r

Tidyverse 🪐to Polars 🐻‍❄️: My Notes

I found Polars syntax is quite similar to dplyr. And the way that we can chain the functions makes it even more familiar! It was fun learning the nuances, now it’s time to put them into practice! Wish me luck! 🍀

Gemini 1.5 Flash Better Than RAG? Let’s Check It Out In R!

Overall, I am quite impressed with the responses! With minimal prompt engineering, document cleaning! It was able to return accurate responses, and even separated different conditions and provided appropriate treatment options. It was also able to return the correct response for tricky questions that our RAG was not able to. It definitely has potential!

Llama, Llama, Oh Give Me A Sign. What’s In The Latest IDSA Guideline?

Wow, what a journey, and more to come! We learned how to perform simple RAG with an LLM and even ventured into LangChain territory. It wasn’t as scary as some people said! The documentation is fantastic. Best of all, we did it ALL in R with Reticulate, without leaving RStudio! Not only we can read IDSA Guidelines, we can use LLM to assist us with retrieving information!

V_s__l_ng M_ss_ng D_t_ W_th D_G & S_m_l_t__n

MCAR, MAR, MNAR, all so confusing. But with DAG, oh so amusing! Many technical words, I don’t understand, but with simulation, I am a fan! Join me in exploring missing mechanisms, learn I will with great optimism.

S.P.I.C.E of Causal Inference

The SUTVA, Positivity, Identifiability, Consistency, Exchangeability of Causal Inference, the essential ingredients that helps us bring out the true flavor of the causal model. Here is my understanding of each assumptions (main course) with examples (side dish) and accompanied by simulation (paired with beverages). Bon Appétit!

Clearer Understanding of 95% Confidence Interval Through The Lens of Simulation

I’m now more confident in my understanding of the 95% confidence interval, but less certain about confidence intervals in general, knowing that we can’t be sure if our current interval includes the true population parameter. On a brighter note, if we have the correct confidence interval, it could still encompass the true parameter even when it’s not statistically significant. I find that quite refreshing

Calculating Number Needed to Treat/Harm (NNT/H) with Odds Ratio

We learned how to convert the pooled odds ratio from a random-effects model and subsequently calculate the number needed to treat (NNT) or harm (NNH). It’s important to understand that without knowing the event proportions in either the treatment or control groups, we cannot accurately estimate the absolute risk reduction for an individual study or for a meta-analysis. Fascinating indeed! Everyday is a school day! 🙌

Approaches to Calculating Number Needed to Treat (NNT) with Meta-Analysis

Here, we have demonstrated three different methods for calculating NNT with meta-analysis data. I learned a lot from this experience, and I hope you find it enjoyable and informative as well. Thank you, @wwrighID, for initiating the discussion and providing a pivotal example by using the highest weight control event proportion to back-calculate ARR and, eventually, NNT. I also want to express my gratitude to @DrToddLee for contributing a brilliant method of pooling a single proportion from the control group for further estimation. Special thanks to @MatthewBJane, the meta-analysis maestro, for guiding me toward the correct equation to calculate event proportions, with weight estimated by the random effect model. 🙏

An Educational Stroll With Stan - Part 4

What an incredible journey it has been! I’m thoroughly enjoying working with Stan codes, even though I don’t yet grasp all the intricacies. We’ve already tackled simple linear and logistic regressions and delved into the application of Bayes’ theorem. Now, let’s turn our attention to the fascinating world of Mixed-Effect Models, also known as Hierarchical Models

An Educational Stroll With Stan - Part 3

Diving into this, we’re exploring how using numbers to express our certainty/uncertainty, especially with medical results, can help sharpen our estimated ‘posterior value’ and offer a solid base for learning and discussions. We often talk about specifics like sensitivity without the nitty-gritty math, but crafting our own priors and using a dash of Bayes and visuals can really spotlight how our initial guesses shift. Sure, learning this takes patience, but once it clicks, it’s a game-changer – continuous learning for the win!

An Educational Stroll With Stan - Part 2

I learned a great deal throughout this journey. In the second part, I gained knowledge about implementing logistic regression in Stan. I also learned the significance of data type declarations for obtaining accurate estimates, how to use posterior to predict new data, and what generated quantities in Stan is for. Moreover, having a friend who is well-versed in Bayesian statistics proves invaluable when delving into the Bayesian realm! Very fun indeed!

An Educational Stroll With Stan - Part 1

There is a lot to learn about Bayesian statistics, but it’s fun, exciting, and flexible! I thoroughly enjoyed the beginning of this journey. There will be learning curves, but there are so many great people and resources out there to help us get closer to understanding the Bayesian way.

Cracking the Code: Unveiling the Hidden Language of USB HID Keyboards!

Sending key presses to another device using software that emulates a keyboard, but isn't a physical keyboard, is a fascinating concept. We understand that in the Linux/Unix environment and with Python, this can be accomplished through low-level programming. But can the R programming language achieve the same feat? If it can, then how does it work?

Exploring Interaction Effects and S-Learners

Interaction adventures through simulations and gradient boosting trees using the S-learner approach. I hadn’t realized that lightGBM and XGBoost could reveal interaction terms without explicit specification. Quite intriguing!

Hugging Face 🤗, with a warm embrace, meet R️ ❤️

I’m delighted that R users can have access to the incredible Hugging Face pre-trained models. In this demonstration, we provide a straightforward example of how to utilize them for sentiment analysis using GPT-generated synthetic data from evaluation comments. Let’s go!

Unraveling the Effects: Collider Adjustments in Logistic Regression

Simulating a binary dataset, coupled with an understanding of the logit link and the linear formula, is truly fascinating! However, we must exercise caution regarding our adjustments, as they can potentially divert us from the true findings. I advocate for transparency in Directed Acyclic Graphs (DAGs) and emphasize the sequence: causal model -> estimator -> estimand.

From TakeOut to TakeIn: The Savings Simulator

Saving can be enjoyable! If you’re planning to cut down on takeout orders, why not use past data to simulate your savings? Let it inspire and motivate your future dining-in decisions! 👍

What Happens If Our Model Adjustment Includes A Collider?

Beware of what we adjust. As we have demonstrated, adjusting for a collider variable can lead to a false estimate in your analysis. If a collider is included in your model, relying solely on AIC/BIC for model selection may provide misleading results and give you a false sense of achievement.

Front-door Adjustment

Front-door adjustment: a superhero method for handling unobserved confounding by using mediators (if present) to estimate causal effects accurately

Seeking Inspiration from Random Learning

I didn’t want to read the textbook in sequence. Hence, I figured that if I read a paragraph a day in a random chapter, I might be able to benefit from random learning!

Math Puzzle #1

How to solve this… 2 ? 1 ? 6 ? 6 ? 200 ? 50 = 416.56

The 100 Prisoners Problem

Brief Introduction: The 100 prisoners problem is a probability theory and combinatorics problem. In this challenge, 100 numbered prisoners must find their own numbers in one of 100 drawers in order to survive. Rules: We have 100 prisoners labeled: 1, 2 … 100 on their clothes we have a room filled with 100 boxes labeled 1, 2, … 100 on the outside of the boxes inside each box, there is a number from 1, 2 … 100 only 1 prisoner may enter the room each time Each prisoner may open only up to 50 attempts/boxes and cannot communicate with other prisoners if the prisoner found his/her/their number, he/she/they will exit the room and no be able to talk to other prisoners.