March 2020

The beta distribution

Posted on March 31, 2020May 4, 2021 by LV

Definition Using the mean Using ranges for the prior Informative vs. uninformative priors Mean, median, variance Highest Density Interval References Addenda Definition The Beta distribution represents a probability distribution of probabilities. It is the conjugate prior of the binomial distribution (if the prior and the posterior distribution are in the same family, the prior and posterior…

RMarkdown to WordPress

Posted on March 31, 2020May 3, 2021 by LV

One option is to create your markdown document in R, using \(\LaTeX\), knit to html, then copy the file content into WordPress and add at the top the following javascript code: This will load images into the blog post itself and preserve the \(\LaTeX\) formatting. I prefer to use the MathJax-LaTeX plugin, and use the…

One-way ANOVA

Posted on March 31, 2020May 3, 2021 by LV

For a simple exercise to understnad one-way ANOVA, I will use the data set red.cell.folate from the package ISwR (see the book “Introductory Statistics with R” by Peter Dalgaard) but will also generate our own data. And now (drum roll) … it’s time to run the ANOVA Let’s look at what this ANOVA table means….

Information criteria: AIC, AICc, BIC

Posted on March 27, 2020May 3, 2021 by LV

Information theoretic approaches view inference as a problem of model selection. The best model is the one that has the least information loss relative to the true model. Information criteria (IC) are estimates of the Kullback Leibler information loss, which cannot be calculated in real life models. The best known IC is the Akaike IC…

Does the p-value overestimate the strength of evidence?

Posted on March 26, 2020May 3, 2021 by LV

Thom Baguley points to the standardized or minimum LR (p381) to answer this question. The minimum LR represents a worst case scenario for the null in that it compares the LR for against the MLE of the observed data, i.e. the most likely (strongest) possible hypothesis supported by the data, and is defined as …

Inference via likelihood

Posted on March 24, 2020May 3, 2021 by LV

The likelihood school affirms that likelihood ratios are the basic tool of inference. The likelihood is the (conjugate) probability of observed data D, conditional on the hypothesis A being true. Given two hypotheses, A and B, it is meaningless to assess evidence except by comparing the evidence favoring hypothesis A over hypothesis B….

Classical statistical inference and its discontents

Posted on March 23, 2020May 3, 2021 by LV

“Classical” statistical inference in medicine is usually synonymous with frequentist inference, the central element of which is the null hypothesis significance testing (NHST). Even though that was not its original intent, NHST is in practice used to evaluate the evidence for or against a hypothesis, due to confusion in mixing Fisher’s approach with that of…

Likelihood and Probability

Posted on March 17, 2020May 3, 2021 by LV

The difference between probability and likelihood is central, among others, to understanding MLE. Randy Gallistel has posted a succinct treatment of this topic: The distinction between probability and likelihood is fundamentally important: Probability attaches to possible results; likelihood attaches to hypotheses [my emphasis]. Explaining this distinction is the purpose of this first column. Possible results…

Welcome to staRt!

Posted on March 12, 2020May 21, 2021 by LV

Welcome to staRt, the introductory R Cookbook. I am Monsieur Gustave H, your concierge. On this corner of the internet I am storing for my own future reference random tidbits, and the occasional whole recipe, about data science and statistics using the programming language R in the R Studio environment. There will be the occasional…