This a second brain of mine that keeps all the stats knowledge I am reviewing (or learning)
I always have an admiration for understanding the "mathy" detail behind these models and have a fear of misuse of models. So here are some notes I put together that delve into the math details behind these models. However, this is by no means a pure mathematical guide on these topics as I'm not officially trained in math or theoratical statistics. Feel free to let me know if anything is incorrect and I will be happy to make changes. Hppy learning! :)
Review for Estimation Theory:
- What are estimators: How sample mean and variance can be understood as estimators (almost done, https://kefangpsych.github.io/StatsReview/sample_mean_and_variance_as_estimators.html)
In psychological statistics, sample mean and variance are typically introduced as summary statistics of discriptive statistics. This is intuitive and useful for understanding (maybe except for the n-1 for sample variance). However, their application becomes less straightforward when transitioning to inferential statistics, such as in z-tests or t-tests, where these statistics are employed in constructing the test statistics.
In this tutorial, I try to elucidate the basic procedure of statistical modeling (inferential statistics) as 4 steps: model specification, estimation, statistical inference and model diagonisis and evaluation. Specifically, we will focus on the estimation procedure, which deals with the problem of how can we estimate some unknown parameters of the proposed DGP given sample data. For example, when estimating the average height of NYU students based on data from 10 students, one could either calculate the sample mean to make a guess or, more simplistically, assume it to be of 5'10". This section introduces fundamental concepts such as estimation, estimands, estimators, and estimates, along with the characteristics of estimators that make some more effective than others (e.g., why calculating the sample mean is generally preferable to assuming a fixed height of 5'10").
In the final part, we apply this framework to demonstrate that the basic process of calculating the sample mean and variance actually serves as estimators (specific, method of moment and least square estimators) for the population expectation and variance. This discussion will also elucidate the use of n−1 in variance calculations and reveal situations where using n might be appropriate instead. Also, we will paradoxically show that wheras the sample variance will on average correctly estimate the variance of the DGP, the standard deviation of sample, does actually not. This foundation paves the way for future topics, including statistical inference regarding sample means and variances (given our educated guesses, how confidence are we?), and more complex estimation challenges like least squares and maximum likelihood estimation in linear regression.
- OLS estimators of regression & Gaussian-Markov Assumption (why we need them)
Review for Inference Theory:
- One-sample z-test: basic of hypothesis testing, effect size, and power (https://kefangpsych.github.io/StatsReview/one-sample_z-test)
In inference, the trio - hypothesis testing, effect size, and power analysis - all super important and super connected. I put together this handy guide because I noticed something kinda off about the way we usually learn statistics. Based on the statistical training experience I had, courses leave power analysis for dessert, like it's an afterthought. Problem is, that makes it harder to get how it fits in with everything else, especially when you dive into the complicated models that involves juggling a bunch of tests at once. So, what I've done here is stick to the basics. Think of it as a training wheels: we're going to focus on the simplest type of hypothesis testing, the one-sample Z-test. I'll walk through it step-by-step, breaking down the math stuff so it's easier to digest with the aim to not just understand the what, but the why and how with formal language of math. I wrote this guide as it is always my hope to learn stats in a more rigorous 'mathy' way while also keeping it intuitive and accessible. And I believe that bring big theoretical concepts like power to a simple example can sometimes be the best way to learn. It's like building a toy house before you actually build the real house - once you've got the fundamentals nailed, everything else kind of falls into place. So I hope this guide provide the logic and math detail behind the concepts in a way that's way less intimidating and way more fun.
- One-sample Chi-square test: test for variance