1970年1月1日

2086 Lecture 4 Central Limit Theorem And Confidence Intervals

No description yet.

Lecture Note CLT: Lecture 4 Notes (Part I).pdf Lecture Note CI: Lecture 4 Notes (Part II).pdf

Previous: 2086 Lecture 3 - Estimation and Maximum Likelihood Next: 2086 Lecture 5 - Hypothesis Testing

Central limit theorem (CLT)

Central limit theorem (CLT) is the most important theory in statistics. It states that no matter what probability distribution that the population is, it can be Binomial, Uniform or Bernoulli. When we draw multiple samples, and calculate sample means for them. The distribution of sample means will approach to normal distribution when sample size increase.

Fact 1 (Central Limit Theorem): Let $Y_1, \ldots, Y_n$ be random variables (RVs) and i.i.d with $\mathbb{E}[Y_i] = \mu$ and $\mathbb{V}[Y_i] = \sigma^2$ . Then

\sum_{i=1}^n Y_i \stackrel{d}{\to} N(n\mu, n\sigma^2)

We know that sample mean $\hat{Y} = \frac{\sum_{i=1}^n Y_i}{n}$ Based on the fact of Variance and Expectations: $\mathbb{E}[cY] = c\mathbb{E}[Y]$ , $\mathbb{V}[cY] = c^2\mathbb{V}[Y]$ We can rewrite CLT fact as what we expected, where $c = \frac{1}{n}$ :

\hat{Y} \stackrel{d}{\to} N(\mu, \frac{\sigma^2}{n})

The greater the sample size $n$ , the less the variance is.

Interval Estimating

Point estimating will return the best guess of the estimator, which may not cover enough cases as our sample size is limited. So rather than give a best guess, we return a interval of estimator where it covers the most of the possible result

This can be denote as:

T(\mathbf{y}) = \left( \hat{\theta}^{-}(\mathbf{y}), \, \hat{\theta}^{+}(\mathbf{y}) \right) \subset \mathbb{R}

The method we use to get such a interval is called confidence intervals

Confidence Interval (CI)

Confidence Interval, denote as $T(\mathbf{y})$ . We say that $T(\mathbf{y})$ is a $100(1 - \alpha)\%$ confidence interval when:

\mathbb{P}(\theta \in T(\mathbf{y})) = 1 - \alpha,

This means that when we have a $100(1 - \alpha)\%$ confidence interval, then if we generate many different 95%CI on different samples from population. About $100(1 - \alpha)\%$ of them will include real parameter $\theta$ .

CI for Normal Mean with Known Variance

The formula of calculating CI with known Variance is:

\left( \hat{\mu} - z_{\alpha/2} \sqrt{\frac{\sigma^2}{n}}, \; \hat{\mu} + z_{\alpha/2} \sqrt{\frac{\sigma^2}{n}} \right)

Where $z_{\alpha/2}$ can be calculated using z-table, we find the line where p(Z>z) equals to $1 - \alpha /2$ and then read the value of Z

CI for Normal Mean with Unknown Variance

The formula of calculating CI with known Variance is:

\left( \hat{\mu}_A - \hat{\mu}_B - z_{\alpha/2} \sqrt{\frac{\sigma_A^2}{n_A} + \frac{\sigma_B^2}{n_B}}, \; \hat{\mu}_A - \hat{\mu}_B + z_{\alpha/2} \sqrt{\frac{\sigma_A^2}{n_A} + \frac{\sigma_B^2}{n_B}} \right)

Backlinks

2086 Lecture 3 Estimation And Maximum Likelihood

No description yet.

2086 Lecture 5 Hypothesis Testing

No description yet.