We will be following Seeing Theory, a fantastic tool created by students at Brown University.

- Introduction to Probability
- Compound Probability
- Probability Distributions
- Frequentist Inference
- Bayesian Inference
- Regression Analysis

🐑 You flip a coin twice. Assume the probability that the coin lands on heads is $p$. What is the probability of have at least one heads?

What is the mathematical definition of independence?

$$ P \cap B = P(A)P(B) $$

🐑 Using the definition of expectation, calculate the expectation of a single coin flip.

$$ E(X) = \sum_{x \in X(\Omega)} xP(X=x) $$

🐑 Compute the variance of a die roll, i.e. a uniform random variable over the sample space $\Omega = \{1, 2, 3, 4, 5, 6\}$.

🐑 Prove that $(A\cup B)^c = A^c \cap B^c$

🐑 In Poker, here are examples of possible hands:

- Royal Flush: A, K, Q, J, 10 all in the same suit.
- Straight Flush: Five cards in a sequence, all in the same suit.
- Four of a Kind: Exactly what it sounds like.
- Full House: 3 of a kind with a pair.

Calculate the probabilities of the above hands.

🐑 What is the definition of conditional probability?

🐑 Using the definition of conditional probability, prove Bayes Rule:

$$ P(A \mid B) = \dfrac{P(B \mid A) P(A)}{P(B)} $$

🐑 You have two coins in a bag. A biased coin and fair coin. The biased coin lands on heads with probability 0.95. Define the following events:

$ A = \{ \text{Picking the biased coin} \}$

$ B = \{ \text{Flipping 3 heads out of 3 total flips} \}$

Compute $P(A \mid B)$ using Bayes Rule.

Distrete vs continuous (countable vs uncountable)

🐑 We load on a plane 100 packages whose weights are independent random variables that are uniformly distributed between 5 and 50 kilograms. What is the probability that the total weight will exceed 3000 kilograms?

Graph your data! Check out the Datasaurus Dozen inspired by Anscombe's Quartet.

Derivation of matrix form of OLS.

And we could fit OLS by hand.

You have data on the grades of 10 students in primary school and high school. You would like to estimate the relationship between the grades. Assume that high school grades are linearly related to primary school grades with some idiosyncratic error, $\epsilon$.

Primary school = 5, 2, 3, 4, 8, 9, 10, 8, 5, 6 High School = 6, 4, 3, 4, 6, 7, 8 , 9, 3, 5

The model is Primary = $\alpha$ + $\beta$ High + $\epsilon$

Estimate the value of $\alpha$ and $\beta$ by hand.

- Covariance
- Correlation coefficient

🐑 Argue that for given random variables X and Y, the correlation lies between −1 and 1.

Correlation does not imply causation. See this website for examples.

🐑 Independence implies zero correlation but zero correlation does not imply independence.

In [10]:

```
sns.regplot(x="primary", y="high", data=grades);
```

In [17]:

```
results = smf.ols('high ~ primary', data=grades).fit()
results.summary()
```

/Applications/anaconda/lib/python3.6/site-packages/scipy/stats/stats.py:1390: UserWarning: kurtosistest only valid for n>=20 ... continuing anyway, n=10 "anyway, n=%i" % int(n))

Out[17]:

Dep. Variable: | high | R-squared: | 0.682 |
---|---|---|---|

Model: | OLS | Adj. R-squared: | 0.643 |

Method: | Least Squares | F-statistic: | 17.17 |

Date: | Sun, 09 Sep 2018 | Prob (F-statistic): | 0.00323 |

Time: | 18:12:14 | Log-Likelihood: | -15.198 |

No. Observations: | 10 | AIC: | 34.40 |

Df Residuals: | 8 | BIC: | 35.00 |

Df Model: | 1 | ||

Covariance Type: | nonrobust |

coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|

Intercept | 1.6562 | 1.007 | 1.645 | 0.138 | -0.665 | 3.977 |

primary | 0.6406 | 0.155 | 4.144 | 0.003 | 0.284 | 0.997 |

Omnibus: | 0.901 | Durbin-Watson: | 2.214 |
---|---|---|---|

Prob(Omnibus): | 0.637 | Jarque-Bera (JB): | 0.411 |

Skew: | 0.465 | Prob(JB): | 0.814 |

Kurtosis: | 2.655 | Cond. No. | 17.1 |