Merge pull request #23 from BruciiZ/main

Modified bayes and hierarchical modeling chapters
rafalab · Dec 25, 2024 · edc838c · edc838c
2 parents 83c0f32 + 7fb3574
commit edc838c
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 3 deletions.
diff --git a/inference/bayes.qmd b/inference/bayes.qmd
@@ -224,9 +224,12 @@ To compute a posterior distribution and construct a credible interval, we define
 ```{r}
 theta <- 0
 tau <- 0.035
-sigma <- results$se
+sigma_n <- results$se
+sigma <- one_poll_per_pollster |>
+  summarise(sigma = sd(spread)) %>%
+  pull(sigma)
 x_bar <- results$avg
-B <- sigma^2 / (sigma^2 + tau^2)
+B <- sigma_n^2 / (sigma_n^2 + tau^2)
 
 posterior_mean <- B*theta + (1 - B)*x_bar
 posterior_se <- sqrt(1/(1/sigma^2 + 1/tau^2))

diff --git a/inference/hierarchical-models.qmd b/inference/hierarchical-models.qmd
@@ -55,7 +55,7 @@ Although we know this bias term affects our polls, we have no way of knowing wha
 Suppose we are collecting data from one pollster and we assume there is no general bias. The pollster collects several polls with a sample size of $N$, so we observe several measurements of the spread $X_1, \dots, X_J$. Suppose the real proportion for Hillary is $p$ and the difference is $\mu$. The urn model theory tells us that these random variables are normally distributed, with expected value $\mu$ and standard error $2 \sqrt{p(1-p)/N}$:
 
 $$
-X_j \sim \mbox{N}\left(\mu, \sqrt{p(1-p)/N}\right)
+X_j \sim \mbox{N}\left(\mu, 2\sqrt{p(1-p)/N}\right)
 $$ 
 
 We use the index $j$ to represent the different polls conducted by this pollster. Below is a simulation for six polls assuming the spread is 2.1 and $N$ is 2,000: