From ca51740eb1412130a49b045f96b59c5c7e74822b Mon Sep 17 00:00:00 2001 From: bsenst Date: Sun, 22 Jan 2023 22:24:31 +0100 Subject: [PATCH] fix small typo --- book/ch-intro.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/book/ch-intro.html b/book/ch-intro.html index b1ce8ff..d271de9 100644 --- a/book/ch-intro.html +++ b/book/ch-intro.html @@ -1304,7 +1304,7 @@

1.4 Discrete random variables: An

“It’s raining, I’m going to take the ….”

Suppose that our research goal is to estimate the probability, call it \(\theta\), of the word “umbrella” appearing in this sentence, versus any other word. If the sentence is completed with the word “umbrella”, we will refer to it as a success; any other completion will be referred to as a failure. This is an example of a binomial random variable: given \(n\) trials, there can be only two possible outcomes in each trial, a success or a failure, and there is some true unknown probability \(\theta\) of success that we want to estimate. When the number of trials is one, the random variable is said to have a Bernoulli distribution.

One way to empirically estimate this probability of success is to carry out a cloze task. In a cloze task, subjects are asked to complete a fragment of the original sentence, such as “It’s raining, I’m going to take the …”. The predictability or cloze probability of “umbrella” is then calculated as the proportion of times that the target word “umbrella” was produced as an answer by subjects.

-

Assume for simplicity that \(10\) subjects are asked to complete the above sentence; each subject does this task only once. This gives us independent responses from \(10\) trials that are either coded a success (“umbrella” was produced) or as a failure (some other word was produced). We can sum up the number of sucesses to calculate how many of the 10 trials had “umbrella” as a response. For example, if \(8\) instances of “umbrella” are produced in \(10\) trials, we would estimate the cloze probability of producing “umbrella” would be \(8/10\).

+

Assume for simplicity that \(10\) subjects are asked to complete the above sentence; each subject does this task only once. This gives us independent responses from \(10\) trials that are either coded a success (“umbrella” was produced) or as a failure (some other word was produced). We can sum up the number of successes to calculate how many of the 10 trials had “umbrella” as a response. For example, if \(8\) instances of “umbrella” are produced in \(10\) trials, we would estimate the cloze probability of producing “umbrella” would be \(8/10\).

We can repeatedly generate simulated sequences of the number of successes in R (later on we will demonstrate how to generate such random sequences of simulated data). Here is a case where we run the same experiment \(20\) times (the sample size is \(10\) each time).

##  [1] 7 6 5 7 4 4 5 3 3 6 6 4 3 4 7 2 5 4 5 5