Probability & Statistics

Probability is an area of mathematics that started much later than most, partly because it can be rather counterintuitive to many people, even mathematicians: guessing how likely it is that something will happen is not something we are very good at.

Simple situations are easy: if you have two fair dice, the probability that you throw a double six is 1/36, because there are 36 = 6 x 6 different possible combinations, and they are all equally likely. But more complex situations are less straightforward. Consider the following example. Suppose that in the city where you live, about 1 person in 10 has blue eyes. Let us now take a group of 1,000 mothers in the city, 500 of whom have 1 child, and 500 have 2 children. If you picked a random name out of the collection of the 1,000 names of these mothers, you would thus have a probability of 1/2 of having selected a mother of 2 children. The following argument suggests that this probability might be different among mothers of blue-eyed children. Here is the argument: Let's consider only mothers who have at least one blue-eyed child. Typically, only 50 of the children among the 500 children in a single-child family have blue eyes. Of the 1000 children in the 500 families with two children, typically 100 will have blue eyes. We have thus (typically) 150 kids with blue eyes. If we ask these 150 blue-eyed kids how many children their mother has, then 100 of them, will answer that their mother has two kids. So among the families in which there is at least one blue-eyed child, the probability that the mom has two kids is 100/150=2/3, which is higher than the 1/2 ratio we had earlier. How can this be?

Puzzles like these are not easy to sort out if you don't have much experience with probability. The situation described in the example is pretty contrived and you are not very likely to encounter it in practice. Yet similar arguments do occur in discussions and the media, where estimates of "how likely it is that X is true" are bandied about all the time. An example: athlete Y has just learned that she tested positive in a random test checking for use of prohibited performance-enhancing drugs. The test used is very reliable: it gives the wrong result, on average, only once per thousand. How likely do you think it is that Y is innocent, as she claims?

Part 1 of this unit will discuss probability in some more detail, and will, in particular, answer the two questions above. In Part 2 we concentrate on statistics, the mathematical science that uses probability arguments for practical applications and studies.

Links to the problem sets on this page will be available gradually corresponding to the course schedule.

Lecture notes

Lecture summaries

Problem sets