# Probability: Sample Spaces

I’ve been doing a few games lately (can be seen here, here and here) and, while I think those are very good ways to become interested in some of the avenues of math research, I also have had a few people come to me with questions regarding help with their classes. So I decided to write a script to try to help understand some elementary probability theory, focusing on discrete sample spaces.

In statistics, any process of observation is referred to as an experiment.
The set of all possible outcomes of an experiment is called the sample space and it is usually denoted by S. Each outcome in a sample space is called an element of the sample space. An event is a subset of the sample space or which the event occurs. Two events are said to be mutually exclusive if they have no elements in common.

Similar to set theory, we can form new events by performing operations like unions, intersections and compliments on other events. If A and B are any two subsets of a sample space S, then their union A ∪ B is the subset of S that contains all the elements that are in either A, in B, or in both; their intersection A ∩ B is the subset of S that contains all the elements that are in both A and B; the compliment A’ of A is the subset of S that contains all the elements of S that are not in A.

A probability is a function that assigns real numbers to events of a sample space. The following are the axioms of probability that apply when the sample space is discrete (finite or countable).

Axiom 1: The probability of an event is a non-negative real number; that is P(A) ≥ 0 for any subset A of S.
Axiom 2: The probability of the entire sample space is 1; that is P(S) = 1.
Axiom 3: If A1, A2, A3, … , is a finite or infinite sequence of mutually exclusive events of S, then
P(A1 ∪ A2 ∪ A3 ∪ …) = P(A1) + P(A2) + P(A3) + …
If A and B are any two events in a sample space S and P(A) ≠ 0, the conditional probability of B given A is

P(B | A) =
 P(A ∩ B)P(A)

Two events A and B are independent if and only if P(A | B) = P(A) ∙ P(B).

# Understanding Bayes’ Theorem

I’ve finished a script that helps understand Bayes’ Theorem.

If we have a set of mutually exclusive (aka non-overlapping) sets Bi for i {0, 1, 2, …, n} for some integer n, then the union of these sets forms a sample space. Lets call the sample space S. Suppose that we also have some set (also known as an event) A which is also a subset of S. Bayes’ Theorem considers the probability that one of these mutually exclusive events (one of the Bi‘s) caused the observed event (A).

This probability can be calculated by the formula

 Pr(Bj | A) = Pr(Bj) Pr(A | Bj) Pr(Bi) Pr(A | Bi)

The theorem helps us determine the the probability of the event Bj given A, or in more plain English, the probability that the event Bj is the cause that gives rise to the observed event A. The numerator is given by the product of of the probability of the causal event (Pr(Bj) times the conditional probability of the observed event given the causal event (Pr(A | Bj)). This numerator could be replaced by its equivalent statement of the set A Bj. Likewise, the denominator the sum (over all the causal events) of the probaility of each causal event times the conditional probability of the observed event given that particular causal event. Each term in this denominator could be replaced b its equivalent staetment A Bi, which when summed give the total probability of A because each pair of the Bj‘s is mutually exclusive. So we are able to replace the probability of A with Pr(Bi) Pr(A | Bi) because of the fundamental law of probability.

An example that would use Bayes’ Theorem is analyzing the results of an election. The set of mutually exclusive events could be membership in a political party (Democrat, Republican, or Independent). The observed event could be the election of an individual. And the conditional distributions could be the percentage of each party that voted for this individual. If we want to calculate how significant each party was to the individual’s election, we’d use Bayes’ Theorem.

The script I’ve written to help understand Bayes’ Theorem works as follows:
– A set of mutually exclusive sets is randomly generated (the number of sets also varies). These sets are called Bi for i (0, …, n}.
– A set A is randomly generated from the union of the Bi‘s.
– A table is displayed showing:
Pr(Bi) for each i on line 1.
Pr(A | Bi) for each i on line 2.

– The user is given the option to select which of the mutually exclusive sets they would like to use to calculate the probability that this set caused the event A.
– Once a set is chosen, the user clicks the “Calculate Conditional” button and Bayes’ Theorem gives the result.
– If the “show work” checkbox was checked, then the steps used in this calculation are also shown.
– All work is done using fractions to give an idea of where the numbers come from.

Other Blogs that have covered this topic:
Better Explained
Bayes’ Theorem-qed