# Basic Probability Concepts

Definition 1: Typically in the field of statistics we study data that results from experiments. An experiment can be considered to be a series of trials, each with a particular outcome. An event is a collection of outcomes corresponding to some result in the experiment. The number of outcomes in event E (i.e. the number of elements in set E) is written as |E|. The set of all possible outcomes is called the sample space, often designed  S. An event is then simply a subset of the sample space. The probability P(E) of the event E is |E| / |S|, assuming S is not empty.

Example 1: Consider the simple experiment of tossing a coin twice. What is the probability that the coin comes up heads both time?

The sample space S = {HH, HT, TH, TT} and the required event E = {HH}. Thus the probability that the coin is heads both times is P(E) = |E| / |S| = ¼, or 25%.

Observation: We now state the fundamental properties of probability, using the usual set notation (see Sets for a quick review of this notation).

Property 1:

1. 0 ≤ P(A) ≤ 1
2. P(Ø) = 0
3. P(S) = 1
4. P(A′) = 1 – P(A), where A′ = S – A
5. P(A B) = P(A) + P(B) – P(AB)

Proof: Simple consequences of Definition 1.

Example 2: Consider the experiment of drawing one card from a standard deck of 52 cards. What is the probability of drawing either a spade or face card?

There are 13 spades and 12 face cards, but 3 of these face cards are also spades, which we should not count twice. Thus, there are 13 spades and 9 non-spade face cards for a total of 22 cards out of 52. The probability is therefore 22/52. We now show how to calculate the result using Property 1e.

Let A = the event that a spade is drawn and B = the event that a face card (King, Queen or Jack) is drawn. P(A) = 13/52, P(B) = 12/52 and P(A ∩ B) = 3/52. Thus the probability of drawing either a spade or face card is P(A B) = P(A) + P(B) – P(A ∩ B) = 13/52 + 12/52 – 3/52 = 22/52.

Definition 2: The probability that an event A occurs assuming that event B occurs is called the conditional probability of A given B and is denoted P(A|B).

Observation: By Definitions 1 and 2

assuming if B ≠ Ø.

Property 2:

1. P(A|B) ∙ P(B) = P(AB) = P(B|A) ∙ P(A)
2. P(A|B) = P(B|A) ∙ P(A) / P(B) called Bayes’ Theorem
3. P(A) = P(A|B) ∙ P(B) + P(A|B) ∙ P(B) called the Law of Total Probability

Proof: The first assertion is a restatement of the last observation. The second assertion is a consequence of two applications of the first since

We now prove the third assertion. Since A  = (AB) ∪ (AB), by Properties 1b and 1e,

Now by Property 2a and 2b,

which proves the third assertion.

Example 3: Consider the experiment of picking two balls at random without replacement from a bag which contains 3 reds and 2 blacks. What is the probability that both balls are red?

Let A = a red ball is taken on the first draw and B = a red ball is taken on the second draw. The probability that the first draw is red is P(A) = 3/5. The probability that the second draw is red given the first draw is red is P(B|A) = 2/4 = ½. From Property 2a, we see that the probability that both draws are red is

Definition 3: Two events A and B are independent if P(AB) = P(A) ∙ P(B)

Property 3: Two events A and B are independent if and only if P(A) = P(A|B)

Proof: A and B are independent if and only if P(A∩B) = P(A) ∙ P(B), which by Property 2a is true if and only if P(A|B) ∙ P(B) = P(A) ∙ P(B), which in turn is true if and only if P(A|B) = P(A).

Observation: A and B are independent if B’s occurring (or not occurring) has no influence on A’s occurring, i.e. it doesn’t increase or decrease the probability of A occurring. By Property 3, A and B are independent if any only if P(B|A) = P(B), and so it also follows that if A and B are independent then A’s occurring has no influence on B’s occurring either.

Example 4: Repeat the experiment from Example 3, but this time we put the ball picked on the first draw back in the bag before drawing a second ball (i.e. sampling with replacement).

Since P(B|A) = 3/5 = P(B), A and B are independent, it follows that

P(AB) = P(A) ∙ P(B) = 3/5 ∙ 3/5 = 36%.

Example 5: You have two bags, one containing 3 red and 2 black balls, the other containing 1 red, 1 blue and 2 black balls. You pick a bag at random and then pick a ball from that bag at random. What is the probability that the ball picked is red?

Let A = event that the first bag is picked and let B = event that a red ball is drawn. By Property 2c,

P(B) = P(B|A) ∙ P(A) + P(B|A) ∙ P(A) = .6(.5) + .25(.5) = 42.5%.

Example 6: Suppose you role a die 12 times. What is the probability that the number 1 will not appear on any of the throws? What is the probability that the number 1 will appear on at least one of the 12 throws?

The 12 throws represent 12 independent events. The probability of throwing a 1 on any single trial is 1/6 and so the probability of not throwing a 1 on any single trial is 1 – 1/6 = 5/6 (by Property 1d). Thus the probability of not throwing a 1 on any of the 12 throws is (5/6)12 = 11.2% (by Definition 3).

The probability that the number 1 will appear at least once is simply 1 – 11.2% = 88.8% (by Property 1d). This is equivalent to 1 – (1 – 1/6)12.

### 6 Responses to Basic Probability Concepts

1. James says:

Hi Charles
This is an excellent website, it appears very carefully put together.
I have a question for the page http://www.real-statistics.com/probability-functions/basic-probability-concepts/

Please could you clarify what is meant by the different terms “definition, property, observation, assertion”?
The only problem is with the last of these terms, it appears that “assertion” = “property” though this isn’t stated anywhere. Once the seed of doubt is in one’s mind the rest of the page becomes quite difficult reading.
Many thanks

• Charles says:

Hi James,

Definition is used to define key terms. This usage is quite standard in mathematics.
Property is used to describe facts which can be proved mathematically. These are sometimes called Theorems. The term “property” is quite standard in mathematics.
Observation is used to explain or illustrate concepts described on the webpage.

The word “assertion” as used on this webpage refers to Property 2. The phrase “the first assertion” is just a way of referring to Property 2(a). The phrase “the second assertion” is just a way of referring to Property 2(b). The phrase “the third assertion” is just a way of referring to Property 2(c). Again this use of the word “assertion” is quite common in mathematics.

Charles

2. Pete says:

Hi Charles
You’ve written above:-
“Definition 2: The probability that an event occurs assuming that occurs is called the conditional probability of A given B and is denoted P(A|B).”

“Definition 2: The probability that an event A occurs assuming that B occurs is called the conditional probability of A given B and is denoted P(A|B).”

or its it the other way round? I can never remember!!
A very useful website – thank you.

• Charles says:

Thanks Pete,
I have now corrected the typo. Thanks again for catching the error. Glad you like the site.
Charles

3. Pete says:

As to example # 6 I take it that the probability of rolling a 1 all 12 times is (1/6) to the 12th. And that rolling a 1 exactly twice is 1/6*1/6*(5/6)to the 10th, correct?

But what if rolling the 1 instead of at least once in the twelve throws it was at least twice?

How is that computed?

Pete

• Charles says:

Pete,
Actually rolling a 1 exactly twice has probability C(12,2) * (1/6)^2 * (5/6)^10. See the webpage on the Binomial Distribution.
Rolling a 1 at least twice has probability = 1 – P(0) – P(1) where P(0) = probability of rolling a 1 zero times = (5/6)*12 and P(1) = probability of rolling a 1 exactly one time = C(12,1) * (1/6) * (5/6)^11.
Charles