5. Probability
Union and intersection, conditional, independence
In statistics, it's probability, not luck. In this lesson, we will uncover the different basic types of probabilities and how to find them given a data set. We will also discuss interdependence, that is if one factor prevents another factor from happening, or makes it more likely to happen, or neither. Overall, probability is the chance that an event will happen.
A. Union and Intersection
1. Union probabilities essentially mean "What is the probability of either event A happening OR event B happening?"
The notation for this is P (A ∪ B), where P means the probability and the event is whatever is inside the parenthesis. Inside the parenthesis, we see A, which means event A, the "∪" means "or," and B means event B. So, translating this into English, we get "What is the probability of event A or event B happening?"
The formula for solving unions are P (A ∪ B) = P(A) + P(B) - P (A ∩ B)
Translating it to English, we get the probability of getting event A or event B equals the event A probability + event B probability subtracted by the probability of getting BOTH event A and B. This is also known as the general addition rule.
However, note that sometimes subtracting the P (A ∩ B) can be redundant if the events are mutually exclusive because it will equal 0 anyway. But what is mutual exclusivity? Mutual exclusivity is when we have a scenario where both events cannot happen. Only 1 event can happen at a time. For example, when flipping a coin, we can't get BOTH heads and tails. The probability of getting heads is 1/2, and tails is 1/2, but the probability of getting both heads and tails in one flip is 0.
So in the coins example, it is simply just P (A ∪ B) = P(A) + P(B)
We don't need the "P (A ∩ B)" because this will just be 0 anyways.
2. Intersection probabilities mean "What is the probability of event A happening AND event B happening?"
Introduced above, the notation of this is P (A ∩ B)
A general formula for how to solve this is P (A ∩ B) = P (A) x P (B), also known as the multiplication rule.
However, this formula changes to P (A ∩ B) = P (A) x P(B | A) once the events are dependent on each other (will get to this in the later sections).
Furthermore, if we are provided with a two-way table, we don't actually have to calculate anything, as we just have to find the box that includes both events, find the quantity, and divide by the total quantity of the whole table which is usually in the bottom right. This is the safest option because, with this method, we won't have to test if our scenario is independent or dependent. It is also the easiest.
Now Let's Apply Fantasy Football!
Example #1
This is a 2-way table I made that shows the data of the consensus 2024 Fantasy Football draft in PPR leagues by Fantasypros using ADP (Averaged Draft Position). So for example, for the first top 3 draft picks, you have CMC with an ADP of 1.0, Ceedee Lamb with an ADP of 2.0, and Tyreek Hill with an ADP of 3.2. So then the order would be CMC 1st pick, Lamb 2nd pick, then Tyreek 3rd pick. And then after all 180 players are ordered, I split them into 15 groups of 12 according to their ADP to represent a 12-man league, which means 12 picks per round, with 15 rounds of drafting.
This has 6 columns for each position and 15 rows for each round. Note that there is also a "total" column all the way on the right which indicates the total number of players regardless of position drafted in each round which is 12. And it is always 12. But on the "total" row at the bottom, we see the total number of players for that specific position. In the bottom right, we see the total number of all players drafted from each of the 15 rounds of all types of positions. Basically, this is everyone in our population. We know our table is correct if the values of all the "total" rows, and the "total" columns both lead up to the population number, which is 180.
Note: These are just ADPS and your own fantasy draft, although very similar to this, will be different. But for practical purposes, we can pretend that this was a real draft for a made-up league we had.
Prompt #1: Suppose we randomly select just 1 player from the whole draft without regard to any rounds. What is the probability that we select a Wide Receiver?
Work/Explanation: We can notate this as P (Wide Receiver). To solve this, we have to divide our quantity (total number of WRs) by the total quantity (everyone), in which the boxes will tell us how many since we have a 2-way table.
The blue circle tells us the total number of WRs, and the purple circle tells us the total number of players in the draft without regard to position.
So we simply just divide these two numbers to get our answer.
Answer: Assuming we are selecting randomly from the draft, the probability of us selecting a Wide Receiver, P(Wide Receiver), is 66/180, or 36.67%.
Prompt #2: Suppose we randomly select just 1 player from the whole draft without regard to any rounds. What is the probability that we select a Wide Receiver AND a Runningback?
Work/Explanation: Whenever we see the word "and," we have to stop and think if these two events are mutually exclusive. The prompt says we select just 1 player, meaning that we only have 1 pick, so is it possible to get BOTH a WR and an RB in one pick? Are these two events mutually exclusive?
Answer: The prompt says that we are only selecting 1 player from the draft, making it impossible to get BOTH a WR and an RB in one pick. Therefore, these events are mutually exclusive. The probability of this intersection event, or P (Wide Receiver ∩ Runningback), is 0.
Prompt #3: Suppose we randomly select just 1 player from the whole draft without regard to any rounds. What is the probability that we select a Wide Receiver OR a Runningback?
Work/Explanation: This is now a union scenario, where we apply the addition rule with P (Wide Receiver ∪ Runningback ) = P(Wide Receiver) + P(Runningback) - P (Wide Receiver ∩ Runningback). But we just explained that because getting both a WR and an RB in one pick is impossible, the intersection probability is 0, so we can leave out the last part of our equation.
So now, it's just P (Wide Receiver ∪ Runningback ) = P (Wide Receiver) + P(Runningback).
We know that P (Wide Reciever) is 66/180 from the last problem, now we need to find P (Runningback). Using the same process, by looking at the total row at the bottom, we can find that this would be 53/180.
Finally, we just plug in our numbers: P (Wide Receiver ∪ Runningback)= 66/180 + 53/180
Answer: Assuming we are selecting randomly from the draft, the probability of us selecting a Wide Receiver OR a Runningback, or P (Wide Receiver ∪ Runningback), is 119/180, or 61.11%.
Prompt #4: Suppose we randomly select just 1 player from the whole draft without regard to any rounds. What is the probability that we select a Wide Receiver AND an 8th-rounder?
Work/Explanation: This is an intersection scenario, so we have to figure out if these events are mutually exclusive. Now let's think about this. While it may not be possible to get both a WR and an RB in one pick, it is possible to get a WR and an 8th-rounder in one pick. This is called an 8th-rounder WR. In other words, you can't be both a RB and a WR, but you can be both a WR and an 8th-rounder. Therefore, these two events are not mutually exclusive. So, the probability of this will not be 0.
Now is the next step. We have to recognize that we have a 2-way table. And so, we don't actually have to do any rough calculations. We can just use our eyes to get to the right box where these two events intersect.
This is where these two events intersect, and we get our number: 6. There are 6 8th rounder WRs that came from this draft, and there are 180 players in total. So now to answer the prompt, we just divide these two numbers.
Answer: Assuming we are selecting randomly from the draft, the probability of us selecting a Wide Receiver AND an 8th rounder, or P (Wide Receiver ∩ 8th rounder), is 6/180, or 3.33%
End of Example
B. Conditional Probabilities
Conditional probabilities include the word "given." So, given event A happens, what is the probability that event B happens?
The notation is seen like this: P(A | B ), where "P" means the probability of, "A" is event A, the " | " means given, and "B" of course means event B.
To formula to solve this probability is P(A | B ) = P (A ∩ B) / P (B)
And because P (A ∩ B) = P (A) x P (B | A), we can edit this formula for convenience as P (A | B) = [ P(A) x P(B | A)] / P (B)
If we are given a 2-way table like in the first example, we don't have to rely on this formula. We can simply just take the quantity total of event A divided by the quantity total of event B. In conditional probabilities, we don't divide by the whole total. We will demonstrate this in the example.
C. Complementary Probabilities
Complementary probabilities literally just mean, "What are the chances that this event DOESN'T happen?
The notation is P (A ^c), where "P" is probability, "A" is event A, and "^c" doesn't happen. This equals to P (A ^c) = 1 - P(A)
But, it doesn't always have to be just event A. Complementary probabilities can work with any type of probability like union, intersection, and conditional, in which we would just subtract our values from 1.
Now Let's Apply Fantasy Football!
(This is the same 2-way table in the previous exmaple)
Prompt #1: Suppose we drafted at random. What is the probability that our pick was a Quarterback given that the pick was in the 7th round?
Work/Explanation: Because we have a table here, we won't need to resort to our equation. Now, our prompt says GIVEN that the pick was in the 7th round. So, we don't care about any other round here besides the 7th round, so let's block everything else out.
This is what our table looks like now that it's simplified. We are only looking at the 7th round.
And now, we see that in this round that consists of 12 total players, only 1 of them is a QB.
With this information, we can do 1 /12 to get our answer.
Answer: Assuming we randomly selected from the draft, the probability that our pick was a Quarterback given that the pick was in the 7th round, or P ( Quarterback | 7th round), is 1/12 or 8.33%.
Prompt #2: Suppose we drafted at random. What is the probability that our pick was NOT a Quarterback given that the pick was in the 7th round?
Work/Explanation: Our keyword that we know this is a conditional probability is the word "not." With any complementary probability, all we do is subtract our probability value from 1.
Now, we already know from the previous prompt that the probability of our pick being a Quarterback given it was in the 7th round was 1/12. So to find the complementary of this, all we need to do is the calculation 1 - (1/12).
Visually on our 2-way table, it looks like this. Before, we had our red circle on "1" which is there being 1 quarterback. But now, since we don't have a quarterback, our interest is in every other position present.
Answer: Assuming we selected randomly from the draft, the probability that our pick was NOT a Quarterback given that the pick was in the 7th round was 11/12, or 91.67%.
Example #2
But what if we don't have a table? Sometimes we aren't always given some sort of diagram but are rather just told the probabilities by themselves. Here, we will need to use the equations.
Prompt #3: In the 2023 NFL season, 21 out of the top 35 RBs (ranked in ppr ppg with min 7 games) were 25 years old and under. Of this age group, 8 out of the 21 averaged at least than 15 ppr ppg. For the top 35 RBs this 2023 season, what was the probability of averaging 15 ppr ppg AND being 25 years old and under? These two events are dependent.
Work/Explanation: These two events are dependent, and this scenario is an intersection probability because of the word "and."
Here is the formula: P (A ∩ B) = P (A) x P (B | A)
We will replace with words: P ( ≤ 25 years old ∩ averages ≥ 15 ppr ppg ) = P ( ≤ 25 years old ) x P ( averages ≥ 15 ppr ppg | ≤ 25 years old )
Now, because 21 out of the top 35 RBs were ≤ 25 years old, the probability of this is 21/35.
Additionally, because 8 out of these 21 RBs averaged ≥ 15 ppr ppg, the probability of this is 8/21.
Finally, we input our numbers in: P ( ≤ 25 years old ∩ averages ≥ 15 ppr ppg ) = (21/35) x (8/21).
Answer: For the top 35 RBs (ranked in ppr ppg with min 7 games) this 2023 season, what was the probability of averaging 15 ppr ppg AND being 25 years old and under was 168/735 or 22.86%
End of Example
D. Independence and Dependence
Events being independent from one other means that one outcome of event A does not affect the probability of event B happening. For example, the action of flipping a coin and the result of getting heads or tails is independent. Getting tails on your first try does increase the chances or decrease the chances of you getting heads the second time.
On the flip side, events are dependent when the outcome of event A does affect the probability of event B happening. For example, say are playing cards and need all 4 kings to win the game. You have 3 so far and need 1 more. There are 6 cards left in the deck to be chosen, meaning that the probability of getting a King on the next selection is 1/6. Your peer selects a card and gets a Jack. Now, the probability of getting a King is 1/5. Because event A, not getting a King, increased the probability of event B, getting a king the next turn, these events are dependent.
Recognizing our scenario as being either independent or dependent is so crucial because it affects our equations. For example, something mentioned earlier, if our event A and event B were independent, then P (A ∩ B) = P (A) x P (B).
If our events were dependent, then P (A ∩ B) = P (A) x P (B | A).
Note: There isn't really any harm in using P (A ∩ B) = P (A) x P (B | A) for independent scenarios, but it is unnecessary to include to given. On the other hand, if we use P (A ∩ B) = P (A) x P (B) for dependent scenarios, we will get our answer wrong. This is more serious to consider.
Now Let's Apply Fantasy Football!
Example #1
Prompt #1: You are simultaneously waiting for the draft order in both of your Fantasy Football Leagues, where one is a 16-man League (meaning 16 possible slots in 1st round) and an 8-man League (meaning 8 possible slots in 1st round). Are these events independent or dependent? What is the probability that you get the first overall pick for both leagues?
Work/Explanation: These two leagues are not connected to each other at all. You getting this drafting position in League A does not affect you getting this other drafting position in League B. Therefore, these two events are independent, and we can simplify our formula:
P (A ∩ B) = P (A) x P (B).
Because League A has 16 slots, the probability of getting the first overall pick is 1/16. And because League B has 8 slots, the probability of getting the first overall pick is 1/8.
P (1st pick in League A ∩ 1st pick in League B) = P (1st pick in League A ) x P (1st pick in League B)
= (1/16) x (1/8)
Answer: These events are independent of one another. The probability of getting the 1st overall pick in both leagues is 1/128 or .78%.
Prompt #2: You are simultaneously waiting for the draft order in 3 of your Fantasy Football Leagues, where one is a 16-man League (meaning 16 possible slots in 1st round), one is an 8-man League (meaning 8 possible slots in 1st round), and one is a 12-man League (meaning 12 possible slots in 1st round). Are these events independent or dependent? What is the probability that you get the first overall pick in all 3 leagues?
Work/Explanation: These 3 leagues are not connected to each other at all. You getting this drafting position in League A does not affect you getting this other drafting position in League B. Therefore, these two events are independent, and we can simplify our formula while adding another event: P (A ∩ B ∩ C) = P (A) x P (B) x P(C)
Because League A has 16 slots, the probability of getting the first overall pick is 1/16. And because League B has 8 slots, the probability of getting the first overall pick is 1/8. And because League C has 12 slots, the probability of getting the first overall pick is 1/12.
P (1st pick in League A ∩ 1st pick in League B) = P (1st overall pick in League A ) x P (1st overall pick in League B) x P (1st overall pick in League C) = (1/16) x (1/8) x (1/12)
Answer: These events are independent of one another. The probability of getting the 1st overall pick in all 3 leagues is 1/1536 or .07%.
Work/Explanation: These 3 leagues are not connected to each other at all. You getting this drafting position in League A does not affect you getting this other drafting position in League B. Therefore, these two events are independent, and we can simplify our formula while adding another event: P (A ∩ B ∩ C) = P (A) x P (B) x P(C)
Because League A has 16 slots, the probability of getting the first overall pick is 1/16. And because League B has 8 slots, the probability of getting the first overall pick is 1/8. And because League C has 12 slots, the probability of getting the first overall pick is 1/12.
P (1st pick in League A ∩ 1st pick in League B) = P (1st overall pick in League A ) x P (1st overall pick in League B) x P (1st overall pick in League C) = (1/16) x (1/8) x (1/12)
Answer: These events are independent of one another. The probability of getting the 1st overall pick in all 3 leagues is 1/1536 or .07%.
Example #2
Prompt #1: You are your 11 other friends are randomly deciding Fantasy Football draft positions in your 12-man league. You decide by selecting paper slips from a box, where each slip has a number from 1-12 on it. There are no duplicates. You are going to select 2nd. You want the 6th overall pick. The person in front of you draws a slip and gets the 2nd overall pick. Are these events independent or dependent? What is the probability that you get the 6th overall pick after the friend in front of you got the 2nd overall pick?
Work/Explanation: In this example, we are only dealing with one league here. The question is, does the one friend in front of you gettingt he 2nd overall pick affect your probability of getting the 6th overall pick? The answer is yes.
Imagine that you were first up to draw a paper slip. The probability of you getting the 6th overall pick is 1/12. (1 slip with #6, in a group of 12 slips)
But in this scenario, after your friend gets the 2nd overall pick with his random selection, the probability of you getting the 6th overall pick is now 1/11. (1 slip with #6, in a group of 11 slips)
In this case, your friend not getting the draft position you wanted increased the chances of you getting it.
Answer: These events are dependent. As the second selecter, the probability of getting the 6th overall pick given that your friend before you got the 2nd overall pick is 1/11.
Note: What would make this problem independent would be to replace every slip that was pulled out. So for example, after the friend, before you got their paper slip, they would place it back into the box. Now, as the second selector, your probability of getting a single desired draft pick is still 1/12, because the person before you put theirs back. But we know that this is not how Fantasy Football draft position results in work, as putting back your slip risks another person getting the same slip that you got, meaning that there could be multiple people in one position. For example, if your paper said #2, meaning 2nd overall pick, you place it back, and the person after you gets that same slip, both of you would now have the 2nd overall pick.
End of Example
2 Takeaways for Fantasy Football
1. Independence
Many players often look at a player's game log and right away view their touchdowns. Some might say, "He's scored a touchdown in the last 6 of his games, so he has to score one this week too," or "He hasn't scored a touchdown in 3 weeks, which means that he is due for one. Get ready for a monster game." But in reality, just like coin flips, where heads and tails don't affect each other, getting a touchdown in the last game does not affect the chances of getting a touchdown in the next.
Yes, we want to look at a player's history of games to predict his next performance, but merely looking at his touchdowns is not sufficient. We should instead look at other metrics like his snap or target shares, for example. In conclusion, don't be fooled by your league mates who make these sorts of statements.
2. Dependence
In Fantasy Football drafts, it's useful to predict what players our league mates will draft because that helps us make decisions of whether to grab players early or wait until the next round. For example. You are in the 6th round of your draft, and you need a QB. Jordan Love catches your eye, but you wonder if he will still be available to you in the 7th round. You see that all of your league mates already have QBs, which means that the probability of selecting Jordan Love is low. This scenario is dependent because whether or not your league mate has a QB affects his tendency to draft an additional QB.
If all of your league mates already have a QB, then they are most likely not going to draft Jordan Love in 6th round, meaning that you have a high chance of waiting one more round to get him in the 7th.