- Home
- Courses
- Data Science
- Data Science Essentials & Machine Learning
Curriculum
- 8 Sections
- 69 Lessons
- 4 Weeks
Expand all sectionsCollapse all sections
- Before You StartIntroduction4
- Module 1: Introduction to Data Science12
- 3.1Principles of Data Science – Data Analytic Thinking
- 3.2Principles of Data Science – The Data Science Process
- 3.3Further Reading
- 3.4Data Science Technologies – Introduction to Data Science Technologies
- 3.5Data Science Technologies – An Overview of Data Science Technologies
- 3.6Data Science Technologies – Azure Machine Learning Learning Studio
- 3.7Data Science Technologies – Using Code in Azure ML
- 3.8Data Science Technologies – Jupyter Notebooks
- 3.9Data Science Technologies – Creating a Machine Learning Model
- 3.10Data Science Technologies – Further Reading
- 3.11Lab Instructions
- 3.12Lab Verification
- Module 2: Probability & Statistics for Data Science21
- 4.1Probability and Random Variables – Overview of Probability and Random Variables
- 4.2Probability and Random Variables – Introduction to Probability
- 4.3Probability and Random Variables – Discrete Random Variables
- 4.4Probability and Random Variables – Discrete Probability Distributions
- 4.5Probability and Random Variables – Binomial Distribution Examples
- 4.6Probability and Random Variables – Poisson Distributions
- 4.7Probability and Random Variables – Continuous Probability Distributions
- 4.8Probability and Random Variables – Cumulative Distribution Functions
- 4.9Probability and Random Variables – Central Limit Theorem
- 4.10Probability & Random Variables – Further Reading
- 4.11Introduction to Statistics – Overview of Statistics
- 4.12Introduction to Statistics – Descriptive Statistics
- 4.13Introduction to Statistics – Summary Statistics
- 4.14Introduction to Statistics – Demo: Viewing Summary Statistics
- 4.15Introduction to Statistics – Z-Scores
- 4.16Introduction to Statistics – Correlation
- 4.17Introduction to Statistics – Demo: Viewing Correlation
- 4.18Introduction to Statistics – Simpson’s Paradox
- 4.19Introduction to Statistics – Further Reading
- 4.20Introduction to Statistics – Lab Instructions
- 4.21Introduction to Statistics – Lab Verification
- Module 3: Simulation & Hypothesis Testing16
- 5.1Simulation – Introduction to Simulation
- 5.2Simulation – Start
- 5.3Lab
- 5.4Simulation – Demo: Performing a Simulation
- 5.5Simulation – Further Reading
- 5.6Hypothesis Testing – Overview
- 5.7Hypothesis Testing – Introduction
- 5.8Hypothesis Testing – Z-Tests, T-Tests, and Other Tests
- 5.9Hypothesis Testing – Test Examples
- 5.10Hypothesis Testing – Type 1 and Type 2 Errors
- 5.11Hypothesis Testing – Confidence Intervals
- 5.12Hypothesis Testing – Demo with R & Python
- 5.13Hypothesis Testing – Misconceptions
- 5.14Hypothesis Testing – Further Reading
- 5.15Hypothesis Testing – Lab Instructions
- 5.16Hypothesis Testing – Lab Verification
- Module 4: Exploring & Visualizing Data4
- Module 5: Data Cleansing & Manipulation4
- Module 6: Introduction to Machine Learning4
- Final Exam & Survey4
Introduction to Statistics – Simpson’s Paradox
Simpson’s Paradox
Downloads and transcripts
Video Transcript
- Start of transcript. Skip to the end.
- Simpson’s paradox is a trap and you can easily fall into it and
- once you know about it, it’s hard to believe any summary
- statistics than anyone ever tells you.
- It’s really cool though.
- Okay so Simpson’s paradox deals with aggregating smaller
- data sets into larger ones and it’s when conclusions
- drawn about the smaller data sets are actually the opposite
- of conclusions drawn from the larger data sets.
- And you’ll see that it occurs when there is a lurking variable
- and uneven-sized groups being combined, okay, so
- there are these two ingredients the lurking variable and
- the uneven-sized groups.
- Let me give you an example.
- This is about kidney stone treatments, okay.
- So there are two treatments, treatment A and treatment B and
- then this is the rate of success of the treatment.
- So clearly,
- it looks like treatment B is more effective right,
- because 83% of people who have treatment B are cured, right?
- 289 out of the 350 people who took treatment B were cured,
- whereas with treatment A, it was only 78%.
- So which treatment would you take?
- It looks like treatment B.
- Now I’m gonna give you a little bit more information.
- What I will tell you is that if you have a small kidney stone,
- then which treatment is more effective for you?
- Treatment A, because you’re cured 93% of the time.
- If you have a large kidney stone,
- which treatment is more effective for you?
- Treatment A, cuz it’s 73 versus 69.
- So treatment A is more effective for you whether you have a small
- kidney stone or a large kidney stone.
- But if you look at both,
- if you combine the data together, it looks like
- treatment B is more effective because 83% of people are cured
- from treatment B who took it versus 78 from treatment A.
- So what happened?
- I give you more information and
- we get the opposite results, right?
- No matter whether you have small stones or large stones,
- treatment A is more effective,
- whereas if you just combine the information together,
- it looks like treatment B is more effective.
- Okay, so the answer is, this is a case of Simpson’s paradox.
- There’s a lurking variable which is the size of the kidney stone,
- and there are uneven sized groups being combined.
- In particular,
- since most of the small stone patients took treatment B and
- since most of the large stone patients took treatment A,
- you end up combining basically group two and group three.
- You end up combining the small stone people who took
- treatment B to the large stone patients that took treatment A,
- which is not fair, okay.
- So when you combine all that stuff together,
- you actually lost a whole lot of information and
- you got the opposite result than you were supposed to.
- Okay, so why do you think they give people treatment B at
- all, right?
- So if you have a small kidney stones, you get treatment B
- whereas, clearly treatment A is the better treatment and they
- give it to pretty much only to people with large kidney stones.
- And so the answer might be that treatment B is less invasive,
- it might have of less side effects, less expensive,
- maybe it’s more easily available.
- So there’s a clear reason why you would see data like this and
- it’s very clear that treatment A is more effective.
- It’s just that when you did this particular calculation that
- combine everything together, there is a case of Simpson’s
- paradox and you got the wrong conclusion.
- Here’s another example.
- We have two airlines, Blue Yonder Airlines and
- Consolidated Messenger.
- And what I’m showing you is the delay rate on Blue Yonder
- is 13.3% and then Consolidated Messenger’s delay rate is 10.9%.
- So which one would you rather take?
- Clearly Consolidated Messenger, because there are fewer delays.
- But now I’m gonna give you more information.
- I’m gonna break it down by city here and
- I’m giving you the delay rate for each city and
- you can see that for every single city, Blue Yonder
- actually has a lower delay rate than Consolidated Messenger.
- So what happened?
- Well as it turns out, Consolidated Messenger flies
- mostly out of Phoenix, right, that’s where most of it’s
- flights are, out of Phoenix where the delay rate is 7.9%.
- Whereas Blue Yonder flies mostly out of Seattle, right, that’s
- where a lot of its flights are, with a much higher delay rate
- because Seattle is a much bigger city, so delay rates are higher.
- So when you combine everything together,
- you’re essentially comparing Blue Yonder’s Seattle rate
- to Consolidated Messenger’s Phoenix rate which is not fair.
- And so that is again a case of Simpson’s Paradox,
- with a lurking variable which is the city, and
- uneven sized groups being combined.
- Though this is Simpson’s Paradox, it deals with
- aggregating smaller data sets into larger ones and
- it’s where conclusions drawn from the smaller data sets
- are actually the opposite of conclusions drawn from
- the larger data set and it occurs when there
- is a lurking variable and uneven sized groups being combined.
- You have to have those two ingredients.
- And after you hear about Simpson’s Paradox, you’re less
- likely to believe any statistics that you hear in the news.
- So I often wonder whether there’s a lurking variable that
- would explain things in exactly the opposite way of what’s
- reported.
- So I hope you enjoyed hearing about Simpson’s paradox.
- End of transcript. Skip to the start.