Probability and Random Variables – Central Limit Theorem

Central Limit Theorem

Downloads and transcripts

Video

Download video file

Transcripts

Video transcript

Start of transcript. Skip to the end.
Well, let me tell you about the central limit theorem and
the normal distribution.
Now, everyday a bunch of customers go into a store,
each with their own kooky distribution
of how much they wanna spend.
Like, this guy wants to spend, sometimes a little bit,
sometimes a lot, but not much in between.
This person spends a little bit, a medium amount, or
a lot, sometimes an awful lot.
And this guy, he just likes ties.
So he always buys ties and there’s always this price.
Somewhere in this price range.
Now, the question is, at the end of the day,
the store counts the total amount of money that
they collect from all of these customers.
So how much total do they spend at the store?
And let’s assume we have a lot of customers in the store.
And we even have some students taking this course going
to the store, and bringing their computers to the store just for
fun, while they’re learning data science.
And each of them has their own distribution
of how much they spend.
Every evening, the store counts their total profit, and every
day it was slightly different than it was the day before.
And now we ask the store to
plot the distribution of it’s profits.
Here’s what it looks like.
A bump.
And we ask them what the center of the bump is, and magically as
it turns out, the center of this distribution of total profits
is exactly the sum of the means of the individual customers.
My goodness, that is cool.
Now, start over again.
Now, this is a totally different store with totally different
customers.
In fact, it’s on the other side of the world.
Everything’s totally different.
And now we ask,
what is the distribution of this store’s profits?
And rather interestingly, ha, ha,
they also end up with a very similar looking sort of bump.
Totally different store.
Totally different customers, and
yet, a single bump with the same shape.
And the mean has the same formula too.
It’s the sum of the means of the customers’ individual means.
But what the heck is going on here?
Are these stores in cahoots?
Did they to each other and
negotiate the distribution of their sales?
No way.
It turns out, and get this, it always happens.
The same bump with the same shape, right?
It might be stretched out a bit, or it might be squeezed a bit,
or scaled a bit, or
shifted a bit, but it’s really the same shape.
So this is called the normal distribution.
And its shape is given by this particular formula, and
it’s always the same formula.
It’s always the same shape.
And the formula has a mean in it, mu, the mean of this thing.
And the standard deviation sigma,
that’s the measure of the spread of that distribution.
Now if you know those two things, if you know the mean and
you know the sigma, you have the whole formula.
So you know the whole shape of that curve.
And it alway integrates to one.
Now funny things happen when you fiddle with the mean and
the variance there.
So you can get these very kinda peaky normal distributions
with small standard deviations.
Or you could get these very broad ones with large standard
deviations.
And the mean can actually be any value it wants to be.
And the standard deviation, as long as it’s positive,
can be anything.
And there’s that formula again.
So, as long as you know the mean and the standard deviation,
you got the full shape.
Now, the cool thing is this fact that I told you,
that the sum of a large number of independent random variables
is approximately normal.
And this is actually called the central limit theorem.
And this is one of the most famous theorems in the world,
the central limit theorem.
So, even though all of the different customers had
a totally different distribution,
when you add them up, the sum is approximately normal.
So let’s have X1 through Xn be independent random variables.
Their means are mu 1 through mu n.
The standard deviations are sigma 1 through sigma n.
Now I take their sum.
Okay, this is the total sales for the store.
And the Xs are the sales for the individual customers.
And now as it turns out, that sum is approximately
normal with mean, which is the sum of the means.
So the mean of the sum is the sum of the means.
And the variance of the sum turns out to be
the sum of the variances of the independent random variables.
Of course, the standard deviation is just the square
root of the variance.
Now, this theorem only works when
the variables are independent.
So you just have to make sure that that’s true.
And that’s true for
sales because each customer comes into the store and
doesn’t worry about what another customer is doing.
Now the larger n is, the closer to normal,
that s, that sum is.
And it turns out that if the Xn’s are actually normal in
the first place, then their sum is normal anyway.
Exactly normal.
So if you start out with weird distributions though,
then n needs to be a bit larger to look normal.
Now, because of the central limit theorem,
this distribution and this formula pops up all the time.
Now, this formula is not something that a person made up,
it’s some thing that exists in nature.
It’s just as natural as the patterns you might find when
looking at water waves or seashells.
It’s something that comes with the earth.
It’s the distribution of the amount of rain in
Boston over the year.
It’s the distribution of test grades
assuming no systematic cheating.
And it’s the distribution of anything that’s a sum of
independent events.
This is like equals mc squared for probability.
You might say, well, but I already
thought you told us about sums of independent random variables?
Doesn’t the binomial distribution come from a sum of
independent random variables?
Well, as it turns out, the limiting binomial distribution
is actually normal, so isn’t that lovely?
So, if I start out with just one trial with probability 0.5,
the binomial distribution is pretty boring,
it just looks like this, right?
This is just a fair coin with a single coin flip.
Half the time you’ll get 0, half the time you’ll get 1.
Then when I start flipping more coins
it starts to look a little bit more normal.
Flip 3 coins.
4 coins.
5, and then we can flip 10 coins, 100 coins, and
1000 coins.
And then you’ll see what looks like very much a beautiful
normal looking distribution.
So the limiting binomial is a normal.
Isn’t that lovely?
Okay, so, yes, so that’s the point.
For large n, the binomial distribution actually becomes
the normal distribution.
End of transcript. Skip to the start.

Data Science Essentials & Machine Learning

Curriculum

Probability and Random Variables – Central Limit Theorem

Central Limit Theorem

Downloads and transcripts

Video

Transcripts

Video transcript

Modal title