Probability and Random Variables – Discrete Random Variables

Discrete Random Variables

Downloads and transcripts

Video

Download video file

Transcripts

Video transcript

Start of transcript. Skip to the end.
Let’s talk about discrete random variables.
Now, the kind of quintessential discreet random
variable is the value on a die when you role it.
Now, the name of the random variable is x,
it’s the value of that die when you roll it.
Now, little x is gonna be an outcome.
So here it’s gonna take values either 1, 2, 3, 4, 5 or 6.
And then the probability that random variable X takes
outcome value little x Is denoted this way okay?
So this is the probability that random variable
X will have outcome little x.
And the probabilities have to sum to 1.
But for the role of a die, the outcomes are equally probable,
they all have probability 1/6.
So make sure you understand the notation here, okay?
So this says that the probability
that random variable x takes outcome 4 is 1/6th.
And this table is called the probability mass function.
It tells your for each outcome what is
the probability of attaining that outcome.
Now, here the probability mass function is constant,
it’s always 1/6.
But there are other random variables where it’s not
constant.
So here, for instance, is a PMF or a very weird die.
This die, instead of having values 1, 2, 3, 4, 5, for
6 on its sides, it has 10,.20, 30, 40, 50, and 60.
And it’s also a weighted die,
where not all the probabilities are 1/6.
So here, one of the probabilities is 3/12 and
another is 1/12.
So just double check for yourself that the probabilities
all add up to 1, and it looks like they do.
Well, okay, so let’s go back and review the general notation.
So the random variable is called capital X, okay, and
the possible outcomes are called little x, and the probability
for outcome little x is denoted like this, okay?
So, this is the probability that random variable X equals
outcome little x, little x3, say, and that equals P3.
Okay, so when I’m talking about a discrete random variable,
I am talking about it’s PMF.
You cannot refer to a discrete random variable without
at least thinking about what its PMF is because that’s
what defines probability in this context.
Okay.
So, armed with this notation, let us talk about how to
summarize a discreet random variable.
Now, it’s important that when I refer to a random variable,
I’m referring to its probability mass function.
Let us talk about how one might summarize a PMF
without having to present the whole thing.
The whole thing could be very big,
it could be very overwhelming, and we just want one or
two numbers that really summarize what it looks like.
Okay, so how are we gonna compute the mean of the PMF,
mean of the random variable, and
how are we gonna compute the spread of it?
So let’s talk about the mean.
The mean is a measure of centrality for
the probability mass function.
What is the middle number of the PMF?
So we’ll use this die’s PMF, which is constant here.
And this is the die that
has the labeled sides 10, 20, 30, 40, 50, 60.
Okay, so what is the average outcome that you should get when
you roll this die over and over again?
And I’m sure you all know the answer, which is that it’s 35,
which is right in the middle there.
And here’s the computation you sorta did in your head in order
to get that.
You multiplied each outcome
by the probability that you get that outcome.
And then you add them all up.
So that’s the computation you did to get to the middle
number there.
So I can write that in general notation this way.
I can say that it’s outcome 1 times the probability of
outcome 1 plus outcome 2 times the probability of outcome 2
and so on.
And then I can write that in summation notation like this.
Okay, it’s the sum over i for outcome i and
probability of outcome i.
Okay, and that’s the formula for the mean of a discrete random
variable Okay, so let’s try this die.
So it’s PMF is slightly different, but
we can just apply the formula.
So it’s each outcome times its probability of occurring,
add them all up, and that is the mean.
Okay, and then here,
when I did that, it was a little bit less than 35.
And you can see why that is by looking at the probability
mass function.
See, we’ve got more mass on the smaller values here, right?
We moved some of the mass down lower, so we get a smaller mean.
Okay, so we’re now likely to choose 20 more often, so
that lowers the average.
Okay, more practice.
Here’s a new random variable, and here is its PMF.
And you can sorta look at that for
a bit, and realize that the values in the middle are more
likely than the values at the extremes, okay?
And you can see that from looking at these
numbers directly.
But see what if there were heck of a lot more numbers,
and what if the table was sort of hundreds of
lines longer, thousands of lines long, then you wouldn’t be
able to look all the numbers and figure out that some of the ones
in the middle are higher than the ones at the extremes.
Okay, so one way to handle that is to actually visualize it.
So I’m gonna do a bar chart of the PMF.
So for outcome zero, I plotted the probability to get zero.
For outcome 1, I plotted the probability to get 1, and so on.
And you can see the nice PMF without having to try and
summarize a table of numbers in your head.
Okay, more practice.
So computing the mean using the formula that I discussed
earlier with you, insanity check, does it look right?
Is 2.45 in the center of the distribution?
Yes, it is, so that looks good.
Okay, so there’s the formula for the mean,
that’s what you just learned.
Now the mean is also called the expectation, by the way, or
the expected value.
And I’m gonna use both terminology and both notations
kind of throughout, so let’s just remember that they stand
for the same thing Okay, so now that
we have a measure of the center of the distribution, let us try
to get some way of measuring the spread of the distribution.
Here’s my random variable.
This is the PMF for my random variable, and I want some
measure of the spread of this thing around the mean.
I wanna know how spread out it is, so let me try a guess.
So here’s my guess.
I take the distance of each outcome xi from the mean,
that’s the first step.
Now, remember that xi could be on either side of the mean, so
I need the absolute value here, right, to compute distance.
And then I’m going to multiply each distance
by how often it occurs, and I will call that the spread.
Okay, so if the distances are very large, fairly often,
then this thing will be large.
Okay, so what do you think of that?
Cool?
Well, so this thing, it’s a good guess but
it’s not quite what I’m looking for.
But it is something that is only a little bit different.
But the intuition for this thing holds for
the real definition of the spread that I’m going to use.
So here’s the real definition.
Instead of computing the mean distance from the mean outcome,
it’s actually the mean squared distance, okay?
So, since you’re still looking at distances from the center of
the distribution this really is the measure of the spread,
right?
This is the official definition of the variance of
a distribution.
Okay, so when you look at this,
what you should see is distances from the mean of
the distribution weighted by their probability of occurring.
And that is the variance.
Let’s do this computation here.
Let’s compute the variance of x.
Here are the outcomes and here are the probabilities.
And here is the formula that we just derived.
Now, lets look at this top line here.
We have probability of 0.03 times the distance of x,
which is zero minus 2.45, which is the mean we computed earlier.
And then we square that, the squared distance from the mean.
Okay, good.
So that’s for the first term, so that’s this term, and
now let’s do the rest of the terms.
Now here’s the second term.
0.14 is the probability of that outcome
times the distance of 1 from 2.45 squared.
And then I just put some dot dot dots just so
we don’t have to write them all out.
And then here is the last term for that line right there.
And you get 1.0675.
Now that’s great.
So we’ve got the variance, but
the problem with the variance is that it’s not in units that
really make any sense, cuz it’s in units of distance squared.
So that’s why we wanna talk about the standard deviation.
Now, the standard deviation is just the square root
of the variance.
Okay, so I put it here.
Standard deviation, it’s also written sigma,
it’s just the square root of the variance.
And that is in units that make sense.
So it’s back in dollars again, not dollars squared.
Okay, so the value here is 1.033 and you can
actually measure that along the horizontal axis here, and it
makes sense because it’s in the same units as the outcomes are.
So if these are in dollars,
the standard deviation is in dollars.
And you can see that by moving away from the mean here,
the mean’s 2.45, which is sort of right there.
And you can see what one standard deviation will get you.
End of transcript. Skip to the start.

Data Science Essentials & Machine Learning

Curriculum

Probability and Random Variables – Discrete Random Variables

Discrete Random Variables

Downloads and transcripts

Video

Transcripts

Video transcript

Modal title