We say that is a
random vector if
is a function valued on , can also be
called a two-dimensional random variable.
We will generalize the concept of CDF/PMF/PDF to random vectors:
Joint CDF/PMF/PDF
Marginal CDF/PMF/PDF
Conditional PMF/PDF
Definition: Joint Cumulative Distribution Function
For a random vector ,
either discrete or continuous, its joint cumulative distribution
function is defined as
Tip
Treat as an event
, and as an event , then It follows that for
all , we have and
We can use the joint CDF to compute for
any :
With the joint CDF defined, it is straightforward to define the
marginal CDF:
Definition: Marginal Cumulative Distribution Function
Let be the joint CDF of
, be the CDF of without considering , and be the CDF of without considering , then for all and
is called the marginal
cumulative distribution function of and , respectively.
Example
Suppose that the joint CDF of is
Determine the value of , and
obtain the marginal CDFs of and
.
Solution. By the basic properties of the joint CDF, we have
Solve this we will get
Then we can get marginal CDF:
Definition: Joint and Marginal Probability Mass Function
For a discrete random vector , let and be the support of
and , respectively. Then the joint
probability mass function of is defined as The marginal probability mass
function of is the PMF of
without considering : Similarly, for we have
The joint PMF satisfies:
Non-negativity: .
Normalization: .
Example
Two dice are tossed independently. Let be the smaller number of points and
be the larger number of points.
If both dice show the same number, say, points, then .
Find the joint PMF of .
Find the marginal PMF of .
Solution.
Definition: Joint Probability Density Function
is said to be a continuous
random vector if there exists a non-negative function , defined for all , satisfies that for
any , is
called the joint probability density function of . Particularly, we have the joint
CDF of to be
It follows that
Similar to the PDF of a single r.v., the joint PDF . Instead, it
reflects the degree to which the probability is concentrated around
. The joint PDF are typically
visualized with the surface plot or the contour plot which help in
intuitively understanding the distribution and relationship between
and .
Definition: Marginal Probability Density Function
is a continuous random
vector with joint PDF , then
the marginal probability density function of , i.e., the PDF of without considering is
Likewise, the marginal PDF of is
Caution
A joint PDF uniquely defines the marginal PDFs, however the reverse
is not true.
Note
Suppose that the join PDF of a random vector is given by Try
to determine and find the
marginal PDF of .
Solution. By the normalization property: it follows that
The marginal PDF is if , otherwise
.
Definition: Conditional Probability Mass/Density Function
For a discrete random vector with joint PMF , the conditional probability
mass function of given is defined as for all values of s.t. .
For a continuous random vector with join PDF , the conditional probability
density function of given
is defined as
for all values of s.t. .
The conditional PMF/PDF also satisfy the non-negativity and
normalization properties.
Tip
Conditioning on can be
understood as conditioning on where . Consider the conditional CDF: The part in the box is exactly the conditional PDF. By the
definition of conditional PDF: Take the
integration w.r.t. on both sides,
the marginal distribution of can
be expressed as
Relationship Between
Two Random Variables
Definition: Independence of Random Variables
Let be the
joint CDF of ,
be the marginal CDF of
, then if we
have then we say that r.v.s are mutually
independent.
For discrete r.v.s, if they are independent, then the PMF satisfies
For continuous r.v.s , if they are
independent, then the PDF satisfies
Example
The PDF of a standard normal random variable is .
We already know that , how is this
value obtained?
Solution. Let r.v.s , ,
and are independent. Then the joint PDF of
and is Since the joint PDF must integrate to , we have Surprisingly, this double integral can be evaluate though the
single integral could not. To evaluate this, we convert to polar
coordinates using ,
, and . Then we have
Example
Suppose that the number of people who enter a shopping mall on a
randomly selected weekday follows a Poisson distribution with parameter
. If each person who enters
the shopping mall is a male with a probability 0.2 and a femal with
probability 0.8. Show that the number of males and females entering the
shopping mall are independent Poisson random variables with parameters
and , respectively.
Solution. Let
be the total number of people that enter the shopping mall, then , i.e. GIven that , it
follows that , so
that Therefore, With the joint PMF, it is not difficult to determine the
marginal PMFs, i.e., and
.
And the independence is proved.
If are independent r.v.s,
then for any function , we have
Proof. w.l.o.g., show the case for the continuous
case. Suppose that and have joint PDF , then Let and , then What if and are not independent? Let consider the
difference between
and :
So if , then cannot be independent. Therefore we
can use it as a measure of the relationship between and .
Definition: Covariance
The covariance between and , denoted by , is defined by If and are independent, then . Hoever, if , and may not be independent, we can only say
that and are uncorrelated.
The covariance has the following properties:
When ,
and tend to vary in the same direction, and
when they tend
to vary in the opposite direction
Covariance-variance relationship: ,
Symmetry:
Constants cannot covary:
Pulling out constants:
Distributivity:
Bilinear property:
Definition: Correlation Coefficient
The correlation coefficient between and , denoted by or , is defined by is a normalized version
of the covariance, and is a dimensionless quantity.
Definition: Conditional Expectation
The conditional expectation of a random variable given another random variable is the expected value of when is known, denoted as . For a discrete random
vector , the conditional
expection of given is For a continuous random vector , the conditional expectation of
given is Where is the conditional PDF of
given .
is typically a function
of which can be denoted as . Then is a function of , which is a random variable. In the
special case where and are independent, then is a constant.
Definition: Law of Total Expectation
For a random vector , if
exists, then Specifically, for a
discrete random vector , For
a continuous random vector ,
Function of Multiple
Random Vaariables
First we consider the continuous case, and the most general solution
is to derive the CDF of
starting from the definition of CDF:
Continuous Convolution Formula
Let be the PDF of random
vector , and be the marginal PDF of and , respectively. Then, the PDF of is
Specifically, if and are independent, then where
the two integrals are called the convolution of and , denoted as .
Proof. Similarly, we can show that
Question
Let and be independent standard normal random
variables, let . Try to
determine which distribution that
follows.
Solution. Since and
are independent, by the
convolution formula, This suggests that
General Results about the Sum of Independent Normal Random
Variables
Let and be two independent random variables,
and . Then More generally, if random
variable are
independent and ,
In summary, a linear combination of independent normal random variables
still follows a normal distribution.
The discrete case is similar.
Discrete Convolution Formula
Let and be two discrete random variables, for
simplicity, assume that the support of and are both , then the PMF of is ():
Specifically, if and are independent, then which is the
discrete convolution formula.
Tip
For ,
are are independent, then
can be shown accordingly.
For a sum of random variables , each time we add
one more random variable, we have to calculate a convolution, and will eventually converges to a
distribution. This is known as the Central Limit Theorem:
Central Limit Theorem (For i.i.d. Random Variables)
Let is a sequence
of i.i.d. (independent and identically distributed) random
variables with
and . Let
and , consider the
standardized version of : As , converges in distribution to a
standard normal random variable, that is: for all .
Tip
The CLT does not require to follow any specific
distribution and being independent. The CLT explains why many measures
in reality are normally distributed: they are typically the combined
effect of multiple factors.
The rule of thumb is if ,
we can use the standard normal distribution as the distribution of .
Example
A disk has free space of 330 megabytes. Is it likely to be sufficient
for 300 independent images, if each image has expected size of 1Mb with
a standard deviation of 0.5Mb?
Solution. We have . As is large, so the CLT applies to their
total size . Therefore, the
probability of sufficient space is
The binomial variable represent a special case of , where .
In this case, the exact distribution of is , and consider the
approximated of by CLT: This is called the normal approximation to binomial
distribution.
Poisson Approximation to Binomial Distribution
can be
approximated by .
This approximation works well when is large and is small, e.g. and .
Normal Approximation to Binomial Distribution
can be
approximated by . This
approximation works well when and .
Tip
When is small, the Poisson
approximation is better, while the normal approximation is better for
large .
If and
we want to calculate
with the normal approximation, we need to apply a continuity
correction. This correction is needed when we approximate a
discrete distribution by a continuous one. The essential reason is that
may be positive when is discrete, but it always for a continuous . The continuity correction is to expand
the interval by in each
direction:
The CDF of and (Continuous)
Let be the PDF of random
vector . Then, the CDFs of
and are Let
and be the marginal CDF of and , then if and are independent, we have Specifically, if
with PDF , then the CDFs and
PDFs of and
are
Proof. When and are independent,
Multivariate Normal
Distribution
Definition: Bivariate Normal Distribution
Random vector is said to
be bivariate normally distributed with means and variances , and with
correlation coefficient , if
the joint PDF of is given by
If can also be expressed as where
is the mean vector
and is the
(variance-)covariance matrix. Specifically, if , , and , then it is said to be a standard
bivariate normal distribution, i.e. .
For an -dimensional random
vector ,
the mean vector is The covariance matrix is e.g. for -dimensional
:
Common properties of bivariate normal distribution
If where , then , , and .
Generally, if random variables and are uncorrelated, then we not
necessarily have and are independent. However, uncorrelation
does imply independence if and
jointly follow a bivariate normal
distribution.
If where , then ,
.
If where , then for any constants , .
More generally, for any real matrix , we have