Basic Concepts in Statistics

Population, Sample, Parameters, and Statistics

Definition: Population, Parameter, Sample

A population refer to the entire set of individuals, objects, events, or measurements that share a common characteristic and are of interest in a particular study. Any numerical characteristic of a population is a parameter. A sample is a subset of the population, which is used to make inferences about the population.

When the form of the population distribution is known but contains known parameters, inference about these parameters is referred to as parametric statistical inference.

Generally, we are interested in some quantitative index of the population. The value of for each individual in the population may be different and these values collectively form a probability distribution. So the population is usually expressed as a random variable following a specific distribution, i.e., or . In statistics, the distribution of is often unknown, but we can typically assume a specific form based on context, leaving only certain parameters of the distribution unknown. So the population can be expressed as or individuals are selected from the population, their quantitative index are recorded as , which is called a sample of size , and is the sample size. The process of selecting individuals from the population is called sampling. The goal is to make inference of the whole population based on a sample. And this requires special attention during sampling to ensure that the samples drawn are sufficiently representative.

The simplest method to ensure representativeness is simple random sampling. As its name suggests, the core of simple random sampling is to select samples in a completely random manner, ensuring that every individual in the population has an equal probability of being chosen. Under this sampling method, are independent, and follow the same distribution with . This is a fundamental assumption of many statistical methods.

Before sampling, we have no idea who will be sampled; after sampling, the observed values of are obtained, and those values aare called the sample observed values.

Definition: Statistic

If is a sample from population and is an -ary function. Define random variable then is called a statistic if does not involve any unknown parameter.

Common Statistics

Sample Mean

Sample Variance and Sample Standard Deviation

Order Statistics

where

Sample -quantile

Number that exceeds at most of the sample and is exceeded by at most of the sample Specifically, is the sample median, () is the sample lower (upper) quartile or the sample first (third) quartile, also written as (). is called the sample interquartile range, or sample IQR for short.

Each statistic is a random variable because it is computed fromrandom data. After sampling, the observed value of a statistic can be obtained.

The distribution of a statistics is called the sampling distribution, which are required to construct confidence interval and perform hypothesis testing latter.

Parameter Estimation: Point Estimation

Properties of Point Estimation

Definition: Point Estimation

Let be a simple random sample from the population , are the unknown parameters. If is a statistic used to estimate , then is called an estimator of .

Plugging in the sample observed values, is called an estimate or estimated value of . Both and can be abbreviated as .

Before introducing the methods of point estimation, we first introduce some properties used to compare multiple point estimators. Since is statistic, i.e., a random variable, a natural criterion is to see if it underestimate or overestimate on average.

Definition: Unbiasedness

Let be a simple random sample from the population and is an estimator of . is the parameter space of , i.e., the set of values that can take. If for all , we have then is called an unbiased estimator of . Otherwise, it is called a biased estimator of . is called the bias of the estimator. If the bias is not , but converge to as , then is called an asymptotic unbiased estimator.

Tip

No matter what is the population distribution, if the population mean and population variance exist, then and are unbiased estimators of and .

Proof. Since and

So Hence

Unbiasedness suggests that an estimator fluctuates around the true parameter, considering the stability of estimation, we would like the magnitude of the fluctuation to be as small as possible.

Definition: Relative Efficiency

Let be a simple random sample from the population , and are two unbiased estimators of . is the parameter space. If for all , we have and for at least one , then is said to be more efficient than .

The last property is about the convergence of an estimator as the sample size approaches .

Definition: Consistency

Let be a simple random sample from the population , and are two unbiased estimators of . is the parameter space. If for all and all , we have i.e., as . Then is called a consistent estimator of .

Tip

An (asymptotic) unbiased estimator may not be a consistent estimator. A consistent estimator may not be an (asymptotic) unbiased estimator.

Tip

If is an asymptotic unbiased estimator of and as , then is a consistent estimator of .

Proof.

Chebyshev’s Inequality

Let be a random variable with mean and variance both exists, then

Therefore, byb the Chebyshev’s inequality,

Tip

No matter what is the population distribution, if the population mean and population variance exist, then and are consistent estimators of and .

Proof. By the weak LLN, we have and

Method of Moments

Definition: Moments

Let be a simple random sample from the population . The -th population moment and -th sample moment are defined as The -th population central moment and sample central moment are defined as ,

Then we can choose the values of s.t. the population moments match the sample moments.

The population moments can be expressed as functions of the parameters , and we set the population moments equal to the sample moments: The solution is For , is called the moment estimator of .

Tip

No matter what is the population distribution, if the population mean and population variance exist, then the 1st sample moment and the 2nd sample central moment are the moment estimators of and , respectively.

Method of Maximum Likelihood

The central idea of the method of maximum likelihood is to find the parameter values that maximize the likelihood of observing the given data.

For a simple random sample from the population , the sample observed values are .
With different values of , the likelihood of observing are different.
We would estimate with the values that maximize the likelihood of observing .

Definition: Likelihood Function and Maximum Likelihood Estimator

Let be a simple random sample from the population and the sample observed values are . is the parameter space. The likelihood function is a function of , measuring the likelihood of observing . i.e., the joint PMF/PDF of . If there exists s.t. then is the maximum likelihood estimate of , the corresponding estimator is the maximum likelihood estimator of . Maximizing is equivalent to maximizing the -likelihood function

Example

is a simple random sample from the population . Derive the maximum likelihood estimators of the unknown parameters and .

Solution. The likelihood function is The -likelihood function is Then The solution is

Tip

Under mild regularity conditions, the moment estimators and maximum likelihood estimators are consistent and asymptotic unbiased estimators. And for large samples, a maximum likelihood estimator has an approximately normal distributions. This property is known as asymptotic normality. This property helps us construct interval estimation for a parameter.

Parameter Estimation: Confidence Interval

Given the sample observed values, a point estimate provides a concrete estimated value of the parameter. However, the accuracy of this estimate is not provided by the point estimate itself. To solve this issue, we introduce the interval estimation.

The general idea of interval estimation is

Find two statistics and use the random interval to estimate the range of .
Since is a random interval while is a fixed value, may not cover the true value of .
If the width of the interval is large, then it has higher probability of covering the true value of , but its precision is relatively low.

Therefore, we need to find a balance between the coverage probability and precision. The confidence interval is a widely used interval estimation.

Definition: Confidence Interval

Let be a simple random sample from the population . For all , if there exists two statistics and s.t. then is called a confidence interval of with confidence level , or simply a confidence interval. and are called confidence lower limit and confidence upper limit, respectively.

Example

is a simple random sample from , suppose that the value of is unknown. Derive the confidence interval of the unknown parameter .

Solution. A natural idea to construct a confidence interval is to start from a point estimator and form an interval by adding and subtracting a quantity around the point estimate. For this example, a good point estimator of is : it can be obtained by both the method of moments and method of maximum likelihood, and it is an unbiased estimator of .

By the definition of CI, we would like to determine the value of , s.t. with an equivalent expression .

Obviously, the value of need to be determined by referring to the distribution of , which is So can be determined by where . The satisfying the equation above and with the minimum width of the interval is where is the upper -quantile of the standard normal distribution, whose value can be found in the table. Therefore, the CI of is With the value of , and observed values of the sample, the confidence interval can be calculated.

Example

Following the above example, derive the CI of the unknown parameter if is unknown.

Solution. Following the same rationale as the case when is unknown, we still start from and try to find s.t. where is the consistent estimator of . Then the value of can be determined based on the distribution of Unlike , no longer follows a standard normal distribution. The exact distribution of is defined to be the Student’s -distribution. And approximately follows the standard normal distribution when the sample size is large.

Therefore, with determined based on , the corresponding CI is called the large sample confidence interval, which is

Construction of Confidence Interval: A General Method

Let be a simple random sample from the population . is an unbiased estimator of and the standard deviation of is . This is known as the standard error of . If exactly follows a normal distribution, i.e., , and does not depend on any unknown parameter, then an exact CI of is If or depends on unknown parameters and is a consistent estimator of , then a large sample CI of is

Probability and Statistics for Engineering Lecture 12-14 Notes

Basic Concepts in Statistics

Population, Sample, Parameters, and Statistics

Common Statistics

Parameter Estimation: Point Estimation

Properties of Point Estimation

Method of Moments

Method of Maximum Likelihood

Parameter Estimation: Confidence Interval