Bivariate Distributions Bivariate distributions and their properties
Definition
A bivariate distribution is the probability distribution of a pair of random variables defined on the same sample space. It assigns probabilities to pairs of values rather than to single values.
For discrete random variables, the bivariate distribution is given by the joint probability mass function:
For continuous random variables, it is given by the joint probability density function:
such that
for any region in the plane.
The key property is that the joint distribution completely describes the probabilistic relationship between the two variables, from which marginal distributions, conditional distributions, expectation, covariance, and correlation can be derived.
Main Content
1. Joint Distribution
Joint probability distribution of two variables
The joint distribution shows how the two variables behave together. For discrete variables, we list the probabilities for every possible pair . For continuous variables, we use a density function over a region in the -plane.
Probability over regions
In a bivariate setting, we can compute probabilities for events like , , or more complex geometric regions. This is one of the most important features because it extends one-variable probability to two dimensions.
Discrete bivariate distribution
Suppose and represent the number of defects in two different machine parts. The joint probability table may look like this:
| 0 | 1 | 2 | |
|---|---|---|---|
| 0 | 0.10 | 0.05 | 0.02 |
| 1 | 0.15 | 0.20 | 0.08 |
| 2 | 0.12 | 0.18 | 0.10 |
Each entry is , and the total of all probabilities must be 1.
Continuous bivariate distribution
If and are continuous, then the joint density function must satisfy:
- for all
For example, if
and elsewhere, then the density is supported only in the triangular region above the line within the unit square.
2. Marginal Distribution
Distribution of one variable alone
The marginal distribution of or is obtained from the joint distribution by “summing out” or “integrating out” the other variable. It tells us the behavior of one variable regardless of the other.
Importance of marginalization
Marginal distributions are essential because even when two variables are related, we often want the distribution of each variable separately. These are useful for understanding averages, spread, and probability behavior independently.
For discrete variables
Using the table above:
For continuous variables
If the joint density is supported in a bounded region, the limits of integration are taken accordingly.
Visual idea
Joint distribution in 2D:
Y ^
|
| • • •
| • • • •
| • • •
| • •
+-----------------> X
Marginal of X:
probability spread along X axis after ignoring Y
Marginal of Y:
probability spread along Y axis after ignoring X
3. Conditional Distribution and Dependence
Conditional probability and conditional distribution
The conditional distribution of one variable given the value of the other shows how the probability structure changes when one variable is known. It is one of the most powerful ideas in bivariate analysis.
Dependence and independence
If the conditional distribution of given is the same as the marginal distribution of , then and are independent. Otherwise, they are dependent. This concept explains whether one variable carries information about the other.
Conditional distribution for discrete variables
Conditional density for continuous variables
Independence
Two random variables and are independent if and only if:
for all in the discrete case, or
for all in the continuous case.
Example of dependence
If the height and weight of individuals are studied, taller people generally tend to weigh more. This means the variables are typically dependent.
Example of independence
If one variable measures the result of a coin toss and another measures the result of an unrelated die roll, then the two variables are independent.
Working / Process
1. Identify the variables and their type
- Decide whether the variables are discrete or continuous.
- Determine the possible values or region of the pair .
2. Write the joint distribution
- For discrete variables, form a joint probability table.
- For continuous variables, specify the joint density function .
- Check that probabilities sum to 1 or the density integrates to 1.
3. Derive required quantities
- Find marginal distributions by summing or integrating.
- Find conditional distributions using division by the marginal.
- Compute expectation, variance, covariance, and correlation if needed.
- Use these results to analyze dependence, shape, and spread.
Example workflow
Suppose the joint pmf is known for two discrete variables.
- First, list all possible pairs .
- Then compute by summing across rows or columns.
- Next, compute .
- After that, calculate .
- Finally, test whether to check independence.
Key properties often studied
Non-negativity
- probabilities/densities are never negative
Normalization
- total probability equals 1
Marginality
- one-variable distributions obtained from the joint distribution
Conditionality
- distribution of one variable given the other
Dependence structure
- measured by covariance and correlation
Useful formulas
-
Expectation:
-
Joint expectation: or
-
Covariance:
-
Correlation:
These quantities summarize how strongly and in what direction the two variables move together.
Advantages / Applications
Describes relationships between two variables
Bivariate distributions provide a mathematical framework for studying how two measurements interact. This is crucial in examining associations, trends, and joint patterns.
Supports advanced statistical analysis
They form the basis for correlation, regression, multivariate modeling, hypothesis testing, and predictive analytics. Without bivariate ideas, much of applied statistics would not work.
Useful in many real-world areas
Applications include economics (income and spending), biology (height and weight), quality control (two defect counts), meteorology (temperature and humidity), and finance (returns on two assets).
Example applications
- In healthcare, age and blood pressure may be studied together.
- In business, advertising spend and sales revenue can be analyzed jointly.
- In engineering, load and failure time can be modeled using bivariate distributions.
- In education, study time and exam score can be examined for dependence.
Why they are important
Bivariate distributions help answer questions such as:
- Does knowing one variable help predict the other?
- Are two variables independent?
- What is the probability that both variables fall in a certain range?
- How does one variable change when the other changes?
Summary
- A bivariate distribution gives the joint behavior of two random variables.
- It can be discrete or continuous and is the basis for marginals, conditionals, and dependence.
- Important terms to remember: joint distribution, marginal distribution, conditional distribution, independence, covariance, correlation.