Probability Density Function
Definition
A Probability Density Function (PDF) is a mathematical function that describes the likelihood of a continuous random variable falling within a specific range of values. Unlike discrete probability, where we calculate the probability of a specific point, the PDF gives the "density" of probability at any given point. The total area under the curve of a PDF is always equal to 1.
Main Content
1. Continuous Random Variables
- Continuous variables can take any value within a range (e.g., height, weight, time).
- Because there are infinite possible values, the probability of any exact point is effectively zero; therefore, we calculate probability over intervals.
2. The Total Area Property
- For any valid PDF, the area under the curve over the entire range of the variable must equal exactly 1.
- This ensures that the sum of all possible probabilities accounts for 100% of the sample space.
3. Visual Representation
- The shape of the PDF curve shows where values are most frequent (highest point) and least frequent (lowest point).
- The height of the curve at any point $x$ is denoted as $f(x)$.
f(x)
^
| ____
| / \
| / \
|__/ \__
+---------------> x
[Area under curve = 1]
(This diagram represents a typical Bell Curve or Normal Distribution PDF.)
Working / Process
1. Defining the Function
- Identify the range $[a, b]$ for which the random variable is defined.
- Ensure the function $f(x) \geq 0$ for all $x$, as probability cannot be negative.
2. Normalization
- Integrate the function over the total range to ensure the total area equals 1.
- $\int_{a}^{b} f(x) \,dx = 1$
- If the integral does not equal 1, the function must be scaled by a constant to normalize it.
3. Calculating Probability for Intervals
- To find the probability that $X$ falls between two values $p$ and $q$, calculate the definite integral of the PDF between those limits.
- $P(p \leq X \leq q) = \int_{p}^{q} f(x) \,dx$
Advantages / Applications
- Used extensively in Statistics and Machine Learning to model data distributions, such as normal, exponential, or uniform distributions.
- Essential in Quality Control to determine the likelihood of a product falling outside of tolerance levels (e.g., measuring the thickness of a steel plate).
- Crucial in Risk Management for calculating the probability of rare extreme events (tail risk) in financial markets.
Summary
- The PDF represents the probability distribution of a continuous random variable.
- The probability of a specific value is found by calculating the area under the curve within an interval.
- The sum of the area under the entire curve must always be 1.
- Important terms: Continuous Random Variable, Definite Integral, Normalization, Sample Space, and Probability Density.