Probability Density Function

Comprehensive study notes, diagrams, and exam preparation for Probability Density Function.

Probability Density Function

Definition

A Probability Density Function (PDF) is a mathematical function that describes the likelihood of a continuous random variable falling within a specific range of values. Unlike discrete probability, where we calculate the probability of a specific point, the PDF gives the "density" of probability at any given point. The total area under the curve of a PDF is always equal to 1.


Main Content

1. Continuous Random Variables

  • Continuous variables can take any value within a range (e.g., height, weight, time).
  • Because there are infinite possible values, the probability of any exact point is effectively zero; therefore, we calculate probability over intervals.

2. The Total Area Property

  • For any valid PDF, the area under the curve over the entire range of the variable must equal exactly 1.
  • This ensures that the sum of all possible probabilities accounts for 100% of the sample space.

3. Visual Representation

  • The shape of the PDF curve shows where values are most frequent (highest point) and least frequent (lowest point).
  • The height of the curve at any point $x$ is denoted as $f(x)$.
       f(x)
        ^
        |     ____
        |    /    \
        |   /      \
        |__/        \__
        +---------------> x
       [Area under curve = 1]

(This diagram represents a typical Bell Curve or Normal Distribution PDF.)


Working / Process

1. Defining the Function

  • Identify the range $[a, b]$ for which the random variable is defined.
  • Ensure the function $f(x) \geq 0$ for all $x$, as probability cannot be negative.

2. Normalization

  • Integrate the function over the total range to ensure the total area equals 1.
  • $\int_{a}^{b} f(x) \,dx = 1$
  • If the integral does not equal 1, the function must be scaled by a constant to normalize it.

3. Calculating Probability for Intervals

  • To find the probability that $X$ falls between two values $p$ and $q$, calculate the definite integral of the PDF between those limits.
  • $P(p \leq X \leq q) = \int_{p}^{q} f(x) \,dx$

Advantages / Applications

  • Used extensively in Statistics and Machine Learning to model data distributions, such as normal, exponential, or uniform distributions.
  • Essential in Quality Control to determine the likelihood of a product falling outside of tolerance levels (e.g., measuring the thickness of a steel plate).
  • Crucial in Risk Management for calculating the probability of rare extreme events (tail risk) in financial markets.

Summary

  • The PDF represents the probability distribution of a continuous random variable.
  • The probability of a specific value is found by calculating the area under the curve within an interval.
  • The sum of the area under the entire curve must always be 1.
  • Important terms: Continuous Random Variable, Definite Integral, Normalization, Sample Space, and Probability Density.