Skewness and Kurtosis

Comprehensive study notes, diagrams, and exam preparation for Skewness and Kurtosis.

Skewness and Kurtosis

Definition

Skewness and Kurtosis are statistical measures used to describe the "shape" of a probability distribution. While the mean and standard deviation tell us about the center and spread of data, skewness identifies the lack of symmetry, and kurtosis identifies the presence of outliers or the "tailedness" of the distribution.


Main Content

1. Skewness (The Measure of Asymmetry)

  • Skewness quantifies how much a distribution deviates from a perfectly symmetrical bell curve (normal distribution).
  • A symmetrical distribution has a skewness of zero; if it leans to one side, it is considered skewed.
   Positive Skew (Right)      Symmetrical (Zero)      Negative Skew (Left)
         _                            _                         _
        / \                          / \                       / \
       /   \                        /   \                     /   \
      /     \                      /     \                   /     \
    _/       \__                 _/       \__             __/       \_
   (Tail on Right)              (No Tail)              (Tail on Left)

2. Kurtosis (The Measure of Tails)

  • Kurtosis measures the "tailedness" of a distribution, specifically focusing on the frequency of extreme values (outliers) compared to a normal distribution.
  • It determines whether the data has heavy tails or light tails.
      Leptokurtic (High)    Mesokurtic (Normal)    Platykurtic (Low)
           |                       |                      |
         _ | _                   _ | _                  _ | _
        / \|/ \                 / \|/ \                / \|/ \
       /   |   \               /   |   \              /  |  \
      /    |    \             /    |    \            /   |   \
    _/     |     \_         _/     |     \_        _/    |    \_

3. Relationship to Normal Distribution

  • A normal distribution is the benchmark: it has a skewness of 0 and a kurtosis of 3 (often referred to as "excess kurtosis" of 0).
  • Understanding these shapes helps researchers determine if their data follows a normal pattern, which is a prerequisite for many statistical tests.

Working / Process

1. Calculating Skewness

  • Identify the mean, median, and mode of your data set.
  • Use Pearson’s Coefficient of Skewness formula: Skewness = 3 * (Mean - Median) / Standard Deviation.
  • A positive result indicates a right-skewed distribution, and a negative result indicates a left-skewed distribution.

2. Calculating Kurtosis

  • Determine the fourth moment of the data distribution about the mean.
  • Divide this value by the standard deviation raised to the power of four to find the Pearson Kurtosis coefficient.
  • Compare the result to 3 (or 0 for excess kurtosis) to categorize the shape of the curve.

3. Interpreting Results

  • Analyze the numerical output: if skewness is between -0.5 and 0.5, the data is fairly symmetrical.
  • Check the kurtosis value: values above 3 suggest frequent extreme outliers, while values below 3 suggest a flatter distribution with fewer outliers.

Advantages / Applications

  • Risk Management: In finance, skewness helps identify the risk of extreme negative returns, while kurtosis helps identify the risk of market crashes.
  • Data Quality: Analysts use these metrics to identify data entry errors or anomalies that deviate from expected trends.
  • Statistical Modeling: Determining the shape of the data ensures that analysts select the correct mathematical models, preventing bias in predictions.

Summary

Skewness and kurtosis are essential descriptive statistics that define the visual shape of data distributions beyond basic averages. Skewness indicates whether data leans toward the left or right, while kurtosis indicates the prevalence of extreme values in the distribution tails. Together, they allow statisticians to verify if data is normally distributed, ensuring accurate analysis and reliable decision-making in diverse fields like finance, engineering, and science.

Important terms: Symmetrical, Mean, Outliers, Normal Distribution, Standard Deviation.