Correlation coefficient

Comprehensive study notes, diagrams, and exam preparation for Correlation coefficient.

Correlation Coefficient

Definition

The correlation coefficient is a number that measures the degree and direction of association between two variables, usually denoted by r. Its value lies between -1 and +1:

r = +1

  • means perfect positive correlation

r = -1

  • means perfect negative correlation

r = 0

  • means no linear correlation

A correlation coefficient closer to +1 or -1 indicates a stronger relationship, while a value near 0 indicates a weaker or no linear relationship.


Main Content

1. Meaning and Nature of Correlation Coefficient

Correlation coefficient shows direction and strength

  • : It tells us whether variables increase together, decrease together, or move in opposite directions. If the value is positive, both variables tend to rise and fall together. If it is negative, one variable increases while the other decreases. The size of the number shows how strong the relationship is.

Example: If hours studied and marks obtained have a correlation coefficient of 0.85, it means there is a strong positive relationship: more study time generally leads to higher marks.

It measures linear relationship

  • : The standard correlation coefficient most commonly used is Pearson’s correlation coefficient, which measures linear association. That means it works best when the relationship between variables can be approximated by a straight line. If the data follow a curved pattern, correlation may be low even when the variables are related.

Example: The relationship between age and height in children may first increase and then level off, so the linear correlation may not fully capture the pattern.

2. Types of Correlation Coefficient

Positive correlation

  • : When one variable increases, the other also increases. The coefficient is greater than 0.
    Example: Income and expenditure often show positive correlation.

Negative correlation

  • : When one variable increases, the other decreases. The coefficient is less than 0.
    Example: Speed and time taken to cover a fixed distance show negative correlation.

Zero or no correlation

  • : There is no consistent linear relationship between the variables. The coefficient is around 0.
    Example: Shoe size and intelligence generally have no meaningful correlation.

3. Interpretation of Correlation Values

Magnitude indicates strength

  • : The closer the value is to 1 or -1, the stronger the relationship. Values around 0.7 to 1.0 or -0.7 to -1.0 usually indicate strong correlation, while values around 0.1 to 0.3 indicate weak correlation.

Sign indicates direction

  • : The positive sign means both variables move in the same direction, while the negative sign means they move in opposite directions.

Visual idea of correlation patterns

Positive correlation

y
|       *
|     *
|   *
| *
|________________ x

Negative correlation

y
| *
|   *
|     *
|       *
|________________ x

No clear linear correlation

y
| *   *    *
|    *   *
|  *    *
|     *    *
|________________ x

Working / Process

1. Collect paired data

  • Obtain values of the two variables for each observation.
  • Example: For 5 students, record their study hours and marks.
  • The data must be matched in pairs because correlation compares corresponding values.

2. Choose the appropriate correlation method

  • Use Pearson’s correlation coefficient for numerical data with a linear relationship.
  • Use Spearman’s rank correlation when data are ranked or when the relationship is monotonic but not necessarily linear.
  • The choice depends on the type of data and the nature of the relationship.

3. Compute and interpret the coefficient

  • Apply the formula or statistical tool to find the coefficient.
  • Interpret the result based on sign and magnitude.
  • Always check whether the value is meaningful in the context of the data and whether the relationship is actually linear.

Basic formula for Pearson’s correlation coefficient

Where:

  • = covariance of variables and
  • = standard deviation of
  • = standard deviation of

This formula shows that correlation is standardized covariance, meaning it measures association on a fixed scale from -1 to +1.

Example calculation idea

Suppose:

Student Study Hours Marks
A 2 40
B 4 55
C 5 65
D 7 78
E 9 90

Here, as study hours increase, marks also increase. A correlation coefficient computed from these values would likely be close to +1, showing a strong positive linear relationship.


Advantages / Applications

Helps understand relationships quickly

It provides a simple numerical summary of how two variables are related, making it easier to analyze patterns in data.

Useful in prediction and decision-making

If two variables are strongly correlated, one variable may help predict the other. For example, businesses may use correlation between advertising expenditure and sales to plan budgets.

Widely used in many fields

Correlation coefficient is applied in economics, psychology, medicine, education, finance, and research to study relationships such as:

  • income and spending
  • study time and performance
  • blood pressure and age
  • rainfall and crop yield

Summary

  • Correlation coefficient is a number that shows the strength and direction of relationship between two variables.
  • Its value ranges from -1 to +1.
  • It is commonly used to study whether variables move together, move oppositely, or have no linear relationship.
  • Important terms to remember: positive correlation, negative correlation, zero correlation, Pearson’s correlation coefficient, linear relationship