Bayes' Theorem
Definition
Bayes' theorem is a mathematical formula that gives the conditional probability of an event based on prior knowledge of related events.
The standard form is:
Where:
- = probability of event occurring given that has occurred
- = probability of event occurring given that has occurred
- = prior probability of event
- = total probability of event
This formula is used to reverse conditional probability. In many practical problems, we know the probability of evidence given a cause and want the probability of the cause given the evidence.
Main Content
1. First Concept: Conditional Probability
- Conditional probability means the probability of one event happening when another event is already known to have happened.
- It is written as:
provided that .
Explanation
Conditional probability is the foundation of Bayes' theorem. It tells us how the probability of an event changes once we know that another event has occurred.
For example, suppose we draw a card from a standard deck. If we know the card is a face card, the probability that it is a king changes from the overall probability of kings in the deck. This is because the sample space is now restricted to face cards only.
Example
Let:
- Event = "the card is a king"
- Event = "the card is a face card"
Then:
So,
This means that if a card is known to be a face card, the probability that it is a king is .
Why it matters
Conditional probability is the bridge between raw probability and updated probability. Without it, Bayes' theorem would not work.
2. Second Concept: Prior, Likelihood, and Posterior
Prior probability
- is the initial belief about an event before new evidence is observed.
Likelihood
- is the probability of observing the evidence assuming the event is true.
Posterior probability
- is the updated probability after observing the evidence.
Explanation
Bayes' theorem is often understood using these three ideas:
Prior
- : what we believed before
Likelihood
- : how likely the new evidence is under that belief
Posterior
- : what we believe after seeing the evidence
Bayes' theorem combines these into a single update rule:
Then the result is normalized using the total probability of the evidence.
Example: Medical Test
Suppose a disease affects 1% of the population.
- Prior probability of disease:
Suppose a test has:
- 95% sensitivity:
- 90% specificity:
That means false positive rate is:
Now if a person tests positive, we want the posterior probability that they actually have the disease:
First compute :
Now:
So even after a positive test, the probability of having the disease is about 8.76%, not 95%.
Why this is important
This example shows that the posterior depends not only on the test accuracy but also on the prior probability. If a disease is rare, many positive results may still be false positives.
3. Third Concept: Total Probability and Reversal of Cause and Effect
- Bayes' theorem uses the law of total probability to calculate the overall chance of evidence.
- It helps reverse conditional reasoning, moving from "cause given evidence" to "evidence given cause" or vice versa.
- This reversal is useful in classification, diagnosis, and inference problems.
Explanation
Often we know the probability of observing some data under different possible causes. Bayes' theorem lets us estimate which cause is most likely after seeing the data.
If events form a partition of the sample space, then:
So Bayes' theorem can be written as:
This form is very important when there are multiple possible explanations for the same evidence.
Example: Faulty Machine
Suppose a factory has three machines producing items:
- Machine 1 produces 50% of the items
- Machine 2 produces 30% of the items
- Machine 3 produces 20% of the items
Their defect rates are:
If a defective item is found, what is the probability it came from Machine 3?
First calculate total defect probability:
Now:
So there is about a 34.5% chance the defective item came from Machine 3.
Visual flow
Given evidence , Bayes' theorem helps determine the most likely cause:
Possible causes -> Evidence observed -> Updated probabilities
A1, A2, A3 -> B -> P(A1|B), P(A2|B), P(A3|B)
This is the central idea behind Bayesian inference.
Working / Process
1. Identify the event and the evidence
- Decide what the unknown event is, such as having a disease, choosing a machine, or being in a particular category.
- Identify the evidence or observation that has been found.
2. Find the prior probability and likelihood
- Determine the prior probability of the event before seeing evidence.
- Determine the probability of the evidence assuming the event is true.
3. Apply Bayes' theorem and compute the posterior
-
Use:
-
If needed, compute using total probability.
- Simplify the result to obtain the updated probability.
Example process
For a medical test:
- Step 1: Event = disease present, evidence = positive test
- Step 2: Use prevalence as prior and test accuracy as likelihood
- Step 3: Calculate posterior probability of disease after a positive result
General computation structure
Another simple illustration
Prior belief + New data -> Updated belief
This updating process can be repeated whenever more data becomes available.
Advantages / Applications
Medical diagnosis
- Helps estimate the actual probability of a disease after test results, especially when diseases are rare.
Machine learning and AI
- Used in Naive Bayes classifiers, spam filtering, text categorization, and predictive modeling.
Decision-making under uncertainty
- Helps make better choices when outcomes are uncertain and information is incomplete.
Fault detection and quality control
- Used to identify likely causes of defects or failures in manufacturing systems.
Risk assessment and finance
- Helps evaluate probabilities of loss, default, fraud, and other uncertain events.
Legal and forensic analysis
- Assists in updating the probability of hypotheses based on evidence.
Scientific inference
- Used in research to revise hypotheses as new data is collected.
Weather forecasting
- Helps improve predictions by updating beliefs based on observed atmospheric conditions.
Summary
- Bayes' theorem is a rule for updating probability using new evidence.
- It connects prior probability, likelihood, and posterior probability.
- It is useful for reversing conditional probability in real-world problems.
- Important terms to remember: prior, likelihood, posterior, conditional probability, total probability.