Indexing Techniques: Bitmap Indices
Definition
A bitmap index is a database index that uses a sequence of bits to represent the presence or absence of a value in a column for each row in a table. For every distinct value in the indexed column, the system maintains one bitmap vector. Each bit in the vector corresponds to a row position in the table, and the bit is set to 1 if that row contains the corresponding value, otherwise it is 0.
For example, if a column Gender has values Male and Female, the database may create two bitmap vectors:
- One bitmap for
Male - One bitmap for
Female
If row 1 is Male, row 2 is Female, and row 3 is Male, the bitmaps may look like:
Male→1 0 1Female→0 1 0
This compact representation allows very fast evaluation of queries using bitwise operations.
Main Content
1. Bitmap Index Structure
- A bitmap index stores a separate bit vector for each distinct value in a column.
- The length of each bitmap equals the number of rows in the table, and each bit position represents one specific row.
Bitmap indices are designed around the idea of mapping values to binary presence information. Suppose a table has 8 rows and a column Status with three values: New, Pending, and Closed. The index may store:
Row: 1 2 3 4 5 6 7 8
New: 1 0 0 1 0 0 1 0
Pending: 0 1 0 0 1 0 0 0
Closed: 0 0 1 0 0 1 0 1
If a query asks for all Pending rows, the database simply reads the Pending bitmap and finds the positions of 1s. If the query asks for rows that are both Pending and Closed, the database performs a bitwise AND on the two bitmaps.
Bitmap indices can also be implemented with compression techniques to reduce storage overhead. Because bitmaps often contain long runs of 0s or 1s, compression methods such as run-length encoding or word-aligned compression can make them very space-efficient.
- Point 1: Each distinct value in the column gets its own bitmap vector.
- Point 2: The position of each bit corresponds directly to a row in the table.
2. Query Processing with Bitwise Operations
- Bitmap indices are exceptionally fast because they use CPU-level bitwise operations such as AND, OR, and NOT.
- Complex queries can be evaluated by combining several bitmaps without scanning the actual table rows first.
For example, consider a table with columns Region, Product, and Year. Suppose a user wants:
SELECT *
FROM Sales
WHERE Region = 'North'
AND Product = 'Laptop'
AND Year = 2024;
The database can fetch the bitmap for Region='North', the bitmap for Product='Laptop', and the bitmap for Year=2024', then compute:
FinalBitmap = Region_North AND Product_Laptop AND Year_2024
The result bitmap identifies exactly which rows satisfy all conditions. This approach is highly efficient because bitwise operations are performed on machine words, allowing many comparisons in a single CPU instruction.
Bitmap indices are particularly strong for multi-condition queries, especially when the indexed attributes are low-cardinality. They work well in OLAP systems where queries may filter data by many dimensions and then aggregate results.
- Point 1: Queries are answered through bitwise logical operations on bitmaps.
- Point 2: Multiple conditions can be combined efficiently without scanning all rows.
3. Suitability, Benefits, and Limitations
- Bitmap indices are best for columns with few distinct values, such as gender, status, region, or category.
- They are not ideal for high-cardinality columns like employee ID or phone number because too many bitmaps would be required.
One major reason bitmap indices are valuable is their excellent performance in read-intensive analytical workloads. They allow quick filtering and can drastically reduce I/O. They also support efficient set-oriented query processing, which is useful in data warehouses where many users run large analytical queries.
However, bitmap indices have some limitations. Inserting, updating, or deleting rows can be expensive because bitmaps may need to be modified for multiple values. They are therefore less suitable for transaction-heavy OLTP systems where rows change frequently. Another limitation is that bitmap indices can become large if the column has many distinct values, making them impractical for high-cardinality data.
A simple example of suitability:
Gendercolumn: good for bitmap indexOrderStatuscolumn: good for bitmap index-
CustomerIDcolumn: poor choice for bitmap index -
Point 1: Best for low-cardinality, read-heavy data.
- Point 2: Less suitable for frequent updates and high-cardinality columns.
Working / Process
1. Identify the indexed column and its distinct values
The database examines the column to determine all unique values. For instance, if the column is Department, the values may be HR, IT, Sales, and Finance. A separate bitmap is created for each value.
2. Assign a bit position to every row
Each row in the table gets a fixed position in the bitmap. If the table has 10 rows, each bitmap has 10 bits. The bit at a given position is set to 1 only if that row contains the corresponding value.
3. Evaluate queries using bitwise logic
When a query is submitted, the database retrieves the relevant bitmaps and combines them with operations like AND, OR, and NOT. The resulting bitmap points to the qualifying rows, which are then fetched from the table if necessary.
Example workflow:
Table rows: 1 2 3 4 5 6
Department:
HR 1 0 0 1 0 0
IT 0 1 0 0 1 0
Sales 0 0 1 0 0 1
Query:
WHERE Department = 'IT'
Result bitmap:
0 1 0 0 1 0
The rows at positions 2 and 5 match the query.
Advantages / Applications
- Very fast query evaluation for low-cardinality attributes and analytical workloads
- Efficient combination of multiple conditions using bitwise AND, OR, and NOT
- Reduced disk I/O and improved performance in data warehousing and decision support systems
Bitmap indices are widely used in business intelligence, reporting systems, online analytical processing (OLAP), and large-scale data warehouses. They are especially effective when queries involve filtering by multiple dimensions such as time, location, product type, and customer segment. They also help with star-schema queries, where fact tables are filtered by several dimension attributes.
Another major advantage is compactness when compressed. Since many bitmaps contain long runs of repeated bits, compression can make them very storage-efficient. This is one reason they are attractive for large datasets where query speed matters more than frequent updates.
Summary
- Bitmap indices store one bit vector per distinct column value.
- They make query processing fast by using bitwise operations on bitmaps.
- They are best suited for low-cardinality, read-intensive analytical databases.
- Important terms to remember: bitmap index, bit vector, low cardinality, bitwise AND, bitwise OR, OLAP, compressed bitmap