Data

Comprehensive study notes, diagrams, and exam preparation for Data.

Data in Business Intelligence

Definition

Data refers to raw, unorganized facts, figures, symbols, or observations that represent the physical world or digital events. In the context of Business Intelligence (BI), data serves as the foundational raw material that, when processed, analyzed, and interpreted, provides the insights necessary for informed business decision-making.


Main Content

1. Data Hierarchy (DIKW Pyramid)

  • Data acts as the base layer; it is objective and unorganized (e.g., a list of daily transaction amounts).
  • Information is the next level, created by giving context to data (e.g., "Total sales for Tuesday were $5,000").
  • Knowledge is the application of information (e.g., "We sell more on Tuesdays because of our weekly discount").

2. Types of Data

  • Structured Data: Highly organized data that fits neatly into rows and columns, such as SQL databases or Excel spreadsheets (e.g., Customer IDs, product prices).
  • Unstructured Data: Data that lacks a pre-defined format, such as social media posts, emails, videos, and images, which require advanced processing to analyze.

3. Data Characteristics (The V's)

  • Volume: The massive scale of data generated every second that modern BI systems must handle.
  • Variety: The diverse formats of data, ranging from sensor logs to customer feedback forms.
  • Velocity: The speed at which new data is generated and must be processed to remain relevant.

Working / Process

1. Data Collection

  • Organizations gather data from various sources such as Point of Sale (POS) systems, CRM software, and website cookies.
  • This step focuses on ensuring the accuracy of incoming data to avoid "garbage in, garbage out" scenarios.

2. Data Processing and Cleaning

  • Raw data is transformed into a usable format, removing duplicates, correcting errors, and handling missing values.
  • This ensures the data is consistent and reliable for BI analytical tools.

3. Data Visualization and Analysis

  • Processed data is converted into visual formats like charts, heatmaps, and dashboards.
  • This allows stakeholders to identify patterns and trends quickly.
[Data Source] --> [Cleaning] --> [Analysis] --> [Dashboard/Insight]
     (Raw)          (Refined)     (Pattern)       (Decision)

Advantages / Applications

  • Enhanced Decision Making: Businesses move from intuition-based decisions to evidence-based strategies.
  • Operational Efficiency: Identifying bottlenecks in supply chains or production lines through real-time data monitoring.
  • Customer Personalization: Using behavioral data to provide customized recommendations and targeted marketing campaigns.

Summary

Data is the essential raw input for Business Intelligence, serving as the building block for turning organizational activities into actionable knowledge. By collecting, cleaning, and visualizing structured and unstructured information, companies can improve operational performance and achieve competitive advantages in the marketplace. Important terms to remember include Structured Data, Unstructured Data, Data Cleaning, and the DIKW Pyramid.