4.1.2 Data Transformation

Once we have the raw data, we need to process it.

The data is verified, transformed, filtered and consolidated for its intended analytical use case.

This phase can involve the following tasks:

  • Filtering, cleansing, de-duplicating, validating, and authenticating the data.
  • Enriching the data with metadata information.
  • Performing calculations, translations, or summarizations based on the raw data. This can  include changing row and column headers for consistency, converting currencies or other units of measurement, editing text strings, and more.
  • Conducting audits to ensure data quality and compliance
  • Removing, encrypting, or protecting data governed by industry or governmental regulators
  • Formatting the data into tables or joined tables to match the schema of the target data warehouse.