SlideShare a Scribd company logo
2
Most read
8
Most read
10
Most read
WHAT IS ETL?
“EXTRACT … TRANSFORM … LOAD”



Eng. Ismail El Gayar
Software Engineer
WHY ETL?

 Companies need a way to analyze their data
  for critical business decisions.
 Transactional Database can’t answer
  complex business questions.
 A data warehouse provide a common data
  repository.
 ETL provide a method of moving the data
  from various source into a data warehouse.
ETL CONCEPT
 A Company data may be scattered in
  different locations and in different formats.
 ETL Allows you to:
     Migrate the data into a data warehouse.
     Convert the various formats and types to adhere
      to one consistent system.
   ETL is a predefined process for access and
    manipulate source data and loading it into a
    target database.
ETL REQUIREMENTS
   Any ETL Architecture must meet the following
    requirements:
       Business Requirement
       Compliance Requirement
       Data Profiling
       Data Security
       Data Integration
       Right Data at Right Time
       Archiving & Uneage
       Final End User Delivery Interface
       Available Skills
       Legacy License
       Alignment with overall Enterprise Architecture
THE ETL PROCESS



                                                    Load
                                                 The process of
                            Transform            writing data into
                                                    the target
                            The process of
                                                     database
                          converting data from
                          one form to another
         Extract
      The process of
    reading data from a
         database
EXTRACT

   Gathering the data
     Raw  data that was written directly into the disk
     Data written to flat files or relational tables from
      structured source systems
     Data can be read multiple times, if needed.

   Cleansing the data
     Eliminateduplicates or fragmented data
     Exclude unwanted / unneeded information
TRANSFORM

 Preparing the data to be housed in the data
  warehouse.
 Converting the extracted data
     Using  rules and lookup tables
     Combining data

     Verification/Validity checks

     Standardization
LOAD

   Storing the transformed data in the data
    warehouse.

   Batch/Real-time processing

   Can follow star schema and snowflake
    schema
ETL FLOW
ADVANTAGE OF ETL TOOL
 Simple, faster and cheaper development
 Most ETL tools provide a metadata
  repository, synchronizing metadata from
  various sources.
 Most ETL tools deliver good performance,
  even for very large dataset.
 Most ETL tools provide impact analysis tools
  for any proposed schema changes.
 Most ETL tools have built-in connectors for
  all the major RDBMS systems
ADVANTAGE OF ETL TOOL
 Most ETL tools allow reuse of the existing
  complex programs.
 Several ETL tools offers visual Development
  Environment.
 Most ETL tools offers built-in scheduler
  sequencers and documentation.
 Several ETL tools offer various performance
  optimization options such as (parallel
  processing, complex load balancing etc)
POPULAR ETL TOOLS

                 Tools                             Company
Infosphere Datastage               IBM
Informatica                        Informatica Corp
DT/Studio                          Embarcadero Technologies
Ab Inito                           Ab Inito Software Corp
Oracle Warehouse Builder           ORACLE
Microsoft SQL Server Integration   Microsoft
Transformation Manager             ETL Solutions
THANK YOU

More Related Content

PPTX
ETL Process
PPTX
ETL Process
PDF
Introduction to ETL and Data Integration
PPTX
Etl - Extract Transform Load
PDF
Data warehouse architecture
PPT
Data warehouse
PDF
Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka
ETL Process
ETL Process
Introduction to ETL and Data Integration
Etl - Extract Transform Load
Data warehouse architecture
Data warehouse
Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka

What's hot (20)

PDF
Etl overview training
PPTX
DATA WAREHOUSING
PDF
ETL VS ELT.pdf
PPT
Date warehousing concepts
PPTX
1. informatica power center architecture
PDF
ETL Using Informatica Power Center
PPTX
Data Warehousing Trends, Best Practices, and Future Outlook
PPTX
Data warehousing ppt
PDF
Data warehousing
PPTX
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
PPTX
Better decision making with proper business intelligence
PPTX
Data warehousing
PPTX
Knowledge Discovery and Data Mining
PPTX
Data warehousing
PPTX
Introduction to Data Engineering
PDF
Data Virtualization: An Introduction
PPTX
Data warehousing
PDF
Data integration
Etl overview training
DATA WAREHOUSING
ETL VS ELT.pdf
Date warehousing concepts
1. informatica power center architecture
ETL Using Informatica Power Center
Data Warehousing Trends, Best Practices, and Future Outlook
Data warehousing ppt
Data warehousing
Master the Multi-Clustered Data Warehouse - Snowflake
Better decision making with proper business intelligence
Data warehousing
Knowledge Discovery and Data Mining
Data warehousing
Introduction to Data Engineering
Data Virtualization: An Introduction
Data warehousing
Data integration
Ad

Similar to What is ETL? (20)

PPTX
Extract, Transform and Load.pptx
PDF
ETL Tools Ankita Dubey
PPTX
1.3 CLASS-DW.pptx-ETL process in details with detailed descriptions
PPTX
Lecture13- Extract Transform Load presentation.pptx
DOC
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
PPT
Datawarehousing & DSS
PPT
Building the DW - ETL
PPTX
ETL_Methodology.pptx
DOCX
Etl techniques
PPT
Should ETL Become Obsolete
PPTX
Extract Transformation Load (3) (1).pptx
PPTX
Extract Transformation Loading1 (3).pptx
PPT
definign etl process extract transform load.ppt
PDF
ETL-Advance IA to improve your skills-pdf
PPTX
Etl process in data warehouse
PPT
Data Warehouse Basic Guide
PPT
Datastage Introduction To Data Warehousing
PPT
Informatica_ Basics_Demo_9.6.ppt
PPTX
ETL
DOCX
Final Project Write-up
Extract, Transform and Load.pptx
ETL Tools Ankita Dubey
1.3 CLASS-DW.pptx-ETL process in details with detailed descriptions
Lecture13- Extract Transform Load presentation.pptx
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
Datawarehousing & DSS
Building the DW - ETL
ETL_Methodology.pptx
Etl techniques
Should ETL Become Obsolete
Extract Transformation Load (3) (1).pptx
Extract Transformation Loading1 (3).pptx
definign etl process extract transform load.ppt
ETL-Advance IA to improve your skills-pdf
Etl process in data warehouse
Data Warehouse Basic Guide
Datastage Introduction To Data Warehousing
Informatica_ Basics_Demo_9.6.ppt
ETL
Final Project Write-up
Ad

More from Ismail El Gayar (7)

PPS
Neural Networks
PDF
Why computer engineering
PPTX
Geographic Information System for Egyptian Railway System(GIS)
PDF
System science documentation
PPTX
Prolog & lisp
PPTX
Parallel architecture &programming
PPTX
Object oriented methodology & unified modeling language
Neural Networks
Why computer engineering
Geographic Information System for Egyptian Railway System(GIS)
System science documentation
Prolog & lisp
Parallel architecture &programming
Object oriented methodology & unified modeling language

Recently uploaded (20)

PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
A Presentation on Artificial Intelligence
PPTX
Tartificialntelligence_presentation.pptx
PPT
Teaching material agriculture food technology
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
1. Introduction to Computer Programming.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
TLE Review Electricity (Electricity).pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
Per capita expenditure prediction using model stacking based on satellite ima...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
OMC Textile Division Presentation 2021.pptx
A Presentation on Artificial Intelligence
Tartificialntelligence_presentation.pptx
Teaching material agriculture food technology
gpt5_lecture_notes_comprehensive_20250812015547.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Digital-Transformation-Roadmap-for-Companies.pptx
1. Introduction to Computer Programming.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Programs and apps: productivity, graphics, security and other tools
NewMind AI Weekly Chronicles - August'25-Week II
Assigned Numbers - 2025 - Bluetooth® Document
A comparative analysis of optical character recognition models for extracting...
TLE Review Electricity (Electricity).pptx

What is ETL?

  • 1. WHAT IS ETL? “EXTRACT … TRANSFORM … LOAD” Eng. Ismail El Gayar Software Engineer
  • 2. WHY ETL?  Companies need a way to analyze their data for critical business decisions.  Transactional Database can’t answer complex business questions.  A data warehouse provide a common data repository.  ETL provide a method of moving the data from various source into a data warehouse.
  • 3. ETL CONCEPT  A Company data may be scattered in different locations and in different formats.  ETL Allows you to:  Migrate the data into a data warehouse.  Convert the various formats and types to adhere to one consistent system.  ETL is a predefined process for access and manipulate source data and loading it into a target database.
  • 4. ETL REQUIREMENTS  Any ETL Architecture must meet the following requirements:  Business Requirement  Compliance Requirement  Data Profiling  Data Security  Data Integration  Right Data at Right Time  Archiving & Uneage  Final End User Delivery Interface  Available Skills  Legacy License  Alignment with overall Enterprise Architecture
  • 5. THE ETL PROCESS Load The process of Transform writing data into the target The process of database converting data from one form to another Extract The process of reading data from a database
  • 6. EXTRACT  Gathering the data  Raw data that was written directly into the disk  Data written to flat files or relational tables from structured source systems  Data can be read multiple times, if needed.  Cleansing the data  Eliminateduplicates or fragmented data  Exclude unwanted / unneeded information
  • 7. TRANSFORM  Preparing the data to be housed in the data warehouse.  Converting the extracted data  Using rules and lookup tables  Combining data  Verification/Validity checks  Standardization
  • 8. LOAD  Storing the transformed data in the data warehouse.  Batch/Real-time processing  Can follow star schema and snowflake schema
  • 10. ADVANTAGE OF ETL TOOL  Simple, faster and cheaper development  Most ETL tools provide a metadata repository, synchronizing metadata from various sources.  Most ETL tools deliver good performance, even for very large dataset.  Most ETL tools provide impact analysis tools for any proposed schema changes.  Most ETL tools have built-in connectors for all the major RDBMS systems
  • 11. ADVANTAGE OF ETL TOOL  Most ETL tools allow reuse of the existing complex programs.  Several ETL tools offers visual Development Environment.  Most ETL tools offers built-in scheduler sequencers and documentation.  Several ETL tools offer various performance optimization options such as (parallel processing, complex load balancing etc)
  • 12. POPULAR ETL TOOLS Tools Company Infosphere Datastage IBM Informatica Informatica Corp DT/Studio Embarcadero Technologies Ab Inito Ab Inito Software Corp Oracle Warehouse Builder ORACLE Microsoft SQL Server Integration Microsoft Transformation Manager ETL Solutions