SlideShare a Scribd company logo
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Automate pipelines:
Tools for building ETL pipelines
Mark Kromer
Microsoft Sr. Program Manager
Information Management
@kromerbigdata
ETL Pipeline Objectives
• Consume hybrid disparate data (Extract)
• On-prem + Cloud
• Structured, un-structured, semi-structured
• Calculate and format data for analytics (Transform)
• Transform, aggregate, join, normalize
• Address large-scale Big Data requirements (Scale / Load)
• Scale-up or Scale-out data movement and transformation
• Operationalize (Automate)
• Create pipeline orchestrations for different org requirements
• Manage & monitor multiple pipelines
Hybrid Data Integration Pattern 1:
Analyze blog comments
Azure SQL Database
SQL Server
(on-premises)
Data Management
Gateway Req’d for ADF
Azure Data Factory (PaaS)
Capture blog comments via API
Drop into Blob
Store
Copy & lookup
Power BI Dashboard
Visualize and analyze
SSIS (self-hosted)
Transform via SPROC (ELT)
Transform via Dataflow (ETL)
Hybrid Data Integration Pattern 2:
Sentiment Analysis with Machine Learning
Azure Data Factory
Power BI
Blob Storage
Azure Functions
Hybrid Data Integration Pattern 3:
Modern Data Warehouse
Daily flat files
OLTP DB
Tables
Analytical
Schemas
AML: Churn
Model
Customer Call Details
Azure Data Factory (PaaS)
SSIS (self-hosted)
Social Media
(un/semi structured)
SQL Server Integration Services (SSIS)
SSIS is a platform for building enterprise-grade data integration solutions
User-friendly code-free authoring / management client tools:
SQL Server Data Tools (SSDT) SQL Server Management Studio (SSMS)
Wealth of connectors + rich transformations to
Extract, Transform, and Load (ETL) data between various sources and destinations, on premises and in the cloud
Low Total Cost of Ownership (TCO)
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Azure Feature Pack
Provides Azure connectivity components for SSIS
1. Move / transfer hybrid data between various sources and destinations, on premises
and in Azure
2. Develop ELT workflows with Big Data transformations / processing in Azure
3. Combine traditional ETL and modern ELT workflows spanning on-premises and Azure
SSIS Azure Feature Pack Features
SSIS Azure Feature Pack contains:
1. Connection Managers
1. Azure Subscription Connection Manager
2. Azure Storage Connection Manager
3. ADLS Connection Manager (NEW)
2. Control Flows / Tasks
1. Azure Blob Upload / Download Tasks
2. Azure HDInsight Hive / Pig Tasks
3. Azure HDInsight Create / Delete Cluster Tasks
4. Azure SQL DW Upload Task (NEW)
3. Data Flows
1. Azure Blob Source / Destination
2. ADLS Source / Destination (NEW)
4. Azure Blob Enumerator
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
SQL DW + SSIS – Quick intro
SQL DW is Microsoft’s scale-out database in the
cloud
Built on Massively Parallel Processing (MPP)
architecture
Capable of processing huge volumes of relational and
non-relational data.
It divides data and processing capability across
multiple nodes
Control Node receives, optimizes, and distributes
requests to Compute Nodes that work in parallel.
There are 2 ways to load data into SQL DW using
SSIS:
Front-loading through Control Node with data flows
Back-loading through Compute Nodes with PolyBase
Azure SQL DW Upload Task – Typical scenario
Azure Blob Storage SQL DW
Cloud
On-Premise
SQL Server Flat File SSIS Machine
0. Export to a flat file
1. Upload to Azure Blob
2. Create an external table
3. Trigger PolyBase to load data
Azure SQL DW Upload Task automates steps 1 – 2 – 3 below:
Azure SQL DW Upload Task
On Azure SQL DW Upload Task Editor, you can
1. Name and describe a create / insert table task
2. Select and configure UTF-8-encoded text file(s) as your data
source
3. Select and configure Azure Storage Connection Manager +
new / existing blob container as your data staging area
4. Select and configure ADO.NET Connection Manager for SQL
DW + new / existing table as your data destination
5. Map source and destination columns for the create / insert
table task
6. Define metadata / data types for source columns
Azure SQL DW Upload Task
Following configurations on Azure SQL
DW Upload Task Editor, T-SQL script
that triggers PolyBase to load data
from your Azure Blob Storage into SQL
DW will be automatically generated
You can still manually edit this auto-
generated T-SQL script to customize it
for your particular needs
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
RAW DATA,
DATA CHAOS
REFINED, ORGANIZED
DATA
DATA CLARITY,
BETTER DECISIONS
DATA MOVEMENT DATA TRANSFORMATIONS BUSINESS INTELLIGENCE
AZURE DATA FACTORY
HYBRID DATA INTEGRATION AT SCALE
Customer
Profitability
Sentiment
Analysis
Market
Basket
Analysis
Machine Learning, Big Data Analytics, SQL, NoSQL,
Data Warehouse , Data Lake
ADF: Orchestrate data services at scale with fully managed Data
Integration cloud service
PREPARE TRAN
AN
INGEST
SQL
<>
SQL
DATA SOURCES
{ }
SQL
• Create, schedule, orchestrate, and manage data pipelines
• Visualize data lineage
• Connect to on-premises and cloud data sources
• Monitor data pipeline health
• Automate cloud resource management
• Move relational data for Hadoop processing
• Transform with Hive, Pig, PySpark, SQL SPROC or custom code
Cloud Analytics – Common Challenges
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
ELT with Apache Spark Activity from ADF Pipeline
Create new pipeline
and HDI Cluster
Linked Service for
Spark from Azure
Portal
Invoke Python script
from Spark activity to
transform data at scale
Schedule, monitor and
manage pipeline from
ADF
Verify results and
perform analytics from
Jupyter notebooks /
PBI
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
ELT with Azure Data Lake from ADF Pipeline
Create new pipeline
and Azure Data Lake
Analytics Linked
Service from Azure
Portal
Perform data
transformations at
scale with U-SQL script
Schedule, monitor and
manage pipeline from
ADF
Verify results and
perform analytics from
ADLA
Monitor & Manage Pipelines
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Microsoft Data Integration Roadmap
• SSIS
• SQL Server 2017
• SSIS on Linux
• Scale-out
• ADF
• SSIS as a Cloud “Integration Runtime”
• Code-free web-based user experience
• Control Flow orchestration + Data Flow steps
• On-Demand Spark Cluster
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Ad

More Related Content

What's hot (20)

Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
BizTalk360
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
David Giard
 
Azure Data Engineering.pptx
Azure Data Engineering.pptxAzure Data Engineering.pptx
Azure Data Engineering.pptx
priyadharshini626440
 
Azure datafactory
Azure datafactoryAzure datafactory
Azure datafactory
Dimko Zhluktenko
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineering
Novita Sari
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
Sergio Zenatti Filho
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
Snowflake Computing
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
Michael Rainey
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
James Serra
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
inovex GmbH
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
Data Con LA
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
Databricks
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
BizTalk360
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview SlidesMicrosoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Azure data factory
Azure data factoryAzure data factory
Azure data factory
David Giard
 
Azure data platform overview
Azure data platform overviewAzure data platform overview
Azure data platform overview
James Serra
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Summary introduction to data engineering
Summary introduction to data engineeringSummary introduction to data engineering
Summary introduction to data engineering
Novita Sari
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
Snowflake Computing
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
Michael Rainey
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
James Serra
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
inovex GmbH
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
Data Con LA
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
Databricks
 

Viewers also liked (14)

Semantics for food and agriculture: the GODAN Action map of data standards
Semantics for food and agriculture: the GODAN Action map of data standardsSemantics for food and agriculture: the GODAN Action map of data standards
Semantics for food and agriculture: the GODAN Action map of data standards
Valeria Pesce
 
Inventory of data standards for food & agriculture
Inventory of data standards for food & agricultureInventory of data standards for food & agriculture
Inventory of data standards for food & agriculture
Valeria Pesce
 
Sharing Agricultural Events Information: When and where is that workshop?
Sharing Agricultural Events Information: When and where is that workshop?Sharing Agricultural Events Information: When and where is that workshop?
Sharing Agricultural Events Information: When and where is that workshop?
Gauri Salokhe
 
How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issues
Valeria Pesce
 
The agINFRA Linked Data layer
The agINFRA Linked Data layerThe agINFRA Linked Data layer
The agINFRA Linked Data layer
Valeria Pesce
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...
Valeria Pesce
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
Valeria Pesce
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
DATAVERSITY
 
Attivio Predictions 2017
Attivio Predictions 2017Attivio Predictions 2017
Attivio Predictions 2017
Attivio
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
Valeria Pesce
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Hortonworks
 
A global linked and open data infrastructure for agricultural development
A global linked and open data infrastructure for agricultural developmentA global linked and open data infrastructure for agricultural development
A global linked and open data infrastructure for agricultural development
Valeria Pesce
 
Cognitive Search for Knowledge Management
Cognitive Search for Knowledge ManagementCognitive Search for Knowledge Management
Cognitive Search for Knowledge Management
Attivio
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
Mark Tabladillo
 
Semantics for food and agriculture: the GODAN Action map of data standards
Semantics for food and agriculture: the GODAN Action map of data standardsSemantics for food and agriculture: the GODAN Action map of data standards
Semantics for food and agriculture: the GODAN Action map of data standards
Valeria Pesce
 
Inventory of data standards for food & agriculture
Inventory of data standards for food & agricultureInventory of data standards for food & agriculture
Inventory of data standards for food & agriculture
Valeria Pesce
 
Sharing Agricultural Events Information: When and where is that workshop?
Sharing Agricultural Events Information: When and where is that workshop?Sharing Agricultural Events Information: When and where is that workshop?
Sharing Agricultural Events Information: When and where is that workshop?
Gauri Salokhe
 
How to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issuesHow to describe a dataset. Interoperability issues
How to describe a dataset. Interoperability issues
Valeria Pesce
 
The agINFRA Linked Data layer
The agINFRA Linked Data layerThe agINFRA Linked Data layer
The agINFRA Linked Data layer
Valeria Pesce
 
Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...Semantic challenges in sharing dataset metadata and creating federated datase...
Semantic challenges in sharing dataset metadata and creating federated datase...
Valeria Pesce
 
Data discovery through federated dataset catalogs
Data discovery through federated dataset catalogsData discovery through federated dataset catalogs
Data discovery through federated dataset catalogs
Valeria Pesce
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
DATAVERSITY
 
Attivio Predictions 2017
Attivio Predictions 2017Attivio Predictions 2017
Attivio Predictions 2017
Attivio
 
Dataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabulariesDataset description: DCAT and other vocabularies
Dataset description: DCAT and other vocabularies
Valeria Pesce
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Hortonworks
 
A global linked and open data infrastructure for agricultural development
A global linked and open data infrastructure for agricultural developmentA global linked and open data infrastructure for agricultural development
A global linked and open data infrastructure for agricultural development
Valeria Pesce
 
Cognitive Search for Knowledge Management
Cognitive Search for Knowledge ManagementCognitive Search for Knowledge Management
Cognitive Search for Knowledge Management
Attivio
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
Mark Tabladillo
 
Ad

Similar to Microsoft Data Integration Pipelines: Azure Data Factory and SSIS (20)

SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
CCG
 
Azure SQL DWH
Azure SQL DWHAzure SQL DWH
Azure SQL DWH
Shy Engelberg
 
Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
CCG
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
Mark Kromer
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
Mark Kromer
 
SQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any DataSQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any Data
Stéphane Fréchette
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
Day 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analyticsDay 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analytics
Armand272
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Trivadis
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
Microsoft Fabric data warehouse by dataplatr
Microsoft Fabric data warehouse by dataplatrMicrosoft Fabric data warehouse by dataplatr
Microsoft Fabric data warehouse by dataplatr
ajaykumar405166
 
AZURE Data Related Services
AZURE Data Related ServicesAZURE Data Related Services
AZURE Data Related Services
Ruslan Drahomeretskyy
 
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive sessionMicrosoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Travis Wright
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Lace Lofranco
 
Azure SQL Data Warehouse
Azure SQL Data Warehouse Azure SQL Data Warehouse
Azure SQL Data Warehouse
Antonios Chatzipavlis
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
Ross McNeely
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the CloudSQL Saturday Redmond 2019 ETL Patterns in the Cloud
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
James Serra
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
CCG
 
Afternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data ServicesAfternoons with Azure - Azure Data Services
Afternoons with Azure - Azure Data Services
CCG
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage Data Analytics Meetup: Introduction to Azure Data Lake Storage
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
Microsoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the CloudMicrosoft Azure BI Solutions in the Cloud
Microsoft Azure BI Solutions in the Cloud
Mark Kromer
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Azure Data Factory for Azure Data Week
Azure Data Factory for Azure Data WeekAzure Data Factory for Azure Data Week
Azure Data Factory for Azure Data Week
Mark Kromer
 
SQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any DataSQL Server 2014 Faster Insights from Any Data
SQL Server 2014 Faster Insights from Any Data
Stéphane Fréchette
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
Day 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analyticsDay 1 - Technical Bootcamp azure synapse analytics
Day 1 - Technical Bootcamp azure synapse analytics
Armand272
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Trivadis
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
Microsoft Fabric data warehouse by dataplatr
Microsoft Fabric data warehouse by dataplatrMicrosoft Fabric data warehouse by dataplatr
Microsoft Fabric data warehouse by dataplatr
ajaykumar405166
 
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive sessionMicrosoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Microsoft ignite 2018 SQL server 2019 big data clusters - deep dive session
Travis Wright
 
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Microsoft Ignite AU 2017 - Orchestrating Big Data Pipelines with Azure Data F...
Lace Lofranco
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
Ross McNeely
 
Ad

More from Mark Kromer (20)

Fabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptxFabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptx
Mark Kromer
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Data cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsData cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flows
Mark Kromer
 
Data cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flowsData cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flows
Mark Kromer
 
Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021
Mark Kromer
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Data Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADFData Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power QueryAzure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101
Mark Kromer
 
Data Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADFData Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 
Data quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADFData quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005
Mark Kromer
 
Data Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data FactoryData Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data Factory
Mark Kromer
 
ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300
Mark Kromer
 
ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2
Mark Kromer
 
ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
ADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview MigrationADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview Migration
Mark Kromer
 
Azure Data Factory Data Flow Limited Preview for January 2019
Azure Data Factory Data Flow Limited Preview for January 2019Azure Data Factory Data Flow Limited Preview for January 2019
Azure Data Factory Data Flow Limited Preview for January 2019
Mark Kromer
 
Fabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptxFabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptx
Mark Kromer
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Data cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsData cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flows
Mark Kromer
 
Data cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flowsData cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flows
Mark Kromer
 
Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021
Mark Kromer
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Data Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADFData Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power QueryAzure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101
Mark Kromer
 
Data Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADFData Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 
Data quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADFData quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005
Mark Kromer
 
Data Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data FactoryData Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data Factory
Mark Kromer
 
ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300
Mark Kromer
 
ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2
Mark Kromer
 
ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
ADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview MigrationADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview Migration
Mark Kromer
 
Azure Data Factory Data Flow Limited Preview for January 2019
Azure Data Factory Data Flow Limited Preview for January 2019Azure Data Factory Data Flow Limited Preview for January 2019
Azure Data Factory Data Flow Limited Preview for January 2019
Mark Kromer
 

Recently uploaded (20)

GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptxTop 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
Top 5 Benefits of Using Molybdenum Rods in Industrial Applications.pptx
mkubeusa
 
Building the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdfBuilding the Customer Identity Community, Together.pdf
Building the Customer Identity Community, Together.pdf
Cheryl Hung
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
Mastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B LandscapeMastering Testing in the Modern F&B Landscape
Mastering Testing in the Modern F&B Landscape
marketing943205
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 

Microsoft Data Integration Pipelines: Azure Data Factory and SSIS

  • 2. Automate pipelines: Tools for building ETL pipelines Mark Kromer Microsoft Sr. Program Manager Information Management @kromerbigdata
  • 3. ETL Pipeline Objectives • Consume hybrid disparate data (Extract) • On-prem + Cloud • Structured, un-structured, semi-structured • Calculate and format data for analytics (Transform) • Transform, aggregate, join, normalize • Address large-scale Big Data requirements (Scale / Load) • Scale-up or Scale-out data movement and transformation • Operationalize (Automate) • Create pipeline orchestrations for different org requirements • Manage & monitor multiple pipelines
  • 4. Hybrid Data Integration Pattern 1: Analyze blog comments Azure SQL Database SQL Server (on-premises) Data Management Gateway Req’d for ADF Azure Data Factory (PaaS) Capture blog comments via API Drop into Blob Store Copy & lookup Power BI Dashboard Visualize and analyze SSIS (self-hosted) Transform via SPROC (ELT) Transform via Dataflow (ETL)
  • 5. Hybrid Data Integration Pattern 2: Sentiment Analysis with Machine Learning Azure Data Factory Power BI Blob Storage Azure Functions
  • 6. Hybrid Data Integration Pattern 3: Modern Data Warehouse Daily flat files OLTP DB Tables Analytical Schemas AML: Churn Model Customer Call Details Azure Data Factory (PaaS) SSIS (self-hosted) Social Media (un/semi structured)
  • 7. SQL Server Integration Services (SSIS) SSIS is a platform for building enterprise-grade data integration solutions User-friendly code-free authoring / management client tools: SQL Server Data Tools (SSDT) SQL Server Management Studio (SSMS) Wealth of connectors + rich transformations to Extract, Transform, and Load (ETL) data between various sources and destinations, on premises and in the cloud Low Total Cost of Ownership (TCO)
  • 19. Azure Feature Pack Provides Azure connectivity components for SSIS 1. Move / transfer hybrid data between various sources and destinations, on premises and in Azure 2. Develop ELT workflows with Big Data transformations / processing in Azure 3. Combine traditional ETL and modern ELT workflows spanning on-premises and Azure
  • 20. SSIS Azure Feature Pack Features SSIS Azure Feature Pack contains: 1. Connection Managers 1. Azure Subscription Connection Manager 2. Azure Storage Connection Manager 3. ADLS Connection Manager (NEW) 2. Control Flows / Tasks 1. Azure Blob Upload / Download Tasks 2. Azure HDInsight Hive / Pig Tasks 3. Azure HDInsight Create / Delete Cluster Tasks 4. Azure SQL DW Upload Task (NEW) 3. Data Flows 1. Azure Blob Source / Destination 2. ADLS Source / Destination (NEW) 4. Azure Blob Enumerator
  • 22. SQL DW + SSIS – Quick intro SQL DW is Microsoft’s scale-out database in the cloud Built on Massively Parallel Processing (MPP) architecture Capable of processing huge volumes of relational and non-relational data. It divides data and processing capability across multiple nodes Control Node receives, optimizes, and distributes requests to Compute Nodes that work in parallel. There are 2 ways to load data into SQL DW using SSIS: Front-loading through Control Node with data flows Back-loading through Compute Nodes with PolyBase
  • 23. Azure SQL DW Upload Task – Typical scenario Azure Blob Storage SQL DW Cloud On-Premise SQL Server Flat File SSIS Machine 0. Export to a flat file 1. Upload to Azure Blob 2. Create an external table 3. Trigger PolyBase to load data Azure SQL DW Upload Task automates steps 1 – 2 – 3 below:
  • 24. Azure SQL DW Upload Task On Azure SQL DW Upload Task Editor, you can 1. Name and describe a create / insert table task 2. Select and configure UTF-8-encoded text file(s) as your data source 3. Select and configure Azure Storage Connection Manager + new / existing blob container as your data staging area 4. Select and configure ADO.NET Connection Manager for SQL DW + new / existing table as your data destination 5. Map source and destination columns for the create / insert table task 6. Define metadata / data types for source columns
  • 25. Azure SQL DW Upload Task Following configurations on Azure SQL DW Upload Task Editor, T-SQL script that triggers PolyBase to load data from your Azure Blob Storage into SQL DW will be automatically generated You can still manually edit this auto- generated T-SQL script to customize it for your particular needs
  • 27. RAW DATA, DATA CHAOS REFINED, ORGANIZED DATA DATA CLARITY, BETTER DECISIONS DATA MOVEMENT DATA TRANSFORMATIONS BUSINESS INTELLIGENCE AZURE DATA FACTORY HYBRID DATA INTEGRATION AT SCALE Customer Profitability Sentiment Analysis Market Basket Analysis Machine Learning, Big Data Analytics, SQL, NoSQL, Data Warehouse , Data Lake
  • 28. ADF: Orchestrate data services at scale with fully managed Data Integration cloud service PREPARE TRAN AN INGEST SQL <> SQL DATA SOURCES { } SQL • Create, schedule, orchestrate, and manage data pipelines • Visualize data lineage • Connect to on-premises and cloud data sources • Monitor data pipeline health • Automate cloud resource management • Move relational data for Hadoop processing • Transform with Hive, Pig, PySpark, SQL SPROC or custom code
  • 29. Cloud Analytics – Common Challenges
  • 31. ELT with Apache Spark Activity from ADF Pipeline Create new pipeline and HDI Cluster Linked Service for Spark from Azure Portal Invoke Python script from Spark activity to transform data at scale Schedule, monitor and manage pipeline from ADF Verify results and perform analytics from Jupyter notebooks / PBI
  • 33. ELT with Azure Data Lake from ADF Pipeline Create new pipeline and Azure Data Lake Analytics Linked Service from Azure Portal Perform data transformations at scale with U-SQL script Schedule, monitor and manage pipeline from ADF Verify results and perform analytics from ADLA
  • 34. Monitor & Manage Pipelines
  • 52. Microsoft Data Integration Roadmap • SSIS • SQL Server 2017 • SSIS on Linux • Scale-out • ADF • SSIS as a Cloud “Integration Runtime” • Code-free web-based user experience • Control Flow orchestration + Data Flow steps • On-Demand Spark Cluster
  翻译: