Azure Data Factory Essentials Training. Learn How to Build a Complete ETL Solution in Azure Data Factory & How to Integrate Pipelines with Azure Databricks
This course will introduce Azure Data Factory and how it can help in the batch processing of data. Students will learn with hands-on activities, quizzes, and a project, how Data Factory can be used to integrate many other technologies together to build a complete ETL solution, including a CI/CD pipeline in Azure DevOps. Some topics related to Data Factory required for the exam DP-203: Data Engineering on Microsoft Azure, are covered in this course.
Learn by Doing
At the end of this course, students will have the opportunity to submit a project that will help them to understand how ADF works, what are the components, and how to integrate ADF and Databricks.
Student key takeaways:
- The student should understand how ADF orchestrates the features of other technologies to transform or analyze data.
- The student should be able to explain and use the components that make up ADF.
- The student should be able to integrate two or more technologies using ADF.
- The student should be able to confidently create medium complex data-driven pipelines
- The student should be able to develop a CI/CD pipeline in Azure DevOps to deploy Data Factory pipelines
What You’ll Learn:
- Introduction to Azure Data Factory. You will understand how it can be used to integrate many other technologies with an ever-growing list of connectors.
- How to set up a Data Factory from scratch using the Azure Portal and PowerShell.
- Activities and Components that makeup Data Factory. It will include Pipelines, Datasets, Triggers, Linked Services, and more.
- How to transform, ingest, and integrate data code-free using Mapping Data Flows.
- How to integrate Azure Data Factory and Databricks. We’ll cover how to authenticate and run a few notebooks from within ADF.
- Azure Data Factory Deployment using Azure DevOps for continuous integration and continuous deployment (CI/CD)
Data Factory Essentials Training – Outline
- Introduction
- Modules introduction
- Getting Started
- Understand Azure Data Factory Components
- Ingesting and Transforming Data with Azure Data Factory
- Integrate Azure Data Factory with Databricks
- Continuous Integration and Continuous Delivery (CI/CD) for Azure Data Factory
- Getting started
- Sign up for your Azure free account
- Setting up a Budget
- How to set up Azure Data Factory
- Azure Portal
- PowerShell
- Azure Data Factory Components
- Linked Services
- Pipelines
- Datasets
- Data Factory Activities
- Parameters
- Pipeline Parameters
- Activity Parameters
- Global Parameters
- Triggers
- Integration Runtimes (IR)
- Azure IR
- Self-hosted IR
- Linked Self-Hosted IR
- Azure-SSIS IR
- Quiz
- Ingesting and Transforming Data
- Ingesting Data using Copy Activity into Data Lake Store Gen2
- How to Copy Parquet Files from AWS S3 to Azure SQL Database
- Creating ADF Linked Service for Azure SQL Database
- How to Grant Permissions on Azure SQL DB to Data Factory Managed Identity
- Ingesting Parquet File from S3 into Azure SQL Database
- Copy Parquet Files from AWS S3 into Data Lake and Azure SQL Database (intro)
- Copy Parquet Files from AWS S3 into Data Lake and Azure SQL Database
- Monitoring ADF Pipeline Execution
- How to Copy Parquet Files from AWS S3 to Azure SQL Database
- Transforming data with Mapping Data Flow
- Mapping Data Flow Walk-through
- Identify transformations in Mapping Data Flow
- Multiple Inputs/Outputs
- Schema Modifier
- Formatters
- Row Modifier
- Destination
- Adding source to a Mapping Data Flow
- Defining Source Type; Dataset vs Inline
- Defining Source Options
- Spinning Up Data Flow Spark Cluster
- Defining Data Source Input Type
- Defining Data Schema
- Optimizing Loads with Partitions
- Data Preview from Source Transformation
- How to add a Sink to a Mapping Data Flow
- How to Execute a Mapping Data Flow
- Quiz
- Ingesting Data using Copy Activity into Data Lake Store Gen2
- Integrate Azure Data Factory with Databricks
- Project Walk-through
- How to Create Azure Databricks and Import Notebooks
- How to Transfer Data Using Databricks and Data Factory
- Validating Data Transfer in Databricks and Data Factory
- How to Use ADF to Orchestrate Data Transformation Using a Databricks Notebook
- Quiz
- Continuous Integration and Continuous Delivery (CI/CD) for Azure Data Factory
- How to Create an Azure DevOps Organization and Project
- How to Create a Git Repository in Azure DevOps
- How to Link Data Factory to Azure DevOps Repository
- How to version Azure Data Factory with Branches
- Data Factory Release Workflow
- Merging Data Factory Code to Collaboration Branch
- How to Create a CI/CD pipeline for Data Factory in Azure DevOps
- How to Create a CICD pipeline for Data Factory in Azure DevOps
- How to Execute a Release Pipeline in Azure DevOps for ADF
- Quiz
156 + Free courses Provided by Google Enroll Now
Coursera 1840 + Free Course Enroll Now
1500 + Free Online Courses of Udemy