From the course: Apache Airflow Essential Training
Unlock this course with a free trial
Join today to access over 24,800 courses taught by industry experts.
Creating and running a pipeline with an S3 hook - Apache Airflow Tutorial
From the course: Apache Airflow Essential Training
Creating and running a pipeline with an S3 hook
- [Instructor] After having performed all of the setup, we are now finally ready to take a look at our DAG code. Notice, on line 6, we've imported the PostgresHook, and on line 7, we've imported the S3 hook. We'll be using both of these hooks in this demo. On lines 14 through 24, I've defined a function that will read an S3 file from the bucket that we have specified. This function takes in the bucket_name and the file_key as an input argument. We then instantiate an S3 hook on line 15 using the connection ID aws_conn_s3. This is the connection that we just set up. Remember that this connection has the right credentials to access S3 buckets. On line 17, I call s3_hook.read_key to read the contents of the file in the bucket specified. The content of this file will be available in bytes on lines 19 and 20. I convert this content to a string format. We then perform pd.read_csv by converting the file content to a StringIO…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
-
(Locked)
Setting up for a PostgreSQL pipeline with hooks2m 28s
-
(Locked)
Creating and running a pipeline with PostgresSQL hooks6m 28s
-
(Locked)
Setting up access to Amazon S3 buckets4m 32s
-
(Locked)
Setting up a connection to Amazon S3 buckets2m 5s
-
(Locked)
Creating and running a pipeline with an S3 hook5m 43s
-
(Locked)
-