From the course: Apache Airflow Essential Training

Unlock this course with a free trial

Join today to access over 24,800 courses taught by industry experts.

Creating and running a pipeline with an S3 hook

Creating and running a pipeline with an S3 hook - Apache Airflow Tutorial

From the course: Apache Airflow Essential Training

Creating and running a pipeline with an S3 hook

- [Instructor] After having performed all of the setup, we are now finally ready to take a look at our DAG code. Notice, on line 6, we've imported the PostgresHook, and on line 7, we've imported the S3 hook. We'll be using both of these hooks in this demo. On lines 14 through 24, I've defined a function that will read an S3 file from the bucket that we have specified. This function takes in the bucket_name and the file_key as an input argument. We then instantiate an S3 hook on line 15 using the connection ID aws_conn_s3. This is the connection that we just set up. Remember that this connection has the right credentials to access S3 buckets. On line 17, I call s3_hook.read_key to read the contents of the file in the bucket specified. The content of this file will be available in bytes on lines 19 and 20. I convert this content to a string format. We then perform pd.read_csv by converting the file content to a StringIO…

Contents