Description
There are a lot of ETL tools out there and sometimes they can be overwhelming, especially when you simply want to copy a file from point A to B. So today, I am going to show you how to extract a CSV…
Summary
- Extracting data from an FTP server using Google Cloud Functions and loading to BigQuery There are a lot of ETL tools out there and sometimes they can be overwhelming, especially when you simply want to copy a file from point A to B.
- So today, I am going to show you how to extract a CSV file from an FTP server (Extract), modify it (Transform) and automatically load it into a Google BigQuery table (Load) using python 3.6 and Google Cloud Functions.
- As at the writing of this post, CF isn’t available in every Google data-centre region, so check here to see where Cloud Functions is enabled.
- Please note, the FTP server I was working on, had multiple CSVs representing transaction data for different days.
- Since we are enabling “auto-detection” , the Bigquery table doesn’t have to have a schema when creating it as it will be inferred based on the data in the CSV file.