Компания: Infinity Works
Adam Dewberry is a consultant data engineer, specialising in building cloud data platforms. Along with engineering, he is responsible for up-skilling customers’ staff, working alongside them to help them own and expand their new data infrastructure.
In a pre-Covid world, he would often be found hitch-hiking across the Balkans and eastern Europe; in a peri-Covid world, he’s learning to skateboard.
How to rapidly deploy an AWS & Snowflake cloud data platform at scale
This hands-on workshop will teach you how to create and roll out a scalable data platform in AWS and Snowflake using Terraform. The session will cover how to build all of the cloud infrastructure required to automate data ingestion and exports between AWS and Snowflake through code, set up deployment pipelines and finally connect an analytics dashboard to derive insights and instant value.
Part 1 Snowflake in the console — creating a simple data warehouse with imports and exports
- Setting up a Snowflake account
- Creating resources:
- Users & roles
- Databases, schemas & tables
- Getting local data into Snowflake
- Querying data in Snowflake
- Getting data out of Snowflake to your local machine
- Automating data ingestion from S3
- Account integrations: connecting Snowflake to AWS S3
- IAM Roles & Policies
- Stages: Where the cloud source data lives
- Landing data with Snowpipes
- Pushing data to AWS from the CLI
- Automating data exports from Snowflake to S3
Part 2 Deploying Snowflake With Terraform
- The power of infrastructure as code
- What is Terraform and how does it work?
- Deploying Snowflake resources through Terraform
- Setting up the codebase
- State resources
- Terraforming resources to automate data ingestion from S3 to Snowflake:
- Users and roles
- Databases and schemas
- Integrations & IAM
- Pipes & Notifications
- Automating data exports from Snowflake to S3 with Terraform
- Terraform modules — Have the heavy lifting done for you
- Snow Cannon
- Data warehouse deployment speed record
Part 3 Data Analytics Dashboards with Docker
- Running Metabase
- Connecting Snowflake to your dashboard
The main goal of this workshop is to learn how to use infrastructure as code and automated deployments to deploy a cloud data platform with AWS and Snowflake. It is a hands-on session where all participants will ultimately produce cloud resources to automate data imports and exports from Snowflake.
The target audience are those interested in building cloud data platforms, particularly with AWS and Snowflake, through the use of infrastructure as code and automated deployment pipelines. The session is open to all who enjoy and are interested in cloud platform and data engineering. The program requires some familiarity with platform engineering but importantly a good knowledge of AWS or Snowflake.
Предварительные условия курса
Accounts to create before the workshop:
- AWS with unrestricted access / deployment privileges that you can use in a local session with the AWS CLI.
- Snowflake account with ACCOUNTADMIN role (create a free trial account and choose the enterprise edition).
- Ubuntu, Mac OS (UNIX based etc) strongly preferred
- Windows not recommended
- Minimum (not recommended but acceptable):
- VS Code + Extension Code Remote — Containers
- Docker v20.10
- AWS Command Line Interface v2.0 (test connectivity to your AWS account via CLI)
- Docker v20.10
- Python >= 3.6
- Terraform v13
- SnowSQL CLI v1.2 (test connectivity to your Snowflake account via CLI)
- AWS: working knowledge of S3, IAM roles and policies. DynamoDB is a bonus.
- Previous use of Snowflake is preferred but not essential.
- Working knowledge of Terraform is strongly preferred.
Working knowledge of Docker is preferred but not essential