Shopify Data Platform Engineer
Data Developer · Shopify
· Permanent · Full-time
· Aug 2019 - Aug 2022
· Ottawa, Canada
Member of the Starscream Runtime - Batch Transformations team, maintaining, scaling and upgrading Shopify’s data ETL platform.
- Currently adding tools to assist in the migration of 3000+ batch transformation flows from Apache Oozie to Airflow.
- Lead the migration of 3000+ batch transformation flows from unmanaged Hadoop YARN to Google's Dataproc and coordinating with data science teams. Updated coordinator tools to ensure a smooth and seamless transition for our data scientists.
- Helped to convert the core part of the platform to be python 2 and python 3 compatible.
- Mentoring, training and on-boarding new teammates.
- Expanding my understanding of data systems, reliability, data warehousing/data lakes
- Technologies: Python, Spark, Hadoop, GCP, Dataproc, Docker, Oozie, Airflow, Terraform, Datadog, SQL, Kubernetes
- Skills Developed :
- Data pipeline scheduling and automation
- Continuous deployment pipeline maintenance
- Troubleshooting and mitigating incidents involving data corruption
- Refactoring and updating legacy code
- Test-driven software development