Shopify Data Platform Engineer

 

Data Developer · 

Member of the Starscream Runtime - Batch Transformations team, maintaining, scaling and upgrading Shopify’s data ETL platform.

  • Currently adding tools to assist in the migration of 3000+ batch transformation flows from Apache Oozie to Airflow. 
  • Lead the migration of 3000+ batch transformation flows from unmanaged Hadoop YARN to Google's Dataproc and coordinating with data science teams. Updated coordinator tools to ensure a smooth and seamless transition for our data scientists.
  • Helped to convert the core part of the platform to be python 2 and python 3 compatible.
  • Mentoring, training and on-boarding new teammates.
  • Expanding my understanding of data systems, reliability, data warehousing/data lakes
  • Technologies: Python, Spark, Hadoop, GCP, Dataproc, Docker, Oozie, Airflow, Terraform, Datadog, SQL, Kubernetes
  • Skills Developed :
    • Data pipeline scheduling and automation
    • Continuous deployment pipeline maintenance
    • Troubleshooting and mitigating incidents involving data corruption
    • Refactoring and updating legacy code
    • Test-driven software development