Peloton is looking for a Data Engineer to build our Data Warehouse and Data Pipelines. You will work with multiple teams of passionate and skilled data engineers, architects, and analysts responsible for building batch and streaming data pipelines that process terabytes of data daily and support all of the analytics, business intelligence, data science and reporting data needs across the organization.
Peloton is a cloud first engineering organization with all of our data infrastructure in AWS leveraging EMR, AWS Glue, Redshift, S3, Spark. You will be interacting with many business teams including marketing, sales, supply chain, logistics, finance and partner to scale Peloton’s data infrastructure for future strategic needs.
Responsibilities
- Understand the data needs of different stakeholders across multiple business verticals including Finance, Marketing, Logistics, Product etc.
- Develop the vision and map strategy to provide proactive solutions and enable stakeholders to extract insights and value from data.
- Understand end to end data interactions and dependencies across complex data pipelines and data transformation and how they impact business decisions.
- Design best practices for big data processing, data modeling and warehouse development throughout the company.
Requirements
- Familiar with at least one of the programming languages: Python, Java.
- Comfortable with Linux operating system and command line tools such as Bash.
- Familiar with REST for accessing cloud based services.
- Excellent knowledge about databases, such as PostgreSQL and Redshift.
- Has experiences with GIT, Github, JIRA and SCRUM.
- 2+ years in building a data warehouse and data pipelines. Or, 3+ years in data intensive engineering roles.
- Experience with big data architectures and data modeling to efficiently process large volumes of data.
- Background in ETL and data processing, know how to transform data to meet business goals.
- Experience developing large data processing pipelines on Apache Spark.
- Experience with Python or Java programming languages.
- Strong understanding of SQL and working knowledge of using SQL(prefer PostgreSQL and Redshift) for various reporting and transformation needs.
- Excellent communication, adaptability and collaboration skills.
- Experience running Agile methodology and applying Agile to data engineering.
- Experience with Java, JDBC, AWS, SDK
Nice to have
- Familiar with AWS ecosystem, including RDS, Glue, Athena, etc.
- Has experiences with Apache Hadoop, Hive, Spark and PySpark.
Apply for this job