About Pendulum: Pendulum is leading a revolution to improve physical and mental health by understanding, restoring, and enhancing the human microbiome. Research indicates that the microbiome is linked to various aspects of health, including metabolism, immune function, and overall well-being. Pendulum has created proprietary probiotic pipelines and a unique discovery platform to identify key novel bacterial strains. With innovative manufacturing technology, Pendulum is transforming consumer probiotics into a new therapeutic category. Due to significant revenue and customer growth, Pendulum has earned a spot on Forbes Magazine's 'The Next Billion Dollar Startups' list. If you are passionate about improving lives and thrive in a collaborative environment, we invite you to learn more about this opportunity.
Position Summary: We are seeking a Lead Data Engineer to join our team and enhance our data infrastructure. This role involves building and maintaining robust data pipelines, managing our data warehouse, and ensuring the reliability and scalability of our data systems. The ideal candidate will possess deep expertise in data engineering and cloud platforms, aligning practices with business and AI/ML needs.
What You'll Do:
- Design, build, and maintain efficient and scalable ETL pipelines using tools like dbt and Fivetran, alongside orchestration frameworks such as Airflow.
- Develop robust schema designs and data models for efficient querying and data integration.
- Manage and optimize our data warehouse and lakehouse on platforms like Snowflake, ensuring data reliability and performance.
- Implement data validation, cleansing, and anomaly detection processes for data integrity.
- Collaborate with Data Science and Analytics teams for the deployment of ML models, including training and evaluation processes.
- Implement monitoring solutions for data pipeline reliability and performance.
- Leverage Docker and Kubernetes for automated data processing workflows.
- Ensure compliance with data governance standards and regulations (GDPR, CCPA) through proper data lineage and secure handling practices.
Knowledge Requirements:
- MSc/PhD in Computer Science or a related field.
- 5+ years of data engineering experience focusing on building and managing data pipelines and data warehouses.
- Expertise in Python and SQL; experience with technologies such as Kafka, Spark, and cloud infrastructure (GCP, AWS) is required.
- Proficiency in orchestration tools like Airflow; container orchestration experience, particularly with Kubernetes, is a plus.
- Strong understanding of data quality management, lineage, and compliance regulations (GDPR, CCPA).
- Ability to lead data projects, make strategic decisions, and adapt in a fast-paced environment.
Salary & Benefits:
- Salary range: $170,000 - $225,000
- Medical, Dental, and Vision coverage
- Commuter Benefits
- Life & STD Insurance
- 401(k) company match
- Flexible Time Off (FTO)
- Equity