Spotter

Principal Data Engineer

Join our dynamic team! Seeking a creative and motivated individual to drive innovative projects and make an impact. Apply now!

Culver City, CA

Medior

Full-time

Job Description

Posted on:

October 27, 2024

Spotter is a platform for Creators, providing services and software designed to accelerate growth for the world’s best Creators and brands. Creators working with Spotter can access the capital, knowledge, community, and personalized AI software products they need to succeed. With unique knowledge of how Creators work, the resources they need to grow, and the challenges they face, Spotter is empowering top YouTube Creators to succeed. Spotter has already deployed over $940 million to YouTube Creators to reinvest in themselves and accelerate their growth, with plans to reach $1 billion in investment by 2024. With a premium catalog that spans over 725,000 videos, Spotter generates more than 88 billion monthly watch-time minutes, delivering a unique scaled media solution to Advertisers and Ad Agencies that is transparent, efficient, and 100% brand safe. For more information about Spotter, please visit https://spotter.com.

The successful candidate will be responsible for processing huge data sets (billions of records) using distributed data processing frameworks (Apache Spark, etc...).

Must Have:

Extensive experience with large data sets and creating performant & scalable ETL pipelines using Spark.
In-depth understanding of performance bottlenecks in large-scale data processing.

What You’ll Do:

Are you ready to help lead the charge in shaping the data-driven future of Spotter? We're in search of an exceptional Principal Data Engineer who will play a pivotal role in designing, building, and optimizing scalable data infrastructure. You will help us with data pipelines for acquisition and transformation of large datasets, storage and querying optimizations of varying data to support a large range of use cases from Analytics to Creator Products to Operations using traditional and ML focused access patterns. You will be a key player in empowering us to make data-informed decisions that will fuel our innovation and growth.

Develop and maintain scalable data pipelines, including:

ETL pipelines, both single and multi-node solutions
Build data quality assurance steps for new and existing pipelines
Create derived datasets with augmented properties
Work on analytics-ready datasets to power internal and creator-facing tools
Troubleshoot issues, working directly with internal data consumers
Automate pipeline runs with scheduling and orchestration tools
Work with large scale datasets and various external APIs to enhance data
Setup database tables for analytics users to consume data
Work with big data technologies to improve data availability and quality in the cloud (AWS)
Lead development of projects involving other team members and act as a mentor
Actively participate in team discussions about technology, architecture, and solutions for new projects and to improve existing code and pipelines

Who You Are:

10+ years of Data Engineering experience (ideally also 2-4 years of software engineering)
5+ years experience with Apache Spark or Apache Flink
4+ years of experience running software and services in the cloud
Proficiency in DataFrame APIs (Pandas and Spark) for parallel and single-node processing
Proficiency in advanced languages and techniques (Python, Scala, etc.) with modern data optimized file formats (Parquet, Avro)
Proficiency with SQL on RDBMS and data warehouse solutions (e.g., Redshift)
Hands-on experience with Data Lake technologies (Delta Lake, Iceberg)
Experience with data acquisition from external APIs at large scale/in parallel processing
Experience supporting ML/AI projects, including deployed pipelines for computing features and using models for inference on large datasets
Bachelor’s degree (or equivalent work experience), preferably in a Computer Science related field

Additional Valued Skills:

Experience with YouTube APIs
Experience with AWS Glue metastore
Experience with Data-Mesh approaches
Experience with data cataloging, lineage, and governance tools

Why Spotter:

100% coverage for medical and vision insurance
Dental insurance
401(k) matching
Stock options
Complimentary gym access
Autonomy and upward mobility
Diverse, equitable, and inclusive culture, where your voice matters

In compliance with local law, we are disclosing the compensation for roles performed in Culver City. Actual salaries may vary based on various factors including skill sets, experience and training, licensure and certifications, and other organizational needs. The overall market range for roles in this area of Spotter is typically: $175K-$215K salary per year. This range is just one component of Spotter’s total compensation package, which may also include an annual discretionary bonus and equity.

Spotter is an equal opportunity employer. We do not discriminate in employment on any basis protected under applicable federal, state, or local laws. Equal access to programs, services, and employment is available to all persons. Those applicants requiring reasonable accommodations as part of the application and/or interview process should notify a representative of the Human Resources Department.

Secret insights

Spotter is thriving! With 239 employees, they've ramped up headcount 45% this year. Engineering grew 30%, signaling a strong tech focus. Invested in HR shows they value their people. AI talent, this is your playground!

Apply now

About

Principal Data Engineer

Job Description

Secret insights

Spotter

More job openings

Senior Data Engineer

Machine Learning Manager

Software Developer