MindsDB is a fast-growing AI startup headquartered in San Francisco, California. As a leading innovator bringing AI and Data together, our passion is empowering companies to easily build AI capabilities that can Think, Understand and Orchestrate, enabling teams to move from prototyping and experimentation to production in a fast and scalable way. Founded in 2017 by Adam Carrigan and Jorge Torres, inspired by Ian M. Banks's Culture series, MindsDB started as an open-source project and has grown to be one of the most widely used AI-Data platforms globally, with over 700 contributor developers worldwide. We are backed by over $55M in funding from Mayfield, Benchmark, YCombinator, and nVidia, and recognized by Forbes as one of America's most promising AI companies (2021) and by Gartner as a Cool Vendor for Data and AI (2022).
THE ROLE
As a Machine Learning Engineer, you'll focus on building advanced machine learning solutions for the MindsDB platform, including robust Text-to-SQL systems and optimizing Retrieval Augmented Generation (RAG) for both structured and unstructured data. Your expertise in transformer models and advanced retrieval techniques will be essential in delivering state-of-the-art LLM-driven solutions.
What You’ll Be Working On:
- Researching, building, and evaluating novel LLM-powered enterprise applications.
- Developing robust Text-to-SQL systems and maintaining Retrieval Augmented Generation (RAG) systems for diverse data sources.
- Implementing advanced chunking techniques and a thorough understanding of retrieval concepts (e.g., embeddings, query expansion).
- Building agentic and tool-calling systems and employing an 'Evaluation Driven Development' approach with messy datasets.
- Fine-tuning and deploying transformer models (e.g., Llama, OpenAI APIs), creating agent-based applications, and integrating them into production environments.
- Showing proficiency in data structures, algorithms, concurrency, multi-threading, and design patterns while writing clean, maintainable code.
- Collaborating closely with engineers and researchers, managing pull requests, and participating in code reviews, while creating design documents and leading architecture discussions.
Requirements/Qualifications
- 3+ years of ML engineering experience with proven experience in machine learning engineering, particularly with LLMs and retrieval-based systems.
- Strong software engineering skills, including data structures, algorithms, and software design.
- Experience working with transformer models, fine-tuning, and deploying them in production settings.
- Ability to build end-to-end ML systems, especially in RAG and agentic contexts, and familiarity with LLM frameworks or a willingness to learn.
- Excellent problem-solving abilities, strong communication skills, and passion for creating innovative AI-driven solutions.
Nice to Have:
- Startup experience with fast-paced adaptability.
- Cloud platform experience (AWS, GCP, Azure) for ML deployments and MLOps knowledge, including CI/CD and model monitoring.
- Experience with big data tools (SQL, NoSQL, Spark) and open-source contributions in LLM or AI projects.
Benefits & Perks:
- Flexible Working Hours
- Competitive Compensation with a salary range of $200,000 - $240,000 USD
- 401k with up to 6% matching
- Unlimited PTO
- New Hire Remote Setup budget ($1500)
- Lunch Provided Mon-Fri
- Commuter Budget ($1200/year)
- Monthly (virtual) team events and international in-person company retreats
- Wellbeing/Mental Health leave
DIVERSITY, EQUALITY & INCLUSION
MindsDB is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. MindsDB will give all qualified applicants consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations, and ordinances.