CareerRiver

Data Engineer - Multimodal Systems

Zyphra · San Francisco Bay Area

📍 San Franciscovia ashbyPosted 2026-03-17
Apply on company site ↗
CareerRiver pulls this listing straight from the employer's hiring system — no recruiter middleman, no reposts. Applying takes you directly to Zyphra.
ZYPHRA IS AN ARTIFICIAL INTELLIGENCE COMPANY BASED IN SAN FRANCISCO, CALIFORNIA. THE ROLE: As a Data Engineer - Multimodal Systems, you will be a core contributor to creating, collecting, and improving Zyphra’s datasets and data pipelines across a variety of modalities. Your work will intersect with almost every team at Zyphra. You will be involved in collecting large-scale datasets and implementing and optimizing highly parallel data pipelines. YOU’LL WORK ACROSS: - Large-scale data collection across a variety of modalities (text, audio, image) - Designing and working with highly efficient, parallelized data processing pipelines across modalities - Designing and running rigorous experimental ablations to demonstrate the impact of new data improvements WHAT WE'RE LOOKING FOR / REQUIREMENTS: - Strong implementation and prototyping ability - Can take an idea from conception to experimentation quickly - The ability to work well with others in a high-paced research setting - Can rapidly learn new fields and are excited to implement new ideas - Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale. QUALIFICATIONS / ADDITIONAL SKILLS: - Experience collecting, handling, and processing large datasets - Experience with parallel Python programming frameworks such as Dask - Understanding of the state-of-the-art in dataset curation across modalities - A generally meticulous nature and a strong interest in actually looking at data and sanity checking things - Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing - Understanding of and interest in large-scale, highly parallel data processing pipelines. - Proficiency with PyTorch and Python. - Experience contributing to large pre-existing codebases and rapidly getting up to speed. - Previously published machine learning research in well-respected venues. - Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning) WHY WORK AT ZYPHRA: - Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued - We strongly value new and crazy ideas and are very willing to bet big on new ideas - We move as quickly as we can; we aim to minimize the bar to impact as low as possible - We all enjoy what we do and love discussing AI BENEFITS AND PERKS: - Comprehensive medical, dental, vision, and FSA plans - Competitive compensation and 401(k) plan - Relocation and immigration support on a case-by-case basis - In-office snacks and meals provided - Unlimited PTO and company holidays - In-person team in San Francisco with a collaborative, high-energy environment

More San Francisco Bay Area jobs

San Francisco Bay Area jobs · Browse all locations