Synthetic Data Engineer (AI Data/Training)
Hyphen Connect Limited ยท Oregon
๐ Oregon, USAvia greenhousePosted 2026-04-24
Apply on company site โ
CareerRiver pulls this listing straight from the employer's hiring system โ no recruiter middleman, no reposts. Applying takes you directly to Hyphen Connect Limited.
We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.
Responsibilities:
Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
Implement automated quality scoring and de-duplication systems.
Manage data pipelines that feed directly into SFT and DPO training loops.
Qualifications:
Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
Deep knowledge of prompt engineering for data generation.
Familiarity with dataset distillation and bias mitigation.
More Oregon jobs
Oregon jobs ยท Browse all locations