Senior AI Platform Engineer

Expel · Remote

📍 Remote💰 $142,900via greenhousePosted 2026-06-24

CareerRiver pulls this listing straight from the employer's hiring system — no recruiter middleman, no reposts. Applying takes you directly to Expel.

You believe great ML systems don't just work — they scale, they recover gracefully, and they give data scientists the confidence to iterate quickly. At Expel, you'll be a key contributor in building and maturing the infrastructure that powers our machine learning and generative AI capabilities. From end-to-end training pipelines to the specialized infrastructure behind production agentic applications, your work will directly shape how fast we can innovate and how reliably our AI systems run. You'll work closely with senior and principal engineers, data scientists, and cross-functional teams to operationalize ML at scale. You bring strong hands-on expertise and a genuine drive to continuously improve the systems and practices around you. What Expel can do for you Give you hard, meaningful problems — building the infrastructure that lets defenders win using AI Connect you with a collaborative team of engineers, data scientists, and researchers who care about doing it right Offer unlimited PTO (that leadership models and encourages), up to 24 weeks of parental leave, and really excellent health benefits Pay you a monthly fitness and cell phone stipends — no receipts required Support your professional growth with a conference benefit and continuous learning opportunities Offer full remote flexibility — work from wherever you do your best work What you can do for Expel Build and scale ML infrastructure Architect and maintain end-to-end machine learning training pipelines on AWS (SageMaker, EKS, Step Functions) to ensure reliable and reproducible model development and deployment Build and maintain infrastructure for production agentic applications using Amazon Bedrock and Bedrock AgentCore — including agent runtimes, memory, secure gateways, and observability at scale Contribute to the architectural evolution of our ML platform, including evaluating MLOps tooling and participating in buy vs. build decisions Operationalize with rigor Implement AI/ML governance best practices for model versioning, testing, validation, maintenance, and security Integrate MLOps best practices with Expel's SDLC, security, and infrastructure standards, working alongside SRE, Platform Engineering, and Security teams Drive quality, reliability, and scalability improvements through thoughtful engineering and monitoring Collaborate and enable Partner with data scientists, software engineers, and stakeholders to operationalize ML models reliably and at scale Mentor and support junior engineers; foster a culture of engineering excellence Create and maintain documentation, internal tooling, and enablement resources so practitioners across Expel can work effectively with ML systems Stay current with the MLOps landscape and bring relevant innovations back to the team What you should bring with you Collaboration & communication Clear communicator — able to write documentation and explain technical concepts to both engineering and non-technical audiences Strong collaborator with engineers, product managers, and business stakeholders Demonstrated ability to mentor others and invest in the growth of the people around you Balances near-term delivery with longer-term technical quality Technical depth Strong Python proficiency; familiarity with other languages (Go, JS) is a plus Solid experience with CI/CD pipelines, infrastructure-as-code, and containerization for ML workloads Hands-on experience with cloud-based ML platforms — AWS (SageMaker, Bedrock, Bedrock AgentCore) strongly preferred; GCP (Vertex AI) experience also valued Proven experience operationalizing LLMs and building infrastructure for complex agentic applications — agent orchestration, memory, tool calling, RAG architectures Familiarity with ML frameworks including Scikit-Learn, PyTorch, Spark, and TensorFlow Working knowledge of continuous retraining, concept drift monitoring, and data drift detection in production Education & experience 5+ years of relevant software engineering experience with meaningful focus on ML operations and infrastructure Degree in Computer Science, Mathematics, Statistics, Engineering, or a related technical field preferred (or a compelling story) Demonstrated track record of delivering impactful ML infrastructure or MLOps projects Experience contributing to team practices, standards, or tooling in a collaborative environment Additional information The base salary range for this role is between $142,900 USD and $207,200 USD + bonus eligibility and equity. We believe in paying transparently and equitably. Your salary will ultimately be based on factors such as your experience, skills, team equity, and market data. You’ll also be eligible for unlimited PTO (which we model and encourage), work location flexibility, up to 24 weeks of parental leave, and really excellent health benefits. We’re only hiring those authorized to work in the United States. We’re an Equal Opportunity Employer: You’ll receive consideration for employment without regard to race, sex, color, religion, sexual orientation, gender identity, national origin, protected veteran status, or on the basis of disability. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation. #LI-Remote Salary Range $142,900 — $207,200 USD

More Remote jobs

Senior Sales Lead
Climate Bonds Initiative
Lead Cyber Security Consultant
Actica Consulting
Senior Cyber Security Consultant
Actica Consulting
Technical Consultant
Actica Consulting
Commercialization Director, Foot and Ankle
Advita Ortho
Commercialization Manager, Shoulder
Advita Ortho

Remote jobs · Browse all locations