CareerRiver

Sr Lead Site Reliability & Systems Engineer

Cox · Austin, TX

📍 Austin TX💰 $163,400via workday
Apply on company site ↗
CareerRiver pulls this listing straight from the employer's hiring system — no recruiter middleman, no reposts. Applying takes you directly to Cox.
Company Cox Automotive - USA Job Family Group Engineering / Product Development Job Profile Sr Lead Software Engineer Management Level Sr Manager - Non People Leader Flexible Work Option Hybrid - Ability to work remotely part of the week Travel % No Work Shift Day Compensation Compensation includes a base salary in the range of $163,400.00 - $272,300.00. The base salary may vary within the anticipated base pay range based on factors such as the ultimate location of the position and the selected candidate’s knowledge, skills, and abilities. Position may be eligible for additional compensation that may include an incentive program. Job Description SENIOR LEAD SITE RELIABILITY &   SYSTEMS ENGINEER   Platform Engineering | Infrastructure, Reliability & Systems Architecture   Location: Austin (candidates must be based in Austin or willing to relocate for this role)   ABOUT THE ROLE   We are seeking a Senior Lead Site Reliability & Systems Engineer — a versatile technical leader who combines deep SRE expertise with broad systems engineering capability. In this hybrid role you will drive platform reliability, operational excellence, and systems architecture across our infrastructure, ensuring our products are scalable, resilient, and delivered with high velocity. You will partner with engineering, product, and operations teams to embed reliability and sound systems design at every layer of the stack.  KEY RESPONSIBILITIES   Reliability Engineering & Incident Management   Define and drive the SRE strategy, roadmap, and standards across engineering teams  Establish and enforce SLOs, SLIs, and error budgets across all production services  Own the incident management lifecycle — detection, response, resolution, and prevention  Lead blameless postmortems and translate findings into lasting systemic improvements  Manage on-call rotations and aggressively reduce toil through automation  Systems Architecture & Design   Lead the design and evolution of large-scale, distributed systems and platform infrastructure  Define technical standards, architectural patterns, and engineering best practices org-wide  Evaluate and recommend technologies and tooling aligned to business and reliability requirements  Conduct architecture reviews and provide guidance on complex technical trade-offs  Lead capacity planning, performance engineering, and infrastructure scaling strategies  Platform & Infrastructure   Build and maintain highly available, fault-tolerant infrastructure on cloud platforms (AWS/GCP/Azure)  Drive infrastructure-as-code adoption (Terraform) and enforce best practices  Architect and implement observability platforms — metrics, logging, tracing, and alerting  Build and improve CI/CD pipelines, deployment automation, and release engineering workflows  Lead chaos engineering and game day exercises to validate system resilience  Champion automation across provisioning, testing, deployment, and monitoring workflows  Leadership, Mentorship & Collaboration   Mentor and grow a team of SREs, platform engineers, and systems engineers  Partner with DevOps, security, and product teams to align on shared platform goals  Serve as the technical escalation point for critical infrastructure incidents and outages  Communicate complex technical concepts clearly to non-technical stakeholders and leadership  Contribute to build vs. buy evaluations and drive strategic vendor assessments  REQUIRED QUALIFICATIONS   8+ years of experience in SRE, systems engineering, platform engineering, or DevOps roles  3+ years in a senior or lead capacity with ownership of large-scale, distributed systems  Deep expertise in at least one major cloud provider — AWS preferred  Strong proficiency in Python, Go, Bash, Java, or C++  Hands-on experience with Kubernetes, container orchestration, and service mesh technologies  Solid understanding of Linux/Unix internals, networking (TCP/IP, DNS, TLS/SSL, load balancing)  Proficiency with observability tooling: Datadog, Prometheus/Grafana, Splunk, or equivalent  Proven track record defining and operating against SLOs and error budgets  Experience with infrastructure-as-code tools — Terraform required  Strong understanding of distributed systems design, security fundamentals, and data governance  PREFERRED QUALIFICATIONS   Experience with service mesh (Istio, Linkerd) and API gateways (Kong, Apigee)  Background in systems integration across enterprise middleware, ERP, or CRM platforms  Familiarity with FinOps practices and cloud cost optimization  Experience in regulated industries: financial services, automotive, healthcare, or government  Familiarity with compliance frameworks: SOC 2, ISO 27001, or NIST  Track record of leading migrations — legacy-to-cloud or monolith-to-microservices  Relevant certifications: AWS Solutions Architect, CKA/CKAD, GCP Professional, or Red Hat RHCA  WHAT WE OFFER   Compensation & Benefits   Competitive base salary + annual bonus  Comprehensive health, dental, and vision coverage  401(k) with company match  Generous PTO and paid parental leave  Culture & Growth   Flexible hybrid work model  Learning & development budget (conferences, certs, courses)  Engineering-first culture with direct product impact  Collaborative teams and transparent leadership  Drug Testing To be employed in this role, you’ll need to clear a pre-employment drug test. Cox Automotive does not currently administer a pre-employment drug test for marijuana for this position. However, we are a drug-free workplace, so the possession, use or being under the influence of drugs illegal under federal or state law during work hours, on company property and/or in company vehicles is prohibited. Benefits The Company offers eligible employees the flexibility to take as much vacation with pay as they deem c

More Austin, TX jobs

Austin, TX jobs · Browse all locations