Head of Customer Reliability & Support
Nexus Cognitive
We are seeking a Head of Customer Reliability & Support to build and lead a new function at the intersection of reliability engineering, technical support, and customer success. This leader will combine the rigor of SRE with the scale of Support to ensure our platform is enterprise-ready, reliable, and trusted, while also providing world-class customer care.
You will establish the foundations of a Customer Reliability Engineering (CRE) practice while overseeing our Support organization. This means defining proactive customer health and reliability programs, while also ensuring efficient, high-quality responses to day-to-day customer issues. This is a unique opportunity to create a hybrid function that blends proactive reliability engagement and reactive support excellence into one customer-facing team.
Responsibilities
Establish & Lead the Function
- Build the company’s first Customer Reliability & Support team, defining mission, charter, and operating model.
- Blend SRE practices (SLOs, error budgets, post-mortems, readiness criteria) into a customer-facing reliability function.
- Design and scale the Support function (processes, tooling, SLAs, self-service) to meet enterprise expectations.
- Create playbooks and engagement models that span Support operations and strategic reliability engagements.
Customer Reliability & Engagement
- Ensure visiility into Customer Health & Reliability Reviews (CHRs) with strategic accounts, participate, stay informed, and guide outcomes.
- Empower Customer Success Managers and account leaders to own and lead CHRs, while Support provides data, insights, and reliability context.
- Act as the executive technical counterpart when appropriate with customer CTOs, platform heads, and reliability leads.
- Translate recurring customer issues into platform engineering priorities to prevent reoccurrence.
Support Excellence & Escalation Management
- Oversee global customer support operations, including L1–L3 ticket handling and resolution SLAs.
- Build and lead a dedicated Escalation Management Team to handle high-severity incidents and customer-critical escalations.
- Ensure escalation processes are well-defined, metrics-driven, and minimize executive firefighting.
- Build a knowledge base and self-service model to improve ticket deflection and customer enablement.
- Implement metrics-driven support management: CSAT, resolution times, backlog health, deflection rates.
- Drive seamless handoffs between Support, CRE, Engineering, and Success teams.
Platform & Engineering Partnership
- Partner with Product and Engineering to ensure the platform is enterprise-grade: scalable, secure, and reliable.
- Bring customer reliability insights into roadmap planning.
- Lead joint post-mortems with customers and internal teams, ensuring accountability and follow-through.
Leadership & Team Development
- Hire and mentor a hybrid team of support engineers, escalation managers, reliability engineers, and customer-facing architects.
- Develop talent to balance technical depth with customer engagement skills.
- Define KPIs that link support excellence + reliability impact directly to retention and expansion.
- 7+ years in engineering, support, or technical customer leadership roles; 3+ years leading global teams.
- Proven success leading both customer support and technical reliability/SRE practices.
- Deep knowledge of distributed systems, cloud infrastructure (AWS/GCP/Azure), Kubernetes, observability practices.
- Strong executive presence: able to build trust with engineering teams and C-level customer stakeholders.
- Track record of driving retention, trust, and expansion through technical engagement.
- Experience scaling support operations (ticketing, knowledge bases, global coverage) in high-growth environments.
- Demonstrated ability to translate reliability into business value (uptime, resilience, cost efficiency).
Nice to Have
- Experience with modern data/AI infrastructure (Kafka, Iceberg, Trino, Spark, Airflow).
- Background in regulated industries (financial services, healthcare, telco).
- Exposure to CRE models (e.g., Google CRE) and customer-facing SRE practices.