Lead Site Reliability Engineer
Tricentis
As a Lead Site Reliability Engineer at Tricentis, you'll drive engineering excellence across core SRE domains, platform, and shared infrastructure. You will work autonomously on strategic initiatives, lead technical programs across teams, and shape the reliability vision across the SRE organisation.
This role bridges between Lead and Principal—designed for a person ready to operate at a group-wide level while still staying hands-on and deeply technical.
Your Impact as a Lead SRE 🚀
Define technical strategy for cross-cutting reliability initiatives, influencing team and org-wide decisions.
Drive architectural improvements in multi-tenant SaaS infrastructure with focus on observability, scalability, and security.
Design systemic solutions to prevent recurrence of production incidents.
Define and evolve SLOs, SLIs, and KPIs (MTTR, MTBF, etc.) for core services and advocate for reliability best practices across teams.
Lead high-impact engineering efforts such as platform standardization, security automation, and self-service tooling.
Mentor senior and leads SRE engineers, grow technical leadership, and raise the bar for system design and operational maturity.
Partner with product, engineering, and business stakeholders to align technical work with company goals and customer outcomes.
As a valuable member of our SRE team, you'll have the opportunity to 💪
Set the technical north star for critical infrastructure domains and coordinate adoption across teams.
Identify and drive systemic improvements in developer experience, security posture, and service reliability.
Evaluate and advocate for or against modern technologies based on tradeoffs, scalability, and business context.
Proactively contribute to hiring senior talent and help raise the technical bar through high-signal interviews and mentorship.
Lead large-scale infrastructure programs that span teams and departments, delivering long-term leverage.
Represent Tricentis at external forums via blog posts, conference talks, or community engagement.
Our Tech Stack 🌐
Azure, AWS, Terraform, GitHub Actions, Kubernetes, DataDog, Prometheus, Grafana, All-in-one incident management platform | incident.io , Betterstack, Jira
Our Culture 🦄
We value autonomy, technical excellence, and outcome-driven leadership. In our SRE team, you'll find a space where proactive ownership is celebrated, and engineers are empowered to drive real business impact. Collaboration, psychological safety, and continuous learning are the cornerstones of how we work.
About You 🎯
Proven experience leading infrastructure initiatives across multiple teams or business units.
6+ years of engineering experience, with deep hands-on skills in cloud architecture (AWS or Azure), Kubernetes, and infrastructure as code (Terraform).
Expert understanding of system design, fault tolerance, observability, and incident management.
Strong opinions on the modern infrastructure landscape, with the ability to communicate tradeoffs clearly.
Trusted technical voice across disciplines—able to work effectively with engineers, executives, and external stakeholders.
Strong track record of mentoring staff-level engineers and raising engineering quality standards.
Experience driving security-focused or platform-wide initiatives at scale is a plus.
Fluent in English
You can look forward to:
Flexible working schedule (no core hours)
Learning and career growth opportunities
25 days paid time off
3 Sick Days
2 days of paid Volunteering Leave per year to get involved in your local community or in a cause that matters to you
Hybrid work environment, with home-office allowance
Meal allowance
Pension Contribution
Life & Disability Insurance
Paid Sickness leave
A team of passionate professionals who are experts in their fields
Events for employees to learn, celebrate and socialize (training sessions, hackathons, parties, sports events, board game gatherings, BBQs) and much more