Staff Software Engineer
Software Engineering
Bengaluru, Karnataka, India
Our Purpose
At Fiddler, we understand the implications of AI and the impact that it has on human lives. Our company was born with the mission of building trust into AI. The rise of Generative AI and Agents has unlocked generalized intelligence but also widened the risk aperture and made it harder to ensure that AI applications are working well. Fiddler enables organizations to get ahead of these issues by helping deploy trustworthy, and transparent AI solutions.
Fiddler partners with AI-first organizations to help build a long-term framework for responsible AI practices, which, in turn, builds trust with their user base. AI Engineers, Data Science, and business teams use Fiddler AI to monitor, evaluate, secure, analyze, and improve their AI solutions to drive better outcomes. Our platform enables engineering teams and business stakeholders alike to understand the "what", “why”, and "how" behind AI outcomes.
Our Founders
Fiddler AI is founded by Krishna Gade (engineering leader at Facebook, Pinterest, Twitter, and Microsoft) and Amit Paka (product leader at Microsoft, Samsung, Paypal and two-time founder). We are backed by Insight Partners, Lightspeed Venture Partners, and Lux Capital.
Why Join Us
Our team is motivated to help build trust into AI to enable society harness the power of AI. Joining us means you get to make an impact by ensuring that AI applications at production scale across industries have operational transparency and security. We are an early-stage startup and have a rapidly growing team of intelligent and empathetic doers, thinkers, creators, builders, and everyone in between. The AI and ML industry has a rapid pace of innovation and the learning opportunities here are monumental. This is your chance to be a trailblazer.
Fiddler is recognized as a pioneer in the field of AI Observability and has received numerous accolades, including: 2022 a16z Data50 list, 2021 CB Insights AI 100 most promising startups, 2020 WEF Technology Pioneer, 2020 Forbes AI 50 most promising startups of 2020, and a 2019 Gartner Cool Vendor in Enterprise AI Governance and Ethical Response. By joining our brilliant (at least we think so) team, you will help pave the way in the AI Observability space.
At Fiddler, the Integrations Team owns the connective tissue between our customers' AI stacks and our observability platform. We build the OpenTelemetry-native SDKs, framework instrumentation, and ingestion pipelines that capture trace, metric, and model data from any environment — predictive models, LLMs, GenAI, and agentic applications — and land it reliably in Fiddler.
This is a uniquely broad role. You'll operate simultaneously as an SDK/instrumentation lead, an OpenTelemetry expert, a distributed-systems architect, and an AI observability platform builder — owning everything from the decorator a developer adds to their agent, through OTLP ingestion, to the normalized telemetry that powers our product. If you're excited about OpenTelemetry, developer-facing SDKs, and meeting customers where their AI actually runs, this is the team.
🚀 What You'll Do
Own the integration surface end-to-end: Design and build the SDKs, instrumentation libraries, and connectors — Fiddler OTel SDK, LangChain/LangGraph/Strands auto-instrumentation, LiteLLM, OTLP ingestion, and ML-platform connectors (Databricks, MLflow) — that bring customer telemetry and model data into a world-class AI observability platform.
Make OpenTelemetry core, not adjacent: Own Fiddler's OTel and OTLP strategy — instrumentation libraries, collectors, exporters, sampling, and the end-to-end tracing path. Engage with the OpenTelemetry/OpenInference ecosystem and contribute back where it strengthens our integrations.
Define agentic semantic conventions and mapping: Build the canonical abstraction layer that maps heterogeneous, non-deterministic agent frameworks (LangChain, LangGraph, Strands, custom Python agents, any OTLP source) onto Fiddler's unified span and metric model — framework-agnostic instrumentation that "just works."
Lead distributed ingestion and processing systems: Architect the services and microservices that receive, normalize, persist, and expose ML + agentic observability signals (e.g., response relevancy, hallucination scores, multi-agent and tool-call traces) from raw trace data at enterprise scale, with strong guarantees around data integrity, ordering, and backpressure.
Build for enterprise scale and trust: Design scalable data infrastructure, services, and APIs that handle high-throughput ingestion, meet compliance needs, and uphold SLAs.
Champion developer experience: Treat our SDKs as products. Drive ergonomics (zero-boilerplate decorators, one-line setup), backward compatibility, versioning, docs, and reference examples so integrations take minutes, not weeks.
Spearhead new evaluation and metric capabilities: Partner with product and customers on discovery, then build the new metrics and evaluation capabilities that evolving GenAI and agentic use cases demand.
Raise operational maturity: Define and evolve reliability, latency, SLOs, and observability for the ingestion path. Champion improvements to CI/CD, testing frameworks, error handling, efficiency, and resiliency.
Build the team & culture: Take an active role in growing a world-class engineering team — interviewing, evaluating candidates, mentoring, and coaching.
🎯 What We're Looking For
Education & experience: Master's or Bachelor's in Computer Science or a related field, plus 10+ years of industry experience with a demonstrated, solid foundation in software development.
OpenTelemetry depth: Hands-on experience with OpenTelemetry / OTLP — instrumentation, collectors, semantic conventions, sampling, and the broader tracing ecosystem. You understand telemetry generation and the systems that consume it.
Backend & ingestion depth: Deep proficiency with Python and strong command of the technologies that power high-throughput ingestion — Kafka, Postgres, Redis — including the ability to design, build, and debug complex, large-scale streaming and storage systems.
Integration & SDK mindset: Proven experience building developer-facing SDKs, client libraries, plugins, or data connectors, with strong instincts for API and contract design across systems you don't fully control.
Adaptability & ownership: Proven ability to thrive in ambiguity and a fast-paced environment. A self-motivated initiator who takes ownership with high autonomy and confidently fills gaps when the full picture isn't available.
System design & optimization: Strong grasp of distributed systems and the ability to troubleshoot production issues. Nice to have: cloud infrastructure (AWS/GCP, Kubernetes) and specialized databases (ClickHouse/Druid).
Technical leadership & collaboration: Demonstrated ability to plan, execute, and deliver by breaking complex problems into manageable tasks and guiding a small team. Adept at cross-functional collaboration across a geographically distributed team — partnering with product managers, designers, frontend developers, and data scientists, and at times directly with customers and integration partners.
Coaching & mentorship: An excellent collaborator and mentor who raises the technical bar for the whole team and regularly engages in code and design reviews.
Good to Have
Experience deploying and working with ML/LLM models in production, and comfort with modern LLM and agentic frameworks (e.g., LangChain, LangGraph, Strands, HuggingFace, vLLM, LiteLLM) and evaluation frameworks (e.g., Ragas, MLflow).
Familiarity with ML platform ecosystems (Databricks, MLflow, SageMaker, Vertex AI) and the integration patterns that connect them to observability tooling.
Contributions to open-source observability or instrumentation projects (OpenTelemetry, OpenInference, or similar).
The posted range represents the expected salary range for this job requisition and does not include any other potential components of the compensation package and perks previously outlined. Ultimately, in determining pay, we'll consider your experience, leveling, location, and other job-related factors.
Fiddler is proud to be an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. If you require special accommodations in order to complete the interviews or perform job duties, please inform the recruiter at the beginning of the process.
Beware of job scam fraud. Our recruiters use @fiddler.ai email addresses exclusively. In the US, we do not conduct interviews via text or instant message, or ask for sensitive personal information such as bank account or social security numbers.