Principal Engineer - Open Source Data Platform (ODP)
acceldata
About the Role
We are seeking a Principal Engineer with a minimum of 14 years of experience to serve as a technical visionary and thought leader for the Acceldata Open Data Platform (ODP). In this role, you will define the long-term technical strategy, lead the most complex architectural initiatives, and represent Acceldata in the global open-source and distributed systems community. You will work directly with engineering leadership and executives to shape product direction while driving innovation across the organization.
This is a full-time, on-site position, open to candidates with valid work authorization.
Why Join Us
As a Principal Engineer at Acceldata, you will be at the forefront of shaping the future of enterprise data platforms. You'll define the technical vision for systems that power mission-critical workloads across the world's largest organizations. Your decisions will influence not just our platform, but the broader data ecosystem through open-source contributions and industry leadership.
You'll work alongside Apache members, committers, and industry veterans who are passionate about solving the hardest problems in distributed computing. This is a rare opportunity to combine deep technical impact with strategic influence, building technology that matters while shaping the direction of a growing company in the data observability space.
Responsibilities
- Define and drive the long-term technical strategy and architecture for the Open Data Platform, aligning with business objectives and industry trends.
- Own the design of the most complex, high-impact systems; establish architectural principles and patterns that scale across the organization.
- Identify emerging technologies and industry trends; lead research and development initiatives that position Acceldata at the cutting edge of data platform innovation.
- Serve as a recognized leader in the open-source community; drive Apache project contributions, represent Acceldata at conferences, and influence project roadmaps.
- Collaborate with CTO, VP of Engineering, and product leadership to translate business strategy into technical execution; provide technical due diligence for strategic initiatives.
- Influence engineering practices, tools, and culture across multiple teams; establish best practices that elevate the entire engineering organization.
- Mentor Staff Engineers and Senior Engineers; develop technical leadership capabilities across the organization.
- Lead resolution of the most challenging technical problems spanning architecture, performance, scalability, and reliability.
- Engage with strategic customers and partners on complex technical discussions; translate customer needs into platform capabilities.
- Drive alignment across engineering, product, and operations on technical decisions with broad organizational impact.
- Work across diverse environments: Bare Metals, VM, Kubernetes, multi-cloud, and hybrid architectures at enterprise scale.
Mandatory Skills & Qualifications
- 12+ years of hands-on software development experience with at least 8 years focused on distributed systems, big data platforms, or data infrastructure.
- Proven track record of leading large-scale technical initiatives from conception to production across multiple teams.
- Expert-level proficiency in Java or Scala; strong skills in Python and systems languages.
- Deep expertise in distributed computing including consensus protocols, distributed transactions, data replication, partitioning strategies, and optimisation with modern table formats.
- Extensive experience in architecting and scaling systems using Hadoop, Spark, Hive, Trino, Kafka, Flink, and related technologies at production scale (100s to 1000s of nodes).
- Demonstrated ability to design and evolve complex systems that handle petabyte-scale data with high availability and performance requirements.
- Expert knowledge of cloud-native architectures, Kubernetes orchestration, and multi-cloud deployment patterns.
- Track record of diagnosing and resolving complex distributed system issues including performance optimization, resource management, and failure mode analysis.
- Significant contributions to major open-source projects; experience working with distributed global teams and open-source governance models.
- Exceptional written and verbal communication skills; proven ability to influence technical direction across organizations and with external stakeholders.
- Ability to balance long-term technical vision with near-term delivery requirements; experience making build vs. buy decisions.
Desired Skills (Bonus)
- PMC member or committer status in Apache projects (e.g., Spark, Kafka, Hive, Hadoop, Iceberg, Flink, Trino).
- Speaker at major conferences (ApacheCon, Spark Summit, Kafka Summit, QCon, etc.); published papers or widely-read technical content.
- Experience building or contributing to query engines, optimizers, or execution frameworks.
- Deep experience with modern lakehouse architectures, table formats (Iceberg, Delta, Hudi), and data mesh patterns.
- Experience with ML infrastructure, feature stores, or MLOps platforms.
- Experience scaling engineering organizations in high-growth environments.
- Master's or PhD in Computer Science, with focus on distributed systems, databases, or related fields.