Join us at Fastino as we build the next generation of LLMs and agentic systems. Our team, boasting alumni from Google Research, Apple, Stanford, and Cambridge is on a mission to develop specialized, efficient AI.
Fastino has raised $25M (as featured in TechCrunch) through our seed round and is backed by leading investors including Microsoft, Khosla Ventures, Insight Partners, Github CEO Thomas Dohmke, Docker CEO Scott Johnston, and others.
The Senior Engineer will work closely with the two co-founders, steering discussions around product architecture and leading product delivery.

Key Responsibilities:

Architect, design, and build scalable agentic systems that serve millions of users, using LLMs + serverless / cloud-native services.
Manage robustness of agent subsystems: memory & context management, tool invocation, agent orchestration, etc.
Build data pipelines, evaluation frameworks, and testing infrastructure to validate agent performance and behavior
Implement backend services (APIs, event pipelines, edge / lightweight inference) and coordinate with frontend team on delivering end product
Design systems for observability, telemetry, error handling, fallback policies, and guardrails for agent operations.
Mentor existing engineers; gradually take on technical leadership responsibilities (architecture design and review, code ownership, hiring).
Contribute to internal research or publication efforts as appropriate; stay current with state-of-the-art in LLMs, agents, RL, planning, etc.
Participate in product roadmap discussions with founders and research team, translating new ideas into deliverable systems.

Requirements:

5+ years of professional software engineering experience, including shipping complex products into production environments at scale.
BS/MS/PhD in Computer Science, Systems Engineering, Machine Learning, or a related technical field—or equivalent industry experience.
Strong foundation in modern systems engineering: distributed systems, service-oriented or microservice architectures, cloud infrastructure (AWS/GCP/Azure), serverless compute, networking, observability, CI/CD, and containerization (Docker, Kubernetes).
Demonstrated ability to design systems for scalability, resilience, performance, and cost efficiency in real-world deployments.
Hands-on experience building and deploying production ML/LLM systems, including inference pipelines, model serving, caching, monitoring, and integration with APIs and third-party services.
Proficiency in Python and at least one systems language (Go, Rust, Java, or C++).
Familiarity with data engineering and pipeline technologies (e.g., Kafka, Airflow, Spark) and modern developer toolchains.
Experience implementing security, compliance, and data privacy best practices for production systems.

Why Join Us?

Supportive Environment: Benefit from the resources of Microsoft and venture funding, collaborating with top-tier talent from renowned universities.
Top-Tier Compute: Enjoy a dedicated GPU cluster for research.
Impactful Work: Your contributions will directly shape the future of AI applications, making technology more accessible, eco-friendly and dev friendly!
Competitive Benefits: Receive competitive salary, stock options, health benefits, and more.

This job is no longer accepting applications