Doubleword is a well funded, VC backed startup building an inference platform that provides the cheapest tokens on the market for high volume batch workloads.

The technical challenge is substantial. We orchestrate thousands of concurrent batch jobs while maintaining sub second latency queries, all within a system where reliability is non negotiable.

We work directly with our users to shape our technical roadmap. Our focus is clear: provide the cheapest and most scalable tokens on the market while maintaining exceptional reliability and developer experience.

The Role

We are looking for someone who elevates the people around them. Someone genuinely excited by hard problems, who loves discussing technical ideas and makes others better through clarity and energy. Someone who cares deeply about both the craft and the people they practice it with.

You will join a small, high trust team with real autonomy. You will take ownership of complex problems and influence how we design, build, test, and ship software.

What You’ll Be Doing

You will build and scale our batched inference platform, a distributed system that handles thousands of concurrent batch jobs across multiple LLM deployments.

Tech stack

Rust for core services
TypeScript for user interfaces
PostgreSQL for persistent state
Kubernetes for deployment and orchestration

Core areas of work

Database optimization under high load and concurrent access patterns
Distributed job scheduling and retry logic
Real time observability and monitoring
Designing for failure from the start to build reliability into the system

What We’re Looking For

Requirements

Technically exceptional

Your skills span domains and technologies. You solve genuinely hard problems and have consistently demonstrated this.

Distributed systems experience

You have delivered distributed systems in production. You understand high throughput, highly parallel architectures and can point to concrete examples of excellent work.

Pragmatic shipper

You move fast while maintaining stability for a large user base.

Humble

You lead by example. You take accountability quickly and say “I don’t know” when appropriate.

Customer focused

You start from real user problems and deliver technical solutions. You are a problem solver, not a technology purist.

Nice to have

Experience with our stack: Rust, TypeScript, PostgreSQL, Kubernetes
Experience with LLM inference systems or batch processing infrastructure

Our Engineering Principles

We are technically ambitious. Hard problems energize us.
We move fast. Priorities shift and requirements evolve. You should be excited by rapid iteration.
We choose pragmatic solutions over clever ones. The right answer beats the interesting answer.
We operate in ambiguity. Decisions are made with incomplete information and revised when evidence changes.

Interview Process

Technical Culture Interview

30 minute video call with an engineer. We discuss your experience and alignment with our engineering culture.

Wider Culture Interview

30 minute video call with someone outside the tech team. This focuses on company values and how you work with others.

Technical Design Interview

1 hour video call with members of the engineering team. We present a challenge and collaboratively design a system.

Paid Day Work Trial

Spend a day working on a real problem from our Batched Inference Server. This gives you a genuine sense of how we operate, and gives us insight into how you approach real world problems.

Compensation: $1,000.

Offer

If there is strong mutual alignment, we make an offer and you join us on the journey.

Apply

Email your CV and a short note explaining why this role interests you to careers@doubleword.ai.

This job is no longer accepting applications

See open jobs at TYTN.See open jobs similar to "Member of Technical Staff - Batched Inference Server" Octopus Ventures.

See more open positions at TYTN

Powered by Getro.com

Privacy policy Cookie policy