|
A World-Changing Company
Palantir builds the world's leading software for data-driven decisions & operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, & more.
The Role
Apollo is Palantir's autonomous software management & deployment platform. It enables seamless, continuous delivery of mission-critical software (Foundry, Gotham, AIP) across a vast range of environments: on-prem, public cloud, disconnected (air-gapped) networks, & highly regulated settings (including IL-5 & FedRAMP).
As a Software Engineer on the Apollo team, you'll build & operate a large-scale distributed system to allow the remote operation & maintenance of Kubernetes clusters. Our mission is to extract the entire state of a cluster into a portable, high-performance artifact within minutes, enabling full & almost instant cluster reconstruction from the ground up-all while pushing the limits of speed, reliability, & scale.
You'll design & implement backup & restore solutions for Kubernetes, leveraging proprietary compression infrastructure tailored to Palantir's unique deployment models. You'll also build & optimize our container artifact store, which is based on the OCI (Open Container Initiative) distribution spec-the industry standard for storing & distributing container images & artifacts. You'll own the backbone of every environment Apollo supports, from hyperscalers to Army trucks.
If you're excited by challenges at the intersection of container technologies like OCI & docker, storage, & distributed systems, you'll find opportunities here to dive deep into storage formats & low-level optimizations, where milliseconds matter. As we increasingly automate cluster creation & management on diverse hardware, you'll play a key role in scaling Palantir's presence at the edge & solving tough distributed systems problems. You'll own the full development lifecycle-from idea generation & design, through implementation, to operation & support-while collaborating closely with both technical & non-technical stakeholders to deliver robust, impactful infrastructure. The usage of AI tools (Claude Code/Codex/Copilot) is highly encouraged!
|