Pachyderm: Data Versioning, Data Pipelines, and Data Lineage

Sr. Python Engineer - Integrations at Pachyderm

San Francisco Bay Area or Remote / Remote
Job Type
1+ years
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Joe Doliner
Joe Doliner

About the role

About Pachyderm

At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data. Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want. If you want to learn more about our grand vision, read what has become our "manifesto."

Pachyderm is a rapidly growing, early-stage company funded by the top VC’s — Benchmark, Decibel, M12, and YCombinator. Like many modern companies, Pachyderm embraces a “Remote-first” approach to growing our team. It gives us a huge advantage in hiring top talent and diverse talent across the country while giving our team members the flexibility to work from anywhere.

You can check out our product on GitHub because it’s open-source and try our cloud service for free.

The Role

Love Docker, Python, Golang, and distributed systems?

Pachyderm is hiring distributed systems engineers to help lay the foundation for key integrations with the core Pachyderm platform -- a distributed version-controlled filesystem and data processing engine. You’ll be solving hard systems problems every day and build the abstractions/APIs to enable rapid integration with several products like machine learning frameworks, model serving frameworks, Notebooks, IDEs, etc.

While your primary focus will of course be building the core product, you’ll also have direct exposure to users, enterprise customers, and partners. At Pachyderm, OSS user and customer feedback is a major driver of our product roadmap. As part of the integrations team, your input will have a significant influence on the product direction. You will own the relationships with our key partners and own several key software components to help drive Pachyderm adoption.

Pachyderm is just a small team right now, so you'd be getting in right at the ground floor and have an enormous impact on the success and direction of the company and product.

We offer significant equity, full benefits, and all the usual startup perks.


2+ years of experience working in distributed systems, data infrastructure, back-end systems, or related development work. A major contribution to prominent and related open-source projects is a plus or can be a replacement for work experience in some circumstances (e.g. You’ve been a student just finishing your degree) While not a strict requirement, python development experience is a bonus. Programming languages are just part of your arsenal and we’ve found that great engineers have no problem learning new tools. Must have strong communication skills when talking about technical concepts. Our interview process strongly tests for communication as we have a very collaborative work environment where many parts of the codebase interact in complex ways. Things change quickly as our product develops and breaking down major features into smaller and more easily executable PRs is an imperative skill.

Why you should join Pachyderm