Data Versioning, Data Pipelines, and Data Lineage
Pachyderm is the leader in data versioning and pipelines for MLOps. We’re building the data foundation that allows data science teams to automate and scale their machine learning lifecycle while guaranteeing reproducibility. With over $40 million in three rounds of funding from leading investors like Benchmark, Microsoft M12, and Y Combinator, Pachyderm is committed to building industrial strength capabilities for Data Centric AI. Pachyderm offers a commercial Enterprise Edition and an open source Community Edition. Pachyderm helps customers get their ML and AI projects to market faster, lower data processing and storage costs, and supports strict data governance requirements.
Pachyderm is growing fast and still small, so joining means you are getting in right at the ground floor and that you will have an enormous impact on the success and direction of the company and product. Pachyderm has always and will always embrace a “Remote-first” approach to growing our team. This allows us to hire a diverse group of individuals across the country (and world!) while giving our team members the flexibility to work from anywhere. Being a member of The Pach means joining a supportive team that cares about you, values kindness and works hard to create an open and transparent workplace.
You'll be working closely with our systems engineering and design teams to ship a core product that will define the future of the business. Your main task will be developing the web application that will become the "face" of the product and one of the primary modes for data scientists to build/manage their data and pipelines. This includes working on both the frontend and GraphQL APIs.
The long and short of it is, if you're looking to make a big impact on a small team that works on open source software and delivers an enterprise-grade product, then this role is for you. You can check out our product on github.
We offer significant equity, full benefits, and all the usual startup perks.
We can’t wait to meet you and hope you’ll join our PACH!
At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. If you want to learn more about our grand vision, read what has become our "manifesto." Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data. Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want.