Pachyderm

Pachyderm: Data Versioning, Data Pipelines, and Data Lineage

Software Development Engineer in Test at Pachyderm

Location
San Francisco, CA OR Remote Anywhere US / Remote
Job Type
Full-time
Apply to Pachyderm and hundreds of other fast-growing YC startups with a single profile.
Apply to role ›

About the role

About Pachyderm

 

At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. If you want to learn more about our grand vision, read what has become our "manifesto." Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data.  Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want. 

 

What it’s like being part of The Pach

Pachyderm is a rapidly growing, Series B company funded by the top VC’s — Benchmark, Decibel, M12, and YCombinator. Pachyderm has always and will always embrace a “Remote-first” approach to growing our team. This allows us to hire a diverse group of individuals across the country (and world!) while giving our team members the flexibility to work from anywhere.

Being a member of The Pach means joining a supportive team that cares about you, values kindness and works hard to create an open and transparent workplace. 

Pachyderm is still small, so joining means you are getting in right at the ground floor and have an enormous impact on the success and direction of the company and product. 

The Role

Love testing distributed systems? 

Pachyderm is hiring a Software Development Engineer in Test to help us architect and build out the framework for testing the core product - a distributed version-controlled filesystem and data processing engine. You’ll be working on challenging distributed systems testing problems every day and helping us build a first-of-its-kind, containerized, data infrastructure platform.

Your primary focus will be to work on building the automation infrastructure, tools, and framework for testing the core product. In addition, you will help layout sustainable practices within engineering to continually raise the quality bar. At Pachyderm, OSS user and customer feedback is a major driver of our product roadmap and we believe that everyone within the company should experience that first-hand. In this role, you will be the customer voice within engineering. This is expected to be a development/automation focused role and some amount of manual testing. You will have a outsized impact in making your stakeholders (developers, customer team, and ultimately customers) successful and happy.

In this role, you will use Docker, Kubernetes, Go, Python, CI systems, various cloud providers, and more.

 

You will:

  • Design, develop, execute, and maintain an automated testing framework, tools, and infrastructure
  • Test the product for performance, resiliency, security, scalability, and reliability
  • Understand the end-to-end configuration, technical dependencies, code paths, and overall behavioral characteristics of the platform
  • Own the performance and longevity benchmarks
  • Analyze and understand existing test coverage and test cases, identifying opportunities for redesign, replacement, reusability, and improvement in efficiency and performance
  • Define and inspire changes to our product with our development engineering team based on feedback from tests and customer issues
  • Develop and contribute to internal and external knowledge bases
  • Care about developer happiness and be a champion for our customers
  • Go above and beyond to ensure customers are getting the most out of their investment in the Pachyderm platform 

Qualifications:

  • Experience working in a continuous integration / continuous delivery development environment
  • Experience working with Kubernetes, Docker automation
  • Experience with automation in distributed systems
  • Strong programming skills and experience (Go, Java, Python, C++)
  • Must have strong communication skills when talking about technical concepts.
  • Professional experience in Databases and/or Distributed Systems
  • BS in CS (or equivalent technical degree) and 5+ years of relevant work experience (QA/Automation/Development)

Benefits:

  • Significant equity, 401k and full benefits (100% medical, 99% dental and vision, 50% for all dependents).
  • Flexible PTO - work/life balance is important and we want you to take time off to rejuvenate!
  • Remote friendly- we were remote before remote was cool and we intend to continue to invest in a remote first culture.
  • Tons of fun swag and surprise packages sent to your doorstep. 
  • Tech and office stipends - what you buy is yours to keep.
  • Education and donation stipends - we want to support your career growth and the community.
  • Supportive parental leave (see also: work/life balance).
  • Encouraged fun - game days, fun activities, zoom hangouts and more (and - when responsible - visits to our home base for team on-sites)

We can’t wait to meet you and hope you’ll join our PACH!

Why you should join Pachyderm

At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. If you want to learn more about our grand vision, read what has become our "manifesto." Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data. Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want.