Data Warehouse for Computer Vision

Distributed Data Systems Founding Engineer

$120K - $200K / 0.50% - 2.00%
San Francisco, CA
Job Type
3+ years
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Sammy Sidhu
Sammy Sidhu

About the role

At Eventual, we are looking for the key individuals who are excited to build a new product and to grow with our company. Our team is small but packs a punch, and we come from backgrounds in genomics, self-driving and high performance computing. We love working on large ambitious projects but also enjoy a strong sense of camaraderie and never forget to work hard/play hard!

Key Responsibilities

As a Distributed Data Systems Engineer, you will be a founding member of the Eventual team with primary responsibilities around building out DaFt (our open source framework which will be the de facto solution for building applications on complex data such as images, audio and video), as well as the infrastructure that supports it on the Eventual Cloud Platform.

Some projects that you can expect to work on include:

  1. Query Optimization to optimize DaFt’s distributed execution plans
  2. Code generation to translate high level DaFt Dataframe query plans into instructions that can run on various backends including Spark, Ray, and the Eventual Cloud
  3. Building systems to optimize workload scheduling for maximal resource utilization

We are a young startup - so be prepared to wear many hats such as tinkering with infrastructure, talking to customers and participating heavily in the core design process of our product!

What we look for

We are looking for a candidate with a strong foundation in systems programming and with experience with building distributed data systems (e.g. Hadoop, Spark, Dask, Ray).

Our ideal candidate has:

  1. 3+ years of experience with building the internals of distributed data systems (workload pipelining, scheduling, networking, fault tolerance etc)
  2. Experience with workload orchestrators such as Kubernetes/Mesos/Yarn
  3. Strong fundamentals in systems programming and Linux
  4. Works well in small focused teams with fast iterations and lots of autonomy

Nice to haves that will help in your day-to-day work:

  1. Experience with containerization technologies
  2. Experience with building compilers or query optimization
  3. Experience with GPUs and with CUDA
  4. Experience with building production machine learning systems

Benefits and Remote Work

We are believers in both having the flexibility of remote work but also the importance of in-person work, especially at the earliest stages of a startup. We have a flexible hybrid approach to in-person work with at least 3 days of in-person work typically from Wednesday - Friday at our office in San Francisco.

We believe in providing employees with best-in-class compensation and benefits including meal allowances, comprehensive health coverage including medical, dental, vision and more.

About the interview

15-minute phone screen

A short phone screen over video call with one of our cofounders (either Sammy or Jay) for us to get acquainted, understand your aspirations and evaluate if there is a good fit in terms of the type of role you are looking for.

Technical Interviews

Our technical interviews for this role are focused on understanding your technical knowledge with distributed data processing.

60-minute data engineering design interview

A technical interview to understand your familiarity with the internals of a distributed data engineering system.

60-minute systems programming interview

A technical interview to understand your familiarity with systems programming and Linux.

Get to know us

As many chats as necessary to get to know us - come have a coffee with our cofounders and existing employees to understand who we are and our goals, motivations and ambitions.

We look forward to meeting you!

About Eventual

Eventual: The Data Warehouse for Computer Vision

Eventual is building an integrated development experience for data scientists and engineers to query, process and build applications on Complex Data (non-tabular data such as images, video, audio and 3D scans).


Daft (https://www.getdaft.io) is our open-sourced Python dataframe API for working with Complex Data. With Daft, users can query and transform their data interactively in a notebook environment, running workloads such as analytics, data preprocessing and machine learning model training/inference. The same transformations that are performed on the dataframe can then be deployed as a HTTP service to respond to incoming requests, helping our users go from experimentation to productionization faster than ever before.

Eventual Cloud Platform

The Eventual Cloud Platform provides an integrated development environment for our users to go from local development to production. We provide:

Notebooks for interactive data science with Daft Fully-managed cluster computing infrastructure to run large distributed Daft workloads Application deployment as services or automated jobs

About Us

Eventual (YC W22) is funded by investors such as Caffeinated Capital, Array.vc and top angels in the valley from Databricks, Meta and Lyft. Our team has deep expertise in high performance computing, big data technologies, cloud infrastructure and machine learning.

Team Size:5
Location:San Francisco
Jay Chia
Jay Chia
Sammy Sidhu
Sammy Sidhu