Task-Specific LLMs that can run 40x faster than OpenAI
This is a founding role. You will work directly with Venkat (the CTO) to engineer an end-to-end framework to build fast, custom LLMs for our customers. You will be expected to help with engineering across the stack, including model architecture research, latency optimization at both the software and hardware level, cloud infrastructure, and product roadmap. You will also work closely with our customers on onboarding, support, and keeping them happy.
We're looking for people who have demonstrable experience in deep learning research, cloud computing, software / hardware co-design. You are also expected to have strong organizational skills and strong problem solving skills.
A Bachelor's degree in computer science / engineering and experience in engineering ML products is required. Graduate level work in deep learning and transformer models is strongly preferred. If you have experience building a strong engineering culture at an early-stage startup, we'd love to talk with you!
Meru gives developers everything they need to build low-latency AI applications that are hosted on premise and run on CPUs. We're building a future where AI is democratized, personalized, and private.