HomeLaunchessync.
58

sync. – AI lip sync tool for video content creators

from the team behind original Wav2Lip library

TL;DR: we’ve built a state-of-the-art lip-sync model – and we’re building towards real-time face-to-face conversations w/ AI indistinguishable from humans 🦾

try our playground here: https://app.synclabs.so/playground

how does it work?

theoretically, our models can support any language — they learn phoneme / viseme mappings (the most basic unit / “token” of how sounds we make map to the shapes our mouths make to create them). it’s simple, but a start towards learning a foundational understanding of humans from video.

why is this useful?

[1] we can dissolve language as a barrier

check out how we used it to dub the entire 2-hour Tucker Carlson interview with Putin speaking fluent English.

imagine millions gaining access to knowledge, entertainment, and connection — regardless of their native tongue.

realtime at the edge takes us further — live multilingual broadcasts + video calls, even walking around Tokyo w/ a Vision Pro 2 speaking English while everyone else Japanese.

[2] we can move the human-computer interface beyond text-based-chat

keyboard / mice are lossy + low bandwidth. human communication is rich and goes beyond just the words we say. what if we could compute w/ a face-to-face interaction?

maybe embedding context around expressions + body language in inputs / outputs would help us interact w/ computers in a more human way. this thread of research is exciting.

[3] and more

powerful models small enough to run at the edge could unlock a lot:

eg.

extreme compression for face-to-face video streaming

enhanced, spatial-aware transcription w/ lip-reading

detecting deepfakes in the wild

on-device real-time video translation

etc.

who are we?

Prady Modukuru [CEO] | Led product for a research team at Microsoft that made Defender a $350M+ product, took MSR research into production moving it from bottom of market to #1 in industry evals.

Rudrabha Mukhopadhyay [CTO] | PhD CVIT @ IIIT-Hyderabad, co-authored wav2lip / 20+ major publications + 1200+ citations in the last 5 years.

Prajwal K R [CRO] | PhD, VGG @ University of Oxford, w/ Andrew Zisserman, prev. Research Scientist @ Meta, authored multiple breakthrough research papers (incl. Wav2Lip) on understanding and generating humans in video

Pavan Reddy [COO/CFO] 2x venture-backed founder/operator, built the first smart air purifier in India, prev. monetizing sota research @ IIIT-Hyderabad, engineering @ IIT Madras

how did we meet?

Prajwal + Rudrabha worked together at IIIT-hyderabad — and became famous by shipping the world’s first model that could sync the lips in a video to any audio in the wild in any language, no training required.

they formed a company w/ Pavan and then worked w/ the university to monetize state-of-the-art research coming out of the labs and bring it to market.

Prady met everyone online — first by hacking together a viral app around their open source models, then collaborating on product + research for fun, to cofounding sync. + going mega-viral.

Since then we’ve hacked irl across 4 different countries, across the US coasts, and moved into a hacker house in SF together.

uploaded image

what’s our ask?

try out our playground and API and let us know how we can make it easier to understand and simpler to use 😄

play around here: https://app.synclabs.so/playground