
TL;DR: General Instinct distill, quantize, and deploy large frontier AI onto limited hardware on drones, old PCs, and robots.
Problem:
Physical AI (robots, drones, IoT) teams can now get frontier models to do impressive things in the lab. The hard part is getting those models to work reliably in production.
Some common patterns:
1. Models works well in the cloud, but fails on edge in production.
2. Custom use-cases like robots operating in the wild or in space requires on device inference.
3. Using Gemini for everything is simply too expensive.
We built a custom, no code pipeline to solve this problem.
What we’re building:
Our first product: Instinct Edge. Give us a model, a target device, and a latency budget. We return an offline runtime that hits the budget on hardware like Jetson, mobile NPUs, ARM CPUs, Apple Neural Engine, and Snapdragons.
Under the hood, this combines compression recipes, custom CUDA / Metal / ARM NEON kernels, and a continuous data-to-serving pipeline.
One production example: a multimodal classifier on Jetson Orin NX with 111 ms cold start, 100% of decisions inside a 150 ms budget, and zero cloud calls.
How you can help:
If you or anyone you know are running large vision models on edge and are having problems fitting them onto your hardware, let us know by emailing founders@general-instinct.com!