Under the Hood: How AI Robot Works
Hi Product Hunters!,
This isn't your standard animation app with pre-baked video assets. Everything you see is AI generated live.
1. The "Brain": Google Gemini 2.5 Flash
When a user inputs a prompt (e.g., "Do a backflip holding a beer"), we send this intent to Gemini. Instead of returning plain text, the AI generates a JSON object containing executable JavaScript code.
This code contains the specific mathematical formulas (Math.sin, cos, lerp) required to calculate joint angles (shoulders, knees, hips) frame-by-frame.
The AI dynamically selects props, determining whether to render a standard Emoji or generate a custom SVG path.
2. The "Engine": React & Procedural Animation
The app is built on React 19. The generated code is "compiled" and executed within a requestAnimationFrame loop (targeting 60fps).
We utilize an Inverse Kinematics (IK) approach: the code calculates exactly where the hands and feet need to be, and the 'bones' (SVG lines) connect these points dynamically.
This ensures the animation is always smooth, infinitely scalable (vector/SVG), and completely unique every time.
3. The "Voice": Contextual Captions
Beyond movement, Gemini also generates a matching, witty caption for social sharing. This caption is context-aware, adapting to both the specific action and the selected Robot "Skin" (e.g., speaking like a Samurai, a Cyberpunk hacker, or a Cat).
4. Sharing: Client-side GIF Encoding
To share these moments as GIFs, we capture the SVG state frame-by-frame into a memory buffer. We use gifenc to encode these frames into an optimized GIF blob entirely in the browser. No heavy server-side processing required!
Tech Stack:
Frontend: React 19, TypeScript, Tailwind CSS
AI: Google Gemini API (via @google/genai SDK)
Icons: Lucide React
Rendering: Raw SVG manipulation


Replies