Launch HN: Golpo (YC S25) – AI-generated explainer videos

24 skar01 38 8/13/2025, 5:11:35 PM video.golpoai.com ↗

Hey HN! We’re Shraman and Shreyas Kar, building Golpo (https://video.golpoai.com), an AI generator for whiteboard-style explainer videos, capable of creating videos from any document or prompt.

We’ve always made videos to communicate any concept and felt like it was the clearest way to communicate. But making good videos was time-consuming and tedious. It required planning, scripting, recording, editing, syncing voice with visuals. Even a 2-minute video could take hours.

AI video tools are impressive at generating cinematic scenes and flashy content, but struggle to explain a product demo, walk through a complex workflow, or teach a technical topic. People still spend hours making explainer videos manually because existing AI tools aren’t built for learning or clarity.

Our solution is Golpo. Our video generation engine generates time-aligned graphics with spoken narration that are good for onboarding, training, product walkthroughs, and education. It’s fast, scalable, and built from the ground up to help people understand complex ideas through simple storytelling.

Here’s a demo: https://www.youtube.com/watch?v=C_LGM0dEyDA#t=7.

Golpo is built specifically for use cases involving explaining, learning, and onboarding. In our (obviously biased!) opinion, it feels authentic and engaging in a way no other AI video generator does.

Golpo can generate videos in over 190 languages. After it generates a video, you can fully customize its animations by just describing the changes you want to see in each motion graphic it generates in natural language.

It was challenging to get this to work! Initially, we used a code-generation approach with Manim, where we fine-tuned a language model to emit Python animation scripts directly from the input text. While promising for small examples, this quickly became brittle, and the generated code usually contained broken imports, unsupported transforms, and poor timing alignment between narration and visuals. Debugging and regenerating these scripts was often slower than creating them manually.

We also explored training a custom diffusion-based video model, but found it impractical for our needs. Diffusion could produce high-fidelity cinematic scenes, but generating coherent sequences beyond about 30 seconds was unreliable without complex stitching, making edits required regenerating large portions of the video, and visuals frequently drifted from the instructional intent, especially for abstract or technical topics. Also, we did not have the compute to scale this.

Existing state-of-the-art systems like Sora and Veo 3 face similar limitations: they are optimized for cinematic storytelling, not step-by-step educational content, and they lack both the deterministic control needed for time-aligned narration and the scalability for 5–10 minute explainers.

In the end, we took a different path of training a reinforcement learning agent to “draw” whiteboard strokes, step-by-step, optimized for clear, human-like explanations. This worked well because the action space was simple and the environment was not overly complex, allowing the agent to learn efficient, precise, and consistent drawing behaviors.

Here are some sample videos that Golpo generated:

https://www.youtube.com/watch?v=33xNoWHYZGA (Whiteboard Gym - the tech behind Golpo itself)

https://www.youtube.com/watch?v=w_ZwKhptUqI (How do RNNs work?)

https://www.youtube.com/watch?v=RxFKo-2sWCM (function pointers in C)

https://golpo-podcast-inputs.s3.us-east-2.amazonaws.com/file... (basic intro to Gödel's theorem)

You can try Golpo here: https://video.golpoai.com, and we will set you up with 2 credits. We’d love your feedback, especially on what feels off, what you’d want to control, and how you might use it. Comments welcome!

Comments (38)

delbronski · 1h ago

Wow, I was skeptical at first, but the result was pretty awesome!

Congrats! Cool product.

Feedback: I tried making a product explainer video for a tree planting rover I’m working on. The rover looked different in every scene. I can imagine this kind of consistency may be more difficult to get right. Maybe if I had uploaded a photo of how the rover looks it may have helped. In one scene the rover looks like an actual rover, in the other it looks like a humanoid robot.

But still, super impressed!

skar01 · 14m ago

Thanks! We are working on the consistency.

adi4213 · 1h ago

This is neat but I wasn’t able to get it to work (server overloaded is what the browser app said) I’d also recommend registering a custom domain in Supabase so the Google SSO shows the golpo domain - which is a small, but professional-signaling affordance

tk90 · 34m ago

Pretty cool, especially the voice and background music - feels just right.

I asked it about pointers in Rust. The transcript and images were great, very approachable!

"Do not let your computer sleep" -> is this using GPU on my machine or something?

reactordev · 1h ago

This is actually pretty amazing. Not only does it work, it’s good. At least from the demo videos. YMMV.

What I always wanted to do was to teach what I know but I lack the time commitment to get it out. This might be a way…

skar01 · 1h ago

Thank you so much!

OG_BME · 42m ago

I created a video on the free tier, the shareable link didn't work (404), I upgraded to be able to download it, and it seems to have disappeared? It says "Still generating" in my Library.

The video UUID starts with "f5fbd6c7", hopefully that's sufficient to identify me!

skar02 · 38m ago

Sorry about that! I found your video. Should I put it here or DM it to you (can you do DM in Hacker News?) ? You could also email me at shreyas2@stanford.edu, and I can send it there

Lienetic · 39m ago

This is really interesting, definitely going to give it a try! What do you think is the actual market opportunity here? How'd you land on this being something that people really want and actually becoming a large, defensible business?

drawnwren · 37m ago

I'm sure someone else has mentioned this but your video on the main page correctly has GRPO the first time it's introduced but then every time you mention it after that -- you've swapped it to GPRO.

mclau157 · 1h ago

I have used AI in the past to learn a topic but by creating a GUI with input sliders and output that I can see how things change when I change parameters, this could work here where people can basically ask "what if x happens" and see the result which also makes them feel in control of the learning

ishita159 · 1h ago

Planning to add links as input anytime soon?

I would love to add a link to my product docs, upload some images and have it generate an onboarding video of the platform.

skar02 · 1h ago

Yes, very soon. We already support this via API and will add to our platform too!

ceroxylon · 59m ago

The generated graphic in the linked demo for "Training materials that captivate" is a sketch of someone looking forlorn while holding a piece of paper. Is there a way to do in-line edits to the generated result to polish out things like this?

typs · 1h ago

If that demo video is how it actually works, this is a pretty amazing technical feat. I’m definitely going to try this out.

Edit: I've used. It's amazing. I'm going to be using this a lot.

skar01 · 1h ago

Hey also, if you want to suggest a video, we could try generating one and reply here with a link! Just tell us what you want the video to be about!!

cube2222 · 1h ago

Hey, kudos for the product / demo on the website - it managed to keep me engaged to watch it till the end.

I’m mostly curious how it fairs with more complex topics and doing actually informative (rather than just “plain background”) illustrations.

Like a video explaining transformer attention in LLMs, to stay on the AI topic?

skar01 · 21m ago

Yeah so it actually does pretty well. Here are some sample videos:

https://www.youtube.com/watch?v=33xNoWHYZGA&t=1s https://www.youtube.com/watch?v=w_ZwKhptUqI

WasimBhai · 1h ago

I have 2 credits but it won't let me generate a video. Founders, if you are around, you may want to debug.

skar02 · 55m ago

Huh, that's odd. Could you DM me your email?

skar01 · 10m ago

Or just email us at founders@golpoai.com

poly2it · 1h ago

The creator tier ($99.99/mo) lists "15 seconds" as a perk. Does this mean the maximum video length is 15 seconds?

bangaladore · 1h ago

Given that the next tier up is "Create longer/more detailed video (up to 4 min long)", I'd guess you are right.

Seems like this is pretty useless unless you pay 200$ per month. Which may be a reasonable number for the clearly commercial / enterprise use case, but I'm just not certain what you can do wtih the lower tiers.

skar02 · 1h ago

One of the founders here! No it's not. The max video length is up to 2 min, which is also the case in any non-free tier. You just have the 15-second option for that tier (which people need for things like FB ads)

nextworddev · 55m ago

Has anyone tried prompting VEO to create these videos

skar02 · 53m ago

We have! Veo I believe, can't do more than 8-second videos, and when prompted they aren't very coherent in our experience.

nextworddev · 4m ago

oh had no idea. will try your product

KaoruAoiShiho · 1h ago

Did NotebookLM just come out with this? Very tough to compete with google.

CalRobert · 1h ago

So it eats concepts and makes videos?

One is reminded of smbc

https://www.seekpng.com/png/detail/213-2132749_gulpo-decal-f...

skar02 · 48m ago

Haha! The name actually comes from the word story in Bengali.

metalliqaz · 1h ago

So... if I had the enterprise accounts for various LLM services, could I dupe this company with a basic upload page and a nice big prompt?

Wolf_Larsen · 1h ago

Its not that simple, but it would be straight forward to duplicate the outputs of this with a simple LLM + ffmpeg workflow. They did mention a custom model on the landing page, and if they've trained one then you would be spending much more money on each output than they are. Because without a fine-tuned model there would be a lot of inference done for QA and refinement of each prompt | clip | frame .

Lienetic · 1h ago

I'm curious - do you feel differently about some of these coding and coding-adjacent tools out there like Cursor and Lovable?

metalliqaz · 3m ago

no, not really. I think they are massively over-valued but in the tech world... what else is new? I view those tools as mostly a convenience. They are integrating things into nice easy packages to use. That's the value.

With this... eh. Most people don't need to make more than one or two explainer videos, so are they going to take on a new monthly fee for that? And then there are power users who do it all the time, but almost surely have their own workflow put together that is customized to exactly what they want.

At any point, one of the big players could introduce this as a feature for their main product.

subhro · 32m ago

From one Kar to another, দূর্দান্ত গল্প Congratulations.

skar02 · 15m ago

Thanks!

metalliqaz · 1h ago

My suggestion would be to re-think the demo videos. I have only watched most of the way into the "function pointers in C" example. If I didn't already know C well, I would not be able to follow that. The technical diagrams don't stay on the screen long enough for new learners to process the information. These videos probably look fantastic to the person who wrote the document it summarizes, but to a newbie the information is fleeting and hard to follow. The machine doesn't understand that the screen shouldn't be completely wiped all the time while it follows the narrative. Some visuals should be static for paragraphs, or stay visible while detail marked up around it. For a true master of the art, see 3blue1brown.

bangaladore · 1h ago

> For a true master of the art, see 3blue1brown.

I agree. Rather than (what I assume is) E2E text -> video/audio output, it seems like training a model on how to utilize the community fork of manim which 3blue1brown uses for videos would produce a better result.

[1] https://github.com/ManimCommunity/manim/

Nginx Introduces Native Support for Acme Protocol (blog.nginx.org)

29 years later, Settlers II gets Amiga release (gamingretro.co.uk)

I chose OCaml as my primary language (xvw.lol)

FFmpeg 8.0 adds Whisper support (code.ffmpeg.org)

April Fools 2014: The *Real* Test Driven Development (2014) (testing.googleblog.com)

Launch HN: Golpo (YC S25) – AI-generated explainer videos (video.golpoai.com)

Cross-Site Request Forgery (words.filippo.io)

PYX: The next step in Python packaging (astral.sh)

So what's the difference between plotted and printed artwork? (lostpixels.io)

ReadMe (YC W15) Is Hiring a Developer Experience PM (readme.com)

Pebble Time 2* Design Reveal (ericmigi.com)

Coalton Playground: Type-Safe Lisp in the Browser (abacusnoir.com)

PCIe 8.0 Announced by the PCI-Sig Will Double Throughput Again – ServeTheHome (servethehome.com)

OpenIndiana: Community-Driven Illumos Distribution (openindiana.org)

DoubleAgents: Fine-Tuning LLMs for Covert Malicious Tool Calls (pub.aimind.so)

This website is for humans (localghost.dev)

New treatment eliminates bladder cancer in 82% of patients (news.keckmedicine.org)

The Mary Queen of Scots Channel Anamorphosis: A 3D Simulation (charlespetzold.com)

We caught companies making it harder to delete your personal data online (themarkup.org)

A case study in bad hiring practice and how to fix it (tomkranz.com)

Man develops rare condition after ChatGPT query over stopping eating salt (theguardian.com)

Claude says “You're absolutely right!” about everything (github.com)

Gartner's Grift Is About to Unravel (dx.tips)

Honky-Tonk Tokyo (2020) (afar.com)

Mesmerizing Hypnoloid, a Kinetic Desktop Sculpture (core77.com)

Claude Sonnet 4 now supports 1M tokens of context (anthropic.com)

Bezier-rs – algorithms for Bézier segments and shapes (graphite.rs)

Nearly 1 in 3 Starlink satellites detected within the SKA-Low frequency band (astrobites.org)

Pebble Time 2 Design Reveal [video] (youtube.com)

F-Droid build servers can't build modern Android apps due to outdated CPUs

The Rock Art of Serrania De La Lindosa (earthasweknowit.com)

Supporting org.apache.xml.security in graalVM (guust.ysebie.be)

How Stock Options Work (web.stanford.edu)

Steam payment headaches grow as PayPal is no longer usable for much of the world (pcgamer.com)

Search all text in New York City (alltext.nyc)

Fingerjigger (fingerjigger.com)

Improving Geographical Resilience for Distributed Open Source Teams with Freon (soatok.blog)

US national debt reaches a record $37T, the Treasury Department reports (apnews.com)

Fennel libraries as single files (2023) (andreyor.st)

How Well Do Coding Agents Use Your Library? (stackbench.ai)

The Incompleteness of Ethics (aeon.co)

Visualizing quaternions: An explorable video series (2018) (eater.net)

A gentle introduction to anchor positioning (webkit.org)

How Much Is Trump Profiting Off the Presidency? (newyorker.com)

From Here? (dirtyfeed.org)

A Comprehensive Survey of Self-Evolving AI Agents [pdf] (arxiv.org)

The Missing Protocol: Let Me Know (deanebarker.net)

When DEF CON partners with the U.S. Army (jackpoulson.substack.com)

Study: Social media probably can't be fixed (arstechnica.com)

Show HN: Building a web search engine from scratch with 3B neural embeddings (blog.wilsonl.in)

Launch HN: Golpo (YC S25) – AI-generated explainer videos

Comments (38)

April Fools 2014: The Real Test Driven Development (2014) (testing.googleblog.com)