Show HN: HuMo AI – Multi-modal human-centric video generator (text+image+audio)

1 dallen97 0 9/18/2025, 3:49:22 PM humo-ai.com ↗

HuMo AI, a browser-based video studio that turns text, images, and audio into human-centric videos with strong subject consistency and lip-sync.

What it does • Text→Video with controllable motion and scene composition. • Image→Video to animate a still with natural movements and camera motion. • Audio-visual sync for speech-driven lip movement and rhythm-matched motion. • Multi-modal fusion: combine text + reference images + audio in one run. • Export-ready output: high-resolution (up to 4K), common aspect ratios.

Why Creative teams often juggle multiple tools: one for T2V, another for in-between motion, a third for lip-sync. I wanted a single studio that keeps identity consistent and aligns visuals with audio—useful for product explainers, character spots, and quick social posts.

What’s different • Built around multi-modal conditioning rather than single-input T2V. • Emphasis on identity/subject preservation across the whole clip. • Frame-level audio alignment for more natural lips and motion. • Workflow extras like shot lists / playbooks to speed up iteration.

Learn Your Way: Reimagining Textbooks with Generative AI (research.google)

Google Injects Gemini into Chrome as AI Browsers Go Mainstream (wired.com)

Podcasts, You Altered the Deal, So I Will Alter Your App (blog.matthewbrunelle.com)

Show HN: Supercharging RL with Hyper-Efficient Online Opt, +165% in 2h, $10 (arc.computer)

Gravity and Quantum Physics Solved (pajuhaan.medium.com)

The Little Prince: Manuscript and Drawings (themorgan.org)

AI-generated genomes yielded viable phages with substantial evolutionary novelty (biorxiv.org)

BM25F from Scratch (softwaredoug.com)

AK or just okay? AI and economic growth (jzmazlish.substack.com)

I've Written About Loads of Scams. This One Almost Got Me (nytimes.com)

A Marketplace for Buying and Selling Automation Workflows (neura.market)

Stocks for the Long Run (bloomberg.com)

OpenTelemetry Collector: What It Is, When You Need It, and When You Don't (oneuptime.com)

If Anything Changes, All Value Dies? (overcomingbias.com)

Intake of low-, no-calorie sweeteners tied to faster cognitive decline (medicalxpress.com)

Vitamin D2 supplements can lead to a drop in Vitamin D3 (newscientist.com)

Database of Databases (dbdb.io)

Bambauer Contra BS (betonit.ai)

Show HN: NoZeroDay – strictly-enforced daily streak challenges (apps.apple.com)

How Big Pharma Saved My Skin (richardhanania.com)

Newton for Ladies (1737) – Newtonianism vs. Cartesianism (whipplelib.hps.cam.ac.uk)

Compress.lol – WebAssembly-Powered Video Compression (github.com)

A Generalized Algebraic Theory of Directed Equality (jacobneu.phd)

From JavaScript Frameworks to AI Tools: The Same Debate, Different Wrapping (lassala.net)

A Cyberattack Crippled Range Rover Production. The Reboot Is Proving Tough (wsj.com)

Food impacts on species extinction risks (nature.com)

Pub/Sub (docs.pgdog.dev)

Scammers are faking cell towers now; Americans don't spot scams (9to5mac.com)

iPadOS 26 and Developers (bugsandbunnies.org)

Ask HN: Replacement for Bitnami Helm Charts/Images

Show HN: BloodMoney 2|A dark comedy SIM where you manage a human life for profit (bloodmoney2.art)

Ask HN: How do you keep AI assistants consistent with your personal preferences?

Ethanol ingestion via frugivory in wild chimpanzees (science.org)

Chrome's New AI Features (blog.google)

Anker's recent power bank recall involves over 481,000 units (theverge.com)

Show HN: Clean, open-source alternative to expensive email signature tools (github.com)

Ask HN: How can we reliably determine if text was written by AI?

Show HN: SandBox – AI agents simulating possible futures (github.com)

Chrome: The browser you love, reimagined with AI (blog.google)

Debug Adapter Protocol (microsoft.github.io)

The crisis in scientific publishing: from AI fraud to epistemic justice (redasadki.me)

Yes, Jimmy Kimmel's suspension was government censorship (theverge.com)

Show HN: PageIndex MCP – Chat with Long PDFs on Claude or Cursor (github.com)

Huawei's AI accelerator roadmap, claims that it makes Earth's mightiest clusters (theregister.com)

Docker backtracks on OSS and partners with CNCF (cncf.io)

How Isaac Newton Discovered the Binomial Power Series (2022) (quantamagazine.org)

First Ultrasonic Chef's Knife Vibrates 40,000X/Second for Easy Cutting (cnet.com)

100k journalists to pitch and get published (journalisthunt.com)

Show HN: Vicoa – Code with Claude and Codex Anywhere (Laptop + Mobile + Tablet) (vibecodeanywhere.com)

Discarded Small-Logs Recovery from Natural Forests: Improving the Value Chain (mdpi.com)

Show HN: HuMo AI – Multi-modal human-centric video generator (text+image+audio)

Comments (0)