Automated speed enforcement reduced vehicle speeds in school zones in Toronto (injuryprevention.bmj.com)

Hi HN, we are Zaid, Muhammad and Hammad, the co-founders of Uplift AI (https://upliftai.org). We build models that speak underserved languages — today: Urdu, Sindhi, and Balochi.

A billion people worldwide can't read. In countries like Pakistan – the 5th most populous country – 42% of adults are illiterate. This holds back the entire economy: patients can't read medical reports, parents can't help with homework, banks can't go fully digital, farmers can't research best practices, and people memorize smartphone app button sequences. Voice AI interfaces can fix all of this, and we think this will perhaps be one of the great benefits of modern AI.

Right now, existing voice models barely work for these languages, and big tech is moving slowly.

Uplift AI was originally a side project to make datasets for translation and voice models. For us it was a "cool side-thing" to work on, not an "important full-time thing" to work on. With some initial data we hacked together a Urdu Voice Bot on Whatsapp and gave it to one domestic worker. In two days 800 people were using it. When we dived deeper into understanding the users, we learned that text interfaces don't work for sooo many. So we started Uplift AI to solve this problem fulltime.

The most challenging part is that all the building blocks needed for great voice models are broken for these languages. For example, if you are creating a speech synthesis model, you will scrape a lot of data from youtube and auto-label it using a transcription model… all very easy to do in English. But it doesn't work in under-served languages because the transcription modes are not accurate.

There are many other challenges. Like when you hire human transcribers to label the data, often they don't have any spell correctors for their languages, and this creates lots of noise in the data… making it hard to train models with low data. There are many more challenges in phonemes, silence detection, diacritization etc.

We solve these problems by making great internal tooling to help with data labeling. Also, we source our own data and don't buy it. This is counterintuitive, but a big advantage over companies buying data and then training. By sourcing our own data we create the right data distributions and get much better models with much less data. By doing the entire thing inhouse, (data, labeling, training, deploying) we are able to make a lot faster progress.

Today we publicly offer a text to speech APIs for Urdu, Sindhi, and Balochi. Here's a video which shows this: https://www.loom.com/share/dcd5020967444c228e9c127151e7a9f5.

Khan Academy is using our tech to dub videos to Urdu (https://ur.khanacademy.org).

Our models excel at informational use cases (like AI bots) but need more work in emotive use-cases like poetry.

We have been giving a lot of people private access in beta mode, and today are launching our models publicly. We believe this will be the fastest way for us to learn about areas that are not performing well so we can fix them quickly.

We'd love to hear from all of you, especially around your experiences with under-served languages (not just the Pakistani ones we're starting with) and your comments in general.

Comments (12)

nojs · 3m ago

Nice, this is really needed. Would be cool to see some of the less common regional Chinese dialects, which are widely spoken and often the only language people speak. And even just more accurate regional accents for Mandarin.

moinism · 9m ago

Congrats on the launch! Having support for regional voices is going to open up so many opportunities.

zaidqureshi · 1m ago

Agreed!

pavlov · 1h ago

Nice! Clearly a big and underserved market for voice AI solutions.

Would be nice to have some code examples for using your TTS API with Pipecat.

zaidqureshi · 1h ago

I have to make that.. I did make one for LiveKit which utilizes our websocket API designed for real-time conversation API:

https://docs.upliftai.org/tutorials/livekit-voice-agent

akshayp29 · 1h ago

Pretty cool! Do you think the model would be good at other under-served languages as well? Or is it hypertuned to just these?

zaidqureshi · 1h ago

The model itself can work well for new languages, its just the process of data gathering and maintaining high quality of data is what we have to figure out as we scale across languages.

Currently the model is only given data for these languages so it doesn't know anything else.

akshayp29 · 1h ago

Cool - makes sense!

sanman8119 · 1h ago

Would love to see Malayalam here one day!

zaidqureshi · 1h ago

Yes! I will keep track of this comment for the day we do :P

yorwba · 57m ago

Unless that happens within a week or so, this thread will be locked and you won't be able to reply anymore.

It would be good to have a company blog with an RSS feed that people can subscribe to for updates.

zaidqureshi · 48m ago

ah, created a quick google form for language requests! https://forms.gle/XA6nZbmBNK5K7GJv5

Ask HN: Do you need to promote a job board?

World Humanoid Robot Games in Beijing (nytimes.com)

French streamer Jean Pormanove dies during grueling multi-day marathon broadcast (spilled.gg)

Fusion and the dbt VS Code extension are now in Preview for local development (getdbt.com)

Show HN: Unified Sub-Agent Management (github.com)

Maybe You're Not Trying (usefulfictions.substack.com)

What Do You Mean? (quarter--mile.com)

Microschool Movement Grows in U.S., Raising Quality Concerns (seattleschild.com)

Enslaved Cyborg Cockroaches Used as Cheap Rescue Robots (core77.com)

WebR – R in the Browser (docs.r-wasm.org)

The EU Commission's gross violation of privacy – endangering encryption (2022) (politico.eu)

Strict Empiricism Is Immature (sidnutul.substack.com)

The Demo Scene – Making Art with Code (youtube.com)

Show HN: Make Manga Study Guides on Any Subject (aisheets.study)

2.4 Wi-Fi's and Counting (reloadin.net)

Show HN: Can an AI replace your nutritionist? I built one to try (kaiden.chat)

From .com to .gov: The internet's inevitable nationalist turn (policyreview.info)

Facts about global fertility trends (pewresearch.org)

What Is the Luhn Algorithm? The Math Behind Credit Card Transactions (scientificamerican.com)

End well, this won't: UK commissioner suggests govt stops kids from using VPNs (theregister.com)

Building Production-Ready MCP Servers at Scale (withcoherence.com)

Pepc – Power, Energy, and Performance Configurator (github.com)

Show HN: CFA/ML engineer and prosports coach blew up trading accts so built this (ooln.ai)

Automated speed enforcement reduced vehicle speeds in school zones in Toronto (injuryprevention.bmj.com)

When leaders shouldn't lead incidents (rootly.com)

Who Cares If It's Been Tried Before? (blog.eladgil.com)

Shroud of Turin Study Debunk – A New Attack on the Shroud of Turin (catholic.com)

Understand Complex Codebase at Ease (codalogy.com)

Show HN: I made a JSON Editor. That allows to convert JSON to CSV and more (json-edit.com)

GDP Will Reflect "AI" (Response to Dwarkesh Patel's Concern) (nominalnews.com)

Positron, a New Data Science IDE (posit.co)

Researchers engineer human spinal cord implants for treating paralysis (advanced.onlinelibrary.wiley.com)

In an AI World, Trust and Empathy Are the Last Human Advantages (opuslabs.substack.com)

Bringing the Helix editor to the evil side (github.com)

Show HN: Mail42 – Disposable emails with AI-based text extraction

Debugging common causes for slow loading in Shopify Liquid storefronts (performance.shopify.com)

Show HN: RoomCycle, iOS, ADHD-friendly home organizing, vibed with Claude Code (roomcycle.app)

Outdoor recreation reached all-time high in 2024 (coloradosun.com)

Why I'm all-in on Zen Browser (werd.io)

Apple Expands iPhone Production in India for US-Bound New Models (bloomberg.com)

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-Scale Pretraining (blog.datologyai.com)

My ADHD vs. the AlarmKit API (blog.jacobstechtavern.com)

Electricity May Flow Without Electrons (quantamagazine.org)

Tidewave Web: in-browser coding agent for Rails and Phoenix (tidewave.ai)

Online Free Graph Maker (makegraph.app)

The Truth About Blocking AI, and How Publishers Can Still Win (fastly.com)

Sisyphus's Inbox (mcsweeneys.net)

Using Recounts to Measure the Accuracy of Vote Tabulations [pdf] (dspace.mit.edu)

MinuteBio – Short biography of famous people for kids (minutebio.org)

How to model the world? Introduction to Laplace Neuron (abibulic.github.io)

Launch HN: Uplift (YC S25) – Voice models for under-served languages

Comments (12)