Show HN: Localize React apps without rewriting code
React app localization typically requires implementing i18n frameworks, extracting text to JSON files, and wrapping components in translation tags - essentially rewriting your entire codebase before you can even start translating.
Our React bundler plugin eliminates this friction entirely. You add it to an existing React app, specify which languages you want, and it automatically makes your app multilingual without touching a single line of your component code.
Here's a video showing how it works: https://www.youtube.com/watch?v=sSo2ERxAvB4. The docs are at https://lingo.dev/en/compiler and, sample apps at https://github.com/lingodotdev/lingo.dev/tree/main/demo.
Last year, a dev from our Twitter community told us: "I don't want to wrap every React component with `<T>` tags or extract strings to JSON. Can I just wrap the entire React app and make it multilingual?"
Our first reaction was "That's not how i18n works in React." But a couple hours later, we found ourselves deep in a technical rabbit hole, wondering what if that actually was possible?
That question led us to build the "localization compiler" - a middleware for React that plugs into the codebase, processes the Abstract Syntax Tree of the React code, deterministically locates translatable elements, feeds every context boundary into LLMs, and bakes the translations back into the build, making UI multilingual in seconds.
Everything happens locally during build time, keeping the React project as the source of truth. No code modifications, no extraction, and no maintenance of separate translation files are needed, however, overrides are possible via data-lingo-* attributes.
Building this was trickier than we expected. Beyond traversing React/JS abstract syntax trees, we had to solve some challenging problems. We wanted to find a way to deterministically group elements that should be translated together, so, for example, a phrase wrapped in the `<a>` link tag wouldn't get mistranslated because it was processed in isolation. We also wanted to detect inline function calls and handle them gracefully during compile-time code generation.
For example, this entire text block that our localization compiler identifies as a single translation unit, preserving the HTML structure and context for the LLM.
``` function WelcomeMessage() { return ( <div> Welcome to <i>our platform</i>! <a href="/start">Get started</a> today. </div> ); } ```
The biggest challenge was making our compiler compatible with Hot Module Replacement. This allows developers to code in English while instantly seeing the UI in Spanish or Japanese, which is invaluable for catching layout issues caused by text expansion or contraction in different languages that take more/less space on the screen.
For performance, we implemented aggressive caching that stores AST analysis results between runs and only reprocesses components that have changed. Incremental builds stay fast even on large codebases, since at any point in time as a dev, you update only a limited number of components, and we heavily parallelized LLM calls.
This approach was technically possible before LLMs, but practically useless, since for precise translations you'd still need human translators familiar with the product domain. However, now, with context-aware models, we can generate decent translations automatically.
We're excited about finally making it production ready and sharing this with the HN community.
Run `npm i lingo.dev` , check out the docs at lingo.dev/compiler, try breaking it and let us know what you think about this approach to React i18n!
This application does not handle many important considerations for translation. Such as pluralization. In many languages there are multiple more ways to pluralize words. Russian has many different ways to pluralize. More problems will occur when you have words within words.
There is no way to do this without working on changing your codebase. I think what would work better is if you can create ICU compliant JSON.
How are you supposed to have this work in Japanese when it's RTL instead of LTR? That will require UI and localization challenges.
I think using AI to do translation will be fine for startups, but I'm not sure how well this will work on real production apps. I think significant work will be required to actually get this working:
https://stelejs.com
Besides pluralization (and e.g. Arabic having 6 forms zero/one/two/few/many/other), turned out number internationalization and currency conversion are big next challenges the community wants to address next.
> create ICU compliant JSON.
I think this is an excellent idea. I have a feeling in the future we will need ICU v2.0, sort of, but unfortunately it's an extremely hard problem and the probability to fail is pretty high (looks like project fluent is not actively maintained anymore: https://github.com/projectfluent/fluent)
Depends on the medium. EPUB 2.0 (and later revisions) specifically supports vertical RTL text for use-cases like Japanese novels. Additionally, many novel reading websites support toggling between vertical and horizontal text. Vertical text implicitly switches to RTL text direction.
Of course, this is not a general use case. But saying "modern Japanese is LTR" is not quite accurate. Computer / digital media is commonly LTR and horizontal, but a single step outside exposes one to vertical text, and RTL text in physical newspapers, literature, comics, a subset of textbooks, and handwritten signs that try to look "traditional" in style.
And since we had to exclude certain terms like "Lingo.dev Compiler" itself from i18n, we've shipped support for data-lingo-skip and data-lingo-override-<locale-code> as well.
Regarding using LLMs for production content localization, I recommend checking out how Reddit translates their entire user-generated content base in 35 languages using AI:
https://techcrunch.com/2024/09/25/reddit-is-bringing-ai-powe...
If it already works great for Reddit today, I believe it's safe to assume it will become accessible to the wider audience soon as well.
1. Targeting fbt (Meta's internal i18n tool)
2. Used CST (<3 ast-grep) instead of AST - really useful here IMO esp. for any heuristic-based checks.
3. Fun fact: this was made entirely on my phone (~2.5h) while I was walking around Tokyo. Voice prompting + o1-pro. Why? My friend was working on porting fbt to TS and said he was planning to build this. I wanted to one-up him + convince him to start using LLMs =)
One thing you should be aware of is that for at least Japanese, localization is far from just translating the text. There are lots and lots of Japan-specific cultural nuances you have to take into account for web users and even down to actually just having an entirely different design for your landing page often because those you'll find those just convert better when you know certain things are done that are typically not done for you know non-Japan websites.
Notta (multi-lingual meeting transcriptions + reports) is a great example if you compare their Japanese [1] and English [2] landing pages.
Note how drastically different the landing pages are. Furthermore, even linguistically, Japanese remains a challenge for proper context-dependent interpretation. Gemini 2.5 actually likely performs best for this thanks to Shane Gu [3], who's put in tons of work into having it perform well for Japanese (as well as other "tough" languages)
[0] https://github.com/f8n-ai/fbtee-migrate
[1] https://www.notta.ai (Japanese version)
[2] https://www.notta.ai/en (English version)
[3] https://x.com/shaneguML
> localization is far from just translating the text
For sure, that's spot on.
What I'm excited about the most is that linguistic/cultural aspects are close to being solved by LLMs, including Gemini 2.5 that's got a huge performance boost vs the previous iteration. So, the automated approaches make more sense now, and have a chance of becoming the default, reducing i18n maintenance down to zero - and as a dev I can't be not excited about that.
P.S. fbt is great by the way, as is the team behind it. It's a shame it's archived on GitHub and isn't actively maintained anymore.
I hate the current react i18n solutions, and the fact that they only work in runtime, as opposed to Angular’s build time i18n solution.
If your compiler could plugin to existing localization workflows in large organizations to at would be great (ie: extraction, load from configuration).
We support larger org workflows with the Lingo.dev Engine product, but that's not the point: Lingo.dev Compiler is unrelated to that, 100% free and open source.
We started with a thought - what if i18n is actually meant to be build-time, LLM-powered, and that's enough for it to be precise? Not today, but in the future, it feels like this type of solution could elegantly solve i18n at scale, in software, as opposed to the existing sophisticated workflows.
WDYT?
The best solution right now is prompt engineering: turns out, AI can be tuned to provide top quality results with the correct system prompt/few shot setup, and custom prompts can be provided in the compiler config.
Longer term, I want this to never be an issue, and I feel we'll get there together with the help from the open source community!
* worth exactly what you paid for it ;)
1. `data-lingo-skip` - excludes a jsx node from i18n 2. `data-lingo-override-<locale code>` - overrides version in <locale code> language with a custom value 3. also `data-lingo-context`
(docs, perhaps, aren't yet the best, but here they are: https://lingo.dev/compiler/configuration/advanced)
I’d say it just takes a few prompts in Cursor or a similar tool.
Then, you simply ask it to translate into other languages. Here’s how I did it for one of my projects - a quantum optics simulator: https://p.migdal.pl/blog/2025/04/vibe-translating-quantum-fl...
Doing it at runtime might make sense for a typical translation. But for scientific (or engineering) content, we often want to verify the output. Translating in production can be wonderful, hilarious, or just inconsistent.
(Here's a battle tested prompt example we found working pretty nicely with claude o3 + claude 3.7: https://lingo.dev/cli/extract-keys)
> Then, you simply ask it to translate into other languages.
Yep! With Lingo.dev Compiler though, we were scratching our own itch, and particularly it was maintenance of the localized code. Turned out, extracting is fine, but then further down the road we found ourselves digging through the code and jumping back and forth between the code and i18n files.
I think it won't be a problem anymore after "Just In Time software" becomes a thing, and vibe coding tools seem to be getting us closer to that point.
Great example!
Few things to put on your roadmap if they aren't on it yet:
- Would like it if we could set the model per language. I'm sure you do your best trying to find the best one for each language, but in our experience some of them aren't optimal yet.
- Multiple source languages would be cool. Example: It can make sense to have JA as source for KO but EN as source for FR. Or probably better, sending both (e.g. EN + JA when doing KO).
- MCP doesn't seem to be working (we posted a log on the discord)
- We seem to have spotted cases where key names were taken into account a little too much when translating, but understand this is super hard to tune and can be fixed by improving our keys.
Typically quality changes significantly with the right setup of translation fine-tuning settings, so send me a DM with your current setup and we'll help you out in a couple of minutes.
Alternatively, Lingo.dev CLI is open source and you can give it a try with your own API key/model, and if your preferred provider ID isn't yet supported - pull requests are welcome, let's add it! (adding new providers is pretty simple).
Checking your MCP scenario right now, but meanwhile regarding the keys: they're indeed important and are great ways to give the LLMs another tip regarding the meaning of the label and its intent.
https://github.com/benmerckx/find-jsx-strings
We believe automatic discovery + i18n processing is the most natural next step for i18n on the web, since LLMs now exist.
And we feel that not only will industry standard i18n libraries like i18next or react-intl adopt it soon, but frameworks like next.js or remix.js themselves will make it one of their core features.
We originally built Lingo.dev Compiler scratching our own itch, but we're really excited to see how the industry will evolve from here!
It will remain 100% free and open-source. We're already dogfooding it on our website and app, so if you'd like to join and contribute at some point, we'd be very happy!
Auto-translated sentences are awkward and I feel extremely insulted every time someone chooses to impose this garbage watered-down version of their products on me.
Hire a translator or don't localize your site.
That's exactly what we want to solve.
Here's the thing:
It turned out, AI translates better than humans when provided with enough correct context. Both macro context, like what the product does, and micro context, like what the component represents on screen and how it relates to other components.
As a result, algorithms extract the needed contextual hints, and a correctly configured LLM model finishes the rest.
This is definitionally untrue. Humans define human language; a "correct" translation is one that an experienced translator would write.
But that doesn't mean that LLMs have become as good as human translators, but rather that corporations have set up a system that treats translators as if they were machines and then we act surprised when machines are better at acting machine-like than humans.
In particular, having it user side makes it fully opt-in, and the user has full control and will accept the quality as it is, whereas your service-side auto translate is your responsibility when shit hits the fan.
1. PostHog has a great tool that lets developers "watch the video" of how users interact with their app's UI. Turns out, automated chrome plugins/built-in features often mess up the HTML so much that apps simply crash. I've seen devs adding translate="no" [0] in bulk to their apps because of this. Therefore, Chrome's built-in auto translation isn't the best solution (yet). 2. Product/marketing folks want users to see content in their language immediately after landing on the website 3. App developers often want to control what users see, update it, rephrase it
If I had to guess, I'd say the approach Lingo.dev Compiler package is using today should end up being a natural part of frameworks like Remix, Next.js and Vue.
[0] https://www.w3schools.com/tags/att_translate.asp
Also, I doubt other translators work by localizing <p> elements one by one, without context. The entire HTML is localized, semantic and all. I fail to see how translating JSX instead of HTML can improve the situation much.
Typically, a human would need to be educated about these aspects to translate perfectly. In the future, in my opinion, humans will be educating—or configuring—the AI to do that.
The "localization compiler", which we've built to solve our own problem in the first place, is just a handy bunch of scripts aimed to help extract needed contextual hints that would then be passed on to the [preconfigured] LLM for translation, and it should go beyond just the names of the tags.
FWIW, by saying AI translations I don't mean Google Translate or machine translation tech that browsers come with. I mean actual foundational AI models that OpenAI, Anthropic, Google, Meta, Mistral and others are developing.
The difference is significant, and there's no worse thing than half-assed robotic translation produced by an MT.
2. Regarding "AI translates better than humans." I think some commenters have already mentioned this, but the point is that outsourced translations can be worse than what LLMs can produce today, because when translations are outsourced, nobody seems to care about educating the native speaker about the product and the UI. And localizing the UI, which consists of thousands of chunks of text, is nontrivial for a human. On the flip side, a correctly configured LLM, when provided with enough relevant contextual tips, shows outstanding results.
If not, then don't.
Yeah, I'd prefer no translation over bad translation.
It's like if someone requested a feature and you gave them the first thing an LLM spewed out when asked to code it, without review.
You should at least have someone on your team be able to understand the program's output and correct it when things inevitably sound off.
Also, if a website offers a language I take that as an indication that the organization is prepared to deal with speakers of that language/people from the country in question (customer support, shipping, regional/legal concerns). Whether the site offers a certain language is a useful signal to figure this out quickly, and if poking around reveals machine translation into dozens of languages, it's a signal that they're probably not prepared to provide reliable services/support.
I love "Translate this page" in Chrome, better than nothing.
If you're bilingual you must know this feeling of reading an awful translation; of knowing someone wanted to offer their product to people speaking your language but couldn't be bothered to do it well, and so used google translate and called it a day, thinking those dumb users won't notice the slop they're feeding them. Fuck that.
Vite Rollup webpack esbuild Rspack Rolldown Farm
should be reasonably straightforward, and we expect pull requests adding other setups soon.
That's a great question!
Very cool
Conceptually, we're relying on common sense assumptions about how developers structure JSX. We assume you write reasonably semantic markup where the visual hierarchy matches the code structure - no CSS tricks that make the UI render completely different from what the JSX suggests.
This let us create translation boundaries that make intuitive sense to both developers and AI models.
And then you just translated them. The English text, essentially, becomes your string ID.
It worked super well, with very low friction for programmers. You didn't have to invent an identifier when writing code, and then switch to a horrible JSON with indefinitely long lists of strings.
But somehow, the world chose the horribly bad "string IDs" method of translation. Sigh.
I think there's a chance compile-time, AST/CST solutions might be the ultimate, O(1) i18n approach that doesn't distract. Ideally it should come out of the box with the framework, but perhaps this future is a little bit too far away just yet.
So we usually try to use both terms at the same time, often interchangeably, though translation is ultimately a subset of localization.
Unsure if I communicated it well, but unlike auto-translators such as Google Translate, this project leverages a context-aware LLM to recreate the meaning and intent of the original text in another language.
I think you are just misrepresenting the capabilities of an auto-translator LLM.
Not the point here, but is there any move yet in React to separating the presentation from the logic[1]?
1 - https://martinfowler.com/eaaDev/SeparatedPresentation.html
Sometimes your presentation varies depending on the data, in ways that are ultimately Turing-complete. Any domain-specific-language is eventually going to grow to incorporate some kind of logic.
React seems to have found a sweet spot for that with JSX, which presents as if it's mostly HTML with some Javascript mixed in. (In reality, it's actually Javascript with HTML-esque syntactic sugar, but it works very hard to present the illusion.) That means that it's working in two well-understood and widely-supported languages, rather than creating yet another presentation language.
HTML+CSS has deep flaws as a presentation language, but it's also universal. I don't expect React to reinvent that particular wheel.