Gemma 3 270M: The compact model for hyper-efficient AI

142 meetpateltech 51 8/14/2025, 4:08:36 PM developers.googleblog.com ↗

Comments (51)

simonw · 45m ago

This model is a LOT of fun. It's absolutely tiny - just a 241MB download - and screamingly fast, and hallucinates wildly about almost everything.

Here's one of dozens of results I got for "Generate an SVG of a pelican riding a bicycle". For this one it decided to write a poem:

  +-----------------------+
  |   Pelican Riding Bike |
  +-----------------------+
  |  This is the cat!  |
  |  He's got big wings and a happy tail.  |
  |  He loves to ride his bike!  |
  +-----------------------+
  |   Bike lights are shining bright.  |
  |   He's got a shiny top, too!  |
  |   He's ready for adventure!  |
  +-----------------------+

There are a bunch more attempts in this Gist, some of which do at least include an SVG tag albeit one that doesn't render anything: https://gist.github.com/simonw/25e7b7afd6a63a2f15db48b3a51ec...

I'm looking forward to seeing people fine-tune this in a way that produces useful output for selected tasks, which should absolutely be feasible.

roughly · 8m ago

I audibly laughed at this one: https://gist.github.com/simonw/25e7b7afd6a63a2f15db48b3a51ec... where it generates a… poem? Song? And then proceeds to explain how each line contributes to the SVG, concluding with:

> This SVG code provides a clear and visually appealing representation of a pelican riding a bicycle in a scenic landscape.

0x00cl · 12m ago

I see you are using ollamas ggufs. By default it will download Q4_0 quantization. Try `gemma3:270m-it-bf16` instead or you can also use unsloth ggufs `hf.co/unsloth/gemma-3-270m-it-GGUF:16`

You'll get better results.

ertgbnm · 31m ago

He may generate useless tokens but boy can he generate ALOT of tokens.

TheJoeMan · 9m ago

Can he draw an "alot" made of tokens? https://hyperboleandahalf.blogspot.com/2010/04/alot-is-bette...

lucb1e · 25m ago

He? I know some Gemmas and it's distinctly a female name; is Gemma a boy's name where you're from?

ertgbnm · 18m ago

I don't really gender LLMs in my head in general. I guess Gemma is a female name. I only gendered it in the joke because I think it makes it funnier, especially since it's just "a little guy". I know they are giving gendered names to these models now but I think it's a bit weird to gender when interacting with them.

jgalt212 · 21m ago

Perhaps the poster we referring to Simon not Gemma.

marinhero · 38m ago

Serious question but if it hallucinates about almost everything, what's the use case for it?

yifanl · 1m ago

It's funny. Which is subjective, but if it fits for you, it's arguably more useful than Claude.

simonw · 26m ago

Fine-tuning for specific tasks. I'm hoping to see some good examples of that soon - the blog entry mentions things like structured text extraction, so maybe something like "turn this text about an event into an iCal document" might work?

zamadatix · 26m ago

I feel like the blog post, and GP comment, does a good job of explaining how it's built to be a small model easily fine tuned for narrow tasks, rather than used for general tasks out of the box. The latter is guaranteed to hallucinate heavily at this size, that doesn't mean every specific task it's fine tuned to would be. Some examples given were fine tuning it to efficiently and quickly route a query to the right place to actually be handled or tuning it to do sentiment analysis of content.

An easily fine tunable tiny model might actually be one of the better uses of local LLMs I've seen yet. Rather than try to be a small model that's great at everything it's a tiny model you can quickly tune to do one specific thing decently, extremely fast, and locally on pretty much anything.

striking · 27m ago

It's intended for finetuning on your actual usecase, as the article shows.

deadbabe · 25m ago

Games where you need NPCs to talk random jiberrish.

luckydata · 9m ago

Because that's not the job it was designed to do, and you would know by reading the article.

numpad0 · 20m ago

robotic parrots?

rotexo · 37m ago

An army of troll bots to shift the Overton Window?

ants_everywhere · 28m ago

oh no now we'll never hear the end of how LLMs are just statistical word generators

iLoveOncall · 28m ago

Nothing, just like pretty much all models you can run on consumer hardware.

cyanydeez · 27m ago

This message brought to you by OpenAI: we're useless, but atleast theres a pay gate indicating quality!

nico · 15m ago

Could be interesting to use in a RAG setup and also finetuning it

For sure it won’t generate great svgs, but it might be a really good conversational model

luckydata · 9m ago

The article says it's not a good conversational model but can be used for data extraction and classification as two examples.

layer8 · 22m ago

> It's absolutely tiny - just a 241MB download

That still requires more than 170 floppy disks for installation.

mdp2021 · 34m ago

> For this one it decided to write a poem

Could it be tamed with good role-system prompt crafting? (Besides fine-tuning.)

campbel · 37m ago

Do you take requests? We need to see how well this model works with some fine-tuning :D

volkk · 33m ago

i was looking at the demo and reading the bed time story it generated and even there, there was confusion about the sprite and the cat. switched subjects instantly making for a confusing paragraph. what's the point of this model?

cyanydeez · 28m ago

the question is wheather you can make a fine tuned version and spam any given forum within an hour with the most attuned but garbage content.

canyon289 · 1h ago

Hi all, I built these models with a great team. They're available for download across the open model ecosystem so give them a try! I built these models with a great team and am thrilled to get them out to you.

From our side we designed these models to be strong for their size out of the box, and with the goal you'll all finetune it for your use case. With the small size it'll fit on a wide range of hardware and cost much less to finetune. You can try finetuning them yourself in a free colab in under 5 minutes

For picking a Gemma size this is a video I recorded for the 1b to 27b sizes earlier this year, 270m being the newest addition

https://www.youtube.com/watch?v=qcjrduz_YS8

Hacker News Disclaimer I really like working at Google so with that; All my opinions here are my own, I'm a researcher so I'll largely focus on technical questions, and I'll share what I can.

NorwegianDude · 3m ago

The Gemma 3 models are great! One of the few models that can write Norwegian decently, and the instruction following is in my opinion good for most cases. I do however have some issues that might be related to censorship that I hope will be fixed if there is ever a Gemma 4. Maybe you have some insight into why this is happening?

I run a game when players can post messages, it's a game where players can kill each other, and people often send threats along the lines of "I will kill you". Telling Gemma that it should classify a message as game related or a real life threat, and that it is for a message in a game where players can kill each other and threats are a part of the game, and that it should mark it as game related if it is unclear if the message is a game related threat or a real life threat does not work well. For other similar tasks it seems to follow instructions well, but for serious topics it seems to be very biased, and often err on the side of caution, despite being told not to. Sometimes it even spits out some help lines to contact.

I guess this is because it was trained to be safe, and that affects it's ability to follow instructions for this? Or am I completely off here?

simonw · 25m ago

Do you have any practical examples of fine-tuned variants of this that you can share? A description would be great, but a demo or even downloadable model weights (GGUF ideally) would be even better.

beoberha · 51m ago

Awesome work! I’m really bullish on small models and think they have the most potential to change our daily lives. Can’t wait to play around with this

cgdl · 25m ago

Very cool. For the INT4 QAT model, what is the recommended precision for the activations and for the key and values stored in KV cache?

tmaly · 37m ago

Are there any fine tuning in a box type options available in the cloud for this? This is amazing work, thank you.

fibers · 48m ago

Great job. Do you know how well it performs in sanity checks with NER since it is on the press release page?

ActorNightly · 1h ago

How does the 270 perform with coding?

I use Gemma27b currently with a custom agent wrapper and its working pretty well.

chrismustcode · 48m ago

I’d be stunned if a 270m model could code with any proficiency.

If you have an iPhone with the semi-annoying autocomplete that’s a 34m transformer.

Can’t imagine a model (even if it’s a good team behind it) to do coding with 8x the parameters of a next 3/4 word autocomplete.

0x457 · 28m ago

Someone should try this on that model: https://www.oxen.ai/blog/training-a-rust-1-5b-coder-lm-with-...

VirusNewbie · 16m ago

hi Ravin, fellow Googler here. Curious if you can share here (or internally?) how these models were trained. Wondering if you face all the chaos the large models have during training?

mrcwinn · 1m ago

Apple should be doing this. Unless their plan is to replace their search deal with an AI deal -- it's just crazy to me how absent Apple is. Tim Cook said, "it's ours to take" but they really seem to be grasping at the wind right now. Go Google!

jasonjmcghee · 31m ago

I'm _very_ interested to see what this can be fine-tuned to do.

I've heard folks say a number of times that neuromuscular control / locomotion (or w/e) are hundreds of millions of parameters rather than billions.

lemonish97 · 51m ago

Never thought I'd run an LLM released in 2025, on my phone, in full BF16. With ~80tps on an iPhone 16 pro btw.

elAhmo · 45m ago

How do you actually run this on an iPhone?

CharlesW · 36m ago

With something like PocketPal AI (https://github.com/a-ghorbani/pocketpal-ai). I'd love hear HN'ers opinions on the "best" LM Studio-like option for iOS devices.

jtbayly · 28m ago

Can somebody give me a link to a tutorial on how I would go about fine-tuning this?

Also, what sorts of things might I consider fine-tuning it for?

JLCarveth · 21m ago

This was linked at the end of Google's announcement: https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-t...

Not sure how much data is needed to realistically fine-tune something like this and get useful output.

simonw · 25m ago

This tutorial looks useful: https://ai.google.dev/gemma/docs/core/huggingface_text_full_...

44za12 · 1h ago

I’ve had great luck with all gemma 3 variants, on certain tasks it the 27B quantized version has worked as well as 2.5 flash. Can’t wait to get my hands dirty with this one.

whinvik · 29m ago

Curious. Are there real world usecases where people have finetuned such tiny models and put them into production.

cyanydeez · 26m ago

9gag.com commenter

Alex-Programs · 55m ago

This is cool. I'm looking forward to trying it - I wonder what it'll be useful for.

dcreater · 40m ago

I've been saying he we need sub 1B models for the edge so thanks fot this.

I am however disappointed that there is no examples, or benchmarks, provided to get a sense of performance. It's a given that benchmark values would be lower than gemma 3n, but having a sense of performance vs size curve and comparison to existing small models is needed

Show HN: We made a 2.5GB Offline disaster AI assistant [video] (youtube.com)

SBoM Workbench v1.19 is out

Placing Arguments (blog.yoshuawuyts.com)

My favorite mouse costs less than USD 10 (manualdousuario.net)

Sharded Is Not Distributed: What You Should Know When PostgreSQL Is Not Enough (medium.com)

Re-Architecting AI for Power (semiengineering.com)

Avatarl: Training language models from scratch with pure reinforcement learning (tokenbender.com)

Plain Text Tools (plain-text.app)

What scientists must know about hardware to write fast code (viralinstruction.com)

Show HN: A website that lets you use Unicode symbols as Icons (symbol.so)

Ghost-Tapping and the Chinese cybercriminal retail fraud ecosystem (recordedfuture.com)

Show HN: AI Chat That Never Stores Your Conversations (vtchat.io.vn)

A multichannel wireless bio-signal capture board for HCI and BCI projects (cnx-software.com)

Using Lxcfs Together with Podman (die-welt.net)

Out-of-bound indexing behaviors in Python ecosystem (gist.github.com)

"Privacy preserving age verification" is bullshit (pluralistic.net)

Deck: Deck is a tool for creating decks using Markdown and Google Slides (github.com)

What is the go proxy even doing? (flak.tedunangst.com)

Bolt Cloud (bolt.new)

Proto Rig and Proto Fleet: A paradigm shift in Bitcoin mining (proto.xyz)

Graphs Are Programs (gdotv.com)

Microsoft is getting ready to return to the office (theverge.com)

Meta appoints anti-LGBTQ+ conspiracy theorist Robby Starbuck as AI bias advisor (thepinknews.com)

Black metal could give a heavy boost to solar power generation (rochester.edu)

Show HN: Evaluating LLMs on creative writing via reader usage, not benchmarks (narrator.sh)

Our relationship to technology is broken (harryglaser.com)

Suspicious Tag Change in AWS's GitHub Action: What Happened and Why It Matters (stepsecurity.io)

Fun with Finite State Transducers (blog.yossarian.net)

Firefox 143 no longer works on certain Windows 10 versions (neowin.net)

Death of the Billable Hour: Legal's $900B AI Repricing (substack.com)

Render Launches Edge Caching for Web Services (render.com)

LLM Copyright/Plagiarism filters trivially bypassed with 0% detection [pdf] (paperclipmaximizer.ai)

The Curious Case of Bedrock's GPT Deployment (benanderson.work)

Ask HN: Has anyone used AI agent to unsubscribe from spam newsletters?

China's Lead in Open-Source AI Jolts Washington and Silicon Valley (wsj.com)

Why Computer-Use Agents Should Think Less (prava.co)

Mellea is a library for writing generative programs (mellea.ai)

Trying Out the New Android Linux Terminal (deepakness.com)

A beginner-friendly guide to learning Jax with practical examples (github.com)

New trend: extreme hours at AI startups (blog.pragmaticengineer.com)

Right to Light (en.wikipedia.org)

SynthicAI now uses Ink-Whisper to deliver customer voice responses under 60ms (synthicai.com)

Goedel-Prover-V2 (blog.goedel-prover.com)

Voice, vision, pen: Oh dear. Windows boss says Microsoft is again reshaping OS (theregister.com)

Can LLMs replace on-call SREs? (clickhouse.com)

Contrastive Approach for Smart Ponzi Scheme Detecter with More Negative Samples (arxiv.org)

Delta Dental Releases 2025 Tooth Fairy Poll (deltadental.com)

Counting Words at SIMD Speed (healeycodes.com)

Three Current Narratives from Tech (cutlefish.substack.com)

Why PDF Hell Breaks RAG Workflows and What Works (unstract.com)

Gemma 3 270M: The compact model for hyper-efficient AI

Comments (51)