Gemini 2.5 Flash Image

fariszr · 3h ago

This is the gpt 4 moment for image editing models. Nano banana aka gemini 2.5 flash is insanely good. It made a 171 elo point jump in lmarena!

Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111

qingcharles · 1h ago

I've been testing it for several weeks. It can produce results that are truly epic, but it's still a case of rerolling the prompt a dozen times to get an image you can use. It's not God. It's definitely an enormous step though, and totally SOTA.

spaceman_2020 · 1h ago

If you compare to the amount of effort required in Photoshop to achieve the same results, still a vast improvement

qingcharles · 1h ago

I work in Photoshop all day, and I 100% agree. Also, I just retried a task that wouldn't work last night on nano-banana and it worked first time on the released model, so I'm wondering if there were some changes to the released version?

druskacik · 1h ago

Is it because the model is not good enough at following the prompt, or because the prompt is unclear?

Something similar has been the case with text models. People write vague instructions and are dissatisfied when the model does not correctly guess their intentions. With image models it's even harder for model to guess it right without enough details.

qingcharles · 1h ago

No, my prompts are very, very clear. It just won't follow them sometimes. Also this model seems to prefer shorter prompts, in my experience.

dcre · 2h ago

Alarming hands on the third one: it can't decide which way they're facing. But Gemini didn't introduce that, it's there in the base image.

ceroxylon · 2h ago

It seems like every combination of "nano banana" is registered as a domain with their own unique UI for image generation... are these all middle actors playing credit arbitrage using a popular model name?

bonoboTP · 2h ago

I'd assume they are just fake, take your money and use a different model under the hood. Because they already existed before the public release. I doubt that their backend rolled the dice on LMArena until nano-banana popped up. And that was the only way to use it until today.

ceroxylon · 2h ago

Agreed, I didn't mean to imply that they were even attempting to run the actual nano banana, even through LMarena.

There is a whole spectrum of potential sketchiness to explore with these, since I see a few "sign in with Google" buttons that remind me of phishing landing pages.

vunderba · 1h ago

They're almost all scams. Nano banana AI image generator sites were showing up when this model was still only available in LM Arena.

koakuma-chan · 2h ago

Why is it called nano banana?

ehsankia · 1h ago

Before a model is announced, they use codenames on the arenas. If you look online, you can see people posting about new secret models and people trying to guess whose model it is.

Jensson · 2h ago

Engineers often have silly project names internally, then some marketing team rewrites the name for public release.

ZephyrBlu · 2h ago

I'm pretty sure it's because an image of a banana under a microscope generated by the model went super viral

93po · 36m ago

Completely agree - I make logos for my github projects for fun, and the last time I tried SOTA image generation for logos, it was consistently ignoring instructions and not doing anything close to what i was asking for. Google's new release today did it near flawlessly, exactly how I wanted it, in a single prompt. A couple more prompts for tweaking (centering it, rotating it slightly) got it perfect. This is awesome.

rplnt · 1h ago

Oh no, even more mis-scaled product images.

echelon · 2h ago

> This is the gpt 4 moment for image editing models.

No it's not.

We've had rich editing capabilities since gpt-image-1, this is just faster and looks better than the (endearingly? called) "piss filter".

Flux Kontext, SeedEdit, and Qwen Edit are all also image editing models that are robustly capable. Qwen Edit especially.

Flux Kontext and Qwen are also possible to fine tune and run locally.

Qwen (and its video gen sister Wan) are also Apache licensed. It's hard not to cheer Alibaba on given how open they are compared to their competitors.

We've left the days of Dall-E, Stable Diffusion, and Midjourney of "prompt-only" text to image generation.

It's also looking like tools like ComfyUI are less and less necessary as those capabilities are moving into the model layer itself.

raincole · 2h ago

In other words, this is the gpt 4 moment for image editing models.

Gpt4 isn't "fundamentally different" from gpt3.5. It's just better. That's the exact point the parent commenter was trying to make.

retinaros · 2h ago

did you see the generated pic demis posted on X? it looks like slop from 2 years ago. https://x.com/demishassabis/status/1960355658059891018

raincole · 2h ago

I've tested it on Google AI Studio since it's available to me (which is just a few hours so take it with a grain of salt). The prompt comprehension is uncannily good.

My test is going to https://unsplash.com/s/photos/random and pick two random images, send them both and "integrate the subject from the second image into the first image" as the prompt. I think Gemini 2.5 is doing far better than ChatGPT (admittedly ChatGPT was the trailblazer on this path). FluxKontext seems unable to do that at all. Not sure if I were using it wrong, but it always only considers one image at a time for me.

Edit: Honestly it might not be the 'gpt4 moment." It's better at combining multiple images, but now I don't think it's better at understanding elaborated text prompt than ChatGPT.

vunderba · 1h ago

I've updated the GenAI Image comparison site (which focuses heavily on strict text-to-image prompt adherence) to reflect the new Google Gemini 2.5 Flash model (aka nano-banana).

https://genai-showdown.specr.net

This model gets 8 of the 12 prompts correct and easily comes within striking distance of the best-in-class models Imagen and gpt-image-1 and is a significant upgrade over the old Gemini Flash 2.0 model. The reigning champ, gpt-image-1, only manages to edge out Flash 2.5 on the maze and 9-pointed star.

What's honestly most astonishing to me is how long gpt-image-1 has remained at the top of the class - closing in on half a year which is basically a lifetime in this field. Though fair warning, gpt-image-1 is borderline useless as an "editor" since it almost always changes the whole image instead of doing localized inpainting-style edits like Kontext, Qwen, or Nano-Banana.

Comparison of gpt-image-1, flash, and imagen.

https://genai-showdown.specr.net?models=OPENAI_4O%2CIMAGEN_4...

gundmc · 17m ago

> Though fair warning, gpt-image-1 is borderline useless as an "editor" since it almost always changes the whole image instead of doing localized inpainting-style edits like Kontext, Qwen, or Nano-Banana.

Came into this thread looking for this post. It's a great way to compare prompt adherence across models. Have you considered adding editing capabilities in a similar way given the recent trend of inpainting-style prompting?

bn-l · 17m ago

You need a separate benchmark for editing of course

esamust · 7m ago

Strange. I was excited to play around with the 2.5 flash image after testing the nano banana in LMarena, but the results are not at all the same? So I went back to LMarena to replicate my earlier tests but it's way worse than when it was nano banana? Did I miss something?

skybrian · 1h ago

Like most image generators, it didn’t pass the piano keyboard test. (Black keys are wrong.)

https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%...

pbhjpbhj · 37m ago

Are their models that have vector space that includes ideas, not just words/media but not entirely corporeal aspects?

So when generating a video of someone playing a keyboard the model would incorporate the idea of repeating groups of 8 tones, which is a fixed ideational aspect which might not be strongly represented in words adjacent to "piano".

It seems like models need help with knowing what should be static, or homomorphic, across or within images associated with the same word vectors and that words alone don't provide a strong enough basis [*1] for this.

*1 - it's so hard to find non-conflicting words, obviously I don't mean basis as in basis vectors, though there is some weak analogy.

joombaga · 1h ago

What is the piano keyboard test? Your link requires granting AI Studio access to Google Drive, which I do not want to do.

raincole · 1h ago

Just ask it to generate a correct piano keyboard. It's something the current gen of image generator AIs fail at.

Workaccount2 · 1h ago

The selling point of this model really seems to be it's consistency between generations rather than it's raw generating ability.

for instance:

https://aistudio.google.com/app/prompts/1gTG-D92MyzSKaKUeBu2...

mikepurvis · 1h ago

Interesting! I feel like that's maybe similar to the business of being able to correctly generate images of text— it looks like the idea of a keyboard to a non-musician, but is immediately wrong to someone who is actually familiar with it at all.

I wonder if the bot is forced to generate something new— certainly for a prompt like that it would be acceptable to just pick the first result off a google image search and be like "there, there's your picture of a piano keyboard".

vunderba · 1h ago

Anything that is heavily periodic can definitely trip up image gen - that being I just used Flux Kontext T2I and got a got pretty close (disregard the hammers though since thats a right mess). Only towards the upper register did it start to make mistakes.

https://imgur.com/a/fyX42my

carimura · 52m ago

or my "hands with palms facing down" test.... no matter how hard I try it just can't get open hands, palms down.

vunderba · 29m ago

It's probably just a matter of rerolling a few times. I was able to get it around 25% of the time.

https://imgur.com/a/H9gH3Zy

pbhjpbhj · 32m ago

I guess the vast majority of images have the palms the other way, that this biases the output. It's like how we misinterpret images to generate optical illusions, because we're expecting valid 3D structures (Escher's staircases, say).

vunderba · 28m ago

Yes - it's the same reason generating a 5-leaf clover fails - massive amounts of training data that predisposes the model against it.

psbp · 1h ago

Doesn't pass the analog clock test either.

cubefox · 47m ago

Like most image models, except GPT-4o, it also didn't pass the wooden Penrose triangle test. (It creates normal triangles.)

notsylver · 3h ago

I digitised our family photos but a lot of them were damaged (shifted colours, spills, fingerprints on film, spots) that are difficult to correct for so many images. I've been waiting for image gen to catch up enough to be able to repair them all in bulk without changing details, especially faces. This looks very good at restoring images without altering details or adding them where they are missing, so it might finally be time.

Almondsetat · 2h ago

All of the defects you have listed can be automatically fixed by using a film scanner with ICE and a software that automatically performs the scan and the restoration like Vuescan. Feeding hundreds (thousands?) of photos to an experimental proprietary cloud AI that will give you back subpar compressed pictures with who knows how many strange artifacts seems unnecessary

notsylver · 2h ago

I scanned everything into 48-bit RAW and treat those as the originals, including the IR scan for ICE and a lower quality scan of the metadata. The problem is sharing them - important images I manually repair and export as JPEG which is time consuming (15-30 minutes per image, there are about 14000 total) so if its "generic family gathering picture #8228" I would rather let AI repair it, assuming it doesn't butcher faces and other important details. Until then I made a script that exports the raws with basic cropping and colour correction but it can't fix the colours which is the biggest issue.

exe34 · 1h ago

this reminds me of a joke we used to tell as kids when there was a new Photoshop version coming out - "this one will remove the cow from the picture and we'll finally see what great-grandpa looked like!"

bjackman · 48m ago

I don't really understand the point of this usecase. Like, can't you also imagine what the photos might look like without the damage? Same with AI upscaling in phone cameras... if I want a hypothetical idea of what something in the distance might look like, I can just... imagine it?

I think we will eventually have AI based tools that are just doing what a skilled human user would do in Photoshop, via tool-use. This would make sense to me. But just having AI generate a new image with imagined details just seems like waste of time.

zwog · 2h ago

Do you happen to know some software to repair/improve video files? I'm in the process of digitalizing a couple of Video 2000 and VHS casettes of childhood memories of my mom who start suffering from dementia. I have a pretty streamlined setup for digitalizing the videos but I'd like to improve the quality a bit.

nycdatasci · 1h ago

I've used products from topazlabs.com for the same problem and have generally been happy with them.

qingcharles · 1h ago

Topaz is probably the SOTA in video restoration, but it can definitely fuck shit up. Use carefully and sparingly and check all the output for weird AI glitches.

notsylver · 2h ago

I didn't do any videos, just pictures, but considering how little I found for pictures I doubt you'll find much

actionfromafar · 2h ago

VHSdecode if you want a rabbit hole.

Barbing · 2h ago

Hope it works well for you!

In my eyes, one specific example they show (“Prompt: Restore photo”) deeply AI-ifies the woman’s face. Sure it’ll improve over time of course.

indigodaddy · 2h ago

Another question/concern for me: if I restore an old picture of my Gramma, will my Gramma (or a Gramma that looks strikingly similar) ever pop up on other people's "give me a random Gramma" prompts?

notsylver · 2h ago

I tried a dozen or so images. For some it definitely failed (altering details, leaving damage behind, needing a second attempt to get a better result) but on others it did great. With a human in the loop approving the AI version or marking it for manual correction I think it would save a lot of time.

This is the first image I tried:

https://i.imgur.com/MXgthty.jpeg (before)

https://i.imgur.com/Y5lGcnx.png (after)

Sure, I could manually correct that quite easily and would do a better job, but that image is not important to us, it would just be nicer to have it than not.

I'll probably wait for the next version of this model before committing to doing it, but its exciting that we're almost there.

qingcharles · 1h ago

Being pragmatic, the after is a good restoration. There is nothing really lost (except some sharpness that could be put back). The main failing of AI is on faces because our brains are so hardwired to see any changes or weirdness. This is the sort of image that is perfect for AI because the subject's face is already occluded.

danielbln · 2h ago

That time had arrived a few months ago already with Flux Kontext (https://bfl.ai/models/flux-kontext).

reaperducer · 2h ago

I've been waiting for image gen to catch up enough to be able to repair them all in bulk without changing details, especially faces.

I've been waiting for that, too. But I'm also not interesting in feeding my entire extended family's visual history into Google for it to monetize. It's wrong for me to violate their privacy that way, and also creepy to me.

Am I correct to worry that any pictures I send into this system will be used for "training?" Is my concern overblown, or should I keep waiting for AI on local hardware to get better?

crustaceansoup · 2h ago

I tried to reproduce the fork/spaghetti example and the fashion bubble example, and neither looks anything like what they present. The outputs are very consistent, too. I am copying/pasting the images out of the advertisement page so they may be lower resolution than the original inputs, but otherwise I'm using the same prompts and getting a wildly different result.

It does look like I'm using the new model, though. I'm getting image editing results that are well beyond what the old stuff was capable of.

mortenjorck · 1h ago

The output consistency is interesting. I just went through half a dozen generations of my standard image model challenge, (to date I have yet to see a model that can render piano keyboard octaves correctly, and Gemini 2.5 Flash Image is no different in that regard), and as best I can tell, there are no changes at all between successive attempts: https://g.co/gemini/share/a0e1e264b5e9

This is in stark contrast to ChatGPT, where an edit prompt typically yields both requested and unrequested changes to the image; here it seems to be neither.

__rito__ · 1h ago

I am glad that I never decided to become a photoshop pro. I always contemplated about it, seemed attractive for a while, but glad that I decided against it. RIP r/photoshopbattles.

It was in the endless list of new shiny 'skills' that feels good to have. Now I can use nano-banana instead. Other models will soon follow, I am sure.

esafak · 1h ago

Retouching is an art. To the pro, this is just another tool to increase efficiency. You pay them not just for knowing how to use Photoshop, but for exercising good judgement. That said, I imagine this will shrink the field, since fewer retouchers will be able to do the same work, unless the amount of work goes up commensurately. Will people get more retouching done if the price goes down? Not sure.

ctippett · 1h ago

Interesting take. I'm a programmer, but learned Photoshop in the early 2000s and had a blast making and editing images for fun. Sure, the generative models today can do a far better job than anything I could come up with, but that doesn't detract from the experience and skills I picked up over the years.

If anything, knowing Photoshop (I use Affinity Designer/Photo these days) is actually incredibly useful to finesse the output produced by AI. No regrets.

SoKamil · 1h ago

If you commented it a decade ago, I would say that at least you own the program and skills in case Google decides to turn off the lights or ask prohibitive price tag. Now you need to pay subscription for PS and maybe there would be some decent open weight model released.

stefs · 2m ago

qwen3 is open weights and offers passable image generation

echelon · 1h ago

Programming and everything else will eventually fall to automation, too. It's just a matter of time.

Engineering probably takes a while (5 years? 10 years?) because errors multiply and technical debt stacks up.

In images, that's not so much of a big deal. You can re-roll. The context and consequences are small. In programs, bad code leads to an unmaintainable mess and you're stuck with it.

But eventually this will catch up with us too.

quantumHazer · 1h ago

Both of you are wrong and this is not good discussion level for HN

casey2 · 1h ago

If being wrong isn't good discussion for HN then they should delete the site

echelon · 1h ago

I'm unclear as to which side of the argument you're taking.

If you think that these tools don't automate most existing graphics design work, you're gravely mistaken.

The question is whether this increases the amount of work to be done because more people suddenly need these skills. I'm of the opinion that this does in fact increase demand. Suddenly your mom and pop plumbing business will want Hollywood level VFX for their ads, and that's just the start.

matsemann · 2h ago

Half the time I ask Gemini to generate some image it claims it doesn't have the capability. And in general I've felt it's so hard to actually use the features Google announce? Like, a third of them is in one product, some in another which I can't use, and no idea what or where I should pay to get access. So confusing.

Al-Khwarizmi · 2h ago

Yeah, in fact the website says "Try it in Gemini" and I'm not sure if I'm already trying it or not - if I choose Gemini 2.5 Flash in the regular Gemini UI, I'm using this?

throwup238 · 2h ago

It’s going to be a messy rollout as usual. The web app (gemini.google.com) shows “Images with Imagen” for me under tools for 2.5 flash but I just tried a few image edits and remixes in the iOS app and it looks like it’s been updated to this model.

oliwary · 2h ago

Also very confused at this... It told me "I'm unable to create images of specific individuals in different settings." I wish it would at least say somewhere which model we are using at the moment.

sega_sai · 2h ago

I think not. Because at least in the aistudio there is a dedicated gemini-2.5-flash-image-preview model. So I am assuming it is not available in the standard gemini chat window.

jeffbee · 21m ago

It's not in the Gemini app or site at all. You have to use AI Studio or another means. Yes, this is all very confusing on Google's part.

adidoit · 3h ago

Very impressive.

I have to say while I'm deeply impressed by these text to image models, there's a part of me that's also wary of their impact. Just look at the comments beneath the average Facebook post.

postalcoder · 3h ago

I have been testing google's SynthID for images and while it isn't perfect, it is very good, insofar that I felt some relief from that same creeping dread over what these images will do to perceived reality.

It survives a lot of transformation like compression, cropping, and resizing. It even survives over alterations like color filtering and overpainting.

sigmar · 3h ago

facebook isn't going to implement detection though. Many (if not most) of the viral pictures are AI-generated. and facebook is incentivized to let their users get fooled to generate endless scrolling

qingcharles · 1h ago

They already did. Certainly on the backend. For a while they were surfacing it, but I think it's gone again. But Meta is definitely onto this.

paul7986 · 3h ago

Along with those being fooled there are many comments saying this is fake, AI trash and etc. That portion of the commenters are teaching the ignorant and soon no one will believe what they see on the Internet as real.

bonsai_bar · 2h ago

> soon no one will believe what they see on the Internet as real.

Now is that so bad?

MitPitt · 2h ago

Facebook comments are obviously botted too

bee_rider · 1h ago

I dunno, I thought so for a while, but I’m beginning to suspect this is a very optimistic view of humanity.

knicholes · 3h ago

I got scammed for $15k BTC last weekend during the (failed) SpaceX Launch. I believe the deepfake of Elon and transferred it over. The tech is very convincing, and the attacks ever increasingly sophisticated.

yifanl · 3h ago

This presumes that you're okay with giving the real Elon your wallet but not a fake Elon, but why?

Jensson · 2h ago

Because it isn't worth real Elon's time to run these scams.

pil0u · 1h ago

I got scammed similarly (although $10, because I tested first), because 1. it was on YouTube, on a channel called "SpaceX" with verified logo 2. with hundreds of thousands of viewers live 3. with a believable speech from Mr. Musk standing next to its rockets (and knowing his interest in cryptocurrencies).

This happened as I was genuinely searching for the actual live stream of SpaceX.

I am ashamed, even more so because I even posted the live stream link on Hacker News (!). Fortunately it was flagged early and I apologized personally to dang.

This was a terrible experience for me, on many levels. I never thought I would fall in such a trap, being very aware of the tech, reading about similar stories etc.

dvh · 38m ago

I am flabbergasted that you both get scammed. I would understand if this was two years ago, but now? Do people really not know about these scams? I can already see down votes coming for victim blaming, but this is to me really shocking. Notice that there isn't "tell hn: don't get scammed by deep fake crypto Elon" because people who usually posts also consider this general knowledge. That's why it's so effective I guess. In a similar manner there will never be "tell hn: don't drink acid it will burn your intestines", the danger is so obvious that nobody feels the need to post it and because nobody is posting it, people get scammed. I don't know what is the solution to that. How should you tell people what everybody should be already knowing?

I remember being on a machining workshop and he was telling such an obvious things. Obvious things are obvious until they aren't, and then somebody gets hurt.

pil0u · 22m ago

To be fair, if that was only $10 it's because it was more of a "let's see if that works". It was believable enough to try this out.

The point of my message was to "tell hn: it could happen to people in this community".

bn-l · 15m ago

Hey it takes courage to admit to it. That’s admirable.

fxtentacle · 2h ago

Plot twist: It wasn't a deepfake.

You sent your wallet to the real Elon and he used it as he saw fit. ;)

pjerem · 2h ago

That’s what they said : they have been scammed !

kamranjon · 3h ago

Would you consider writing a blog post about this experience? I'm incredibly interested in learning more details about how this unfolded.

qingcharles · 1h ago

I think the comment is a joke. Their bio is satirical at least :)

lucasmullens · 18m ago

I'm pretty sure the comment wasn't a joke? I saw the stream last week, it was very impressive use of AI, I didn't realize it was AI until he started talking about doubling crypto.

What about the bio is satirical? I'm pretty sure that's sincere too.

paul7986 · 2h ago

Well just go on this guy's lawn and you will find your answer lol

Imustaskforhelp · 3h ago

Please pardon me since I don't know if this is satirical or not. I'd wish if you could clarify it.

Because if this is real, then the world is cooked

if not, then the fact that I think that It might be real but the only reason I believe its a joke is because you are on hackernews so I think that either you are joking or the tech has gotten so convincing that even people on hackernews (which I hold to a fair standard) are getting scammed.

I have a lot of questions if true and I am sorry for your loss if that's true and this isn't satire but I'd love it if you could tell me if its a satirical joke or not.

bauruine · 3h ago

I guess it was something like [0] The Nigerian prince is now a deep fake Elon but the concept is the same. You need to send some money to get way more back.

[0]: https://www.ncsc.admin.ch/ncsc/en/home/aktuell/im-fokus/2023...

Imustaskforhelp · 3h ago

hm, but isn't it wild thinking that elon is talking to you and asking you for 15k , like bro has the money of his lifetime, why would he ask you?

It doesn't make that much sense idk

atrus · 2h ago

I remember watching the SpaceX channel on youtube, which isn't a legit source. AI Elon basically says "I want to help make bitcoin more popular, let me show you how easy it it to transfer money around with btc. Send my $X and I'll send you back $2X! It's very inline with a typical elon message (I'll give you 1 million to vote R), it's on a channel called SpaceX. It's pretty believable.

Granted I played Runescape and EvE as a kid, so any double-isk scams are immediate redflags.

Imustaskforhelp · 1h ago

Now I have never played runescape but have heard of this legendary game in references.

For some reason, my mind confused runescape with neopets from the odd1sout video which I think is a good watch.

Scams That Should be Illegal : https://www.youtube.com/watch?v=XyoBNHqah30

empath75 · 51m ago

It's only believable to the extent that I believe that Musk would actually run such a transparently obvious scam.

Jensson · 2h ago

Even Elon could lose his credit card or something, the story they spin is always something like that "I am rich but in a pickle, please send some money here and then I'll send you back 10x as much tomorrow when I get back to my account", but of course they never send it back.

Edit: But of course Elon would call someone he knows rather than a stranger, rich people know a lot of people so of course they would never contact you about this.

tantalor · 2h ago

That's an "advance fee" scam.

https://en.wikipedia.org/wiki/Advance-fee_scam

runarberg · 1h ago

There are a lot of people on the internet, and every individual on the internet is in a unique situation. Chances are some of them are very likely to be persuaded by a scam which seems obvious to you.

Parent’s story is very believable, even if parent made this particular story up (which I personally don‘t think is the case) this has probably happened to somebody.

Imustaskforhelp · 22m ago

Ya maybe I didn't get their tone correctly which is why I was actually serious if they were joking or not.

If they aren't joking, I apologize.

jaredklewis · 3h ago

This comment is perfect.

latchkey · 3h ago

As always, it is the replies that make it worth it. GopherGeyser strikes again!

AbraKdabra · 2h ago

I don't mean to be rude, but this sounds like natural selection doing its work.

umbra07 · 2h ago

That's the sort of statement that remains extremely rude even if you try and prefix it with "I don't mean to be rude".

AbraKdabra · 59m ago

It's not rude if it's the truth.

Also he's a troll so...

lionkor · 3h ago

Not to victim-shame or anything, but that sounds more like more than one safety mechanism failed, the convincing tech only being a rather small part of it?

hansonkd · 3h ago

I think the biggest failure is on the part of the companies hosting these streams.

Its been a while, but I remember seeing streams for Elon offering to "double your bitcoin" and the reasoning was he wanted to increase the adoption and load test the network. Just send some bitcoin to some address and he will send it back double!

But the thing was it was on youtube. Hosted on an imposter Tesla page. The stream had been going on for hours and had over ten thousand people watching live. If you searched "Elon Musk Bitcoin" During the stream on Google, Google actually pushed that video as the first result.

Say what you want about the victims of the scam, but I think it should be pretty easy for youtube or other streaming companies to have a simple rule to simply filter all live streams with Elon Musk + (Crypto|BTC|etc) in the title and be able to filter all youtube pages with "Tesla" "SpaceX" etc in the title.

lionkor · 3h ago

I feel like somehow that would lessen it, but not really help much? There are obviously people with too much money in BTC who are trying to take any gamble to increase its value. It sounds like a deeper societal issue.

jfoster · 1h ago

You are right that they might never be able to get it to 0, but shouldn't they lessen it if a simple measure like the one described can prevent a bunch of people from getting fooled by the scam?

michelb · 3h ago

These SpaceX scams are rampant on youtube and highly, highly lucrative. It’s crazy and you have to be very vigilant, as whatever is promised lines up with Elon’s MO.

rangerelf · 2h ago

Why would anyone give them any money AT ALL?

It's not like they're poor or struggling.

Am I missing something?

nickthegreek · 2h ago

it requires zero vigilance if you dont play the game.

amatajohn · 2h ago

the modern turing test:

am i getting scammed by a billionare or an AI billionaire?

DonHopkins · 1h ago

If you believe anything the actual Elon says, then you have nobody to blame but yourself. That's not a sophisticated attack, you're just extremely gullible.

lucasmullens · 13m ago

Come on, don't be mean. Imagine saying this in person to someone who just told you they got scammed. "You're just extremely gullible" is just so mean...show some empathy.

conradkay · 1h ago

I don't think he's the gullible one, check their bio ;)

UltraSane · 2h ago

On the balance of probabilities it being a scam is vastly more likely than Elon actually wanting to contact you. Why would Elon need $15k in bitcoin?

It seems like money naturally flows from the gullible to the Machiavellian.

pennaMan · 3h ago

hey, I got a bridge to sell you, was $20k but we can lower it to $15k if you pay in BTC

testplzignore · 3h ago

You're paying too much for your bridges man. Who's your bridge guy?

dkiebd · 2h ago

That wasn’t a bridge.

77pt77 · 1h ago

Was the bridge built by a genius like Elon though?

nikanj · 2h ago

The comments are probably AI-generated too, because a site that seems to have lots of other people on it is more appealing than an empty wasteland

lifthrasiir · 3h ago

FYI, this is the famed nano-banana model which has been now renamed to gemini-2.5-flash-image-preview in LMArena.

Mistletoe · 3h ago

https://medium.com/data-science-in-your-pocket/what-is-googl...

For people like me that don’t know what nano-banana is.

mock-possum · 3h ago

Wow I hate the ‘voice’ in that article - big if true though.

daemonologist · 2h ago

I suspect the "voice" is a language model with a bad system prompt. (Possibly the author's own words run through an LLM, to be charitable.)

3036e4 · 2h ago

It's medium.com. YouTube comments quality text packaged as clickbait articles for some revenue share. It was always slop, even without LLMs. Do they even bother with paying human authors now or is the entire site just generated? That would probably be cheaper and improve quality.

debugnik · 1h ago

> Do they even bother with paying human authors now

I thought Medium was a stuck up blogging platform. Other than for paid subscriptions, why would they pay bloggers? Are they trying to become the next HuffPost or something?

seydor · 1h ago

I mean they are going to have to rename their AI because gemini.com is going to IPO soon.

"Banana" would be a nice name for their AI, and they could freely claim it's bananas.

postscapes1 · 3h ago

This is what i came here to find out. Thanks.

mkl · 3h ago

That lamp example is pretty impressive (though it's hard to know how cherry-picked it is). The lamp is plugged in, it's lighting the things in the scene, it's casting shadows.

mortsnort · 32m ago

At $0.02 per image, it's prohibitively expensive for many use-cases. For comparison, the cheapest Flux model (Schnell) is $0.003 per image.

bn-l · 14m ago

Schnell isn’t AR and doesn’t do editing.

kemyd · 2h ago

I don't get the hype. Tested it with the same prompts I used with Midjourney, and the results are worse than in Midjourney a year ago. What am I missing?

bonoboTP · 2h ago

The hype is about image editing, not pure text-to-image. Upload an input image, say what you want changed, get the output. That's the idea. Much better preservation of characters and objects.

appenz · 2h ago

I tested it against Flux Pro Kontext (also image editing) and while it's a very different style and approach I overall like Flux better. More focus on image consistency, adjusts the lighting correctly, fixes contradictions in the image.

qingcharles · 1h ago

I've been testing it against Flux Pro Kontext for several weeks. I would say it beats Flux in a majority of tests, but Flux still surprises from time-to-time. Banana definitely isn't the best 100% of the time -- it falls a bit short of that. Evolution, not revolution.

vunderba · 25m ago

Agreed. I find myself alternating between Qwen Image Edit 20B, Kontext, and now Flash 2.5 depending on the situation and style. And of course, Flash isn't open-weights, so if you need more control / less censorship then you're SOL.

SirMaster · 2h ago

Can it edit the photo at the original resolution?

Most of my photos these days are 48MP and I don't want to lose a ton of resolution just to edit them.

qingcharles · 1h ago

I don't know. All the testing I've done has output the standard 1024x1024 that all these models are set to output. You might be able to alter the output params on the API or AI Studio.

kemyd · 2h ago

Thanks for clarifying this. That makes a lot more sense.

vunderba · 1h ago

Midjourney hasn't been SOTA for over a year. Even the latest release of version 7 scores extremely low on prompt adherence only managing to get 2 out of 12 prompts correct. Even Flux Dev running locally consistently out performs it.

Here's a comparison of Flux Dev, MJ, Imagen, and Flash 2.5.

https://genai-showdown.specr.net/?models=FLUX_1D%2CMIDJOURNE...

That being said, if image fidelity is absolutely paramount and/or your prompts are relatively simple - Midjourney can still be fun to experiment with particularly if you crank up the weirdness / chaos parameters.

cdrini · 2h ago

Hmm, I think the hype is mainly for image editing, not generating. Although note I haven't used it! How are you testing it?

kemyd · 2h ago

I tested it with two prompts:

// In this one, Gemini doesn't understand what "cinematic" is

"A cinematic underwater shot of a turtle gracefully swimming in crystal-clear water [...]"

// In this one, the reflection in the water in the background has different buildings

"A modern city where raindrops fall upward into the clouds instead of down, pedestrians calmly walking [...]"

Midjourney created both perfectly.

echelon · 2h ago

As others have said, this is an image editing model.

Editing models do not excel at aesthetic, but they can take your Midjourney image, adjust the composition, and make it perfect.

These types of models are the Adobe killer.

kemyd · 2h ago

Noted that! The editing capabilities are impressive. I was excited for image gen because of the API (Midjourney doesn't have it yet).

echelon · 1h ago

David Holz mentioned on Twitter that he was considering a Midjourney API. They're obviously providing it to Meta now, so it might become more broadly available after Midjourney becomes the default image gen for Meta products.

Midjourney wins on aesthetic for sure. Nothing else comes close. Midjourney images are just beautiful to behold.

David's ambition is to beat Google to building a world model you can play games in. He views the image and video business as a temporary intermediate to that end game.

qingcharles · 1h ago

It actually has impressive image generating ability, IMO. I think the two things go hand-in-hand. Its prompt adherence can be weaker than other models, though.

abdusco · 3h ago

I love that it's substantially faster than ChatGPT's image generation. It takes ages, so slow that the app tells you to not wait and sends you notification when the generation finishes.

andrewinardeer · 2h ago

"Generate an image of OpenAI investors after using Gemini 2.5 Flash Image"

radarsat1 · 3h ago

I've had a task in mind for a while now that I've wanted to do with this latest crop of very capable instruction-following image editors.

Without going into detail, basically the task boils down to, "generate exactly image 1, but replace object A with the object depicted in image 2."

Where image 2 is some front-facing generic version, ideally I want the model to place this object perfectly in the scene, replacing the existing object, that I have identified ideally exactly by being able to specify its position, but otherwise by just being able to describe very well what to do.

For models that can't accept multiple images, I've tried a variation where I put a blue box around the object that I want to replace, and paste the object that I want it to put there at the bottom of the image on its own.

I've tried some older models, and ChatGPT, also qwen-image last week, and just now, this one. They all fail at it. To be fair, this model got pretty damn close, it replaced the wrong object in the scene, but it was close to the right position, and the object was perfectly oriented and lit. But it was wrong. (Using the bounding box method.. it should have been able to identify exactly what I wanted to do. Instead it removed the bounding box and replaced a different object in a different but close-by position.)

Are there any models that have been specifically trained to be able to infill or replace specific locations in an image with reference to an example image? Or is this just like a really esoteric task?

So far all the in-filling models I've found are only based on text inputs.

rushingcreek · 3h ago

Yes! There is a model called ACE++ from Alibaba that is specifically trained to replace masked areas with a reference image. We use it in https://phind.design. It does seem like a very esoteric and uncommon task though.

ceroxylon · 2h ago

I don't think it is that esoteric, that sounds like deepfake 101. If you don't mind answering, does Phind do anything to prevent / mitigate this?

bawana · 12m ago

Google is eating adobe

j_m_b · 3h ago

If this can do character consistency, that's huge. Just make it do the same for video...

ACCount37 · 3h ago

It's probably built on reused "secret sauce" from the video generation models.

beyonddream · 3h ago

“Internal server error

Sorry, there seems to be an error. Please try again soon.”

Never thought I would ever see this on a google owned websites!

lionkor · 3h ago

A cheap quip would be "it's vibe-coded", but that might actually very well be the case at this point!

reaperducer · 1h ago

Never thought I would ever see this on a google owned websites!

Really? Google used to be famous not only for its errors, but for its creative error pages. I used to have a google.com bookmark that would send an animated 418.

dpoloncsak · 4h ago

I've been looking for a whitepaper or something. So far I've found this...which is not a whitepaper but seems relevant

https://developers.googleblog.com/en/introducing-gemini-2-5-...

It seems like this is 'nano-banana' all along

lemonish97 · 3h ago

Yes, they mention that the model is aka nano-banana in the blogpost

chadcmulligan · 2h ago

This is technically impressive though I really wish they'd choose other professions to automate than graphic design.

dangoodmanUT · 1h ago

It’s what data is available, they’re not targeting graphic design

reaperducer · 2h ago

AI is supposed to set us all free. Yet, so far all the tech companies have done is eliminate the jobs of the lowest-paid people (artists, writers, photographers, designers) and transfer that money to billionaires. Yay.

anthonypasq · 1h ago

[Plows] are supposed to set us all free. Yet, so far all the tech companies have done is eliminate the jobs of the lowest-paid people ([field hands]) and transfer that money to landowners. Yay.

reaperducer · 1h ago

If you can't understand the difference, perhaps consult one of your AI chat overlords.

throitallaway · 28m ago

History repeats itself. Productivity gains the last ~half century have mostly made their way to the top.

qoez · 4h ago

Anyone know how it handles '1920s nazi officer'? They stopped doing humans for a while but now I see they're back so I wonder how they're handling the criticism they got from that

napo · 3h ago

it said: "I can create images about lots of things but not that. Can I try a different one for you?"

napo · 3h ago

when giving more context it replied:

""" Unfortunately, I can't generate images of people. My purpose is to be helpful and harmless, and creating realistic images of humans can be misused in ways that are harmful. This is a safety policy that helps prevent the generation of deepfakes, non-consensual imagery, and other problematic content.

If you'd like to try a different image prompt, I can help you create images of a wide range of other subjects, such as animals, landscapes, objects, or abstract concepts. """

bastawhiz · 3h ago

What a weird rejection. You have to scroll pretty far in the article to see an example output that doesn't have a realistic depiction of a person.

geysersam · 3h ago

It's unfortunate they can't just explain the real reason they don't want to generate the image:

"Unfortunately I'm not able to generate images that might cause bad PR for Alphabet(tm) or subsidiaries. Is there anything else I can generate for you?"

int_19h · 1h ago

If you want that kind of thing, Qwen3 delivers:

https://www.reddit.com/r/LocalLLaMA/comments/1mx1pkt/qwen3_m...

tanaros · 3h ago

The rejection message doesn’t seem to be accurate. I tried “happy person” as a prompt in AI Studio and it generated a happy human without any complaints.

It’s possible that they relaxed the safety filtering to allow humans but forgot to update the error message.

No comments yet

Der_Einzige · 3h ago

The moment the weights are on huggingface someone with orthogonalize/abliterate the model and make it uncensored.

rvnx · 3h ago

BigBanana would be a good name for that future OnlyFans model

martythemaniak · 3h ago

What is a "1920s nazi officer" what do they look like?

sorokod · 2h ago

The SA article has some photos

https://en.m.wikipedia.org/wiki/Sturmabteilung

detaro · 3h ago

brown uniform, red armband with swastika was the usual SA look in the 1920s.

rvnx · 3h ago

Mh. Apparently like this if we ask AI:

https://postimg.cc/xX9K3kLP

...

anotheryou · 2h ago

Super cheap generation but expensive image upload, do I read that right?

https://openrouter.ai/google/gemini-2.5-flash-image-preview

daviding · 1h ago

Not sure. If the Flash image output is $30/M [1] then that's pretty similar to gpt-image-1 costs. So a faster and better model perhaps but not really cheaper?

[1] https://developers.googleblog.com/en/introducing-gemini-2-5-...

dangoodmanUT · 1h ago

That’s like .12 cents per image uploaded

johnfn · 2h ago

I naively went onto Gemini in order to try to use the new model and had what I could only describe as the worst conversation I've had with an AI since GPT 3.5[1]. Is this really the model that's on top of the leaderboard right now? This feels about 500 ELO points worse than my typical conversation with GPT 5.

Edit: OK, OK, I actually got it to work, and yes, I admit the results are incredible[2]. I honestly have no idea what happened with Pro 2.5 the first time.

[1]: https://g.co/gemini/share/5767894ee3bc [2]: https://g.co/gemini/share/a48c00eb6089

GaggiX · 2h ago

"Google AI Studio" and select the model

byteknight · 2h ago

Are you doing roleplay?

johnfn · 2h ago

What?

SpaceManNabs · 1h ago

sometimes these bots just go awry. i wish you could checkpoint spots in a conversation so you could replay from a that point, maybe with a push in the latent space or a new seed.

modeless · 3h ago

This model is very impressive. Yesterday (as nano-banana) I gave it a photo of an indoor scene with a picture hanging on a wall, and asked it the picture on a wall with a copy of the whole photo. It worked perfectly the first time.

It didn't succeed in doing the same recursively, but it's still clearly a huge advance in image models.

asadm · 1h ago

this is amazing. I just wish models would have more non-textual controls. I don't want to TYPE my instructions. We need a better UI for editing images with AI.

mh- · 15m ago

Can you expand on that? What would ideal look like to you?

simianwords · 2h ago

L like it but it is very restricted. I can't modify people's faces etc.

jawns · 3h ago

I was able to upload my kids' back-to-school photos and ask nano-banana to turn them into a goth, an '80s workout girl, and a tracksuit mafioso. The results were incredibly believable, and I was able to prank my mom with them!

elorant · 3h ago

I have a certain use case for such image generators. Feed them an entire news article I fetch from bbc and ask it to create an image to accompany the article. Thus far only midjourney managed to understand context. And now this, which is even more impressive. We live in interesting times.

vunderba · 16m ago

I think most of the SOTA models could probably handle this but you'd probably get better results using a pipeline:

1. Reduce article to a synopsis using an LLM

2. Generate 4-5 varying description prompts from the synopsis

3. Feed the prompts to an imagegen model

Though I'd wager that gpt-image-1 (in the ChatGPT) being multimodal could probably managed it as well.

oracleclyde · 2h ago

I just tried it inside Gemini with a Medium article. Here's my prompt: "Read the article at this url and provide a hero image that incapsulates the message the author wants to convey: https://bioneers.org/supreme-oligarchy-billionaires-supreme-..."

The response was a summary of the article that was pretty good, along with an image that dagnabbit, read the assignment.

shashankpritam · 1h ago

Are men not attractive? Or perhaps for Google, this blog is a targeted content? But who is it targeting? I would like to see the reasoning behind using all women images (at the least the top/first ones) to show off the model capabilities. I have noticed this trend in the image manipulation business a lot.

rd · 1h ago

The average man finds the average woman more attractive than the average woman finds the average man. Replace attractive with (eye-catching/attention-grabbing/motivating/retention-boosting).

shashankpritam · 1h ago

Oh, in that case, it makes sense. Also, I think men/women consume different kind of media and this is one of those "men dominated" corner of the internet. I also think due to trainig data bias - there could be some difference in quality with different subjects. So, they might be showing off their best of best.

zoeysmithe · 1h ago

Because tech is largely male dominated and has inherent sexism/patriarchy and images of women, especially conventionally attractive ones, has the perception of aiding sales.

Also women are seen as more cooperative and submissive, hence so many home assistants and AI being women's voices/femme coded.

stuckinhell · 3h ago

Is this the "nano banana" thing the art ai world was going crazy about recently ?

SweetSoftPillow · 3h ago

Yes it is

simedw · 3h ago

The model is only available in AI Studio when I set my VPN to the USA (I’m located in the UK).

kumarm · 3h ago

Seems to be failing at API Calls right now with "You exceeded your current quota, please check your plan and billing details. For more information on this error,"

Hope they get API issues resolved soon.

mindprince · 3h ago

What is the difference between Gemini Flash Image models and the Imagen models?

og_kalu · 3h ago

Imagen is a diffusion text to image model. You write some text that describes your image, you get an image out and that's it.

Flash Image is an image (and text) predicting large language model. In a similar fashion to how trained LLMs can manipulate/morph text, this can do that for images as well. Things like style transfer, character consistency etc.

You can communicate with it in a way you can't for imagen, and it has a better overall world understanding.

raincole · 2h ago

Imagen: Stable Diffusion, but by Google

Gemini Flash Image: ChatGPT image, but by Google

patates · 3h ago

It seems that they still block access from Europe, or from Germany at least.

beklein · 3h ago

It works fine in OpenRouter

elorant · 3h ago

I can access it from Greece through AI Studio just fine.

punkpeye · 3h ago

Use one of the router services

Narciss · 3h ago

Use it on fal.ai

kumarm · 3h ago

Since API currently is not working (seems rate limits not set for Image Generation yet) I tried on fal.

Definitely inferior to results I see on AI Studio and image generation time is 6s on AI Studio vs 30 seconds on Fal.AI

echelon · 2h ago

> Definitely inferior to results

Quality or latency?

kridsdale1 · 3h ago

Get less contradictory regulations, then.

kneegerm · 3h ago

They vote [well they don't] for it, then they complain, then they downvote and seethe. The European experience.

rvnx · 3h ago

In EU they forbid us newspapers from non-approved countries, impose cookies banners everywhere, and now block porn. Soon they will forbid some AI models which have not passed EU censorship ("safety") validation. Because we all know that governments (or even Google with Android) are better at knowing what is the safest for you.

https://digital-strategy.ec.europa.eu/en/news/eu-rules-gener...

krige · 3h ago

How do you do, fellow europeans?

08764276596 · 2h ago

Inconceivable that anyone would dare to criticize the regime. Have you already filed a report, comrade?

No comments yet

cchance · 1h ago

did they actually roll it out i cant seem to find the option to use it

Edit: Nevermind its not in gemini for everyone yet, its in aistudio though

therealmarv · 3h ago

What is the max input and output resolution of images?

This is why I'm sticking mostly to Adobe Photoshop's AI editing because there are no restrictions in that regard.

qingcharles · 1h ago

In my testing it has been stuck at 1024x1024. Have to upscale with something...

abdusco · 3h ago

Around 1 megapixel, AFAICT.

TrousersHoisted · 1h ago

What is the "flash image?" I don't see anything downloadable there...

bsenftner · 2h ago

All these image models are time vampires and need to be looked at with very suspicious eyes. Try to make a room - that's easy, now try to make multiple views of the same room - next to impossible. If one is intending to use these image models for anything that requires consistency of imagery, forget it.

ragazzina · 52m ago

"Can you make a version of this picture where I wear the best possible sunglasses for my face shape?"

made me realize that AI image modification is now technically flawless, utterly devoid of taste, and that I myself am a rather unattractive fellow.

sandreas · 2h ago

I wonder if this could be used for preprocessing documents before doing OCR...

mclau157 · 2h ago

I could see this destroying a lot of jobs like photography, editing, marketing, etc.

bityard · 2h ago

These jobs won't go away. Power tools didn't destroy carpentry. Computers didn't destroy math. But workers who don't embrace these new tools will probably get left behind by those who do.

keepamovin · 3h ago

Those examples are gorgeous and amazing. This is really cool.

sam1234apter · 1h ago

this model is awesome - now anyone can build photo ai apps

Narciss · 3h ago

Nano banana is here!

t_mahmood · 2h ago

After the rugpull of Android, are we really going to trust Google with anything?

jeffbee · 1h ago

What does the first phrase even mean?

jfoster · 1h ago

I think it's a reference to this & similar things:

https://9to5google.com/2025/08/25/android-apps-developer-ver...

t_mahmood · 1h ago

Yes, this is what I was talking about

t_mahmood · 1h ago

I think I should have used the word enshittification of Android. And, I need to brush up my writing, it's getting progressively worse.

dboreham · 1h ago

Hmm...assumed this was a model shipped on a flash drive...

uejfiweun · 2h ago

This is pretty remarkable, I'm having a lot of fun playing around with this. Kudos to Google.

yuchana · 2h ago

The progress is insanely good but imagine the competition between engineers especially there are many people taking up courses in ai and cs

lyu07282 · 3h ago

still fails at analog clocks, if anyone else was also wondering

asdev · 3h ago

Looks like AI image generation is converging to a local maximum as well

casey2 · 1h ago

4 out of 7 images show a woman 1 out of 7 show a man I feel like this is trying to advertise power over women to men. Which makes it evil.

GaggiX · 3h ago

An image seems to be 256 tokens looking the AIstudio tab, so you can generate 3906,25 images per 1M tokens, that seems a lot if I'm not wrong in some ways.

Edit: the blog post is now loading and reports "1290 output tokens per image" even though on the AI studio it said something different.

runarberg · 1h ago

Still fails the “full glass of wine” test, and still shows many of the artifacts typical of AI generated images like non-nonsensical text, misplacement of objects, etc.

To be honest I am kind of glad. As AI generated images proliferate, I am hoping it will be easier for humans to call them out as AI.

idiotsecant · 1h ago

This is going to be so helpful for all the poorly photoshopped Chinese junk eBay listings.

awestroke · 3h ago

Internal server error. lol

Gemini 2.5 Flash Image

Comments (250)