The Canary in the Classroom (hollisrobbinsanecdotal.substack.com)
1 points by HR01 6m ago 0 comments
Blocky Planet – Making Minecraft Spherical (bowerbyte.com)
1 points by gdevillers 14m ago 0 comments
Gemini 2.5 Flash Image
284 meetpateltech 136 8/26/2025, 2:01:46 PM deepmind.google ↗
Developers Announcement: https://developers.googleblog.com/en/introducing-gemini-2-5-...
Just search nano banana on Twitter to see the crazy results. An example. https://x.com/D_studioproject/status/1958019251178267111
There is a whole spectrum of potential sketchiness to explore with these, since I see a few "sign in with Google" buttons that remind me of phishing landing pages.
No it's not.
We've had rich editing capabilities since gpt-image-1, this is just faster and looks better than the (endearingly? called) "piss filter".
Flux Kontext, SeedEdit, and Qwen Edit are all also image editing models that are robustly capable. Qwen Edit especially.
Flux Kontext and Qwen are also possible to fine tune and run locally.
We've left the days of Dall-E, Stable Diffusion, and Midjourney of "prompt-only" text to image generation.
It's also looking like tools like ComfyUI are less and less necessary as those capabilities are moving into the model itself.
Gpt4 isn't "fundamentally different" from gpt3.5. It's just better. That's the exact point the parent commenter was trying to make.
My test is going to https://unsplash.com/s/photos/random and pick two random images, send them both and "integrate the subject from the second image into the first image" as the prompt. I think Gemini 2.5 is doing far better than ChatGPT (admittedly ChatGPT was the trailblazer on this path). FluxKontext seems unable to do that at all. Not sure if I were using it wrong, but it always only considers one image at a time for me.
In my eyes, one specific example they show (“Prompt: Restore photo”) deeply AI-ifies the woman’s face. Sure it’ll improve over time of course.
I've been waiting for that, too. But I'm also not interesting in feeding my entire extended family's visual history into Google for it to monetize. It's wrong for me to violate their privacy that way, and also creepy to me.
Am I correct to worry that any pictures I send into this system will be used for "training?" Is my concern overblown, or should I keep waiting for AI on local hardware to get better?
I have to say while I'm deeply impressed by these text to image models, there's a part of me that's also wary of their impact. Just look at the comments beneath the average Facebook post.
It survives a lot of transformation like compression, cropping, and resizing. It even survives over alterations like color filtering and overpainting.
Now is that so bad?
You sent your wallet to the real Elon and he used it as he saw fit. ;)
It's not like they're poor or struggling.
Am I missing something?
Because if this is real, then the world is cooked
if not, then the fact that I think that It might be real but the only reason I believe its a joke is because you are on hackernews so I think that either you are joking or the tech has gotten so convincing that even people on hackernews (which I hold to a fair standard) are getting scammed.
I have a lot of questions if true and I am sorry for your loss if that's true and this isn't satire but I'd love it if you could tell me if its a satirical joke or not.
[0]: https://www.ncsc.admin.ch/ncsc/en/home/aktuell/im-fokus/2023...
It doesn't make that much sense idk
Granted I played Runescape and EvE as a kid, so any double-isk scams are immediate redflags.
Edit: But of course Elon would call someone he knows rather than a stranger, rich people know a lot of people so of course they would never contact you about this.
https://en.wikipedia.org/wiki/Advance-fee_scam
Its been a while, but I remember seeing streams for Elon offering to "double your bitcoin" and the reasoning was he wanted to increase the adoption and load test the network. Just send some bitcoin to some address and he will send it back double!
But the thing was it was on youtube. Hosted on an imposter Tesla page. The stream had been going on for hours and had over ten thousand people watching live. If you searched "Elon Musk Bitcoin" During the stream on Google, Google actually pushed that video as the first result.
Say what you want about the victims of the scam, but I think it should be pretty easy for youtube or other streaming companies to have a simple rule to simply filter all live streams with Elon Musk + (Crypto|BTC|etc) in the title and be able to filter all youtube pages with "Tesla" "SpaceX" etc in the title.
It seems like money naturally flows from the gullible to the Machiavellian.
am i getting scammed by a billionare or an AI billionaire?
For people like me that don’t know what nano-banana is.
https://openrouter.ai/google/gemini-2.5-flash-image-preview
Most of my photos these days are 48MP and I don't want to lose a ton of resolution just to edit them.
// In this one, Gemini doesn't understand what "cinematic" is
"A cinematic underwater shot of a turtle gracefully swimming in crystal-clear water [...]"
// In this one, the reflection in the water in the background has different buildings
"A modern city where raindrops fall upward into the clouds instead of down, pedestrians calmly walking [...]"
Midjourney created both perfectly.
Editing models do not excel at aesthetic, but they can take your Midjourney image, adjust the composition, and make it perfect.
These types of models are the Adobe killer.
Without going into detail, basically the task boils down to, "generate exactly image 1, but replace object A with the object depicted in image 2."
Where image 2 is some front-facing generic version, ideally I want the model to place this object perfectly in the scene, replacing the existing object, that I have identified ideally exactly by being able to specify its position, but otherwise by just being able to describe very well what to do.
For models that can't accept multiple images, I've tried a variation where I put a blue box around the object that I want to replace, and paste the object that I want it to put there at the bottom of the image on its own.
I've tried some older models, and ChatGPT, also qwen-image last week, and just now, this one. They all fail at it. To be fair, this model got pretty damn close, it replaced the wrong object in the scene, but it was close to the right position, and the object was perfectly oriented and lit. But it was wrong. (Using the bounding box method.. it should have been able to identify exactly what I wanted to do. Instead it removed the bounding box and replaced a different object in a different but close-by position.)
Are there any models that have been specifically trained to be able to infill or replace specific locations in an image with reference to an example image? Or is this just like a really esoteric task?
So far all the in-filling models I've found are only based on text inputs.
Sorry, there seems to be an error. Please try again soon.”
Never thought I would ever see this on a google owned websites!
https://developers.googleblog.com/en/introducing-gemini-2-5-...
It seems like this is 'nano-banana' all along
It didn't succeed in doing the same recursively, but it's still clearly a huge advance in image models.
The response was a summary of the article that was pretty good, along with an image that dagnabbit, read the assignment.
Hope they get API issues resolved soon.
Flash Image is an image (and text) predicting large language model. In a similar fashion to how trained LLMs can manipulate/morph text, this can do that for images as well. Things like style transfer, character consistency etc.
You can communicate with it in a way you can't for imagen, and it has a better overall world understanding.
Gemini Flash Image: ChatGPT image, but by Google
This is why I'm sticking mostly to Adobe Photoshop's AI editing because there are no restrictions in that regard.
Definitely inferior to results I see on AI Studio and image generation time is 6s on AI Studio vs 30 seconds on Fal.AI
Quality or latency?
https://digital-strategy.ec.europa.eu/en/news/eu-rules-gener...
""" Unfortunately, I can't generate images of people. My purpose is to be helpful and harmless, and creating realistic images of humans can be misused in ways that are harmful. This is a safety policy that helps prevent the generation of deepfakes, non-consensual imagery, and other problematic content.
If you'd like to try a different image prompt, I can help you create images of a wide range of other subjects, such as animals, landscapes, objects, or abstract concepts. """
It’s possible that they relaxed the safety filtering to allow humans but forgot to update the error message.
No comments yet
"Unfortunately I'm not able to generate images that might cause bad PR for Alphabet(tm) or subsidiaries. Is there anything else I can generate for you?"
https://en.m.wikipedia.org/wiki/Sturmabteilung
https://postimg.cc/xX9K3kLP
...
Edit: the blog post is now loading and reports "1290 output tokens per image" even though on the AI studio it said something different.