Our new architecture analyzes and decomposes images into a code-like intermediate representation called layouts — an internal visual language that captures the composition and structure of any image.
Our intermediate image representation is designed for transparency and control. Rather than hiding the model's understanding behind a black box, we expose its internal representation, or code, to enable direct manipulation of visual elements. Users can now move, resize, add, remove or replace objects with granular control.
moorjani · 3h ago
very cool, surely it's better than nano banana
mgh83 · 3h ago
exciting to see new tools that don't just give you a one-off random answer but let you mess up with the details!
hunterloftis · 3h ago
Hey, I worked on this & totally agree - I want my tools to be editors, not slot machines.
My focus has been on the beta "Edit" feature, tucked away into the top-right when you're looking at a single image. It lets you directly manipulate the image as both a spatial canvas and a semantic structure.
Our intermediate image representation is designed for transparency and control. Rather than hiding the model's understanding behind a black box, we expose its internal representation, or code, to enable direct manipulation of visual elements. Users can now move, resize, add, remove or replace objects with granular control.
My focus has been on the beta "Edit" feature, tucked away into the top-right when you're looking at a single image. It lets you directly manipulate the image as both a spatial canvas and a semantic structure.