Google suddenly launched Nano Banana 2 late at night, the raw image community is in chaos! Pro-level 4K blockbuster, price slashed in half

Google has launched the next-generation image generation model Nano Banana 2, featuring ultra-fast generation speed and powerful multilingual processing capabilities, capable of real-time networking and generating 4K images. The model has performed excellently in various evaluations, achieving the world's top image generation capability and third in image editing capability. Its output price is $0.0672 per image, only half of the Pro version, marking a significant breakthrough in the design field

Weekly updates, Google has once again dropped a "depth bomb" late at night.

Just now, the strongest image generation model Nano Banana 2 has emerged, backed by the brand new Gemini 3.1 Flash Image.

Not only is it incredibly fast at generating images, but it also has stronger multilingual text processing capabilities and can connect to the internet in real-time, producing 4K masterpieces in one go.

As soon as it was released, the entire internet was flooded with its terrifying capabilities.

A single sentence can directly generate a game UI interface; a casual sketch can be transformed into a web-based UI; it can output 20 comic strips at once without batting an eye.

Netizens exclaimed: Designers are doomed!

Moreover, the Chinese characters written by NB2 are incredibly stable, completely bidding farewell to "ghostly symbols."

Major reviews have further solidified Nano Banana 2's position as the top image generation model.

In the benchmark tests conducted by Artificial Analysis, it easily secured the global first place.

In terms of image editing capabilities, it ranks third, only behind GPT Image 1.5 and Nano Banana Pro.

In Image Arena, NB2's text-to-image generation also topped the charts, scoring 1279 Elo points, with image editing capabilities second only to GPT Image.

In Google's official evaluation, NB2 (with Thinking + text search + image search enabled) not only comprehensively outperformed competitors like GPT-Image 1.5 and Grok Imagine Image Pro in overall preference, visual quality, and accuracy of infographics, but even surpassed the big brother Nano Banana Pro.

Moreover, its output price is only $0.0672 per image, just half of Pro's price.

Fast, but not just fast

Without further ado, let's take a look at its killer features.

The first is world knowledge.

Nano Banana 2 has fully integrated the Gemini knowledge base and real-time web search.

If you ask it to draw a real building, it will first search online for visual reference materials to understand what the place actually looks like, and then render it in the style you specify.

This understanding also allows it to directly help you create infographics, turn notes into diagrams, and generate data visualizations.

For example, ask it to generate a science infographic about the water cycle.

The model chose a bird's-eye view from directly above, clearly laying out each step of the water cycle from left to right on a clean light gray textured background.

It also drew simple black hand-drawn arrows on the background to guide the viewer's eye, with soft and even lighting, almost no shadow interference, creating an overall educational feel that is not dull, at a level you could directly use in a classroom.

Infographic depicting the water cycle from a flat overhead view

Now let's look at this comparison chart of cloud types.

It adopts a triptych format, placing cumulus, stratus, and cirrus clouds in three separate panels, each with a dramatically styled sky as the background, accompanied by prominent label text The overall style is high-contrast American comic style, with clear information and maximum visual impact.

An infographic comparing different types of clouds in a triptych

This one allows the model to first search for real photos of the Château de Clos Lucé (the French castle where Leonardo da Vinci lived in his later years) as a reference, and then reinterpret it in a vibrant synthetic cubism style.

The model not only accurately restored the basic structural features of the building but also infused the essence of cubism's multi-perspective collage and geometric deconstruction, while strictly adhering to the requirement of "no text."

This is the difference brought by "world knowledge"—it knows what this castle looks like, rather than fabricating it out of thin air.

Château de Clos Lucé in synthetic cubism style

What's even more impressive is that Google has specifically created an application called "Window Seat" to showcase this capability.

Specifically, they let Nano Banana 2 access the knowledge base and network image search, combining real scenes from around the world with real-time weather data to generate realistic airplane window views.

It's like giving you a global trip without leaving your home, and every frame of the view outside the window is based on real geographical and meteorological information, not just randomly put together.

The second is text rendering and translation.

One of the biggest shortcomings of AI image generation is that "writing looks like ghostly symbols," but Nano Banana 2 has made significant improvements in this area.

The generated text is accurate and clear, making it completely sufficient for marketing posters and greeting cards.

Look at the following set of images.

The first image is a cinematic close-up filled with natural elements, showcasing a beautiful sign made from recycled eco-friendly materials, featuring local birds and flowers, with the text "Native Wildlife: Please Observe from a Distance" written in elegant handwritten font at the bottom. Soft diffused light filters through the leaves of nearby ferns, with a background of vibrant green plants blurred out.

The second image completes the scene localization with just one sentence—transforming the entire concept into an Indian scene, with all text translated into Hindi, and even the vegetation and lighting atmosphere adjusted accordingly. This "one-click localization" capability is incredibly useful for creators producing global content.

Localized version of the "Native Wildlife" sign

Similarly, Google has also provided a cool demonstration for this capability—"Global Ad Localizer." This global advertising localization tool can directly translate advertising materials into different language versions. It not only renders the translated text but also synchronously adjusts the visual elements in the images to fit the target market.

4K Creative Blockbuster, Better Image Quality

The speed has increased, and the quality has not dropped; this is what truly excites people about Nano Banana 2.

First, there is a significant improvement in subject consistency.

Specifically, a workflow can maintain the characteristics of up to 5 characters and the high fidelity of 14 objects.

What does this mean? You will understand after looking at the image below.

14 uniquely styled characters and props appear together in a farm scene, happily playing, creating a fun, quirky, and joyful atmosphere.

The key is that each character and prop strictly maintains its original characteristics and image, with no "face changes" or "mix-ups."

Fun and joyful characters and props in the farm

Now, let's look at this more narrative example.

The story of three furry friends building a treehouse is divided into 6 chapters. The entire story is thrilling and exciting, ending with a joyful moment.

The most impressive part is that the clothing and appearance characteristics of the three characters remain consistent across the 6 images, but the expressions and perspectives are different in each image, ensuring that each character appears only once in each image.

This is a boon for creators who need to create continuous narratives—finally, they don't have to discover that "the main character has changed their face" every time they generate an image.

Furry friends building a treehouse

Next, this application called "Pet Passport" can be described as "a happiness generator for pet owners."

Here, you only need to upload a photo of your pet, and the model can take your furry friend on a global adventure, checking in at various famous landmarks.

Moreover, it comes with various creative control settings, allowing you to customize different styles and effects.

The key is that no matter which destination you go to, the pet's appearance can remain highly consistent.

Secondly, the instruction following has become more precise.

The subtle details you have in mind can now be better captured by the model. Complex descriptions are no longer "freely interpreted" into something else.

Third, the specifications have also been maximized.

From 512px to 4K, various aspect ratios are available for you to choose from.

It is worth mentioning that 512px is a newly added resolution tier, specifically optimized for low-latency and high-load scenarios. If your workflow requires rapid iteration of a large number of images, this tier can help you maximize efficiency In terms of aspect ratios, in addition to the common ratios, this time new extreme ratios of 4:1, 1:4, 8:1, and 1:8 have been added, allowing banner ads, vertical long images, and information flow cards to natively adapt without the need for post-processing cropping.

For developers, there is also a new feature that greatly impacts image quality: configurable Thinking Level.

You can manually adjust the "thinking depth" of the model before generating images—the default is the lowest level, prioritizing speed.

After switching to advanced or dynamic mode, the model will perform more thorough reasoning on complex prompts before rendering, significantly improving output quality and adherence to instructions.

Finally, the visual quality itself has also jumped to a new level.

The light and shadow are more vivid, the textures are richer, and the details are sharper.

For example, in the aerial image of a misty valley below.

You can see the entire canyon from a very high bird's-eye view, with the foreground being dark water surrounded by a bright green field, dotted with scattered trees and shrubs at the edge of the field, and a narrow winding path disappearing into the distance among the green hills on the right.

In the depths of the valley, a light blue-gray lake extends between the densely vegetated towering mountains, with peaks hidden in the low-hanging mist.

The main color transitions from the lush green of the foreground to darker and softer tones in the distance, with the water surface reflecting the gloomy sky, and heavy clouds creating a soft diffuse lighting effect. The overall scene exudes a rugged beauty reminiscent of the Scottish Highlands, with a sense of tranquility and untamed wilderness.

Aerial view of a misty green valley

Now, let's look at this pop art fashion portrait.

The image is taken from a slightly low angle, featuring a young person with deep skin tone dressed in an extremely eye-catching suit.

The fabric is printed with bold electric blue swirling wave patterns, interspersed with huge bright pink concentric circles that overlap and radiate outward.

The oversized lapel suit jacket is paired with bell sleeves, and underneath is a neatly pressed yellow collared shirt, with wide-leg pants dramatically flaring out towards the ground.

Bright yellow heart-shaped sunglasses, huge pink circular earrings, and a defiant pose with hands on hips against a pure, uniform sky blue background make the entire image look like a visual bomb exploded from the pop art universe.

Moreover, no matter what aspect ratio you request, the model can output it accurately.

Highly stylized pop art fashion portrait in different aspect ratios

The First Test on the Internet: "Imagining" the Whole World from a Single Frame

Since the emergence of Nano Banana, people around the world have generated over a billion images with it.

Google DeepMind created a demo through "atmospheric encoding," showcasing NB's powerful understanding of the real world.

In each frame, NB2 can only see the previous image and can "imagine" the upcoming scene, with a coherence that is simply outrageous.

Now, a large number of netizens have shared stunning demo tests of Nano Banana 2.

With a simple prompt, it perfectly recreated Belfast in the 1970s.

Moreover, you can randomly capture a map, and NB2 will generate a cartoon-style panoramic image.

Upload a book cover, and NB2 can directly produce a page displaying jellyfish from the book.

In text rendering, NB2 has taken a new leap, accurately producing fonts for manuscripts, whiteboards, posters, and more.

Let NB2 generate a newspaper of today's technology news. By searching online, it directly outputs the front page of the news, although there are some minor issues with the details.

In another demo, NB2 also demonstrates super strong dominance in text generation.

Moreover, the portraits generated by NB2 are more realistic, making it difficult to distinguish between real and fake with the naked eye.

In a comparison image, NB2 is more detailed and powerful in character portrayal in games.

There are also various creative images such as container displays and spiral staircases, where NB2 performs exceptionally well.

An anime-style image, NB2 can replicate it into a GTA-style picture with one click.

A 3D miniature model scene generated by NB2, the scene restoration is very realistic.

An infographic made by NB2, with very rich details.

a16z partner Justine Moore discovered during testing that NB2 has improved capabilities in infographics, advertisements, action shots, and even cartoon generation, and the speed is very fast.

Under the same prompt, NB2 can better follow instructions, generating results that are more realistic than GPT Image 1.5.

Here are some official demos from Google's DeepMind:

Where can it be used?

From global knowledge to text rendering, from 14 subject fidelity to 4K output, with capabilities summarized here, you might be eager to try it out.

The good news is that Nano Banana 2 has been rolled out across Google's entire product line:

Gemini App is the most direct entry point.

Nano Banana 2 will fully replace Nano Banana Pro in the Fast, Thinking, and Pro models. Users wanting the "top configuration" can still switch back to Nano Banana Pro by selecting "Regenerate Image" from the three-dot menu.

By the way: The number of images generated is limited each day.

Users who have not subscribed to the Google AI plan can generate a maximum of 100 images within 24 hours; for those subscribed to Google AI Plus, Pro, or Ultra, this limit increases to 1,000 For most people, it's sufficient, while heavy creators may consider a subscription.

Google's old business—search—is also included. This encompasses the Google App, AI Mode on mobile and desktop, as well as Smart Lens.

Developers can access the preview version in AI Studio, Gemini API, and Vertex AI, and Google Antigravity is also supported.

Flow users benefit directly—Nano Banana 2 has become the default model, available for all without points.

Google Ads has also integrated this, automatically providing smart suggestions when creating ad campaigns.

Conclusion

In summary, Google's strategy this time is quite clear:

Use Nano Banana 2 to cover the daily needs of the vast majority of users—fast, accurate, visually appealing, capable of searching and translating;

Reserve Nano Banana Pro for professional scenarios that have extremely high requirements for factual accuracy.

Instead of a "Pro or compromise" choice, it allows most users to access flagship capabilities without a speed drop.

Risk Warning and Disclaimer

The market has risks, and investment requires caution. This article does not constitute personal investment advice and does not take into account the specific investment goals, financial conditions, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article align with their specific circumstances. Investment based on this is at one's own risk