Head-to-Head with GPT-4! Alphabet-C Launches "All-Powerful" AI Model Gemini, with Multi-Environment Adaptability from Mobile Phones to Data Centers

Alphabet-C has launched a super versatile AI model called Gemini, which can be applied to smartphones, cloud, and data centers, and directly competes with GPT-4. Gemini has advanced reasoning capabilities and can consider more carefully when answering difficult questions. Gemini is divided into three versions: Nano, Pro, and Ultra, suitable for different application scenarios. Gemini is a native multimodal model that supports text and image services, with faster speed and higher efficiency. Gemini will be integrated into Alphabet-C search and provide support for various Alphabet-C services. In 32 industry benchmark tests, Gemini outperforms GPT-4 in 30 of them.

Google has taken an important step in catching up with OpenAI in the application of artificial intelligence (AI) technology by launching a super versatile AI model that can be applied to smartphones, cloud, and data centers, directly competing with GPT-4.

On Wednesday, December 6th, Google officially released its new generation of large language model (LLM) Gemini to the public. It claims to be the "largest and most versatile AI model" Google has ever created, with advanced reasoning capabilities and a more careful consideration when answering difficult questions. Unlike other companies' LLM competitors, Google emphasizes that Gemini is the most flexible model because it can be used in various generative AI applications with different sizes.

Among them, the lightest version, Gemini Nano, can run offline directly on smartphones. The more powerful version, Gemini Pro, can perform multiple tasks and provide support for Google's AI services, including the ChatGPT chatbot Bard, enhancing Google's services such as Gmail, Maps, Docs, and YouTube. The most powerful version, Gemini Ultra, is also the most powerful LLM Google has created so far, designed primarily for data centers and enterprise applications.

Eli Collins, the Vice President of Product at DeepMind, Google's AI research organization, said that the diversity of Gemini means that it "can run on all devices from mobile devices to large-scale data centers." He stated that Google has long wanted to create a new generation of AI models that are more like helpful collaborators rather than just intelligent software, and Gemini brings Google one step closer to this vision.

Currently, Gemini is only available in English, but Google will soon release versions in other languages. Google CEO Sundar Pichai said that Gemini represents a new era of AI. Ultimately, Gemini will be integrated with more Google products, including the search engine, advertising products, Chrome browser, and more.

Gemini Nano for mobile and computer versions available on Wednesday, Gemini Pro supports Bard, Gemini Ultra to be launched next year

In terms of specific application timelines, starting from this Wednesday, Android developers can register to use the Gemini Nano version to create Gemini-supported apps for smartphones and computers. Google stated that Gemini can be immediately enabled on its flagship phone, Pixel 8 Pro, to achieve new generative AI functions such as summarizing key points of phone call recordings.

Gemini Pro version will support Bard starting from this Wednesday, enabling advanced reasoning, planning, understanding, and other functions. It will operate in 170 countries and regions in English, excluding the UK or other European regions, as Google stated that it is working with local regulatory authorities. Starting from next Wednesday, December 13th, Google will provide Gemini Pro version to cloud customers through Google Cloud on its Vertex AI and AI Studio platforms.

Gemini Ultra will first be available to developers and enterprise customers, and the details of this version will be announced next week. Google plans to open up Gemini Ultra applications to the public on a large scale early next year.

Google also plans to release an advanced version of Gemini Ultra called Bard Advanced in early next year. Before launching it to the public, a testing project will be introduced to improve Bard Advanced.

The following image from Google shows the three versions of the Gemini family.

Gemini outperforms GPT-4 in 30 out of 32 industry benchmark tests

Google is ambitious in comparing Gemini with GPT-4. Before the release of Gemini, Google conducted a series of tests using standard industry benchmarks. Google claims that Gemini Pro outperformed OpenAI's GPT-3.5 in six out of eight tests. In the benchmark tests for general language understanding, reasoning, mathematics, and coding, Gemini surpassed OpenAI's latest model GPT-4 in seven out of eight benchmarks.

At the same time, Google evaluated its latest generative AI product, AlphaCode 2, which can interpret and generate code. It was found that in the competitive programming field, AlphaCode 2 outperformed 85% of its competitors.

Demis Hassabis, CEO of DeepMind, stated that Google conducted 32 comprehensive benchmark tests comparing Gemini and GPT-4, covering a wide range of tests from multi-task language understanding to generating Python code. Out of the 32 benchmark tests, Gemini "far surpassed" GPT-4 in 30 of them.

The screenshot from Google's report below shows the comparison of scores between Gemini Pro and Ultra and other LLMs such as GPT-4 and GPT-3.5 in multiple-choice questions, math problems, Python code tasks, and reading comprehension.

Gemini is a native multimodal model trained on Google's high-performance cloud chip TPU v5p

Google claims that Gemini is a "native multimodal" AI model. This means that it is pre-trained from the beginning and can handle tasks based on both text and image prompts, supporting services for both text and images. For example, parents can help their children with homework by uploading images of math problems and photos of attempts to solve the problems in worksheets. Gemini can read the answers, understand why they are correct or incorrect, and explain concepts that need further clarification.

Google said that the "search generative experience" using generative AI technology will be integrated with Gemini's new features next year.

Google acknowledges that Gemini may still contain AI-generated false or fabricated information. Collins said this is an ongoing research problem that has yet to be solved, but he said Gemini has undergone the most comprehensive security evaluation of any Google AI model to date. To evaluate the security of Gemini, Google conducted adversarial testing on the model, simulating users with malicious intent using input prompts to help researchers check for hate speech and political bias in the model. These tests include "real toxic prompts," which consist of over 100,000 prompts extracted from the internet.

Google emphasizes that Gemini's AI tools are highly efficient and fast. It is trained on Google's self-developed new version of cloud chips called Tensor Processing Units (TPU). The performance of TPU v5p is even stronger, with training speeds for existing models being 2.8 times faster than the previous generation. TPU v5p is designed for training in data centers and running large models.

Amin Vahdat, Vice President of Google Machine Learning, said that this approach has given Google "a new understanding of future standard AI infrastructure." Google still uses third-party AI chips to run the Gemini model.

The following image provided by Google shows rows of Google Cloud TPU v5p AI accelerator supercomputers in a Google data center.