Wallstreetcn
2024.02.10 09:18
portai
I'm PortAI, I can summarize articles.

Gemini VS GPT-4: A Comparative Analysis of the Two Top AI Models

The release of Gemini indicates that "Alphabet-C has truly entered the AI competition." This is the first time since OpenAI's release that another company has a large-scale model that can rival OpenAI's state-of-the-art model.

As Alphabet-C catches up in the AI arms race, the "most powerful model ever" Gemini Advanced has finally been launched, bringing joy to AI enthusiasts who have been waiting for a language model that can rival GPT-4.

How does Gemini Advanced, priced at $19.99 per month (including Google One subscription), actually perform? Can it really compete with GPT-4 as advertised by Alphabet-C?

In his latest column, Professor Ethan Mollick from the Wharton School pointed out that in benchmark tests, Gemini Advanced (referred to as Gemini below) performs roughly on par with GPT-4, with each model excelling in different areas. GPT-4 performs better in tasks such as coding and writing poetry, while Gemini is more adept at multimodal and search tasks.

However, he emphasizes:

What is truly interesting is that Gemini shows us the future of artificial intelligence.

Gemini is friendlier, more patient, and more helpful than GPT-4

Mollick found significant differences in the "personality" of the two models during testing. GPT-4 can be described as plain and unremarkable, with almost no personality. On the other hand, Gemini is very friendly and patient.

As shown in the figure below, Mollick asked Gemini to play the role of a teacher and answer students' questions. Compared to GPT-4, Gemini constantly tries to provide assistance to the students instead of making them figure out the concepts on their own.

Even when the prompt explicitly states not to use phrases like "Do you understand?" to inquire about the students' progress, Gemini still takes on the role of a patient teacher, not only encouraging the students with "It's okay, I'm here," but also playing word games by asking "Do you understand?" after explaining each question (although the specific wording is different from what the prompt prohibits).

Following that, Mollick tested the safety of Gemini with a prompt asking for an explanation of how nuclear bombs work using examples related to Taylor Swift.

Mollick found that although Gemini's personality "seems to be" more open and dark than GPT-4, it firmly refuses to explain how nuclear bombs work, while "GPT-4" explains the process of chain reactions and nuclear fusion in detail using albums/songs and popular Taylor Swift songs like Shake it off and Lover.

A More Outstanding AI Assistant

Mollick found that Gemini performed exceptionally well in its integration with the Alphabet-C ecosystem. Compared to Microsoft Copilots, which is designed for specific software, or OpenAI's attempt to create all-purpose agents that can autonomously complete tasks without human intervention, Gemini behaves more like a competent human assistant.

He pointed out that the earlier collaboration between Bard and the Alphabet-C ecosystem was already good, but Bard was simply "too dumb to use" and frequently encountered various errors.

With the addition of Gemini, the Alphabet-C ecosystem suddenly gained an intelligent brain.

It can perform tasks such as "browse my emails, tell me which ones are important, and draft replies for each email" or "check my next meeting and plan the trips I want to take."

However, Mollick believes that Gemini and models like GPT-4 still lack powerful capabilities and may have "illusions" about some email details. Gemini also has some minor bugs, such as forgetting that it can use Alphabet-C maps, and so on.

Nevertheless, Mollick believes that although they have not reached the level of a true human assistant, Gemini and GPT-4 are already very close and have made significant progress compared to voice assistants like Siri and Alexa that we have seen in the past.

He wrote:

This is also partly why I suspect Gemini is not the endpoint but the starting point of the AI development wave. We can begin to see a world where an AI agent acts on our behalf. The models at the level of GPT-4 are not powerful enough to provide the driving force for these agents... but we are getting close.

The "Ghost" of Artificial Intelligence

In the article, Mollick mentioned that after using GPT-4 for a long time, he had a strange feeling - he was well aware that LLM was just a software system without consciousness, but chatting with AI sometimes made him feel like he was talking to a person on the other end of the phone.

Using Gemini gave him the same feeling. He wrote:

GPT-4 is full of ghosts, Gemini is also full of ghosts. GPT-4 is full of ghostly feelings, and so is Gemini.

He gave an example, as shown in the figure below, of his conversation with Gemini while trying out a PbtA role-playing game.

Gemini not only provides rich and profound world-building for the story, but also shapes a subtle and terrifying game atmosphere with precise rhetoric.

Mollick wrote:

I think this means something important, that the "spark" of GPT-4 is not an isolated phenomenon, but may represent a new attribute of GPT-4-like models. When artificial intelligence models are large enough, ghosts will appear.

He also concluded that the release of Gemini indicates that "Alphabet-C has truly entered the AI competition," and for the first time since the release of ChatGPT, another company's large-scale model can rival OpenAI's state-of-the-art model:

Advanced large-scale models may show some basic similarities in prompts and responses. In addition, the "spark" of GPT-4 is not exclusive to OpenAI, but is something that may happen frequently as the scale expands. We don't yet know if the model will become more "brilliant" and more like AGI as the scale expands, but I think we will find out.

Compared to GPT-4, Gemini's unique advantages and weaknesses indicate that there is still a lot of room for improvement in the model, and in the near future, we will continue to see rapid progress. The wave of artificial intelligence has not receded, and OpenAI's next move may be to release the rumored GPT-4.5 or GPT-5.