Wallstreetcn
2023.12.08 03:27
portai
I'm PortAI, I can summarize articles.

The first wave of user feedback for Gemini is in: Not so good!

The feedback from netizens' testing is roughly - not as good as GPT.

Overnight, driven by the multimodal new model Gemini, which claims to surpass GPT-4, Google's stock price soared by 5%. Google itself is very confident in the capabilities of Gemini, and in various promotional materials, Gemini has been hyped up to the sky, with the released demo videos showing very impressive results.

Currently, the "lite version" of Gemini, Gemini Pro, has landed on Google's AI chatbot Bard (English version only). However, according to feedback from users who have tested it, the performance of Gemini Pro on basic facts, math problems, and generating news summaries is not as ideal as expected. It can even be said to be inferior to GPT-3.5, which has been released for over a year.

For example, when a user asked Gemini who would win the Best Actor at the Oscars in 2023, the incorrect answer given was Brendan Gleeson, instead of the actual winner Brendan Fraser.

Although Gemini clearly has the ability to access the internet, it made a mistake even in such basic facts that can be easily googled, which is truly thought-provoking.

What's even more outrageous is that when a TechCrunch reporter asked Gemini the same question, it gave a different incorrect answer: Austin Butler.

Moreover, as shown in the above image, Gemini also fabricated information about other awards.

The film that won the Best Documentary at the 95th Oscars was "Navalny," not "All the Beauty and Bloodshed," and the winner of the Best International Film was "All Quiet on the Western Front," but Gemini provided the answer "All the Beauty and Bloodshed"...

In addition, science fiction writer Charlie Stross discovered more errors in a recent blog post. Gemini Pro even fabricates other information, such as claiming that Stross himself made contributions to the development of the Linux kernel, when in fact he has never been involved in any projects related to the Linux kernel.

The TechCrunch reporter also asked Gemini to provide a 6-letter French word, but Gemini's answer had 7 letters.

However, it should be pointed out that Wall Street News has emphasized in a previous article that scenarios involving the control of character count have always been a weakness of AI. This is because the technology behind generative AI is context prediction, based on tokens rather than characters.

Wall Street News gave the same task to ChatGPT, which also provided an incorrect answer with 7 letters.

In summarizing the news, Gemini's performance seems overly cautious - cautious to the point of affecting the basic user experience.

As shown in the figure below, a user named X simply asked Gemini to list the latest situation of the Israeli-Palestinian conflict, without asking Gemini to make a judgment. However, Gemini told the user:

"Why don't you just Google it yourself?"

Wall Street News tried the same question and received the same suggestion: "Go search for it yourself!"

In comparison, ChatGPT provided a list-style news summary with citations:

Interestingly, when the reporter asked Gemini about the latest news on the Russia-Ukraine conflict, Gemini did not evade the question but generated a news summary. However, this information was already outdated by over a month.

So, what about the ability to write code? This is one of the key application areas where AI can greatly enhance human productivity.

However, feedback from user X indicates that although Gemini has improved in coding compared to the previous version Bard, Gemini's basic coding ability is very average, even inferior to ChatGPT, which came much earlier. Another user, X, tested AI to generate a code for a small game, and ChatGPT was able to write the code on the first try, while Gemini took 3 attempts.

In summary, the feedback from users is that Gemini is not as good as GPT.

Of course, the Gemini Pro currently available for use is not the most powerful version of Gemini. The most powerful version, Gemini Ultra, will be launched in Google Bard and other products sometime next year. Google Gemini Pro is comparable to the previous generation, GPT-3.5.

Clearly, Gemini Pro still has a lot of room for improvement. And is the Ultra version as amazing as demonstrated by Google? We will have to wait until next year to find out.