"Stealing" data, using Tesla GPUs, what has Musk turned AI into?

Huxiu
2024.08.16 00:21
portai
I'm PortAI, I can summarize articles.

Elon Musk's Grok2 AI model has shown significant performance improvement, performing exceptionally well and ranking third in overall capabilities. It has also added image functionality, especially excelling in mathematical and commonsense questions. The release of an anonymous model sus-column-r has attracted attention, which is actually Grok2 of xAI. Grok2's win rate table shows its outstanding performance in large model competitions, basically on par with the most advanced AI models on the market

Negative review X.PIN (ID: chaping321), author: Shichao, editor: Jiangjiang, cover image from: Visual China

Elon Musk's Grok2 AI model has made significant improvements in performance and features.

• 🚀 Grok2 performs well in the large model arena, ranking third in overall capabilities.

• 🖼️ Added image functionality, collaborated with FLUX.1, and excelled in creative abilities.

• 🧠 Performs well in mathematical and common sense problems, and competes with GPT-4o.

There was a strange incident recently in the large model arena, where an anonymous model with the code sus-column-r suddenly emerged.

No one knows where it came from, but in the past month, it has been performing exceptionally well...

With over 10,000 votes, it has managed to break into the top rankings. As of now, its overall capabilities are tied for third place with the GPT-4o (API version) released on May 13.

Seeing this momentum, many speculated at the time that this could be the mysterious "Strawberry Q*" project from OpenAI that has yet to be publicly announced.

However, yesterday, the long-standing mystery that has puzzled everyone finally has an answer. But it wasn't OpenAI that unraveled it, but Elon Musk, who is often overlooked but also involved in AI.

This anonymous sus-column-r model is actually the new Grok2 model that xAI is about to launch, which has now been released as the X member version.

The reason for this move in the large model arena is simply to create buzz around yesterday's launch.

For example, in the blog post announcing the release of Grok2, they blatantly displayed the achievements accumulated by sus-column-r, and even created a battle win rate table According to their own claims, apart from Google's Gemini 1.5 Pro, all others, whether it's GPT-4o or Claude3.5 Sonnet, are just "subordinates" of Grok2.

Of course, in terms of paper performance, Grok2 also did quite well. Looking at various benchmark data, its capabilities are similar to the most advanced AI models on the market. It can be considered as one of the top large models in the industry, thanks to Grok.

However, whether a model is good or not cannot be solely judged based on these data. The most important thing is to look at everyone's actual user experience.

Compared to the previous Grok1.5, which could only joke around with text, the most obvious upgrade in Grok2 this time is the addition of image functionality.

But unlike other companies developing multimodal AI on their own, this time Musk surprisingly chose to cooperate with others.

The cooperating party should ring a bell for some of you, it's FLUX.1 that we just wrote about a couple of days ago.

Seeing this, Seachao didn't have too high expectations for Grok2's image functionality, after all, it's an AI that was just tested a few days ago...

But unexpectedly, FLUX.1 integrated into Grok2 did spark some different ideas.

It's not that its performance suddenly skyrocketed overnight. It's more about Grok2's creativity compared to other image AIs on the market, which can be described as outstanding.

For example, with a Disney princess prompt, Grok2 is much bolder in scale compared to other models.

Moreover, Grok2 can directly satirize its own boss, like this image of Musk who loves sweets and gains weight.

In the blink of an eye, it can even travel to the world of Game of Thrones and engage in role-playing

What's even more interesting is that some netizens directly used the graphics generated by Grok2 and combined them with AI to create videos.

However, Grok2's unrestrained parody of various public figures and cartoon characters definitely carries risks.

For example, some netizens created images of Mario smoking and drinking, as well as observing the "911" incident, which prompted calls for legal action from Nintendo.

In addition to the new image function, it's necessary to test Grok2's enhanced capabilities and performance.

Considering that the current X model online is just a slightly underperforming mini version, Shi Chao tested the more powerful Grok2 in the large model arena and compared it with the latest version of GPT-4o.

In the first test, they started with AI's common mistakes. Recently, the large models collectively failed on "comparing decimals," which many people should have heard about.

This time, they recreated a classic scenario for comparison, asking them to compare 9.5 and 9.11.

Surprisingly, GPT-4o remained stubborn, getting the correct answer but with a completely confused reasoning. On the other hand, Grok2 provided a logical and well-founded answer.

There was also a classic counting problem, where GPT-4o still surprised everyone by counting "I grabbed the handle" as 5 handles, while Grok2 remained stable in its performance.

However, when asked about the meaning of a certain sentence, Grok2 seemed a bit hesitant and rambled on without hitting the key points. In contrast, GPT-4o easily explained it in a few simple sentences.

Next, Shi Chao tested them on some basic general knowledge questions, such as "Who is Li Zhengdao," and both of their answers were relatively accurate. However, for some reason, GPT-4o tended to be lazy and ended its responses abruptly On the Grok2 side, the answers provided each time are quite detailed, and they are thoughfully categorized.

In any case, when Shi Chao actually uses it, he really feels the improvement in Grok2's capabilities.

Furthermore, according to the official data, this time in the field of mathematics, Grok2 is also quite proficient.

So I dug out the math problem that Grok got wrong before, which was a derivative problem.

Turns out Musk didn't really deceive us, both of them figured out this problem clearly.

And to transform Grok into its current state, Musk has contributed more than a little behind the scenes.

But what's interesting is that Musk's main focus is to fleece his own other companies...

First, in terms of employees, xAI only has around 50 employees in total, with 11 of them working at Tesla, and among them, 6 are working in the Autopilot team, showing no intention of avoiding suspicion.

According to the Wall Street Journal, even the GPUs originally intended for Tesla were requested by Musk to be prioritized for supply to xAI, and he even grandly stated that Tesla currently has no use for them, just storing them in the warehouse.

After stealing from Tesla's home, Musk wasn't satisfied and extended his reach to X.

Just the other day, the tech media Techcrunch reported that in order to "quietly" train AI with user data, X sneakily changed users' default settings during updates, and to turn them off, users had to specifically log in to the web version...

However, constantly stealing like this is bound to lead to lawsuits. Musk and his X have been sued by Tesla's shareholders and several national data protection agencies.

Currently, the Tesla case has been heard in a court in Delaware.

And the other data protection agencies have sued X, temporarily halting the use of user data to train Grok. X may even face a fine of "4% of platform revenue".

But in any case, in the field of large models, Musk's Grok has truly caught up with the progress of the big players. Compared to other large models, whether it's image generation or other basic capabilities, Grok2 is not bad at all, and even has a bit of its own uniqueness It is said that xAI will further integrate Grok into the X platform, and will also release a preview version of AI with multi-modal understanding.

For some reason, Shi Chao is already looking forward to what surprises Musk can come up with next...

Source: X, WSJ, Techcrunch