Surpassing GPT-4 in all aspects! Anthropic has launched its fastest and most powerful AI model, Claude 3, capable of summarizing 150,000 words.

Anthropic, an AI startup supported by Alphabet's C division, has launched its first AI model, Claude 3, capable of summarizing 150,000 words, surpassing OpenAI's 3,000-word capacity. Additionally, this model introduces multi-modal functionality for the first time, allowing users to upload images and files.

On March 4th, Monday, the artificial intelligence company Anthropic launched an AI model and a new chatbot called Claude 3, which includes three models: Opus, Sonnet, and Haiku. The company claims that this is the fastest and most powerful product they have developed to date.

Founded by former research executives from OpenAI, Anthropic has successfully completed five rounds of financing totaling $7.3 billion in the past year. They have launched products directly competing with OpenAI, gaining support from major companies such as Alphabet-C, Salesforce, and Amazon. From a promising startup, Anthropic has grown into a hot company in the AI field, receiving widespread attention and support from the industry.

Notably, the Claude 3 model has a powerful ability to process and summarize large amounts of text data, capable of summarizing up to 150,000 English words, equivalent to the length of epic works like "Moby Dick" or "Harry Potter and the Deathly Hallows". In comparison, OpenAI's GPT-4 can only summarize about 3,000 words. Additionally, Anthropic has introduced the ability to upload images and files for the first time.

Claude 3's Superior Long Text Processing Capability Over OpenAI

The company states that Claude 3 Opus is the most powerful among the three models, excelling in handling complex problems and logical reasoning, surpassing OpenAI's GPT-4 and Google's Gemini Ultra.

The other models, Sonnet and Haiku, have relatively limited processing capabilities or features, but they are more cost-effective compared to Opus, making them suitable for users or businesses that do not require the advanced features of Opus.

According to Daniela Amodei, co-founder of Anthropic, and the company's statement, Claude 3 model has the following features:

1) Multimodal support and text processing capabilities:

Claude 3 is Anthropic's first model with multimodal capabilities, able to handle various data types such as documents, images, videos, etc. Users can upload images and files, greatly expanding the model's application range and practicality, making it one of the most anticipated applications in the industry.
Claude 3 can summarize up to 150,000 English words, far exceeding OpenAI's 3,000 words. Additionally, Claude 3 can provide outputs in different formats according to user needs, such as memos, letters, or stories. This capability makes Claude 3 far superior to OpenAI in processing long texts.
Claude 3 has a more refined understanding of user intent and context, providing more accurate and relevant responses by deeply analyzing the semantics, context, and emotions of language.

2) Improved Risk Understanding:

Daniela Amodei, co-founder of Anthropic, stated that the newly launched Claude 3 model has improved its ability to understand risks related to sensitive or controversial topics compared to the previous version. It can more accurately determine when to respond or exercise caution. The previous Claude 2 model was too conservative when dealing with sensitive topics, sometimes overly refusing to respond to sensitive or controversial issues. Claude 3 is committed to maintaining safety and caution while reducing unnecessary response restrictions, making the model more flexible and practical.

Regarding specific release dates, Anthropic mentioned that Sonnet and Opus have been launched in 159 countries/regions since Monday, and Haiku will also be released soon.

In terms of the team, Amodei revealed that the company has adopted a layered team structure in developing core AI models. The core development team consists of 60 to 80 people responsible for algorithm and architecture design of the model. The technical support team consists of 120 to 150 people responsible for programming, data processing, testing, and deployment.

In the final iteration of the model, 30 to 35 people directly participated in the development, but the overall support team reached about 150 people. Although the team directly involved in core development is relatively small, the large size of the overall support team ensures efficient collaboration and optimization in model development.

Text Alone is Not Enough, AI Models Need Multimodal Functionality

In the past year, generative AI has become a focal point in the business and technology sectors, rapidly penetrating various fields including education, online travel, healthcare, and online advertising. AI topics have also been repeatedly discussed in major corporate earnings conference calls.

According to PitchBook data, the investment in the AI field reached a record $29.1 billion in 2023, with transaction volume increasing by over 260% YoY, demonstrating strong investor confidence in the potential of AI development.

While AI is rapidly advancing, Brad Lightcap, COO of OpenAI, pointed out that using only text and code as inputs and outputs for AI models is insufficient. AI should be closer to human natural perception and interaction. He stated:

"The world is multimodal, human daily experiences involve not only text but also various sensory inputs such as images, sounds, and more. Therefore, using only text and code as inputs and outputs for artificial intelligence models is not enough."

"To better mimic human perception and interaction, AI models need to be able to process and generate various types of data. **By integrating multiple modalities, artificial intelligence models can provide richer, more realistic experiences and applications, closer to human natural perception and interaction." However, as AI models become increasingly complex, especially with the introduction of multimodal features such as image generation, new risks and challenges also arise. For example, Alphabet-C recently took down its AI image generator (part of the Gemini chatbot) due to inaccurate and problematic historical responses, sparking widespread attention on social media.

Unlike Google's Gemini, Anthropic's Claude 3 does not have the ability to generate images. It only allows users to upload images and other documents for analysis, thereby reducing the risks and controversies associated with auto-generated content to some extent.

Amodei also acknowledges:

"Of course, no model is perfect. I think it is very important to make this clear in advance. When developing models, we not only pursue performance and functionality but also prioritize the security and reliability of the models. Of course, despite rigorous development and testing, models may occasionally make mistakes and produce inaccurate or unpredictable outputs in certain situations."