Track Hyper | Snapdragon 8 Gen 2: Qualcomm's most powerful performance platform in history

Wallstreetcn
2024.11.10 01:47

The amazing applications of AI in the mobile phone industry originate from here

Author: Zhou Yuan / Wall Street News

Qualcomm's new generation Snapdragon flagship mobile platform has performance that rivals PC-level, redefining what "performance" means.

The performance of mobile consumer chips is catching up to that of PC chips, which is unprecedented.

On October 22, Qualcomm released the Snapdragon 8 Gen 2, which has become the most powerful flagship mobile platform since Qualcomm's establishment, with its technology being released like a spring "gushing" forth, allowing it to look down on the industry with few rivals.

This generation of Snapdragon 8 flagship does not follow the naming convention of Snapdragon 8 Gen X since 2021, but instead is called Snapdragon 8 Elite. Why is that?

Because the Snapdragon 8 Elite uses the same CPU architecture as the Snapdragon X Elite—Qualcomm's self-developed Oryon CPU architecture, abandoning the Kryo CPU architecture previously used in mobile chipsets.

This super computing platform (not just a SoC chip) features a full big-core design that Qualcomm has never had before; its overall performance, energy efficiency, and AI capabilities have reached a new height based on Snapdragon 8 Gen 2 (which reduces power consumption) and Snapdragon 8 Gen 3 (which enhances AI performance).

Overall, the technical focus of the Snapdragon 8 Elite is aimed at breaking through the AI experience on the smartphone side.

The stunning AI experiences launched by the Android camp this year, such as Honor's "One Sentence Matter" edge AI agent released on October 30, which can deconstruct and automatically fulfill the actual needs implied by users' vague intentions, come from the underlying technical capabilities of the Snapdragon 8 Elite; the offline communication feature of the Xiaomi 15 Pro also reflects the shadow of the NB-NTN (Non-Terrestrial Network) satellite communication technology in the Snapdragon 8 Elite.

Full Big-Core Architecture Dominates Chip Design

When evaluating the performance of any chip, there are three dimensions, collectively known as PPA.

That is, Power (energy consumption), Performance (performance), and Area (size). Among these, energy consumption ranks first, followed by performance, and area ranks third, mainly considering cost-related factors.

This excellent characteristic is also inherited by the Snapdragon 8 Elite: based on GeekBench test results, the CPU of the Snapdragon 8 Elite has improved single/multi-core performance by 45%, overall energy efficiency by 44%, and overall energy savings by 27%; GPU performance and energy efficiency have both improved by 45%, compared to the third generation Snapdragon 8 (i.e., Snapdragon 8 Gen 3) standards.

Compared to Snapdragon 8 Gen 3 and Snapdragon 8 Gen 2, the Snapdragon 8 Elite also uses TSMC technology. Unlike the previous two generations, this flagship platform uses TSMC's 3nm process (second generation N3E), which has the same technical specifications as Apple's A18 series and MediaTek's Dimensity 9400 This generation's mobile flagship platform is not just a simple SoC chip integration; it is referred to as a computing power platform because Qualcomm has packaged more than 40 different components together.

In addition to the CPU and NPU, it also includes RF, transceivers, power management, ultrasonic fingerprint recognition, and mobile connectivity chips, among others, providing comprehensive mobile, AI inference, integrated applications (such as imaging, gaming, screen unlocking, etc.), and communication connectivity capabilities.

Qualcomm has named it the Snapdragon 8 Gen 2, or "Elite," similar to the laptop chip Snapdragon X Elite launched in 2023, because Qualcomm has introduced the Oryon CPU architecture of the Snapdragon X Elite into the mobile platform for the first time, marking it as the second generation.

The Oryon CPU architecture is primarily designed to meet the growing demand for AI performance.

Therefore, the Snapdragon 8 Gen 2 is a significant technological iteration based on Qualcomm's first mobile AI chip designed for edge generative AI—Snapdragon 8 Gen 3—making it a solid AI mobile chip integration platform.

The biggest difference from all previous flagship mobile SoC chips from Qualcomm is that this is a mobile platform integrating more than 40 different functional chips, and its CPU architecture adopts an all-big core design for the first time, transitioning from Kryo to Oryon.

Based on the second-generation self-developed Oryon CPU architecture, the Snapdragon 8 Gen 2 is equipped with two prime cores with a clock speed of up to 4.32GHz; paired with this are six performance cores with a clock speed reaching an astonishing 3.53GHz, which is very close to the 3.62GHz clock speed of MediaTek's Dimensity 9400 ultra-large core.

In other words, in terms of clock speed parameters, the ultra-large core of the Dimensity 9400 is only equivalent to the performance core clock speed level of the Snapdragon 8 Gen 2.

The clock speed of the two prime cores of the Snapdragon 8 Gen 2 is already comparable to that of PC-level CPUs, hence its powerful performance. Qualcomm even proudly stated that the cores based on the second-generation Oryon CPU architecture are more powerful than Intel's highly anticipated Lunar Lake PC processor.

“How does the second-generation Oryon CPU compare to the best PC products from competitors (referring to Intel)?” said Qualcomm CEO Cristiano Amon. “Compared to competitors, our CPU performance has improved by 62%, which is much faster than Intel's recently released products, while energy efficiency has increased by 190%.”

From the perspective of CPU architecture, the Snapdragon 8 Gen 2 adopts 2 prime cores and 6 performance cores, with small cores eliminated. This means that the Snapdragon 8 Gen 3 is Qualcomm's last mobile platform to use a tri-cluster CPU architecture As of now, Arm's big.LITTLE architecture, launched in 2011, officially exits the historical stage of the Snapdragon flagship mobile platform, marking the arrival of the all-big-core era, where chip CPU design is dominated by an all-big-core structure.

What improvements are there in CPU and NPU?

Although the Snapdragon 8 Gen 2 also adopts a CPU architecture similar to the Snapdragon X Elite's Oryon CPU, the former uses the second generation of Oryon. So, what are the differences?

Qualcomm has made special improvements specifically for mobile platforms. In addition to different CPU configurations, the other optimizations mainly focus on enhancing cache.

The L1 cache for each Prime core and each Performance Core has been increased to 192KB and 128KB, respectively, totaling 1152KB, exceeding 1MB (1024KB); at the same time, the L2 cache has been increased to 24MB, with 12MB dedicated to the two super-large cores and 12MB shared among the six performance cores.

According to Qualcomm, this is a brand new microarchitecture featuring "instant wake" functionality, which reduces the frequent power cycling of each CPU core.

Previously, the Kryo CPU architecture used by Qualcomm involved a "Power-Up Sequence" that required resetting code to prepare the cores for operation. However, now, with the "instant wake" technology, the cores can immediately execute the next instruction, eliminating the delays caused by the power-up sequence, thereby further enhancing operational efficiency.

At the same time, the LP-DDR5X supported by the Snapdragon 8 Gen 2 has a rate of 10.7Gbps (bandwidth), and the main frequency has reached 5.33GHz, representing improvements of 11.04% and 26.90% over the previous Snapdragon 8 Gen 3's 4.8GHz and Snapdragon 8 Gen 2's 4.2GHz, respectively.

Qualcomm stated that the second-generation Qualcomm Oryon CPU microarchitecture and new memory technology will ultimately bring an excellent user experience to the Snapdragon 8 Gen 2, including faster application launch speeds, seamless multitasking, and advanced generative AI capabilities.

By the way, compared to the new CPU architecture and memory system brought by the Snapdragon 8 Gen 2, AI is the more attention-grabbing focus of this mobile platform.

Since we are talking about AI performance, we cannot overlook the AI computing dedicated chip "Hexagon NPU," which Qualcomm first adopted in the Snapdragon 8 Gen 2. This is the core of Qualcomm's AI engine.

What improvements have been made to the Hexagon NPU in this generation of mobile flagship platform?

First, the number of scalar and vector accelerators has been increased: there are 8 scalar cores and 6 vector cores; second, the data throughput capability has been enhanced across the board; third, there is a tensor accelerator similar to a super-large core, which overall improves NPU performance and efficiency by 45%, doubling the token generation rate on foundational large language models If we break down the roles or tasks, the Tensor accelerator mainly handles the acceleration of the LVM (Logical Volume Manager) logical volume AI model (primarily focusing on memory resource efficiency management); the Scalar accelerator is responsible for the acceleration of large language models (LLM), and the Vector accelerator supports long context support. Together, they enhance overall computational capability while supporting ultra-long text and LLM acceleration.

Currently, the response speed of some popular large language model applications in the industry shows that the Snapdragon 8 Gen 2 can process over 70 tokens/s, while the Snapdragon 8 Gen 3 achieves a speed of 20 tokens per second (for a 7 billion parameter LLM).

The AI capabilities of this generation of mobile platforms are based on a significantly upgraded Hexagon NPU, which can support the construction of personalized multimodal AI agents on the edge. This is particularly crucial for enhancing user experience.

The Snapdragon 8 Gen 2 can provide support at the base level for multimodal models, including automatic speech recognition (ASR), large language models (LLM), large vision models (LVM), and the new multimodal large model (LMM). Through heterogeneous computing, these AI models can run on different cores of the Qualcomm AI engine.

These technological capabilities can bring unprecedented new experiences to smartphone users.

For example, the sensors and cameras of smartphones can create a personal neural network (NPU) locally on the edge based on the user's daily preferences, allowing the AI personal agent to understand user needs more effectively, akin to a real human assistant.

AI Personal Agent and Image Elimination

In terms of user experience, thanks to the Hexagon NPU module, smartphones can understand the images displayed on the screen, even grasping complex user intentions and possessing the technical capability to provide immediate solutions.

For instance, when a user points the smartphone camera at something they want to learn about and asks the phone a question, the phone can utilize the real-time camera feed to perform a deconstruction analysis and provide an answer.

There are also more complex applications, such as when a user speaks a sentence to the smartphone, the smartphone has the capability to deconstruct the user's vague intentions and automatically fulfill the user's needs throughout the process. This essentially provides users with a highly "human-like" AI personal assistant, offering a new experience reminiscent of science fiction scenarios.

This experience has already been realized in the newly launched Magic 7 series, equipped with the built-in Honor AI personal agent YOYO and MagicOS 9.0, which was unveiled on October 31. The Magic 7 series is powered by the Snapdragon 8 Gen 2.

Honor claims that with smartphones running MagicOS 9.0, users only need to say "a sentence" to accomplish complex tasks such as ordering food or canceling hidden subscription fees, greatly expanding the highly intelligent experience of AI smartphones, marking a significant step forward compared to OPPO's preference for AI photo editing and Xiaomi's focus on AI photography The new AI experience of this smart terminal is actually based on the powerful underlying AI technology of the Snapdragon 8 Gen 2's Hexagon NPU.

The Snapdragon 8 Gen 2 can achieve a relatively higher difficulty AI experience by understanding complex user intentions, while simpler tasks like removing unnecessary pedestrians from static photos are a piece of cake.

However, Qualcomm, as a technology giant, will not stop there. This time, the AI capabilities of the Snapdragon 8 Gen 2 have also been extended to the video domain.

Qualcomm has equipped its AI engine Hexagon NPU with a collaborative hardware module: AI ISP (Image Signal Processor).

The main function of the AI ISP is to enhance computational photography performance, such as running more processing pipelines in the RAW domain.

This means that when the AI ISP performs shooting actions like autofocus, automatic white balance correction, and automatic exposure, it supports AI-assisted enhancement features, ultimately achieving better imaging performance, such as improved image quality (higher clarity or brightness, better color balance) and higher frame rate videos.

In addition, Qualcomm has combined two Micro NPUs, two AI ISPs, one DSP (Digital Signal Processor), and one memory to form the Qualcomm Sensing Hub, resulting in a 60% overall AI performance improvement and a 45% increase in AI inference speed.

From the parameters, the pixel throughput of the AI ISP has increased by 33%, reaching 4.3 billion pixels per second; at the same time, this ISP can support up to three cameras with a maximum of 48 million pixels each and record zero-latency shutter videos at 30FPS.

Where does the so-called new AI-assisted enhancement feature manifest? It's simple: it can achieve 60fps real-time video shooting quality at 4K resolution.

So, how do the Hexagon NPU and AI ISP collaborate? What is their role?

Qualcomm uses Hexagon Direct Link technology to achieve collaboration between the two, allowing the Hexagon NPU to directly access the native raw data from the ISP sensor, utilizing the NPU's technical capabilities to assist the ISP in faster image segmentation (Insight AI), further understanding the various elements in the image, and achieving quicker "blurring" or "object removal."

Yes, this is similar to the AI photo removal feature introduced starting with the OPPO Find X7 series.

This time, on the Snapdragon 8 Gen 2, video object removal has also been realized: simply select the object you want to erase in a 30FPS video, and it can be removed.

Based on the powerful performance of the Hexagon NPU and the high collaboration with the AI ISP, the entire processing process is placed on the device side, eliminating the need for cloud processing, thus ensuring no latency and a top-notch experience.

GPU Slicing Architecture and Offline Communication

The upgrade focus of each generation of Snapdragon mobile platforms, in addition to CPU, NPU, and ISP, also includes GPU and Modem Among them, the GPU is a traditional strong module of the Snapdragon flagship mobile platform, which is why the industry claims that the Snapdragon mobile platform offers a GPU with a CPU purchase.

Perhaps due to the abundance of technology categories, Qualcomm still did not give the new generation Adreno GPU of the Snapdragon 8 Gen 2 a catchy marketing name.

The new Adreno GPU of the Snapdragon 8 Gen 2 adopts a slice architecture for the first time—dividing the shader cores and other fixed-function blocks into different slices: a total of three groups, each with a frequency of 1.1GHz (compared to 900MHz in the previous generation), with the three groups receiving unified scheduling from the command processor.

When rendering complex scenes, it can directly store 12MB of data (graphics cache) on the GPU, reducing the need to send additional graphics data (RAM) to the Snapdragon 8 Gen 2 memory, resulting in lower latency, smoother application performance, longer battery life, clearer graphics, and more realistic 3D environments.

This design approach is similar to NVIDIA's GPC/TPC/SM hierarchical structure, AMD's CU compute units, and Intel's Render Slice (the core component of the Xe-GPU architecture).

Among them, Intel's Render Slice includes 4 Xe-Cores and 1 ray tracing unit, as well as other IPs such as geometry pipelines, rasterization pipelines, samplers, and pixel backends, forming the foundation of Intel's Arc GPU.

Through this design, the Snapdragon 8 Gen 2 allows for more dynamic resource allocation, higher clock speeds, and better load balancing, while also reducing power consumption by shutting down slices.

According to data provided by Qualcomm, the new GPU launched this time has improved performance by 40%, energy efficiency by 40%, and ray tracing performance by 35% (thanks to the upgraded Snapdragon Elite Gaming technology).

Ray tracing, commonly referred to as ray tracing, is simply the simulation of various light effects. For example, phenomena such as reflection and refraction, scattering and dispersion, bring lifelike lighting, reflections, and illumination effects to mobile games, achieving exquisite game graphics that closely resemble real environmental lighting.

To enhance the gaming experience, the Snapdragon 8 Gen 2, like the Snapdragon 8 Gen 2, introduces a core capability of Unreal Engine 5—the Nanite solution—into the edge mobile platform for the first time, while also upgrading Unreal Engine 5 to version 5.3. The Snapdragon 8 Gen 2 previously introduced the Metahuman framework of Unreal Engine 5 on the edge The Nanite solution allows developers to use high-polygon models in games and real-time rendering projects without significantly impacting performance; the Metahuman framework is dedicated to creating realistic digital human characters.

By adopting a new Virtualized Geometry System, Nanite enables even low-end machines to run complex large models. This is crucial for improving the accessibility and performance of games and real-time rendering projects.

The value of this system lies in its intelligence, as it only processes and renders details that the human eye can observe, using a highly compressed data format, thereby greatly reducing rendering pressure.

Wall Street News noted that the current generation Adreno GPU still only supports OpenGL ES 3.2 and Vulkan 1.3, just like the Snapdragon 8 Gen 2, aimed at enhancing the graphics processing efficiency of mobile devices for large mobile games.

In terms of communication connectivity, the Snapdragon 8 Gen 2 supports AI-enhanced 5G and Wi-Fi connectivity mobile platforms, integrating the Snapdragon X80 5G modem and RF system, which is the second-generation 5G AI processor.

The Snapdragon X80 5G modem boasts several industry firsts: first to support downlink 6-carrier aggregation, first to support 6 Rx receiver paths, first to support AI/5G-A integration, first to support AI multi-antenna management, first to support CPE AI-enhanced communication, and first to support NB-NTN (non-terrestrial network) satellite communication.

It is worth mentioning that the Xiaomi 15 Pro uses the Snapdragon 8 Elite and is the first to feature the Xiaomi Star Communication System, allowing the phone to achieve two-way calls within a radius of 3.5 kilometers without any network. This feature, along with the Snapdragon 8 Elite's support for NB-NTN (non-terrestrial network) satellite communication technology—which allows communication connections in areas without terrestrial network coverage—seems quite similar, doesn't it?