Track Hyper | Huawei and Apple Enter the Field of Edge AI with Large Models

New Hope for Innovation in Intelligent Terminal Technology.

Apple and Huawei, two heavyweight players in the field of edge AI models, have recently entered the new track, providing new hope and motivation for smart terminals, especially smartphones, to overcome their sluggish state.

Apple GPT is an AI tool developed by Apple based on its self-developed Ajax framework. It is currently undergoing small-scale internal testing within the company. On the other hand, Huawei demonstrated the application capabilities of edge AI models through its intelligent assistant "Xiao Yi" in HarmonyOS 4, which was unveiled at the HDC 2023 Developer Conference on August 4.

Previously, companies such as Meta, OpenAI, Qualcomm, Google, Tencent, and Baidu have already launched or are about to launch applications or technical frameworks that support edge AI models, rapidly shaping the technological and market prospects of this new direction.

HarmonyOS 4: What are the capabilities of edge AI models?

On August 4, Huawei officially released HarmonyOS 4. HarmonyOS is a distributed operating system designed for the Internet of Things (IoT), supporting various terminal devices such as smartphones, tablets, smart wearables, and smart screens.

In the fourth version of this well-known IoT operating system, the capability of edge AI models (i.e., running large models on smartphones) has become a key focus.

"Today, we have entered the era of large models, and Huawei's PanGu large model will empower the HarmonyOS ecosystem," said Yu Chengdong, Executive Director, CEO of Huawei's Consumer Business Group, and CEO of the Intelligent Automotive Solutions Business Unit. "With the underlying capabilities of PanGu, Huawei will bring users a brand-new AI experience transformation in terms of intelligent terminal interaction, advanced productivity efficiency, and personalized services."

Text generation based on AI models is one of the features of HarmonyOS 4. With the continuous development of large models, Xiao Yi has been upgraded to enhance its capabilities in interaction, productivity, and personalized services.

Text generation and summarization capabilities are standard features of large model applications, and HarmonyOS 4 is no exception. Through Xiao Yi, it can recognize the content and text in images, read text content, and integrate with more services.

Ziad Asghar, Senior Vice President of Product Management and Head of AI at Qualcomm, believes that large models will rapidly reshape human-computer interaction.

The interactive changes in HarmonyOS 4 may not be significant at first glance. Based on voice interaction, it has expanded to include various forms of input such as text, images, and documents. For example, users can communicate with AI in a natural way using everyday language, and Xiao Yi can automatically complete specified tasks. This is something that Apple has already achieved through Siri.

However, as the first intelligent assistant with the capability of large AI models, Xiao Yi has a deeper understanding of natural language semantics.

For example, Apple's AI assistant Siri requires clear and precise semantic voice instructions from end users, while HarmonyOS 4 can understand voice instructions with relatively less obvious meanings. Therefore, smartphones equipped with HarmonyOS 4 can complete tasks more accurately and quickly. In addition, due to Xiaoyi's large-scale AI model generalization ability and its connection to various services and special scenarios through mobile smart assistants, it can now obtain services faster than before.

For example, if a promotional poster is received and the user gives instructions to Xiaoyi, AI can automatically recognize the address on the poster and provide navigation buttons or save the poster's phone number as a contact.

If browsing a lengthy English news article, Xiaoyi can quickly read the article, translate it, and provide a summary. It can also answer questions related to the news article.

HarmonyOS 4 also has generation capabilities. For example, it can automatically generate various types of business email content or generate images. It can also use personalized photos saved on the device to generate images in various styles using AI drawing functions.

Through continuous communication between users and Xiaoyi, Xiaoyi's AI capabilities will continue to improve. These interaction data will be retained on the device side to protect user privacy.

Personalization is a feature of smartphone applications. With prolonged use, Xiaoyi's memory capabilities will improve, and it will become more understanding of its "master" and provide more thoughtful suggestions. For example, when traveling, it can provide comprehensive and rich travel information before the trip, and provide local information after arrival. It can also provide personalized recommendations based on user habits.

According to Huawei, these new capabilities of Xiaoyi will be available for public testing and experience in late August.

This is not Huawei's first attempt to integrate AI large-scale models into mobile devices. In March of this year, Huawei released the P60 smartphone, which comes with intelligent image search functionality. This feature is based on multimodal large-scale model technology and achieves large-scale model operation on the device side by miniaturizing the model.

Apple Moves Slowly, Qualcomm's Enthusiasm Rises

Huawei is not the only smart terminal manufacturer eyeing the capabilities of on-device AI large-scale models. Apple is also in the game.

Apple is secretly developing a software called Apple GPT, which is based on Apple's self-developed Ajax framework.

However, at present, Apple GPT lacks more details, and it is difficult for the outside world to know the specific highlights of its technology or application capabilities. In response to this, Apple has explicitly stated that the future development direction of Apple GPT has not yet been decided.

Previously, during the second-quarter earnings conference call, Apple CEO Tim Cook acknowledged the enormous potential of AI but emphasized the need for further consideration on how to use AI technology. Cook stated that Apple has integrated AI technology into its products and services and will continue to do so in the future.

Apple Siri, the world's first NLP-based intelligent assistant for the consumer market, has been around for 12 years. Compared to the newly upgraded Huawei, Xiaomi Xiaoai, Baidu Xiaodu, and other "younger" assistants, Siri not only lacks intelligence but also appears somewhat "dim-witted."

It is speculated that the carrier for Apple's deployment of GPT capabilities is likely to be Siri, which is in a "senior" state. However, this has not been confirmed by Apple. Despite this, there are many signs that Apple is entering the field of on-device AI models. For example, in January of this year, Apple launched a new program to add digital narration functionality to Apple Books, generating high-quality AI audio narration from written text. In the iOS17 update, Apple improved the predictive and spelling correction features of its input method using Transformer language models.

Furthermore, the new AirPods Pro utilize machine learning (ML) to achieve adaptive audio modes that automatically adjust the volume by recognizing the external environment. iPadOS 17 uses machine learning models to recognize PDF fields, and Vision Pro uses machine learning technology, specifically encoder-decoder neural networks, to create digital avatars for users.

How Apple is positioning itself in the AI race is a complex question that cannot be fully explained in a few hundred words. However, the lack of true motivation among the technical engineers of Siri's development team to transform Siri's "stupidity" reflects the reality of "organizational barriers and lack of ambition," which may affect Apple's efficiency in implementing on-device AI models.

Nevertheless, Apple's inefficiency does not change the fact that on-device AI models have immense potential in intelligent terminals. This is an undeniable fact, as companies such as Qualcomm, Meta, OpenAI, Google, Amazon, Tencent, and Baidu are all dedicated to achieving lightweight deployment of AI models on mobile devices.

In the fourth week of July, the OpenAI team launched the mobile ChatGPT application, covering iOS and Android systems. Meta will collaborate with Qualcomm, and starting from 2024, its open-source large-scale model Llama 2 will be able to run on flagship smartphones and PCs. Qualcomm has expressed its commitment to migrating more generative AI use cases to the edge, with AI models exceeding 1 billion parameters already capable of running on smartphones, achieving performance and accuracy levels similar to those in the cloud.

Google, Tencent, Baidu, and others have tightly integrated model compression techniques with mobile model deployment frameworks/tools.

Among them, Qualcomm is particularly enthusiastic about this. Cristiano Amon, CEO of Qualcomm, emphasized in a statement that Qualcomm's ability to run AI models on smartphones instead of cloud servers provides an opportunity for the company to achieve a "turning point" and drive future growth.

"In short, we are in a unique position to shape and leverage the upcoming Gen AI opportunities on devices," Amon said.

It is still uncertain when the overall decline in the smartphone market will stop. However, the influx of many B-side participants in the on-device AI model race brings new hope for reshaping the application and market landscape of this increasingly sluggish consumer electronics category.