From Transformer to Brain-Computer Interface, Apple's press conference reveals much more than just MR.

Perhaps Apple is the company that best combines brain-machine and AI applications?

Critics have complained about the high price and poor practicality of Apple's MR device, Vision Pro, which requires an external battery. However, according to Jianzhi Research, Apple's presentation last night exceeded expectations.

Apple has always prided itself on not just showcasing technology, but applying all of its cutting-edge tech to create the best possible user experience. And with Vision Pro, Apple has done it again.

With direct aerial control without the need for a joystick, seamless screen switching between devices, smooth UI control with real-time feedback, 3D cameras, and more, what seems like simple interactions come together to create something that is not just simple, but rather a groundbreaking application of technology.

At Apple's WWDC23 conference, there was no focus on flashy technology, but rather on the use of cutting-edge tech in product interaction. The most impressive feature for Jianzhi Research was the real-time feedback received from Vision Pro when controlling it with both hands in the air. Anyone who has used wireless devices knows how frustrating latency can be, but from Apple's promotion, it appears that these operations are almost entirely real-time.

The market believes that this is due to the multiple cameras installed on Vision Pro. However, Jianzhi Research obtained an answer from a neurofunctional developer at Apple via Twitter this morning, and it is far more complex than just cameras.

According to Sterling Crispin, an Apple neurotechnology designer, "This monitors changes in the pupil and can be used to predict future behavior. This technology has been validated and is a very cool experience. In mixed reality or virtual reality experiences, AI models attempt to predict whether users are curious, distracted, afraid, attentive, recalling past experiences, or in other cognitive states. These can be inferred through measurements such as eye tracking, brain activity, heart rate and rhythm, muscle activity, blood density, blood pressure, skin conductivity, and more."

At the same time, Sterling Crispin also mentioned that Vision Pro uses machine learning to monitor signals from the body and brain to predict human emotions, creating a more suitable virtual environment to enhance the user experience.

This shows that Vision Pro's brain-machine prediction behavior is a groundbreaking development in XR devices. Previously, the development path of XR devices has been focused on hardware upgrades, but the biggest problem has always been the difficulty in improving the sense of interaction and immersion. However, Apple has exceeded expectations in this area, even after its hardware specifications have already been widely circulated in the market.

Apple: People Shouldn't Be Symbols in Cyberpunk

Apple's philosophy is highlighted by some other details - they don't want people to become symbols in cyberpunk.

Creatively, Apple rendered the eyes of people on the front of Vision Pro (Mr. Fu Peng, who is always keen on black technology, changed his avatar at the first time), emphasizing the interaction of eye contact when people communicate with each other.

This is not the first time Apple has emphasized the importance of eye contact in dialogue between people. In FaceTime, several versions ago, Apple adjusted the position of the eyeballs in the video with AI, adjusting the direction of our attention when looking at the screen to focus on the other person's direction.

In Vision Pro, Apple further enhances eye contact between people, and when communicating with people who are facing each other, they can directly communicate through MR devices without the need to remove the device. When the behavior of the pupils changes, they can re-enter the immersive virtual space.

When using the FaceTime function of Vision Pro, the other party will see a real-time rendering of you using deep learning, and digital content is integrated into the real world through the spatial operating system VisionOS.

Apple's Own AI is Here

The market believes that there is no content about AI models in Apple's press conference this time, but this is also a mistake.

Jianzhi Research found that in the introduction of the latest upgraded iOS17 system, Apple announced the use of Transformer language models for input and speech recognition text.

Through the Transformer model, Apple can continuously improve the experience and accuracy with each user's typing, automatically correct language grammar, and facilitate users to obtain real-time predictive text recommendations when typing, just by tapping the space bar to add entire words or complete sentences, making text input faster than ever. The dictation function uses a brand-new language recognition model, and the accuracy is further improved.

Transformer is the basis of a series of large models such as OpenAI. According to Apple's usual privacy policy, it is not surprising that this technology is also purely localized.

Apple's introduction of language prediction models into input methods and speech transmission is the best case of combining cutting-edge technology and applications. Transformer is the strongest underlying technical support for human-machine dialogue assistants, and Apple is the first company to embed this technology in mobile product systems. iOS 17 will bring a new experience in speech and language input. At the same time, we can expect that Apple is likely to bring localized LLM models to users next year. The localization judgment is partly due to Apple's high level of concern for user privacy, and partly because Apple has accumulated considerable technology in hardware processing. The application of large-scale models with software and hardware integration will be more effective.

Just as Mac brought in the era of personal computing and Apple brought in mobile computing, Vision Pro will usher in the era of spatial computing.

According to JZYJ Research, the emergence of Vision Pro will truly open up a new era of computing, and generative AI may also complement Apple's MR devices, as real-time rendering in MR has far greater computational requirements than generative AI. Previously, we analyzed in the article AI+XR will become the next battlefield for mobile devices that the rapid development of generative AI combined with MR will bring about a comprehensive upgrade of mobile products, especially in terms of application content innovation, which will greatly improve the problem of the lack of popular XR content at the current stage.

Conclusion

For 7 years, Apple has been using its mountain-moving ability to handle every detail of Vision Pro. This software and hardware integration capability is not something that anyone can easily imitate; every small problem can be seen as Apple has put in a lot of effort to solve it.

This is also the reason why, even before the release, the hardware BOM list of Apple MR was circulating everywhere, but the appearance of Vision Pro still exceeded JZYJ Research's expectations.