Microsoft competes head-on with Google's AI search service, Windows AI assistant upgraded significantly: Copilot can analyze website content, engage in GPT-like voice conversations
Microsoft has launched a series of new Copilot features, including Copilot Vision, which can interpret popular website content. Data is deleted immediately after the conversation, and it cannot access paywalls and sensitive content; Copilot Daily provides voice news summaries, for which Microsoft pays cooperating publishers such as Reuters and Financial Times; Think Deeper can reason complex problems, or support the OpenAI o1 model; personalized features recommend Copilot usage based on user interactions in the past; Bing's generative search, AI Overviews, went live in the United States on Tuesday, challenging Google
Author: Li Dan
Source: Hard AI
Microsoft made a big move this Tuesday, with its personal artificial intelligence (AI) assistant Copilot receiving a major upgrade. Microsoft has started rolling out a series of new Copilot features to all Windows users, including a new tool that can understand and respond to computer screen queries. At the same time, Microsoft's web search engine Bing officially launched the AI-generated summarization feature, directly challenging Google's similar AI search feature, AI Overviews.
Starting from Tuesday, October 1st, in the Eastern Time Zone, Microsoft launched a new version of the Copilot app for Apple's iOS, Google's Android, Windows, and web platforms, claiming that all these apps have a more "unique" and "heartwarming" style. Microsoft has also introduced chatbots into Meta's social media platform WhatsApp, allowing users to chat with Copilot via direct messages (DM), similar to the experience of using other bots on Meta's messaging platform.
Copilot Vision Interprets Website Content and Deletes Data Immediately After Conversation
Copilot Vision is the most prominent feature owned by Copilot. As the name suggests, it can see the content users are viewing on their personal computers (PCs), specifically the websites users access using Microsoft's AI browser, Microsoft Edge.
Copilot Vision is a new experimental optional feature exclusive to Copilot Pro. Users can allow it to analyze text and images on web pages and answer user questions about the content, such as asking how to make a food displayed in an image, prompting Copilot to provide a recipe.
Users can summon Copilot Vision by typing "@copilot" in the Edge address bar, similar to the search technology Google provides on Android systems and Chrome browsers. However, Microsoft states that Copilot Vision is more powerful than previous screen analysis features and places a greater emphasis on privacy.
Microsoft claims that Copilot Vision can suggest next steps for users, answer their questions, help guide them towards what they want to do, and assist in completing tasks, all through natural language conversation. For example, if a user wants to decorate a new apartment, Copilot Vision can help search for furniture, find suitable color palettes, consider all choices from carpets to blankets, and even suggest how to arrange items the user is looking at.
Regarding privacy, Microsoft emphasizes that users can immediately delete conversation data with Copilot Vision after the conversation. The processed audio, images, or text will not be stored or used for model training, at least not in this preview version. Additionally, Copilot Vision is limited in the types of websites it can analyze and interpret, only able to analyze pre-approved "popular" websites Currently, Microsoft prevents this feature from processing paid content and "sensitive" content behind paywalls, but has not disclosed what constitutes sensitive content.
Copilot Vision is currently only available in the United States. Microsoft stated that the feature will respect websites' "machine-readable controls for AI," such as regulations prohibiting robots from scraping data for AI training. However, the company has not clearly stated which controls Vision will respect; there are currently several controls in use. We have requested clarification from Microsoft.
Copilot Daily provides voice news summaries Microsoft pays cooperating publishers
Microsoft released a new feature called Copilot Daily on Tuesday, providing users with voice summaries of weather and current events. To support this feature, Microsoft will pay publishers for the content featured in Copilot Daily.
Amazon and Google's voice assistants, Alexa and Google Assistant, have long been providing similar daily briefings. Microsoft stated that Copilot Daily "can alleviate people's familiar sense of information overload," and that it is "concise, simple, and easy to understand, extracting content only from authorized sources." Over time, the feature will provide reminders and customization options.
Microsoft currently has only launched Copilot Daily in the United States and the United Kingdom. Reuters, Axel Springer, Hearst Magazines, USA Today Network, and the Financial Times are collaborating with Microsoft to provide information for this feature. Microsoft plans to "quickly" add paid publishers and expand Copilot Daily to more countries.
Think Deeper reasoning complex problems or supporting OpenAI o1 model
Similar to Vision, Copilot's new feature, Think Deeper, aims to make Microsoft's AI assistant more flexible.
Microsoft stated that Think Deeper enables Copilot to reason through more complex problems, thanks to the "reasoning model," which requires more time for reasoning and provides step-by-step answers. Microsoft did not disclose specific reasoning models, only mentioning that they are using "OpenAI's latest model, with some modifications by Microsoft." Media speculates that this refers to a customized version of OpenAI's o1 model.
Starting this Tuesday, Think Deeper is being made available to a limited number of users in Copilot Labs in Australia, Canada, New Zealand, the United States, and the United Kingdom.
Copilot Voice similar to ChatGPT advanced voice mode
Copilot's new feature, Copilot Voice, is first launching in English versions in New Zealand, Canada, Australia, the United Kingdom, and the United States. Microsoft has added four synthesized voices, allowing users to choose which voice their Copilot will use for conversation Just like the Advanced Voice Mode provided by OpenAI for ChatGPT, Copilot Voice can recognize the user's tone during the conversation and respond accordingly. Users can also interject at any time when Copilot Voice responds, just like in a human conversation.
However, there are limitations on the usage duration of Copilot Voice. Microsoft stated that subscribers to Copilot Pro can get more Copilot Voice conversation time, but the specific duration is "variable" and depends on demand.
Personalization Uses Past Interactions to Recommend Copilot Usage
Microsoft mentioned that with the new personalization settings enabled, Copilot will soon better align with the user's preferences. It will utilize the user's past interactions and history, as well as interactions with other Microsoft apps and services, to recommend ways for the user to use Copilot. This can help users get started with Copilot by providing both a convenient guide to its practical functions and conversation starters.
The personalization feature of Copilot can be turned off in the Copilot settings menu on Windows and will not be launched in the UK and EU countries in the near future.
Bing Generative Search Launches in the US on Tuesday
In July of this year, Microsoft launched a trial version of Bing Generative Search, and this week, the service officially launched for all users in the US. The easiest way to access it is by searching "Bing generative search" on Bing.
Microsoft mentioned that they will introduce an option to make it easier for users to trigger Bing Generative Search for "information queries."
Bing Generative Search is seen as Microsoft's response to Google's AI search. Wall Street News introduced Google's three major innovations in search mode at the 2024 Google I/O Developer Conference in May, one of which is the AI-generated summary feature called AI Overviews, which displays the generated summary at the top of search results.
Bing Generative Search combines multiple AI models to generate summary content in response to search queries. For example, when a user searches for "What is an Italian Western film?" Bing Generative Search will display a summary of the history and examples of this type of film, along with source links.
Microsoft believes that Bing Generative Search is not only limited to finding answers. They stated that it can understand search queries, review millions of information sources, dynamically match content, and generate search results in a new AI-generated layout to more effectively meet the user's search intent
![](https://wpimg-wscn.awtmt.com/ef284b39-ac5f-479b-a97a-7c68b2dda087.jpeg)
![](https://wpimg-wscn.awtmt.com/5ec72c9a-a165-419f-bf10-0fc4acda1fe0.jpeg)