According to informed sources, Microsoft Corporation (MSFT.US) has reached an agreement with HarperCollins Publishers, a subsidiary of News Corporation-B (NWS.US), to utilize the latter's rich resources of non-fiction books to train its artificial intelligence models, aiming to enhance the quality and performance of the models. This collaboration is limited to using selected old books for model training and does not involve the creation of new books, with authors having the right to choose whether to participate. Specifically, Microsoft hopes to incorporate HarperCollins books into its yet-to-be-announced AI model to expand high-quality text sources and improve the model's accuracy and expertise. Although Microsoft declined to comment, HarperCollins has confirmed the agreement, stating that it will "allow limited use of selected non-fiction old books to train AI models." At the same time, HarperCollins emphasized that the scope of this agreement is limited and has clear restrictions on the exemplary output that respects authors' rights, allowing authors to choose whether to participate. "One of our tasks is to create opportunities for authors to think deeply while ensuring that the core value of their works and the revenue and royalties we share are protected," HarperCollins stated. "This agreement is limited in scope and sets clear boundaries for outstanding works that respect authors' rights, successfully achieving this goal." It is understood that technology companies have been seeking more high-quality text sources to train AI models, and companies like Microsoft are no exception. They obtain licenses to use a range of data from social media sites to news articles to make their programs more accurate and better at answering questions or providing expertise on specific topics. Notably, News Corporation had previously signed an agreement with OpenAI, allowing it to use content from several of its publications. Microsoft has also collaborated with multiple publishers on AI projects. Additionally, earlier this year, Google reached a $60 million agreement with Reddit, enabling the search giant to utilize a large number of subreddits to train its AI models. However, some publishers have expressed dissatisfaction with AI companies citing content without permission and have filed lawsuits. For example, The New York Times has sued OpenAI and Microsoft, accusing them of copyright infringement. In summary, the agreement between Microsoft and HarperCollins marks another significant advancement for technology companies in seeking high-quality text sources to train AI models. However, how to respect authors' rights while utilizing these resources remains a challenge that publishers and technology companies need to face together