ANT GROUP initiates "Privacy Revolution"

Wallstreetcn
2024.07.06 11:07
portai
I'm PortAI, I can summarize articles.

Promote the circulation of data security

AI is accelerating the entire industry towards the era of data privacy.

On July 5th, at the 2024 World Artificial Intelligence Conference, ANT GROUP launched the first product of ANT Privacy Computing Company - "YinYu Cloud" large-scale model privacy computing platform.

According to Wang Lei, CEO of ANT Privacy Computing, the "YinYu Cloud" large-scale model privacy computing platform mainly provides two capabilities, namely large-scale model privacy hosting and large-scale model privacy inference.

Privacy hosting mainly solves the problem of IP protection of large models. When a large model is deployed in the cloud, the model will be processed into privacy mode to ensure that the model's IP is not stolen by others. Large-scale model privacy inference mainly addresses the protection of access information, ensuring that access information is private throughout the entire inference process.

Currently, high-quality data supply and secure circulation have become the primary challenges for large models to enter vertical industry applications. When large models are applied in vertical industries, many enterprises address data security challenges through private deployment, which not only increases the operation and service costs of enterprises but also affects the efficiency and quality of external services.

Professional data is usually distributed among different institutions and enterprises, making it difficult to share due to its high value and confidentiality. At the same time, there are trust barriers between enterprises, large model manufacturers, and users: enterprises are concerned about data leakage, manufacturers are concerned about model asset security, and users are concerned about personal privacy risks.

Wang Lei revealed that the YinYu Cloud platform will provide end-to-end data security services, covering the entire process from building to serving large models. The platform will provide privacy computing for pre-training, fine-tuning, evaluation, inference, and user interaction of large models, ensuring the secure circulation of data between providers and users. In addition, the platform will also provide full-chain development tools including privacy retrieval, prompt words, and process orchestration.

Wei Tao, Vice President and Chief Technology Security Officer of ANT GROUP, and Chairman of ANT Privacy Computing, believes that data supply determines the upper limit of the application capabilities of large models, while privacy computing technology determines the upper limit of cross-domain data supply. When large models transition from general to professional applications, from technical imagination to industrial productivity, they must address the challenges of scarce high-quality data sets and professional data blockages. Otherwise, as an "intellectual engine," large models will only end up idling.

At the end of May, ANT GROUP announced a technology strategy centered on AI and data technology, and established Zhejiang ANT Privacy Computing Technology Co., Ltd., which will provide products and services related to privacy computing, including an end-to-end data security guarantee, a hardware-software integrated computing acceleration solution, and a privacy computing cloud service platform to promote secure and trusted low-cost cross-cloud and cross-terminal data circulation.

The following is a transcript of the conversation between Wall Street News and Wei Tao, Vice President and Chief Technology Security Officer of ANT GROUP, and Chairman of ANT Privacy Computing, and Wang Lei, CEO of ANT Privacy Computing (edited):

Question: How do you view the relationship between cost, security, and performance?

Wang Lei: Security always comes with a cost, which fundamentally includes two aspects: first, from a business perspective, whether the security benefits brought by privacy computing technology are sufficient to offset the cost. For example, in data breach incidents, we have seen losses amounting to millions of dollars, so if the cost of security measures is lower than the potential losses, then these measures are acceptable Secondly, from a technical perspective, as technology continues to iterate, related costs will gradually decrease. In addition, privacy computing products need to be classified according to security requirements. For data with not very high value, high-cost security measures are not necessary. Based on the classification of data security, technical measures should also be classified accordingly. When the value of data matches the cost of its security measures, such a security strategy is the most economical and reasonable.

Question: How to understand the situation where service costs decrease after adding privacy computing processes? In addition, adding privacy computing seems to add an extra step, how will this affect the efficiency of the entire data flow, is it to improve or reduce it?

Wei Tao: Simply looking at the technical chain will definitely result in higher costs, but considering factors such as human factors, technical factors, and compliance factors in the entire chain, the total cost is actually lower. Although plaintext computing may seem simple at first, once data leakage occurs, it will bring huge losses, including loss of commercial interests and legal risks. The development of privacy computing will trigger a revolution. Currently, many data sources are reluctant to share data easily due to concerns about data leakage. Privacy computing technology can enable data that was originally unable to flow to circulate securely, thereby fully realizing the value of data.

Question: With the emergence of large models, people are generally concerned about computing power speed and price. In the past two years, many independent privacy computing vendors have found it increasingly difficult to do business. In our strategies or methods, when applied to specific business scenarios, in which aspects do customers usually adopt privacy computing?

Wei Tao: In the past two years, the privacy computing industry has made a lot of attempts, mainly in the so-called "bucketing" stage, achieving point-to-point connections. PSI (Privacy Computing Intersection) is one of the most widely used technologies, allowing two organizations to calculate the intersection of user groups while protecting the privacy of their users. Although this technology performs well in verifying individual links, its application scope is relatively limited and has not yet achieved full-chain protection of data circulation.

Throughout the entire research and development process, data source parties still have significant concerns about data leakage, which has not been effectively resolved. The current technological applications are still insufficient in depth and breadth. If the technology in the "bucketing" stage is expanded to large-scale applications, the cost will be very high, and the entire process lacks consistent guarantees, and risks have not been effectively controlled.

Wang Lei: The reason why the commercialization of privacy computing is no longer as popular mainly has two reasons: firstly, privacy computing technology is currently mainly suitable for small-scale applications and is costly, making it difficult to achieve scalability, which leads to cost reduction challenges. Only by expanding the scale can costs be expected to decrease. Secondly, the traditional business model mainly involves selling software, and this high-cost delivery model is not conducive to the application and promotion of privacy computing technology. The ultimate goal of privacy computing is to promote the secure circulation of data.

After establishing a new company, we are also deeply considering this issue. On one hand, we plan to adopt a cloud-based model, including the upcoming series of products such as the upcoming Hidden Language Cloud. We believe that only through cloud services can data truly achieve large-scale circulation and be applied to more complex scenarios, thereby achieving scalability and cost reduction At the same time, we will also launch related products on the client side to achieve end-cloud collaboration.

On the other hand, we hope to establish a business model that is accountable for results. This means that throughout the data circulation process, we can ensure data security, reduce costs and legal risks from a full-chain perspective. We aim to continuously generate revenue in this process, as data value is guaranteed, thereby profiting from the value of data.

We hope to introduce an insurance company, which can serve two purposes: first, as an independent third party, to assess the security of products in advance and provide data security insurance; second, to provide post-event protection in the event of unforeseen black swan incidents. This mechanism will promote the healthy operation of the entire industry. Only when the business model operates healthily can technological innovation and iteration continue to develop healthily.

Question: In recent years, the importance of privacy computing technology has been widely recognized in the market, but there is a divergence on whether it is an indispensable technology at the technical level. Some experts point out that although the cost of privacy computing is high, there may be alternative technologies with higher cost-effectiveness. Is there really an urgent need for privacy computing to gain widespread recognition from market institutions? What obstacles does it need to overcome?

Wei Tao: The development trajectory of privacy computing technology is quite similar to the photovoltaic industry. When photovoltaic technology was first introduced, it was costly and could not be immediately popularized in all industries. However, as high-demand industries took the lead in adoption and promoted large-scale production, costs gradually decreased. When the cost of photovoltaic power generation dropped to a critical point comparable to coal-fired power generation, it began to be widely used.

Privacy computing follows the same pattern. It will first be applied in high-value data and scenarios. Although the problems solved by privacy computing are not limited to high-value data, the current data leakage issue is very serious. Data from many institutions is being traded on the dark web, leading to serious consequences. However, this is just the tip of the iceberg, as a large amount of data trading in the domestic black market is no longer limited to the dark web, which is a very dangerous phenomenon. Data leaks are occurring on a large scale, causing significant harm to society, and traditional technological paths cannot effectively safeguard data security.

When industries with high-value data and scenarios first establish a privacy computing system and achieve scale to reduce costs, it will be able to serve more industries. We believe that the critical point for each industry scenario is when the cost of privacy computing drops to around 5% of the value of data circulation, which will enable widespread promotion.

Wang Lei: Let me add, although we often mention secure computing, privacy computing is still the consensus within the industry. The reason we no longer frequently mention privacy computing is not to hype a new concept. In everyone's impression, privacy computing is more based on the fusion of multi-party secure computing and federated secure learning, adding assurance that participants cannot steal data from each other. In fact, in the process of large-scale data flow, many application scenarios are not like this.

To give a very practical example, what risks do public data face when opened? When data needs to be opened to the public on the external network, it may not involve data fusion, but there are huge risks, which is why they dare not open it For example, how can we ensure the security of data when transferring data from the government intranet to an external platform? Even if the operation and maintenance company is trustworthy, can the operation and maintenance personnel also be trusted? Is it possible for them to easily steal data by disassembling hard drives or other means? In addition, during the data processing and usage process, even personnel with normal permissions may pose a risk of data leakage.

Therefore, many truly valuable data are not dared to be opened, it has been difficult to open up. Privacy-preserving computation ensures that operators cannot steal data in a privacy-preserving manner, allowing data to truly open and circulate.

Privacy computing was previously only applied to a small part of the entire data circulation process. When data truly circulates on a large scale, we need to achieve multi-party data fusion. We believe that privacy-preserving computation is the next generation of privacy computing, hoping to solve the real problems encountered in the larger-scale circulation of data