Guo Tiannan from Westlake University: Disrupting traditional experiments, promoting the three major data pillars and closed-loop learning of AI virtual cells

生物谷
2025.03.30 02:40
portai
I'm PortAI, I can summarize articles.

Professor Guo Tiannan from Westlake University discussed the development of Artificial Intelligence Virtual Cells (AIVC), emphasizing their potential in biomedical research. AIVC combines multimodal data with AI technology to create more complex cellular function models, which may replace traditional experiments in the future through high-throughput simulations. Despite the promising outlook, key issues still need to be addressed, such as how to select the appropriate "culture medium" and prioritize the types of cells to be virtually cultured, in order to fully leverage their applications in drug development and disease research

Cells are the basic units of life, crucial for understanding health, aging, and disease, and serve as important tools in drug development and synthetic biology. However, cell-based experimental resources are resource-intensive and variable, leading to reproducibility issues in biomedical research.

While the first carbon-based cells emerged after billions of years of evolution, the development of the first silicon-based cells today presents transformative opportunities for the scientific community. The concept of virtual cells or digital cells was proposed around 2000, initially relying on traditional low-throughput biochemical experiments to quantify the spatiotemporal changes of substances involved in specific biological processes. These early models used differential equations and stochastic simulations to model specific cellular processes. Groundbreaking whole-cell virtual models, such as those for mycoplasma, Escherichia coli, and Saccharomyces cerevisiae, were primarily based on prior knowledge. However, they lacked well-designed matching perturbation omics data and spatiotemporal imaging data. Despite the pioneering significance of these early models, they had limitations in comprehensively capturing the dynamic characteristics and complexities of living cells, highlighting the need for more comprehensive data integration and advanced modeling approaches.

Recent advancements in high-throughput technology and artificial intelligence (AI) have paved the way for more complex virtual cell simulations.

In December 2024, Professor Stephen Quake from Stanford University and others published in Cell, proposing the concept of Artificial Intelligence Virtual Cells (AIVC)【1】, which combines AI with multimodal data to create a comprehensive computational model of cellular functions. These AI virtual cells are expected to enable precise and scalable computer-simulated experiments, potentially supplementing or even replacing traditional experiments in certain cases through high-throughput simulations, thereby revolutionizing biomedical research.

Despite the promising prospects of Artificial Intelligence Virtual Cells (AIVC), several key questions remain unresolved. Just as culture media nourish biological cells, what kind of "media" is ideal for nurturing these digital entities? Which cell types should we prioritize for virtual cultivation?

Addressing these questions is crucial for fully realizing the potential of Artificial Intelligence Virtual Cells (AIVC) in drug development, disease modeling, and fundamental biological research. As we are about to enter this new era of cell modeling, the scientific community should collaborate to establish standards and best practices for the development and validation of Artificial Intelligence Virtual Cells (AIVC).

On March 25, 2025, Researcher Guo Tiannan from Westlake University published an editorial titled: Grow AI virtual cells: three data pillars and closed-loop learning in Cell Research.

The article proposed that the evolution and development of Artificial Intelligence Virtual Cells (AIVC) rely on three key data pillars—prior knowledge, static architecture, and dynamic states—combined with deep learning algorithms It forms the foundation for the development of AIVC.

This diagram illustrates the three key pillars for the development of AIVC: prior knowledge, static architecture, and dynamic state. These data are integrated through artificial intelligence algorithms to simulate cellular behavior (such as models of model organisms like Escherichia coli, yeast, and various cell lines) and also demonstrate the evolutionary development of AIVC using a closed-loop active learning system. In this advanced framework, computational predictions guide automated experiments, with a particular focus on perturbation omics.

Imagine cultivating a "virtual cell" in a computer that can simulate the growth, metabolism, and even carcinogenesis of real cells, helping scientists predict drug effects and elucidate disease mechanisms. This seemingly science fiction scenario is becoming a reality with the advancement of artificial intelligence (AI).

The dilemma of traditional cell experiments: the dual challenges of cost and uncertainty

Cells are the basic units of life, but traditional experiments face two major challenges:

High resource consumption: A single experiment takes weeks and requires expensive reagents and precision instruments;

Low reproducibility: Experiments are affected by environmental fluctuations and operational differences, leading to a "reproducibility crisis" in the global research community.

AI virtual cells: A step towards silicon-based life

From the first concept of "virtual cells/digital cells" in 2000 to today's artificial intelligence virtual cells (AIVC), the team led by Guo Tiannan has proposed three core pillars for constructing a "digital twin" of cells:

  1. Prior knowledge: The "intelligent melting pot" of vast literature

Integrating a century of biomedical research achievements, including 240 million papers and a 3D molecular structure database, this existing human knowledge serves as a "cell encyclopedia," providing foundational cellular biology principles for AI, similar to how ChatGPT learned from all human texts, allowing AIVC to absorb all cellular knowledge.

  1. Static architecture: A nano-scale "panoramic map" of cells

Combining cryo-electron microscopy, super-resolution microscopy, and spatial omics technologies to map the precise three-dimensional structures of organelles and protein networks, with a resolution of 5-10 nanometers.

  1. Dynamic state: Capturing every frame of life’s changes

Tracking molecular dynamics during processes such as cell development and carcinogenesis; using perturbation techniques (such as gene editing and drug stimulation) to generate large amounts of data to train AI to predict cellular behavior.

Technological breakthrough: When multi-omics meets deep learning

The team led by Guo Tiannan further proposed a "closed-loop learning" framework:

  1. Data Fusion: The Transformer model integrates text, image, and proteomics data;

  2. Dynamic Inference: The Diffusion model simulates cellular state transitions and predicts drug intervention effects;

  3. Self-Evolution: Each virtual experiment's results feed back into model optimization, forming iterative upgrades.

Future Applications: From Precision Medicine to Synthetic Biology

  1. Drug Development: Virtual screening of anti-cancer drug combinations to shorten R&D cycles;

  2. Disease Decoding: Simulating the abnormal aggregation process of Alzheimer’s disease proteins;

  3. Cell Factories: Designing artificial cells for efficient insulin production.

Conclusion and Outlook

When creating and nurturing Artificial Intelligence Virtual Cells (AIVC) in the digital petri dish of modern biomedical research, we must carefully consider the "nutrients" that nourish their growth. The three data pillars proposed in the article—prior knowledge, static architecture, and dynamic states—constitute the necessary "culture medium" for these computational simulated entities. Among them, perturbation-based omics data—transcriptomics, proteomics, and metabolomics—becomes the key "growth factor."

To efficiently generate such rich perturbation data, the author envisions that a closed-loop active learning system will become the next evolutionary step. Inspired by autonomous chemical laboratories, these systems will seamlessly integrate AI-driven predictions with robotic experiments. Like a skilled gardener, they will identify knowledge gaps, design targeted experiments, and continuously deepen our understanding of cellular complexity. The journey from static models to adaptive, self-optimizing Artificial Intelligence Virtual Cells is expected to revolutionize drug discovery, disease modeling, and fundamental biological research. The author also suggests a low-hanging fruit in this journey—creating and nurturing a virtual yeast cell may be a feasible option.

As we stand on this exciting frontier, the collaborative efforts of the scientific community are crucial to fully realize the potential of Artificial Intelligence Virtual Cells and to advance the future of computational simulations in life sciences