American AI developers' first-hand experience: Open-source large models are inferior to closed-source ones, with low efficiency and poor optimization.

Why are open-source large models actually the most expensive? Recently, with the announcement of Llama 3's open-source release, an interview video of American AI entrepreneur Arsenii Shatokhin went viral online (https://weixin.qq.com/sph/AZM8h34Jm). The founder of AI agent company VRSEN stated that enterprises running open-source large models themselves are less efficient than using closed-source APIs. "We only have one or two clients with sufficient resources to fine-tune or run the 70-billion-parameter Llama open-source model."

Arsenii Shatokhin has been working in the AI industry for six years and is one of the well-known AI entrepreneurs in the U.S. His current startup, VRSEN, focuses on AI agents, creating AI Agents based on large models for enterprise clients to improve metrics like sales conversion rates. So far, Arsenii Shatokhin has provided AI solutions for several renowned companies such as Cisco, StripePMA, and HUGO PFOHE.

After Llama 3 was open-sourced, Arsenii Shatokhin quickly identified its practicality issues. "Llama 3 is much larger than any previously released open-source model. Even now, we only have one or two clients with enough resources to fine-tune or even just run this 70-billion-parameter model."

For his clients, using this open-source large model is actually less efficient than closed-source commercial models. He analyzed that closed-source models' APIs are more optimized. "These APIs are specifically built for the models and are optimized as much as possible. You only pay for what you use, with no additional costs." In contrast, developing such an optimization system for open-source models is "extremely complex."

The debate between open-source and closed-source is a hot topic in the large model industry recently. Unlike open-source systems like Linux and Android, more AI professionals are expressing support for closed-source models and pointing out various issues with open-source ones.

"Open-source large models will fall further behind," Robin Li, founder, chairman, and CEO of Baidu, stated in a recent speech. "People used to think open-source was cheaper, but in the context of large models, open-source is actually the most expensive."

Currently, open-source large models differ significantly from traditional open-source systems. In developer communities, many users point out that today's open-source large models are not truly open-source—only the parameters are released, while training code, data, and algorithms remain closed, creating a "black box" that leads to several issues:

1) Difficult problem-solving: Open-source models only provide APIs and downloads. Developers can't see a single line of source code, making it hard to diagnose and fix issues when they arise.

2) High resource consumption for post-pretraining: Open-source large models are like unfinished houses—far from plug-and-play. Post-pretraining requires massive computing resources. As the American AI entrepreneur noted, most companies lack the computing power to fine-tune and run them. In contrast, closed-source commercial models are optimized for immediate use.

3) Security risks: Overseas open-source models haven't undergone security testing. Ensuring safety requires additional fine-tuning, which not only introduces risks but also increases costs.

Because current open-source large models are not "truly open" but merely "released," they can't benefit from collective improvements like traditional open-source systems. Over time, the gap between open-source and closed-source models will widen.

Recently, Fei-Fei Li, director of Stanford's AI Research Institute, released the AI Index report with her team. In 10 major model evaluations, open-source models lagged behind closed-source ones across the board. On AgentBench, which measures model application and agent capabilities, closed-source models scored 4, while open-source scored only 0.96—a 300% gap.