AIMultipleAIMultiple
No results found.

Wu Dao 3.0: China's Version of GPT-5

Cem Dilmegani
Cem Dilmegani
updated on Nov 12, 2025

When chip sanctions hit China’s AI industry, Beijing Academy of Artificial Intelligence (BAAI) did something unexpected. Instead of building bigger models, they built smaller ones.

Wu Dao 3.0, announced in July 2023, abandons the “bigger is better” race. No more trillion-parameter models. BAAI now focuses on compact, efficient models that startups can actually use, even with limited hardware.

Why BAAI Changed Direction

Wu Dao 2.0 made headlines in 2021 with 1.75 trillion parameters, claiming to rival GPT-3. Two years later, BAAI quietly shelved that approach.

The reasons:

  • US chip sanctions limited access to advanced GPUs
  • Training costs for mega-models became prohibitive
  • Chinese government policy shifted toward practical applications over prestige projects
  • Market reality showed most companies need specialized tools, not general-purpose giants

The new strategy: build a collection of smaller models (called Aquila) that work together. Think microservices instead of monoliths.

What Wu Dao 3.0 Actually Is

Wu Dao 3.0 isn’t a single model. It’s an ecosystem of specialized AI tools released under the Aquila brand:

AquilaChat: Dialogue Models

Two sizes available:

  • 7 billion parameters: Competes with LLaMA 7B and similar open-source models
  • 33 billion parameters: Targets more complex conversations

Both trained on Chinese (40%) and English (60%) text. The smaller version runs on consumer hardware—you don’t need a data center.

BAAI claims AquilaChat 7B outperforms comparable international models, though independent benchmarks remain limited.

AquilaCode: Text-to-Code Generation

Still in development. Early versions can generate:

  • Basic algorithms (Fibonacci sequences, sorting)
  • Simple games
  • Utility scripts

Not yet at the level of GitHub Copilot or GPT-4’s coding abilities, but improving. BAAI targets developers who need code generation in Chinese technical contexts.

Wu Dao Vision Series

A collection of computer vision models, not a single system:

EVA (1 billion parameters): Focuses on visual representation learning. Trained on public datasets, achieving new benchmarks in:

  • Image recognition
  • Video action detection
  • Object detection
  • Segmentation tasks

Open source, unlike competitors that keep vision models proprietary.

EVA-CLIP: BAAI claims this is the best open-source CLIP alternative available. Handles image-text matching for search and retrieval.

Painter: Implements “in-context” visual learning—show it examples, and it learns new visual tasks without retraining. Similar to how GPT-3 does in-context learning for text.

vid2vid-zero: Zero-shot video editing tool. Edit videos based on text descriptions without training on specific video editing datasets.

Emu (multimodal models): Handles both images and text in a single model. Use cases include image captioning, visual question answering, and content generation.

FlagOpen: The Infrastructure Layer

BAAI has also enhanced the FlagOpen platform they launched in early 2023. This system offers parallel training techniques, faster inference, evaluation tools, and data processing utilities, essentially providing everything needed to develop large AI models. 1

When Wu Dao 2.0 first debuted at the Beijing Zhiyuan Conference, its creators displayed Chinese poems and drawings generated by it.2 Following that event, a virtual student was created based on Wu Dao’s AI model, Zhibing Hua. Wu Dao powers the virtual student. Therefore, she can use her knowledge base and learning capabilities to write poems, draw, and compose music.

Although these features are not highlighted for Wu Dao 3.0, they are worth mentioning if you plan to utilize Wu Dao 2.0 for your enterprise instead of Wu Dao 3.0.

Poems generated by Wu Dao 2.03

Zero-Shot learning benchmarks

  1. ImageNet: Achieves state-of-the-art zero-shot performance, surpassing OpenAI’s CLIP.
  2. UC Merced Land-Use: Records the highest zero-shot accuracy in aerial land-use classification, outperforming CLIP.

Few-Shot learning benchmark

  1. SuperGLUE (FewGLUE): Outperforms GPT-3, achieving the best few-shot learning results.

Knowledge and language understanding benchmarks

  1. LAMA Knowledge Detection: Demonstrates superior factual knowledge retrieval, surpassing AutoPrompt.
  2. LAMBADA Cloze Test: Exceeds Microsoft Turing-NLG in reading comprehension and context understanding.

Text-to-Image and Image-to-Text retrieval benchmarks

  1. MS COCO (Text-to-Image generation): Outperforms OpenAI’s DALL·E in generating images from text descriptions.
  2. MS COCO (English Image-Text retrieval): Surpasses OpenAI’s CLIP and Google ALIGN in retrieving images from captions (and vice versa).
  3. MS COCO (Multilingual Image-Text retrieval): Outperforms UC2 and M3P in multilingual image-text retrieval.
  4. Multi30K (Multilingual Image-Text retrieval): Also surpasses UC2 and M3P, confirming its strong multilingual multimodal capabilities.

Wu Dao 3.0 vs. OpenAI GPT

Here’s a comprehensive comparison of Wu Dao 3.0 LLM models and various OpenAI models based on BAAI.4 We cannot provide more detailed and up-to-date comparisons for Wu Dao since it doesn’t have recent and consistent benchmarks available.

Long Context Performance

Testing across four tasks:

  • VCSUM (Chinese summarization)
  • LSHT (Chinese long-sequence handling)
  • HotpotQA (English multi-hop reasoning)
  • 2WikiMQA (English multi-document QA)

Long context performances of LLMs5

Reasoning performance benchmark

Testing across 6 tasks:

  • bAbI #16 and CLUTRR (inductive reasoning)
  • bAbI #15 and EntailmentBank (deductive reasoning)
  • αNLI (abductive reasoning)
  • E-Care (causal reasoning)

Reasoning task performances of LLMs6

If you want to use Wu Dao, you can set it up on your computer by downloading it for free.7

FAQs

Further Reading

Principal Analyst
Cem Dilmegani
Cem Dilmegani
Principal Analyst
Cem has been the principal analyst at AIMultiple since 2017. AIMultiple informs hundreds of thousands of businesses (as per similarWeb) including 55% of Fortune 500 every month.

Cem's work has been cited by leading global publications including Business Insider, Forbes, Washington Post, global firms like Deloitte, HPE and NGOs like World Economic Forum and supranational organizations like European Commission. You can see more reputable companies and resources that referenced AIMultiple.

Throughout his career, Cem served as a tech consultant, tech buyer and tech entrepreneur. He advised enterprises on their technology decisions at McKinsey & Company and Altman Solon for more than a decade. He also published a McKinsey report on digitalization.

He led technology strategy and procurement of a telco while reporting to the CEO. He has also led commercial growth of deep tech company Hypatos that reached a 7 digit annual recurring revenue and a 9 digit valuation from 0 within 2 years. Cem's work in Hypatos was covered by leading technology publications like TechCrunch and Business Insider.

Cem regularly speaks at international technology conferences. He graduated from Bogazici University as a computer engineer and holds an MBA from Columbia Business School.
View Full Profile

Be the first to comment

Your email address will not be published. All fields are required.

0/450

We follow ethical norms & our process for objectivity. AIMultiple's customers in Chatbots include Freshchat, CustomGPT.ai, Tidio.