CMC OpenAI (C-OpenAI), a member of CMC Technology Group, has announced two foundational components for developing a Vietnamese legal virtual assistant: the large language model CMC-AI-Legal-32B and the evaluation benchmark VLegal-Bench.

According to the company, mastering both the model and the evaluation framework reflects its strategy to advance national technology goals under Resolution No. 57-NQ/TW on breakthroughs in science, technology, innovation, and digital transformation. The effort also aligns with Vietnam’s broader orientation toward developing core AI technologies such as Vietnamese-language LLMs, virtual assistants, and domain-specific AI systems.

Fine-tuning CMC-AI-Legal-32B: A model built for complex legal reasoning

At the center of the announcement is CMC-AI-Legal-32B, a Vietnamese legal large language model fine-tuned by C-OpenAI for Vietnam’s legal context. On the VLegal-Bench evaluation suite, the model achieved top overall performance, ranking first in six out of twenty-two tasks and excelling in multi-layered legal reasoning and argumentation.

The research team noted that while general-purpose commercial models such as GPT-4o, Claude, and Gemini perform strongly in information retrieval, their effectiveness declines sharply in structured legal reasoning. In contrast, models trained with Vietnamese legal data and citation standards demonstrate distinct advantages in accuracy and relevance.

C-OpenAI confirmed that it will release the source code, datasets, and evaluation procedures transparently at https://vlegalbench.cmcai.vn/, inviting domestic and international experts to collaborate in refining the toolkit.

“C-OpenAI pursues the vision of building an open and secure AI platform developed by Vietnam, focusing on Vietnamese LLMs and sector-specific AI applications. We aim to cultivate a community of enterprises and developers to co-create and distribute applications built on this platform,” said Dang Van Tu, CEO of C-OpenAI.

Ảnh 1 bài cmc.jpg
The VLegal-Bench evaluation benchmark developed by C-OpenAI, registered on arXiv (Cornell University, USA).

VLegal-Bench: Vietnam’s benchmark for legal AI evaluation

The development team explained that training and evaluating high-quality legal LLMs requires a benchmark tailored to Vietnam’s linguistic and legal system. International benchmarks, they noted, cannot replicate these unique conditions, making it essential for domestic researchers to build datasets and tasks from the ground up.

VLegal-Bench comprises 10,450 labeled data samples with reference answers, divided into twenty-two tasks across five levels of increasing reasoning complexity. These range from identifying legal articles and structuring provisions within hierarchical frameworks, to performing multi-step legal reasoning, generating contextually appropriate legal content, and evaluating ethics, fairness, and bias.

Designed to reflect Vietnam’s civil law system, the benchmark includes hierarchical legal structures, amendment relationships, scope of application, and inter-article references. Each sample links directly to verified legal documents, ensuring transparency and reproducibility in evaluation.

Ảnh 2 bài CMC.jpg
Vietnam’s leading legal AI evaluation benchmark - VLegal-Bench by C-OpenAI.

A foundation for fair, transparent legal AI assessment

“Developing VLegal-Bench was a demanding process,” said Nguyen Tien Dong, Chief Technical Officer of C-OpenAI. “We rebuilt nearly the entire pipeline - from text collection and standardization to reference labeling and task design - to ensure both technical rigor and legal accuracy.”

Ảnh 3 bài CMC.jpg
The research team behind the VLegal-Bench project.

He added that the greatest challenge lay in maintaining precision, reproducibility, and alignment with international LLM evaluation standards. “Through close collaboration between legal professionals and AI engineers, we created a high-quality dataset and tested it across twenty-two models, contributing a reliable benchmark for evaluating Vietnamese legal virtual assistants,” he said.

Building Vietnam’s AI ecosystem through collaboration

According to C-OpenAI, VLegal-Bench serves as a “competitive arena” for comparing open-source, closed-source, and domain-specific models. Its automated evaluation pipeline ensures consistency and removes human subjectivity. Results are published in public leaderboards to encourage transparency and research collaboration.

From an academic standpoint, the project’s technical report has been released as a preprint on the arXiv repository. C-OpenAI also collaborates with international partners, including Professor Binh Vu’s research group at SRH Heidelberg University in Germany, to strengthen peer-review quality and align with global standards.

Ảnh 4 bài CMC.jpg
 CMC’s technology foundation powering C-OpenAI’s R&D and AI transformation initiatives.

At the group level, Nguyen Trung Chinh, Chairman of CMC Technology Group, reaffirmed the company’s long-term strategy of investing in core AI technologies and developing specialized models.

“The leadership of CMC Technology Group takes great pride in the C-OpenAI team’s achievements,” Chinh said. “With over a decade of sustained investment in research and development, CMC remains committed to our mission of creating twenty-five core technologies made by CMC. This foundation empowers us to deliver world-class AI transformation solutions fully developed by Vietnamese engineers on our journey to become a global AI-X company.”

Looking ahead, C-OpenAI plans to expand its open-source repositories, datasets, and leaderboard systems for VLegal-Bench, rolling out phased releases through 2026. These initiatives aim to establish a shared, verifiable platform that supports the development of trustworthy Vietnamese AI applications across domains.

C-OpenAI also reaffirmed its commitment to nurturing Vietnam’s AI community through open collaboration and knowledge-sharing initiatives.

PV