Hung released the information when talking to the media on the sidelines of Vietnam AI Day held in HCM City on December 5.
At the event, VinAI for the first time announced an open-source research project on large language model in Vietnamese language – PhoGPT. The goal of the project is developing models similar to ChatGPT in Vietnamese language and with Vietnamese culture.
PhoGPT has the superior ability of understanding and writing Vietnamese language compared to previous-generation language technologies. The model is also trained from scratch with Vietnamese data set, not relying on any other model in the world, ensuring the mastery of advanced core technology for Vietnam.
PhoGPT is an open-source code project, running abreast with other large language models with open-source code of the world, such as Meta’s Llama and Mistral developed to compete with OpenAI’s ChatGPT.
Experts, after assessing and comparing PhoGPT-7B5-Instruct and closed source ChatGPT (GPT-3.5-turbo), and other open-source models, have come to the conclusion that PhoGPT ranks second just to ChatGPT in all ranking items. The development team is continuing to improve the model and may expand the project to other languages, especially the ones in Southeast Asia.
VinAI’s Hung affirmed that PhoGPT has been developed by the company from the very beginning, independent to most other models in the world. With the open-source code model, the entire Vietnamese community can use and improve it.
One of the open-source goals is to lay a foundation so that people don't have to waste time doing the same things again. Meanwhile, they can spend time to further develop PhoGPT large language model, because VinAI alone cannot handle all applications. This will also help create a high-quality Vietnamese-language open-source code community, thus creating good effects for many other companies to join in and apply some certain things.
According to Hung, AI large language models have high requirements on capabilities and large calculation platforms which are very costly. One of the purposes being pursued by VinAI is optimizing such large language modes to create the models with higher accuracy and compact models which can run on smaller calculation platforms, thus helping reduce the production cost. This is one of the important focuses and directions being pursued by VinAI and one of the problems that not only Vietnam, but the entire world, especially Southeast Asian countries, are facing.
Le My