AI breakthrough brings Welsh into the digital age

Welsh speakers could soon benefit from a powerful new artificial intelligence model designed to understand and use the language in everyday services such as healthcare, education and law.
The project, led by University College London in partnership with Bangor University and technology giant NVIDIA, has trained a new Welsh-language AI on the UK’s most powerful supercomputer, Isambard-AI, based in Bristol.
It is part of the UK-LLM initiative, which is creating “sovereign AI” models for UK languages. The new model is the first to show strong reasoning ability in Welsh.
The aim of the model is to make public services more accessible, while also supporting efforts to grow the number of speakers under the Welsh Government’s Cymraeg 2050 strategy.
Bilingual services
Researchers say the technology could help organisations in Wales – from hospitals and schools to shops and broadcasters – provide bilingual services more easily. It may also be used to support people learning Welsh by making resources more widely available.
Gruffudd Prys of Bangor University’s Welsh language technology centre, brings around two decades of experience with language technology for the Welsh language to the collaboration.
He and his team are helping to check the accuracy of machine-translated training data and hand-translated evaluation data, as well as assessing how the model handles Welsh nuances that Artificial Intelligence usually struggles with — such as the way consonants at the beginning of Welsh words mutate based on nearby words.
Prys said: “AI shows enormous potential to help with second-language acquisition of Welsh as well as for enabling native speakers to improve their language skills.
“The aim is to ensure that Welsh remains a living, breathing language that continues to develop with the times. AI shows enormous potential to help both native speakers and learners.”
Irish
The project builds on earlier models developed for UK languages, with plans to expand into Irish, Scottish Gaelic, Cornish and Scots. International collaborations are also expected, with the same techniques applied to under-represented languages in Africa and Asia.
The Welsh model and its training data will be made publicly available so that developers, public services and businesses can adapt it for their own needs.
“This collaboration with NVIDIA and Bangor University enabled us to create new training data and train a new model in record time, accelerating our goal to build the best-ever language model for Welsh,” said Pontus Stenetorp, professor of natural language processing and deputy director for the Center of Artificial Intelligence at University College London.
“Our aim is to take the insights gained from the Welsh model and apply them to other minority languages, in the UK and across the globe.”
Support our Nation today
For the price of a cup of coffee a month you can help us create an independent, not-for-profit, national news service for the people of Wales, by the people of Wales.


No LLM has any “reasoning ability”. It predicts the next word based on previous words. That’s all.