Online translation has a bad press – but could turbo-charge how much Welsh we see around us
Ben Screen, Translation Manager in NHS Wales
Online translation tools, or ‘Machine Translation’ and MT for short, often get very bad press. Tools such as Google Translate and Microsoft Translator are often the cause of many a post on social media about laughable signs, painful mistranslations and hilarious gaffs.
But there’s much more to these tools than this and we need to understand the wider context of their use, as harnessing these tools could have a huge impact on any effort to promote the use of Welsh in society.
Getting the gist
Tools such as Google Translate and Microsoft Translator have a huge impact worldwide. Looking beyond Wales, the think-tank in the Netherlands TAUS (Translation Automation Users Society) in a recent webinar showed that MT systems actually see more content than human translators do, accounting for hundreds of millions of words annually.
Users worldwide are clearly taking advantage of the technology to get the gist of text in a language they can’t read, and it can now be used to understand restaurant menus, street signs, posters, websites and all sorts of other digital information and communications as well such as emails and electronic documents.
In the EU for example, eTranslate, an MT system developed for languages of the EU for use by translators and EU citizens launched in 2017, has so far received over 253 million requests and given a rough translation of 166 million pages. MT is also proving useful as a research tool to allow researchers and lawyers for example to find relevant literature in foreign languages, and has been used in crises such as the earthquake in Haiti in 2010 where a system for Haitian Creole was developed rapidly to assist humanitarian efforts there.
The use of ‘raw’ or uncorrected MT then is huge and must be not underestimated; the contribution that MT has made to bridging cultural gaps and fostering understanding is likely immeasurable due to its widespread use in this context.
There are also suggestions that MT is being used in this way in Wales, as part of the Microsoft Office suite as one example where Microsoft Translator can offer an automatic translation of email communication to allow people who don’t speak Welsh to get the gist of emails in Welsh.
The other advantage to this is that where this does happen, Welsh speaking staff do not have to switch to English, and by doing so are using the language in a professional context and are not buckling under the pressure to use English.
The CERI chat-bot in the field of healthcare is another great case study. When creating a chat-bot using IBM’s Watson artificial intelligence (AI) engine that provides information about COVID-19 for patients in Cwm Taf Morgannwg Health Board area, MT was used in the backend to translate the user’s requests to it written in Welsh, so that the chat-bot could understand Welsh input. Without MT developed by IBM in co-operation with the Health Board’s Welsh Language Services Team, it may not have been bilingual at all.
A further example is social media. Twitter uses Google Translate, and Facebook its own MT system, but both can offer translation from Welsh into English of comments, posts and Tweets allowing those who can’t read Welsh to understand social media posts in Welsh. Whilst the results are sometimes amusing, the fact that these systems can often give a decent translation of colloquial Welsh never ceases to amaze me given the sheer variety of ways spoken Welsh can be written.
My own friends and family, some of whom don’t speak Welsh, respond to posts I’ve made in Welsh by commenting in English, meaning I don’t always need to think about which language to post in. This is huge for the use of Welsh online. Again then we see how useful a tool it is and how it can have an extremely positive cultural impact.
A word used several times above was gist, defined as “the most important pieces of information about something, or general information without details” by the Cambridge Online Dictionary. We can extend this slightly to account for its wider use in the context of MT, that is getting the main message without the person needing a full, grammatically correct account of what was said.
What this also means is that for MT to be useful, it doesn’t need to be perfect. It only needs to give the user the information he or she needs for his or her specific situation. Where the information is critical, and where important decisions are to be taken based on that information, then a human translator must be part of the process. However, for day-to-day use, MT really is an incredibly powerful tool.
It’s not all Google
Of course, there are some really bad MT systems out there for Welsh. It’s important to bear in mind here that by now there are a large number of systems available for translation between English and Welsh, some online and some app based, developed by different developers. Taking a quick look on my ‘Play Store’ on my Android phone for example reveals that there are several free apps claiming to translate between English and Welsh. When we see the really bad and incomprehensible signs and posters online, it could well be that it’s these that are responsible.
But over the last 5 years there’s been a huge development in the field in the form of ‘Neural Machine Translation’, a form of AI for translation that has greatly improved the quality of the output, for Welsh included. Systems such as Google Translate, Microsoft Translator, Amazon Translate, the MT system within Facebook as well as some others, all use this newer and much better method when translating between English and Welsh.
Suffice to say then that the field of MT for Welsh is wider than it was a few years ago and that the quality offered varies depending on where the translation came from.
That MT is being used so widely for gist as alluded to above suggests that the output from some of these tools is quite often acceptable enough. This is where a second use of MT comes in, that as a tool for professional translators.
Again, for MT to be useful, it doesn’t need to be perfect. It need only be close enough in this context of use that correcting the output from the system is less effortful, and therefore quicker, than translation from scratch (the process called ‘post-editing’ in the field). Where the translation process has been boosted in this way, the productivity of the translator using it also rises meaning more words translated.
It once again cannot be underestimated how powerful a tool MT is in this context either. There is now a wide body of academic research that shows that when translators use the right MT system with the right kind of documents, that the translation process for the translator is easier cognitively, easier in terms of typing effort and quicker overall.
Research by myself has also shown this for Welsh, as well as research by Canolfan Bedwyr of Bangor University in collaboration with the brilliant translation company Cymen in Caernarfon. In my own research, the productivity of the professional translators using Google Translate to translate technical texts rose on average by 168%.
As part of a project I am currently doing, it seems Google Translate in particular has continued to improve. For texts in the field of HR as part of the project, several large documents totalling tens of thousands of words between them were translated, reviewed and formatted with the help of MT and another tool called translation memory software in an average of three hours per document, with most of the output coming from the MT system. Without the help of MT, it would have taken more than double that time.
So, what does this mean for Cymraeg 2050? The ambitious strategy Cymraeg 2050: A Million Welsh speakers also includes a very important theme, which is increasing the use of Welsh. As the strategy recognises, people also need to be using the language daily for the long-term future of Welsh to be safe.
Good translators have a huge role to play in this. Protecting and increasing the use of Welsh means translating shopping and banking apps, all sorts of websites from various fields and for various services, restaurant menus and marketing for all sorts of businesses, bills and letters, leaflets, online questionnaires and forms, and yes, reports and documents “that nobody reads” that they do, in fact, read.
None of the above can be achieved without good translators, committed to the job, trained and qualified to do the job and using the right tools to do the job. And this is the main point of this article: Using the right tools to do the job.
Let’s take the example above of the figure of a 168% increase in productivity, achieved using even a generic MT system as opposed to a field or ‘domain’ specific one for translation in specific contexts. This could mean each translator up to doubling what they’re currently able to do without the additional cost when translating certain types of documents, and when considered with the full range of tools translators now have at their disposal such as translation memory software to reuse previous translations, terminology management software and workflow and accounting software specifically for translation, there’s great potential.
A further aspect of this is that MT systems are now completely integrated into translation and localization processes for many a multinational app and technology company, as well as for multinational companies in most fields that translate their content into other languages such as e-commerce, finance, education, health, science and technology, the leisure and fitness industry, you name it.
By having good MT systems for Welsh and translators able and prepared to use them to ‘post-edit’ as well as translate, the prospect of translating into a minority language like Welsh is much more attractive to large companies whose products and services we’d like to have in Welsh but currently don’t.
There are some caveats to this. First of all, data. These systems to be at their very best need huge amounts of data in the form of translations between two languages, between English and Welsh in our case. The more electronic translation data of this kind, the better the systems will be.
Even better if this data is from a specific domain, like health. This can in part be achieved by increasing our use of a technology called translation memory software (mentioned above), and being prepared to share data from them. Were there a national translation system for all translators in Wales to work in and save their translations (or at least such a system for large sections of the public sector like local government), the amount of translations we’d have to leverage for MT would be enormous, not to mention all the other benefits that come with sharing previous translations between ourselves. This is a debate for another day though.
Secondly, using MT in professional contexts will increase productivity but this does not have to mean a decrease in quality, where quality here also includes readability and the ‘naturalness’ of the translations. There is some research now which is starting to show that using MT to translate may cause priming effects and the phenomenon of under-editing, where the translator does not correct the literal translations from the systems. However, this can be ‘trained out’ of translators and can be picked up at review stage (any translation ought to be reviewed as a text in its own right, and using MT to speed up the process shouldn’t change this). Also, it’s important to bear in mind that when using MT to translate technical documents and documents with a short lifespan, people don’t always expect them to be works of art anyway.
Thirdly, and as alluded to above, translators need to be trained to use these tools. Again, recent research shows that the best translators who use MT have a specific set of skills above and beyond translation as well as specialist knowledge about MT. This ought to be available to translators in Wales.
Fourthly, it shouldn’t be underestimated what implementing MT tools into workflows in organizations will entail. It wouldn’t be a huge upheaval but knowledge of change management and introducing organizational innovation wouldn’t go amiss, as well as knowledge around arguing the business and financial case for training and software costs. Once implemented however, the benefits are considerable.
Finally, I’m not just talking free MT online here; implementing MT would mean paying a comparatively small subscription fee (where the texts inputted are also protected, important for cyber-security and data confidentiality), as well as organizations developing or augmenting their own systems with their own translation data.
A helping hand
The phrase I used at the start of this article was ‘changing the terms of the debate’. MT in Wales is not always understood by all, and despite promising projects in universities and the all-important support for it from government, many of the Welsh-speaking general public still see it as a nuisance tool.
I hope I’ve argued in this short article that there’s more to it than this and a huge body of academic research to support its wider implementation. By looking at MT as a tool for gist, and as a tool for professional translators when it comes to publishing and displaying content, we can start to see the enormous potential it has for increasing the use of Welsh in society.
There is still of course a lot these systems can’t do, such as dealing with most forms of non-literal language (or non-compositional meaning to use the technical term), such as metaphor, idioms and collocations, dealing with certain shades of meaning, mastery of sentence structure and how some concepts tend to be worded in a language, ensuring textual cohesion and using contextual knowledge to discern ambiguity (especially with generic MT systems), to name a few.
This is where the added value of a professional translator is important and why humans are still essential to the process. But that still leaves the general, more literal or even mundane texts where MT systems can offer help.
Using MT more widely will turbo-charge how much Welsh we see around us and will shine light at the same time on the skilled work that good professional translators actually do.
Let’s take the helping hand MT offers and get as much Welsh out there as possible.
Ben Screen is a Translation Manager in NHS Wales, having completed a PhD in the evaluation of translation technology in professional translation contexts at the School of Welsh in Cardiff University in 2018. He’s published several peer-reviewed papers in the field and his book for trainee translators published by the University of Wales Press is due to be published towards the end of the year.
Support our Nation today
For the price of a cup of coffee a month you can help us create an independent, not-for-profit, national news service for the people of Wales, by the people of Wales.
I very much agree that software translation is useful for ‘gist’ understanding. I have very limited Welsh, a few 100 words, yet I am doing a PhD on the Mabinogi (the first prose stories ever, which were created in mediaeval Wales). Some books and articles I need are in Welsh but google tells me what the titles mean, or a selected quote. If I need more I plod through with GPC. /// I do wish the categories about Welsh were less rigid. I could not class myself as a ‘Yes’ Welsh speaker but I’m not ‘No’ either. Mostly what I… Read more »
Oh and I use the online translator for emails. Gmail even has a link at the top of the email to translate the message. On the other topic of being categorised our family use a small collection of words in everyday life. But we cannot class ourselves as ‘Welsh speakers’. We are I think, Welshish.
Iawn! We need formal recognition of learners and basic Welsh speakers. If our resl numbers were known it would make the politicians take notice and shut up the dead language idiots.
We need to encourage people like me to use more bits of Welsh in our everyday conversations.
Combining MT with text to speech programmes like IVONA will be essential in the future. I have used MT to translate articles from English and Spanish, post-edited them and created instantaneous mp3 audio files using IVONA for my Welsh learners. Skype uses MT and TExt To Speech technology to enable speakers of different languages to communicate orally in their respective languages (like the StarTrek Universal Translator). Cymraeg needs to be there sooner rather rhan later.
Yn werth ei ddarllen
Erthygl wych a hynod o ddiddorol. Diolch Ben.
It is an interesting idea, and there is a lot of potential in machine translation, but of course, this exists in a real world where there are financial pressures, and it could be very tempting to use it in a way to de-skill translators into proof-readers of the output of Google Translate or equivalent. One potential issue, is that widespread use of machine translation, could overwhelm in quantity the output of human-written text for a smaller language such as Cymraeg, and the input for future language technologies actually becomes more based on previous translations rather than the original language. It’s… Read more »
In this article’s photo, nation.cymru is teasing us monoglots with a question that’s not answered in the text.
With my very limited knowledge, I’d translate the English in the photo as “Ai cyfeithiad cywir yw hwn?”. Or, with the emphasis switched around, “Ai hwn sy’n gyfeithiad cywir?”. So tell me, bilingual commentators, who’s doing better, Google or me?
I find it handy, its good to be able to get a hand with reading comments in Welsh. I fear I will never be fluent, or a good at writing in Welsh, but I love the language and even speaking it a little means a lot to me.