Chartered Institute
of Linguists

CIOL AI Updates


Introduction


Since we published CIOL AI Voices, CIOL Council and delegates at our 2024 Conference days underlined their view of the importance of enabling linguists to keep abreast of developments in AI with a regularly refreshed resource hub. Here are some useful recent resources for linguists, which we will add to and update regularly, alongside future CIOL Voices pieces.


Latest AI News & Articles for Linguists


110 new languages are coming to Google Translate

Google announced it is using AI to add 110 new languages to Google Translate, including Cantonese, NKo and Tamazight. Google announced the 1,000 Languages Initiative in 2022, a commitment to build AI models that will support the 1,000 most spoken languages around the world. Using its PaLM 2 large language model, Google is adding these languages as part of its largest expansion of Google Translate to date.

 

Slator

AI Translation ‘Extremely Unsuitable’ for Manga, Japan Association of Translators Says

Slator reports that the Japan Association of Translators (JAT) issued a bilingual statement in June 2024 to express “strong reservations” over the use of high-volume AI translation of manga, noting that AI has not demonstrated a consistently high-quality approach to nuance, cultural background, and character traits — all of which play a major part in manga.

“Based on our experience and subject-matter expertise, it is the opinion of this organization that AI translation is extremely unsuitable for translating high-context, story-centric writing, such as novels, scripts, and manga,” JAT wrote. “Our organization is deeply concerned that the public and private sector initiative to use AI for high-volume translation and export of manga will damage Japan’s soft power.”


SAFE AI Task Force Guidance and Ethical Principles on AI and Interpreting Services


This Interpreting SAFE guidance from the SAFE AI Task Force in the USA establishes four fundamental principles as a durable, resilient and sustainable framework for the language industry. The four principles are drawn from ethical, professional practices of high and low resource languages, and are intended to drive legal protections and promote innovations in fairness and equity in design and delivery so that all can benefit from the potential of AI interpreting products.

A broad cross-section of stakeholders participated in designing the Interpreting SAFE AI framework, which is intended guidance for policymakers, tech companies/vendors, language service agencies/providers, interpreters, interpreting educators, and end-users.


Brave new booth - UN Today


This article, Interpreting at the dawn of the age of AI, in UN Today (the official staff magazine of the UN) discusses the limitations of AI in language services, particularly interpreting. It argues that AI, in its current form, cannot replace human interpreters due to the inherent complexities of human communication, which involve not just speech but also non-verbal cues and cultural context.

Experts highlight that AI systems 'mimic' rather than truly interpret language, and they lack the emotional intelligence and ethical decision-making required for sensitive interpreting situations. The article also raises concerns about AI's potential biases and inaccuracies, urging a cautious approach to adopting AI interpreting solutions and emphasising the irreplaceable value of human interpreters' skills and empathy.


Generative AI and Audiovisual translation


AUDIOVISUAL TRANSLATORS EUROPE

Noting the increasing use of generative AI in many industries and for a variety of highly creative tasks, AVTE (Audiovisual Translators of Europe) has decided to issue its own Statement on AI regulation.

AVTE note that Audiovisual translation is a form of creative writing and that it is not against new technologies per se, acknowledging they can benefit translators, as long as they are used for improving human output and making work more ergonomic and efficient. What they are against is the theft of human work, the spread of misinformation as well as unethical misuse of generative AI by translation companies and content producers.


No-one left behind, no language left behind, no book left behind


CEATL, the European Council of Literary Translators' Associations has published its stance on generative AI. CEATL notes that since the beginning of 2023, the spectacular evolution of artificial intelligence, and in particular the explosion in the use of generative AI in all areas of creation, has raised fundamental questions and sparked intense debate.

While professional organisations are coordinating to exert as much influence as possible on negotiations regarding the legal framework for these technologies (see in particular the statement co-signed by thirteen federations of authors’ and performers’ organisations), CEATL has drafted its own statement detailing its stance on the use of generative AIs in the field of literary translation.


More AI News & Articles for Linguists


AI Chatbots Will Never Stop Hallucinating

This article discusses the phenomenon of “hallucination” in AI-generated content, where large language models (LLMs) produce outputs that don’t align with reality. It highlights that LLMs are designed first and foremost to generate responses and not factual accuracy, leading to inevitable errors. The article suggests that to minimize hallucinations, AI tools need to be paired with fact-checking systems and supervised. It also explores the limitations and risks of LLMs due to marketing hype, the constraints of data storage and processing and the inevitable trade-offs between speed, responsiveness, calibration and accuracy.

 

A red flag for AI translation

This article discusses how the quality of data used to train large language models (LLMs) affects their performance in different languages, especially those with less original high-quality content on the web. It reports on a recent paper by researchers from Amazon and UC Santa Barbara, who found that a lot of the data for less well-resourced languages was machine-translated by older AIs, resulting in lower-quality output and more errors. The article also explores the implications of this finding for the future of generative AI and the challenges of ensuring data quality and diversity.

 

multilingual.com | Connecting Global Business and the Language Industry since 1987

Generative AI - The future of translation expertise

This article explores the transformative potential of generative AI in the translation industry. It illustrates how translators may be able to enhance their work quality and efficiency using generative AI tools (notably the OpenAI Translator plugin with Trados Studio) and the importance of 'prompt' design in achieving desired outputs. The article emphasises, however, that generative AI will augment rather than replace human translators by automating routine tasks, and encourages translators to adapt to and adopt AI as these tools herald new opportunities in the field.

 

SlatorAre Large Language Models Good at Translation Post-Editing?

The article discusses a study on the use of large language models (LLMs), specifically GPT-4 and GPT-3.5-turbo, for post-editing machine translations. The study assessed the quality of post-edited translations across various language pairs and found that GPT-4 effectively improves translation quality and can apply knowledge-based or culture-specific customizations. However, it also noted that GPT-4 can produce hallucinated edits, necessitating caution and verification. The study suggests that LLM-based post-editing could enhance machine-generated translations' reliability and interpretability, but also poses challenges in fidelity and accuracy.

 

The ConversationWho will write the rules for AI? How nations are racing to regulate artificial intelligence?

This article from The Conversation analyses the three main models of AI regulation that are emerging in the world: Europe’s comprehensive and human-centric approach, China’s tightly targeted and pragmatic approach, and America’s dramatic and innovation-driven approach. It also examines the potential benefits and drawbacks of each model, and the implications for global cooperation and competition on AI.

 

SlatorAmazon Flags Problem of Using Web-Scraped Machine-Translated Data in LLM Training

Amazon researchers have discovered that a significant portion of web content is machine translated, often poorly and with bias. In their study, they created a large corpus of sentences in 90 languages to analyze their characteristics. They found that multi-way translations are generally of lower quality and differ from 2-way translations, suggesting a higher prevalence of machine translation. The researchers warn about the potential pitfalls of using low-quality, machine-translated web-scraped content for training Large Language Models (LLMs).

 

Generative AI: Friend Or Foe For The Translation Industry?

This article discusses the potential impact of generative AI (GenAI) on the translation industry, noting the high degree of automation and of machine learning algorithms which already exists, and argues that GenAI will not cause a radical disruption, but rather an acceleration of the current trend of embedding automation and re-imagining translation workflows. The author also suggests that content creators will need translation professionals more than ever to handle the increase in content volumes and to navigate assessment, review and quality control of GenAI-generated content.


Intento - Generative AI for Translation in 2024


 

This new study from Intento explores the dynamic landscape of Generative AI (GenAI), and the comparative performance of new models with enhanced translation capabilities. Following updates from leading providers such as Anthropic, Google, and OpenAI, this study selected nine large language models (LLMs) and eight specialized Machine Translation models to assess their performance in English-to-Spanish and English-to-German translations.

The methodology employed involved using a portion of a Machine Translation dataset, focusing on general domain translations as well as domain-specific translations in Legal and Healthcare for English-to-German. To optimise the use of LLMs for translation, the researchers crafted prompts that minimized extraneous explanations, a common trait in conversationally designed LLMs. Despite these efforts, models still produced translations with unnecessary clarifications, which had to be addressed through post-processing.

The findings revealed that while LLMs are slower than specialized models, they offer competitive pricing and show promise in translation quality. The study used 'semantic similarity' scores to evaluate translations, with several models achieving top-tier performance. However, challenges such as hallucinations, terminology issues, and overly literal translations were identified across different LLMs. The research concluded that while specialized MT models lead in-domain translations, LLMs are making significant strides and could become more domain-specific in the future. For professional translators, these insights underscore the evolving capabilities and potential of GenAI in the translation industry.


Stanford University - 2024 AI Index Report


The 2024 Index is Stanford's most comprehensive to date and arrives at an important moment when AI’s influence on society has never been more pronounced. This year, they have broadened their scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development.

Key themes are:

  1. AI beats humans on some tasks, but not on all.
  2. Industry continues to dominate frontier AI research.
  3. Frontier models (i.e. cutting edge) get way more expensive.
  4. The US leads China, the EU, and the UK as the leading source of top AI models.
  5. Robust and standardized evaluations for Large Language Models (LLMs) responsibility are seriously lacking.
  6. Generative AI investment has 'skyrocketed'.
  7. AI makes workers more productive and leads to higher quality work.
  8. Scientific progress is accelerating even further, thanks to AI.
  9. The number of AI regulations in the United States is sharply increasing.
  10. People across the globe are more cognizant of AI’s potential impact—and more nervous.

This edition introduces new estimates on AI training costs, detailed analyses of the responsible AI landscape, and a new chapter dedicated to AI’s impact on science and medicine. The AI Index report tracks, collates, distills, and visualizes a wide range of the latest data related to artificial intelligence (AI).


Elon University - Forecast impact of artificial intelligence by 2040


 Imagining the Digital Future report cover

This report from Elon University, a US liberal arts, sciences and postgraduate university, predicts significant future upheavals due to AI, requiring reimagining of human identity and potential societal restructuring.

Combining public and expert views, both groups expressed concerns over AI's effects on privacy, inequality, employment, and civility, while also acknowledging potential benefits in efficiency and safety.

Analysis of several commissioned expert essays revealed five emerging themes: 1) Redefining humanity, 2) Restructuring societies, 3) Potential weakening of human agency, 4) Human misuse of AI and 5) Anticipated benefits across various sectors.

Mixed opinions on AI's overall future impact highlight very real concerns about privacy and employment, alongside a more positive outlook for healthcare advancements and leisure time.


Microsoft New Future of Work Report


 

This new report from Microsoft is about the impact of large language models (LLMs) on the future of work, especially in the domains of information work, critical thinking, human-AI collaboration, complex and creative tasks, team collaboration and communication, knowledge management, and social implications. It synthesizes recent research from Microsoft and other sources to provide insights and recommendations for how to leverage LLMs to create a new and better future of work with AI.

The report covers the productivity and quality effects of LLMs, the challenges and opportunities of 'prompting' and interacting with LLMs, the potential of LLMs to augment and provoke critical thinking, the design principles and frameworks for effective human-AI collaboration, the domain-specific applications and implications of LLMs in software engineering, medicine, social science, and education, the ways LLMs can support team collaboration and communication, the impact of LLMs on knowledge management and organizational changes, and the ethical and societal issues raised by LLMs. The report also provides examples of how LLMs are being used and developed at Microsoft and elsewhere.

It also flags (on p36) the important concept of an increased risk of “moral crumple zones". It points out that studies of past 'automations' teach us that when new technologies are poorly integrated within work/organisational arrangements, workers can unfairly take the blame when a crisis or disaster unfolds. This can occur when automated systems only hand over to humans at the worst possible moments, when it is very difficult to either spot, fix or correct the problem before it is too late.

This could be compounded by 'monitoring and takeover challenges' (set out on p35) where jobs might increasingly require individuals to oversee what intelligent systems are doing and intervene when needed. However studies reveal potential challenges. Monitoring requires vigilance, but people struggle to maintain attention on monitoring tasks for more than half an hour, even if they are highly motivated.

These will likely be challenges linguists may face, alongside the many new possibilities and opportunities that this report calls out.


UK Government's Generative AI Framework: A guide for using generative AI in government


This document provides a practical framework for civil servants who want to use generative AI. It covers the potential benefits, limitations and risks of generative AI, as well as the technical, ethical and legal considerations involved in building and deploying generative AI solutions in a government context.

The framework is divided into three main sections: Understanding generative AI, which explains what generative AI is, how it works and what it can and cannot do; Building generative AI solutions, which outlines the practical steps and best practices for developing and implementing generative AI projects; and Using generative AI safely and responsibly, which covers the key issues of security, data protection, privacy, ethics, regulation and governance that need to be addressed when using generative AI.

It sets out ten principles that should guide the safe, responsible and effective use of generative AI in government and public sector organisations.

These are:

  1. You should know what generative AI is and what its limitations are
  2. You should use generative AI lawfully, ethically and responsibly
  3. You should know how to keep generative AI tools secure
  4. You should have meaningful human control at the right stage
  5. You should understand how to manage the full generative AI lifecycle
  6. You should use the right tool for the job
  7. You should be open and collaborative
  8. You should work with commercial colleagues from the start
  9. You should have the skills and expertise needed to build and use generative AI
  10. You should use these principles alongside your organisation’s policies and have the right assurance in place

The framework also provides links to relevant resources, tools and support as well as a set of posters with the key messages boldly set out, as below:

 

  

A transparent framework is clearly to be welcomed. From the perspecitive of linguists working with UK government, these principles also give a useful framework for accountabilty and the means to ask reasonable questions about policies and practice that may affect their work.


Read the 'CIOL AI Voices' White Paper


 

Click the image or download the PDF here