Researchers Are Breaking Ancient Language Barriers With AI

Deciphering ancient languages and texts has been a challenge for archaeologists for generations. Now, researchers are using artificial intelligence to quickly translate ancient texts and languages into English—including ancient Cuneiform and Egyptian hieroglyphs.

In a new Oxford Academic report, a group of AI developers details how they were able to use natural language processing (NLP) to translate cuneiform tablets from Akkadian into English.

"Hundreds of thousands of clay tablets inscribed in the cuneiform script document the political, social, economic, and scientific history of ancient Mesopotamia," the report said. "Yet, most of these documents remain untranslated and inaccessible due to their sheer number and limited quantity of experts able to read them."

But using AI to read ancient text is not as simple as running a picture through a ChatGPT plugin.

"The main challenge for us is the lack of a large amount of data," Gai Gutherz, who co-authored the report, told Decrypt in an interview. "There is only a small amount of data you could use to train the models; we managed to get to tens of thousandss [of] examples."

Meta Is Training Its AI on the Bible and Other Religious Texts

Gutherz, a software engineer at Google, said that while translating Akkadian proved more challenging than Spanish, there was a large amount of data to pull from because it was a widely spoken and written language in its time.

"Akkadian is a very important language. It was the common language in the old Middle East and Mesopotamia," Gutherz said. "People in Mesopotamia spoke different languages and used Akkadian to communicate. It was used as English is today."

Cuneiform, meanwhile, originated around 3400 BCE and is one of the earliest writing systems used to document several ancient dialects, including Sumerian, Akkadian, Hittite, Aramaic, and Old Persian.

The oldest surviving literary work, the Epic of Gilgamesh, discovered in 1853, was written over 4,000 years ago in the Akkadian language using cuneiform script.

In May, an Italian research team published a paper detailing how AI can be used to detect ancient sites for archaeological discovery in the Mesopotamian floodplains. Earlier this month, researchers used AI to discover Nasca geoglyphs, etchings of humans and animals, in Peru.

Another project looking to help modern researchers understand ancient languages and text is Google Fabricius. Fabricius lets users decode ancient Egyptian hieroglyphs into English using online tools developed by the tech giant.

"The easiest way to understand hieroglyphs is to imagine that they are the ancient Egyptian equivalent of emojis," Fabricius said.

Researchers Can Identify AI-Generated Academic Writing with '99% Accuracy'

The Oxford report said that the problem in translating these ancient texts is finding a complete tablet, saying that clay tablets are rarely preserved in their entirety, and as a result, neural machine translation, as well as human translation, are affected by a lack of context.

"It's a relatively old language that has been extinct for over 2,000 years and has many examples," Gutherz said. "So it's easier to work with compared to languages that you have just dozens of translated texts."

The researchers working with Gutherz developed the website, The Babylonian Engine, to showcase their technology. While the name may sound like the evil robot overlord of a cyberpunk novel, the project aims to conduct translations for various ancient languages, starting with Akkadian.

As Gutherz explained, the model used to train the Babylonian Engine AI, known as Akkademia, is open-sourced and available to view on the project's Github.

"We have a lot to learn from ancient history; the first letters and books were written in Akkadian and other ancient languages," Gutherz said. "I think it's super interesting to try to make it more accessible and translate [Akkadian] into English and other languages that people speak today."