As India celebrates its 69th Republic Day, Microsoft on Thursday announced to bring Artificial Intelligence (AI) and Deep Neural Networks to improve real-time language translation for Hindi, Bengali and Tamil. With Deep Neural Networks-powered language translation, the results are more accurate and the sound more natural.
“Microsoft celebrates the diversity of languages in India and wants to make the vast Internet even more accessible. We have supported Indian languages in computing for over two decades, and, more recently, have made significant strides on voice-based access and machine translation across languages,” said Sundar Srinivasan, General Manager-AI and Research, Microsoft India.
“Today’s launch is a testament of our quest to bring cutting-edge Machine Learning (ML) technology to democratise access to information for everyone in India,” Srinivasan added. Users can avail the benefits of Deep Neural Networks-enhanced Indian language translation while surfing the Internet across any website on the Microsoft Edge browser, on Bing search, Bing Translator website, as well Microsoft Office 365 products like Word, Excel, PowerPoint, Outlook and Skype.
The Microsoft Translator app in Android and iOS can recognise and translate languages from text, speech and even photos. Since early 2000s, Microsoft has been pioneering the traditional Statistical Machine Translation (SMT) paradigm to translate global as well as Indian languages. The incorporation of Deep Neural Networks into translating complex Indian languages has been engineered to bring more accuracy and fluency to translation.
While SMT is limited to translating a word within the local context of a few surrounding words, Deep Neural Networks operate differently as it has the capability of encoding more granular concepts like gender (feminine, masculine, neutral), politeness level (slang, casual, written, formal), and type of word (verb, noun, adjective).
For accurate translations, the system demands millions of parallel sentences in each language pair, in all permutations and combinations. “However, Indian languages, constituting of Dravidian and Aryan subdivisions, are complicated. The complexities increase while translating languages for India, where 29 different states have 22 official languages,” Microsoft said in a statement.
Adding to the challenges was the dearth of digital content in Indian language, which could be pulled from the Internet to train the neural networks. “Six Indian languages are part of top 20 global languages by population. Ironically, these languages are not on top of the digital content list. There’s not enough material on Internet that we could use to train the system,” explained Krishna Doss Mohan, Senior Programne Manager, Microsoft India, who is part of the team that works on Indian languages.
Despite the obstacles, Deep Neural Networks-powered translation systems have shown significant improvement in both automatic and human evaluation metrics. “More specifically, we have witnessed at least 20 percent improvement in translation quality for all Indic languages currently supported by Microsoft,” the company said.
To break the language barrier, Microsoft started working with the Indian languages two decades ago and launched “Project Bhasha” in 1998 to accelerate computing in Indian languages. “We have come a long way since then — supporting text input in all 22 constitutionally-recognised Indian language across our products, and Windows interface support in 12 languages,” the company said. Bhashaindia.com that provides computing tools for Indic languages has received 40 million hits to date.