इलेक्‍ट्रॉनिकी और सूचना प्रौद्योगिकी मंत्रालय, भारत सरकार

AI - Shruti

Automatic Speech Recognition (ASR) is all about using computers to transform the spoken word into a written one. ASR is a subfield of Artificial Intelligence (AI) in which a computer recognizes spoken words and transforms them into text. The process is also commonly referred to as "speech-to-text"" or "transcription services." The process can be applied to live speech or audio recordings. In short, ASR is the technology that makes it possible to dictate texts into your application for voice inputs.

Text-to-Speech (TTS) is ideal for any application that plays audio of human speech to users. It allows you to convert arbitrary strings, words, and sentences into the sound of a person speaking the same things. Text-to-Speech (TTS) technology leverages neural network techniques to deliver a human-like, engaging, and personalized user experience.

This service may be needed for eFile notations to be dictated and transcribed to text, or translating to another language and speaking the written notes; e.g., a note may be dictated in English and eFile notations written in Hindi, or vice versa; filling out APARs, video conferencing, IVRS-based systems, user forms, VOICEBOTS, etc.

A user department or ministry using NIC Cloud VM can request ASR and TTS by filling out a service request form the first time they are applying for the service on MeghRaj Cloud of NIC and briefly describing the use case. The user needs to include a User Request Letter. The user has to select a language for ASR input voice. If the user requires translation with ASR, then the user has to provide the translation language.

Registered Cloud users may Click here and submit their Service Request (SR) to avail the above service, whereas new users ( i.e., users not yet registered for cloud ) are requested to first apply for the Cloud Registration with reference to the On-boarding procedure.

