“Voice as an interface is very powerful. It is very behaviour changing.” But for Kumar Rangarajan, Co-founder & CEO at Slang Labs, the use of voice and voice assistants on devices like Alexa or even our phones is limiting.
That’s exactly what Slang Labs wants to change with Slang Conva, a voice assistant as a service platform. This product enables other brands to create and add their own multilingual voice assistants to their apps thus enhancing the user experience.
The dominance of English in the Indian internet has over the years made it tough for the majority which is not comfortable in the language from navigating or making full use of the user interfaces. Also, to make matters worse, regional language keyboards are not easy to use. All this makes voice-assistants the best solution for those who are not digital natives.
“I was actually a non-believer in Alexa. I was like who wants to talk to a speaker. But I saw my parents, and my kids loved it. I realised that Alexa was not something that they were afraid of because it’s a voice they could talk to. It’s not alien or a new concept,” Rangarajan tells indianexpress.com, explaining how they thought of this idea.
But in his view, the Alexa and smart speakers are a limited use case for voice given most of the transactions for users take place on apps. “What if we could marry this new experience of voice and connect it with these apps and make them more accessible and friendly to consumers. And to take it one step further, if users can talk to these in multiple languages,” Rangarajan explains.
Bengaluru-based Slang Labs was started back in 2017 by Rangarajan, and two of his co-founder Giridhar Murthy and Satish Gupta, all once colleagues at IBM. Murthy and Rangarajan previously built Little Eye Labs, which was acquired by Facebook back in 2013.
Their quest for a plug-and-play voice assistant solution for apps, resulted in a “deep tech product” which the Slang Labs believes can help brands create “multilingual in-app assistance”. “Our system enables any brand to consume these pre-built voice assistants inside their apps in a matter of minutes,” he explains.
For now, the company is focusing on four domains — e-commerce, travel, insurance and recruitment. Without revealing names, the company claims some brands will be going live with voice-assistants based on their platform soon.
“With Slang Conva, we have taken a domain by domain approach. For example in the E-commerce world, we have taken all the data of things that we have now learned and built out the lower level AI models that are required to create this experience. We built the most common models which are required. Essentially these assistants are shared, repeatable components,” he says, adding that it takes just days to go live.
By creating domain specific models, Slang Labs’ clients can then customise what they want. For example for e-commerce companies the primary concern is the items or SKU that could be specific to them. By using a service like Slang Conva, their voice assistant can focus on the SKUs, and they don’t have to worry about the nuances of building the system itself.
Right now, the platform supports five languages: Indian Accent English, Hindi, Tamil, Kannada and Malayalam. The company plans to add Gujarati, Bengali, Marathi and Telugu as the next set of languages.
SlangConva allows a local language request to be translated into English for easier processing in the back end. “Even though the user is speaking in their own language, say Tamil, the assistant implicitly converts everything into English. So from the perspective of the app, they can get all the inputs that they want in English. This is actually a big challenge for a lot of brands, because you can’t maintain your SKUs in every single language,” Rangarajan points out.
The company claims it built its own Natural Language Processing (NLP) stack, though they started by using third-party solutions, but soon realised there were limitations. The idea was not just to build their NLP, but to ensure that it was also programmable, so that it could scale across different types of apps, and different types of domain.
“We wanted to ensure that voice experiences work very, very fast; that the real-time experience is good so that as soon as you finish speaking, you get a response.”