@build
Conversations with founders, engineers, investors building great products and companies. Hosted by @arish
The future is unevenly distributed - AI and people with speech disabilities
Join this
conversation
Download the Swell app and instantly add your voice.
Arun Munjay likes to quote William Gibson, who said the future is already here. It's just not evenly distributed. Arun believes that a lot of the recent advances in technology have left certain segments of the population behind people with certain diseases, disabilities. And in fact, these are the very same people that these advanced technologies can probably help the most and make a big difference in their lives. But there are not too many startups or even big companies paying attention to this segment of the population
Arun Munje
@Arun · 2:58
So when I took a pause to think about my next venture some time ago, I was overwhelmed with how fast technology had been growing. I saw lots of trends with self driving cars, artificial intelligence, augmented reality, virtual reality
Arun Munje
@Arun · 4:33
But it's part of their vocabulary that they use. There are many similar sounding words where they sound the same phonetically, but they actually mean differently based on the context. Like I just did hesitate in saying in between. So we have to smartly clean those parts out as not part of the part that needs to be decoded. Usually broadly, there are three areas that come into play in order to address these
Thank you, Erin, for explaining some of the complexities involved in AI and audio. And with that understanding, I think what you are trying to do it becomes clear that it's so much more harder. The challenge that you have embarked upon on what might be helpful is if you could give us some examples. If you have like, what kind of audio data you get and what does it look like after it has been processed by I hear you
Arun Munje
@Arun · 0:55
For sure, listening to a sample would give you a better appreciation of the problem we are trying to address. Actually, some time ago, like there was a race between companies like IBM, Microsoft. They trying to reach the human parity in speech recognition means, like their best case scenario was the ability to reach a level of speech recognition that human has already already in them
Vikas Gupta
@Vikas · 2:00
And I think his name with Karan Swisher, how they talk about how in the future, the biggest change that's going to happen over the next 100 years is where algorithms and AI are going to be the key decision makers and key aspects of how our lives get transformed. So relevant news and conversations in that sense, what you're doing. And maybe I have a question for you, which is I think we live in a world that is for all said and done. It's a capitalist world
Vikas Gupta
@Vikas · 0:13
To add a link to the conversation that I mentioned that referred to with Cars for a year, doing it to be with you and Daniel Karan. Let's see if I'm able to do this
So Arun, I have to confess I could not understand at all what Anand was trying to say in the clip that you posted and to me that if I, as a kind of a normal human has such time understanding it, I can only imagine the complexity involved in training Ali to be able to understand it
Arun Munje
@Arun · 0:26
Yeah. Regarding what was actually said in the example that I had played before. Basically he said it is really very good for learning the computer programming language. So maybe you can and go back and listen to it. And now that you know what he had said, still try to correlate how difficult decoding some of these might be
Arun Munje
@Arun · 2:16
And what motivated us more is that when we started trying this out with some other speakers and we started seeing good results there too, we realized that there could be a lot more people who could actually make use of this. And just in the US, there are about 7.5 million people who have some kind of speech difficulty. This is not including like there's many people who have thick accents and the way they speak a regular speech engine may not catch them with high accuracy
Arun Munje
@Arun · 0:24
Oh by the way, thanks Vikas, for the link to the interview. It was really thoughtprovoking it was good to hear their views about like AI's ability to how to cut through noise and see even better than what humans would be able to do and the role that it could play in future. It was very interesting. Thanks for the link
Thank you for sharing that example. I have one last question for you. And again, thank you for your time for sharing all your kind of ideas and what you're working on with all of us here. And after this last question, I'm going to open it up. So people who are listening can ask you questions as well. So my last question was, is the government doing anything in Canada, you're based in Canada or the US to fund or these kind of initiatives?
Arun Munje
@Arun · 0:57
Yes, we have been received quite well to the places where we have gone to. We've got some support from Canadian agencies like Canary, who help us with infrastructure and other income line services. There are more grants and aid that we can leverage, although we are still looking to expand more, both to scale and also to invest more in technology, we have just begun to scratch the surface. We are exploring other advances like gesture recognition, muscle movement and even brainwave to try to improve the quality
Thank you Arun. What you're building is truly amazing. Wish you all the best and all the success in your mission. Everyone who's listening, please join the conversation let alone know if you have any questions, if you have any comments or just want to show your support, you can hit the reply button. That's a yellow plus button and join in and Arun, thank you again for being here as well. I'm
Yogesh Tiwari
@YogeshTiwari · 3:13
Hello, Arish. Hello, Arun. How are you? This is Yogesh Tiwari wonderful thing that you guys are doing there. And I just remembered listening to what you guys are trying to do is that the person who's trying to set a voice activation alarm to his door was trying to say something, a word, a praise that was supposed to open the door for him, said that thing or recorded it actually on the machine
Sumit Gupta
@sumit · 0:43
Aronet was fascinating to hear about the product that you're building was listening to. Also the earlier thread around potential commercial users or something like this. Just curious if you've thought about thinking about something like this for speech therapy like you mentioned, there are many people who have either thick accents or due to various conditions have challenges with speech. And I wonder if one commercial exploration of technology like this could be around speech therapy. And many children go through development stages where they go to speech therapists, etc
phil spade
@Phil · 0:42
Arun, I just want to second everybody else's accolades here for you. And I just think this is outstanding and what a great endeavor. I really applaud this initiative. And, you know, I grew up with a couple of kids in the neighborhood who were not able to really speak. You couldn't understand what they were saying. And one of the kids was deaf. And I think something like this wouldn't really, really have helped them and potentially still can. So please keep it up
Deborah Pardes
@DBPardes · 1:10
Arun, this is a great conversation. And after you played the audio and then you told us what it meant, I, in fact, did understand it. And I want to know in terms of the technology. Is this similar to the way the mind helps us recognize a word, even if the letters are not in order. Because the mind has been taught to look at every third letter and identify the letter
Arun Munje
@Arun · 1:26
We have Arish, which can allow speech therapists to assign specific training exercises to the speakers. And on the other hand, we can create and export reports of exactly which sounds. The AI is having more trouble for that speaker, and actually they can monitor this over a period of time. So these are currently being exposed as APIs available. Hopefully you will be integrated into some system in the future
Arun Munje
@Arun · 2:01
So he's able to decode it much more clearly. And our brains are not yet trained for that people who are not familiar with Anna. So what we are trying to do is get the AI has a possibility that it has more time at hand to very quickly speed up the process of training itself. And that way it could act as the same way as it would help any other person who it learns from
Karan Dev
@Karan.Dev · 0:29
Hey, Arish. Arun, thank you so much for this fascinating interview. Arun, it's truly remarkable the work that you are doing. And I just had one question. You mentioned that IBM and Microsoft are also developing speech recognition software, or have developed it if I'm not mistaken. But I was wondering, what the competitive landscape look like. Is there a lot of competition? And what are some of the other similar technologies that's aiding in your research and development
Sreeja V
@Wordsmith · 1:22
And I'm sure that as the speech AI technology evolves, it can also help with aspects of caregiving, a concern that most families with children with disabilities grapple with in terms of longterm caregiving. Look forward to hearing from you from time to time on the progress you're making and wish you all the very best
Arun Munje
@Arun · 3:37
Also answering your question as to what other big companies like Google and all are doing in this area. They have started a program where they are basically trying to collect more voice samples from people whose voices have problems in being a clear voice. And they are trying to just like normally AI does is collect enough of those voices and see if they can try to figure out decoded the way we are doing
Arun Munje
@Arun · 0:54
Hey, Yogish, it's interesting that you brought the Hindi and that kind of a Sanskrit based speech, actually, if everything everyone was used to that language, it's more scientific and it has very few exceptions in spellings and in sounds. So probably the work of AI would have been more simplified if we were using that. But anyway, there are concepts in there where people have tried to extract the phonetics out of the regular English language in any language, for that matter
Arun Munje
@Arun · 1:00
Hi Ramiya. Yes, definitely. I hear you working as a supportive tool for communication for education is definitely part of the plan. There's so many different types of integrations that can be done. So like, we've already been working on, like if there's a remote communication server Zoom to provide closed captioning on the Zoom while they are talking to their remote teacher, or if the teacher is live how to communicate directly with them
Join this
conversation
Download the Swell app and instantly add your voice.