Software dictation has been around for absolutely ages,
All the way from early Dragon software, which used to take hours and hours to learn an individual, and even then wasn’t particularly good.
They have come on in leaps and bounds with modern phones chattering to the internet in order to do diagnostics and proper translation, but even then they’re not real time at true conversational speed over more than a short sentence.
When one claims to be absolutely amazing at transcribing, and able to do it for meetings and multiple voices, I have to say I was more than a little suspicious, and this is what Otter AI claims.
Now I didn’t want it for meetings. I wanted it for writing blog posts like this one, but I wanted to just rant while wearing a headset, and for it to keep up with me, rather than for me to stop/start or talk in a slightly stilted fashion. I find that even the Google or Amazon stuff only gets about 9 words in 10 right, and often get sentences scrambled, often because it interprets something I’ve said that is technologically or geekily specific, as a generic word.
So I was introduced to Otter AI, and after I got past the forced way they are selling it, it is actually a very clever mixture of relatively new and old style technology.
When you use it, they say that you’re supposed to dictate into it, and then it will do magic to make everything amazing.
Don’t get overexcited. What it actually does is use standard speech to text for what you’re seeing in real time, where it gets the normal 9 out of 10 words right. And then once you’ve finished your meeting/rant, it will send your audio up to the internet and have another go at it with a lot more accuracy, as it can do it at its leisure.
This turns out to be a brilliant idea. Although they should really sell it better rather than just confusing the hell out of you.
Their target audience is obviously for long meetings or meeting note takers, not people who actually just want a decent accurate natural voice dictation app, the UI reflects that, the first few times I used it I desperately kept trying to stop and edit the text I was working on, but that is not how you do it.
The ordinary real time voice to speech conversion just acts as a general guide to what you have been saying, its not the final product, just say what you have got to say, save it, and let it clean everything up. Once you’ve got the hang of that as a process, then it’s a brilliant tool.
It’s amazing for dictating large formats or even quick blogs and notes. it has a good export feature which while not perfect, will happily export my monologues to WordPress and dump them into a standard post which I can then edit.
However again their market position is very off for my kind of usage. It seems to be only priced for people who are doing hours upon hours of meetings. Whereas people like myself that just want it for dictation are never going to come off the free tier, They’ve got no reason to. So they should really introduce say a £5 a month tier with some slight advantages (say a upgraded export to blog platforms). without that, people like me will never have a genuine reason to purchase it.
However all in all, if you’re wanting to write blog posts, or just take notes that you can then export as text. I couldn’t recommend it more.

