As a kid, I was enamoured with Science Fiction programs like Star Trek where computers were smart, willing collaborators with humanity in the pursuit of knowledge (or whatever alien race Captain Kirk or Commander Riker just happened to be interested in romantically!). Although a touch interface was always available to control the amazing technology, the promise of a fully immersive voice interface was a reality as well, and it perfectly complemented the more ‘hands on’ experience. This voice interface made possible the ‘universal translator’, that removed the barrier of language and normalised being able to tell the computer to “replicate” a “Tea, Earl Grey” out of seeming thin air. In short, this technology was no less magical than seeing Harry Potter wandering the halls of Hogswarts. The Star Trek ‘vision’ of the future was, I suspect, a very real inspiration for the thousands of people that have worked on text to voice and voice control technology over the years. So where are we at right now in making this fiction into a reality?
Speech to text software, like Dragon Dictate and assistant programs, like Siri (for Apple devices) and Google Assistant (for Android devices), have come a long way in the last 20 years. I remember a friend who received an early version of Dragon for Christmas and spent many hours reading into it to train it to understand her voice. This same process now takes about 5 minutes (and no time at all for Mobile Assistance programs). We are very close to having commercially available voice translation software, that over time, will rival Star Trek’s ‘Universal Translator’. The voice control functions on our mobile devices are getting smarter and smarter allowing us to write emails and texts, play music, set our alarms or ring / video chat with people without ever touching our phone or tablet!
There are quite a few limitations which do exist with the technology at present however. A pet peeve of mine is that the technology isn’t as interactive as I often would wish it to be, meaning I’ve got to unlock the phone and fiddle around with it to do what I want. There are limited pathways for voice recognition to work with, and it is impossible to do anything outside this limited design. For example, on an iPhone I can ask Siri to play a song for me, however I cannot do the same for video. This frustration however will, hopefully, only be temporary, as each version of iOS (the operating system used on Apple devices) expands the actions and interactions that are possible using Siri’s voice control.
This expansion of what, and how, voice control can be used can sometimes be surprising. I, before it was released in the ‘Home’ app on iOS, would not have ever thought of being able to control my lighting system with simple voice commands. Now that this technology has been invented, and I’ve bought a Hue Lighting hub and globes, I use it every day, and miss it when it isn’t available! Who know’s what will become indispensable tech in my future, and how science fiction will inspire future science fact? … It could be Flying Wheelchairs, Robots, Jet packs, 3D printing replicators… Whatever comes, I am certain that being able to control these devices by simply talking to them through an immersive voice interface, will be part of this future as well.
Bring it on!
For more information about Dragon Dictate check out this Wikipedia article, with links to the official site for the software: https://en.wikipedia.org/wiki/DragonDictate?wprov=sfsi1
For more information about voice control on Apple devices, check out the official Apple page: http://www.apple.com/au/ios/siri/
For more information about Google Assistant, check out the official Google page: https://assistant.google.com