This week I upgraded to Parallels 6 because it was once again part of the MacUpdate bundle. Primarily I use it for Dragon NaturallySpeaking when my hands are getting achy. But it always leaves me feeling like the battered spouse who keeps going back to her husband because she’s sure that this time it’ll be different. Bottom line: these products still don’t work.
On the Parallels side, the problem is simple and one would think fixable: Parallels just can’t keep track of the microphone. Every time I quit and relaunch Parallels, I have to completely reconfigure the microphone in both MacOS X and in the virtual machine. Parallels simply is not able to remember its microphone settings. Here’s a hint for version 7: microphones aren’t disk drives. Don’t make me configure sound input or output or assign USB devices to the host or guest operating systems. Treat audio devices like the network ports where it just works without me having to think about it.
On the NaturallySpeaking side, the problems are virtually everything except speech recognition, and that’s more than you might think. Many of the issues with NaturallySpeaking are problems that PowerSecretary had resolved over 15 years ago on 100 MHz PowerBooks. For example, I shouldn’t have to keep correcting “parallels six” to “Parallels 6″. After I’ve used that phrase once or twice, the speech recognition program should learn it and recognize it and not bother me about it again. But instead I have to keep correcting it and correcting it and correcting it every single time. NaturallySpeaking does not learn from its mistakes. It has some sort of training mode that is a complete waste of time. There is no reason for this to be separate from the normal dictation process. I should not have to think about which words I’m going to use in an article and carefully train them before dictating. NaturallySpeaking should simply learn from the corrections I make in the process of dictation.
Secondly, NaturallySpeaking does not integrate well with the Windows operating system. It works better in some programs and others, but its ability to navigate and select text in a document ranges from poor to abominable. In Firefox, it’s manageable. It’s often off by character or two, but usually it puts the insertion point somewhere close to where it should be. In Chrome, it simply doesn’t work at all. Even in relatively well supported programs like Firefox and Microsoft Word, the cursor placement is still not nearly accurate enough to use for editing. NaturallySpeaking is adequate for dictating a first draft when your RSI is flaring up, but the second draft has to be typed by hand.
There are other aspects of the recognition that Nuance really needs to do a better job with. For example, the phrase “go to end of line” is one of the recognized commands, except half the time when NaturallySpeaking recognizes it as “code to end of line” and inserts the text that you now have to delete and then repeat the command. Other commands such as “delete previous word” and “correct that” are also frequently misrecognized. most annoyingly, is when Nuance correctly recognizes the command but merely inserts the text rather than performing the requested action. Unlike free-form text recognition, recognizing commands, even in the middle of free-form text, is really easy. Systems with a lot less CPU power than my virtual machine do this all the time and have for a couple of decades now. There is no reason NaturallySpeaking should ever get these commands wrong.
There are also times when NaturallySpeaking recognizes a command that isn’t there. This doesn’t happen with built-in commands such as “go to end of line” but rather when it’s trying to interpret words I’m speaking as references to links on the screen. For instance, earlier while I was dictating this post, NaturallySpeaking attempted to move it into the trash because it misinterpreted the word “to” as a command to go to the link on the page “Move to Trash” and as I was dictating this very sentence it did it again.
Yes, I know there are a series of confusing and misleading options hiding the preferences to turn some of this functionality off. Half of them shouldn’t be options in the first place. The other half should have the opposite of the defaults they do. This reeks of a classic Windows user interface design where features are thrown into the program and rubbed in the face of the user just to show how clever the developers are. Nuance still hasn’t learned the lesson that simplicity is power, that just because you can do something doesn’t mean you should, and that a program should simply do the right thing rather than asking the user whether they want to do the right thing or the wrong thing.
The bottom line is that Nuance has over-emphasized the purely algorithmic speech recognizer, while not adequately addressing the failures of its user interface. The actual recognition of words within Dragon NaturallySpeaking is pretty good and has been for several versions now. However, the user interface remains atrocious. There is colossal room for improvement left in this product without touching the actual recognition engine. Unfortunately, I see little to no evidence that Nuance knows that or cares about it. If you look at the release notes for Dragon NaturallySpeaking 11, there’s a list of features and the usual promise of increased recognition accuracy but nothing about accurate cursor placement, universal application support, real-time training from the actual text that’s dictated, or properly distinguishing between commands and spoken text.