Users should not need to "train" to use voice-recognition software. If I am told that something will react to my voice commands, then the only constraint I will accept is a list of commands I can use (I understand that the developers cannot possibly have allowed for all the available words in the dictionary).
Other than that, if my neighbor can understand what I say, then voice-recognition software should be able to too. If it doesn't, then haul it back to the lab and work on it until it does.
Yes, I know, this is a very difficult field, and regional accents must really be very hard to contend with. Nevertheless, in an era where people have become accustomed to seeing one-button solutions to everything (because of films, books, and even everyday workplace experience), it is hardly surprising that users cannot comprehend that a voice-recognition software cannot "recognize" their own voice immediately. After all, that's what it says on the box, no ?
Pascal.