I think that some speech recognition systems depend too much on language model priors. This works well for routine tasks where most of the speaker's words are easy to predict. It fails when the speaker's words are specific or unusual.
For example, try speaking the words: "OK Google, the thick round box jumped over the hazy bog."
For example, try speaking the words: "OK Google, the thick round box jumped over the hazy bog."