Hi,
I am second to Marsha.
TestComplete does not have means to generate speech, neither can it check that generated audio produces correct words. In theory, I can imagine the approach when TestComplete drives, say, Windows Media Player to open some predefined sound file and then intercepts output stream from your tested application to check that expected audio output is generated. However this way looks to be complex, brittle and unreliable.
If your application can redirect generated audio to file, then you should be able to feed it with predefined sound input files and/or images of text and compare generated output with expected pre-saved outputs.