"Smart, invisible regex" sounds like a lot of bs... could you give a more technical explanation?
Also the Whisper model doesn't really have a context window, it already segments the audio with a certain amount of overlap between the chunks, I really have a hard time understanding what you are trying to say here.
This is just plain wrong. I have my own Whisper App in the AppStore (on iOS, with very limited memory capacity) and there are no problems at all with longer Audio / Video files.
Also the Whisper model doesn't really have a context window, it already segments the audio with a certain amount of overlap between the chunks, I really have a hard time understanding what you are trying to say here.