General speech recognition is not the same problem as radiology dictation. Anatomical language, measurements, abbreviations, dictated corrections, and double negatives create a domain where generic models fail in clinically meaningful ways.
The problems are specific.
- Technical vocabulary rarely appears in general corpora.
- Measurements must become structured clinical notation.
- Implicit punctuation has to be inferred from reporting context.
The useful path was not piling rules onto a generic model. We trained for the domain and added controlled vocabulary so each radiologist can keep the output predictable.