A typical computer uses a standard keyboard with more than 100 buttons. Many of these will have a secondary function, activated by modifiers such as Shift and Ctrl modifiers. This is more than enough to encode the entire alphabet in upper- and lowercase, numbers 0 to 9, a selection of everyday symbols, and common functions that interact with the operating system.
On the other hand, a stenotype machine has less than 25 buttons, which is not enough for all the letters of the English alphabet, never mind the numbers and punctuation marks. This is because the operator is more interested in the sound of a word than the spelling, and it allows a speed of more than 200 words per minute while moving the hands as little as possible.
Incidentally, the one punctuation mark on the device is an asterisk, used to mark corrections. In some messaging applications, where messages can’t be recalled, users will typically type an asterisk underneath, followed by the corrected word underneath.
However, the stenotype is now decades old and technology has now moved beyond that. Below is a video about live subtitling for proceedings in Parliament.
In this application, voice recognition is used. However, it’s far easier to program a computer to understand just one voice instead of many, so an operator listens through headphones to the words spoken on TV and repeats them.
You’ll notice from the video that the operator speaks in something of a monotone regardless of how passionate the MPs are feeling, and this helps the software to provide a consistent result. Punctuation also needs to be added manually, not to mention switching between different people; colour codes are often used to help viewers work out which person said what.
Such software is also available for home users. For a period when I had RSI, I used Dragon NaturallySpeaking to give my fingers a rest. It worked to a high standard, I found, even straight out of the box and with a Scottish accent. However, it produces its best results when connected to the Internet, as it can benefit from deep learning techniques. If it can’t, the audio is processed locally and there’s a noticeable decrease in quality.