When I hit fn, fn, the audio from my internal speaker cuts out and text is transcribed after a second or two lag. It works perfectly to transcribe text to a TextEdit, Textwrangler or Word document, as long I select Quicktime source in AHJ, click the “Hijack” button, play a QT movie that is on my desktop and immediately click on the TextEdit, Textwrangler or Word window and hit fn, fn. I’ve set up the system to turn audio hijacked by AHJ into dictation which is transcribed to text via Maverick’s Advanced Dictation feature. The following screencast illustrates this process from start to finish:ĭo you have your own solution for this that you’ve been using? Please comment below and share what you’ve learned. In other words, it becomes an integral part of your sound system in MacOS X.įinally, we set the Dictation input to be Soundflower as follows:Īt this point, any audio played by QuickTime Player X will be routed to Soundflower and will thus become available to any application that accepts text input and has a Start Dictation menu item. Once installed, Soundflower becomes an input/output option in your Sound preference pane and everywhere else audio sources and destinations can be specified. The Auxiliary Device Output plug-in enables us to choose the previously installed Soundflower as the recipient of the HiJacked audio as follows: To do that we go to the Effects tab and choose Auxiliary Device Output from the 4FX menu. Next, we use Audio HiJack Pro to send that audio to Soundflower (free). This configuration will grab all the audio from QuickTime Player X as it plays the “NPR Gettsyberg Address” audio file. My sample audio is from NPR and contains a dramatic reading from noted actor, Sam Waterston and looks like this in QuickTime Player X: This will capture the audio from anything that this app plays. Thus, I set that app as the audio source as follows: It could be any app that emits audio but I used QuickTime Player X. The first is to identify the source of the audio. There are two things to set up with this app. The application at the center of this process is Audio HiJack Pro by Rogue Amoeba ($32 USD). It isn’t intuitive or Apple-easy but it is something that anyone can accomplish with a bit of determination. You can, in fact, route the speech in an audio file through Apple’s speech-to-text subsystem and render very usable text output. There is no obviously easy way to route speech from a recorded file through Apple’s Dictation system to produce usable text. Still, this is a system that assumes a live speaker. However, Dictation was given a significant boost in MacOS X 10.9 (Mavericks) with the introduction of Enhanced Dictation which enables offline use and continuous dictation with live feedback. This created delays and was difficult to use for substantial bodies of text. However, the first iteration of this system required an internet connection so that speech could be uploaded to Apple’s servers where it would be turned into text. This is quite an advance over having to purchase a two hundred dollar application to accomplish the same end. MacOS X recently introduced Dictation (speech-to-text) as a feature usable in any application that takes text as input. Speech to text (STT) is a bit more difficult than text to speech (TTS) which has been in use much longer. Indeed, many important videos are created in ad hoc fashion (interviews, panel discussions, conference presentations and the like) where scripts would be totally inappropriate.Ĭreating text from speech has become essential to meeting these expectations, especially where all one has to work with is the speech in the audio track of a video. The problem is that many videos are created without a script that is followed closely by the speakers in that video. For video content creators, this means providing a transcript or, better, providing subtitles to that video so that dialogue may be viewed in the same context as the video. One important aspect of that challenge is to make video more accessible to persons who are deaf or have difficulty hearing. The pressure is on to to make screencasts and other online video more accessible. This can be extremely handy for anyone that needs to create captions for a video, but lacks the transcribed text. Lowney describes how to use the Enhanced Dictation feature in MacOS X 10.9 (Mavericks), combined with Audio Hijack and Soundflower to turn recorded audio into a text file. If you’re interested in captioning your videos, you’ll find this interesting. Frank Lowney from the Digital Innovation Group at Georgia College & State University for this informative guest post.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |