Thanks for coming to, the world's #1 resource for all things InDesign!

Transcribing sound files (YES!) and then getting the text into ID (How?)

I apologize for this post as I only know half of what I want to know, so I’m posting this as an open-ended question for some brilliant person to come and finish the technique. But first, the background.

Did you know that Adobe Sound Booth CS4 now has a terrific transcribing feature? This means you can import a sound file and the Transcribe command will convert the words in the file into text?

Now, it’s not perfect. And can actually be the source of a fun party game. For instance, can you tell what the following words are supposed to be:

It’s actually the opening words to the latest episode of the InDesign Secrets podcast. And it’s supposed to read:

“…independent resource for all things InDesign-ine-ine-ine-ine-ine. It’s true…”

But the bottom line is that if you import an audio file into Sound Booth and then click the Transcribe button, you get a sort-of transcription of the words.

If you click on each of the words, you will jump to that portion of the audio waveform that contains that text. It’s supposed to make it easier for editors to find things. And (I think) it can be used to create subtitles in Flash videos.

But here’s my problem.

The text is not selectable as one complete story. Each word is only a separate markers in the box.

Now, you can export the text, but the file format is XML.

I know that InDesign can import XML, but when I import the XML file I get the following structure in the XML pane:

And when I open the XML file in a text editor I get:

(The text goes on and on.)

Now, I KNOW there is real text in there, somewhere. I just need someone who understands this stuff to give me some hints as to what I can do to extract it.


Sandee Cohen

Sandee Cohen

Sandee Cohen

Latest posts by Sandee Cohen (see all)

  • - November 30, -0001
Related Articles

13 Comments on “Transcribing sound files (YES!) and then getting the text into ID (How?)

  1. Grep might be able to help you strip all of the tags out, but I can’t see enough of the XML. Sandee, can you post the XML file from your export. Someone here might be able to tear it apart and put it together again.

  2. Sandee, I figured it out.

    In the Soundbooth metadata panel menu there is an option to copy speech transcription. Then just paste it into InDesign and you are good to go.

  3. Does anyone know a good way to automatically take the transcript from Soundbooth or Premiere and use it to caption a movie file? It seems there should be a good link between Premiere Pro/Soundbooth and Flash but I haven’t found it.


  4. How clean does the sound byte need to be for the transcribing feature to work? If there is some background noise (like pumps running), or the speaker has a southern accent, do you think it will still work?

  5. In my initial tests I’ve found the feature to have an accuracy of less than 30-40%. If the transcription generated all the cue points and made it easy to caption or subtitle a movie it might be worth it. As it is, I’m sending transcripts to India.


  6. I am in the process of setting up a website (my first!) for the Laurel Senior Friendship Club (a club of local residents who are 55 up). We have a membership of 580. Among these are seniors who have vision problems and hearing problems. I am looking for a tool I can insert into the website or individual webpages that could convert text into sound and sound in to text (which would cover the vision and hearing problems).

    Have you come across anything recently that would help in this?

Leave a Reply

Your email address will not be published. Required fields are marked *