Thanks for coming to InDesignSecrets.com, the world's #1 resource for all things InDesign!

InDesign Secrets Video: Extracting Images From Word Documents

For InDesign users there are a few certainties in life: death, taxes, and the fact that somewhere along the line you’re going to have to deal with Microsoft Word documents.

And when you place content from Word into InDesign, you have to decide how you want to handle any images that came along with the text. By default, InDesign will import inline graphics along with the text. And those images become embedded in your InDesign files. This is can be useful since you don’t have to deal with the chore of placing the images separately. But it also means that you can’t edit those images, and if they’re large they can bloat your InDesign file size.

Fortunately, you can quickly unembed images that came from placed Word docs, using the technique that David Blatner shows in his latest InDesign Secrets video on lynda.com.

InDesignSecrets lynda.com video

In the video, David also demonstrates how to easily extract all the images from a Word document so you can edit them, place them in other InDesign documents, etc.

 InDesignSecrets lynda.com video

So check out the video and take control over the images that come from your Microsoft Word documents. But when it comes to death and taxes, you’re still on your own.

Tags
Related Articles
Comments

10 Comments on “InDesign Secrets Video: Extracting Images From Word Documents

  1. This trick works most of the time to extract raster images such as tiffs, jpgs or gifs; and some vectors… but if there is a graphic placed in word that used a special plug-in for word (e.g. dedicated software for graphing or drawing chemical equations), the media folder may only have the placeholder images of the pictures in question.

    To get to the source files in these circumstances, there is another subfolder that appears called “embeddings” and the files appear as .bin files. To open these files, the suffix has to match the appropriate software that created the original, and if working cross-platform (e.g. using this trick on a Mac to edit images created by a client using Windows) it doesn’t always work.

    This circumstance recently happened where I had to attempt to make usable vector files of graphs that were embedded in a word file, but were created using the Efofex FX Graph plug-in.

  2. Ah, but some of us are card-carrying Word minimalists. We find Word documents create so many hassles, we bring in as little as possible directly from the Great Beast of Redmond.

    Cutting back on the formatting imported typically means those images don’t come at all. But that’s no problem. Word 2008 for Macs (and I presume other versions) has a control-click option for images called Save as Picture. That lets you save them as files and, like all good boys do, run them through Photoshop to clean them up and change them from whatever gosh-awful specs Word has them in.

    ——-

    I’ll offer another hint that can be a real timesaver. When ID imports Word documents, it seems to let it get away with placing formatting instructions that it would never let a user do, thus creating a problem that users can’t directly correct. I’ve had every paragraph in an imported Word document come with a hyperlink (tiny blue, underlined type) attached to every bit of text. I could cover it up by assigning a different style, but I couldn’t get rid of it.

    And just a few days ago, I found that every bit of text in my ID index had Footnote Reference style attached, which mean superscripts. Most of the index entries had other formatting attached that overrode that. Unfortunately, the commas between page numbers didn’t. The result was a mess.

    No changes I made to the index style definitions changed that. Eliminating them and creating another did no good. When Word messes up an ID document, it really messes it up. Even cutting and pasting an index from another book and importing its styles did not get rid of the problem. Yes, I could create a Normal style that overrode that rogue styling, but that would have to be done every time I redid the index.

    Then I discovered that the Footnote Reference style was not used in the book for footnotes. ID lets you simply superscript them without a style. That Footnote Reference character style was yet another alien import from Word. ID, apparently unable to resist anything that a Word document demands, had bizarrely imposed it on the Index in a way that no user tweaks could fix.

    Once the Word root of the ill was discovered, the fix was easy. I simply deleted that Footnote Reference character style, taking great care to answer “No” when ID asked if I wanted to preserve that styles formatting.

    That’s the long of it. The short is that, if you find yourself wrestling with impossible-to-correct formatting issues in a document imported from Word, look for styles brought in from Word and delete those styles totally and utterly. If you’ve used those styles other places in your document, you’ll need to give that text new styles.

  3. The links from the article on extracting images from Word documents are incorrect. They take the viewer to a video on Quick Apply. Very interesting, but not what I wanted to see.

  4. Hi Glenna and Bob-

    Sorry, the linking is a bit tricky. When the new InDesignSecrets videos are posted at lynda they are available only through the main link to the course. After a week, I could edit the post to point to the specific video, but I’ve been hesitant to do that since the link would break if/when lynda.com puts it behind the paywall. So I just link to the course in each case, since that will never break. But I might reconsider since I don’t want to cause frustration/confusion for anyone.

    In the meantime, here’s how to get to any free video: Follow the link to the course. Then scroll down and on the left side, you’ll see the complete listing of videos. Find the video you want in that list and click it to watch. Here’s a direct link to the video on extracting images from Word docs: http://goo.gl/6HKhI7

  5. There’s a quicker and better way to get your hands on images in Word files. The way to get the images out before placing the document in InDesign depends on whether the Word document type is doc or docx.

    DOCX
    docx documents are zip files and the images can be extracted from there. With a decent unzipper such as 7zip, open the docx archive directly. If you don’t have 7zip or a similar utility, change the docx type to zip and open that zip with your standard OS unzipper. The images are in the /word/media folder. Most images are in .EMF format, which you can place directly in InDesign. It pays to have a peek in those .emf files: many (if not most/all) are in fact PDFs. Just change the .emf type to .pdf and you can place that. Those PDFs sometimes don’t look right; you can solve that by opening the PDF in Acrobat and saving it from there.

    DOC
    To get the images out of a doc document, save the document as a web page (in Word: Save As > Save as type > Web page). Word creates a new folder using the document’s name and dumps various files in there. You’ll get every image in two formats: its native format (jpg, png, etc) and an .EMZ version. Many (if not all) .emz files are zip files themselves, and when you unzip them you often get PDf files of very high quality.

    Peter

    • Hi Peter, thank you for the comments. Yes, I’m not sure what you mean by “quicker and better way” because I actually show that docx trick in the video above, along with the unembed method that is (I think) faster than word’s save as web page) method. But it’s always good to learn all the tricks!

  6. David — I clearly didn’t read past the unembed trick and didn’t see that you describe the docx trick. Sorry about that! There may be a disadvantage in the unembed trick though: I think that when you place the images along with the Word file, you get lower-res versions of the images. The docx rout trick gives high-res images. I must admit that I didn’t investigate that in detail though. But I did see a few times some shabby graphics which improved greatly after retrieving high-res versions.

    Peter

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>