is now part of CreativePro.com!

*** From the Archives ***

This article is from May 30, 2011, and is no longer current.

Creating EPUB in InDesign CS5.5: Beware the WebKit Bug!

28

InDesign CS5.5 brings many improvements to the creation of EPUB files. A posting here revealed some of the improvements. However, recently I ran into a bug which occurs when exporting EPUB out of InDesign CS5.5, which does not occur in InDesign CS5. It has been described as a WebKit encoding bug.

I was working on creating two eBooks, and I had started them in InDesign CS5. When InDesign CS5.5 was completed, I happily opened them in the new version because using CS5.5 required a lot less postprocessing of the files. The chapters passed EPUB validation, but when I tried to preview some of them on my iPad (and my iPhone) in Apple’s iBooks eBook reader, I saw this strange message (shown in a screen capture from my iPhone):

WebKit Error Message

WebKit Error Message

I’m a relatively new at working with creating EPUB files, but I’ve learned quite a bit with the help of books like Elizabeth Castro’s EPUB Straight to the Point, Gabriel Powell’s webinars, and Anne-Marie’s Lynda.com videos. But with this error I kind of hit a wall because I had not seen this documented anywhere. More confusingly, the same files worked fine when exported out of InDesign CS5, and passed EPUB validation. For a few weeks, I was stymied. Because the error appeared at the beginning of a chapter, I assumed that the error was at that location.

Fortunately, I was able to attend and present at the InDesignSecretsLive.com Print and ePublishing Conference last week. There I was able to show my problem to EPUB gurus Ron Bilodeau of O’Reilly Media and Gabriel Powell. They described it as a WebKit encoding bug.

Ron opened up the InDesign CS5.5-generated EPUB in Oxygen, an excellent XML and EPUB editor. The error message actually points to line numbers in the XHTML code. Note in the error message shown above that the first error occurs on line 17. In Oxygen (or another editor like TextWrangler when you turn on line numbers) it appears like this.

Shy Entity in CS5.5 EPUB

Shy Entity in CS5.5 EPUB

The “shy” character here refers to a discretionary hyphen at some places in the text. Here is the problem as described in a blog posting Ron pointed me to:

This is a very common XHTML mistake, now growing in visibility much due to the Google Chrome boom. Google Chrome is based on Webkit, an open source browser engine also used in Apple’s Safari; Webkit is very restrict [sic] on XHTML rules.

This particular error is caused due to common HTML entities usage on XHTML outputs, which follows XML entities rules. Basically means you are using a  -like entity, when in XHTML you should use a   [XML-encoded] entity.

The posting also shows a chart of HTML and XML entities which differ:

Table of HTML & XML Entities

Table of HTML & XML Entities

The bug doesn’t show up with a failure to validate the EPUB. And it won’t show up in eReaders with a different rendering engine like Adobe Digital Editions. But it will show up in eBook readers like iBooks which use the WebKit rendering engine. It may happen with characters other than a discretionary hyphen,  but I haven’t investigated that.

In InDesign CS5, when the EPUB was exported, such entities were not included in the EPUB. The same passage of text is shown below in the EPUB XHTML generated by InDesign CS5:

No Shy Entity in CS5

No Shy Entity in CS5

At the conference we were told by Chris Kitchener, the InDesign product manager, that EPUB export was totally rewritten in InDesign CS5.5. Apparently, the fix for the WebKit bug that was in InDesign CS5 EPUBs was accidentally dropped in InDesign CS5.5. I’ve passed the bug report on to him.

The fix in this case is to identify where the bug is and remove it. In my case, I searched my chapters for discretionary hyphens (not used in EPUB) and removed them. You could also edit the XHTML code.

And what does the word “shy” mean? I found this reference which explained it:

The ISO Latin 1 character code, also known as ISO 8859-1, and the ISO 8859 character sets in general, contain a character named soft hyphen, abbreviated SHY, code value 255 in octal. In general, the ISO 8859 standards specify the characters and their codes only, not the use of the characters. However, soft hyphen is one of the few exceptions.

Steve Werner is a trainer, consultant, and co-author (with David Blatner and Christopher Smith) of InDesign for QuarkXPress Users and Moving to InDesign. He has worked in the graphic arts industry for more than 20 years and was the training manager for ten years at Rapid Lasergraphics. He has taught computer graphics classes since 1988.
  • Give the WebKit bug issue do you really benefit using the improved epub tools and then have to use something like Oxygen to search for thes errors and correct them?

    Obviously this will likely be fixed in an update and hopefully soon. Soo much about epub evolution and hrml 5/CSS3 is changing where is the standard ?

    Thanks for this enlightenment

    Sincerely,

    Stephen W. Cannon

  • Steve Werner says:

    There’s a relatively easy fix for the bug. I’m sure Adobe will fix it soon.

    It’s still much easier to create a good file from InDesign CS5.5 than earlier versions. There were several bugs and limitations in InDesign 4 and 5 to be overcome. The ability to set output order with the Articles panel is reason enough to upgrade if you spend much time creating EPUB files.

    There will be a lot of flux in the next couple years as the standards evolve, but things are changing very fast, and things should be much easier soon. But it will probably still be necessary to do some pre- and postprocessing for a while to come using scripting and a knowledge of CSS.

  • Eden Maxwell says:

    Hi Steve:

    Thanks for the time-saving heads up with nuisance issue. Your information will also diffuse much frustration.

  • Jeremy says:

    I think it should be called the “InDesign CS5.5 Webkit bug” to avoid any impression that it’s Webkit’s fault!

  • Answering the question of “is it still better to use CS5.5 for epub, even though you need to fix things like this”: Ron and others made it clear that almost every EPUB needs to be tweaked after export, no matter what version of InDesign you’re using. Ron ran through 5 or 10 different find/change routines he regularly did with Oxygen, even on 5.5 epubs. The point is that even with this fix, there’s a lot less fixing required from 5.5.

  • Dan Rodney says:

    I found this problem when I exported my first real ePub in CS5.5 (which had non-breaking spaces).

    @naomikennedy on Twitter suggests there’s an extra space after the XHTML 1.1 in the doctype.

    https://twitter.com/#!/naomikennedy/status/72105682362580993

    I haven’t had time to investigate this myself so I’m not sure what the proper fix is. What does everyone else think?

  • Dan Rodney says:

    Looks like you’ll have to copy/paste the above twitter link, somehow it didn’t make it through properly.

  • Kai Ruebsamen says:

    I had the same problems some weeks ago. Steves problem occurs in my tests only, if a discretionary hyphen is not display as a printable hyphen in InDesign. In my case a finde/change in InDesign solves the problem and had also not effects for the print version. Maybe it?s a bug, but in my opinion a minor one.

    Dans problem seems much bigger, cause a find/change for the non-breaking-spaces has effect for the print-version as well :(

  • Jeremy says:

    I have a ridiculous admission to make. Although I’ve been using Oxygen for quite a while, I still haven’t figured out how to make changes to multiple XHTML files in one go.

    Yeah, I’ve been busy, on other stuff, but strange that it didn’t seem obvious at all (to me) how to do it.

    But if I did know how to do that, it wouldn’t be a big deal at all to simply do a find/change.

  • Phil Frank says:

    Jeremy,

    The reason you haven’t figured out how to make changes to multiple XHTML files in one go in Oxygen is that until about two weeks ago, it wasn’t possible. It’s only with the new release 12.2 that a new option has been added that allows this.

  • Jeremy says:

    Phil, you’re absolutely right! Thank you so much, this makes a huge difference.

    The funny thing is, I’m doing exactly the same in Oxygen 12.2 as I would have tried in the earlier versions — it’s as if it “thought” it was able to do a find/replace in multiple open files, but wasn’t in fact able to carry it off.

    Thanks again — this is great!

    Jeremy

  • Jamie McKee says:

    Thanks for bringing this up Steve. I had the same problem when I created my first ePub out of CS5.5, and was stumped as to why it was working fine in ADE and others, but displayed, as Ron called them at PePCon, “the dreaded pink box” in iBooks. In my case, it wasn’t the “shy” character, but the “nbsp” character?the non-breaking space. So here’s my question…and the reason nbsp showed up in my ePubs: How are folks handling names like T. S. Eliot, so that the T. and the S. always stay together? I put in a non-breaking space for my print versions, but can’t do this for ePub. Sigh. I hope the state of typography gets improved as ePb evolves.

  • Naomi says:

    As Ron mentioned, I discovered that the problem actually has nothing to do with non-breaking spaces. I do a bit of iPhone development on the side and when I first hit this error I knew that nbsp wasn’t the problem. (I had used non-breaking spaces in iBooks before)

    But the file passed ePubCheck, so…FlightCrew to the rescue. It flagged a slightly incorrect DTD, with “-//W3C//DTD XHTML 1.1 //EN” instead of “-//W3C//DTD XHTML 1.1//EN” (notice the extra space after 1.1).

    I figured it was worth a shot and (miraculously) once I corrected this, no more undefined entities. This has fixed the error in every ePub I’ve exported from 5.5 (which is a lot).

    I would be grateful if someone could explain why though…

    I would recommend following the #eprdctn hashtag on twitter. Lots of these problems are solved there.

  • Peter Sorotokin says:

    Naomi,

    It makes sense to me now! Contrary to what other say about HTML entity references, using them in XHTML is OK, as long as XHTML DTD that defines them is included. However, as you pointed out, SYSTEM identifier of the XHTML DTD contains extra space. Epubcheck and Adobe EPUB renderer both key off the DTD URL (which is also given and which is correct) to find XHTML DTD, but iBooks apparently uses SYSTEM ID instead and does not look at DTD URL, so it does not recognize DTD as XHTML DTD. Without DTD, HTML entity references cannot be resolved.

  • Alan Gilbertson says:

    @Jeremy, I tend to side with Peter S. in considering this a webkit/iBooks deficiency, rather than InDesign.

    DTD declarations should be honored, else why bother having them in the first place?

  • Peter Sorotokin says:

    Extra space in DTD is still a bug in InDesign, WebKit should be able to use SYSTEM ID if it so chooses (not sure what is the reason behind it, but it is legal).

  • Naomi says:

    Thanks for explaining Peter! I should have posted that solution a long time ago. I just submitted a bug report to Adobe as it seems (from this post) that they haven’t posted/patched this error.

    I would point out that I am still removing all discretionary hyphens (“shy”) before export because a lot of print designers use them them as a “no break” hack at the beginning of a word and I suspect that it doesn’t work that way in ePub files. It was less work to just add discretionary hyphens to my Find and Replace script than to work out how the different devices handled them.

  • Kiyo Toma says:

    Thanks to everyone for brining this the InDesign team’s attention. Special thanks to Naomi for discovering the root cause! We’ve already logged this as a bug, so no need for anymore bug submissions. I can’t comment on an exact date for a fix, but the current plan is to roll this into a patch, along with other fixes we have in the pipeline. Stay tuned.

  • Jeremy says:

    Is this not ridiculously easy to fix? Does it involve more than someone opening a file and deleting a space?

    Forgive my paranoia — every time I spend half a million bucks calling a plumber in, I wonder whether all he did was turn off a tap!

  • Command_Q says:

    I can’t for the life of me make this edit. I delete the extra space in Dream Weaver save it with original name, zip the files and change the extension to .epub but can not open the ebook in anything. Any insight anyone can offer would be greatly appreciated.

  • The unzip/rezip and renaming only works on Windows. If you’re on a Mac, you have to use an Applescript or a Terminal command. Are you on a Mac?

  • Also, you can easily run the Find/Change without unzipping anything by opening the EPUB with the Archive Browser panel in Oxygen Author. Choose Multi-file search, click the “Opened Archive” for the scope, and it’s just a straight find/change. Takes about 2 seconds.

    https://www.oxygenxml.com/xml_author.html

  • Jorina says:

    I’ve come across this warning when exporting my ePub.

    The file was exported but one or more problems were detected:
    TOC: 1 skipped (TOC entry has incorrect nesting level)

    I would appreciate it if someone could advise me on this issue.

  • Linda says:

    I am getting an error when exporting as .mobi and .pub. My InDesign 5.5 stops working and closes every single time. It worked when I only had the first chapter, but now that I have added all 8, ID quits every time. Any ideas from all you genius’s? Thanks!!

  • Brad Siefert says:

    Thank you! This was killing me.

  • Don Cox says:

    Thanks for a great article, Steve. I’ve been beating myself up over this issue for a couple of days. I’d located the chart you show above, but was still having some issues getting things to work. Oxygen may make the error correction process more fluid.

  • Sacha Heck says:

    With the last InDesign Update 7.5.2, Adobe fixed the bug. Take a look at the release notes here: https://kb2.adobe.com/cps/919/cpsid_91983.html

  • Ben says:

    I’ve just spent my whole day on this “npsp” error.
    I took me 5 minutes after reading this post to understand that npsp = non-breaking space.

    Yeah, I know, I’m a bit slow. But now, I have a really nice and really cool ePub thanks to you ! And I’m happy :.)

    Thanks Steve

  • >