Invalid characters in index

Learn / Forums / General InDesign Topics / Invalid characters in index

Viewing 6 reply threads
  • Author
    Posts
    • #105186

      I need urgent help, if possible.
      I’m working on a 480-page book that will appear in print and as ebooks. It’s been indexed — markers came in with Word files — and everything is showing up fine in the Index panel.
      The first time I attempted to generate the index, ID churned the list, and then I got the “invalid characters” message. I went through the list and noted incorrect glyphs that replaced the numerous diacriticals used in composer names throughout the book. I found the correct glyphs and replaced the incorrect ones. I also found instance of “<0092>” or similar syntax that represents unicode for the correct character. I fixed by replacing the unicode with the correct glyph.
      The second time I attempted to generate the index, I got the “invalid characters” message again. I’ve gone through the Index list twice now, both Levels 1 and 2, and I can’t find anything else to fix.
      Like most projects, this one is down to the wire. Any advice would be helpful. I really need to be able to generate the index, place it on pages, and print it out for proofing.
      Thanks.

    • #105188
      Peter Kahrel
      Participant

      I had the same experience a few times, and it turned out to be curly quotes. Double ones for sure, forgot whether single ones were a problem too. After replacing the curly quotes with a temporary character the index generated fine. After generation reverted to quotes in the generated index. Double quotes aren’t always a problem, that’s the confusing part. Accented character have never been a problem in my experience.

    • #105200

      Thanks, Peter. There weren’t many quote marks and apostrophes, but I did as you recommended and I’m still having issues. I’ve even gone through and substituted safe, temporary sequences for other suspicious characters, such as virgules, back slashes, colons, and exclamation points (composers!), with no success.

      I think the issue may lie in the fact that we began with RTF files that were exported from old PageMaker and InDesign files (c. 2000), from the first edition of this book. The author wanted to use some text from the first edition and a subsequent followup (2007), so it was necessary to port out those old files for him to work with in Word. Even though I used paragraph and character styles religiously even back then, I’m sure that between the different fonts and versions of fonts used, the inevitable cut-and-paste from online sources that authors do, and any other kind of cruft between PM/ID/Word, there’s no easy way to diagnose the issue. I think that the altered accented characters (Á for ?, for example) may have been a result of different glyphs in different fonts?

      The other reason for using the old files was that the two previous books had been indexed with PM and ID, and the index markers from those books came over, which was a pleasant surprise, but clearly, not that pleasant now.

      I have looked at the information about the plug-ins and scripts that are available for cleaning up an index, and I can’t determine which would be appropriate, and more importantly, straightforward for me to use. I’ve been so pleased with your scripts–especially the compose3 script that allows me to create accented characters in fonts that don’t have those glyphs, and the footnotes to endnotes script, which allowed me to convert 1200-some footnotes to endnotes in a very long book. If you have advice as to which script might resolve this index issue, I’d be happy to purchase it. My husband was the indexer, and all I’m trying to do now is to generate the index so we can see it on the pages and more easily proof and then clean it up.

      Thanks so much for your help.

    • #105201
      Peter Kahrel
      Participant

      Hard to say. Can you send me that file? Please use [email protected] and I’ll have a look.

      P.

    • #110014

      Peter, thank you for finding the invalid character—a small square—in the topic names. As you explained, it was Unicode 11, which in the InDesign documents you can search for (in the Grep tab) using \x{0011}. The script you wrote found and deleted these in the book documents and also deleted the leading spaces that created separate topics, AND it successfully reverted the codes I had added in the Index panel for what might have been other suspect characters when the index was generated.

      The result was a clean, well-formed index list that I can work with in exporting the book to EPUB and MOBI now, with a minimum of intervention that is content-related at this point.

      So, so grateful. I hope I can provide someone with this level of assistance some day.

    • #110017

      How wonderful! Thank you Peter. (He has helped me out many times as well, just yesterday in fact.)

      You can learn more about Peter on his personal site, and download all of his wonderful InDesign scripts (please pay attention to the little Donate button at the bottom of many of these pages). https://www.kahrel.plus.com/

      And thank you Shawn for posting the follow-up! I’m curious, how did a small square end up in the topic names, do you know?

      AM

    • #110054

      Anne-Marie, I do not know. Because the content was exported from old PM and ID files, into which old Word files had been placed, I suspect something was present in the original Word files. I’m working on the EPUB and MOBI conversions now, so will try to find that answer when I’m clear of this project! : )

Viewing 6 reply threads
  • You must be logged in to reply to this topic.
>
Notice: We use cookies on our websites to give you a great online experience. If you keep browsing, we'll assume you're ok with this. For more information, see our privacy policy. By closing this banner, you agree to the use of cookies.I AGREENo