Build a biblical reference index with GREP

Home Forums General InDesign Topics Build a biblical reference index with GREP

This topic contains 4 replies, has 3 voices, and was last updated by  nick lamme 4 years, 12 months ago.

  • Author
    Posts
  • #53998

    nick lamme
    Member

    Hi All,

    I don't know how many of you have worked in the publishing of academic (or non-academic) biblical literature. If you have, then you have had to work with biblical references a plenty. Then, if you have been tasked with building not only a subject index and perhaps a name index, but (cringe and Cry) also a biblical reference index. Normally, this might be built by the author or an indexer, but if your author can't and you can't hire a good indexer, and there's no intern to abuse, then you've got to do it. When I started learning about GREP, I jumped for joy, because I saw the potentials. All biblical references are patters. However, they can be irregular in nature. For example, you can have:

    Romans 3:15 or Ro. 3:15 or Rom. 3:15 or Ro. 3.15 or Rom. 3.1-5, 6,7, 18, etc. you get the point. Pair that with any combination of references separated by a semicolon and life just got more complicated.

    Without GREP, there are ways to tackle this. You can go through and apply a character style to every single reference individually, and then use a script like indexmatic (find it here: http://indesignsecrets.com/free) to populate an index. But that is a pain. You could also use the index feature of InDesign, but you have to use a workaround. You probably have to use a workaround anyway if you are already building more than one index (subject and author, for example). But there is a faster way with GREP. After hours of painful trial and error (thank you RegExr for providing an absolutely beautiful envornment in which to write and dynamically test my expressions!!!! Check out http://www.gskinner.com/RegExr), here is the expression which I have written:

    (d+s)?(w+?.?sd+[[:punct:]]d+)(.d+)?([,sd]+(.d+)?)?

    It breaks down into the following parts:

    (d+s)? (w+?.?sd+[[:punct:]]d+) (.d+)? ([,sd]+(.d+)?)?

    Part 1 accounts for books that begin with numbers (e.g. 2 Timothy), but not all will.

    Part 2 is the main expression to account for the name of the book, a period (if there is one… some books are spelled out completely and others are too short to merit an abbreviation, like Job), a space, the chapter number, followed by puntuation (either a period or a colon), then the verse number.

    Part 3 takes care of a range of verses, so not Rom. 2:3 (covered in parts 1 and 2), but Rom. 2:3-8.

    Part 4 is written as flexibly as possible to account for anything that comes after the pattern covered by parts 1-3. For example, you could have a verse or range of verses followed by a comma, a space (or maybe no space) and and another verse or another range of verses or even another chapter and verses, etc. For example: Rom. 2:3-8, 15 or Rom. 2:3-8, 3:6.

    With this, then I do run a Find/Change operation and apply a character style with no formatting (e.g. “bib style”) to all of my biblical references. I just get to do it to all of them at the same time, as opposed to hunting them down one by one. I haven't tried, but I suspect that I could simply create a GREP style in the paragraph style and automatically apply this to everything. Then with a great script called indexmatic, I pull all those references into one place and begin laying out the index.

    My guess is that maybe someone could hone this expression or has even come up with a better way of making an index. This is the way I do it on a regular basis. I at least hope that someone can benefit from it, or at least modify the expression to fit their particular need. Thanks for letting me share.

  • #54005

    Tournier
    Member

    Hello

    This is a very good regex, and I am sure I will need it one day (but for french biblical references, sometimes with roman numbers, I Cor., XV, 14-17 ; II Rois, XVIII, 2 [second roman numerals in small caps]).

    For index, without using a character style, I use a Peter Kahrel's JavaScript : Chain_GREP_queries (download here). Using Test mode, “the script collects all instances matched by the Find What expressions in all selected queries and lists these matches in a new document.” To have a look of the result, you can see my blog, indigrep, here.

    After, you can index with another script as IndexBrutal (Marc Autret) or one of P. Kahrel (here).

    Best and thanks

  • #54010

    nick lamme
    Member

    Wow, Tournier,

    I feel silly. I didn't even take Roman numerals into account. It shouldn't be too hard to add. And in English, if I were working with older texts, it would be an issue. And thank you for the Chain_GREP_queries script. I was unaware of it, but I am going to try it out the next time. As we say here in Costa Rica, pura vida.

  • #54011

    Jongware
    Member

    Nick,

    ([,sd]+(.d+)?)?

    will match “, 12.3″ but stop after the first comma (and the number after that). “?” is zero or one time.

    Change to

    ([,sd]+(.d+)?)*

    to accommodate for a verse list of any length; “*” stands for zero or more times.

  • #54018

    nick lamme
    Member

    Jongware, thank you. That worked great.

You must be logged in to reply to this topic.