Build a biblical reference index with GREP

nick lamme

This topic has 4 replies, 3 voices, and was last updated 14 years, 5 months ago by nick lamme.

Viewing 4 reply threads

Author

Posts
- November 26, 2009 at 1:38 am #53998
  
  nick lamme
  Member
  
  Hi All,
  
  I don't know how many of you have worked in the publishing of academic (or non-academic) biblical literature. If you have, then you have had to work with biblical references a plenty. Then, if you have been tasked with building not only a subject index and perhaps a name index, but (cringe and ) also a biblical reference index. Normally, this might be built by the author or an indexer, but if your author can't and you can't hire a good indexer, and there's no intern to abuse, then you've got to do it. When I started learning about GREP, I jumped for joy, because I saw the potentials. All biblical references are patters. However, they can be irregular in nature. For example, you can have:
  
  Romans 3:15 or Ro. 3:15 or Rom. 3:15 or Ro. 3.15 or Rom. 3.1-5, 6,7, 18, etc. you get the point. Pair that with any combination of references separated by a semicolon and life just got more complicated.
  
  Without GREP, there are ways to tackle this. You can go through and apply a character style to every single reference individually, and then use a script like indexmatic (find it here: https://creativepro.com/free) to populate an index. But that is a pain. You could also use the index feature of InDesign, but you have to use a workaround. You probably have to use a workaround anyway if you are already building more than one index (subject and author, for example). But there is a faster way with GREP. After hours of painful trial and error (thank you RegExr for providing an absolutely beautiful envornment in which to write and dynamically test my expressions!!!! Check out https://www.gskinner.com/RegExr), here is the expression which I have written:
  
  (d+s)?(w+?.?sd+[[:punct:]]d+)(.d+)?([,sd]+(.d+)?)?
  
  It breaks down into the following parts:
  
  (d+s)? (w+?.?sd+[[:punct:]]d+) (.d+)? ([,sd]+(.d+)?)?
  
  Part 1 accounts for books that begin with numbers (e.g. 2 Timothy), but not all will.
  
  Part 2 is the main expression to account for the name of the book, a period (if there is one… some books are spelled out completely and others are too short to merit an abbreviation, like Job), a space, the chapter number, followed by puntuation (either a period or a colon), then the verse number.
  
  Part 3 takes care of a range of verses, so not Rom. 2:3 (covered in parts 1 and 2), but Rom. 2:3-8.
  
  Part 4 is written as flexibly as possible to account for anything that comes after the pattern covered by parts 1-3. For example, you could have a verse or range of verses followed by a comma, a space (or maybe no space) and and another verse or another range of verses or even another chapter and verses, etc. For example: Rom. 2:3-8, 15 or Rom. 2:3-8, 3:6.
  
  With this, then I do run a Find/Change operation and apply a character style with no formatting (e.g. “bib style”) to all of my biblical references. I just get to do it to all of them at the same time, as opposed to hunting them down one by one. I haven't tried, but I suspect that I could simply create a GREP style in the paragraph style and automatically apply this to everything. Then with a great script called indexmatic, I pull all those references into one place and begin laying out the index.
  
  My guess is that maybe someone could hone this expression or has even come up with a better way of making an index. This is the way I do it on a regular basis. I at least hope that someone can benefit from it, or at least modify the expression to fit their particular need. Thanks for letting me share.
- November 26, 2009 at 7:35 am #54005
  
  LAURENT TOURNIER
  Member
  
  Hello
  
  This is a very good regex, and I am sure I will need it one day (but for french biblical references, sometimes with roman numbers, I Cor., XV, 14-17 ; II Rois, XVIII, 2 [second roman numerals in small caps]).
  
  For index, without using a character style, I use a Peter Kahrel's JavaScript : Chain_GREP_queries (download here). Using Test mode, “the script collects all instances matched by the Find What expressions in all selected queries and lists these matches in a new document.” To have a look of the result, you can see my blog, indigrep, here.
  
  After, you can index with another script as IndexBrutal (Marc Autret) or one of P. Kahrel (here).
  
  Best and thanks
- November 26, 2009 at 1:53 pm #54010
  
  nick lamme
  Member
  
  Wow, Tournier,
  
  I feel silly. I didn't even take Roman numerals into account. It shouldn't be too hard to add. And in English, if I were working with older texts, it would be an issue. And thank you for the Chain_GREP_queries script. I was unaware of it, but I am going to try it out the next time. As we say here in Costa Rica, pura vida.
- November 26, 2009 at 2:10 pm #54011
  
  Theunis De Jong
  Member
  
  Nick,
  
  ([,sd]+(.d+)?)?
  
  will match “, 12.3” but stop after the first comma (and the number after that). “?” is zero or one time.
  
  Change to
  
  ([,sd]+(.d+)?)*
  
  to accommodate for a verse list of any length; “*” stands for zero or more times.
- November 26, 2009 at 4:46 pm #54018
  
  nick lamme
  Member
  
  Jongware, thank you. That worked great.
Author

Posts