Find All Headings Based on a Pattern
Scott wrote:
Is there a script or GREP code to locate a single line paragraph in a document?
After some back and forth email with Scott, I determined that what he meant was that he wanted to search to find anything that looked like a heading — a single sentence paragraph, but without punctuation. This is an important distinction because GREP cannot find “a single line” — that is, it can’t see where line breaks are, so it can’t see if a sentence fits on one line or breaks across two.
But finding a single sentence paragraph like a heading is not difficult at all, though the code looks weird to the beginner:
^[^.!?]+$
Let’s break this down a bit:
- The caret means “the pattern begins at the beginning of the paragraph.”
- The stuff inside the brackets means that it’s going to be “any one of these” (a choice of any one)
- The caret inside the bracket means “not” — in other words, it’s “not any of these”
- The plus means one or more of the previous thing (“one or more characters that are not a dot, exclamation point, or question mark”)
- And finally the dollar sign means the pattern ends at the end of the paragraph
So it’s any line that takes up a whole paragraph that does not contain one of these punctuation marks. If you want to exclude other characters, just add them to that list inside the brackets.
Scott wrote back and said that someone else gave him this code: ^.{1,20}$
and that it worked okay, too. But that grep code simply means “any paragraph that contains between 1 and 20 characters.” So that might find a short paragraph that you didn’t intend.
Granted, no code will be perfect for all situations. If your headings do, in fact, have puctuation in them, then my code won’t work either. But it’s a start!
Hi David,
Nice post but like you said, if the heading does have punctuation your GREP won’t work.
I think this is because you are using both the ^ and $ symbols. So you’re basically saying that the entire paragraph (it’s the entire paragraph because you are specifying both start and endpoint) cannot contain . or ! or ?
If you were to use this one you’d have more luck:
.+\w[^.?!]$
So here I am saying:
.+ find a series of characters (any characters so punctuation can be included if available)
\w ending with one word character (which is uppercase, lowercase or number). I am including this one because a headline without punctuation will end in a word character right? (you could even add the code for an extra space, if present)
[^.?!] as long as it is not followed by punctuation
$ at the end of the paragraph
Try it, should work.
Nice idea.
Often you even just can search for more than one paragraph or so, as writers tend to show “this as a heading” via more than one paragraph return in Word.
Hi All,
I have an InDesign document. In which I have applied TOC. I have pages in which I have headings and sub headings. I am looking for a GREP formula to convert subheadings into headings.
Example:
Main Heading: Signal Relays…………………. 14
Sub headings
IMF ……………………………………….. 14
P2 …………………………………………14
FX2 …………………………………………14
FT2 / FU2 ………………………………….14
D2N V23105 ………………………………….15
MT2 …………………………………………15
P1 V23026 ………………………………….15
Reed DIP / SIL ………………………………….16
Cradle …………………………………………16
TSC …………………………………………16
OUAZ / T81 ………………………………….16
I want main heading to be converted as 14-16
Your response will be highly appreciated