Swap Names Around with GREP and Find/Change
Joe wrote:
I have about two hundred text boxes which are captions of student pictures produced by Bob Stucky’s contact sheet script. The captions are actually the file name in this format: LastName_FirstName.jpg. I need to change to FirstName (space) LastName (no file extension).
Sounds like a job for super GREP! Try this: Open the Find/Change dialog box and switch to the GREP tab (this is in CS3 and later). Type this into the Find what field:
^(\w+)_(\w+)\.jpg
Type this into the Change to field:
$2 $1
Now click Change All. That should do it!
Here’s how to decode the grep codes:
- ^ means “beginning of paragraph”
- (\w+) means “remember what the first group of one or more word characters are”
- _ is literal — that is, it’s just looking for an underscore character
- \. means a period/dot (you have to “escape” that character with the backslash before it)
- .jpg is also just literal
- $2 $1 means “take the second word group you found (the stuff in parentheses), then type a space, then type the first word group you found”
There, that wasn’t so hard, was it?
For Mac OS X users :
if you need to list all the files on a Finder’s folder, just :
- select the files
- Edit > Copy or Cmd-C
- Open the app TextEditor
- Cmd-Shift-T to switch to Plain Text formating or Format > Make Plain Text
- Edit > Paste or Cmd-V : there you are, the list is there.
Eugene emailed me offlist with a very good point: What if the names include non-word characters (characters that the \w won’t pick up), such as “Joe O’Hara” or “Jason Briggs-Jones”. Good point! In that case, perhaps the find code should be:
^(.+)_(.+)\.jpgor
^([-'\w]+)_([-'\w]+)\.jpg(The first uses the dot instead of \w, which lets InDesign find any character at all. Seems kind of extreme to me, so the second one limits it to just hyphens, apostrophes, and word characters. But both seem to work.)
You can also use :
^(\S+)_(\S+)\.jpg
\S looks for everything except a space
Well now that the morning has passed and I’ve thought about it, filenames wouldn’t usually have apostrophes, but could have hyphens.
So anyway, I like the second one there David, it’s a good one.
I don’t know how to use the [[:punct]] posix in GREP, so I didn’t really know how to search for punctuation.
I can’t get the second one to work though? Not sure why.
This one worked for “O’Hare_John-Joe.jpg”
^(\w+[-'\w]\w+)_(\w+[-']\w+)\.jpg
But I guess you can’t get all possible variables for names in there – can you?
Something’s wrong : the antislash disappeared
the good regex is :
^(\S+)_(\S+)\.jpg[[:punct:]] matches any ponctuation :
^([[:punct:]\w]+)_([[:punct:]\w]+)\.jpgThe first regex is better.
I’m not sure why the backslash (or “antislash”) is not working sometimes in these comments. It looks like you need to type it two times in a row to make it appear. I guess our CMS strips it out if you only type it once. I’ve edited your posts, Laurent.
What about names like “de la Vega” or “Wilcox III”? Since the underscore character is so rare, I’d go with this:
^(.+)_(.+)\.jpg
Hi Bob, that’s what I said in post 2, I emailed David about the hole in the grep code for names.
I pretty much thought the code was solid, but David thinks it’s extreme.
Seems to be hot topic though
I’d go for this GREP (there’s much less backtracking) :
^([^_])_(.+)\\.jpgI’d give «(•).(•)» a go, but that’s just because it looks way cuter than anything you guys came up with
On a more serious note: if you work on a copy to run the GREP, there’s no reason not to use as rigorous a replace as possible. I wouldn’t want to end up with a name screwed up because I was too cautious.