Massaging Text with InDesign (Getting the Text You Want from the Text You Have)
Sometimes the text I need is hidden inside a bunch of other text. For example, here’s a web page that lists, among many other things, a bunch of emails I need:
The blue column in the middle is the list of emails. I’m sure there are web utilities that let you grab just one column, and there’s that cool trick of Option/Alt-dragging in MS Word to get a column… but I want to show you how you can use InDesign to pull text out of a bunch of data like this.
I copy and pasted the whole page into InDesign, and you can see that it’s a mess:
But I also notice patterns! In this case, we need to remove all the lines that have just the user name followed by a bunch of tabs. So I whip up this quick grep:
That looks for a word followed by four tabs, followed by a return, followed by the same (duplicate) word. It then replaces it with the word itself. Obviously, your grep would be different, depending on what kind of pattern you find.
I click Change All, and end up with a nice clean list, with one email on each line:
In order to grab just the emails addresses, I need to put them in a single column. So I select all the text and choose Table > Convert Text to Table:
The result is a nice, orderly table:
Now it’s easy to select all the columns I don’t need and delete them, leaving me with a single column table. Choose Table > Convert Table to Text and I end up with what I wanted all along:
Sounds like a lot of work, but the point of this post is: InDesign can do all kinds of text massage/processing that many people leave to text editors or word processors. InDesign is where I’m fastest and most comfortable, so here’s where I do it!
Why the emails are blurred in the Web screenshot if they are then visible in the text frame ? ;-)
Good catch, Branislav. It is because I could not edit the text on the web page, but I did change all the personal information in the other screen shots to protect people.
Love it. I do this sort of thing all the time with html code to help build large tables of data of even building xml files for indesign or online purposes. InDesign + GREP = <3.
Hello,
Can I propose another method ? Using Peter Kahrel’s script chain_grep_queries, it is easy to select text you want. Has you know, with Test mode “the script collects all instances matched by the Find What expressions in all selected queries and lists these matches in a new document.” In your example, you had just to write a regex to find e-mails.
In my opinion, a very interesting plugin you’ve described here
I agree fully: InDesign is marvellous for many things. Please allow me a little joke: I like InDesign also because its frames have comfortable handles! ;)
Broaden the frame of the 2nd picture until every paragraph fits nicely into one line.
Double-click into the frame (Cursor becomes active).
Search Text for ^t^t^t^t^p
Change Text to ^t
Edit > Select all.
Table > Change Text to Table
Select the rows you don?t need and delete them.
Because: At least this text was not such a mess ? it has a nice pattern just from the beginning!
Hi, guys :)
A little bit off-topic, but since we have once again come to the mighty powers of The GREP :), what GREP-related (ID-specific) learning resources would you recommend for a novice?
@Anton: Check out the resources here on our grep page:
https://creativepro.com/grep