Using GREP on specific parts of speech, etc

Learn / Forums / General InDesign Topics / Using GREP on specific parts of speech, etc

Tagged: 

Viewing 3 reply threads
  • Author
    Posts
    • #91252

      I’ve never used GREP but I want to get into it. I wonder if its possible to use GREP to automate the process of mimicking a 17th-century style document, complete with long S-es, capitalized and italicized nouns, etc. Here’s an example of what I’m looking to do:

      https://goo.gl/hlwDWB

      One of the problems is turning the s-es into long s-es, excepting those that appear at the end of a word or before an apostrophe mid-word.

      Another is italicizing all the nouns, excepting those that have a small caps style applied.

      And actually, one other thing I need to do is change all instances of “(space):” to “(nonbreaking space):” – since these old docs tend to put a space on either side of a colon, yet you don’t want the line to break at the space before. Easy enough to do a find and replace but even better if I can automate it in the paragraph style.

      I don’t know that it’s possible to tell InDesign to isolate parts of speech, as if it could always reliably distinguish between nouns and verbs etc, but I think the long S thing could at least be done with GREP.

      Part of what I love about old texts like this is the quirkiness and arbitrariness of it, like you can imagine the typesetter laboring over it the old-fashioned way, line by line…so I know it’s a tough look to automate, but obviously the less I have to do by hand, the better.

      Thanks to anybody who can help me with this…

    • #91258
      David Blatner
      Keymaster

      GREP cannot really see parts of speech; that’s too much to ask.
      The regular find/change could probably do much of what you need… finding “ss” and replacing it with a different character.
      But what really caught my eye is your comment about “arbitrariness”… any kind of automated solution will, by definition, be the opposite of arbitrariness. :-)

      • #91460

        Just getting back to this…thanks for the replies!

        Yeah, I know you can’t automate arbitrariness…lol…I was just saying I like that quality of these old docs, and I know there would always be some degree of hand-finishing to replicate it, but some of it could probably/hopefully be automated. I’ve been using find/change and going thru the doc, and it’s not that big of a deal since I’ve just been doing docs that are a couple pages.

    • #91270

      Hello John,

      Likely the medial s can be scripted. It can also be handled 99% of the way within a font if thus coded to do so. Same with the non-breaking spaces before/after certain punctuation.

      Do note that the rules for the medial s are dependent upon era and country. They changed with time and even then were not adamantly adhered to. So even if a script could be written to do the substitution, what would be proper for one text from one print establishment may not wholly apply to another text from even the same era and most likely from another country.

      Setting handwritten manuscripts from a given era/country also adds a layer of variable to the medial s issue. Most of the handwritten manuscripts I have seen (even French manuscripts) use the regular s character or an odd mix of regular and medial s depending upon, I guess, the mood of the writer. Space before/after punctuation also varies.

      Proper nouns may be able to be caught with GREP or a script…but the text better be carefully checked afterwards.

      I am finishing up a font right now that handles the 99% of the medial s and punctuation thing. It is tailored to a certain country’s rules (France) and a given era (early 19th century). Handing the italicized proper nouns and catching proper noun abbreviations that end with s (which require the medial s), though, is going to be a manual slog through the text even if I do a find/change.

      Mike

      • #91461

        Very interesting…actually I’d been using, in the first place, Adobe Caslon Pro, just for the huge variety of available characters and the versatility, along with an initial cap called Goudy…then I went for a more antique look with the Igino Marini Fell Types and a cap called Grimeswade. Both of those had the long s in the character set, but I guess they aren’t coded to auto-insert the long s; I have to do it myself with find/change. But, then I just got a font called 1786 GLC Fournier, and it DOES appear to make all the s’es into the long s except those at the end of a word – so I see what you mean about that.

        Thanks for all the info about the variations in old texts. The text I’ve mainly been using for inspiration/guidance is an old copy of Elias Ashmole’s Theatrum Chemicum Brittanicum from the 1650’s, printed in England. (Too early for Caslon I guess, might be one of the old Dutch Fell types.) The general rule there seems to be long s’es except at the end of a word or before an apostrophe, and I like how that looks, so I’ve gone with that. BUT, even within that one book there are huge variations not only in the typesetting, but even in the spelling of words – you’ll see the same word spelled three different ways on the same page. Which adds a kind of charm, because again, I imagine the typesetter doing it all by hand, and maybe some of the variations just depend on what he had available at the moment.

        Again the general rule in that text seems to be that you capitalize/italicize nouns, but it’s not strictly adhered to at all; sometimes adjectives, sometimes there’s no apparent rhyme or reason whatsoever; but I just decided to simplify it to capitalizing and italicizing nouns and maybe an occasional adjective if it can be considered part of a proper noun, like “Ineffable Name.” But yeah, I guess that does have to be done by hand, or like you say, maybe you can automate but then you’d still have to go through and hand check.

        Let me know if you’re selling your font, sounds like it’s right up my alley. Actually I’d love to design one of these myself someday…

    • #91466

      Hi John,

      In that book (Elias’ book) there are cases where the 1786 GLC Fournier font won’t do the longs correctly. So do proofread carefully. I don’t know about the text from your first post, though.

      And yep, there was a great amount of inconsistency. The shear amount of italics mixed in can drive one nuts. I do like setting old manuscripts from time to time–its a good break from the monotonous work I generally do. But boy, it takes so much concentration I would hate to do it day in and day out.

      One thing I found that helps me when the text source is an OCR’d PDF is to use a small voice recorder and record myself reading the text. Then I play it back as I silently read the text in the layout and pause to make corrections. Once in a while I need to pause the recording of my reading aloud of the PDF to get difficult passages down pat, then start the recorder and read some more.

      I will eventually put the font up for sale, likely via MyFonts or another distributor. But I don’t know when that will be as I need to roll some more changes into a couple of the styles (I’ve only used the regular and bold + their italics in the manuscript I am setting currently). Here are a couple screen shots of text from the Elias book using the font I have spent way too much time on…

      https://www.dropbox.com/s/k1j75bt4i20jgsh/capture-000722.png?raw=1

      https://www.dropbox.com/s/2f492npcagozmme/capture-000723.png?raw=1

      • #91470

        Thanks for the samples using the Elias book no less – love the little serif notch in the middle of the l’s. I’m always looking for more fonts in that kind of style.

        Yep, I noticed with Fournier that e.g. if an s appears at the end of a word but is followed by a parenthesis, it’s still the long s.

        In looking closely at the Theatrum I saw that regular italics and “swash” italics were used haphazardly – I started off by trying to replicate pages from that book as closely as possible, same line breaks and everything, and Adobe Caslon had virtually all the versions of the characters used, so I was able to get really close. Then I applied that style to some stuff I’d written like that thing I posted before – that’s using Adobe Caslon.

        I wound up resetting that in the IM Fell English font, which I think I like better – looks dirtier and older (though I could always dirty up a clean font like Caslon). It only has one style of italics, no swash alternates, but that’s fine:

        https://www.dropbox.com/s/29iht6qt8bru2p9/MM2017-2.jpg?dl=0

        The dense, concentrated look of it, packed with footnotes and busy with all the typographical variations and flourishes and archaic spellings, seems suited to the abstruse, obscure, frustrating mysticism of the text, which is what I was going for. It’s a total struggle to read, but it would be anyway…lol.

        So that’s what I was looking to automate, but I think we’ve all concluded that there’s no way around hand-finishing for such a hand-done look…

Viewing 3 reply threads
  • You must be logged in to reply to this topic.
>