April 30 2009 • 10:46 AM

Quick GREP To Superscript Ordinals

M.V. wrote:

I have some numbers with ordinals (21st, 54th, 2nd etc.). I would like to search and replace by GREP all of those ”st”, “nd”, “rd” and “th” and change their format to OpenType superscript.

You are absolutely correct that this is a great case for GREP! Fortunately, it’s not hard to do. First you need to write an “or” expression: (st|nd|rd|th). Those are vertical bars (also called a pipe) between each string you’re looking for. Next, make sure they’re the last thing in the word by putting a \> (backslash greater than) after it. Finally, you only want to find those strings when there’s a number (digit) before them, right? So add a “positive lookbehind” at the beginning: (?<=\d). When you’re done, the full grep code is:

(?<=\d)(th|st|nd|rd)\>

You can use that in the GREP tab of the Find/Change dialog box, or as the basis of a GREP Style (in CS4). Just apply a character style setting the found text to superscript.

One note, however: If you’re using an OpenType font, you may be tempted to use the OpenType Superior/Superscript feature (in the Position pop-up menu, in the Basic Character Formats pane of the Character Style Options dialog box). Sounds great, but it may not work. Why? Because some OpenType fonts don’t support that style. For example Myriad Pro — shown on the first line in the image above — doesn’t do squat with normal English ordinals. (It only works with “o” and “a”). However, Minion Pro and Chaparral Pro both (also shown above) work great.

Want more GREP tips? Check out my new GREP title at Lynda.com!

26 Responses discussing this post. Add yours below.

  1. April 30th, 2009 • 11:28 am • Link

    This brings up an off-topic pet peeve of mine. I don’t know when ordinals set in superscript were introduced to the computer world. The first time I ran into it was when Word introduced their auto-correct feature and all of a sudden, every document I got from somebody using the program as a plain typewriter seemed to be covered in the ordinals set in superscript.

    As a sometimes historian used to spending hours in old French books, seeing ordinals set in superscript are a commonplace occurrence, but it’s not something I use in modern writing. It is an archaic form of typesetting. In their discussion of ordinals, The Chicago Manual of Style (15th edition) always sets them as normal type. It is only in it’s discussion of documenting titles (see §17.52) where superscript ordinals are addressed there it states that superscript ordinals “may” be changed to normal type.

  2. April 30th, 2009 • 12:00 pm • Link

    Sometimes the old guys get it right, and it makes sense to turn an archaism into standard practice (so it is no longer an archaism).

    I think the practice of using superscripts for ordinal numbers is logical, because ’st’, ‘nd’, ‘th’, etc. don’t function like ordinary text — we do not read ‘1st’ as “onest”, ‘2nd’ as “twond”, and so on. The superscript serves to indicate that the digit and suffix together make a wholly new symbol that stands on its own for a single abstract thing.

  3. Scott
    April 30th, 2009 • 12:20 pm • Link

    I have to agree with Old Jeremy.

    The old guys got it right. Superscripting the letter pair puts the emphasis on the important part, the number, which makes for easier reading without errors. Same goes for the registered and trademark symbols. They can get in the way otherwise.

    I would also say the same for “old style figures.” I’ve recently been reading through a lot of 100 year old engineering books and have found that old style figures are far easier to read (with their baseline shifts up and down) than “modern” style figures in large tables (which are now as difficult to read quickly as all caps, for me).

    Scott

    P.S. I just notices the comment text uses old style figures (at least on my Mac running Safari v4 beta). Nice touch!

  4. Chris V.
    April 30th, 2009 • 12:22 pm • Link

    This is a huuuge help! I’ve been wanting something similar for quite a while now, but didn’t know where to start.

    I work on tons of chemical formulas, so it’s not uncommon to see H2O, CH4, CO4, etc. riddled throughout the text.

    By adapting what you just posted, I was able to come up with:
    (?<=.)(0|1|2|3|4|5|6|7|8|9)+

    I think I’m on the right path, but I still have an issue with the following sentence in the copy:
    —-

    The formula is: 2H2O+CH4O.

    —-
    The GREP finds that first number before the H, which I don’t want it to. Any idea how I might fix the GREP for that so that it only finds numbers inside or at the end of a word? Directly after a period would be OK as well.

  5. April 30th, 2009 • 1:05 pm • Link

    If you’re looking for any digit coming immediately after any capital letter, you might try this:

    (?<=\u)\d

  6. Chris V.
    April 30th, 2009 • 1:13 pm • Link

    D’oh, I feel foolish for having looked for something so complicated like I had.

    I just adapted what you gave me to include lowercase letters and periods.

    (?<=[\l\u]|\.)\d+

    Thanks, Jeremy for the beginning point. =)

  7. April 30th, 2009 • 5:37 pm • Link

    Almost every title we produce has these characters as normal instead of superscript. This is the first thing we change, (superscript to normal)

  8. May 1st, 2009 • 1:46 am • Link

    I suspect that some typesetting conventions became established on the basis of “what typewriters couldn’t do”.

    Years ago I used an ancient IBM (running DOS) to write my student thesis, which contained a lot of individual italic letters. The university “house rules” decreed that whatever would appear as italics when professionally printed should be underlined in a typewritten manuscript. So I had to go through it and change all the italic letters into underlined letters — to indicate that they should be in italics.

    Not quite so long ago, a well-known television news anchor man came into possession of a note purportedly written on a typewriter in the 1960s by a well-known politician. He didn’t notice that it contained a superscript ‘th’, and so was unlikely to have been produced by a typewriter of that vintage!

  9. May 1st, 2009 • 8:17 am • Link

    David, you’re rocking with good GREP stuff these days, thanks!

    I’m with Old Jeremy here: ordinals are rather special stuff, conceptually-pronounciationally (pardon!), and the old-style way of superscripting them is The One True Way.

    Modern-day style manuals are of . . . highly uneven quality, as they often are polluted by many bad, modern practices, and therefore shouldn’t be taken as biblical commandments whereas *proper* typography is concerned.

  10. Kenneth Trettin
    May 2nd, 2009 • 2:29 pm • Link

    This all seems to be a matter of taste and opinion. My taste and opinion dictate that when editing a manuscript for publication I spell it out if it is one hundred or less; first, second, third, fourth, etc. It is simple, in my opinion May second reads better than May 2nd.

  11. Nadya Miloserdova
    May 3rd, 2009 • 11:17 pm • Link

    This technique may be possibly useful when you need to write about square millimeters, etc.
    Find the string mm2 (or mm3) and lift the digit into superscript with the following Find/Change Query (either store it for future, or use it as a GREP style):
    Find What: ([question mark]
    Change to: +supercript (leave this field empty, change the formatting only)
    The same is applicable to sq.cm, sq.m, etc.

  12. Nadya Miloserdova
    May 3rd, 2009 • 11:33 pm • Link

    An edition to my previous post

    Somehow my Find What string turned to be taboo for the IDSecrets.com website application module.
    I wanted to write:
    ([question mark][less sign]=mm)(2|3)[more sign]

    Here “less sign” and “more sign” are interpreted as opening and closing tags. For the same reason I use “question mark” instead of “?”.

    So far, everybody who what to try my idea please replace these words to the forementioned signs. No square brakets [] are needed, too.

  13. ep
    May 4th, 2009 • 5:05 am • Link

    Great tips, thanks.
    I would add that you can search for accented characters like in french ordinals: 1ère or 2ème by adding these searches to your ‘or’ expression:
    [[=e=]]me|[[=e=]]re

  14. David Blatner
    May 4th, 2009 • 6:06 am • Link

    Yes, as Nadya found out, these comments don’t like opening angle-brackets because it thinks you’re writing HTML! If you want to type a < then type & l t ; (with no spaces between the letters).

  15. Troy Cole
    May 22nd, 2009 • 5:52 am • Link

    Thank you everyone. This is very helpful and exactly what I was looking for. I’m currently laying out a French document which has a lot of instances of numbers such as, 1er, 2e 3ième, etc.

    Using the comments above I’ve modified the search string to

    (?<=d|_)(e|er|ième|er/ème)>

    This pretty much does what I need it to but one thing that I have not been able to figure out, is how do you search for a string that includes a forward slash. One of the items I need to search for is “_____er/ème” (the document is an agreement and the underscores represent a field to be filled in with a numberal).

    Is there an easy solution to including a forward slash in the search?

    Thanks,
    Troy

  16. ep
    May 22nd, 2009 • 7:07 am • Link

    Troy, you just ‘escape’ it with a backslash. Just type backslash then slash (I can’t type it here).
    You might also need to use the POSIX form for the accented é

    [[=e=]]

    The escape rule is the same for periods and many other characters including backslashes themselves.

  17. Troy Cole
    May 25th, 2009 • 4:35 am • Link

    Hi ep,

    Thanks for your reply. I figured that there must be something like this to work around the foreward slash. However, I still can’t quite get this to work. I’ve modified my search to the following

    (?<=d|_)(e|er|ième|er\/ème)>

    And I’ve tried a number of variations, all with the same results. It will find and replace up to the second “er” but that’s where it stops. If anyone has any thoughts on the detail I’m missing here, I would love to hear it.

    For the record, the accented characters seemed to work just fine, without any modification.

    Troy

  18. ep
    May 25th, 2009 • 5:11 am • Link

    Try putting the longest string first in your array. Grep searches in the order you enter. Also, the forward slash in ‘er/ème’ is understood as a word end hence your search for ‘er’ stops at it.

    Hope this helps.

  19. Troy Cole
    May 25th, 2009 • 7:27 am • Link

    Interesting! Changing the order worked like a charm!

    This is my first venture into the world of GREP. From what I’ve seen it looks like it’s just the tip of the iceberg.

    Thanks a bunch for your help with this.
    Much appreciated!

    Troy

  20. Ian
    October 3rd, 2010 • 2:43 am • Link

    Re: Peter Hertzmann’s first comment on this topic, I have to say that in old manuscripts the superscript ordinal *is* used, e.g. in Latin we have “5o” for “quinto”, etc.

    I suppose printers of the period simply didn’t have the option of superscript, or perhaps found it easier not to bother with it unless really necessary.

    Whatever the case, it looks better to use superscript for these things. Thanks, David, for the tip.

  21. Amy
    October 26th, 2010 • 9:48 am • Link

    This is awesome! Exactly what I was looking for.

  22. Neil
    February 21st, 2011 • 8:33 am • Link

    Just because some idiot at Microsoft thought superscript looked cute is no reason to demolish the visual unity of text documents. This Victorian affectation is ugly and pointless and in many cases fouls up the line spacing of documents, particularly on the Web. It’s “standard” only at Microsoft (check Wikipedia for a useful history) and should be deprecated in every case.

  23. Cheryl
    February 7th, 2012 • 9:58 pm • Link

    Thank you so much just what I am looking for.

  24. Elizabeth O
    February 22nd, 2012 • 12:02 pm • Link

    I’m using GREP for chemical expressions, but can’t get it to work for glucose (C6H12O6). In other words, I can’t seem to figure out how to get the second and third set of numbers to subscript. In all cases—H2O, O2, CO2—I can get the subscripts to work, but in this case, I can only get the first 6 to subscript. As soon as I add more the the pattern, it all goes to poop. Suggestions?

  25. February 22nd, 2012 • 12:16 pm • Link

    @Elizabeth: There is a long discussion about this over on this other page: http://indesignsecrets.com/auto-format-superscript-and-subscript-numbers-using-grep-styles.php
    But I find this grep style works well:
    (?< =\u|\l)\d+(?=\u|\b)

    [Gah! I can't figure out how to get rid of that extra space before the first equal symbol. You need to remove that!]

  26. Elizabeth O
    February 22nd, 2012 • 12:42 pm • Link

    I think I may be too much of a newbie on this GREP stuff, but that’s not working for me. (I’m working within the Paragraph styles with a subscript Character style.) I did find that page, but would like to know how the GREP expression would look for glucose specifically. I think that may help me to learn this bit by mind-boggling bit.

    Thank you, David!

Subscribe to the Discussion

Get the ongoing discussion surrounding "Quick GREP To Superscript Ordinals" delivered to you. Click here to subscribe via RSS.

Leave a Reply

You can use limited HTML tags, such as <em></em> for emphasis/italics and <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> .

InDesignSecrets reserves the right to edit and/or remove posts and comments.