HTML(5) and text-level semantics

As an absolute type nut and militant web standards advocate, one of the most exciting things that HTML5 brings for me is not the new structural elements like <header>, <aside> et al (although they are pretty awesome) but rather the text-level semantics it brings with the addition and redefinition of certain elements.

The best way to explain them is probably to take a look at the following excerpt:


Hi there, I’m Harry Roberts, I am a web developer at BSkyB. I specialise in web standards, accessibility, Ruby, design and build, mobile development, typography and more. I have been in the industry for three four years.

Please note: I am not a programmer.

A photo of meAbove: A photo of me.

I write at my personal blog CSS Wizardry and tweet at @csswizardry. I love the uppercase R in Helvetica. My motto on web development is ‘Always code like you’re working in a team, even when you’re not.’ I absolutely love the web.

I am also an advocate of clearing floats the clean way. Please note the width and overflow properties in the code below:

.wrapper{
  width:940px;
  margin:0 auto;
  padding:10px;
  overflow:hidden;
}

This is, admittedly, a very forced bit of writing, but I had to write in such a way as to properly, and in context, use a large set of semantic text-level elements. These are, in order of appearance:

  1. The <del> element
  2. The <s> element
  3. The <ins> element
  4. The <strong> element
  5. The <small> element
  6. The <b> element
  7. The <cite> element
  8. The <i> element
  9. The <q> element
  10. The <em> element
  11. The <code> element
  12. The <mark> element

The <del> element

...I specialise in web standards, accessibility, <del datetime="2011-01-23T10:07:25+00:00">Ruby,</del> design and build, mobile development, typography

The <del> element indicates a removal from a document; this shows that the text inside it has no place in the document. You could actually physically remove the text, but you can—for the sake of transparency— leave it in and show that it does not belong. It also has an attribute to show when it was deleted.

In this case, I do not know Ruby and as such it has no place in the text. I used the <del> element to show this.

The <s> element

I have been in the industry for <s>three</s> four years.

This element is very similar to the <del> element and their differences are very subtle. Where the <del> shows incorrect information that should not be in the document, the <s> element represents information that is no longer accurate or relevant (e.g. out of date). Here, it used to be true that I had been industry for three years, but that has been replaced by four. The information is not incorrect per se, merely no longer relevant.

The <ins> element

<ins datetime="2011-01-23T10:07:25+00:00"><strong>Please note:</strong> I am not a programmer.</ins>

The <ins> element represents text that has been inserted into the document after it has been published. Here I am inserting content as a result of the Ruby mistake earlier. Here I am inserting text to explain why the previous text was removed. Note the same attributes as used on the <del> element.

This isn’t its only use, however. I frequently use the <ins> element in articles to show addenda and updates.

The <strong> element

<ins datetime="2011-01-23T10:07:25+00:00"><strong>Please note:</strong> I am not a programmer.</ins>

We should all be familiar with the <strong> element. It represents strong importance; where the content is more important than its surroundings. Here is is important because I am saying it is important that you know that I am not a programmer as was accidentally stated above.

The <small> element

<small><b>Above:</b> A photo of me.</small>

The <small> element has been redefined to represent small print and side comments. Here it is used to describe the picture above it. It’s usage is, luckily, fairly obvious. Any time you have supporting information for a larger piece, mark it up as a <small>.

The <b> element

<small><b>Above:</b> A photo of me.</small>

The once loathed <b> element has been redefined in HTML5 to represent any text whose appearance is offset from its surroundings, often by means of bolding. There are occasions when you’d want bold text but without any extra importance, such as <strong> would add. I also use the <b> element for marking up the origins of quotes, e.g.:

<blockquote>
  <p>&ldquo;
    A lie gets halfway around the world before the truth has a chance to get its pants on.&rdquo;
    <b>Sir Winston Churchill</b>
  </p>
</blockquote>

The <cite> element

I write at my personal blog <a href="/"><cite>CSS Wizardry</cite></a> and

The <cite> element is used to represent a mention of or reference to a body of work, such as a book, an article, a painting and more. It is not, according to the HTML spec, used for marking up the names of sources of quotes (as above, I use the <b> element).

So whenever you reference a name of a film or song or website or sculpture or article, mark it up with the <cite> element.

The <i> element

I love the uppercase <i>R</i> in Helvetica.

The newly redefined <i> element is another slightly confusing one. The usage is any piece of text that may be spoken with a slightly different inflection but bears no extra importance. The best way to tell whether you need the <i> element or not is to say it aloud.

Here I’m marking up a single letter, because if I were to speak that sentence aloud the R would have a slightly different tone applied. The <i> element can also be applied to full words and phrases.

The <q> element

My motto on web development is <q>&lsquo;Always code like you&rsquo;re working in a team, even when you&rsquo;re not.&rsquo;</q>

The <q> element is simply used to mark up inline quotations; quotes that are in the context of surrounding copy.

The <em> element

I absolutely <em>love</em> the web.

Again, the <em> element is one we should all be familiar with. It denotes stressed importance. If you read the example aloud you can see how the <em> element adds inflection on the word love with importance.

The <code> element

Please note the <code>width</code> and <code>overflow</code> properties in the code below:

The <code> element is very self-explanatory, it simply represents pieces of code.

N.B. There is a much larger array of elements used to denote code, inputs and outputs too detailed to go into here. Please refer to the HTML spec for these.

The <mark> element

<pre><code>.wrapper{
  <mark>width:940px;</mark>
  margin:0 auto;
  padding:10px;
  <mark>overflow:hidden;</mark>
}</code></pre>

The <mark> element is a really nice new element introduced in HTML5. Its purpose is simply to highlight. You could highlight each occurrence of a search term in a search-results page as HTML5 Doctor do or—as I like to do—highlight specific references to code that is in a larger block. This allows me to give the code context, but also highlight the relevant snippet that I am talking about.


So there we have an array of highly semantic and really nifty text-level elements to use in your work; some old, some new, some modified but all useful.

There are more than I’ve outlined here, I may revisit the blog post and add them, but the ones I’ve covered are the ones I find most commonly occurring. In the meantime, why not give the HTML spec a quick read?

By Harry Roberts on Sunday, January 23rd, 2011 in Web Development. Tags: , , , | 25 Comments »

+

25 Responses to ‘HTML(5) and text-level semantics’


  1. James Young said on 23 January, 2011 at 11:55 am

    Nice write up and examples, good to see the “lesser” elements get some coverage too!


  2. Eli Dupuis said on 23 January, 2011 at 7:13 pm

    Simple and informative. Apparently, I’ve been using the <cite> element incorrectly! Thanks for the quick refresher and inspiration.


  3. Maximilian said on 23 January, 2011 at 7:54 pm

    Nice review. Thanks!


  4. PhilD said on 23 January, 2011 at 9:03 pm

    Thanks for the explanations.
    I think I should be able to remember most of them now
    Cheers


  5. Gabe Casalett said on 23 January, 2011 at 10:59 pm

    This will be a good resource going forward. Thanks a lot!


  6. Josh Littlejohn said on 24 January, 2011 at 7:34 am

    In the case of providing your picture and a caption, wouldn’t the ‘figure’ and ‘figcaption’ elements be more appropriate?

    Thanks for the article. I hope all these new inline elements provide us designers much more freedom in expression when it comes to content.


  7. Thierry Mauduit said on 24 January, 2011 at 8:34 am

    Thanx for this. I would just add an “outline:0;” for clearing floats because of a display bug in Firefox from time to time…


  8. Patrick Samphire said on 24 January, 2011 at 8:39 am

    I must admit, I’m not really convinced by the definition of ‘i’. Is there really any part of speech that is not inflected slightly differently from the rest? Unless you speak like a robot, you use different inflections all the time, and you’d end up marking-up most of your text with ‘i’ elements. This seems more like an attempt to shoehorn in an existing element than to provide anything useful.


  9. Horia Dragomir said on 24 January, 2011 at 8:52 am

    All this time, I’ve been using both strong and b the same way you describe and the back-end guys have given me hell over it.

    Nice to see I’m not crazy after all.


  10. Dominik Hahn said on 24 January, 2011 at 10:21 am

    I’d use <figure> and <figcaption> instead of <small> in the example above because this way the supporting information is tied to the piece of information.


  11. Ade said on 24 January, 2011 at 8:07 pm

    I’ve been following the stricter definitions from the HTML5 spec for about a year or so. The fact that it has more to say about semantics than previous specs addresses an obvious gap; one which I think led to a lot of misinterpretation and, worse, misinformation.

    Another good article, Harry.


  12. jitendra vyas said on 25 January, 2011 at 3:13 am

    Very good explanation. Thanks for this. I also liked your quote ‘Always code like you’re working in a team, even when you’re not.’


  13. CHJJ said on 25 January, 2011 at 5:38 pm

    The “i” element is less confusing than it seems.

    i.e. “text read in an alternate mood.”: When you think about it, this is what italic text has always meant in almost any piece of writing. There were always non-apparent semantics surrounding italic text that went only implicitly acknowledged.

    @Patrick, I tend to use “i” in any situation where I could potentially justify the use of “finger-quotes” in normal speech, or in other words, when I would be saying a particular word that I wasn’t comfortable pronouncing normally. That’s just one use case for me. It’s surprisingly useful.


  14. Brent Lagerman said on 26 January, 2011 at 5:20 am

    what? since when has overflow:hidden cleared a float? can you point me to a more detailed explanation of how this works

    brent


  15. Eric Oyen said on 27 January, 2011 at 11:52 am

    I was reading this page using voiceover. some of the text attributes were invisible to me. the <del> element was completely invisible. I didn’t even know that those entries were to be removed from the text.

    btw, I am a mac user and use voiceover. color highlights, bold, italics, other enhancements are all invisible to me. the only way I could “see” them is if I turned on “text attributes” which would make an ordinary html/5 document unnecessarily verbose. Since I cannot read braille yet (and have no braille device), I have no clear idea if these same markers would be visible or otherwise in braille.

    also, I have no clear idea how other screen access software (jaws, window eyes, thunder, system acces and NVDA) would react to these. more input from the blind community is needed.


  16. Charles Marshall said on 28 January, 2011 at 12:24 am

    I was viewing this in FireFox 2.0.0.6.
    It displayed “Ruby” (the example <del> element)
    as a strike out font.
    Or should I say striken? It looks exactly like legislative statues.
    Hey wait, isn’t this blog all about text attributes?
    I don’t even know how to turn them off.


  17. Brent French said on 29 January, 2011 at 10:40 pm

    Informative web design please keep up the tutorials!


  18. Andy said on 13 September, 2011 at 9:12 am

    Really nice explanation of the semantic markup for text. Should definitely help people gaining a deeper understanding of the various tags.

    One thing about the <i> tag, it is also intended for use with foreign words, or at least that is my understanding. So it should be used whenever you say something in a language other than the one the main text is written in.


  19. Chris said on 13 September, 2011 at 9:22 am

    I’m a little confused about your usage of <i>. You’re saying that you’d use it when there is no semantic relevance, but you’d say it different out loud?

    I sort of get your logic, although I’m struggling to see the relevance with respect to web design. (normal) people don’t read markup out loud—unless you’re listening to a screen reader, in which case (as is my understanding) you’d need to use an <em> to highlight that emphasis. So if there’s no meaning to the <i>, why not just use font-style?


  20. Harry Roberts said on 13 September, 2011 at 9:42 am

    @Chris:

    Certainly you could use something like <span class="italicised">foo</span> where you want a different inflection but no extra importance, but that’s really verbose. The spec has simply repurposed the defunct <i> element to do just this without the need to write all that extra markup/CSS. <i> used to mean italics where now I guess you could say it means inflection <– I just validly used the <i> element there!

    Screenreaders are a grey area as some don’t differentiate between em and i but the spec dictates this is the correct usage so if you want to do it by the book they do have different meanings.

    If it’s italicised but not emphasised you probably want an <i> :)

    H


  21. Chris said on 13 September, 2011 at 9:58 am

    @Harry

    Gotcha. To be honest, I hadn’t clocked that <i> had been un-deprecated in HTML5.

    After years of having it beaten into my head that that’s using <i>, <b> and <s> for styling is a terrible idea, it’ll probably take some conscious thought to get myself using them again. It still reeks to me of using a presentational tag for styling purposes, but if it’s valid usage then good—if I’m honest, it’ll be nice having a few more shorthand tags available to use.


  22. Christian Krammer said on 15 September, 2011 at 6:52 am

    Thanks for the round-up. Although I already tinkered quite much with HTML5 there is always something new to learn. And whenever you think you know everything there comes a chunk of info – like your article – that teaches you the opposite.


  23. marc said on 21 September, 2011 at 3:25 pm

    Thanks for your comprehensive explanation. Let me see if I understood it right: if – for example – I have some products listed on a page with the price, and one product is at some point out of stock (and will remain out of stock because there is no more): would I use or <del> to indicated, that price and order-link are obsolete (and replace them using with a text stating that the product is out of stock)?


  24. Sid Anand said on 4 October, 2011 at 6:25 am

    HTML5 is simply awesome.

    Good tips, will be useful for sure, bookmarked.

    Thanks


Leave a Reply

Respond to HTML(5) and text-level semantics

Hi there, I am Harry Roberts. I am a 21 year old web developer from the UK. I Tweet and write about web standards, typography, best practices and everything in between. You should browse and search my archives and follow me on Twitter, 7,791 people do.

via Ad Packs