Learn HTML5 the right way

What follows is an excerpt from HTML5 & CSS3 for the Real World, by Alexis Goldstein, Louis Lazaris and Estelle Weyl.

This post was originally published in 2013 and was updated in April 2016.

As you learn HTML5 and add new techniques to your toolbox, you’re likely going to want to build yourself boilerplate, from which you can begin all your HTML5-based projects. We encourage this, and you may also consider using one of the many online sources that provide a basic HTML5 starting point for you.[1]

In this project, however, we want to build our code from scratch and explain each piece as we go along. Of course, it would be impossible for even the most fantastical and unwieldy sample site we could dream up to include every new element or technique, so we’ll also explain some new features that don’t fit into the project. This way, you’ll be familiar with a wide set of options when deciding how to build your HTML5 and CSS3 websites and web apps, so you’ll be able to use this book as a quick reference for a number of techniques.

Let’s start simple, with a bare-bones HTML5 page:

<!doctype html>

<html lang="en">
<head>
  <meta charset="utf-8">

  <title>The HTML5 Herald</title>
  <meta name="description" content="The HTML5 Herald">
  <meta name="author" content="SitePoint">

  <link rel="stylesheet" href="css/styles.css?v=1.0">

  <!--[if lt IE 9]>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv.js"></script>
  <![endif]-->
</head>

<body>
  <script src="js/scripts.js"></script>
</body>
</html>

With that basic template in place, let’s now examine some of the significant parts of the markup and how these might differ from how HTML was written prior to HTML5.

The Doctype

First, we have the Document Type Declaration, or doctype. This is simply a way to tell the browser—or any other parser—what type of document it’s looking at. In the case of HTML files, it means the specific version and flavor of HTML. The doctype should always be the first item at the top of any HTML file. Many years ago, the doctype declaration was an ugly and hard-to-remember mess. For XHTML 1.0 Strict:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

And for HTML4 Transitional:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/loose.dtd">

Although that long string of text at the top of our documents hasn’t really hurt us (other than forcing our sites’ viewers to download a few extra bytes), HTML5 has done away with that indecipherable eyesore. Now all you need is this:

<!doctype html>

Simple, and to the point. The doctype can be written in uppercase, lowercase, or mixed case. You’ll notice that the “5” is conspicuously missing from the declaration. Although the current iteration of web markup is known as “HTML5,” it really is just an evolution of previous HTML standards—and future specifications will simply be a development of what we have today.

Because browsers are usually required to support all existing content on the Web, there’s no reliance on the doctype to tell them which features should be supported in a given document. In other words, the doctype alone is not going to make your pages HTML5-compliant. It’s really up to the browser to do this. In fact, you can use one of those two older doctypes with new HTML5 elements on the page and the page will render the same as it would if you used the new doctype.

The html Element

Next up in any HTML document is the html element, which has not changed significantly with HTML5. In our example, we’ve included the lang attribute with a value of en, which specifies that the document is in English. In XHTML-based syntax, you’d be required to include an xmlns attribute. In HTML5, this is no longer needed, and even the lang attribute is unnecessary for the document to validate or function correctly.

So here’s what we have so far, including the closing </html> tag:

<!doctype html>
<html lang="en">

</html>

The head Element

The next part of our page is the <head> section. The first line inside the head is the one that defines the character encoding for the document. This is another element that’s been simplified since XHTML and HTML4, and is an optional feature, but recommended. In the past, you may have written it like this:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

HTML5 improves on this by reducing the character encoding <meta> tag to the bare minimum:

<meta charset="utf-8">

In nearly all cases, utf-8 is the value you’ll be using in your documents. A full explanation of character encoding is beyond the scope of this chapter, and it probably won’t be that interesting to you, either. Nonetheless, if you want to delve a little deeper, you can read up on the topic on W3C or WHATWG.

Note: Encoding Declaration

To ensure that all browsers read the character encoding correctly, the entire character encoding declaration must be included somewhere within the first 512 characters of your document. It should also appear before any content-based elements (like the <title> element that follows it in our example site).

There’s much more we could write about this subject, but we want to keep you awake—so we’ll spare you those details! For now, we’re content to accept this simplified declaration and move on to the next part of our document:

<title>The HTML5 Herald</title>
<meta name="description" content="The HTML5 Herald">
<meta name="author" content="SitePoint">

<link rel="stylesheet" href="css/styles.css?v=1.0">

In these lines, HTML5 barely differs from previous syntaxes. The page title (the only mandatory element inside the head) is declared the same as it always was, and the meta tags we’ve included are merely optional examples to indicate where these would be placed; you could put as many valid meta elements here as you like.

The key part of this chunk of markup is the stylesheet, which is included using the customary link element. There are no required attributes for link other than hrefand rel. The type attribute (which was common in older versions of HTML) is not necessary, nor was it ever needed to indicate the content type of the stylesheet.

Leveling the Playing Field

The next element in our markup requires a bit of background information before it can be introduced. HTML5 includes a number of new elements, such as article and section, which we’ll be covering later on. You might think this would be a major problem for older browser support for unrecognized elements, but you’d be wrong. This is because the majority of browsers don’t actually care what tags you use. If you had an HTML document with a recipe tag (or even a ziggy tag) in it, and your CSS attached some styles to that element, nearly every browser would proceed as if this were totally normal, applying your styling without complaint.

Of course, such a hypothetical document would fail to validate and may have accessibility problems, but it would render correctly in almost all browsers—the exception being old versions of Internet Explorer (IE). Prior to version 9, IE prevented unrecognized elements from receiving styling. These mystery elements were seen by the rendering engine as “unknown elements,” so you were unable to change the way they looked or behaved. This includes not only our imagined elements, but also any elements that had yet to be defined at the time those browser versions were developed. That means (you guessed it) the new HTML5 elements.

The good news is, at the time of writing, most people still using a version of IE are using version 9 or higher, and very few are on version 9, so this is not a big problem for most developers anymore; however, if a big chunk of your audience is still using IE8 or earlier, you’ll have to take action to ensure your designs don’t fall apart.

Fortunately, there’s a solution: a very simple piece of JavaScript originally developed by John Resig. Inspired by an idea by Sjoerd Visscher, it can make the new HTML5 elements styleable in older versions of IE.

We’ve included this so-called “HTML5 shiv”[2] in our markup as a script tag surrounded by conditional comments. Conditional comments are a proprietary feature implemented in Internet Explorer in version 9 and earlier. They provide you with the ability to target specific versions of that browser with scripts or styles. The following conditional comment is telling the browser that the enclosed markup should only appear to users viewing the page with Internet Explorer prior to version 9:

<!--[if lt IE 9]>
<script src="https://cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv.js"></script>
<![endif]-->

It should be noted that if you’re using a JavaScript library that deals with HTML5 features or the new APIs, it’s possible that it will already have the HTML5-enabling script present; in this case, you could remove reference to the script. One example of this would be Modernizr, a JavaScript library that detects modern HTML and CSS features. Modernizr gives you the option to include code that enables the HTML5 elements in older versions of IE, so the shiv would be redundant. We take a closer look at Modernizr in Appendix A.

Note: Not Everyone Can Benefit from the HTML5 Shiv

Of course, there’s still a group of users unable to benefit from the HTML5 shiv: those who have for one reason or another disabled JavaScript and are using IE8 or lower. As web designers, we’re constantly told that the content of our sites should be fully accessible to all users, even those without JavaScript. But it’s not as bad as it seems. A number of studies have shown that the number of users who have JavaScript disabled is low enough to be of little concern, especially when you factor in how few of those will be using IE8 or lower.

In a study published in October, 2013, the UK Government Digital Service determined that users browsing government web services in the UK with JavaScript disabled or unavailable was 1.1%. In another study conducted on the Yahoo Developer Network(published in October 2010), users with JavaScript disabled amounted to around 1% of total traffic worldwide.

The Rest is History

Looking at the rest of our starting template, we have the usual body element along with its closing tag and the closing html tag. We also have a reference to a JavaScript file inside a script element.

Much like the link tag discussed earlier, the <script> tag does not require that you declare a type attribute. If you ever wrote XHTML, you might remember your scripttags looking like this:

<script src="js/scripts.js" type="text/javascript"></script>

Since JavaScript is, for all practical purposes, the only real scripting language used on the Web, and since all browsers will assume that you’re using JavaScript even when you don’t explicitly declare that fact, the type attribute is unnecessary in HTML5 documents:

<script src="js/scripts.js"></script>

We’ve put the script element at the bottom of our page to conform to best practices for embedding JavaScript. This has to do with the page-loading speed; when a browser encounters a script, it will pause downloading and rendering the rest of the page while it parses the script. This results in the page appearing to load much more slowly when large scripts are included at the top of the page before any content. It’s why most scripts should be placed at the very bottom of the page, so that they’ll only be parsed after the rest of the page has loaded.

In some cases, however, (such as with the HTML5 shiv) the script may need to be placed in the head of your document, because you want it to take effect before the browser starts rendering the page.

Now that you’ve learnt how to create a HTML template, check out our screencast on creating JavaScript templates with Handlebars. Handlebars is great for interactive web apps that need to update frequently. In this screencast we’ll be looking at how Handlebars works, and then work through a small project together.