Go to the first, previous, next, last section, table of contents.

Document Structure

An HTML document is a tree of elements, including a head and body, headings, paragraphs, lists, etc. Form elements are discussed in section Forms.

Document Element: HTML

The HTML document element consists of a head and a body, much like a memo or a mail message. The head contains the title and optional elements. The body is a text flow consisting of paragraphs, lists, and other elements.

Head: HEAD

The head of an HTML document is an unordered collection of information about the document. For example:

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HEAD>
<TITLE>Introduction to HTML</TITLE>
</HEAD>
...

Title: TITLE

Every HTML document must contain a TITLE element.

The title should identify the contents of the document in a global context. A short title, such as "Introduction" may be meaningless out of context. A title such as "Introduction to HTML Elements" is more appropriate. (12)

A user agent may display the title of a document in a history list or as a label for the window displaying the document. This differs from headings (section Headings: H1 ... H6), which are typically displayed within the body text flow.

Base Address: BASE

The optional BASE element provides a base address for interpreting relative URLs when the document is read out of context (see section Hyperlinks). The value of the HREF attribute must be an absolute URI.

Keyword Index: ISINDEX

The ISINDEX element indicates that the user agent should allow the user to search an index by giving keywords. See section Queries and Indexes for details.

Link: LINK

The LINK element represents a hyperlink (see section Hyperlinks). Any number of LINK elements may occur in the HEAD element of an HTML document. It has the same attributes as the A element (see section Anchor: A).

The LINK element is typically used to indicate authorship, related indexes and glossaries, older or more recent versions, document hierarchy, associated resources such as style sheets, etc.

Associated Meta-information: META

The META element is an extensible container for use in identifying specialized document meta-information. Meta-information has two main functions:

to provide a means to discover that the data set exists and how it might be obtained or accessed; and
to document the content, quality, and features of a data set, indicating its fitness for use.

Each META element specifies a name/value pair. If multiple META elements are provided with the same name, their combined contents--concatenated as a comma-separated list--is the value associated with that name. (13)

HTTP servers may read the content of the document HEAD to generate header fields corresponding to any elements defining a value for the attribute HTTP-EQUIV. (14)

Attributes of the META element:

HTTP-EQUIV: binds the element to an HTTP header field. An HTTP server may use this information to process the document. In particular, it may include a header field in the responses to requests for this document: the header name is taken from the HTTP-EQUIV attribute value, and the header value is taken from the value of the CONTENT attribute. HTTP header names are not case sensitive.
NAME: specifies the name of the name/value pair. If not present, HTTP-EQUIV gives the name.
CONTENT: specifies the value of the name/value pair.

Examples

If the document contains:

<META HTTP-EQUIV="Expires"
      CONTENT="Tue, 04 Dec 1993 21:29:02 GMT">
<meta http-equiv="Keywords" CONTENT="Fred">
<META HTTP-EQUIV="Reply-to"
      content="[email protected] (Roy Fielding)">
<Meta Http-equiv="Keywords" CONTENT="Barney">

then the server may include the following header fields:

Expires: Tue, 04 Dec 1993 21:29:02 GMT
Keywords: Fred, Barney
Reply-to: [email protected] (Roy Fielding)

as part of the HTTP response to a `GET' or `HEAD' request for that document.

An HTTP server must not use the META element to form an HTTP response header unless the HTTP-EQUIV attribute is present.

An HTTP server may disregard any META elements that specify information controlled by the HTTP server, for example `Server', `Date', and `Last-modified'.

Next Id: NEXTID

The NEXTID element is included for historical reasons only. HTML documents should not contain NEXTID elements.

The NEXTID element gives a hint for the name to use for a new A element when editing an HTML document. It should be distinct from all NAME attribute values on A elements. For example:

<NEXTID N=Z27>

Body: BODY

The BODY element contains the text flow of the document, including headings, paragraphs, lists, etc.

For example:

<BODY>
<h1>Important Stuff</h1>
<p>Explanation about important stuff...
</BODY>

Headings: H1 ... H6

The six heading elements, H1 through H6, denote section headings. Although the order and occurrence of headings is not constrained by the HTML DTD, documents should not skip levels (for example, from H1 to H3), as converting such documents to other representations is often problematic.

Example of use:

<H1>This is a heading</H1>
Here is some text
<H2>Second level heading</H2>
Here is some more text.

Typical renderings are:

H1: Bold, very-large font, centered. One or two blank lines above and below.
H2: Bold, large font, flush-left. One or two blank lines above and below.
H3: Italic, large font, slightly indented from the left margin. One or two blank lines above and below.
H4: Bold, normal font, indented more than H3. One blank line above and below.
H5: Italic, normal font, indented as H4. One blank line above.
H6: Bold, indented same as normal text, more than H5. One blank line above.

Block Structuring Elements

Block structuring elements include paragraphs, lists, and block quotes. They must not contain heading elements, but they may contain phrase markup, and in some cases, they may be nested.

Paragraph: P

The P element indicates a paragraph. The exact indentation, leading space, etc. of a paragraph is not specified and may be a function of other tags, style sheets, etc.

Typically, paragraphs are surrounded by a vertical space of one line or half a line. The first line in a paragraph is indented in some cases.

Example of use:

<H1>This Heading Precedes the Paragraph</H1>
<P>This is the text of the first paragraph.
<P>This is the text of the second paragraph. Although you do not 
need to start paragraphs on new lines, maintaining this 
convention facilitates document maintenance.</P>
<P>This is the text of a third paragraph.</P>

Preformatted Text: PRE

The PRE element represents a character cell block of text and is suitable for text that has been formatted for a monospaced font.

The PRE tag may be used with the optional WIDTH attribute. The WIDTH attribute specifies the maximum number of characters for a line and allows the HTML user agent to select a suitable font and indentation.

Within preformatted text:

Line breaks within the text are rendered as a move to the beginning of the next line. (15)
Anchor elements and phrase markup may be used. (16)
Elements that define paragraph formatting (headings, address, etc.) must not be used. (17)
The horizontal tab character (code position 9 in the HTML document character set) must be interpreted as the smallest positive nonzero number of spaces which will leave the number of characters so far on the line as a multiple of 8. Documents should not contain tab characters, as they are not supported consistently.

Example of use:

<PRE>
Line 1.
       Line 2 is to the right of line 1.     <a href="abc">abc</a>
       Line 3 aligns with line 2.            <a href="def">def</a>
</PRE>

Example and Listing: XMP, LISTING

The XMP and LISTING elements are similar to the PRE element, but they have a different syntax. Their content is declared as CDATA, which means that no markup except the end-tag open delimiter-in-context is recognized (see 9.6 "Delimiter Recognition" of [SGML]). (18)

Since CDATA declared content has a number of unfortunate interactions with processing techniques and tends to be used and implemented inconsistently, HTML documents should not contain XMP nor LISTING elements -- the PRE tag is more expressive and more consistently supported.

The LISTING element should be rendered so that at least 132 characters fit on a line. The XMP element should be rendered so that at least 80 characters fit on a line but is otherwise identical to the LISTING element. (19)

Address: ADDRESS

The ADDRESS element contains such information as address, signature and authorship, often at the beginning or end of the body of a document.

Typically, the ADDRESS element is rendered in an italic typeface and may be indented.

Example of use:

<ADDRESS>
Newsletter editor<BR>
J.R. Brown<BR>
JimquickPost News, Jimquick, CT 01234<BR>
Tel (123) 456 7890
</ADDRESS>

Block Quote: BLOCKQUOTE

The BLOCKQUOTE element contains text quoted from another source.

A typical rendering might be a slight extra left and right indent, and/or italic font. The BLOCKQUOTE typically provides space above and below the quote.

Single-font rendition may reflect the quotation style of Internet mail by putting a vertical line of graphic characters, such as the greater than symbol (>), in the left margin.

Example of use:

I think the play ends
<BLOCKQUOTE>
<P>Soft you now, the fair Ophelia. Nymph, in thy orisons, be all 
my sins remembered.
</BLOCKQUOTE>
but I am not sure.

List Elements

HTML includes a number of list elements. They may be used in combination; for example, a OL may be nested in an LI element of a UL.

The COMPACT attribute suggests that a compact rendering be used.

Unordered List: UL, LI

The UL represents a list of items -- typically rendered as a bulleted list.

The content of a UL element is a sequence of LI elements. For example:

<UL>
<LI>First list item
<LI>Second list item
 <p>second paragraph of second item
<LI>Third list item
</UL>

Ordered List: OL

The OL element represents an ordered list of items, sorted by sequence or order of importance. It is typically rendered as a numbered list.

The content of a OL element is a sequence of LI elements. For example:

<OL>
<LI>Click the Web button to open URI window.
<LI>Enter the URI number in the text field of the Open URI 
window. The Web document you specified is displayed.
  <ol>
   <li>substep 1
   <li>substep 2
  </ol>
<LI>Click highlighted text to move from one link to another.
</OL>

Directory List: DIR

The DIR element is similar to the UL element. It represents a list of short items, typically up to 20 characters each. Items in a directory list may be arranged in columns, typically 24 characters wide.

The content of a DIR element is a sequence of LI elements. Nested block elements are not allowed in the content of DIR elements. For example:

<DIR>
<LI>A-H<LI>I-M
<LI>M-R<LI>S-Z
</DIR>

Menu List: MENU

The MENU element is a list of items with typically one line per item. The menu list style is typically more compact than the style of an unordered list.

The content of a MENU element is a sequence of LI elements. Nested block elements are not allowed in the content of MENU elements. For example:

<MENU>
<LI>First item in the list.
<LI>Second item in the list.
<LI>Third item in the list.
</MENU>

Definition List: DL, DT, DD

A definition list is a list of terms and corresponding definitions. Definition lists are typically formatted with the term flush-left and the definition, formatted paragraph style, indented after the term.

The content of a DL element is a sequence of DT elements and/or DD elements, usually in pairs. Multiple DT may be paired with a single DD element. Documents should not contain multiple consecutive DD elements.

Example of use:

<DL>
<DT>Term<DD>This is the definition of the first term.
<DT>Term<DD>This is the definition of the second term.
</DL>

If the DT term does not fit in the DT column (typically one third of the display area), it may be extended across the page with the DD section moved to the next line, or it may be wrapped onto successive lines of the left hand column.

The optional COMPACT attribute suggests that a compact rendering be used, because the list items are small and/or the entire list is large.

Unless the COMPACT attribute is present, an HTML user agent may leave white space between successive DT, DD pairs. The COMPACT attribute may also reduce the width of the left-hand (DT) column.

<DL COMPACT>
<DT>Term<DD>This is the first definition in compact format.
<DT>Term<DD>This is the second definition in compact format.
</DL>

Phrase Markup

Phrases may be marked up according to idiomatic usage, typographic appearance, or for use as hyperlink anchors.

User agents must render highlighted phrases distinctly from plain text. Additionally, EM content must be rendered as distinct from STRONG content, and B content must rendered as distinct from I content.

Phrase elements may be nested within the content of other phrase elements; however, HTML user agents may render nested phrase elements indistinctly from non-nested elements:

plain <B>bold <I>italic</I></B> may be rendered 
the same as plain <B>bold </B><I>italic</I>

Idiomatic Elements

Phrases may be marked up to indicate certain idioms. (20)

Citation: CITE

The CITE element is used to indicate the title of a book or other citation. It is typically rendered as italics. For example:

He just couldn't get enough of <cite>The Grapes of Wrath</cite>.

Code: CODE

The CODE element indicates an example of code, typically rendered in a mono-spaced font. The CODE element is intended for short words or phrases of code; the PRE block structuring element (section Preformatted Text: PRE) is more appropriate for multiple-line listings. For example:

The expression <code>x += 1</code>
is short for <code>x = x + 1</code>.

Emphasis: EM

The EM element indicates an emphasized phrase, typically rendered as italics. For example:

A singular subject <em>always</em> takes a singular verb.

Keyboard: KBD

The KBD element indicates text typed by a user, typically rendered in a mono-spaced font. This is commonly used in instruction manuals. For example:

Enter <kbd>FIND IT</kbd> to search the database.

Sample: SAMP

The SAMP element indicates a sequence of literal characters, typically rendered in a mono-spaced font. For example:

The only word containing the letters <samp>mt</samp> is dreamt.

Strong Emphasis: STRONG

The STRONG element indicates strong emphasis, typically rendered in bold. For example:

<strong>STOP</strong>, or I'll say "<strong>STOP</strong>" again!

Variable: VAR

The VAR element indicates a placeholder variable, typically rendered as italic. For example:

Type <SAMP>html-check <VAR>file</VAR> | more</SAMP>
to check <VAR>file</VAR> for markup errors.

Typographic Elements

Typographic elements are used to specify the format of marked text.

Typical renderings for idiomatic elements may vary between user agents. If a specific rendering is necessary -- for example, when referring to a specific text attribute as in "The italic parts are mandatory" -- a typographic element can be used to ensure that the intended typography is used where possible.

(21)

Bold: B

The B element indicates bold text. Where bold typography is unavailable, an alternative representation may be used.

Italic: I

The I element indicates italic text. Where italic typography is unavailable, an alternative representation may be used.

Teletype: TT

The TT element indicates teletype (monospaced )text. Where a teletype font is unavailable, an alternative representation may be used.

Anchor: A

The A element indicates a hyperlink anchor (see section Hyperlinks). At least one of the NAME and HREF attributes should be present. Attributes of the A element:

HREF

gives the URI of the head anchor of a hyperlink.

NAME

gives the name of the anchor, and makes it available as a head of a hyperlink.

TITLE

suggests a title for the destination resource --- advisory only. The TITLE attribute may be used:

for display prior to accessing the destination resource, for example, as a margin note or on a small box while the mouse is over the anchor, or while the document is being loaded;
for resources that do not include a title, such as graphics, plain text and Gopher menus, for use as a window title.

REL

The REL attribute gives the relationship(s) described by the hyperlink. The value is a whitespace separated list of relationship names. The semantics of link relationships are not specified in this document.

REV

same as the REL attribute, but the semantics of the relationship are in the reverse direction. A link from A to B with REL="X" expresses the same relationship as a link from B to A with REV="X". An anchor may have both REL and REV attributes.

URN

specifies a preferred, more persistent identifier for the head anchor of the hyperlink. The syntax and semantics of the URN attribute are not yet specified.

METHODS

specifies methods to be used in accessing the destination, as a whitespace-separated list of names. The set of applicable names is a function of the scheme of the URI in the HREF attribute. For similar reasons as for the TITLE attribute, it may be useful to include the information in advance in the link. For example, the HTML user agent may chose a different rendering as a function of the methods allowed; for example, something that is searchable may get a different icon.

Line Break: BR

The BR element specifies a line break between words (see section Characters, Words, and Paragraphs). For example:

<P> Pease porridge hot<BR>
Pease porridge cold<BR>
Pease porridge in the pot<BR>
Nine days old.

Horizontal Rule: HR

The HR element is a divider between sections of text; typically a full width horizontal rule or equivalent graphic. For example:

<HR>
<ADDRESS>February 8, 1995, CERN</ADDRESS>
</BODY>

Image: IMG

The IMG element refers to an image or icon via a hyperlink (see section Simultaneous Presentation of Image Resources).

HTML user agents may process the value of the ALT attribute as an alternative to processing the image resource indicated by the SRC attribute. (22)

Attributes of the IMG element:

ALIGN

alignment of the image with respect to the text baseline.

`TOP' specifies that the top of the image aligns with the tallest item on the line containing the image.
`MIDDLE' specifies that the center of the image aligns with the baseline of the line containing the image.
`BOTTOM' specifies that the bottom of the image aligns with the baseline of the line containing the image.

ALT

text to use in place of the referenced image resource, for example due to processing constraints or user preference.

ISMAP

indicates an image map (see section Image Maps).

SRC

specifies the URI of the image resource. (23)

Examples of use:

<IMG SRC="triangle.xbm" ALT="Warning:"> Be sure 
to read these instructions.

<a href="http://machine/htbin/imagemap/sample">
<IMG SRC="sample.xbm" ISMAP>
</a>

Go to the first, previous, next, last section, table of contents.