W3C NOTE-XSL-and-CSS-19980911


Using XSL and CSS together

W3C Note, 11 September 1998

This version
http://www.w3.org/TR/1998/NOTE-XSL-and-CSS-19980911
Latest version
http://www.w3.org/TR/NOTE-XSL-and-CSS
Authors
Håkon Lie ([email protected]), Bert Bos ([email protected])

Copyright © 1998 W3C (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.

Status of this document

This document is a NOTE made available by the W3 Consortium for discussion purposes. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.

The note is published in the hope that it may provide a useful viewpoint for understanding the relation between various Web specifications.

Comments should be sent to the authors.

Abstract

This W3C Note describes how XSL [1] and CSS [2] can be used together. In particular, it discusses how XSL can be used as a bridge between complex XML-based documents and the CSS formatting model. It gives an outline of a system for displaying documents in XML-based formats as human-readable, or human-audible, text. To use the CSS properties in the language of XSL, it is necessary to invent an XML-based syntax, compatible with XSL, to represent CSS's properties. No new CSS properties, or other formatting semantics, are defined in this document.

Introduction

CSS is a powerful and easy to use formatting language. The two levels defined to date, CSS1 and CSS2, offer a wealth of formatting properties, and the next level promises to add even more. CSS is implemented by many programs and the experience from those implementations is being fed back into the development of more advanced formatting properties.

CSS, however, is only a formatting language: it attaches style properties to the elements of a source document. It lacks facilities commonly found in report generators, mail-merge programs, etc., for massaging a set of data into a human-readable format. It assumes that process has been done by an external program. In effect, that is how much of the information on the Web today is produced: information from a database at the server side is extracted and put into an HTML template, which is sent to a client (browser) and formatted and displayed.

With the advent of XML, the expectation is that in many cases the original data, rather than the HTML representation of it, will be sent to the client. This gives the client a richer data-set to work with, but data transformations may be necessary. XSL will be able to perform these transformations.

We can see several ways of using XSL and CSS together:

  1. Using XSL on the server to transform XML data into HTML documents with CSS style sheets. This has the benefit of being backwards compatible with a large installed base of User Agents. In the short term, this is likely to be the most common combination of XSL and CSS. Due to the well-known semantics of HTML, this is also the best way to ensure that information in lesser-known XML formats is accessible (e.g., to people using different output media).
  2. Using XSL on the server to transform XML data into XML documents with CSS style sheets. XML, unlike HTML, comes with no formatting conventions and will always need a style sheet to be displayed. This method requires new functionality in User Agents. Unless there are XSL sheets provided for different media, this may lead to accessibility problems.
  3. Using XSL to generate HTML/CSS on the client side. The content is passed through HTML/CSS to take advantage of current implementations, but is never made available in this form. This method requires new functionality in User Agents.
  4. Transform directly to "CSS formatting objects". Compared to the previous method, this method is more direct as the content isn't converted to/from HTML. It requires new functionality in User Agents. Unless there are XSL sheets provided for different media, this may lead to accessibility problems.

This note only concerns itself with the last method. This document shows how the set of "CSS objects" might be defined.

XSL basics

The XSL language is still under development. At the time of writing, it is a W3C Working Draft. All syntax shown here is therefore tentative, and only meant to introduce the concepts.

The bulk of an XSL sheet is a series of pattern-action rules. The patterns are similar to CSS's selectors (in function, not necessarily in syntax), but the action part may create an arbitrary number of "objects." The action part of the rule is called the "template" in XSL, and a template and a pattern together are referred to as a "template rule."

An author of an XSL sheet selects a suitable set of objects for his task. The set of objects could be anything for which a specification exists that defines their syntax inside XSL templates; below we show how that specification might look for CSS. The objects need not be formatting objects: they could, e.g., be objects that create SMIL elements or RDF elements. In principle, when an XML syntax for a data-format already exists, it should be fairly easy to derive an XSL template format from that.

An XSL sheet looks like an XML document with a mixture of two kinds of elements: those defined by XSL and those defined by the object language.

The result of applying all matching patterns to a document recursively is a tree of objects. The resulting tree of objects is then interpreted, top-down, according to the definition of each object. If they are (hypothetical) HTML objects, they will produce an HTML document, probably one HTML element for each HTML object. If they are, as in this note, CSS objects, they will produce a certain rendering on screen or in some other medium.

To give a simple example, the template rule below shows how one XML element ("partnumber") expands to a series of CSS objects, with the content of the XML element expanded inside it. "Process-children" indicates the place where the content of partnumber should be put. The elements "template" and "process-children" are defined by XSL; "chunk" is a CSS object (defined by this note). The non-XSL elements must be prefixed with a short string that ends in a colon, to distinguish them from XSL keywords; in this note we've used "css:".

<template match="partnumber">
  <css:chunk display="block"
            font-weight="bold"
            margin-top="20px">
    <css:chunk display="inline"
              color="red">
      Part number:<css:space/>
    </css:chunk>
    <css:chunk color="green">
      <process-children/>
    </css:chunk>
  </css:chunk> <!-- end of block -->
</template>

CSS also supports aural renderings. A similar template that produces CSS objects for audio output might be:

<template match="partnumber">
  <css:chunk speak="normal"
            voice-family="female"
            cue-before="partnumber-jingle.au"
            pause-after="15ms">
    Part number:
    <process-children/>
  </css:chunk>
</template>

These examples only serve to give the flavor of XSL. XSL supports much more powerful transformations than these two examples show.

Declaring the CSS objects

As explained above, XSL is designed to be used with different sets of objects. The result-ns attribute at the top of the XSL sheet declares the short string that is used as the prefix (we've chosen "css" in this note, but we could have used "fo", or "r", or anything else), and another attribute then binds the prefix to the definition of the objects:

<stylesheet
    result-ns="css"
    xmlns:css="http://www.w3.org/TR/XSL-for-CSS">
  ... rest of sheet, with CSS objects...
</stylesheet>

Note: as explained earlier, the syntax of XSL is still being developed. Although there will be ways to write selectors (patterns) and templates, and declare the set of objects, the syntax will not be frozen until XSL is issued as W3C Recommendation.

Principles for creating the CSS objects

CSS doesn't have an XML syntax, which makes defining its XSL template syntax slightly harder than it would be for, e.g., SMIL or MathML. Below are a few principles for the conversion, and some examples of the result.

CSS has properties like font and border, but also font-size and border-top, which allow small aspects of a font or border to be specified. The CSS objects for XSL could either maintain this redundancy, or allow only one way to set a property. To minimize the number of surprises for people using the CSS objects, allowing both the shorthand and the individual properties is probably advisable.

CSS properties become XML attributes in the XSL syntax. Some CSS properties (font-family, content) accept both quoted strings and keywords. In the XSL syntax that would become font-family="'Times Roman', serif", which invites errors. Some possible ways to avoid double quoting are given below.

The main CSS object is called chunk. A chunk has properties and usually some text content and/or embedded objects (often other chunks). If the output medium is visual, a chunk typically produces a single box, although it may also produce multiple boxes, if its 'display' property is 'inline', or no boxes at all, if 'display' is 'none'.

Some auxiliary objects may be necessary, either variants of chunks with extra functionality (e.g., anchor), or objects to get around restrictions on XSL syntax (e.g., switch). A chunk is reminiscent of the {...}-block of a normal CSS rule. For example:

{
  font-size: 10px;
  color: #FB9;
  text-indent: 1em
}

would become:

<css:chunk
  font-size="10px"
  color="#FB9"
  text-indent="1em">

Pseudo-classes

Pseudo-classes in CSS serve to select elements based on information other than what can be learned from the source document. Examples are ":active," ":visited," and ":hover." One way to handle them is with a switch object that contains chunks for all possible states, and let the renderer switch between the chunks, based on the truth of some condition attached to them. For example:

<css:switch text-decoration="underline"
            background="red"
            font-style="italic">
  <css:chunk condition="active | hover"
              color="...">...</>
  <css:chunk condition="visited"
              color="...">...</>
</css:switch>

The switch object contains the properties common to the alternatives, and each alternative has a condition attribute that contains the condition under which this chunk is displayed. (If there is more than one URL, a condition like "visited" also needs a way to indicate which URL is visited.)

Note that the "first-child" pseudo-class is handled by XSL's patterns directly.

Pseudo-elements

Pseudo-elements in CSS refer to parts of a displayed element, for which there is no (or cannot be) mark-up in the source document. ":Before" and ":after" are used to insert new elements where the source didn't have any, and ":first-letter" and ":first-line" refer to the first letter/line of a block box as actually displayed on the screen.

Since XSL templates allow an arbitrary number of objects to be created, the ":before" and ":after" are automatically catered for. The ":first-letter" and ":first-line" probably need something like the switch object above. For example:

<css:compound font="12pt Times"
              line-height="1.2"
              text-align="left">
  <css:first-line font-variant="small-caps"
                  color="green"/>
  <process-children/>
</css:compound>

The compound object is like a normal chunk, but it may have two special children, first-line and first-letter, that hold the properties of the pseudo-elements.

Page box

For paged media, CSS2 allows the characteristics of the pages to be described with @-rules. Since XSL has no place for global declarations (at least not in the August 1998 draft), the best place to put them is probably near the root of the generated object tree. An @page-rule might translate to a page object:

<template match="/">
  <css:page size="landscape"
            margin="1.5in 1in"
            marks="crop"/>
  <css:page name="left"
            .../>
  <css:page name="rotated"
            .../>
</template>

Media types

Selection of output medium could be handled outside of XSL, like it is for CSS level 1. That means that to write an XSL sheet for two media, say print and screen, one has to write two sheets. They could still import a sheet with common rules.

Web-fonts

Another @-rule in CSS is the one for defining Web-fonts. These, again, should probably become objects that are attached close to the root of the object tree:

<template match="/">
  <css:font font-family="Pantani"
            panose1="4726402695"
            font-style="all" />
</template>

Replaced elements

Replaced elements could be handled with a replaced object, which has the combined attributes of the chunk and the object element from HTML:

<css:replaced src="/Icons/w3c_home.png"
              type="image/png"
              params="..."
              border="solid red"
              .../>

Interactive elements

Hyperlink source anchors and form elements have not only style, but also a behavior when a user activates them. In CSS that behavior doesn't need to be specified, since the displayed boxes on the screen have a direct link to elements in the source, and those elements come with their own semantics. Because of the transformation that takes place in XSL templates, that back-link is not available, and the transformation needs to carry any behavior information forward from the source elements to the generated objects. How this can best be done is still an open issue. Introducing objects like anchor (like chunk, but with an extra href attribute), and form, input, etc. may be a solution.

Counters & the content property

To make the syntax slightly easier to read, it may be possible to make counters into counter objects. XSL allows literal text to be inserted in the templates directly, obviating the need for the content property.

There are several possibilities for defining whitespace handling inside templates. The easiest seems to be to define that whitespace in templates is not significant, except as separator between words. Another way to express this is to say that whitespace is collapsed: leading and trailing whitespace is removed, and any other sequences of whitespace characters replaced by a single space. To get extra spaces, or newlines or tabs, they have to be inserted explicitly. Below we will use space to insert a space and newline to insert a hard line break. XSL will probably provide a generic text object for that (somewhat similar to the pre of HTML), in which case space and newline can be defined as XSL macros.

<template match="fig">
  <css:chunk text-align="center">
    Figure
    <css:space/>
    <css:counter name="figno" style="upper-alpha"
                 font-weight="bold"
                 color="blue" />
    <css:newline/>
    <process-children/>
  </css:chunk>
</template>

XSL provides a predefined object that inserts the value of an attribute. Using that, the attr(...) function of CSS would be replaced by

<value-of expr="attribute(...)" />

Other string-valued properties

Font-family is another property that uses a mixture of quoted strings and keywords. It could be split into two. That would make the inheritance model different, but avoid quotes inside quotes:

<css:chunk font-family="Helvetica, gill sans"
            generic-font-family="sans-serif">...</>

text-align also accepts strings; a similar split might be possible.

Specialized objects

It may be lead to more easily readable sheets if a small number of convenience objects, in the form of "subclasses" or "curried" versions of chunk, is added. Obvious candidates are hr (horizontal rule, a chunk with certain border properties preset), block (a chunk with 'display' preset to 'block'), and inline (analogous for 'inline').

Here is an example of a template rule that uses those derived objects. It formats para elements from the source document as CSS block elements with a red pilcrow sign at the end.

<template match="para">
  <css:block margin-top="1.2em">
    <process-children />
    <css:inline color="#F00">¶</css:inline>
  </css:block>
</template>

Acknowledgments

The authors wish to thank Tim Berners-Lee, James Clark, Martin Dürst, Chris Lilley, Vincent Quint and Steve Zilles for their comments to this Note. Still, any views expressed in this Note are solely those of the authors.

References

[1]
Extensible Style Language, XSL, is under development within W3C. Current plans indicate that XSL will become a W3C Proposed Recommendation in the middle of 1999. This note is based on the 18 August 1998 draft (http://www.w3.org/TR/1998/WD-xsl-19980818)
[2]
Cascading Style Sheets, CSS, is defined in two W3C Recommendations, Level 1 (http://www.w3.org/TR/REC-CSS1) and Level 2 (http://www.w3.org/TR/REC-CSS2).