This report is based on notes taken during the workshop. There is also the original agenda, a list of participants and a report on day 2.
Jean-François delivered the opening statement, in which he talked a little about the history of W3C in Europe, which started only on April 7, 1995, and about the fact that this workshop, besides being important for its content, was also the first activity of its kind organized by W3C in Europe.
As chairman of the workshop, Steven described the structure of the workshop, his own role in it, and some of the goals, including two that he found especially important: avoiding chaos on the Web and ensuring device-independence.
David described himself as a `radical designer', i.e., one that doesn't care about style sheets but about results. He would use whatever means where available to achieve the desired appearance.
He stressed the importance of `the market' and referred to a book by Jeffrey Moore, Crossing the chasm (sp?). One should look at examples in history, such as the T-Ford, which eventually lost its popularity, because its availability in only a single color (black) drove people away. Similarly, cellular phones, once also available in dignified black only, are now on the street in all colors and shapes and are a huge success.
The Web has three tasks, which partly overlap: information, exchange, and entertainment. The latter is David's area of interest. A stylized graph of the number of people that accept Web technology in a particular period shows the form of a bell curve: at first, growth is small, then it starts to rise until it reaches a peak, after which only `late adopters' join in and the growth gradually goes back to zero again. Halfway on the up-slope is the year 1995, which has seen huge growth at least in part because of the `Netscape extensions', which, after all, primarily increase the entertainment value of the Web.
For a Web-site that focuses on entertainment (or `experience'), a 3-act paradigm is a useful, if not the only viable model: first the people have to be drawn into the single entrance, then they can explore the interior, finally they exit through the single exit.
Eventually, David estimates, information will be only a small part of the Web, exchange is three times as large, and entertainment will be three times larger again.
David proposes a two-level approach to HTML (and related) development: first develop HTML 3.0 `Lite', which lays down the foundations, then allow the market to develop HTML 3.0 `Pro', which will mainly be for entertainment. The latter will never be fixed, since the market will demand something new all the time, and as quickly loose interest in old things.
In the entertainment field there won't be a dominant party, because demands change so fast. In the exchange area there will be a huge battle between software vendors. On the information side, the W3C can be the main player, with tasks such as defining HTML and style sheets.
A designer knows better than a reader how things should be presented, except of course for special needs, such as those of color-blind people.
HTML and style sheets should ideally provide a design grid, because tables aren't enough; different kinds of containers can have different functions on the screen; standard fonts should be the same size on every platform. The designer wants control over every pixel, but also over absolute length: 12pt should be 1/5inch independent of resolution.
Control of colors is very difficult, but at least the designer should be able to specify which images and which colors in those images are the most important. An explicit load order for all images on a page could help, as could control over dithering.
Some of the technical people in the audience remarked that on the fly changes to color maps and dither matrices is in general impossible. The most that can be achieved is a dither method for the whole of the window and possibly a toggle to turn dithering off per image.
Anti-aliasing is badly needed. David estimates that he currently spends a third of his time anti-aliasing fonts by hand.
Unlimited overlays are also a necessity. Although some of the effects can be achieved by preparing separate images and using clickable image maps, an overlay would give much more control, plus the opportunity to reuse images and text fragments.
Even more effort can be saved if some image operations can be specified in the style, such as tinting of a gray image with a particular color, rotations by 90°, and flops.
HTML Pro will not be written by hand. It's function is best described by the advise to the reader: sit back and enjoy. Pages come automatically and there will be video in the background (yes, people will create unusable pages with it, but a designer will know what to do with it). The goal is user experience, i.e., entertainment, not information.
Control over all aspects of whitespace is essential, likewise over rules. Text wrapping around non-rectangular areas will be in HTML Pro as well.
For the immediate future, the Consortium should start by creating (or acquiring) a set of good, hinted screen fonts, in serif, sans and mono-spaced form, and make it available for free on every platform. Such a font may actually be on offer now, for an affordable price and without copyright restrictions.
DSSSL contains both a transformation language and a formatting language. Originally the transformation was needed to make certain kinds of styles possible (such as tables of contents). The query language now takes care of that, but the transformation language survives because it is useful in its own right.
DSSSL shouldn't constrain the style and it shouldn't need extra mark-up in the SGML document. The DSSSL output is described geometrically and it can be created both for on-line presentation and for print. The standard currently has more support for print.
Both simple and complex designs should be possible, and the styles should be suitable for batch formatting as well as interactive applications. Existing systems should be able to support DSSSL with only minimal changes (a DSSSL parser is obviously needed.)
The layout should be described precisely, up to a certain point. Page-fidelity cannot be a goal. Algorithms, such as for line breaking, are not specified by DSSSL.
Internationalization is a goal, though there are still some problems, notably with Arabic.
The language is strictly declarative, which is achieved by adopting a functional subset of Scheme. Interactive style sheet editors must be possible.
Of course, DSSSL conforms to, or works with the relevant standards, such as SGML, HyTime and SPDL, but it doesn't depend on any of them. This also helps marketing: DSSSL acceptance should be a result of real implementations; its development is therefore driven more by needs then elegance.
The formatting model is based on flow objects. Each flow object produces one or more rectangular areas. (How many and how they look is determined by a back-end, not by DSSSL.) The are two kinds: inline and displayed, and there are several types of each.
There are a few dozen flow object types (or `classes'), each with its own set of characteristics, a few hundred characteristics in all. The set of classes is open ended: new ones can be added. A mechanism based on public identifiers (like in SGML) is used for this. Characteristics are always attached to a flow object, never directly to the areas that the back-end formatter may produce.
A DSSSL style sheet very precisely describes a function from SGML to a flow object tree. It allows partial style sheets to be combined (`cascaded' as in CSS): some rule may override some other rule, based on implicit and explicit priorities, but there is no blending between conflicting styles.
DSSSL Lite is a subset of DSSSL: it is DSSSL with a particular choice of features. There are some extensions in the form of new flow objects. DSSSL Lite should be well suited for on-line display.
Creating a good, extensible style language is hard!
The definition in terms of flow objects, as an intermediate step between the document and the final output, is a very useful one.
Somebody in the audience asked about implementations. Nothing fixed is known about this, though some vendors are working on it. Glenn Adams said that his company, Stonehand, thinks of using the DSSSL flow object model, but with CSS as a language.
EBT currently uses its own style language, but it thinks the language is also compatible with the flow object idea.
One reason for developing CSS is certainly `damage control': avoiding an uncontrolled growth of HTML extensions. But CSS has other goals as well.
CSS should be a simple language, readable and writable after looking at a few examples. Yet it should have power comparable to the average DTP product, maybe with some limits caused by its focus on on-line presentation. E.g., it needs no columns, at least on the screen. CSS supports stream-based (or `incremental' formatting) where possible.
In accordance with the Web philosophy, CSS offers both readers and authors control over the style, with the same language for both. But to protect authors, they can assign a priority to their styles, so that readers can only change the style by turning off the complete style sheet.
A style sheet can describe styles for multiple media, e.g., not only screen, but also paper and speech.
An immediate goal of level 1 is to replicate at least all the functionality that is currently available in various browser-dependent `extensions.'
The approximate time line until now is:
A style sheet is a collection of rules, each with three parts: a selector, a property and a value. The selector matches an HTML element by GI, class, or ID. For matching visited/unvisited links the selector can include `pseudo-classes'. Matching can also be restricted by an element's context (i.e., ancestors).
The style language allows a lot of effect with just a few lines. The formatting model is based on boxes that stack on top of each other.
A simple interactive style editor (using HTML forms) is running.
The blink tag scared some people, the font tag scared some more. It seems there is no leader anymore and the result is frustration. But for a new style standard to be accepted, it is necessary that more browsers support it, that cross-platform font problems are solved, and that there is reference code available.
It also helps to have a style editor and conversion tools to convert between DTP programs and CSS.
The level 1 definitions should be finished this year. The Arena browser will be updated to support it. We will be able to show that CSS isn't hard, not for designers and not for implementers.
To further ease acceptance we could introduce the style attribute for every element of HTML, allowing in-line style specifications in a somewhat cleaner and more powerful way than with ad-hoc attributes/elements.
One of the goals of the research in which Cécile partakes is a WYSIWYG editor for structured documents. It supports interactive formatting and there has already been some experience with it.
The presentation is defined by two things, called the style and the layout. The style contains the formatting properties that depend mainly on the logical structure (elements and attributes) while the layout concerns the physical appearance. The style includes font, font size, color, etc; the layout describes pages, columns, etc.
The P language allows automatic formatting and helps to maintain a homogeneous presentation. Characteristics of the style and the layout can be specific or generic, and context dependent or inherited.
Typographical properties depend on both the logical and the physical structure. For interactivity that leads to some constraints. Formatting must be possible in an incremental way and there are some device dependencies.
P supports complex structures such as tables, mathematical formulas and trees. It can also offer multiple views of the same document. A separate language, called `S', is used to define the structure of a document. (S is not SGML).
The Thot editor (formerly Grif, but Grif is now a commercial product) is based on P. Its formatting process is split up into several steps, starting with assignment of properties to logical cells, line breaking, and so forth. The underlying model is one of boxes that can be lined up or nested in several ways. Each box has an origin and a size, plus four axes: the vertical middle, the horizontal middle, the horizontal reference axis, and the vertical reference axis.
To line up the boxes, P offers elastic and fixed dimensions. There are also special boxes for decorations.
Several types of counters can be defined and used to generate text. Boxes may be filled with generated text and/or text from the document. Characteristics can be made conditional on counters and other things.
The style produces an intermediate physical structure, called abstract picture (because style properties are partially evaluated) which is similar to the flow object tree of DSSSL.
There will probably be an attempt to implement CSS in P. The plan is to produce an editor/browser provisionally called `Tamaya' that handles HTML and CSS and is fully WYSIWYG.
Formatting with P relies on a rich box model and separates the structure and the style. The result is portability, re-use of styles, and the possibility to have multiple views for one document.
For CSS, Cécile recommends to keep it simple. Nevertheless, good style editors are needed for its acceptance.
The Web is in danger of going to hell, according to Kevin.
Originally, SGML was meant to convey content and structure, not presentation. The same with the Web. And it did catch on because content was so easy to make.
Unfortunately, content gets boring eventually and that's when the extensions were introduced. But they mix content and presentation, and they are also non-standard.
A quick count of the speed with which HTML is breaking down: Netscape 1.1 introduced 4 non-standard tags and 23 attributes, Microsoft another 3 tags and 7 attributes, and so it goes on.
Not only are new tags being invented, people also abuse existing tag for the wrong purpose. Multiple title elements in a document cause some browsers to display different titles while the document is being loaded. Tables are used for layout purposes instead of for tabular data. Blank images are used to introduce whitespace, etc.
Th conclusion must be that to most users SGML means nothing and that, therefore, the original purpose of the Web has failed.
Style sheets can save the Web. But they must be suitable for both manual and automatic creation. They must be easy to learn and author, and follow an intuitive model. A style sheet must be compact, so that people have a reasonable excuse for not using extensions. Some minimum level must be standardized and supported everywhere.
The language is constrained by the tools in use today: simple text editors combined with the HTML mark-up language. The future will be different: there will be VRML systems, style editors, etc.
Some people have predicted that `browsers will be the OS's of tomorrow'. This is wrong! The current browser-centric view must be turned into a media-centric view. Instead of a monolithic browser, we need to have portable, mobile code. We also need two-way communication, including automatic push & pull of documents. Eventually, the user will be creating as well as experiencing.
Today's formatting is based on boxes. Netscape's frames are a first step towards different media in cells of a grid (simple compound documents). What's needed is much more powerful frames, allowing multiple layers and relationships between them.
Text should be flowing automatically from frame to frame. Different media can be `stitched' together in different ways, they can be merged, as well as divided. One such `stitch' (relationship) is frames layered on top of each other. Of course, the user should be allowed to manipulate the layers on-line.
There are dozens of geometric and other operations that could be applied to images or frames: synchronize, crop, resize, rotate, tint, etc. The important message, however, is that HTML should not be used as a language for frames, layers, or hypermedia glue.
What's needed is an HTML that is independent of resolution, independent of hardware, independent of presentation, and independent of media.
The Web must allow generic media parts. It should be modular, but with rich relationships between the modules. Together the modules will form `ecologies'.
A module has six levels, apart from the relationships, there will be a summary, logic parts (programs), presentations (style sheets), structure (SGML), and content. Modules `communicate' with each other on each of these levels.
Communication between server (database) and user is not enough; there will also have to be communication directly between users.
Media can be linked together in multiple ways.
A `home page' should really be a collection of media. Browsers should be modifiable. But this requires everybody to stop hacking and start again with a good design. Style sheets are part of that design.
The web should eventually see universal access, but also universal authoring.
Dave started with a list of requirements that a style sheet language on the Web had to meet, without trying to define a concrete syntax.
Style sheets must be easy to learn and use. And the writer must have the option of defining the style either in-line in an (HTML) document, in the document's header, or externally in a separate, linked file.
The style language must be suitable for use in a WYSIWYG editor for HTML documents and/or styles and it must be easy to implement. The performance of a browser or editor must be taken into account as well: the style language must help to achieve high performance and low latency.
There must also be room for extensions.
Formatting can be based on a set of presentation objects, such as a block, a line, a table, each with its own properties, such as margin, coupled objects, etc. The system is object oriented. Properties are intensional: they are specified as expressions over other properties.
The presentation objects have hooks or call-backs. They can be used in a WYSIWYG editing environment, based on the Model-Viewer-Controller (MVC) model.
Counters for such things as lists and figures should be scoped and they should have an explicit name. Initial values can be provided, as well as conditions on which a counter is reset or incremented.
A tree of the SGML structure for a document can be used to attach the style rules to. Which nodes to attach the rules to is determined by conditions, such as the value of an SGML attribute. Since it may be possible that several rules apply, some form of conflict resolution based on precedence is needed.
The rows and the columns in a table can be organized into groups. A cell inherits not only from the row and the column, but also from a row group and the column group.
The properties of a cell include things like size, frame style for each of its four edges, and also whether the cell has a fixed size (and thus needs a scrollbar if the contents are larger than the cell).
The Arena browser will be further developed to serve as a testbed for HTML editing.
It is necessary to start simple, but with an eye to the future. The style language should be easy. The relation between the style language and scripting languages needs to be investigated (hence the `hooks' in the style language!).
Some people in the audience suggested that the style language could do even more: it might also be used to define navigators, or to define which additional documents should be printed (and how) when one document from a web was printed.
To be able to match the style rules (either CSS or DSSSL) to the HTML elements, the HTML parser must keep some context. The context can consist of no more than a stack of open elements, or it can be the entire document tree.
The advantage of a stack is that it is usually fairly small. A disadvantage is that it cannot handle DSSSL. Keeping the entire tree needs much more memory, but it also offers more possibilities, such as querying and outlining.
When designing the display engine, a number of things have to be weighed against each other. For example, is it better to display a document quickly, before the style is known, or is it more important to display the document only as the author intended it? If connections are scarce, is it more important to get the images, or the style sheet?
Font handling is a very difficult area, because of the differences between platforms. Emacs runs on many different platforms, so the difficulties are extra noticeable. Solutions can maybe be found in systems for distributed fonts, a free font created by W3C and available on all platforms, and a logical font model. The model used by X is bad, the MS-Windows model could work, but for Emacs-W3 Bill created a new version of the Emacs font handling system.
The generic fonts in CSS need to be described better. Maybe a separate W3C technical report on this is needed. The user could also have a role in this: the way fonts are selected could be controlled by the user. The question remains what constitutes a successful font substitution: is a scaled 12pt Courier preferable to a non-scaled 12pt Helvetica?
Colors are also dependent on the window system. A platform dependent color specification should be allowed. The list of color names in CSS should at least include the ANSI color names.
Maybe something to think about: when two areas of different color overlap, could the colors be blended?
The user must be notified that a style has been applied. He must also be given the option of turning a style off, or of selecting a different one. Maybe he could even turn off specific parts of the cascade of styles.
The style sheets should also contain meta-info, such as the name of the designer, the date and the version.
To reduce eye strain, a full redraw of the screen should be avoided as much as possible. Reparsing the document should never be done.
CSS has no support for multiple media yet. One way of adding it could be with something like a pseudo-class: H1[device=ANSI]...
Audio output needs its own set of properties. The properties could define a `personality', built from male/female, soft...shout, pitch, speed, etc. Another option is to let the browser down-load a set of phonemes.
George started by saying that his experience with style sheets in NaviPress had shown him that they had chosen the wrong implementation.
Currently, properties are attached to elements, and the properties inherit. One of the problems he has found is that people find it difficult to understand. The problems start already with HTML: people don't see why there is a <B> as well as a <STRONG>
To solve these problems, it may be better to let the user's choose only from very high level styles, and not let them see the low-level properties at all.
Basing the Web on Unicode is a large step towards making the Web world-wide, but there is still a large number of difficult problems to solve. The list is much longer than the examples below.
To create some letters, it is necessary to combine non-spacing marks with base letters. For some letters it is even necessary to combine several marks. The order of the marks is sometimes significant, sometimes not. Unicode has around 200 of these non-spacing characters. They are put after a base character with which they should combine. When the order is significant, the marks follow an `inside-out rule', i.e., the one visually closest to the base letter is placed first.
Arabic is not only written right to left, but it also combines the character's glyphs in complicated ways. The shape of a letter depends very much on the surrounding letters' shapes. This happens in other languages as well. The mapping from character codes to glyphs is thus a 1-to-many mapping. A typical Arabic font has more than 200 ligatures.
In Thai, the vowels are non-spacing, as are the special tone marks. It shows that counting composite symbols which fit into a single display cell is a difficult process; the number of cells can't be computed from the number of coded characters consumed by a string, although the number of bytes can be simply computed (for Unicode UCS-2 strings, # bytes = 2 * # character codes)
Bidirectional text has its own problems. In Arabic, the Hindu digits are used, written left-to-right within the overall right-to- left text. The correct way to represent a phone number is like this: (123)4567, even though the parentheses would normally turn round and lead a display like this: 4567(123). Explicit directional overrides, in the form of directional mark characters, must be inserted between or around such words to resolve directional ambiguities.
In HTML, the explicit direction marks are better handled via element structure (tags), instead of by the special Unicode BIDI control characters {RLE, LRE, RLO, LRO, PDF}.
Arabic occasionally also needs explicit control over the joining of glyphs. Sometimes two letters that normally combine should not be combined, or vice versa. Unicode uses special ZERO WIDTH JOINER and ZERO WIDTH NON-JOINER characters to achieve this control.
In Devanagari, the display order of vowels can be different from the logical order. This movement of vowels may be long distance as it may occur over multiple consonants which are combined into a conjunct ligature form.
Even in Western fonts the same font size doesn't mean that characters are the same size. When combining Roman and non-Roman text, the respective relations between body and ascender/descender may be so different that a different font size must be used for one of them. To solve this, it may be necessary to have a mechanism to express font sizes (and other font parameters) for individual fonts which are used together in a font set.
Multiple fonts are needed to display a multi-lingual text, but to make sure that all missing glyphs are displayed (in some way) some sort of backup font should also be available.
The `World-Wide' in WWW is still a long way off.
An internationalized style language should avoid directional terms such as left, right, top and bottom when referring to logical directions. For example the start of a line may be at the left, the right, or the top depending on whether it is Roman, Arabic, or vertically set Japanese text.
Character counts cannot be relied upon for measuring physical distances or for counting display cells due to the display behavior of combining (non-spacing) characters and other complex script behavior.
The mapping from characters to glyphs has to be a many-to-many mapping, and instead of a font, a designer must specify a font set, a logical grouping of physical fonts.
Different languages may express the same style in different ways. For example, `emphasis' may use italics, bold, glyph decorations, or an alternate script style according to the writing system.
Since the style sheet can contain content characters as well, it, too, must be internationalized. A language tag is needed along with a mechanism for determining the style sheet's character encoding.
Most of Louis' talk was accompanied by a demonstration, which was impressive, but unfortunately made it very difficult to take notes, as the room was darkened.
Meta-design gives control to the designers, even across media. It uses visual languages to establish relations between objects. The objects form a derivation tree, which in turn translates into a set of media objects plus constraints. The constraints are solved dynamically to arrive at the final presentation.
The relations can be content-sensitive as well as context-sensitive, where the content is the document itself and the context includes factors such as window size, aspect ratio, etc. The relations can also be `user-sensitive', honoring the different interests of different readers.
The document is no longer static, but it becomes a dynamic interface to information. The dynamics even include things like simulation.
The visual language allows WYSIWYG authoring. A number of intelligent tools help the author or designer to create the documents and the visuals he wants.
The demonstration showed the viewer and the editor in action. By loading a different set of constraints the look of a page could be switched from the WIRED-look to Scientific American and others. Changing the window size made the program calculate the best design for the new environment, which could mean that the number of columns was reduced, that images were moved elsewhere, etc.
The authoring environment relies very much on drag and drop. It does a lot of things automatically and it re-uses earlier work.