Graphical composition langages and VR

A language, and graphic interface, is needed for objects which are 3 dimensional (or quasi 3 dimensional) compositions of graphical objects. This could be prototyped easily under NeXTStep for exmaple using the class library, making a new quasi 3d view, or even using Renderman 3d facilities.

Here are a few messages about this, a discussion which slips into VR. Mail me with any more input! Included here:

Distributed Interactive Simulation
Labyrinth

From [email protected]

I wrote most of the internet chess server (which can currently be
reached with "telnet ics.uoknor.edu 5000").  A number of other
programmers have written interfaces that run on your local
PC/MacIntosh/Xterminal/Next.  The interface parses the output from the
chess server and displays a pretty chess board on your screen.  It
also allows the user to make moves with the mouse.  These mouse
strokes are converted into algebraic chess move notation.  Some of the
interfaces show a dynamically changing clock, indicating the amount of
time remaining for each player.

Each of these interfaces is a custom-made special purpose-program for
connecting to this chess server.  A similar set of interfaces has been
created for use with the go server.

This whole set-up is really cumbersome for a number of reasons.
(1) It's very difficult to set up other game servers with graphics,
because a whole set of special purpose interface programs must be
written, one for each major platform. (2) Any change in the chess
server must be backward compatible with all the interfaces out there.
This limits the the way the server can develop.  (3) When a separate
new platform emerges, a new interface must be written for each of the
internet games with graphics.

All of these problems could be solved by a standard language for
expressing the simple types of graphics needed for these games,
combined with a set of interfaces to interpret this language on a
variety of platforms.  Postscript suffers from at least two drawbacks.
(1) It's extremely verbose -- since many users connect through slow
lines, this a problem. (2) full postscript is a HUGE language that is
very difficult to implement, and 99% of it is unnecessary for these
purposes.

I believe that the development of such a set of tools would spur a
tremendous explosion of new simple graphics games and applications
that could run through telnet the way the chess server does now.  It
would be possible for one person to write a new game (such as double
bughouse chess) without having to write a half dozen graphics
interfaces.  Many really cool things change from being impossible to
being quite feasible.  (The PLATO system developed in the 70s at the
University of Illinois had some of these properties: simple graphics
available to all users, fast interaction among a large pool of users.
The result was the development of a number of very popular and
engrossing interactive games.)

So, my question is: does a language with these properties already
exist?  If not, how do we go about creating it?  This whole idea seems
to fit quit well into the philosophy of Mosaic, which is a standard
interface to the net that runs on all platforms.  If it emerges, would
this new type of network interaction be built into Mosaic?

Please forward this message to anybody else who you think would have
some useful insight on this problem.  Thanks.

Daniel Sleator

Professor of Computer Science

Carnegie Mellon University

From Tim Berners-Lee

Date: Thu, 27 Jan 94 11:25:56 +0100


Professor Sleator brings up a very pertinent question which has also
arisen in the World-Wide Web, and which I feel is a very important next step. I will add my own slant on it.

I would like to see a graphics composition language which allows a structured display to be composed on the screen from a number of items, which may be
local or remote.  For example, I would like to see a chess screen made up
of a composition of a basic board, and a bunch or pieces, all identified
by their network or local addresses (URIs).  These would be cached in
practice, so to redraw the board would be the very rapid operation of
respecifying the structure.  Another example is a picture of a conference
room composed of a background, an overhead projector screen which is in
fact a whiteboard, or IRC, session, and people sitting around the table
which are GIFs, or videos for those with cameras.  We have the formats
for the basic graphics (though only TIFF a la NeXT has the much-needed
 transparency channel I believe).  We need a composition language.

There will be those who suggest adapting PDF, which is basically
 postscript with the ability to embedd other formats.
Maybe the renderman format would have something to give us.
There will be those who suggest augmenting TIFF, using its general
 tagging.
[There are those who say it should be CGM ]
There will be those who say it should be SGML.
There will be those who say that HyTime ought to be used for this,
 as that was what it was indended for more or less.
There will be those who feel like writing it in an afternoon from
 scratch.

 (By the way, I disagree that postscript is too terribly
 _big_ -- with display postscript coming with X, and already doing a
 neat job as the screen interface in this system as I write,  if
 it was what we wanted I would go for it. I feel though that
 we want something very declarative, rather than procedural,
 as a base.  I would like it to support lazy eveluation, so
 that when a bit of embedded video becomes covered by another
 window, the bandwidth can be saved, or if I can't see the
 output of a simulation, the CPU can be saved.  "If you never
 have a dream, then you never have a dream come true".)
 
 Clearly some 3d or fake 3d ability is useful.  Taking the
 object-oriented approach, I would imagine objects suitable
 for composition responing to, according to their sophistication,
 methods

 	Renderself();
 	RenderSelf(viewed_from_x,y,z);
	Renderself(viewed_frorm_x,y,z, lighting_conditions);


The results would, with HTTP, be returned in any format the
client could handle, so those who could handle video might get
back a video stream, those which can handle 3d might get back 3d,
defaulting down to 2d with transparency or just plain GIFs.
(I guess we rule out 3d video, but video with transparancy
would be cool -- using color separation overlay to paste
people from their own blue painted conference room into
a common virtual environment).

An object which only had a 2d representation would always
return the same picture, with the result that, for example,
when one rotated the conferenec room, the people would
always be facing one. That is a reasonable compromise,
the sort of thing which makes practical systems really work,
and turn smoothly into ultimate systems with time, bandwidth
and money.

I would like dynamic editing, so that the
chairperson (chairobject? ;-) can mouse drag Prof. Sleator to the
head of the table
to explian his ideas, and he can drop documents into the
overhead projector, or one can drag/drop the chess pieces)
We would have a great basis for consutruction of networked
VR, graphical MUDDS, and cyberspace would never be the same again.
This would be totally in keeping with web tradition in
being a powerful means of communication with an incredibly
intuitive interface, and with the ability to grow in
sophistication limited only be the imagination, and all based
on not very much.

In fact, many of the requirements, such as the format negotiation
and the embedding of graphics by reference go, much of it is there
in differing amounts in different clients, and spec'd out.
The missing thing is the composition language, with its 3d elements.
Which is why Prof. Sleator's message hits the nail on the
head.

I would encourage a discussion of this, an evaluation of what
is there some rapid work toward some prototypes.  I would
strongly encourage a viuew of this as just another sort of
object, with various possible representations (just like
basic images -- in fact this is an image object, embeddable
in a document).  Let's go fo it...

Tim Berners-Lee

From Lou Burnard [email protected]

Date: Thu, 27 Jan 1994 11:27:18 +0000

From: Lou Burnard <[email protected]>

>I would like to see a graphics composition language which allows a structured  
>display to be composed on the screen from a number of items, which may be
>local or remote.  For example, I would like to see a chess screen made up
>There will be those who suggest adapting PDF, which is basically
>There will be those who suggest augmenting TIFF, using its general
>There will be those who say it should be SGML.
>There will be those who say that HyTime ought to be used for this,
>There will be those who feel like writing it in an afternoon from
> scratch.


Actually, I think most graphics people would say it should be CGM.
Indeed, I think there are some who would say it *is* CGM. 

But I am no expert in this field...

regards

Lou

From: Dave_Raggett <[email protected]>

Date: Thu, 27 Jan 94 15:42:24 GMT

I too have been putting thought into a similar scheme. It seems to me
that the processing power of workstations and high end PCs is now
good enough to support platform independent VR with Web style global
hyperlinks. Like Tim, I believe that the way forward will naturally lead
to non-proprietary VR, and think now is the time to start exploring
how we can do this.

A key to allowing effective platform independence is to use logical
descriptions so that viewers can fill in the details according to their own
rendering capabilities. As an example, you could describe a room in terms of
the polygon defining the floor plan, the height of the walls, and categories
for the textures of floor, walls and ceiling. Hierarchical descriptions of
wall textures could include: raw color and a link to the tiling pattern for
an explicit design of wall paper. Low power systems would use plain walls,
saving the cost of retrieving and patterning the walls. Fractal techniques
offer interesting possibilities too.

Shared models would avoid the need to download detailed models, e.g. for
wall paper, window and door fittings, chairs, tables, carpets etc. These
models, by using well known names can be retrieved over the net and cached
for subsequent use. The models would include hierarchical levels of detail.
This is important for "distancing" and reducing the load on lower power
clients. In addition to appearence, models could include behaviours
defined by scripts, e.g. sound of a clock ticking, the way a door opens
and functional calculators, radios and televisions.

Full VR needs expensive I/O devices, but we could get by with side-ways
movement of the mouse (cursor keys) to turn left or right and up-down
movement of the mouse to move forwards and backwards in the scene. I believe
that allowing a progression from simple to sophistocated I/O devices with the
same VR interchange formats will be critical to broad take up of VR.

So far I have outline a way in which you could click on an HTML link and
appear in a VR museum and wander around at will. Pushing on doors would
correspond clicking on hypertext links in HTML. The next step is to get to
meet other people in these VR environments. The trick here, is to wrap
real-time video images of people's faces onto 3D models of their heads.
This has already been done by a research group at ATR in Japan. Our library
couldn't find any relevant patents, so it looks like there are no problems
in defining non-proprietary protocols/interchange formats for this approach.

The bandwidth needed is minimised by taking advantage of the 3D models
to compress movements. By wrapping the video image of a face onto a 3D model,
you get excellent treatment of facial details, as needed for good non-verbal
communication, while minimizing the number of polygons needed.

The effectiveness of this approach has been demonstrated by Disney who
project video images on onto a rubber sheet deformed by a mask pushing out
of the plane. Needless to say, there remain some research issues here ...

The first steps in achieving this vision is to start work on a lightweight
interchange format for VR enviroments and experimenting with viewers
and http. A starting point is to pool info on available software tools
we could use to get off the ground.

Regards,

Dave Raggett (looking forward to the Web's VR version of the Vatican Exhibit). 

-----------------------------------------------------------------------------
Hewlett Packard Laboratories,           +44 272 228046
Bristol, England                        [email protected]

From: [email protected] (Vinay Kumar)

Date: Thu, 27 Jan 94 11:07:43 PST

Very interesting indeed. Recently i saw a running demo from General Magic's "MagicCap" UI environment. It seems to do a lot of the stuff mentioned by others on this list earlier (assuming i understand the emails correctly ofcourse). MagicCap UI shows a downtown view on the desktop, using a mouse one could navigate (VR style) around houses, rooms, hallways, libraries, etc...One could customize wall colors, wall papers, posters, and other artifacts in and outside the rooms. Drag and drop feature is supported. Linking of objects is thru drag and drop. However i am not sure if linking of objects over distributed networks is supported. They claim everything in their environment is an "object" and almost every object could be linked to any other object. I will recommend everyone on W3 to atleast take a look at this product. (I apologize if this sounds like infomercial on General Magic's product, certainly didn't mean that way). In essence, it makes lot of sense in viewing W3 alternatively in a "non-document" centric manner as well. At this point, i am not sure what is the best way to do this in W3, certainly W3 is powerful and flexible enough to allow us to accomplish such a thing. Sounds like there is need for a multimedia-scripting-and-synchronization language (whatever that means....). Shall get back to you on this more after careful thinking.

Vinay Kumar

[email protected]

CGM

From: [email protected] (lofton henderson)

Date: Wed, 2 Feb 1994 13:12:49 -0700

I have recently seen some pieces of email dialog on the subject
of a Universal Network Graphics Language.  I have comments to
offer on a couple of aspects of the issue.

It is interesting that no one has mentioned the format that seems
to be the obvious solution -- Computer Graphics Metafile (CGM).
The mail dialog that I have seen so far proposes PostScript/PDF,
TIFF, NAPLPS, and various other private formats.

CGM:1992, especially with the pending completion of Amendment 2
(Application structuring) has all of the features named, except
for 3D.  A 3D extensions project is being studied now by ISO
graphics standards committees.

It is declarative (as opposed to procedural), and it a highly
efficient device- and application- independent graphics format.
The Version 3 definition of CGM:1992 is roughly as capable as
PostScript level 2 in graphical expressive power.

It is a composite vector/raster format, so it preserves editability
and the ability to manipulate the picture (as opposed to TIFF).
Virtually all commonly encountered 2D graphical primitives can be
translated directly into CGM elements.  Scanned images can be 
embedded as tiled raster elements.

It has two flavors of structuring.  "Segments" are a graphical
efficiency mechanism, for saving and reusing sets of primitives
(which can be instantiated with different attributes, transformations,
etc).  

The Application Structuring of Version 4 metafiles (Amendment 2,
anticipated completion summer 1994) lists among its target
capabilities "network distributed graphical applications", 
interactive electronic manuals, etc.  It includes not only the
ability to divide the metafile into pieces of application (as
opposed to graphical) significance, but also includes picture
and structure directory features.  The pictures can be completely
indexed, "objects" are randomly accessible, and any variety of
Structure Attributes can be defined and attached to the structures.

It is an ISO standard (ISO/IEC 8632), and has been around for 7 years
(it was republished in 1992).  It has been designated as the graphical
content architecture by a number of electronic documentation initiatives:
the ISO Office Document Architecture standard (ODA); the electronic
document programs of US DoD (CALS) and international commercial
aviation (ATA/AIA); it is the graphical basis of the ATA "intelligent
graphics" and intelligent electronic documents program; etc.  It is
also a European standard (EN) and national standard of most of the
main industrialized countries.

There certainly would seem to be some advantage to using ISO standards
where suitable ones exist (I have worked on ISO graphics standards 
committees for a dozen years, principally on CGM).

It is widely implemented (but not always well) -- the CALS Test Network
(CTN) lists several hundred products that claim CGM support.

A testing and certification service for products has just been
established, by the National Institute of Standards and Technology
(NIST, a part of the US Department of Commerce).

It is certainly worth looking into, as it seems an ideal candidate for
your projected application.  I'd like to offer a couple of other remarks
in closing.

Firstly, some of the mail has talked about time, the nature of time,
and synchronization.  For that aspect of the problem, there is an
ISO standard, HyTime, whose purpose is exactly synchronization and
integration.

Secondly, there are numerous standards efforts underway to look at
the whole picture of interactive electronic documents, distributed
multimedia, etc:  MHEG (Multimedia hypermedia experts group),
HyperODA, IMA (Interactive Multimedia association), OMG, SC24/PREMO
(Presentation Environment for Multimedia objects), to name just
a few (in this very crowded field).

Finally, it seems to me that there is some confusion in the mail
between the graphics formats used to support a set of requirements
for distributed interactive network graphics, and the tools that
actually provide the services.  There is obviously some relationship,
but the solutions will be forthcoming more quickly if the separation
is kept clear.  My specialty is formats, so I've limited the bulk of 
my comments to that topic.

Regards,
Lofton Henderson.

Henderson Software Inc.
1919 14th St., Suite 610
Boulder, CO   80302 
USA
ph:  (+1) 303-442-6570
fx:  (+1) 303-442-6572
Internet:  [email protected];  or, [email protected]

(As you might guess for the 2nd Internet address, our business is CGM).

Distributed Interactive Simulation

Virtual Reality

Anyone got any leads on this and standards in use? The following was from the IETF list.

From: Margaret <[email protected]>

Date: Wed, 02 Feb 94 09:00:27 EDT

The Distributed Interactive Simulation (DIS) standards being developed under IEEE are for linking simulations at multiple locations to create realistic, complex, virtual "worlds" for the simulation of interactive activites. Our work over the past 5 years has focused on connecting military simulations; however, the DIS technology is applicable to entertainment, medicine, education,... The FAA is even joining our workshops.

The next workshop is March 14-18 in Orlando. To get more information on this subject, please contact Caroline LaFave at 407-658-5518 or [email protected].

Margaret Loper, PM DIS Standards

Institute for Simulation and Training

3280 Progress Drive

Orlando, FL 32826

407-658-5517

Labyrinth

From: [email protected] (Mark D. Pesce)
Subject: VR and WWW - LABYRINTH Project...
To: [email protected]
Date: Thu, 17 Mar 1994 10:28:39 -0800 (PST)

Tim -

I was on www.info.cern.ch today and saw a discussion of VR and WWW. Well, it's already being worked on, here in SF, and I hope that we'll be able to contribute it to the public domain before Summer is over. In any case, we'll be showing it off at SIGGRAPH (hopefully) as part of the SIGKIDS exposition, to show how VR can make WWW sites like the U.S. Library of Congress more navigable to everyone, not

just children.

I've enclosed a short document describing the rationale and goals of the Labyrinth project. Any comments you care give would be greatly appreciated.

Any help/pointers you can give to people doing similar work would be greatly appreciated. Thanks for all your great work on WWW! We're glad to be able to add to it.

Mark Pesce

Network Zero

San Francisco, California, USA

[email protected]

Tim BL