The Jigsaw resource factory is a modular piece of software that runs
behind the scene, and creates
HTTPResource
instances out of existing data. The factory currently knows about files and
directories of the underlying file system, but you can extend it to handle
more objects, at will.
This document describes when the factory is called and how it maps any existing data source to HTTP exportable resources.
Each running server has a resource factory attached to it (which it might share with other server, but this is not relevant here). Any resource can call its server factory in order to create a resource out of an existing object. Currently, the only resource that does so is the w3c.jigsaw.resources.StoreContainer, which is the base class for most resource containers (such as the one exporting directories).
When queried for an URL component, at lookup time, the directory resource first checks its children resource store for a matching resource, if such a resource is found, than it is returned as the target of the lookup, otherwise, if the directory is flaged as extensible, the directory resource derives a file name from the resource's identifier, and goes to the resource factory to obtain a wrapping resource instance. If such a resource is built successfully by the factory, the directory resource installs it as one of its children resources, and manages its persistency.
Let's walk through this algorithm with an example. Suppose there is a directory
resource User
which wraps an underlying file-system directory
named User
. This directory resource will usually be created
empty (with no children resources). At some point, a client will ask for,
say, User/Overview.html
. The lookup process starts, and after
some iterations comes to the point were it looks for
Overview.html
in the directory resource User
. The
directory resource looks into its children resources to find it, as none
is found, it goes to the resource factory, and asks it to construct a resource
for the file Overview.html
. If a resource is returned (which
depends on the factory configuration), the directory
resource plugs the newly created resource into its resource store, and returns
it as the target of the lookup.
One important note here: as resources are persistent objects (they persist across Jigsaw invocations), resources that wrap existing objects are created only once in the whole lifetime of the server. This means that changing the factory configuration after a resource has been indexed, has no effect on the resources that have already been created. This is one of the features that makes the server fast: indexing an existing object into a resource might be a costly process (it will involve querying multiple databases, such as the extensions and directory templates database, etc.). Caching the result of this operation allows the server to concentrate on its real work, which is to serve data back to clients. You may still however, want to change the resource factory configuration, and re-index part of your information space with these new options. Currently the only way to do that is to delete the resources to be re-indexed and have them recreated through the normal mechanism.
The factory is defined in terms of a set of indexers. Each container resource may specify the indexer to use to index its content, through its indexer attribute which should provide the valid name of a registered indexer. You could implement for example, a MailMessageIndexer that would create resources out of a berkley-like mail box file, and have a MailResource use it to export it.
The default indexer class, in current Jigsaw release is the w3c.jigsaw.indexer.SampleresourceIndexer, which knows only about files and directories. It creates resources by maintaing two databases: the extension database is used to index files, while the directories database is used for directories indexing.
When the sample resource indexer is called to index a normal file, the first
thing it does is to split the file name into its raw name, plus its set of
extensions. So, for example, if the file to be indexed if
foo.en.html.gz
, the raw name will be foo
, and the
set of extensions will be {en
, html
,
gz
}.
It then take each extension description record, and look if it defines a
resource class. In a typicall setting, only the html
extension
will have an associated resource class, which is likely to be the
FileResource
class. This gives the indexer the class of the resource to build for the
given file, so the indexer carries on by creating an empty instance of this
class. It then creates a set of default attribute values, first by defining
the following pre-defined set of attributes:
identifier
defaults to the file name,
directory
defaults to the file directory,
last-modified
time defaults to the last-modified time of
the file
ontent-length
defaults to the length of the file.
Then for each of the file extensions, it looks into the associated database
record, and fill in the remaining attributes. The html
extension
record, for example, might define the default value for
content-type
to text/html. The en
extension
record will probably define the content-language
default value
to en, and finally the gz
extension record will probably
state that the resource's content-encoding
default value should
be x-gzip. Once the set of default attribute values is constructed,
the resource is initialized, and returned.
When the factory is called to index a directory, it examines its directory templates database. This database allows the web admin to map directory names to specific sub-classes of resources.
For each directory template, the web admin first specifies an appropriate
resource class. A typicall setting, might specify, for example, that all
directory named Putable
should be exported by an instance of
the PutableDirectory
.
The class attached to a directory template needs not be a sub-class of the
DirectoryResource
.
You can specify, for example, that directories named CVS
should
be exported through a
CvsDirectoryResource
,
which will provide you with a form-based interface to CVS.
Configuring Jigsaw factory consists of editing the set of indexers, and for each indexer editing the extensions and directory templatesdatabases. This can be done entirely through the administration application. This section describes how this works, you might also want to check the configuration tutorial.
When you connect to the Jigsaw admin server through the JigAdm application, you'll see that each opened server as a node named indexers. At installation time, this will only display the default indexer which knows about usual mime types.
Open the default indexer node, and its extension database. This will
show up the sorted list of currently defined extensions. To remove an extension
record, select it by clicking on its name, and press the Delete Resource
button (bottom of the right panel): the extension record is deleted from
the database. To edit a particular extension record, select it. On the top
of the right panel you can see a number of buttons, click on the
Attributes button.This will bring up a form, containing all the default
attribute values for the extension. This form changes depending on the class
that you have attached to the extension (extension with no class applies
to all resources, hence, they allow you to edit the
HTTPResource
attribute values). You can change any of these values, which will provided
as default attribute values for resources wrapping a file that matches this
particular extension.
To define new extensions, select the extensions node. This will popup
a form querying you for the extension name (the identifier field at
the top), and the class. Let's say you want to define the extension
ps
for exporting application/postscript files. Type in
the name of the extension (here ps
), and attach it the
w3c.jigsaw.resources.FileResource
class, then click on the Add Resource button. Select the newly
created extension and click on the upper Attributes button. This will
popup the attribute editor, state that the default value for the
content-type
is application/postscript, and press the
Commit button. You are done: all files having the ps
extension will be exported through a FileResource whose default value for
the content-type
attribute will be
application/postscript.
Now, let's create some directory templates. Open the directories node. This will display the sorted list of currently defined templates. To remove a directory template, just select it , and press the Delete Resource button (at the bottom of the right panel). To edit the attributes of a directory template, click on its name, and select the Attributes sheet. This will display the set of attributes for the directory template itself.