Playing with the new Validator API
A new release of the W3C Markup Validator is usually good news for Web designers: it means bug fixes, better documentation, more document types supported, and little usability improvements to make the life of HTML coders easier. The latest version of the validator, released this week, is also good news for developers, giving us a new toy to play with, and build upon.
Meet the validator API: a simple way for applications to use the power of the validator. The validator has had something a little similar for a while: a proprietary XML output format, which has done a great job as a proof of concept, and is now being deprecated for something more powerful.
The validator API gives you validation of a document in two little steps:
- Send an HTTP query (there are tons of libraries doing this, in every language) to a validation service. Either your own if you have the validator installed on your server (fast and easy...) or the one at W3C (free, but slower)
- Parse the XML-based response. The response also happens to be SOAP1.2 compliant, which brings the benefit of a lot of available parsing libraries.
Having an XML response means that any other format (XHTML, Atom, whatever you like) is just a transformation away. Just match validity
for a quick peek at the validity, and loop on <xsl:for-each select='m:error' …
for more meaty results.
Why not just scrape the XHTML results of the validator? Because, while still experimental, the soap output is well documented, and will not be changed without warning once stable. It's also very similar to the APIs of the CSS and Feed validators, meaning a lot of existing code can easily be reused.
Is XSLT still too much work? Just use one of the libraries available. In perl for example, thanks to Struan Donald's 'WebService::Validator::HTML::W3C, the work is pretty much reduced to three lines of code:
use WebService::Validator::HTML::W3C; my $v = WebService::Validator::HTML::W3C->new(); $v->validate("http://www.example.com/");
This gets us a nice object to play with. We now know whether the checked document passed validation ($v->is_valid
), we can access all the validation errors ($v->errors
), etc.
Validating a batch of documents can be done in a few more lines of code: loop on a list of URIs, sleep()
between request to avoid overloading the validation service, and voilà!
And if neither perl nor XSLT are your cup of tea, no problem! Whether you're juggling with Java or rolling on Rails, we are waiting to be wooed away by your implementation, and we would love to list your libraries on the validator's site.
The next release is only weeks away. The documentation is here. Ready?
I've been playing around with it, trackback doesn't seem to work so here is my post about it: http://www.joostdevalk.nl/blog/w3c-validator-api/.
What about a WSDL for the validator? That would be much easier to use the service if there were some, e.g. using Delphi WSDL importer I could get the interface automatically.
Is there a regression test suite (valid and invalid HTML with expected test result) with which to test a validator?
This would make it much easier for native XHTML validators to be produced and kept in step.
Is there a modern/recent recommended PHP snippet for calling the Validator? or a library even?