Wednesday, August 24, 2005

The OPS Blog Sample Application, Part II

This is a follow-up to The OPS Blog Sample Application, Part I, published a few months ago in this blog. Part I covered basics such as persistence, document formats, and XML-RPC. In this installment, we continue the development of this sample applications for OPS 3.0.


First of all, the OPS Blog application has been upgraded to the latest and greatest best practices related to XForms and the PFC (for more information about the changes between OPS 2.8 and 3.0 beta, please visit this page). This means using the new XForms engine, but also relying more on the cool XML submission feature built into the PFC.

Security and Authentication

One issue to tackle is figuring out how to perform authentication of the XML-RPC requests. With the Blogger and MetaWebLog APIs, username and password are embedded within the body of the XML-RPC message. I thought that the best solution would be to use eXist's support for users, rather than implementing our own user management. It turns out that user management functions in eXist are available through XQuery extension functions, therefore accessing those does not require modifying the XML:DB processors of OPS. For example, to check whether a user is authenticated, you can run the following query:

xquery version "1.0";
{xmldb:authenticate('xmldb:exist:///db/orbeon/blog-example/', 'ebruchez', 'private')}

The result document is <authenticated>true</authenticated> or <authenticated>false</authenticated>. Simple enough: if the result is false, the application simply returns an XML-RPC fault message telling the calling application that authentication failed.

It is fairly easy to plug your own authentication code, since all the authentication logic is under the data-access directory.

Let's now go further. There are several aspects to users and authentication to consider in the blog application:

  • XML-RPC access. All accesses to the application through the XML-RPC APIs should be authenticated. They should also be authorized: in theory, not all existing users may be authorized to read and write all the blog posts of the system, for example. At the moment we associate a particular blog with a user, and all read accesses only return blogs and posts belonging to the user doing the request. If a user is not the owner of a blog, he will receive an empty list of blogs. However, post creation is currently not restricted. This should be improved in the future.

  • Blog owners. A blog should belong to one or more users. Such users should be able to perform administrative tasks on that blog, such as changing its name, creating categories, editing comments, creating new blogs, etc.

  • System administrator. A system administrator must be able to create new users, and possibly perform the tasks that regular blog administrators can perform as well.

Note that the HTML and RSS view of the blogs do not require authentication at all, as usually a blog is meant to be public.

Users Administration Page

This page shows the list of users in the ops-blog group within eXist. It also allows you to create new users in that group and to delete existing users. It is implemented with XForms, by using the very cool XForms replace="instance" feature to update the list of users as you go. The page is declared as follows:

  <page path-info="/blog/admin/users" model="admin/edit-users/edit-users.xpl" view="admin/edit-users/edit-users.xsl"/>

Backend functionality is implement as services with actions declared in the PFC, for example:

  <page path-info="/blog/admin/add-user" view="admin/edit-users/edit-users.xpl">  <action action="admin/edit-users/add-user-action.xpl"/>  </page>  <page path-info="/blog/admin/delete-user" view="admin/edit-users/edit-users.xpl">  <action action="admin/edit-users/delete-user-action.xpl"/>  </page>

Note that a page view is specified in order to return the updated list of users as the resulting instance, and that an action pipeline is used to actually perform the "add" or "delete" actions. Heed this, as it is an important "best practice" of OPS 3.0!

Users Administration Page

Blogs Administration Page

This page is very similar to the user administration page: it shows the list of existing blogs, allowing for editing and deleting them as well as creating new blogs. It nicely handles blog categories as well.

Blogs Administration Page

Comments Page

The "Recents Posts" page now also handles comments with XForms. I decided to use a unique OPS page to handle showing the latest posts as well as showing a single post with attached comments. The page looks quite nice already:

Comments Page

URL Structure

A REST-style application should display nice-looking URLs. I propose the following structured path and parameters structure for URLs:

  • /{user}/ -> default blog or list of blogs
  • /{user}/{blog-id}
  • /{user}/{blog-id}?category=123
  • /{user}/{blog-id}?format=rss
  • /{user}/{blog-id}?format=atom
  • /{user}/{blog-id}/{post-id}
  • /{user}/{blog-id}/{post-id}?format=rss
  • /{user}/{blog-id}/{post-id}?format=atom
  • /{user}/{blog-id}/{post-id}?format=rss&category=123
  • /admin/users
  • /admin/blogs

Path information and parameters are easily extracted by the PFC into an XML submission, as follows:

  <page id="post" path-info="/blog/([^/]+)/([^/]+)/([^/]+)" matcher="oxf:perl5-matcher" default-submission="recent-posts/recent-posts-default-submission.xml" model="recent-posts/recent-posts-model.xpl" view="recent-posts/recent-posts-view.xpl">  <setvalue ref="/form/username" matcher-group="1"/>  <setvalue ref="/form/blog-id" matcher-group="2"/>  <setvalue ref="/form/post-id" matcher-group="3"/>  <setvalue ref="/form/format" parameter="format"/>  <setvalue ref="/form/category" parameter="category"/>  </page>

The default-submission attribute specifies the XML document used if no external submission is provided, which is the case here. The setvalue elements then fill-out this XML template with information from the URL. Note the improved setvalue syntax: OPS 2.8 featured a param element, which did not allow specifying which regular expression group was to be chosen (except implicitly by order) and did not allow extracting URL parameters.

Look and Feel

In OPS, you usually separate page model from page view, according to the MVC architecture. The page model produces an XML document consumed by the page view. The page view itself is split into a page-specific view, and a site-specific epilogue which can apply common formatting to all pages.

One common feature of blogging software is that users are allowed to customize the appearance of their blog. In the OPS Blog sample, we would like at some point to allow users to configure their own page view stylesheet for that purpose. But also, we need to produce similar but different output for RSS and other feeds. The solution to both those issues is, instead of using a single XSLT stylesheet for the page view, to use an XPL pipeline which dynamically selects the page view stylesheet to use:

  • HTML View Stylesheet. This stylesheet formats blog data into a standard HTML view.

  • Feeds Stylesheets. There are one stylesheet per feed type, including RSS 2, Atom, etc. For now, only RSS 2 is supported.

  • User-Defined Styleshets. In the future, those will allow user-defined HTML views.

What Next?

There has been progress since Part I, but the application is not quite done yet. Here is a list of upcoming tasks:

  • Blog Administration Page. It wouldn't hurt to make the page look nicer. There are also some minor bugs to fix. Finally, proper authorization must be implemented.

  • Users Administration Page. Similar comments apply.

  • Comments Page. Comments preview and submission must be implemented.

  • Validation. Form validation must be improved. XML validation must also be improved to validate all data-access queries.

Once we get there, we'll be quite close from a pretty nice blog application based entirely on XML technologies, from XForms to XSLT to native XML storage!

As usual, the source code is available from CVS, under src/examples/web/examples/blog. It is also availabe from the unstable builds.

Tuesday, August 16, 2005

OPS Stack Traces

OPS, like many Java platforms and applications, has to deal with exceptional conditions occurring at runtime: an XML file may be ill-formed and cause parsing errors; a Page Flow configuration may be incorrect; an XSLT stylesheet or an XForms page may contain incorrect markup; and so on.

In such cases a Java exception is raised and propagated up the Java method call stack. For practical reasons, such as adding more meaningful error messages, the initial exception is sometimes encapsulated within other exceptions. Finally, a top-level exception handler catches and formats the exceptions for presentation to the developer or for logging purposes.

Java stack traces are certainly useful for the developer who writes Java code: they provide information about the name of the Java source file, as well as line and column information. However in most situations the developer writing an application on top of OPS does not write the incriminated Java code, and instead really cares instead about XML file names and location information related to those files. This is why whenever possible OPS provides location data wrapped into a ValidationException Java exception.

This was pretty good but a recent blog entry by Sylvain Wallez has motivated some improvements to the error messages provided by OPS. A hierarchy of location data is in fact provided during the excecution of the OPS Page Flow Controller and XPL pipelines in general. Collecting and displaying this information amounts to creating an OPS stack trace showing the sequence of calls that have occurred until the exception was raised. This is absolutely invaluable information for the OPS developer.

Most of the code was already there and it was a breeze to enhance OPS to display stack traces more meaningful to an OPS user, such as the following:

This does not mean that Java stack traces are no longer shown, on the contrary, but a few cosmetic changes later, and the Java exceptions suddenly look much nicer:

  • Java stack traces are split into their different request components. For example, an exception may have been produced by a request on the OPSServlet, forwarded again to OPSServlet, then going through the OPS example portal's OPSPortlet. In this case, the stack trace is split into three parts for clarity.

  • Some coloring is applied to the class names to easily distinguish OPS classes from third-party classes.

  • As OPS users may have noticed, Java stack traces can be quite long due to the streamed nature of the execution of XPL pipelines. Exceptions are now initially folded, except for the first few lines, and expandable on click.

Of course, as was the case before, the layout of the OPS stack trace and Java stack traces can be customized: simply edit error.xpl and error.xsl.

Another good news is that all this is now in the nightly builds, and will soon be in OPS 3.0 final as well.

Of course we welcome any suggestion as to how the current error reporting in OPS could be improved. Don't be shy!