Monday, October 21, 2013

Supporting permissions in your persistence API implementation

Up to version 4.2, implementations of the persistence API didn't need to worry about permissions; permissions were entirely checked by the part of Form Runner that was calling the persistence API. This changed with version 4.3, which introduced owner-group/based permissions. Let's see what changed:
  1. Without this feature, i.e. up to version 4.2, you could determine whether a user can perform a certain operation just based on the user's roles, the operation (create, read, update, or delete), and the app/form names. For instance, we could know if Alice was authorized to read data for the License application form even before calling the persistence API.
  2. This changes with the owner/group-based permissions: it could be that Alice can read all the data for the License application form just based on her roles (maybe she's an admin, or in charge of processing applications); but if not, she might still be authorized to access some of the data for the License application form, for instance the data she created, or data created by other users in the same group, depending on how permissions were setup.
This second point means that some filtering needs to happen, based on who the user is. Say Alice accesses the License application form summary page, and the system was setup so she can only see and access her own applications. In that case, the summary page will maybe show the 5 applications she filled, amongst the maybe thousands stored in the databases. It would be unreasonable for the implementation of the persistence API to return all the applications, and count on the caller to do the filtering: as more data is entered into the system, this could become increasingly more inefficient. Hence, the job of checking permissions has been in part shifted to the implementation of the persistence API. Now:
  • The search API only returns documents the current user has access to, and for each document specifies which operations the user can perform, this through the operations attribute on the document element in the query response.
  • This same information is also returned when reading a document. In that case, since the the document is returned in the body, the list of operations is returned "out-of-band" through the Orbeon-Operations header.
To make it easier for implementations of the persistence API to deal with permissions, Orbeon Forms offers a few helper functions, provided as static methods that can be called from Java, implemented in FormRunnerPermissionsOps.scala. All the static methods mentioned below are in the org.orbeon.oxf.fr.FormRunner class. 
  • If you're implementing search, you could use those operations as follows:
    1. First, you might want to ask: is Alice authorized to access all license applications just based on her roles? The static method javaAuthorizedOperationsBasedOnRoles() answers this question.
    2. If the answer is positive, then you can return all the data, just as you used to be done in 4.2 and earlier.
    3. However, if the answer is negative (there are no operations Alice can do on license application just based on her roles), then you'll want to filter out and return only the applications Alice directly created, or that were created by another in her group, depending on the configuration.
    4. Finally, when returning applications Alice has access to, you can find what operations she can perform on each document by calling javaAllAuthorizedOperations().
  • If you're implementing the read operation, as mentioned earlier, you're expected to return the operations the user has access to through the Orbeon-Operations header. If you wish, you can delegate this task to the setAllAuthorizedOperationsHeader() method, which will figure out the list of operations and set the header for you.

Sunday, October 13, 2013

Unification of the relational persistence layers

XForms by itself doesn't fully solve the question of how to persist data. It provides solid foundations that we can leverage to persist data, but intentionally doesn't answer questions such as "how is the data organized?" or "how is data saved, retrieved, or searched, this in an actual database, say Oracle?". From the start, we aimed at Form Builder and Form Runner to work as much as possible out-of-the-box, which meant that we had to answer those questions. We did this by:

  • Defining an API for persistence, which Form Builder and Form Runner use, so both form definitions and form data are saved through this API.
  • Providing built-in implementations of this API for some of the most popular databases.
On the implementation side, we started with the eXist XML database; it ships with Orbeon Forms, and enables you to have a solution that works out-of-the-box without any setup required. Then, we added implementations for two relational databases: Oracle and MySQL. Those two databases are almost opposites in terms of features: Oracle is very rich and provides tons of extensions, while MySQL is a much "simpler" database. This lead us to implement things quite differently on those two databases, so we went with two completely separate implementations, which didn't really share any code.

Then, with version 4.3, we added support for DB2, and continued having separate code for each database, but found this approach not to be sustainable. It was very much a maintenance problem as every fix, new feature, or performance improvement done in one version of the code had to be ported in two other versions. This meant that the 3 implementations were not fully in sync: for instance, in 4.3 some features were added and were only available in DB2 and MySQL, but not Oracle, while some performance improvements were available for Oracle but not DB2 and MySQL. Making any type of change felt like a huge task, let alone thinking of supporting more databases.

In the upcoming Orbeon Forms 4.4, we resolved this by creating a unified implementation, which works across relational databases. The difficulty was in finding ways of doing things that didn't rely on unique database features, but were efficient enough on all databases. There are still places in the code where we need to do something different depending on the database, but those are now the exception rather than the rule.

While having a unified implementation for relational databases doesn't provide any new user-facing feature, this will allow us to improve those implementation faster, and to add support for new relational databases (did I hear someone mention SQL Server and PostgreSQL?).

Wednesday, October 9, 2013

More dynamic control bindings in Form Builder

Recently, a user asked us why, in Form Builder, there is a separate Number control in the toolbox. Is this the same as an Input Field set it to a Decimal datatype?

Our answer was, unfortunately: "Not really!". When you add a Number control, you are in fact adding an fr:number XBL component, which is able to format numbers back and forth with customizable decimal and grouping separators, and to top it off supports a prefix and suffix. In short, it has more features than just a plain input field, and it is part of a set of so-called typed controls.

Form Builder Typed Controls
Obviously, that situation is not ideal, as there are two incompatible ways of inserting a number field in From Builder!

For the longest time we have wanted to have fully dynamic XBL bindings, depending on datatypes. This is a task that we haven't yet completed.

But it occurred to us that we could already go one step in the right direction, by providing this functionality at the Form Builder level. And this is possible because Form Builder knows about datatypes at design time.

So that's just we have done and now from a user's perspective the Number control in the toolbox is just a shortcut for inserting an Input Field with Decimal datatype, as you would have expected.

What's even better is that we now have a general underlying mechanism to support this type of mappings. At the low level, we simply say that the XBL fr:number component must be used when we actually have a plain fr:number element in the form, or an xf:input bound to a decimal datatype. We express this with a CSS selector:
<xbl:binding element="fr|number, xf|input:xxf-type('xs:decimal')">
(We implemented a full CSS 3 selector parser along the way!)

This change will be available in the upcoming Orbeon Forms 4.4.

Tuesday, October 8, 2013

CoffeeScript: Create objects referencing other properties

In CoffeeScript, you can easily create an object and set the value of some of its properties:

    section =
        title: 'My section'
        element: $('.my-section')
        width: $('.my-section').width()

The width property is used to cache the width of the body. But the way it is defined isn't ideal as $('body') is duplicated. This can be a concern in terms of performance and DRYness. Instead, you could write:

    body = {}
    body.label = 'Document body'
    body.element = $('body')
    body.width = body.element.width()

I find the lack of indentation makes this code less clear. Also body. needs to be repeated on every line. Using the do keyword, you can introduce an indentation, which makes the code clearer, but this doesn't solve the lack of DRYness:

    body = {}; do ->
        body.label = 'Document body'
        body.element = $('body')
        body.width = body.element.width()

Can JavaScript's new operator solve all our problem? Let's try:

    body = new ->
        @title = 'Document body'
        @element = $('body')
        @width = @element.width()

This look like the perfect solution… unless the last property returns an object, in which case body will be equal to that object. For instance, in the following case, body will point to the document (the body's parent), which, obviously, isn't what we want. This is a result of CoffeeScript functions returning the value of the last expression, and the way new works in JavaScript when new is invoked on a function that returns an object.

    # Doesn't work, as body will point to the document
    body = new ->
        @title = 'Document body'
        @element = $('body')
        @width = @element.width()
        @parent = @element.parent()

Until CoffeeScript adds some syntactic sugar not to return the value of the last expression, you can get around this by explicitly adding a return at the end of the function if the last property is an object or in all cases, if you prefer to avoid a possible mistake:

    body = new ->
        @title = 'Document body'
        @element = $('body')
        @width = @element.width()
        @parent = @element.parent()
        return

With the return, I find this code looses some of its elegance, as its looks more like a sequence of statements than the definition of a data structure. Also, this is error prone, as leaving out the return will work in some cases and not others. Alternatively, you can use an anonymous class, and define the function as its constructor:

    body = new class
        constructor: ->
            @title = 'Document body'
            @element = $('body')
            @width = @element.width()

This makes the code somewhat heavier, harder to understand, especially if you're not declaring many classes in your code. My favorite solution leverages Underscore's _.tap. That function is in general used when chaining operations, but is at its core very simple: it applies the function passed as its second argument to the object passed as first argument, and returns the first argument. Using it we can write:

    body = _.tap {}, (o) ->
        o.title = 'Document body'
        o.element = $('body')
        o.width = o.element.width()

Now, that is a solution I can use.

Monday, October 7, 2013

Autosave

Glitches happen. While you're filing a form, your browser might crash, your computer be disconnected from the power or run out of power, your Internet connection might go down, or the server might have a technical issue. When such a glitch happens, you're likely to loose whatever work you've done since you last saved. Hence was born the Save early, Save often mantra. But why would you have to do what software can automatically do for you?

In Orbeon Forms 4.3, we introduced the autosave feature. When autosave is enabled, as the name implies, your work is automatically saved in the background as you interact with the form. Since we want to keep the distinction between data you explicitly saved and data that was autosaved, we refer to the latter as drafts. You can access drafts from the summary page, on which they are clearly marked as such. Also, if you go back to a form, and you have a draft for that form, you'll also be asked whether you'd like to continue your work from that draft.

In Orbeon Forms 4.3, autosave is available for MySQL and DB2, and needs to be enabled through properties. In the upcoming Orbeon Forms 4.4, this feature will be available and enabled by default for all relational databases supported by Orbeon Forms.

Tuesday, October 1, 2013

Spreadsheet-like forms

Many of us are familiar with using spreadsheets with rows and column of cells. Sometimes we also want to gather data in a form that behaves similar to a spreadsheet with data categorized in two dimensions using regular rows and columns. In some cases, the default behavior of XForms may not work exactly the way users expect in a spreadsheet.

This example form walks you through the steps to make XForms work like a spreadsheet. We gather typical income and expense information for an organization for a year, with both row and column totals. We add features such as allowing for empty (“null”) values but still check for valid data in each cell.

The initial form is shown in figure 1:

Figure 1: Income and expenses for four quarters
It gives users several points of feedback:
  1. If they enter any invalid decimal character an alert appears next to the cell.
  2. Empty cells do not show alerts even though the data is not a valid decimal number.
  3. Rows show the total annual income and total annual expenses.
  4. Column show the net income for each quarter and the annual net income in the bottom right cell of the table.
Figure 2 shows a version of the form with the data filled in and an error in one of the input fields.
Figure 2: Alert shown on invalid decimal value
In Q4 Expenses the user typed the letter “L” instead of the number “1”. There are several features of the form that we have added that to get this functionality.

Allowing empty initial values

One of the problems is that empty fields are not valid decimal numbers. To get around this we have two options. The first option is to create a custom data type called for example decimalOrNull that allows empty values in each of the cells. We do this by using an custom XML Schema data type and put it in our XForms model.
<xs:simpleType name="decimalOrNull">
    <xs:restriction base="listOfDecimals">
        <xs:maxLength value="1"/>
    </xs:restriction>
</xs:simpleType>
<xs:simpleType name="listOfDecimals">
    <xs:list itemType="xs:decimal"/>
</xs:simpleType>
Our other option is to use the XForms implementation of the decimal data type which also allows empty values. Many people, like me until recently, are not aware that XForms has its own version of standard data numeric data types which allow empty values. This is useful in cases like like ours, in particular so that forms does not show errors when the form is first rendered to the user.

We use the XForms <xf:bind> element to create binding rules that applies to all the input fields. Here is what the instance that holds the saved data looks like:
<xf:instance xmlns="" id="save-data">
    <data>
        <income>
            <q1/>
            <q2/>
            <q3/>
            <q4/>
        </income>
        <expenses>
            <q1/>
            <q2/>
            <q3/>
            <q4/>
        </expenses>
    </data>
</xf:instance>
And here is our binding rule for our custom data type:
<xf:bind id="decimal-or-null"
         ref="instance('save-data')/*/*" type="decimalOrNull"/>
or alternatively, with the built-in XForms data type for decimal values:
<xf:bind ref="instance('save-data')/*/*" type="xf:decimal"/>
Note that the prefix is xf not xs for this data type!

Because we use wildcards in our path, these binds apply to all quarters in both the income and expenses in our save-data instance.

Adding calculations

Now we take a look at some options for showing calculations. One way to do this is to put <xf:output value=""/> elements in the table cells. But this means that each output needs to have its own formula, and we don’t take advantage of the regularity of our data.

As an alternative design we can do the calculations directly in bind rules. First we separate the data that is to be saved from the calculations. We put all our calculations in a separate instance:
<xf:instance xmlns="" id="calculations">
    <data>
        <net-income>
          <q1/>
          <q2/>
          <q3/>
          <q4/>
        </net-income>
        <income-total/>
        <expenses-total/>
        <net-annual/>
    </data>
</xf:instance>
The remaining binding rules are for creating the totals in the right column and the differences in the last row. We use a predicate to only include the non-empty values in our total calculations. For example here is the total of income:
<xf:bind ref="instance('calculations')/income-total"
         calculate="sum(
            (instance('save-data')/income/*)
            [string() castable as xs:decimal], 0)"/>
Note that [string() castable as xs:decimal] removes all empty and invalid values from the total.

The quarterly net income calculations can be done using four distinct formulas. But if we use the element name consistently in the income, expense and calculation areas we can write just a single bind rule to make all four calculations.
<xf:bind ref="instance('calculations')/net-income/*"
    calculate="
    for $quarter in name()
    return
       instance('save-data')/income/*
         [name() = $quarter and string() castable as xs:decimal] -
       instance('save-data')/expenses/*
         [name() = $quarter and string() castable as xs:decimal]
    "/>
In this bind rule the calculation for each quarter is done by selecting all the net-income calculation sub-elements (q1, q2, q3 and q4). The rule then subtracts the expense for that quarter from the income to get the net-income for that quarter. It does this by adding the predicate to return only the current quarter from all the elements returned by the wildcard expression. As long as the three data structures have the same quarters a single bind rule does all the work of four distinct formulas!

What is beautiful about this example is that you don’t have to write a large amount of JavaScript code to detect when one of the cells change. All the dependency calculations are created from the binding rules for you. This is a good example of how the built-in XForms dependency graph keeps your XForms code short and easy to maintain.

Note that this example uses the custom Orbeon currency control. We decided that since the users know that all the data is currency, we did not need to show the `$` prefix. You can get a similar behavior by using the xf:input control with both the format and unformat attributes.
<xf:input ref="myNumber"
  xxf:format="format-number(.[string() castable as xs:decimal], '$#,###')"
  xxf:unformat="replace(replace(., '\$', ''), ',', '')"/>
What you can see is that with a little work, we can make forms work much like a spreadsheet. But unlike a typical spreadsheet, we can also use XForms binding rules to get field-by-field data checks and warnings right when you tab out of each cell. This instant feedback makes is easier for users to get their data right before they save their data.

A full listing of the form is available here.

Formulas for Summing Done Right

References

W3C Specification on XForms Custom Data Types