|Photo by Brad Montgomery|
New lines and indentation fall in the category of so-called whitespace. Whitespace consists in anything that looks “blank”, including actual space characters and line breaks.
<book> <details> <title/> <author/> <language/> <link/> <rating/> <publication-year/> <review/> <image filename="" mediatype="" size=""/> </details> <notes> <note/> </notes> </book>
What we have noticed is that, especially for large forms, whitespace can take a significant amount of memory in compiled form definitions. So for Orbeon Forms 4.4, we looked into how we could improve that situation.
The trick is to remove whitespace where it is not needed, because in some cases you do want to keep at least some of it. For example you can remove most indentation and new lines, but consider this HTML fragment:
Some space after “a” and before “moment” needs to be there, but in most cases that space could be collapsed or normalized to a single space (there are exceptions). But it would certainly be wrong to remove all the spaces.
<p>This is a <b>great</b> moment.</p>
A different example is the HTML
<pre>tag, within which all the whitespace should be preserved, including indentation and new lines.
To address this, we implemented a configurable whitespace stripper  for form definitions, with the intent of removing as much whitespace as we can while keeping it where it is needed.
The result is that for very large forms, it is possible to save over 20% memory for the compiled form definition compared with Orbeon Forms 4.3.1. One very large form definition had over 15 MB of waste due to whitespace!
These savings are especially important if you have a large number of distinct form definitions.
We still hope to improve on this in the future. For example it is probably possible to save memory within form data as well.