Qpy

Qpy provides a convenient mechanism for generating safely-quoted xml text from python code. It does this by implementing a quote-no-more string data type and a slight modification of the python compiler.

Quoting

XML reserves 5 characters ("<", ">", "&", quote and apostrophe) so that they can be used as markup delimiters. When a document needs to use these characters for some other purpose, they must be escaped, that is, replaced by the an equivalent entity or character reference. This package defines a xml_quote() function that, for a string argument, returns a string with these 5 characters with equivalents: for example, "<" becomes "&lt;".

When assembling an XML (or similar markup such as HTML) document, it is important to remember to quote everything that should be quoted, such as text that comes from a database or some (untrusted) outside source. In the case of web pages, underquoting this dangerous, as it leaves the door open for cross-site scripting and other attacks.

It would be nice if you could assemble your document as a string and then call xml_quote() on it at the end, just to make sure that everything was quoted, but this generally results in over-quoting, where you lose the intended markup structure. For web pages, over-quoting produces a result that is ugly, but much safer than the underquoted alternative.

Programs that produce XML documents must keep track of just what has been quoted already and what has not been quoted already, and mistakes are common. Our objective is to make quoting errors rare, especially underquoting errors.

The Quoted-No-More Class: xml

Our xml_quote() function always returns an xml instance. The class named "xml" is a subclass of Python's unicode string class. An instance of xml is a string that is known to need no more XML quoting. When the xml_quote() function gets an xml instance as an argument, it just returns the instance immediately, without any changes. When the xml_quote() function gets None as an argument, it always returns an empty xml instance. All other arguments to quote are converted to unicode strings and then the reserved characters are escaped to produce the resulting xml instance.

The xml class defines some functions that make it easy to build quoted documents.

When an xml instance is combined with another object using the + operator, the result is the xml instance formed by concatenating the quoted operands. The value of the expression xml('<x>') + '<' is equal to the value of xml('<x>&lt;') When an xml instance is used as a format string with the % operator, the (non-number) arguments to the format string are quoted as they are used.

The xml class includes a join() method that quotes the items in the sequence before joining them. The common case of using an empty xml instance to join a sequence is implemented in the join_xml() function. The join_str() function acts the same way, except that it does not escape any characters.

The Qpy Compiler

The Qpy compiler is Python compiler with an added preprocessor that can best be understood understood as a source-code transformation. The transformation is limited to the definitions of certain functions we call "templates". An xml template is designated in qpy source code by :xml just after the function name in the function's definition. For example, this is an xml template: def f:xml(x): "<div>" x "</div>" The Qpy preprocessor essentially replaces this by: from qpy import xml as _qpy_xml, join_xml as _qpy_join_xml def f(x): qpy_accumulation = [] qpy_append = qpy_accumulation.append qpy_append(_qpy_xml("<div>")) qpy_append(x) qpy_append(_qpy_xml("</div>")) return _qpy_join_xml(qpy_accumulation)

There are two main things going on here. One is that every string-literal in the body of the function is wrapped by the xml constructor. The assumption is that a literal string, provided by the programmer, does not need any more quoting. The other part of the conversion is that expression values are accumulated on a local list, and the default return value is the xml instance formed by concatenating these values, after quoting them.

The values returned by f are xml instances, and here are some samples: f(None) ⇒ "<div></div>" None becomes "". f("<hr />") ⇒ "<div>&lt;hr /&gt;</div>" Quoting happens. f(1) ⇒ "<div>1</div>" Converted. f(xml("<hr />")) ⇒ "<div><hr /></div>" Already quoted. The nice thing about this is that the expressions appearing in a template, possibly including values provided from outside sources, will always be quoted unless they are already instances of the xml class. If the programmer makes a mistake with respect to quoting, it will very likely appear as over-quoting instead of lurking as a security problem.

Templates can't have normal python docstrings after the arguments: we just use comments.

A template may also be designated by :str, instead of :xml appearing before the function name. The difference is that a str template will accumulate the values of expression statements and return the join_str() of the list, and there is no XML-quoting.

Templates can be nested arbitrarily along with other functions. A template's code transformation does not apply inside ordinary functions that are defined inside the template body.

Using Qpy

Source code files that include templates should be named with a .qpy suffix and placed in a python package directory. The package __init__.py should contain the following lines to make sure that the compiled versions of the qpy modules are up-to-date: from qpy.compile import compile_qpy_files compile_qpy_files(__path__[0])

The qpcheck.py Utility

This package also includes qpcheck.py, a script that looks for unknown names and unused imports in directories containing python and qpy source code.

Example

An example package is included in the distribution. To run it, just import the qpy.example.example1 module. The purpose of the example is to provide an example of a package, including the required __init__.py, and a .qpy module.

Content-in-code instead of code-in-content.

Most template systems are designed to embed program-like value-substitution and control flow into what would otherwise be static content. Qpy (like Quixote's PTL templates) uses the opposite pattern, embedding static content in what would otherwise be an ordinary program. This program-centric pattern is especially attractive when content maintenance team is the same as the programming team.

DurusWorks Documentation