mike watkins dot ca : June 4 2007 Archives

June 04 2007

Harper: Won't meet Kyoto

In Europe today Stephen Harper said his government wouldn’t even try to meet Canada’s Kyoto obligations

In other news, the sun also rose.

Both events, although unrelated, were entirely predictable.

Harper, as former leader of the Canadian Alliance, former policy chief of the Reform Party, and as head of the right-wing National Citizens Coalition, has always fought against even recognizing climate change as a serious issue. If he had the power to do so, and he does to some degree, Harper would ignore climate change and Kyoto and would allow a business as usual approach to the file.

That’s essentially what he is doing.

Python Web Application Diary, Part Four

In part three of this series we determined that Entry and Journal classes will be required, and we started to look at how straight Python classes could become full database participants in a Durus object database.

Our object model so far is very simplistic; lets add a healthy dose of constraints to aid both in testing and also prevent unintended (mis)use down the road.

I promise we'll get to webby things soon enough, we are just waiting for the chorus to come around again on the guitar. (All apologies to Arlo Guthrie)

Specifications: Contracts for Busy Developers

The QP package has a wonderful module, qp.lib.spec which deserves to get more attention whether QP does or not.

spec provides an easy way to declare or specify what various object attributes should contain, without littering your code with miles of asserts and other test. Examples will make things clear - lets take our too-simple Entry object and beef it up. First the original database-aware object:

from durus.persistent import PersistentObject

class Entry(PersistentObject):
    title = None
    text = None
    author = None
    created = None

Out of control: Clearly the object as its described presents a problem - its just an empty 'bag' with no constraints on what a user/developer might attempt to do with it. For example, whether intended or not, our dumb object allows for all sorts of questionable attribute assignments, as shown in this interactive session:

->> e = Entry()
->> e.title = 'my title'  # so far so good
->> e.title = None        # probably reasonable
->> e.title = 123.456     # not at all what we want
->> e.created = 3.1415
->> e.some_new_attribute = "foo"

The dynamic nature of Python is both blessing and curse at times; the above object requires lots of additional code to ensure that data it manages is what the developer intended, leading to significant code expansion via tests and assertions, reducing readability along the way.

In control: There is another way. Lets use spec and apply specifications and some helper methods:

from durus.persistent import PersistentObject
from qp.lib.spec import datetime_with_tz, spec, string
from qp.lib.spec import add_getters_and_setters

class Entry(PersistentObject):

    title_is = spec(
        (string, None),
        'A short description of the entry')
    text_is = spec(
        (string, None),
        'The full text of the entry')
    author_is = spec(
        User,
        'The individual responsible for the content of the entry')
    created_is = datetime_with_tz

add_getters_and_setters(Entry)

A spec can be a simple type assignment (see created_is above) or make use of the spec function which allows for a certain amount of self-documentation which can be very helpful. Arguments supplied in tuple imply the specification either, or you can spell it out: either(string, None).

Now lets run through the same interactive session as before:

->> e = Entry()
->> e.set_title('my title')
->> e.set_title(None)
->> e.set_title(123.456)
Traceback (most recent call last):
  File "<input>", line 2, in <module>
  File "/usr/local/lib/python2.5/site-packages/qp/lib/spec.py", line 725, in f
    require(value, getattr(klass, name + '_is'))
  File "/usr/local/lib/python2.5/site-packages/qp/lib/spec.py", line 171, in require
    raise TypeError(error)
TypeError:
  Expected: (string, None)
  A short description of the entry
  Got: 123.456

Aha, for the cost of a pair of get and set methods (no moaning please, we didn't even have to write the getter and setter ourselves and they don't clutter the code), we've effectively constrained what type of data can be assigned to the title attribute of Entry, while also preserving the easy to read nature of the original, simple, code.

To demonstrate the utility of spec, I've pulled a number of examples from Dulcinea and my own code. As you can see, there is great flexibility provided:

date_is = datetime

approvals_is = spec(
    sequence(DulcineaUser, set),
    "The users who agree the issue is resolved.")

issues_is = spec(
    mapping({string:Issue}, PersistentDict),
    "Mapping of issue IDs to issues.")

id_is = spec(
    pattern('^[-A-Za-z0-9_@.]*$'),
    "unique among users here")

# some specs are referred to over and over again

datetime_without_tz = both(datetime, with_attribute(tzinfo=None))
datetime_with_tz = both(datetime, with_attribute(tzinfo=no(None)))
email_pattern = pattern("^.+@.+\..{2,4}$")
existing_user = both(User, with_attribute(id=no(None)))
hex_pattern = pattern('[a-fA-F0-9]*$')

# reuse them

email_is = email_pattern
id_is = spec(
    hex_pattern,
    'a lower case alphanumeric pattern')

# use the specifications in tests (more powerful and cleaner than
# testing for type and instance alone)

require(thing, either(list, tuple))
match(a_user, existing_user) # returns boolean

My experience is that you can create fairly complex specifications that remain very readable. Have a complex object that has to be "just so" before it is committed to a database in a transaction? Specify everything, and check it for sanity with a one line assertion: assert get_spec_problems(theobject_instance) == [] and you are done.

Testing, Testing, One Two Three

As you might imagine, it becomes easier to write unit tests when our objects are so highly specified. Sancho, a unit testing framework also from the same development shop from which QP originates, is designed for projects and teams who prefer to leave code in a working state, all or most of the time.

Tests live in ./test, one level down from our objects being tested, and there is no __init__.py. A utility, urun.py, will execute one test supplied on the command line, or all tests in the test subdirectories in the current directory and below. Lets write one for Entry:

# /www/lib/parlez/test/utest_journal.py
from parlez.journal import Entry
from sancho.utest import UTest, raises

entry_text = '''This is a blog entry.\n\n*We hope you like it*.'''

class EntryTest(UTest):

    def init_test(self):
        Entry()

    def entry_test(self):
        joe = User('joe')
        e = Entry()
        e.set_author(joe)
        # a string causes a TypeError, authors must be User instances
        raises(TypeError, e.set_author, 'Joe')
        assert e.get_author() == joe
        assert e.get_created() == e.get_stamp()
        e.set_text(entry_text)
        assert e.get_text() == entry_text
        e.set_stamp()
        assert e.get_created() != e.get_stamp()

class JournalTest(UTest):
    # we'll write this shortly, before Journal!
    pass

if __name__ == '__main__':
    EntryTest()
    JournalTest()

Run urun.py from the command line or from your editor and the result:

# /www/lib/parlez/test% urun.py
./utest_journal.py: EntryTest:

No tracebacks indicates successful test(s).

In part five of this series we'll start to write the HTML (remember, this article series is apparently about web development with QP) and other user interfaces for our Entry object.

Python Web Application Diary, Part Three

In part two of this series we created a location and file system hierarchy for application library objects, UI and other components, and did the same for an actual application by using a script mkqpapp.py that automates those tasks.

Today lets start writing code -- we'll begin by defining basic objects for managing weblog or journal entries, and then we'll move on to showing how QP and Durus make defining and publishing your Python objects as easy as, well, py.

Basic Data Elements

As discussed in part one, this tutorial / web application project will result in a basic weblog or on-line journal application. Lets break down a weblog into its most basic data elements:

  1. A weblog is a collection of writing, generally presented in chronological fashion. A weblog could be considered a diary or journal, so lets use the term Journal to describe its function.
  2. A journal usually, but not always, represents the thoughts and opinions of a single author.
  3. Each item in a journal can be considered an article or a post - lets use a more generic term and call each item in the journal an Entry. Entries are typically short bits of text so lets enter and store them as such. Each entry may have a title, and may include other information including dates relating to when the Entry was created, made available to readers, or changed -- but the principal information is the entry itself.

Our first classes

Turning to Python then, we could easily represent Entry as:

class Entry(object):
    title = None
    text = None
    created = None

We could then use the class:

e = Entry()
e.title = 'Python Web Application Diary, Part Three'
e.text = 'Hello, Bruce, my name is Bruce.'
created = datetime.datetime.now()

That was pretty simple, no? Simplicity can be both a boon and a pain in the butt, and experienced developers will recognize at least two significant problems with our still too-simple Entry object:

  1. There is no way of easily persisting this data (saving it so that its available later when we need it)
  2. The current design doesn't warn or otherwise prevent someone from intentionally or accidentally storing data we don't expect, such as:
e = Entry()
e.title = datetime.datetime.now()
e.created = 'Python Web Application Humour'

Kicking Entry Up a Notch - Persistence

Lets first look at the issue of persistence. Keeping our journal entries around for future display (or edits) could be done by:

  • Saving the data into individual files
  • Saving the data in a relational (SQL) database such as Postgres, Oracle, MySQL or MS SQL Server

To SQL or not to SQL, that is the question

Most often these days by default a developer will turn to a SQL database to store and manage persistent data. While there is nothing wrong with this, introducing SQL into the mix does complicate matters some what. SQL types are not exactly analogous to Python data types, and accessing and updating data held in a SQL repository can often require lots of tedious SQL code, in addition to your Python code and objects.

To ease the friction or so-called impedance mismatch between Python and SQL, various Object Relational Mappers (ORMs) have appeared on the Python scene. While ORMs like SQLObject and SQL Alchemy do make using SQL-based data within a Python application somewhat more convenient, its equally true that not all applications need the added complexity and there are other alternatives which can be useful to Python programmers regardless of complexity.

QP doesn't enforce a particular data persistence approach upon a developer, but it does make a choice for you which you can then consciously choose to ignore.

Durus, a Python Object Database

Rather than deal with the impedance mismatch between Python and SQL, QP by default uses Durus, a Python object database, to persist application data.

The truly neat thing about Durus, for Python users, is that you almost know how to use it now, sight unseen.

Lets take our dirt-simple Entry object and make it database aware. You'll recall the basic object looked like this:

class Entry(object):
    title = None
    text = None
    created = None

An Entry object able to participate in the Durus object database looks like this:

class Entry(PersistentObject):
    title = None
    text = None
    created = None

As you can see, other than subclassing a special Durus type, PersistentObject, there are no outward differences. We can therefore make a straightforward claim: Durus is the database you already know.

We'll revisit Durus and object persistence in a future installment. In Part Four of this series we shall take our too-simple, but persistent, object and show how specifications can add useful constraints. We'll also take our first look at Sancho, a unit testing framework.