Durus®

Durus is a persistent object system for applications written in the Python programming language.

Durus was written by the MEMS Exchange software development team at the Corporation for National Research Initiatives (CNRI). Durus is designed to be the storage component for the Python-powered web sites, and it provides the features that we need for this purpose, and no more.

Overview

Durus offers an easy way to use and maintain a consistent collection of object instances used by one or more processes. Access and change of a persistent instances is managed through a cached Connection instance which includes commit() and abort() methods so that changes are transactional.

Quick Demo

Run durus -s in one window. This starts a durus storage server using a temporary file and listening for clients on localhost port 2972. Run durus -c in another window. This connects to the storage server on the port 2972 on the localhost. When you start, you have access to only one dictionary-like persistent object, "root". If you make changes to items of root and run "connection.commit()", the changes are written to the (in this case, temporary) file. If you make changes to attributes of root, and then run connection.abort(), the attributes revert back to the values they had at the last commit.

Run another durus -c in a third window, and you can see how committed changes to root in the first client are available in the second client when it starts. Subsequent changes committed in any client are visible in any other client that synchronizes by calling either connection.abort() or connection.commit().

You can stop the server by Control-C or by running durus -s --stop. You can stop the clients by Control-D or by your usual method of terminating a python interaction.

This demonstrates simple transactional behavior, but not persistence, since the temporary file is removed as soon as the durus server is stopped.

To see how persistence works, follow the same procedure again, except add --file test.durus to the command that starts the server. Make changes to attributes of root, run connection.commit(), and durus -s --stop, and the changes to root will be stored in test.durus, so that you'll see the changes again if you restart again with the --file test.durus option.

Finally, note that you can run durus -c --file test.durus (after stopping the durus server) to use the file storage directly and exclusively. Everything works the same way as before, except that no server is involved.

Both the durus -s and durus -c commands accept --help command line options that explain more about their usage.

Using Durus in a Program

To use Durus, a Python program needs to make a Storage instance and a Connection instance. For the Storage instance, you have two choices: FileStorage or ClientStorage. If your program is to be one of several processes accessing a shared collection of objects, then you want ClientStorage. If your program has no competition, then choose FileStorage. There is only one Connection class, and the constructor takes a Storage instance as an argument.

Example using FileStorage to open a Connection to a file: from durus.file_storage import FileStorage from durus.connection import Connection connection = Connection(FileStorage("test.durus")) When the argument to the connection constructor is a string, it treats it as the name of a FileStorage file, so the shortcut way to open a Connection to a file is this: from durus.connection import Connection connection = Connection("test.durus")

Example using ClientStorage to open a Connection to a Durus server: from durus.client_storage import ClientStorage from durus.connection import Connection connection = Connection(ClientStorage())

Note that the ClientStorage constructor supports the address keyword that you can use to specify the address to use. The value must be either a (host, port) tuple or a string giving a path to use for a unix domain socket. If you provide the address you should be sure to start the storage server the same way. The durus command line tool also supports options to specify the address.

The Connection instance has a get_root() method that you can use to obtain the root object.

In your program, you can make changes to the root object attributes, and call connection.commit() or connection.abort() to lock in or revert changes made since the last commit. The root object is actually an instance of durus.persistent_dict.PersistentDict, which means that it can be used like a regular dict, except that changes will be managed by the Connection. There is a similar class, durus.persistent_list.PersistentList that provides list-like behavior, except managed by the Connection.

PersistentList and PersistentDict both inherit from durus.persistent.Persistent, and this is the key to making your own classes participate in the Durus persistence system. Just add Persistent to class A's list of bases, and your instances will know how to manage changes to their attributes through a Connection. To actually store an instance x of A in the storage, though, you need to commit a reference to x in some object that is already stored in the database. The root object is always there, for example, so you can do something like this: # Assume mymodule defines A as a subclass of Persistent. from mymodule import A x = A() root = connection.get_root() # connection set as shown above. root["sample"] = x # root is dict-like connection.commit() # Now x is stored.

Subsequent changes to x, or to new A instances put on attributes of x, and so on, will all be managed by the Connection just as for the root object. This management of the Persistent instance continues as long as the instance is in the storage. Sometimes, though, we wish to remove "garbage" Persistent instances from the storage so that the file can be smaller. This garbage collection can be done manually by calling the Connection's pack() method. If you are using a storage server to share a Storage, you can use the gcinterval argument to tell it to take care of garbage collection automatically.

Non-Persistent Containers

When you change an attribute of a Persistent instance, the fact that the instance has been changed is noted with the Connection, so that the Connection knows what instances need to be stored on the next commit(). The same change-tracking occurs automatically when you make dict-like changes to PersistentDict instances or list-like changes to PersistentList instances. If, however, you make changes to a non-persistent container, even if it is the value of an attribute of a Persistent instance, the changes are not automatically noted with the Connection. To make sure that your changes do get saved, you must call the _p_note_change() method of the Persistent instance that refers to the changed non-persistent container. You can see an example of this by looking at the source code of PersistentDict and PersistentList, both of which maintain a non-persistent container on a "data" attribute, shadow the methods of the underlying container, and add calls to self._p_note_change() in every method that makes changes.

Frequently Asked Questions about Durus

How do I backup a database? Do I need to shutdown the storage server first?

It is safe to just copy the file. Data is only appended to the file and the FileStorage class can detect if the last transaction has been truncated. There is no need to shutdown the storage server first. Because writes are isolated, a utility such as rsync can be used for efficient backups. If you are setting up a backup system for Durus database files, consider rsync (version 3.0.2 or higher) with the --append and/or --append-verify flags.

I made a change in one client but it is not visible in another client.

You need to call commit() or abort() in the second client in order to see the new data. This behavior is necessary to provide data consistency.

My client has received a ConflictError or ReadConflictError. What must it do to recover?

The client must call abort() and restart the transaction. Note that it must not keep partial results in local variables, for example, since the data it was using before the conflict was out of date.

When does a write conflict occur?

If, in the period since the last time client A called commit() or abort(), client A has accessed an attribute of a PersistentObject instance X and some other client has also committed changes to X, then a ConflictError exception will be raised when client A tries to commit any changes. This prevents client A from committing changes that are based on out-of-date data.

When does a read conflict occur?

The exact conditions under which a ReadConflictError is raised are complicated so the source code is probably the best reference. In essence, a read conflict occurs when a client tries to load data from the storage server that is inconsistent with other data that it has accessed since the last commit() or abort().

For example, a client examines object A, a second client modifies object A and object B. If the first client tries to load object B it will get a read conflict error. The state of object A, already accessed, is not consistent with the state of B.

I've made changes to my object model. How do I update an existing database?

We have found that a separate database update script works well.

I need to find all objects of a certain class in order to update their attributes.

If you can't easily find them by following the object graph then you can use the gen_every_instance() function from the durus.connection module. Note that this is expensive since it iterates over every record in the database. We use it only for making data model changes.

I want to rename a class. How do I update the database?

Here is an example script: import new_module import old_module # Put the new class where the old class was. old_module.OldClass = new_module.NewClass # Open the connection. from durus.connection import Connection from durus.connection import gen_every_instance, touch_every_reference connection = Connection("myfile.durus") # Make sure that every instance of the class is marked as changed. for obj in gen_every_instance(connection, old_module.OldClass): obj._p_note_change() # Make sure that every referring instance is marked as changed. touch_every_reference('OldClass') connection.commit() If your database structure is clear enough, you should iterate over every instance using something more direct than gen_every_instance(). If you have a direct way to iterate over the instances that contain references to instances of OldClass, do that instead of touch_every_reference() and call _p_note_change() on each of them.

I want to rename a module that includes a PersistentObject subclass. How do I update the database?

This is basically the same as changing a class name. A useful trick is to assign to ``sys.modules`` directly. For example, in your update DB script you could do something like::

   
    
       import newmodule
       sys.modules["oldmodule"] = newmodule

Is Durus thread-safe?

Durus Connections using ClientStorage should be safe to use in multiple threads in the same process as long as no individual Connection or ClientStorage instance or any of the PersistentObject instances obtained from that Connection are accessed from more than one thread. Our applications use multiple processes when they need multiple threads of control.

Can a Durus application use multiple Durus databases?

Yes, and this may be a good idea if you have data sets that are independent. The thing you can't do is put a reference to an object from Connection A into an object from Connection B.

Does Durus support "undo"?

No. If you need this capability, you should handle it directly in your application-level code.

History

Durus was inspired by ZODB and ZEO, the database and client/server code produced by our friends at Zope, Inc. Like ZODB, Durus implements a pickle storage system and manages instances of a Persistent class in a transactional way. Like ZEO, Durus supports access to a shared storage from connections in multiple processes.

The code of Durus shares some small fragments and some names with code in ZODB/ZEO, but we do not consider Durus to be a modification of ZODB/ZEO.

Durus was developed by the MEMS Exchange software team starting in 2003: Anton Benard
David Binger
Roger Masse
Neil Schemenauer
Neil initiated work on Durus and made it fast.

Other Contributors:
Andrew Kuchling
Patrick K. O'Brien
Jesús Cea Avión
John Belmonte
Andrew Bettison
Sergio Ãlvarez Muñoz
Matthew Scott

DurusWorks Documentation