What Happened to Atop?

Several people have asked, "Why is Axiom better than Atop?", and "Why Atop in the first place, if you're abandoning it now?"

This is a working draft of an answer to these, and related questions. It probably sounds pretty reactive and defensive, and it is, because it's an exercise in apologetics for previous decisions that may not have made sense to certain people. As you may guess, some of the questions are not so frequently asked by the public but rather persistently asked by a small audience. A more positive spin on this whole thing can be found in AxiomProject.

Use Case

Q: What was the rationale behind developing a custom persistence system in the first place?

A: In the applications we want to create, Divmod has a lot of heterogenous data. The most basic example of this is that messages from email, RSS, jabber, MSN, AIM, IRC (etc) have custom data associated with them, yet conform to a single API which allows certain client programs (such as structured information extractors) to access them simply as "messages". In order to implement this in a database, we needed the notion of an object that implements an interface in the database.

These messages would be stored in large groups, in the case of the user's "inbox", perhaps hundreds of thousands or millions at a time.

We are also not totally sure about our schema, and needed a system with support for explicit, but automatic upgrading, which can run interleaved with the actual application, so that we don't have to schedule downtime between schema upgrades.

This lead to 3 requirements:

  • explicit support for database-efficient, very large collections (which, due to the event-driven nature of our server, needed to not just be efficient but also be loaded in small queries that would complete in ~constant time)
  • automatic upgrading of database state from one version of the code to the next
  • the ability to store a 'reference to any' and have a 'collection of any', rather than a relation to a particular other table.

Curiosity

Q: What are the differences between Axiom and Atop, broadly speaking?

A: Atop was a very thick layer over a dictionary-based mapping, providing an API to load and store objects by ID, manage transactions, atomically manipulate stored files on disk, and perform queries on large collections known as "pools", which had "indexes". While pools were superficially like tables, only indexes could be queried, and objects typically contained large amounts of data beyond their indexed values. Atop has in-memory transactions to make sure that Python objects are always consistent with the transaction state of the database on disk. Atop provides an upgrade system whereby users can register upgrade functions to run over objects of a particular type when they are loaded from disk, automatically making them current with the state of the code manipulting them. As a database based around a key/value mapping, any object could easily store a reference (or set of references) to any other, simply by making note of its ID.

Axiom is a very thin layer over an embedded SQL database, providing an API to load and store objects by ID (or by query), manage transactions, atomically manipulate stored files on disk, and perform queries over arbitrary attributes of declared "Item" classes, which correspond directly to SQL tables. Axiom also has in-memory transactions like Atop. Axiom also provides an upgrade system by appending a version number to every table name it creates for an object, but it is much more explicit; rather than taking arbitrary Python objects, it takes 'row' objects which, for a particular version, are required to conform to the exact schema for the table of that version; an upgrader can never be 'surprised' by data it did not expect to find.

Atop allowed users to store objects of any type using automatic persistence. Axiom places a STRONG emphasis on explicit schemas; users cannot accidentally assign any attribute in memory that will not be stored on disk; all attributes must be explicitly declared somewhere, and inheritance is forbidden, to provide the simplest possible mapping between tables and objects.

Axiom also allows users to store references to arbitrary Items (database objects) from other Items. To do this, Axiom creates 2 rows for each object; one row in a central 'objects' table, noting its type, and one row in a 'data' table, where user attributes are stored. However, queries which refer to attributes of a particular Item type do not need to load data from this central table, since they can determine the type of the requested object from context. It is only consulted when an object is loaded by ID.

Competition

Q: Why not start with SQLite at the beginning?

A: We looked at it. SQLite 1.0, which was around when we started on Atop, was just awful. SQLite 3.2, the version we are now using, is incredibly awesome by comparison. The operational issues (discussed below) were a significant motivator to go with BSDDB as an "embedded database" in the first place.

Q: Why not [Object database: (ZODB|Durus|Cog|...)] in the first place / now rather than axiom?

A: None of these databases had "large, scalable collection" as a basic type. ZODB does have ZCatalog, but it is an add-on. Atop had pools, Axiom has queries (and axiom.sequence.List). This is an important distinction. Also, they are even more deeply tied to Pickle as a persistence format than Atop was - see below about Pickle.

Q: Why not [O/R Mapper: (SQLObject|PyDO|Django|...)]?

A: SQLObject and PyDO seem to have poor support for transactions. Django doesn't have 'reference to arbitrary type'. All are designed for multithreaded operation connected to a big database rather than embedded use for synchronous, fast, in-process queries. None have schema upgrading built in.

Q: Why not [RDBMS: (PostgreSQL|MySQL|Firebird)] as a backend for this rewrite?

A: We want our system to be able to run on the client as well as the server, to simplify synchronization and message delivery. Also, we plan to run all our applications in clusters, which means we do not have the luxury of customized DBA work on each one; we want to just have a single command to set up a new system, and that means an embedded database. (Those of you that asked about Firebird: although Firebird is in-process, it seems to be configured and maintained much more like a traditional RDBMS than like an embedded database; for example, every application using firebird reads the same system-wide configuration file, and if it's not present, the library helpfully exits your process for you.)

That said, we *are* eventaully interested in supporting other back-ends. We haven't had an opportunity to integrate any patches related to that because it's not a top priority, but we will eventually.

Q: The SQLite homepage says it's inappropriate for server-based systems with lots of data!

A: We use it in a rather unusual way; we don't keep all of our data in one big database. It might not be the most performant thing in the world but given the other constraints discussed above, it's still far better than a traditional RDBMS, and performance has not been negatively impacted.

Religion

Q: Why a relational database? Don't you guys hate relational databases?

A: Let me set the record straight here: relational == good. SQL == bad. It is unfortunate that relational databases universally have SQL as their interface.

We switched to a relational database because it has better support for ad-hoc queries and for multifaceted queries. Axiom's 'store.query' interface seems to do a decent job of preserving an object-based view of the world without sacrificing performance when generating SQL queries. Basically, to get the same functionality, we would have had to build a relational engine on top of BSDDB, which is what we were slowly doing. Think of SQLite as a shortcut.

What we had in Atop was a potentially good idea which worked for many simple cases but which required us to develop our own stack all the way up from a basic B-Tree. Missing elements of that stack meant missing features, some of which were rapidly becoming critical for our applications.

SQLite implements most of the hard stuff - paging, loading, indexes, querying - and the concerns which make SQL databases generally unsuited to navigational structures (mostly "really high query latency") don't apply in its case. If it had a custom API or query language not SQL, we would have used it as well, as long as it implemented those same features.

Q: Don't relational databases perform poorly on hierarchical and sequential datasets, like threads, mailboxes and the like?

A: It's a bit slower than bsddb in the best case, yes, BUT, taking into account performance hogs like Atop's custom indexing scheme (implemented in about 3 days, never optimized) and our large objects stored with pickle, SQLite is a clear winner here. Hierarchical queries can be optimized by flattening to a top node for shallow hierarchies (and extremely deep hierarchies and navigational structures have different optimization problems entirely).

Navigational structure can still be implemented with Axiom, using the 'reference to any' type described several places on this page. It seems to be reasonably efficient. I might not use it as the basis for a frame-based ontology representation engine, but if you need FramerD or OpenCyc, you know where to find them :-).

Q: Isn't BSDDB awesome? Why'd you abandon it?

A: For what it is: yes. However, what it is, is a fairly special-purpose tool. Also, it is VERY hard to handle correctly. After literally years of experimentation, we have determined the documentation to be incorrect on a few very important points related to reliability. I think we managed a good deal better than Subversion did, but the SVN community's endemic problems with bsddb are well-documented.

SQLite *may* be slightly less reliable if used perfectly correctly; however, there is no hard data to suggest this. What is certain is that it's much easier to manage and much harder to make mistakes related to recovery and failure.

Q: I used the word Pickle in a conversation, and (Glyph and/or JP) shotgunned a handful of Prozac and washed it down with whiskey. Is he / are they okay?

A: Pickle is baaaad news, kids. It's *great* if your tiny little application needs to save some data in a hurry. It generates some BIIIIG problems if you use it on a large scale.

  • Your data tends to calcify along with your application code. It's very difficult to find the links between them in either direction; you don't know what application code your data references, you don't know what data your application code is producing. The fact that it's easy to manage this on a framework level is deceptive - once you start saving real data the rules all change.
  • Explicit is better than implicit. *EVERYTHING* in Pickle is implicit.
  • It is possible - nay, *easy*, to have some totally random part of your application stick temporary data to an in-database objects. Once objects 'get dirty' in this fashion, it's nearly impossible to find them, and when you do, the usual way to detect the problem is by having a 'load' operation explode!
  • As a corollary to that, since there are no explicit schemas, if you write an otherwise valid upgrader which does not delete an obsoleted attribute, that attribute will silently sit around forever, bloating your database until the end of time (or until objects you THOUGHT were gone from your data are deleted from your code, only to surprise you by then not loading)
  • cPickle can be coerced to coredump for certain inputs, both on store and on load. Some of these inputs are valid, some are the result of programming bugs. Heaven help you if you ever create a bug in a reduce method.
  • Also, 'regular' Pickle is too slow even for a joke.
  • There are no tools, besides a raw Python prompt, to investigate the contents of a pickle, or to profile the disk or memory usage of different parts of it. They are completely opaque blobs.

We have experienced all of these issues in production and some of them still give us nightmares. While we've never lost any user data, it has certainly made the process of upgrading and enhancing our production server... challenging.

And finally, my favorite section:

ZOMG Hax!!1!!

Q: Your above answer for <question> is totally wrong. <database product X> answers it better, and by extension is much better overall. Why don't you rewrite Axiom to look like <database product X>?

A: 'Questions' of this form will be politely ignored, *especially* if product X is ObjectStore or Gemstone - there is *no way* we will ever have the time or inclination to become a full-fledged database company, we're just trying to find the best collection of tools and apply them to suit our needs. Please stop asking!

Q: Why did you abandon Atop! I was using that, I expect support!!!!

A: We never made any representations about Atop's stability, and in fact we actively discouraged people who were not working directly with us from using it. We will gladly maintain your Atop-based application, or migrate it to Axiom, for a reasonable maintenance fee. Seriously people - we are a *company*, not a foundation, and we are already giving away most of our infrastructure. We are NOT going to maintain it for your use cases unless you give us enough money to feed the people doing it.

jethro@divmod.org