David Brown's Blog

David Brown's Blog

David Brown  //  Software engineer/Jazz musician.

May 31 / 8:06am

Jpool up on github

I figure with wonderful tools like git, there's not really that much reason for me to avoid publishing the code. I'll be pushing the source to github's jpool page as I work on it.

It's not exactly ready for release, but if anyone wants to play with it, feel free. You'll have to have Apache Ant and Ivy installed in order to build it, as well as version 2.7.4 of the Scala compiler installed. Ivy will download the rest of the needed dependencies. Running ant test should run the unit tests.

There's only a couple of commands that it has so far. Everything needs a storage pool reference which is a URI of the form jpool:file://path, note that since the first two slashes are part of the URI, if the path is absolute (and it should be), you'll have three slashes in the path.

Things you are do are:


  • Store a tarball into the pool: tar -cf - ... | jpool save jpool:file:///path key=value key=value
  • Make a snapshot of a directory: jpool dump jpool:file:///path /path/to/dump key=value key=value
  • List the entries: jpool list jpool:file:///path
  • Extract a tarball: jpool restore jpool:file:///path hash --tar | tar -xf - ...
  • Extract a snapshot: jpool restore jpool:file:///path hash /path/to/restore

The tarball's are more intended for archiving than backup. You should not compress the data. Jpool does partially parse the tar headers. The contents will be compressed and the data de-duped when stored in the pool (meaning it will take little space to store similar tarballs).

The snapshots are intended for backup. Again the data is deduped, and jpool will remember file contents so that subsequent backups should be fast. However, these are not incremental backups. Each snapshot is a complete snapshot of the tree, but will share data with previous snapshots.

Associated with each backup are a set of key=value pairs. You can use whatever you want, I typically use host=hostname fs=root, and stuff like that. Using something informative is important, because as far as jpool is concerned, the only meaningful handle is the SHA-1 hash of the backup itself.