[Haskell] Announcing harchive: Backup and restore software in
Haskell.
David Brown
haskell2 at davidb.org
Sun Mar 4 23:40:13 EST 2007
Announcing release 0.1 of "harchive", a program for backing up and
restoring data. I've included a brief feature list below. The code
is available in darcs at:
http://www.davidb.org/darcs/harchive/
The connection isn't all that fast, so if it gets too busy, I'll move
it somewhere else.
This software is in a very early stage, but is at the point where
others may be interested in looking at it. It does demonstrate that
Haskell (at least GHC) is indeed useful for this kind of low-level
seeming task.
Thanks,
David Brown
Harchive version 0.1.
- Implemented, with support for the following.
- Client/server model. The backup is stored in a file pool with
the hpool program. The hfile program can access this pool over
tcp (no authentication or encryption, so be careful).
- Stores data from multiple backups and multiple machines in a
content addressible store. Duplicated data even on separate
machines will not take additional space. Collisions can be made
arbitrarily improbable by choice of hash function size (not
easily changeable, in the current code, though).
- Pool manager uses Sqlite3 specific capabilities to get efficient
storage of the pool index. Sqlite3 is a custom binding to
Sqlite3 to take advantage of these capabilities.
- Uses openssl's sha1 library. Wouldn't be difficult to use a
different library (there are license issues with openssl).
Generally the program's performance is bound by the speed of
hashing and/or the speed of data compression.
- Uses Duncan Coutts' zlib library to get good zlib speed.
- Linux dependent. Uses the output of '/sbin/blkid' to map
devices to UUIDs of the filesystems to get persistent, and
unique identifiers for each filesystem.
- Has a primitive status display ('-v') during backup.
- Able to backup and restore directories/filesystems. Restore
semantics are as accurate as I can get them without extra
strange semantics.
- Multithreaded backup. Allows backup to run at high speed, even
while waiting for cache responses from the server. Does not
need to be built with '-threaded'.
- Restore is reasonably simple. It tends to get tested less, so
it is beneficial to move complexity to the backup side.
More information about the Haskell
mailing list