Recovering svn bdb repositories from endian issues
Update: Dominic has a few tips to make this an even easier process.
The other day I came across the old subversion repository for my thesis1, and thought it would be a good idea to convert it to git and back it up somewhere. Immediate problem: I couldn’t even read it! This post is a short description of what the problem was, and how it was resolved.
The Problem
The repository uses the old (deprecated) BerkelyDB file-system type. Let’s try and browse it:
$ svn ls file:///home/mark/_SVNREPOS/thesis
svn: E180001: Unable to connect to a repository at URL 'file:///home/mark/_SVNREPOS/thesis'
svn: E180001: Unable to open an ra_local session to URL
svn: E180001: Unable to open repository 'file:///home/mark/_SVNREPOS/thesis'
svn: E160029: Berkeley DB error for filesystem '/home/mark/_SVNREPOS/thesis/db' while opening environment:
svn: E160029: BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery
svn: E160029: bdb: BDB2530 Ignoring log file: /home/mark/_SVNREPOS/thesis/db/log.0000000018: magic number 88090400, not 40988
svn: E160029: bdb: BDB2527 Invalid log file: log.0000000018: Invalid argument
svn: E160029: bdb: BDB0061 PANIC: Invalid argument
svn: E160029: bdb: BDB1546 unable to join the environment
Oops. The key is in the fragment “magic number 88090400, not 40988
”, which looks suspiciously like an
endianness issue
(88-09-04-00 / 00-04-09-88). It was created on a PPC Mac and I’m now
trying to access it from an Intel box, so that seems like a reasonable
explanation.
At this point though I ran into something of a black-hole. It was hard to even find confirmation that it might be possible; “bdb is platform-independent” is the prevailing opinion. Some evidence did exist but was otherwise thin on the ground, which is one reason I’m writing this. One helpful post mentioned that while the main database is platform-independent the log files are not, and hence my problem.
Solution
I no longer had access to a PPC machine, but the solution came when a friend pointed out that QEMU can virtualise big-endian architectures, even on a little-endian host. After that, it was just a matter of implementation details.
- Create a PPC guest
- With this post as
a guide, grab a
pre-configured Debian image,
and apply the
openbios
fix as well. Install subversion on it (
apt-get install subversion
). - Copy the repository onto the guest
- QEMU doesn’t provide out-of-the-box shared folders (well, apparently
it has a built-in SMB server, but I didn’t investigate), but it does
make the host addressable on 10.0.2.2 which is all we need. Tar up
the repository and run
python -m http.server
2 from the same directory on the host, then from the guest download it withwget http://10.0.2.2:8000/repo.tar
. - Migrate the repository
- The reason we’re here; we’ll convert it to the
FSFS format. This is
actually slightly fiddly; essentially you create a new repository in
the FSFS format, dump from your old repository and import into the
new one. Assuming it has been extracted to
oldrepo
:svnadmin create ./fsfsrepo --fs-type fsfs
svnadmin dump ./oldrepo -q | svnadmin load ./fsfsrepo
- Copy the recovered repository back to the host
- Once again there are probably smoother ways, but I used a
quick-and-dirty approach (good old
netcat
!):- (from the host)
nc -l -p 8000 > fsfsrepo.tar
- (on the guest)
tar cvf - fsfsrepo | nc -p 8000 10.0.2.2
- (from the host)
- Import your recovered repository into git
- This step is entirely optional of course, but I’m mentioning it for
the sake of completeness since that was my goal in this exercise. I
used svn2git, which is a
wrapper around
git-svn
.
Conclusion
I hope this helps someone else in the future, and at the very least adds Google-weight to the problem. Thanks to the fine folk on Google+ for their suggestions, and Paul for the QEMU tip!
-
Way to make me feel old. Not only was it in a long-deprecated format, but this was actually the rewrite of my thesis. The first version was written under CVS… (Remember when people were excited about the upcoming release of subversion, and new announcements about it made the front page of Slashdot? Hell, remember Slashdot? I think I’m going to go drink now, and feel like a fossil.) ↩︎
-
Or
python -m SimpleHTTPServer
if you’re still using python 2. ↩︎