Bookie Recovery

When a bookie crashes, any ledgers with entries on the bookie potentially become underreplicated. For this reason, we provide a recovery tool which will ensure that all ledgers which had entries on the bookie are fully replicated. At the moment, this is not an automatic process. The administrator must run this tool manually when he sees that the bookie has died.

To run recovery, with zk1.example.com as the zookeeper ensemble, and bk3.example.com as the failed bookie, do the following:

bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools zk1.example.com:2181 bk3.example.com:3181

It is necessary to specify the host and port portion of failed bookie, as this is how it identifies itself to zookeeper. It is possible to specify a third argument, which is the bookie to replicate to. If this is omitted, as in our example, a random bookie is chosen for each ledger fragment. A ledger fragment is a continous sequence of entries in a bookie, which share the same ensemble.

The recovery process is as follows.

  1. The client reads the metadata of active ledgers from zookeeper;
  2. From this, the ledgers which contain fragments using the failed bookie in their ensemble are selected;
  3. A recovery process is initiated for each ledger in this list;
    1. The client goes through all ledger fragments in the ledger, selecting those which contain the failed bookie;
    2. A recovery process is initiated for each ledger fragment in this list;
      1. The client selects a bookie to which all entries in the ledger fragment will be replicated;
      2. the client reads entries that belong to the ledger fragment from other bookies in the ensemble and writes them to the selected bookie;
      3. Once all entries have been replicated, the zookeeper metadata for the fragment is updated to reflect the new ensemble;
      4. The fragment is marked as fully replicated in the recovery tool;
    3. Once all ledger fragements are marked as fully replicated, the ledger is marked as fully replicated;
  4. Once all ledgers are marked as fully replicated, bookie recovery is finished.