A BookKeeper cluster consists of two main components:
- A ZooKeeper cluster that is used for configuration- and coordination-related tasks
- An ensemble of bookies
We won't provide a full guide to setting up a ZooKeeper cluster here. We recommend that you consult this guide in the official ZooKeeper documentation.
Cluster metadata setup
Once your ZooKeeper cluster is up and running, there is some metadata that needs to be written to ZooKeeper, so you need to modify the bookie's configuration to make sure that it points to the right ZooKeeper cluster.
On each bookie host, you need to download the BookKeeper package as a tarball. Once you've done that, you need to configure the bookie by setting values in the
bookkeeper-server/conf/bk_server.conf config file. The one parameter that you will absolutely need to change is the
zkServers parameter, which you will need to set to the ZooKeeper connection string for your ZooKeeper cluster. Here's an example:
A full listing of configurable parameters available in
bookkeeper-server/conf/bk_server.confcan be found in the Configuration reference manual.
Once the bookie's configuration is set, you can set up cluster metadata for the cluster by running the following command from any bookie in the cluster:
$ bookkeeper-server/bin/bookkeeper shell metaformat
You can run in the formatting
metaformatcommand performs all the necessary ZooKeeper cluster metadata tasks and thus only needs to be run once and from any bookie in the BookKeeper cluster.
Once cluster metadata formatting has been completed, your BookKeeper cluster is ready to go!
Starting up bookies
Before you start up your bookies, you should make sure that all bookie hosts have the correct configuration, then you can start up as many bookies as you'd like to form a cluster by using the
bookie command of the
bookkeeper CLI tool:
$ bookkeeper-server/bin/bookkeeper bookie
The number of bookies you should run in a BookKeeper cluster depends on the quorum mode that you've chosen, the desired throughput, and the number of clients using the cluster simultaneously.
|Number of bookies
Increasing the number of bookies will enable higher throughput, and there is no upper limit on the number of bookies.