Bookie Configuration Parameters

This page contains detailed information about configuration parameters used for configuring a bookie server. There is an example in "bookkeeper-server/conf/bk_server.conf".

Server parameters

bookiePortPort that bookie server listens on. The default value is 3181.
journalDirectoryDirectory to which Bookkeeper outputs its write ahead log, ideally on a dedicated device. The default value is "/tmp/bk-txn".
ledgerDirectoriesDirectory to which Bookkeeper outputs ledger snapshots. Multiple directories can be defined, separated by comma, e.g. /tmp/bk1-data,/tmp/bk2-data. Ideally ledger dirs and journal dir are each on a different device, which reduces the contention between random I/O and sequential writes. It is possible to run with a single disk, but performance will be significantly lower.
indexDirectoriesDirectories to store index files. If not specified, bookie will use ledgerDirectories to store index files.
bookieDeathWatchIntervalInterval to check whether a bookie is dead or not, in milliseconds.
gcWaitTimeInterval to trigger next garbage collection, in milliseconds. Since garbage collection is running in the background, running the garbage collector too frequently hurts performance. It is best to set its value high enough if there is sufficient disk capacity.
flushIntervalInterval to flush ledger index pages to disk, in milliseconds. Flushing index files will introduce random disk I/O. Consequently, it is important to have journal dir and ledger dirs each on different devices. However, if it necessary to have journal dir and ledger dirs on the same device, one option is to increment the flush interval to get higher performance. Upon a failure, the bookie will take longer to recover.
numAddWorkerThreadsNumber of threads that should handle write requests. if zero, the writes would be handled by netty threads directly.
numReadWorkerThreadsNumber of threads that should handle read requests. if zero, the reads would be handled by netty threads directly.

NIO server settings

serverTcpNoDelayThis settings is used to enabled/disabled Nagle's algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance. Default value is true.

Journal settings

journalMaxSizeMBMaximum file size of journal file, in megabytes. A new journal file will be created when the old one reaches the file size limitation. The default value is 2kB.
journalMaxBackupsMax number of old journal file to keep. Keeping a number of old journal files might help data recovery in some special cases. The default value is 5.
journalPreAllocSizeMBThe space that bookie pre-allocate at a time in the journal.
journalWriteBufferSizeKBSize of the write buffers used for the journal.
journalRemoveFromPageCacheWhether bookie removes pages from page cache after force write. Used to avoid journal pollute os page cache.
journalAdaptiveGroupWritesWhether to group journal force writes, which optimize group commit for higher throughput.
journalMaxGroupWaitMSecMaximum latency to impose on a journal write to achieve grouping.
journalBufferedWritesThresholdMaximum writes to buffer to achieve grouping.
journalFlushWhenQueueEmptyWhether to flush the journal when journal queue is empty. Disabling it would provide sustained journal adds throughput.
numJournalCallbackThreadsThe number of threads that should handle journal callbacks.

Ledger cache settings

openFileLimitMaximum number of ledger index files that can be opened in a bookie. If the number of ledger index files reaches this limit, the bookie starts to flush some ledger indexes from memory to disk. If flushing happens too frequently, then performance is affected. You can tune this number to improve performance according.
pageSizeSize of an index page in ledger cache, in bytes. A larger index page can improve performance when writing page to disk, which is efficient when you have small number of ledgers and these ledgers have a similar number of entries. With a large number of ledgers and a few entries per ledger, a smaller index page would improves memory usage.
pageLimitMaximum number of index pages to store in the ledger cache. If the number of index pages reaches this limit, bookie server starts to flush ledger indexes from memory to disk. Incrementing this value is an option when flushing becomes frequent. It is important to make sure, though, that pageLimit*pageSize is not more than JVM max memory limit; otherwise it will raise an OutOfMemoryException. In general, incrementing pageLimit, using smaller index page would gain better performance in the case of a large number of ledgers with few entries per ledger. If pageLimit is -1, a bookie uses 1/3 of the JVM memory to compute the maximum number of index pages.

Ledger manager settings

ledgerManagerTypeWhat kind of ledger manager is used to manage how ledgers are stored, managed and garbage collected. See BookKeeper Internals for detailed info. Default is flat.
zkLedgersRootPathRoot zookeeper path to store ledger metadata. Default is /ledgers.

Entry Log settings

logSizeLimitMaximum file size of entry logger, in bytes. A new entry log file will be created when the old one reaches the file size limitation. The default value is 2GB.
entryLogFilePreallocationEnabledEnable/Disable entry logger preallocation. Enable this would provide sustained higher throughput and reduce latency impaction.
readBufferSizeBytesThe number of bytes used as capacity for BufferedReadChannel. Default is 512 bytes.
writeBufferSizeBytesThe number of bytes used as capacity for the write buffer. Default is 64KB.

Entry Log compaction settings

minorCompactionIntervalInterval to run minor compaction, in seconds. If it is set to less than or equal to zero, then minor compaction is disabled. Default is 1 hour.
minorCompactionThresholdEntry log files with remaining size under this threshold value will be compacted in a minor compaction. If it is set to less than or equal to zero, the minor compaction is disabled. Default is 0.2
majorCompactionIntervalInterval to run major compaction, in seconds. If it is set to less than or equal to zero, then major compaction is disabled. Default is 1 day.
majorCompactionThresholdEntry log files with remaining size below this threshold value will be compacted in a major compaction. Those entry log files whose remaining size percentage is still higher than the threshold value will never be compacted. If it is set to less than or equal to zero, the major compaction is disabled. Default is 0.8.
compactionMaxOutstandingRequestsThe maximum number of entries which can be compacted without flushing. When compacting, the entries are written to the entrylog and the new offsets are cached in memory. Once the entrylog is flushed the index is updated with the new offsets. This parameter controls the number of entries added to the entrylog before a flush is forced. A higher value for this parameter means more memory will be used for offsets. Each offset consist of 3
longs. This parameter should not be modified unless you know what you're doing.
compactionRateThe rate at which compaction will re-add entries. The unit is adds per second.

Statistics

enableStatisticsEnables the collection of statistics. Default is on.

Auto-replication

openLedgerRereplicationGracePeriodThis is the grace period which the rereplication worker waits before fencing and replicating a ledger fragment which is still being written to upon a bookie failure. The default is 30s.

Read-only mode support

readOnlyModeEnabledEnables/disables the read-only Bookie feature. A bookie goes into read-only mode when it finds integrity issues with stored data. If readOnlyModeEnabled is false, the bookie shuts down if it finds integrity issues. By default it is enabled.

Disk utilization

diskUsageThresholdFraction of the total utilized usable disk space to declare the disk full. The total available disk space is obtained with File.getUsableSpace(). Default is 0.95.
diskCheckIntervalInterval between consecutive checks of disk utilization. Default is 10s.

ZooKeeper parameters

zkServersA list of one or more servers on which zookeeper is running. The server list is comma separated, e.g., zk1:2181,zk2:2181,zk3:2181
zkTimeoutZooKeeper client session timeout in milliseconds. Bookie server will exit if it received SESSION_EXPIRED because it was partitioned off from ZooKeeper for more than the session timeout. JVM garbage collection or disk I/O can cause SESSION_EXPIRED. Increment this value could help avoiding this issue. The default value is 10,000.