A reference guide to all of BookKeeper's configurable parameters


The table below lists parameters that you can set to configure bookies. All configuration takes place in the bk_server.conf file in the bookkeeper-server/conf directory of your BookKeeper installation.

Server parameters

Parameter Description Default
bookiePort

The port that the bookie server listens on.

3181
journalDirectories

The directories to which Bookkeeper outputs its write-ahead log (WAL). Could define multi directories to store write head logs, separated by ‘,’. For example: journalDirectories=/tmp/bk-journal1,/tmp/bk-journal2 If journalDirectories is set, bookies will skip journalDirectory and use this setting directory.

/tmp/bk-journal
journalDirectory

The directory to which Bookkeeper outputs its write-ahead log (WAL).

/tmp/bk-txn
allowMultipleDirsUnderSameDiskPartition

Configure the bookie to allow/disallow multiple ledger/index/journal directories in the same filesystem disk partition

indexDirectories

The directories in which index files are stored. If not specified, the value of ledgerDirectories will be used.

/tmp/bk-data
minUsableSizeForIndexFileCreation

Minimum safe usable size to be available in index directory for bookie to create index file while replaying journal at the time of bookie start in readonly mode (in bytes)

1073741824
listeningInterface

The network interface that the bookie should listen on. If not set, the bookie will listen on all interfaces.

eth0
advertisedAddress

Configure a specific hostname or IP address that the bookie should use to advertise itself to clients. If not set, bookie will advertised its own IP address or hostname, depending on the listeningInterface and useHostNameAsBookieID settings.

eth0
allowLoopback

Whether the bookie is allowed to use a loopback interface as its primary interface (the interface it uses to establish its identity). By default, loopback interfaces are not allowed as the primary interface.

Using a loopback interface as the primary interface usually indicates a configuration error. It’s fairly common in some VPS setups, for example, to not configure a hostname or to have the hostname resolve to 127.0.0.1. If this is the case, then all bookies in the cluster will establish their identities as 127.0.0.1:3181, and only one will be able to join the cluster. For VPSs configured like this, you should explicitly set the listening interface.

false
bookieDeathWatchInterval

Interval to watch whether bookie is dead or not, in milliseconds.

1000
flushInterval

How long the interval to flush ledger index pages to disk, in milliseconds. Flushing index files will introduce much random disk I/O. If separating journal dir and ledger dirs each on different devices, flushing would not affect performance. But if putting journal dir and ledger dirs on same device, performance degrade significantly on too frequent flushing. You can consider increment flush interval to get better performance, but you need to pay more time on bookie server restart after failure.

100
allowStorageExpansion

Allow the expansion of bookie storage capacity. Newly added ledger and index directories must be empty.

false
useHostNameAsBookieID

Whether the bookie should use its hostname to register with the ZooKeeper coordination service. When false, the bookie will use its IP address for the registration.

false
allowEphemeralPorts

Whether the bookie is allowed to use an ephemeral port (port 0) as its server port. By default, an ephemeral port is not allowed. Using an ephemeral port as the service port usually indicates a configuration error. However, in unit tests, using an ephemeral port will address port conflict problems and allow running tests in parallel.

false
enableLocalTransport

Whether allow the bookie to listen for BookKeeper clients executed on the local JVM.

false
disableServerSocketBind

Whether allow the bookie to disable bind on network interfaces, this bookie will be available only to BookKeeper clients executed on the local JVM.

false
skipListArenaChunkSize

The number of bytes we should use as chunk allocation for org.apache.bookkeeper.bookie.SkipListArena

4194304
skipListArenaMaxAllocSize

The max size we should allocate from the skiplist arena. Allocations larger than this should be allocated directly by the VM to avoid fragmentation.

131072
bookieAuthProviderFactoryClass

The bookie authentication provider factory class name. If this is null, no authentication will take place.

Garbage collection settings

Parameter Description Default
gcWaitTime

How long the interval to trigger next garbage collection, in milliseconds. Since garbage collection is running in background, too frequent gc will heart performance. It is better to give a higher number of gc interval if there is enough disk capacity.

1000
gcOverreplicatedLedgerWaitTime

How long the interval to trigger next garbage collection of overreplicated ledgers, in milliseconds. This should not be run very frequently since we read the metadata for all the ledgers on the bookie from zk.

86400000
numAddWorkerThreads

The number of threads that handle write requests. if zero, writes are handled by Netty threads directly.

1
numReadWorkerThreads

The umber of threads that handle read requests. If 0, reads are handled by Netty threads directly.

1
isForceGCAllowWhenNoSpace

Whether force compaction is allowed when the disk is full or almost full. Forcing GC may get some space back, but may also fill up disk space more quickly. This is because new log files are created before GC, while old garbage log files are deleted after GC.

false

TLS settings

Parameter Description Default
tslProvider

TLS Provider (JDK or OpenSSL)

OpenSSL
tlsProviderFactoryClass

The path to the class that provides security.

org.apache.bookkeeper.security.SSLContextFactory
tlsClientAuthentication

Type of security used by server.

true
tlsKeyStoreType

Bookie Keystore type.

JKS
tlsKeyStore

Bookie Keystore location (path).

tlsKeyStore

Bookie Keystore location (path).

tlsKeyStorePasswordPath

Bookie Keystore password path, if the keystore is protected by a password.

tlsTrustStoreType

Bookie Truststore type.

tlsTrustStore

Bookie Truststore location (path).

tlsTrustStorePasswordPath

Bookie Truststore password path, if the truststore is protected by a password.

Long poll request parameter settings

Parameter Description Default
numLongPollWorkerThreads

The number of threads that should handle long poll requests.

10
requestTimerTickDurationMs

The tick duration in milliseconds for long poll requests.

10
requestTimerNumTicks

The number of ticks per wheel for the long poll request timer.

1024

AutoRecovery settings

Parameter Description Default
auditorPeriodicBookieCheckInterval

The time interval between auditor bookie checks, in seconds. The auditor bookie check checks ledger metadata to see which bookies should contain entries for each ledger. If a bookie that should contain entries is unavailable, then the ledger containing that entry is marked for recovery. Setting this to 0 disables the periodic check. Bookie checks will still run when a bookie fails. The default is once per day.

86400
rereplicationEntryBatchSize

The number of entries that a replication will rereplicate in parallel.

10
openLedgerRereplicationGracePeriod

The grace period, in seconds, that the replication worker waits before fencing and replicating a ledger fragment that’s still being written to upon bookie failure.

30
autoRecoveryDaemonEnabled

Whether the bookie itself can start auto-recovery service also or not.

lostBookieRecoveryDelay

How long to wait, in seconds, before starting autorecovery of a lost bookie.

0

Netty server settings

Parameter Description Default
serverTcpNoDelay

This settings is used to enabled/disabled Nagle’s algorithm, which is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network.

If you are sending many small messages, such that more than one can fit in a single IP packet, setting server.tcpnodelay to false to enable Nagle algorithm can provide better performance.

true
serverSockKeepalive

This setting is used to send keep-alive messages on connection-oriented sockets.

true
serverTcpLinger

The socket linger timeout on close. When enabled, a close or shutdown will not return until all queued messages for the socket have been successfully sent or the linger timeout has been reached. Otherwise, the call returns immediately and the closing is done in the background.

0
byteBufAllocatorSizeInitial

The Recv ByteBuf allocator initial buf size.

65536
byteBufAllocatorSizeMin

The Recv ByteBuf allocator min buf size.

65536
byteBufAllocatorSizeMax

The Recv ByteBuf allocator max buf size.

1048576

Journal settings

Parameter Description Default
journalFormatVersionToWrite

The journal format version to write. Available formats are 1-5: 1: no header 2: a header section was added 3: ledger key was introduced 4: fencing key was introduced 5: expanding header to 512 and padding writes to align sector size configured by journalAlignmentSize

By default, it is 4. If you’d like to enable padding-writes feature, you can set journal version to 5. You can disable padding-writes by setting journal version back to 4. This feature is available in 4.5.0 and onward versions.

4
journalMaxSizeMB

Max file size of journal file, in mega bytes. A new journal file will be created when the old one reaches the file size limitation.

2048
journalMaxBackups

Max number of old journal file to kept. Keep a number of old journal files would help data recovery in specia case.

5
journalPreAllocSizeMB

How much space should we pre-allocate at a time in the journal.

16
journalWriteBufferSizeKB

Size of the write buffers used for the journal.

64
journalRemoveFromPageCache

Should we remove pages from page cache after force write

false
journalAdaptiveGroupWrites

Should we group journal force writes, which optimize group commit for higher throughput.

true
journalMaxGroupWaitMSec

Maximum latency to impose on a journal write to achieve grouping.

200
journalBufferedWritesThreshold

Maximum writes to buffer to achieve grouping.

524288
journalFlushWhenQueueEmpty

If we should flush the journal when journal queue is empty.

false
numJournalCallbackThreads

The number of threads that should handle journal callbacks.

1
journalAlignmentSize

All the journal writes and commits should be aligned to given size. If not, zeros will be padded to align to given size.

512
journalBufferedEntriesThreshold

Maximum entries to buffer to impose on a journal write to achieve grouping.

0
journalFlushWhenQueueEmpty

If we should flush the journal when journal queue is empty.

false

Ledger storage settings

Parameter Description Default
ledgerStorageClass

Ledger storage implementation class

org.apache.bookkeeper.bookie.SortedLedgerStorage
ledgerDirectories

The directory to which Bookkeeper outputs ledger snapshots. You can define multiple directories to store snapshots separated by a comma, for example /tmp/data-dir1,/tmp/data-dir2.

/tmp/bk1-data,/tmp/bk2-data
auditorPeriodicCheckInterval

The time interval, in seconds, at which the auditor will check all ledgers in the cluster. By default this runs once a week.

Set this to 0 to disable the periodic check completely. Note that periodic checking will put extra load on the cluster, so it should not be run more frequently than once a day.

604800
sortedLedgerStorageEnabled

Whether sorted-ledger storage enabled (default true)

true
skipListSizeLimit

The skip list data size limitation (default 64MB) in EntryMemTable

67108864L

Ledger cache settings

Parameter Description Default
openFileLimit

Max number of ledger index files could be opened in bookie server. If number of ledger index files reaches this limitation, bookie server started to swap some ledgers from memory to disk. Too frequent swap will affect performance. You can tune this number to gain performance according your requirements.

900
pageSize

Size of a index page in ledger cache, in bytes. A larger index page can improve performance writing page to disk, which is efficent when you have small number of ledgers and these ledgers have similar number of entries. If you have large number of ledgers and each ledger has fewer entries, smaller index page would improve memory usage.

8192
pageLimit

How many index pages provided in ledger cache. If number of index pages reaches this limitation, bookie server starts to swap some ledgers from memory to disk. You can increment this value when you found swap became more frequent. But make sure pageLimit*pageSize should not more than JVM max memory limitation, otherwise you would got OutOfMemoryException. In general, incrementing pageLimit, using smaller index page would gain bettern performance in lager number of ledgers with fewer entries case. If pageLimit is -1, bookie server will use 1/3 of JVM memory to compute the limitation of number of index pages.

1

Ledger manager settings

Parameter Description Default
ledgerManagerType

The ledger manager type, which defines how ledgers are stored, managed, and garbage collected. See the Ledger Manager guide for more details.

flat
zkLedgersRootPath

Root Zookeeper path to store ledger metadata. This parameter is used by zookeeper-based ledger manager as a root znode to store all ledgers.

/ledgers

Entry log settings

Parameter Description Default
logSizeLimit

Max file size of entry logger, in bytes. A new entry log file will be created when the old one reaches the file size limitation.

2147483648
entryLogFilePreallocationEnabled

Enable/Disable entry logger preallocation

true
flushEntrylogBytes

Entry log flush interval, in bytes. Setting this to 0 or less disables this feature and makes flush happen on log rotation. Flushing in smaller chunks but more frequently reduces spikes in disk I/O. Flushing too frequently may negatively affect performance.

0
readBufferSizeBytes

The capacity allocated for BufferedReadChannels, in bytes.

512
writeBufferSizeBytes

The number of bytes used as capacity for the write buffer.

65536

Entry log compaction settings

Parameter Description Default
compactionRate

The rate at which compaction will read entries. The unit is adds per second.

1000
minorCompactionThreshold

Threshold of minor compaction. For those entry log files whose remaining size percentage reaches below this threshold will be compacted in a minor compaction. If it is set to less than zero, the minor compaction is disabled.

0.2
minorCompactionInterval

Interval to run minor compaction, in seconds. If it is set to less than zero, the minor compaction is disabled.

compactionMaxOutstandingRequests

Set the maximum number of entries which can be compacted without flushing. When compacting, the entries are written to the entrylog and the new offsets are cached in memory. Once the entrylog is flushed the index is updated with the new offsets. This parameter controls the number of entries added to the entrylog before a flush is forced. A higher value for this parameter means more memory will be used for offsets. Each offset consists of 3 longs. This parameter should not be modified unless you know what you’re doing.

100000
majorCompactionThreshold

Threshold of major compaction. For those entry log files whose remaining size percentage reaches below this threshold will be compacted in a major compaction. Those entry log files whose remaining size percentage is still higher than the threshold will never be compacted. If it is set to less than zero, the minor compaction is disabled.

0.8
majorCompactionInterval

Interval to run major compaction, in seconds. If it is set to less than zero, the major compaction is disabled.

86400
isThrottleByBytes

Throttle compaction by bytes or by entries.

false
compactionRateByEntries

Set the rate at which compaction will read entries. The unit is adds per second.

1000
compactionRateByBytes

Set the rate at which compaction will read entries. The unit is bytes added per second.

1000000

Statistics

Parameter Description Default
enableStatistics

Whether statistics are enabled for the bookie.

true
statsProviderClass

Stats provider class.

org.apache.bookkeeper.stats.CodahaleMetricsProvider

Read-only mode support

Parameter Description Default
readOnlyModeEnabled

If all ledger directories configured are full, then support only read requests for clients. If “readOnlyModeEnabled=true” then on all ledger disks full, bookie will be converted to read-only mode and serve only read requests. Otherwise the bookie will be shutdown. By default this will be disabled.

false
forceReadOnlyBookie

Whether the bookie is force started in read only mode or not.

false
persistBookieStatusEnabled

Persist the bookie status locally on the disks. So the bookies can keep their status upon restarts.

Disk utilization

Parameter Description Default
diskUsageThreshold

For each ledger dir, maximum disk space which can be used. Default is 0.95f. i.e. 95% of disk can be used at most after which nothing will be written to that partition. If all ledger dir partions are full, then bookie will turn to readonly mode if ‘readOnlyModeEnabled=true’ is set, else it will shutdown. Valid values should be in between 0 and 1 (exclusive).

0.95
diskUsageLwmThreshold

Set the disk free space low water mark threshold. Disk is considered full when usage threshold is exceeded. Disk returns back to non-full state when usage is below low water mark threshold. This prevents it from going back and forth between these states frequently when concurrent writes and compaction are happening. This also prevent bookie from switching frequently between read-only and read-writes states in the same cases.

0.9
diskUsageWarnThreshold

The disk free space low water mark threshold. Disk is considered full when usage threshold is exceeded. Disk returns back to non-full state when usage is below low water mark threshold. This prevents it from going back and forth between these states frequently when concurrent writes and compaction are happening. This also prevent bookie from switching frequently between read-only and read-writes states in the same cases.

0.95
diskCheckInterval

Disk check interval in milliseconds. Interval to check the ledger dirs usage.

10000

ZooKeeper parameters

Parameter Description Default
zkServers

A list of one of more servers on which Zookeeper is running. The server list can be comma separated values, for example zkServers=zk1:2181,zk2:2181,zk3:2181.

localhost:2181
zkTimeout

ZooKeeper client session timeout in milliseconds. Bookie server will exit if it received SESSION_EXPIRED because it was partitioned off from ZooKeeper for more than the session timeout JVM garbage collection, disk I/O will cause SESSION_EXPIRED. Increment this value could help avoiding this issue.

10
zkRetryBackoffStartMs

The Zookeeper client backoff retry start time in millis.

1000
zkRetryBackoffMaxMs

The Zookeeper client backoff retry max time in millis.

10000
zkEnableSecurity

Set ACLs on every node written on ZooKeeper, this way only allowed users will be able to read and write BookKeeper metadata stored on ZooKeeper. In order to make ACLs work you need to setup ZooKeeper JAAS authentication all the bookies and Client need to share the same user, and this is usually done using Kerberos authentication. See ZooKeeper documentation

false

An entry is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Entries are also known as records.

A ledger is a sequence of entries written to BookKeeper. Entries are written sequentially to ledgers and at most once, giving ledgers append-only semantics.

A bookie is an individual BookKeeper storage server.

Bookies store the content of ledgers and act as a distributed ensemble.

A subsystem that runs in the background on bookies to ensure that ledgers are fully replicated even if one bookie from the ensemble is down.

Striping is the process of distributing BookKeeper ledgers to sub-groups of bookies rather than to all bookies in a BookKeeper ensemble.

Striping is essential to ensuring fast performance.

A journal file stores BookKeeper transaction logs.

When a reader forces a ledger to close, preventing any further entries from being written to the ledger.

A record is a sequence of bytes (plus some metadata) written to a BookKeeper ledger. Records are also known as entries.