This is the fifth release of BookKeeper as an Apache Top Level Project!
The 4.5.0 release incorporates hundreds of new fixes, improvements, and features since previous major release, 4.4.0, which was released over a year ago. It is a big milestone in Apache BookKeeper community, converging from three main branches (Salesforce, Twitter and Yahoo).
Apache BookKeeper users are encouraged to upgrade to 4.5.0. The technical details of this release are summarized below.
The main features in 4.5.0 cover are around following areas:
- Dependencies Upgrade
- Public API
Here is a list of dependencies upgraded in 4.5.0:
- Moved the developement from Java 7 to Java 8.
- Upgrade Protobuf to
- Upgrade ZooKeeper from
- Upgrade Netty to
- Upgrade Guava to
- Upgrade SLF4J to
- Upgrade Codahale to
Prior to this release, Apache BookKeeper only supports simple
DIGEST-MD5 type authentication.
With this release of Apache BookKeeper, a number of feature are introduced that can be used, together of separately, to secure a BookKeeper cluster.
The following security features are currently supported.
- Authentication of connections to bookies from clients, using either
TLSor `SASL (Kerberos).
- Authentication of connections from clients, bookies, autorecovery daemons to
ZooKeeper, when using zookeeper based ledger managers.
- Encryption of data transferred between bookies and clients, between bookies and autorecovery daemons using
It’s worth noting that those security features are optional - non-secured clusters are supported, as well as a mix of authenticated, unauthenticated, encrypted and non-encrypted clients.
For more details, have a look at BookKeeper Security.
There are multiple new client features introduced in 4.5.0.
The [Ledger API] is the low level API provides by BookKeeper for interacting with
ledgers in a bookkeeper cluster.
It is simple but not flexible on ledger id or entry id generation. Apache BookKeeper introduces
as an extension of existing
LedgerHandle for advanced usage. The new
LedgerHandleAdv allows applications providing
ledger-id and assigning
entry-id on adding entries.
See Ledger Advanced API for more details.
Long Poll is a main feature that DistributedLog uses to achieve low-latency tailing.
This big feature has been merged back in 4.5.0 and available to BookKeeper users.
This feature includes two main changes, one is
LastAddConfirmed piggyback, while the other one is a new
long poll read API.
The first change piggyback the latest
LastAddConfirm along with the read response, so your
LastAddConfirmed will be automatically advanced
when your read traffic continues. It significantly reduces the traffic to explicitly polling
LastAddConfirmed and hence reduces the end-to-end latency.
The second change provides a new
long poll read API, allowing tailing-reads without polling
LastAddConfirmed everytime after readers exhaust known entries.
long poll API brings great latency improvements on tailing reads, it is still a very low-level primitive.
It is still recommended to use high level API (e.g. DistributedLog API) for tailing and streaming use cases.
See Streaming Reads for more details.
Prior to 4.5.0, the
LAC is only advanced when subsequent entries are added. If there is no subsequent entries added,
the last entry written will not be visible to readers until the ledger is closed. High-level client (e.g. DistributedLog) or applications
has to work around this by writing some sort of
control records to advance
In 4.5.0, a new
explicit lac feature is introduced to periodically advance
LAC if there are not subsequent entries added. This feature
can be enabled by setting
explicitLacInterval to a positive value.
There are a lot for performance related bug fixes and improvements in 4.5.0. These changes includes:
- Upgraded netty from 3.x to 4.x to leverage buffer pooling and reduce memory copies.
- Moved developement from Java 7 to Java 8 to take advantage of Java 8 features.
- A lot of improvements around scheduling and threading on
- Delay ensemble change to improve tail latency.
- Parallel ledger recovery to improve the recovery speed.
We outlined following four changes as below. For a complete list of performance improvements, please checkout the
full list of changes at the end.
Netty 4 Upgrade
The major performance improvement introduced in 4.5.0, is upgrading netty from 3.x to 4.x.
For more details, please read upgrade guide about the netty related tips when upgrading bookkeeper from 4.4.0 to 4.5.0.
Delay Ensemble Change
Ensemble Change is a feature that Apache BookKeeper uses to achieve high availability. However it is an expensive metadata operation.
Especially when Apache BookKeeper is deployed in a multiple data-centers environment, losing a data center will cause churn of metadata
operations due to ensemble changes.
Delay Ensemble Change is introduced in 4.5.0 to overcome this problem. Enabling this feature means
Ensemble Change will only occur when clients can’t receive enough valid responses to satisfy
ack-quorum constraint. This feature
improves the tail latency.
To enable this feature, please set
true on your clients.
Parallel Ledger Recovery
BookKeeper clients recovers entries one-by-one during ledger recovery. If a ledger has very large volumn of traffic, it will have
large number of entries to recover when client failures occur. BookKeeper introduces
parallel ledger recovery in 4.5.0 to allow
batch recovery to improve ledger recovery speed.
To enable this feature, please set
true on your clients. You can also set
to control the batch size of recovery read.
Prior to 4.5.0, bookies are only allowed to configure one journal device. If you want to have high write bandwidth, you can raid multiple disks into one device and mount that device for jouranl directory. However because there is only one journal thread, this approach doesn’t actually improve the write bandwidth.
BookKeeper introduces multiple journal directories support in 4.5.0. Users can configure multiple devices for journal directories.
To enable this feature, please use
journalDirectories rather than
Apache BookKeeper supports pluggable metadata store. By default, it uses Apache ZooKeeper as its metadata store. Among the zookeeper-based
ledger manager implementations,
HierarchicalLedgerManager is the most popular and widely adopted ledger manager. However it has a major
limitation, which it assumes
ledger-id is a 32-bits integer. It limits the number of ledgers to
LongHierarchicalLedgerManager is introduced to overcome this limitation.
See Ledger Manager for more details.
Weight-based placement policy
Region-Aware placement polices are the two available placement policies in BookKeeper client. It places ensembles based
on users’ configured network topology. However they both assume that all nodes are equal.
weight-based placement is introduced in 4.5.0 to
improve the existing placement polices.
weight-based placement was not built as separated polices. It is built in the existing placement policies.
If you are using
Region-Aware, you can simply enable
weight-based placement by setting
Customized Ledger Metadata
Map<String, byte> is introduced in ledger metadata in 4.5.0. Clients now are allowed to pass in a key/value map when creating ledgers.
This customized ledger metadata can be later on used by user defined placement policy. This extends the flexibility of bookkeeper API.
Add Prometheus stats provider
Add more tools in BookieShell
BookieShell is the tool provided by Apache BooKeeper to operate clusters. There are multiple importants tools introduced in 4.5.0, for example,
For the complete list of commands in
BookieShell, please read BookKeeper CLI tool reference.
Full list of changes
- [BOOKKEEPER-552] - 64 Bits Ledger ID Generation
- [BOOKKEEPER-553] - New LedgerManager for 64 Bits Ledger ID Management in ZooKeeper
- [BOOKKEEPER-588] - SSL support
- [BOOKKEEPER-873] - Enhance CreatedLedger API to accept ledgerId as input
- [BOOKKEEPER-949] - Allow entryLog creation even when bookie is in RO mode for compaction
- [BOOKKEEPER-965] - Long Poll: Changes to the Write Path
- [BOOKKEEPER-997] - Wire protocol change for supporting long poll
- [BOOKKEEPER-1017] - Create documentation for ZooKeeper ACLs
- [BOOKKEEPER-1086] - Ledger Recovery - Refactor PendingReadOp
- [BOOKKEEPER-1087] - Ledger Recovery - Add a parallel reading request in PendingReadOp
- [BOOKKEEPER-1088] - Ledger Recovery - Add a ReadEntryListener to callback on individual request
- [BOOKKEEPER-1089] - Ledger Recovery - allow batch reads in ledger recovery
- [BOOKKEEPER-1092] - Ledger Recovery - Add Test Case for Parallel Ledger Recovery
- [BOOKKEEPER-1093] - Piggyback LAC on ReadResponse
- [BOOKKEEPER-1094] - Long Poll - Server and Client Side Changes
- [BOOKKEEPER-1095] - Long Poll - Client side changes
- [BOOKKEEPER-852] - Release LedgerDescriptor and master-key objects when not used anymore
- [BOOKKEEPER-903] - MetaFormat BookieShell Command is not deleting UnderReplicatedLedgers list from the ZooKeeper
- [BOOKKEEPER-907] - for ReadLedgerEntriesCmd, EntryFormatter should be configurable and HexDumpEntryFormatter should be one of them
- [BOOKKEEPER-908] - Case to handle BKLedgerExistException
- [BOOKKEEPER-924] - addEntry() is susceptible to spurious wakeups
- [BOOKKEEPER-927] - Extend BOOKKEEPER-886 to LedgerHandleAdv too (BOOKKEEPER-886: Allow to disable ledgers operation throttling)
- [BOOKKEEPER-933] - ClientConfiguration always inherits System properties
- [BOOKKEEPER-938] - LedgerOpenOp should use digestType from metadata
- [BOOKKEEPER-939] - Fix typo in bk-merge-pr.py
- [BOOKKEEPER-940] - Fix findbugs warnings after bumping to java 8
- [BOOKKEEPER-952] - Fix RegionAwarePlacementPolicy
- [BOOKKEEPER-955] - in BookKeeperAdmin listLedgers method currentRange variable is not getting updated to next iterator when it has run out of elements
- [BOOKKEEPER-956] - HierarchicalLedgerManager doesn't work for ledgerid of length 9 and 10 because of order issue in HierarchicalLedgerRangeIterator
- [BOOKKEEPER-958] - ZeroBuffer readOnlyBuffer returns ByteBuffer with 0 remaining bytes for length > 64k
- [BOOKKEEPER-959] - ClientAuthProvider and BookieAuthProvider Public API used Protobuf Shaded classes
- [BOOKKEEPER-976] - Fix license headers with "Copyright 2016 The Apache Software Foundation"
- [BOOKKEEPER-980] - BookKeeper Tools doesn't process the argument correctly
- [BOOKKEEPER-981] - NullPointerException in RackawareEnsemblePlacementPolicy while running in Docker Container
- [BOOKKEEPER-984] - BookieClientTest.testWriteGaps tested
- [BOOKKEEPER-986] - Handle Memtable flush failure
- [BOOKKEEPER-987] - BookKeeper build is broken due to the shade plugin for commit ecbb053e6e
- [BOOKKEEPER-988] - Missing license headers
- [BOOKKEEPER-989] - Enable travis CI for bookkeeper git
- [BOOKKEEPER-999] - BookKeeper client can leak threads
- [BOOKKEEPER-1013] - Fix findbugs errors on latest master
- [BOOKKEEPER-1018] - Allow client to select older V2 protocol (no protobuf)
- [BOOKKEEPER-1020] - Fix Explicit LAC tests on master
- [BOOKKEEPER-1021] - Improve the merge script to handle github reviews api
- [BOOKKEEPER-1031] - ReplicationWorker.rereplicate fails to call close() on ReadOnlyLedgerHandle
- [BOOKKEEPER-1044] - Entrylogger is not readding rolled logs back to the logChannelsToFlush list when exception happens while trying to flush rolled logs
- [BOOKKEEPER-1047] - Add missing error code in ZK setData return path
- [BOOKKEEPER-1058] - Ignore already deleted ledger on replication audit
- [BOOKKEEPER-1061] - BookieWatcher should not do ZK blocking operations from ZK async callback thread
- [BOOKKEEPER-1065] - OrderedSafeExecutor should only have 1 thread per bucket
- [BOOKKEEPER-1071] - BookieRecoveryTest is failing due to a Netty4 IllegalReferenceCountException
- [BOOKKEEPER-1072] - CompactionTest is flaky when disks are almost full
- [BOOKKEEPER-1073] - Several stats provider related changes.
- [BOOKKEEPER-1074] - Remove JMX Bean
- [BOOKKEEPER-1075] - BK LedgerMetadata: more memory-efficient parsing of configs
- [BOOKKEEPER-1076] - BookieShell should be able to read the 'FENCE' entry in the log
- [BOOKKEEPER-1077] - BookKeeper: Local Bookie Journal and ledger paths
- [BOOKKEEPER-1079] - shell lastMark throws NPE
- [BOOKKEEPER-1098] - ZkUnderreplicationManager can build up an unbounded number of watchers
- [BOOKKEEPER-1101] - BookKeeper website menus not working under https
- [BOOKKEEPER-1102] - org.apache.bookkeeper.client.BookKeeperDiskSpaceWeightedLedgerPlacementTest.testDiskSpaceWeightedBookieSelectionWithBookiesBeingAdded is unreliable
- [BOOKKEEPER-1103] - LedgerMetadataCreateTest bug in ledger id generation causes intermittent hang
- [BOOKKEEPER-1104] - BookieInitializationTest.testWithDiskFullAndAbilityToCreateNewIndexFile testcase is unreliable
- [BOOKKEEPER-612] - RegionAwarePlacement Policy
- [BOOKKEEPER-748] - Move fence requests out of read threads
- [BOOKKEEPER-757] - Ledger Recovery Improvement
- [BOOKKEEPER-759] - bookkeeper: delay ensemble change if it doesn't break ack quorum requirement
- [BOOKKEEPER-772] - Reorder read sequnce
- [BOOKKEEPER-874] - Explict LAC from Writer to Bookies
- [BOOKKEEPER-881] - upgrade surefire plugin to 2.19
- [BOOKKEEPER-887] - Allow to use multiple bookie journals
- [BOOKKEEPER-922] - Create a generic (K,V) map to store ledger metadata
- [BOOKKEEPER-935] - Publish sources and javadocs to Maven Central
- [BOOKKEEPER-937] - Upgrade protobuf to 2.6
- [BOOKKEEPER-944] - Multiple issues and improvements to BK Compaction.
- [BOOKKEEPER-945] - Add counters to track the activity of auditor and replication workers
- [BOOKKEEPER-946] - Provide an option to delay auto recovery of lost bookies
- [BOOKKEEPER-961] - Assing read/write request for same ledger to a single thread
- [BOOKKEEPER-962] - Add more journal timing stats
- [BOOKKEEPER-963] - Allow to use multiple journals in bookie
- [BOOKKEEPER-964] - Add concurrent maps and sets for primitive types
- [BOOKKEEPER-966] - change the bookieServer cmdline to make conf-file and option co-exist
- [BOOKKEEPER-968] - Entry log flushes happen on log rotation and cause long spikes in IO utilization
- [BOOKKEEPER-970] - Bump zookeeper version to 3.5
- [BOOKKEEPER-971] - update bk codahale stats provider version
- [BOOKKEEPER-998] - Increased the max entry size to 5MB
- [BOOKKEEPER-1001] - Make LocalBookiesRegistry.isLocalBookie() public
- [BOOKKEEPER-1002] - BookieRecoveryTest can run out of file descriptors
- [BOOKKEEPER-1003] - Fix TestDiskChecker so it can be used on /dev/shm
- [BOOKKEEPER-1004] - Allow bookie garbage collection to be triggered manually from tests
- [BOOKKEEPER-1007] - Explicit LAC: make the interval configurable in milliseconds instead of seconds
- [BOOKKEEPER-1008] - Move to netty4
- [BOOKKEEPER-1010] - Bump up Guava version to 20.0
- [BOOKKEEPER-1022] - Make BookKeeperAdmin implement AutoCloseable
- [BOOKKEEPER-1039] - bk-merge-pr.py ask to run findbugs and rat before merge
- [BOOKKEEPER-1046] - Avoid long to Long conversion in OrderedSafeExecutor task submit
- [BOOKKEEPER-1048] - Use ByteBuf in LedgerStorageInterface
- [BOOKKEEPER-1050] - Cache journalFormatVersionToWrite when starting Journal
- [BOOKKEEPER-1051] - Fast shutdown for GarbageCollectorThread
- [BOOKKEEPER-1052] - Print autorecovery enabled or not in bookie shell
- [BOOKKEEPER-1053] - Upgrade RAT maven version to 0.12 and ignore Eclipse project files
- [BOOKKEEPER-1055] - Optimize handling of masterKey in case it is empty
- [BOOKKEEPER-1056] - Removed PacketHeader serialization/deserialization allocation
- [BOOKKEEPER-1063] - Use executure.execute() instead of submit() to avoid creation of unused FutureTask
- [BOOKKEEPER-1066] - Introduce GrowableArrayBlockingQueue
- [BOOKKEEPER-1068] - Expose ByteBuf in LedgerEntry to avoid data copy
- [BOOKKEEPER-1069] - If client uses V2 proto, set the connection to always decode V2 messages
- [BOOKKEEPER-1083] - Improvements on OrderedSafeExecutor
- [BOOKKEEPER-1084] - Make variables finale if necessary
- [BOOKKEEPER-1085] - Introduce the AlertStatsLogger
- [BOOKKEEPER-1090] - Use LOG.isDebugEnabled() to avoid unexpected allocations
- [BOOKKEEPER-1096] - When ledger is deleted, along with leaf node all the eligible branch nodes also should be deleted in ZooKeeper.
- [BOOKKEEPER-390] - Provide support for ZooKeeper authentication
- [BOOKKEEPER-391] - Support Kerberos authentication of bookkeeper
- [BOOKKEEPER-575] - Bookie SSL support
- [BOOKKEEPER-670] - Longpoll Read & Piggyback Support
- [BOOKKEEPER-912] - Allow EnsemblePlacementPolicy to choose bookies using ledger custom data (multitenancy support)
- [BOOKKEEPER-928] - Add custom client supplied metadata field to LedgerMetadata
- [BOOKKEEPER-930] - Option to disable Bookie networking
- [BOOKKEEPER-941] - Introduce Feature Switches For controlling client and server behavior
- [BOOKKEEPER-948] - Provide an option to add more ledger/index directories to a bookie
- [BOOKKEEPER-950] - Ledger placement policy to accomodate different storage capacity of bookies
- [BOOKKEEPER-969] - Security Support
- [BOOKKEEPER-983] - BookieShell Command for LedgerDelete
- [BOOKKEEPER-991] - bk shell - Get a list of all on disk files
- [BOOKKEEPER-992] - ReadLog Command Enhancement
- [BOOKKEEPER-1019] - Support for reading entries after LAC (causal consistency driven by out-of-band communications)
- [BOOKKEEPER-1034] - When all disks are full, start Bookie in RO mode if RO mode is enabled
- [BOOKKEEPER-1067] - Add Prometheus stats provider
- [BOOKKEEPER-932] - Move to JDK 8
- [BOOKKEEPER-931] - Update the committers list on website
- [BOOKKEEPER-996] - Apache Rat Check Failures
- [BOOKKEEPER-1012] - Shade and relocate Guava
- [BOOKKEEPER-1027] - Cleanup main README and main website page
- [BOOKKEEPER-1038] - Fix findbugs warnings and upgrade to 3.0.4
- [BOOKKEEPER-1043] - Upgrade Apache Parent Pom Reference to latest version
- [BOOKKEEPER-1054] - Add gitignore file
- [BOOKKEEPER-1059] - Upgrade to SLF4J-1.7.25
- [BOOKKEEPER-1060] - Add utility to use SafeRunnable from Java8 Lambda
- [BOOKKEEPER-1070] - bk-merge-pr.py use apache-rat:check goal instead of rat:rat
- [BOOKKEEPER-1091] - Remove Hedwig from BookKeeper website page
- [BOOKKEEPER-967] - Create new testsuite for testing RackAwareEnsemblePlacementPolicy using ScriptBasedMapping.
- [BOOKKEEPER-1045] - Execute tests in different JVM processes
- [BOOKKEEPER-1064] - ConcurrentModificationException in AuditorLedgerCheckerTest
- [BOOKKEEPER-1078] - Local BookKeeper enhancements for testability
- [BOOKKEEPER-1097] - GC test when no WritableDirs
- [BOOKKEEPER-943] - Reduce log level of AbstractZkLedgerManager for register/unregister ReadOnlyLedgerHandle