Replacing byte[] and ByteBuffer compareTo methods with JDK builtins #199

danielcranford · 2022-07-06T15:49:58Z

Resolves #198

…tins.

danielcranford · 2022-07-06T16:10:00Z

Ahh, missed the fact Arrays.compare(byte[], byte[]) was added in JDK 9. ByteByffer.compareTo() still works. ByteBuffer.wrap(byte[]).compareTo(ByteBuffer.wrap(byte[]) works as an alternative to Arrays.compare though adds allocation cost to the compare

cgilliard

Test cases cover these changes.

benalexau · 2023-02-03T05:23:10Z

This is a more complicated proposal than it first appears.

The PR modifies two implementations of the BufferProxy.compare(T, T) method. This method provides comparator functionality should null be passed in as the comparator to Dbi.iterate(Txn, KeyRange, Comparator). A comparator is needed for the CursorIterator to facilitate its iteration logic. CursorIterator is a relatively uncommonly used API and there is no equivalent in the LMDB native library.

LMDB native code does not call the provided comparator. While a user is free to provide their own Java-side comparator when creating a Dbi, most do not given this would impose a significant overhead. The majority of users carefully configure the Dbi with the correct flags for their key type and let native code on the LMDB side handle ordering. The JavaDocs for CursorIterable make reference to this:

If iterating over keys stored with {@link DbiFlags#MDB_INTEGERKEY} you must provide a Java comparator when constructing the {@link Dbi} or this class. It is more efficient to use a comparator only with this class, as this avoids LMDB calling back into Java code to perform the integer key comparison.

Where we run into issues is when using an MDB_INTEGERKEY with a CursorIterator. This is because what we need is a Java comparator that will treat buffers containing integer keys in the same manner as LMDB would. ByteBuffer.compareTo(..) delegates to Byte.compare(byte, byte) and Java bytes are signed. So we end up with a different order.

There are a few options available, such as better clarifying the requirements in the JavaDocs of BufferProxy.compare(T, T), requiring the user to instantiate the CursorIterator with an integer-aware comparator, or automatically doing the latter if the Dbi is using an integer key. However I am wondering how much of an issue this presently is given the limited occasions these comparators are called and the existing code already handles integers.

benalexau · 2023-02-03T22:06:51Z

After thinking about this some more I believe the better approach is to store a mandatory Comparator in the Dbi at construction time. If the user does not provide a Comparator - which they very rarely would - the BufferProxy can select an optimal implementation based on the Dbi flags. So if there are integer keys it would use an approach like the existing code, otherwise it can fall back to a built-in Java method.

Storing a Comparator in the Dbi is the most natural approach given all keys in the Dbi must align with that Comparator for the CursorIterator to function correctly and there is no reason to provide a different Comparator for a particular invocation of Dbi.iterate(Txn, KeyRange, Comparator). Removing the Comparator parameter from the iterate method would remove this overloaded variant and potential API usage confusion.

It would be natural to only accept a single Comparator in a Dbi and use a boolean flag to denote whether it should be invoked from the LMDB native library side. The majority of the time this would be false for efficiency reasons, with the Comparator only used for CursorIterator instances if the user requests one.

The only downside I can presently see to this is it would represent a breaking change by removing the aforementioned method and some minor variations to the Env.openDbi(...) methods. Nevertheless it would allow more efficient Comparators and a more concise and clearer Dbi API. It's worth doing this but I will need to defer it until 0.9.0 to comply with semantic versioning expectations.

I'll leave this PR open as a reminder to address this when preparing 0.9.0.

benalexau · 2023-02-04T04:47:36Z

Following the release of 0.8.3 I switched the target development version to 0.9.0 and implemented my prior comment. The main changes are:

A default Comparator is provided by each BufferProxy (just as it always has been)
A single Comparator must be associated with each Dbi when that database is opened
There is no longer support for a different Comparator when creating a CursorIterator
Clearer JavaDocs about how the Comparator is used
ByteBufferProxy detects if DbiFlags includes MDB_INTEGERKEY and uses the same custom logic as it always has (otherwise it uses the ByteBuffer.compare(..) method)
Various internals have been cleaned up (eg Txn no longer exposes the BufferProxy etc)

I decided against making other BufferProxy implementations (eg ByteArrayProxy) similarly differentiate between the existing custom Comparators and any inbuilt alternatives. In the case of Netty, we are already delegating to its compareTo method. In the case of byte[]s, the previously-mentioned absence of Arrays.compare(byte[], byte[]) in Java 8, plus the allocation costs of delegating through to ByteBuffers, would likely exceed any performance gain. If it is ultimately required it is easy enough to implement without further API changes, and/or for end users to provide their own BufferProxy implementation.

Replacing byte[] and ByteBuffer proxy compareTo methods with JDK buil…

c782f20

…tins.

benalexau force-pushed the master branch 2 times, most recently from 0a78414 to a1162e1 Compare December 10, 2022 01:29

cgilliard approved these changes Feb 2, 2023

View reviewed changes

benalexau closed this in 3524995 Feb 4, 2023

benalexau added a commit that referenced this pull request Feb 4, 2023

JavaDoc correction (#199)

9e46f9d

benalexau added a commit that referenced this pull request Feb 4, 2023

JavaDoc corrections (#199)

7a9b30e

benalexau added a commit that referenced this pull request Feb 4, 2023

JavaDoc corrections (#199)

0dc6780

ivenhov mentioned this pull request Jan 22, 2024

ByteBuffer sorting changed in 0.9.0 #230

Closed

at055612 mentioned this pull request Mar 5, 2025

The new MDB_UNSIGNEDKEY DbiFlag should probably be the default behaviour #249

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replacing byte[] and ByteBuffer compareTo methods with JDK builtins #199

Replacing byte[] and ByteBuffer compareTo methods with JDK builtins #199

Uh oh!

danielcranford commented Jul 6, 2022

Uh oh!

danielcranford commented Jul 6, 2022

Uh oh!

cgilliard left a comment

Uh oh!

benalexau commented Feb 3, 2023

Uh oh!

benalexau commented Feb 3, 2023

Uh oh!

benalexau commented Feb 4, 2023

Uh oh!

Uh oh!

Replacing byte[] and ByteBuffer compareTo methods with JDK builtins #199

Replacing byte[] and ByteBuffer compareTo methods with JDK builtins #199

Uh oh!

Conversation

danielcranford commented Jul 6, 2022

Uh oh!

danielcranford commented Jul 6, 2022

Uh oh!

cgilliard left a comment

Choose a reason for hiding this comment

Uh oh!

benalexau commented Feb 3, 2023

Uh oh!

benalexau commented Feb 3, 2023

Uh oh!

benalexau commented Feb 4, 2023

Uh oh!

Uh oh!