Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary quantization #82

Merged
merged 61 commits into from
Sep 19, 2024
Merged

Binary quantization #82

merged 61 commits into from
Sep 19, 2024

Conversation

irevoire
Copy link
Member

@irevoire irevoire commented Jul 4, 2024

Related issue

Fixes #69

What does this PR do?

  • Introduce a new UnalignedVector generic type to replace the UnalignedF32Slice type we were using before
    • It's parametrized by a Codec
    • Supports the unaligned f32 (equivalent to the old UnalignedF32Slice type)
    • Supports the binary quantized slice by converting each f32 to a single 0 or 1 depending on if their value was negative or positive
    • It can convert the binary quantized slice to Vec<f32> quickly using SIMD
  • Provide new distance trait to binary quantized the Euclidean, Manhattan and Angular distances, respectively named BinaryQuantizedEuclidean, BinaryQuantizedManhattan and BinaryQuantizedAngular
  • To keep the relevancy good while binary quantizing, we had to re-develop two_means « un-binary quantize » the vectors before searching for the centroid

Warning

Below is the initial investigation of the different method we could use to make the binary quantization works without losing too much relevancy.

First version

Just binary quantize every operation.

Here's the measured relevancy:
image

Second version to improve relevancy

One issue we found out is that when binary quantizing vectors, we basically end up creating a bunch of clusters in two dimensions. It would look like this:
image

All vectors will end up on one of the four edges of the square.

The more dimensions we have, the more clusters we'll get.
The number of clusters is 2^nb_dimensions.

By making the internal computation of two_means not use the binary quantized vectors but instead use the real vectors, here are the results:

image

Warning

It’s actually worse than initially

Third idea to improve relevancy

In the second solution I was computing the two_means loop with non binary-quantized distances which greatly improved the relevancy.
But then the output of two_means was binary quantized again.
We should try to compute the normal on non binary quantized distance as well and then binary quantize this vector right before storing it in the DB.

image

Note

This improved the relevancy by almost 10 points of recall in the worst case over the previous best solution.

With the bits being [0:1] the relevancy is terrible image

Fourth solution to improve relevancy

Store the non binary quantized distance in the SplitNode in the database directly.
This increase by a lot the size of the database (still way less than the non-binary-quantized distance though).
The search may become slower as well.

The results are bad and I can't explain why:
image

I made another branch in case we want to investigate further: #84


In conclusion

The best version is the third one.

The next steps are:

  • Optimize all the binary quantized version
  • Merge + make a release
  • Merge in meilisearch + find a way to change the distances => We'll talk about that with louis in two weeks after our vacations

Next step in Arroy:

  1. Add the size of the DB in the relevancy benchmarks
  2. Overfetch search results (between x3 and x6)
  3. Compare ourselves to qdrant
  4. Optimize performances

src/writer.rs Outdated Show resolved Hide resolved
@irevoire irevoire marked this pull request as ready for review September 16, 2024 14:52
@irevoire irevoire added enhancement New feature or request breaking Something that will break in the next release indexing Everything related to indexing performance labels Sep 16, 2024
src/node.rs Show resolved Hide resolved
examples/relevancy.rs Outdated Show resolved Hide resolved
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks perfect to me 👌 That is a wonderful job that you've done here @irevoire 👏
Thank you!

@Kerollmops Kerollmops merged commit 2386594 into main Sep 19, 2024
8 checks passed
@Kerollmops Kerollmops deleted the binary-quantization branch September 19, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Something that will break in the next release enhancement New feature or request indexing Everything related to indexing performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support binary quantization
2 participants