Graph-based vector indices explained through the "FES theorem"

7 semihsalihoglu 1 7/23/2025, 6:53:00 PM blog.kuzudb.com ↗

Comments (1)

semihsalihoglu · 7h ago
I wrote a blog post on the HNSW vector index design, perhaps the most popular vector index design adopted by databases at this point. The post is based on several lectures I gave in a graduate course at UWaterloo last fall. This is intended for people who are interested in understanding how these indices work internally.

My goal was to explain the intuitions behind HNSW indices as a natural relaxation of two prior indices: kd trees and the (not much appreciated) sa trees.

I also place these three vector indices in a framework that I call the "FES Theorem", which states that any vector index design can provide at most two of the following three properties: - Fast: returns vectors that are similar to a query vector q quickly. - Exact: correctly returns the most similar vectors to q (instead of "approximate" indices that can make mistakes) - Scalable: can index vectors with large number of dimensions, e.g., 1000s of dimensions.

Kd trees, sa trees, and HNSW satisfy each 2 possible combinations of these 3 properties.

Needless to say, I intentionally picked the term "FES Theorem" to sound like the famous "CAP Theorem". Fes (Turkish) or a fez (English), just like cap, is a headdress. You can see a picture in the post.

I hope you find the explanation of HNSW as a sequence of relaxation of kd trees useful.

Enjoy!