Pulsejet has 7 types of indexes. These indexes has different characteristics and optimizations based on different input.
We are working on custom index implementation and it will become Pulsejet’s de-facto index implementation.
Index Types
FlatL2 - Flat Index with L2 distance calculation
- It is exact search with L2 (Euclidean Distance).
FlatIP - Flat Index with Cosine Similarity (Inner Product) distance
- It is exact search with Inner Product
HNSW - Hierarchical Navigable Small Worlds
IVFFlat - Inverted File Index - Flat
IVFScalar - Inverted File Index - Scalar
IVFPQ - Inverted File Index - Product Quantized
OPQPQ - Optimized PQ chained with Flat Product Quantization
How to select my index type?
Below you can find how to select which index and their general starting requirements.
Whatever your selection will be (unless you don’t disable optimizer) your indexes will optimized to better indexes through time by Pulsejet (with given constraints.)Learn more about Optimizer Structure.
FlatL2
FlatIP
HNSW
IVFFlat
IVFScalar
IVFPQ
OPQPQ
First, you can use FlatL2 index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "FlatL2",
"on_disk": true
},
"optimizer_config": {}
}
☝️ Welcome to the content that you can only see inside the first Tab. First, you can use FlatIP index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "FlatIP",
"on_disk": true
},
"optimizer_config": {}
}
✌️ Here’s content that’s only inside the second Tab. First, you can use HNSW index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "HNSW",
"on_disk": true
},
"optimizer_config": {}
}
💪 Here’s content that’s only inside the third Tab. First, you can use IVFFlat index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "IVFFlat",
"on_disk": true
},
"optimizer_config": {}
}
💪 Here’s content that’s only inside the third Tab. First, you can use IVFScalar index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "IVFScalar",
"on_disk": true
},
"optimizer_config": {}
}
💪 Here’s content that’s only inside the third Tab. First, you can use IVFPQ index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "IVFPQ",
"on_disk": true
},
"optimizer_config": {}
}
💪 Here’s content that’s only inside the third Tab. First, you can use OPQPQ index at collection creation with:{
"name": "sift1m",
"vector_config": {
"size": 128,
"index_type": "OPQPQ",
"on_disk": true
},
"optimizer_config": {}
}
💪 Here’s content that’s only inside the third Tab.