Create the client
Create the client with different host and port
Creating Collection
Since we will insert some vectors into our storage. We will buildVectorParams
which are indicating our
vector configuration for our collection.
Build vector configuration for our collection
For possible index types please head to Index Types.
Please try out
ANODE
index, our new invention.size
indicates the vector size, and index_type
is
one of the possible indexes that we have documented at Index Types.
Let’s start with HNSW
for now.
As a next step let’s create a collection. Collections are like tables for us.
They allow us to organize vectors and their indexes into same location for data locality.
You can learn more with storage system doc.
Create collection
List Collections
Now we can take a look to existing collections in this list:List collections
Fetch example data
Now it is time to downloadSIFT1M
dataset to our system as an example:
Download and separate Sift1M dataset
xq
and xb
both query and base vectors.
Now let’s prepare our vector data for insertion
Prepare vector data for insertion.
Inserting Vectors
Now we are good to go with insertion. Let’s insert these vectors:Insert vectors into Pulsejet
Searching Vectors
Now let’s get into the search, after a short period, indexes should be started to be populated. Let’s use our search vector to get some similarity matches.Search by a single vector
You can search multiple vectors at the same time with
client.search_multi
method too.limit
will give most similar top-K (in our case top 100) vectors for us.
If distances are too apart you won’t be getting 100 exact results in your experiments. This is because our clustering is too tight for your data or we don’t have enough vector similarity to present you with the trained index.
Extract vectors
Search output's first match
Bigger datasets and performance
If you are thinking about loading millions or billions of vectors to Pulsejet, then it is better for you to useAsyncPulsejetClient
. You can head to Quickstart - Async Client to reduce your time to insert vectors.