Prerequisites
Ensure you have the following Python packages installed:- Pulsejet
- Ollama
- NLTK
Setting Up the RAG System
First, let’s import the necessary libraries and set up our Pulsejet client:Indexing Documents
Now, let’s create functions to chunk text and index documents:Searching and Generating Answers
Now, let’s create functions to search for similar documents and generate answers:Running the RAG Application
Finally, let’s put it all together:-
Document Indexing with Pulsejet:
- We use
client.insert_single(collection_name, embedding, meta)
to insert each document chunk’s embedding and metadata into Pulsejet. - The
insert_single
method efficiently stores the vector (embedding) along with its associated metadata.
- We use
-
Vector Search with Pulsejet:
- We use
client.search_single(collection_name, query_embedding, limit=limit, filter=None)
to find similar documents. - This method performs a fast similarity search in the vector space, returning the most relevant documents.
- We use
-
Embedding Generation with Ollama:
- We use
ollama.embeddings(model=EMBEDDING_MODEL, prompt=text)
to generate embeddings for both documents and queries. - The vector size is determined automatically based on the embedding model output.
- We use
-
Text Generation with Ollama:
- We use
ollama.generate(model=LLM_MODEL, prompt=prompt)
to generate answers based on the retrieved context. - This leverages the power of the LLaMA 3.1 model to produce human-like responses.
- We use
-
Collection Management:
- The
ensure_collection_exists()
function checks if the collection already exists before attempting to create it, avoiding unnecessary operations.
- The
Remember to replace
"files/"
with the path to your document folder. If you want to use the Art Deco building files, download them from the rag_art_deco GitHub repository and place them in your files2/
folder.