3 Ways Vector Databases Take Your LLM Use Cases to the Next Level

Prasun Mishra

Generative AI | LLM| NLP| ML | MLOps | Top Machine Learning Voice|

Published Jun 18, 2023

Tim saw someone wearing elegant metallic finish double-colored leather shoes with colored laces. He wanted to ask about the brand, but he couldn't find the person after the meeting. Tim was sure that these are not regular ‘off the shelf’ shoes, they seem special. He went to his favorite merchandise marketplace and typed in the search: "metallic finish burgundy, black, formal leather shoes with double-colored laces in size 10, male." He was thrilled to find the desired shoe as the third choice on the list. He also bought matching socks right away.

This is an example of how semantic search can help businesses, such as e-commerce, retail, and marketplaces, to drive customers toward the ‘action’ or purchase. This would have been challenging with traditional keyword searches.

Behind the success of the semantic search, vector databases are one of the real heroes, bringing high performance in high query volume scenarios.

Vector databases :

Vector databases treat vectors or embeddings as first-class citizens. Vector databases gained popularity because they can store embeddings of unstructured data, such as images, text, video, audio, or event logs, and make it easy to perform semantic searches among them.

If the solution is designed well, Vector databases with LLMs can handle large-scale, high-dimensional data, enabling more nuanced, context-aware, and efficient natural language understanding applications.

Vector databases + Large Language Models

Vector databases and LLM complement each other very well. We will analyze the three most efficient ways in which vector databases amplify LLM use cases and ensure better ROI.

1. As a knowledge base to provide ‘context’ from the enterprise aka RAG ( Retrieval Augmented Generation). In this case, Vector DB acts as a knowledge extension for LLM’s and can be queried to retrieve existing similar information (context) from the knowledge base. This also eliminates the need to use sensitive enterprise data to train or fine-tune LLM. Every time a question is asked:

· Question gets converted to LLM-specific embedding.

· Embedding is used to retrieve relevant context (or documents) from the Vector database

· LLM Prompt is created with the help of this context.

· Response is generated. Enterprise-specific context helps LLM to provide accurate output.

Use cases: Document discovery, Chatbots, Q&A.

Key benefits: Avoids using sensitive data for model training/fine-tuning. Cheaper than fine-tuning LLMs. Almost real-time updated knowledge base.

2. Acting as long-term LLM Memory. This helps to retrieve the last N messages relevant to the current message from the entire chat history which can encompass multiple simultaneous sessions and historical interactions. This also helps to bypass context length (tokens) limitations of LLM and gives more control in your hand. Here key steps are:

· User asks a query.

· System retrieves stored embedding from the vector database and pass on to query LLM

· LLM response is generated and shared with the User. Also, response embedding (with history) is stored in a vector database.

Use cases: Knowledge discovery, Chatbots.

Key benefits: Bypass token length limitations of LLM and help with conversation topic changes.

Recommended by LinkedIn

Innovative Retrieval-Augmented Generation (RAG)…

Jaroslaw Sokolnicki 7 months ago

Structured Outputs from LLMs: LangChain Output Parsers

Vijay Chaudhary 3 months ago

Building a Local RAG Document Knowledge Base

Vinod K Vijayan 2 months ago

3. Cache previous LLM queries and responses. When a query is fired, create embedding and do a cache lookup before invoking the LLM. This ensures quick response and money saved on computation as well as LLM usage. Here key steps are:

· User asks a question.

· Embedding created and Cache lookup performed.

· If information is available in Cache, a response is provided.LLM not invoked.

· If information is unavailable in Cache, LLM is invoked, and the response is stored in the Cache.

Use cases: All use cases such as Document discovery, Information retrieval, Chatbots, and Q&A.

Key benefits: Speeds up performance, optimizes computational resources and LLM invocation cost.

This list doesn’t end here. Vector database work like a buddy with LLM and helps you to optimize security, cost, and performance across use cases. Depending on your business and specific use cases, solution need to be designed.

If you are considering investing in a Vector database or planning to use the feature available in an existing database like Radis, you should think and plan to address multiple system design concerns related to vector databases and LLM use cases, including but not limited to:

· Keyword vs Semantic search: Keyword search is good for finding results that match specific terms, while the semantic search is good for finding results that are relevant to the user's intent. You may need a strategy to leverage the best of both, depending on your use cases.

· Creating embeddings with cost and time efficiency at scale,without paying too much for GPUs (vs CPU) but avoiding latency in the system.

· Strategy around Multimodal search which allows users to search for information using multiple modalities, such as text, images, audio, and permutation combinations.

· Also, think about whether your use case needs precise search results or explorative results.

· Whether to invest in a new vector database or use dense_vector in Elasticsearch, open search, or Solr?

· Integration with existing ML models and MLOps. How to ensure models will be performing best even at an increased scale? You may need to relook at the data pipeline and enable real-time streaming (Kafka/Kinesis/Flink) as real-time or near real-time predictions, fraud detection, recommendations, and search results would need them.

· There are business use case-driven issues to consider too: for example in a marketplace scenario, if a seller adds a new product, how the system treats it with respect to search and recommendations?

· Many more...

With the technology still evolving, it's beneficial to have experts that can guide you through system design while preventing redundant future costs.

Credit and gratitude: Thanks to discussions, podcasts, and blogs from , the Pinecone team, Dmitry Kan , Sam Partee , Edo Liberty , and many others.

Sathyamanikanta B K

Data Analyst

This docs is very well put about the vector db.

1 Reaction

Nicholas Khami

CEO and Co-Founder helping with Search, Recommendations, and RAG at Trieve | Futo Fellow (S23) | YC (W24)

We make this as easy and fast as possible to deploy as an internal product Arguflow 😎

1 Reaction

Angus Innes

Lead Product Manager at Planes Product Studio | 🤖 Running Free Half Day Workshops On AI | 🔌 Visit planes.studio/ai-workshop

Thanks for writing this up Prasun Mishra - the most succinct resource I've come across on the value of vector dbs

1 Reaction

Mahtab Syed

Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle

Nice article Prasun Mishra. A question - Can we fetch user specific data from another enterprise db (e.g., a bank users loan eligibility) and combine the results from Vector db and the LLM model to give user specific, context specific, "user data" augmented response? Have you come across such implementations?

1 Reaction

See more comments

To view or add a comment, sign in

3 Ways Vector Databases Take Your LLM Use Cases to the Next Level

Prasun Mishra

Generative AI | LLM| NLP| ML | MLOps | Top Machine Learning Voice|

Recommended by LinkedIn

More articles by Prasun Mishra

Insights from the community

Others also viewed

Building a Local RAG Document Knowledge Base

Unraveling the Puzzle: Navigating Through String Search, Regex Search, and Similarity Search in the GenAI Space

Semantic Search in Practice: Using Embeddings to Decode Earnings Call

LLMs for Recommendation Systems: Capabilities and Challenges

How Web Crawler Works?

RAG recipe needs a Knowledge Graph

Knowledge Graphs and LlamaIndex

My Experience with Retrieval-Augmented Generation (RAG)

Rapid RAG: A Quick Way to Build a Retrieval-Augmented Generation System for documents

How to Use RAG for Knowledge Base Construction?

Explore topics

Recommended by LinkedIn

More articles by Prasun Mishra

HuggingFace smolagents: Fast, Lightweight LLM Agents Powered by CodeAct

LLM-as-a-Judge: Automating AI Evaluation, Quality, and Oversight at Scale

DeepSeek's Secret Sauce: Reinforcement Learning and the Future of AI

Beyond Tokens: Large Concept Models (LCM) for Enhanced Context & Coherence

Machines of Loving Grace: Balancing Potential and Peril

How CrewAI Flows Power Complex Real-World Automation

Llama 3.2: Empowering Agentic AI with Built-In Tool-Calling for Seamless Workflow Automation

Multimodal Large Language Models (MLLM): The Future of Tackling Real-World Complexities

Agentic Reasoning: AI Agents with Reflection Outperform Top LLMs at a Reduced TCO

Building Teams of AI Agents with CrewAI is as Easy as Legos

Insights from the community

Others also viewed

Building a Local RAG Document Knowledge Base

Unraveling the Puzzle: Navigating Through String Search, Regex Search, and Similarity Search in the GenAI Space

Semantic Search in Practice: Using Embeddings to Decode Earnings Call

LLMs for Recommendation Systems: Capabilities and Challenges

How Web Crawler Works?

RAG recipe needs a Knowledge Graph

Knowledge Graphs and LlamaIndex

My Experience with Retrieval-Augmented Generation (RAG)

Rapid RAG: A Quick Way to Build a Retrieval-Augmented Generation System for documents

How to Use RAG for Knowledge Base Construction?

Explore topics