Improved Search Accuracy with New Embedding Models

Anna Hakkarainen

Junior Innovation Engineer @ Visma Spcs Innovation & Technology Team

Published Mar 18, 2024

Embedding Awesomeness: How We're Powering Up Our AI at Visma Spcs.

I have been fortunate enough to land a spot as Data Science Intern at Visma Spcs in Växjö, Sweden. In our recent work with AI at Visma Spcs I’ve had the opportunity to experiment with the newly released embedding models from OpenAI. The results were great and I am writing here to share our thoughts with you.

What are Embeddings and Why Do They Matter?

Embeddings are a powerful technique in natural language processing (NLP) that transform words, audio and phrases, or even entire documents into numerical vectors. These vectors live in a high-dimensional space where the distances and relationships between them capture the semantic meaning of the original text.

By representing text as numbers, embeddings enable computers to understand language more like humans do, leading to more accurate and sophisticated tasks like search, translation, and question answering. The closer embeddings are to each other, the more semantically similar the text is considered.

OpenAI announced their coming release of new models in January and we at Innovation and Technology Team have eagerly been awaiting their release through Azure OpenAI.

Recently text-embedding-3-large and text-embedding-3-small were finally made available and we jumped straight into testing them.

Initial Testing and Impressions

We have previously been testing the ada–002 embedding model. While it has served us well it has had issues finding the relevant content in certain cases, for example in retrieving documents containing terms like "client," "customer," or "member," which often have overlapping meanings.

These newly released models are a significant step forward. In our testing, they've consistently outperformed ada–002, resulting in a noticeable accuracy increase in the article suggestions they produce.

Recommended by LinkedIn

AI Frameworks in Action: Building RAG Systems with…

Pavan Belagatti 3 months ago

Vector Search in AI and Its Advantages Over LLMs and…

Jean KOÏVOGUI 1 year ago

The Rise of Generative AI Search: Transforming…

Carolina H. 2 months ago

These boxplots show how our text-3-large, text-3-small and ada-002 (current model) are spreading their score within their top 15 articles that they are suggesting as answers to some, for ada-002, difficult questions.

We can observe that both text-3-large and text-3-small have a far wider spread in their scores. If we focus on the top 3 choices of each model we can see that ada-002 has a very limited span in its relevance scoring. It is never under 0.8 even at the bottom of its lower quartile. Its top quartile scores are almost identical across rankings.

Meanwhile, if we instead look at our new models, we notice that their boxes have significant height, meaning their scores are more widely spread. Text-3-large has a range, on its rank 1, between 0.3-0.75. There is also a more distinct step down from rank 1 both for text-3-large and text-3-small while for ada-002 it is almost indistinguishable.

It is our overall impression that text-3-large especially is a big leap forward in the capacity of text embeddings. We served the models a whole battery of ambiguous questions. Text-3-large consistently found the correct answer in its top 2 suggestions. We believe it will have a positive impact on our AI projects performance while also giving us the ability to send back less context since the model will find the correct answer in the top.

OpenAI themselves report big improvements in known embeddings benchmarks.

Both new models have been trained with a technique called Matryoshka Representation Learning, giving developers the ability to trade-off performance versus cost of embeddings usage. The default text-3-large 3072 embedding vector can according to the MTEB benchmark be shortened to 256 while still beating ada-002 full-size 1536 embedding in performance.

Next Steps

We are very much looking forward to further testing these new models and to bringing them into production. Dimension reduction also opens up possibilities for tuning performance vs cost, which of course is of high interest to many developers.

Best regards,

Anna at Visma Spcs Innovation & Technology Team

Improved Search Accuracy with New Embedding Models

Anna Hakkarainen

Junior Innovation Engineer @ Visma Spcs Innovation & Technology Team

Embedding Awesomeness: How We're Powering Up Our AI at Visma Spcs.

What are Embeddings and Why Do They Matter?

Initial Testing and Impressions

Recommended by LinkedIn

Next Steps

Insights from the community

Others also viewed

How Generative AI Is Disrupting the Data Economy and Creating New Opportunities

Exploring RAG with LangChain

Understanding Transformers: A Deep Dive with PyTorch

Google Search Gets an AI Overhaul: Revolutionizing How We Find Information

Can We Really Hand-Engineer Level 2+ AGI?

A Comparative Analysis: DeepSeek VS. OpenAI

Relevance of Custom Patent Data Classification Tools

Hugging Face Advocates Open-Source AI in Regulatory Framework

Microsoft vs Google: Who Will Win the AI War?

The "Unreasonable Effectiveness of Data" and the Evolution of AI-Driven Search Engines: A Cohesive Future

Explore topics