Insight Engines, using LLMs for private files
Without a doubt, the most requested use case for LLMs is the ability to safely query an organisation's internal knowledge base.
An Insight Engine is essentially an upgraded, AI-powered search tool that not only searches for information but understands it, enabling it to extract valuable insights from the data. This is achieved through the use of advanced artificial intelligence and natural language processing technology, which can analyze and comprehend both structured and unstructured data.
The core function of an Insight Engine is to enhance search capabilities by understanding the user's intent and the context behind the query. Instead of merely returning keyword-based results, it provides more accurate and relevant results by discerning the searcher's intent, context, and relationships among different pieces of information.
One of the primary applications of Insight Engines is in enterprise search. They can scan across a plethora of an organization's data sources, such as databases, websites, emails, and documents, and deliver personalized content.
Knowledge management is another area where Insight Engines can be incredibly beneficial. Insight Engines can be used to drive decision-making. They can extract meaningful insights from data, identify trends, patterns, and relationships that may not be immediately apparent, and support organizations in making data-driven decisions.
In the current age of data proliferation, Insight Engines are becoming increasingly vital for organizations. They provide an efficient way to derive useful insights from the growing data volumes. However, the implementation of an Insight Engine needs to be carefully planned, keeping in view the specific needs of the organization and its existing data infrastructure.
Many applications will have this functionality on their development roadmaps, and Dropbox Dash may be the first to market with their AI offering, which indexes the files stored and makes them available for querying. This is a reasonable solution for individuals and small businesses, as it has a good level of functionality. Anticipate that Microsoft and Google will have similar offerings quite soon for their storage offerings.
However, solutions like Dropbox Dash and other service offerings have their drawbacks. For instance, you are restricted to the LLM model provided by the vendor. They may deploy a model with an 8k context window, and if a more efficient 100k context window model is available, you may find yourself in a tight spot. Although there are code libraries that allow for a seamless transition between these truncated contexts, anyone who has experimented with the Claude 100k model would acknowledge that the output quality noticeably differs.
The pace of LLM development is unprecedented, and many would prefer to have a system that is able to swap out LLM models, and possibly switch to local models once they achieve similar performance to OpenAI and Anthropic. Also, there is the issue of trust, what if your enterprise information is highly sensitive, or needs to adhere to data regulations like the GDPR, or yet to be passed, American Privacy and Protection Act?
One novel solution is ThirdAI's pocket LLM, a locally run model (a few million, to 1B parameters) that trains on your specific data, and can do this all offline. This is not an enterprise solution, but for personal knowledge bases, it could be perfect. The drawback is whenever you open the app or change the documents you want to interact with, you will need to retrain the model, which takes a few minutes. Still, a very interesting concept and I can see the use of specific corpus trained models serving LLMs being a viable approach in the future.
Thankfully, there are numerous code solutions available in various formats and languages, with the capability to host and run your own enterprise service.
Of these many ways to achieve this, your consideration of the following factors will affect the execution:
It is also important to touch briefly on vector databases.
Recommended by LinkedIn
In short vector databases are the required type of storage format for LLMs to utilise, and there are many solutions available, each with their own characteristics and functionalities, depending on your use case.
One advantage of using your own hosted or serviced vector databases is being able to Chinese wall various datasets from queries, if needed for any regulatory or ethical requirements.
Below is a conceptual map of an Open AI local Insight Engine, using the open-source vector database Chroma DB. This architecture would be perfect for most locally hosted use-cases and relies on API calls to OpenAI. OpenAI do not retain the API requested data for model training, but the data used in the API call is stored for 30 days for trouble shooting purposes. Should you substitute a local open source LLM model for the embeddings, then data sent to OpenAI is only a tiny subset related to the query, not the full corpus.
Conceptually this will work across small to enterprise scales, and these solutions can best be served via one of the established platforms, such as Azure, AWS or GCP for the latter.
For absolute surity of data, the use of locally run open-source, or licensed, LLMs are needed. The trade off is these do not perform at the same level that serviced models do such as OpenAI, regardless of how smart your prompt engineering pipeline is. These models are not very far off, and it is forseeable that the current generation of serviced models will become licensable in the future. To be considered is that the computational costs may be restrictive, depending on usage and requirements.
The pace of innovation in generative AI, and the LLM space is incredibly swift. Numerous predicted advancements will transform these architectures, specifically context window sizes, vector database innovations, multi-modality, and decision engineering requirements. All of these will impact the current solutions, with turn-key vendors striving to implement the latest technologies to remain competitive in their risk-averse, leviathan production environments with slow product road maps.
Ultimately, the right solution for your organization may revolve around what's best for the present, with the understanding that it may need to be adjusted until the Insight Engine market stabilizes its core offerings and capabilities.
About the Author
Rudy Nausch is a tech leader specializing in AI, LLM, SDLC, and data-driven strategies. With a knack for demystifying complex concepts, Rudy fosters a collaborative atmosphere that drives innovation and delivers value. He welcomes discussions on how his expertise can shape your organization's future. Visit rudynausch.com for more.