🔮 Moving beyond RAG

Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer

Published Mar 22, 2024

+ Follow

In this issue:

2.9x Lower Latency with Prompt Compression
Unified Structure Learning
Is it RAG? Is it FT? No, it’s RAFT!

Meet your new AI-powered data analyst!

Telescope Labs makes quality insights and Data Science more accessible by simplifying the "data to action" journey for everyone.

Want to empower your teams to develop better products and services with the help of AI? Click on the button below and try it out for free.

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Watching: LLMLingua-2 (paper)

What problem does it solve? Prompts are a crucial component in interacting with Large Language Models (LLMs). However, as prompts become more complex and detailed to guide the model effectively, they also become longer. This increased length can lead to redundancy and inefficiency in the prompts. Existing approaches to compress prompts often rely on information entropy obtained from a causal language model, but this method has limitations in capturing all essential information and aligning with the prompt compression objective.

How does it solve the problem? The proposed approach addresses the limitations of existing prompt compression methods by introducing a data distillation procedure. This procedure derives knowledge from an LLM to compress prompts without losing crucial information. Additionally, the authors introduce an extractive text compression dataset to support the compression task. By formulating prompt compression as a token classification problem and using a Transformer encoder architecture, the model captures essential information from the full bidirectional context, ensuring the faithfulness of the compressed prompt to the original one.

What's next? As prompt-based interaction with LLMs becomes increasingly prevalent, efficient and effective prompt compression techniques will be essential for maintaining performance while minimizing computational costs. Further research could explore the application of this approach to a wider range of tasks and LLMs, as well as investigating the potential for integrating prompt compression into the LLM training process itself.

Recommended by LinkedIn

To Data & Beyond Week 8 Summary

Youssef Hosni 1 year ago

To Data & Beyond Week 3 Summary

Youssef Hosni 1 year ago

Beyond Text and Numbers: The Rise of Multimodal Data…

Iain Brown PhD 1 year ago

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

Watching: DocOwl 1.5 (paper/code)

What problem does it solve? Multimodal Large Language Models (MLLMs) have shown impressive capabilities in understanding and reasoning about visual documents like forms, receipts, charts, and webpages. However, current MLLMs often struggle with fully capturing the rich structural information present in these documents. Understanding the layout, spatial relationships, and hierarchical organization of elements is crucial for accurately interpreting the semantics of text-rich images.

How does it solve the problem? The researchers propose Unified Structure Learning, which combines structure-aware parsing tasks and multi-grained text localization tasks across various domains. They introduce H-Reducer, a vision-to-text module that preserves layout information while efficiently reducing the length of visual features. This enables the LLM to process high-resolution images more effectively. Additionally, they construct DocStruct4M, a comprehensive training set with structure-aware text sequences and multi-grained text-bounding box pairs, and DocReason25K, a high-quality reasoning tuning dataset for detailed explanations in the document domain.

What's next? The proposed DocOwl 1.5 model achieves state-of-the-art performance on 10 visual document understanding benchmarks, significantly outperforming previous MLLMs with a 7B LLM. This demonstrates the importance of incorporating structure learning in MLLMs for text-rich image understanding. Future research could explore extending this approach to other domains, such as scientific literature, medical records, or legal documents, where structure plays a vital role in comprehension. Additionally, investigating more efficient architectures and training strategies for structure-aware MLLMs could further enhance their practicality and scalability.

3. RAFT: Adapting Language Model to Domain Specific RAG

Watching: RAFT (paper)

What problem does it solve? Large Language Models (LLMs) are typically pretrained on vast amounts of general-domain data. However, when applying these models to specific domains or tasks, it is often necessary to incorporate additional knowledge that is not present in the pretraining data. This can be achieved through techniques like Retrieval-Augmented Generation (RAG) or fine-tuning. The challenge lies in finding the most effective way to integrate this new knowledge into the pretrained model to improve its performance on the target task.

How does it solve the problem? Retrieval Augmented FineTuning (RAFT) is a training approach that enhances the model's ability to answer questions in an "open-book" in-domain setting. Given a question and a set of retrieved documents, RAFT trains the model to disregard documents that are not relevant to answering the question, referred to as "distractor documents." It achieves this by explicitly citing the correct sequence from the relevant document that would assist in answering the question. Additionally, RAFT employs a chain-of-thought-style response, which helps improve the model's reasoning capabilities.

What's next? The effectiveness of RAFT in improving the performance of pretrained LLMs in domain-specific RAG tasks has been consistently demonstrated across various datasets, including PubMed, HotpotQA, and Gorilla. This suggests that RAFT could serve as a valuable post-training recipe for adapting pretrained LLMs to in-domain RAG tasks. Future research could explore the applicability of RAFT to a wider range of domains and investigate potential improvements to the technique, such as incorporating more sophisticated retrieval methods or exploring alternative ways of guiding the model's attention to relevant information within the retrieved documents.

Papers of the Week:

LLM Watch

54,796 followers

+ Subscribe

YUNSEOP IM

Software Engineer @OMSCS student at GaTech

Good good

Khusrav Badalov

AI Engineer | Specializing in Transformer Architectures, Agentic AI Systems & Trigger-Action Programming

Thank you for sharing!! Kudos!

Chris Booth

Giving your AI memory using Neo4j

Tony Hickman Vineet Saini

2 Reactions

Patrick Ranger

Newly Qualified IT Specialist in System Integration with a Passion for AI | Creator of an AI Blog | Advancing in Cybersecurity | Eager to Drive Technological Innovation | DM for collab

It's fascinating to see the evolution of language models and the ongoing quest for optimal methodologies.

1 Reaction

See more comments

To view or add a comment, sign in

🔮 Moving beyond RAG

Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer

In this issue:

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Recommended by LinkedIn

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

3. RAFT: Adapting Language Model to Domain Specific RAG

Papers of the Week:

LLM Watch

54,796 followers

More articles by Pascal Biese

Insights from the community

Others also viewed

The Future of Data Science: Trends, Challenges, and Opportunities

Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

𝗩𝗲𝗰𝘁𝗼𝗿 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 𝗳𝗼𝗿 𝗟𝗟𝗠 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀

Vector Databases Explained: The Future of Intelligent Data Retrieval

NLP-A Complete Guide for Topic Modeling- Latent Dirichlet Allocation (LDA) using Gensim!

TimesFM: A Foundation Model Revolutionizing Time-Series Forecasting

Synthetic data creation with Persona-Driven Methodology

🌐📝🔍Balancing Model Accuracy and Interpretability: Striking the Right Balance

Power BI Co-Pilot: Transforming Business Intelligence with AI

Traditional RAG vs. Graph RAG vs. Data²: A Comprehensive Analysis

Explore topics

In this issue:

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Recommended by LinkedIn

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

3. RAFT: Adapting Language Model to Domain Specific RAG

Papers of the Week:

LLM Watch

54,796 followers

More articles by Pascal Biese

NVIDIA's LLamaTron Moment

📉 May the Best Cheater Win

🤗 Reinforcement Learning Without Human Feedback

🤗 The Very First Diffusion Reasoning Model

🤖 AI Is Shaking Up the Life Sciences

🐋 DeepSeek Strikes Again As OpenAI's Valuation Skyrockets

🎧 Vibe Coding + Knowledge Graphs = 10x Cheaper

⚛️ Quantum-Enhanced AI - It's Here

🧠 Search-R1, Gemini Embeddings & Controlled Reasoning with L1

🤯 QwQ-32B: 20x smaller than DeepSeek-R1

Insights from the community

Others also viewed

The Future of Data Science: Trends, Challenges, and Opportunities

Paper Review: Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

𝗩𝗲𝗰𝘁𝗼𝗿 𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 𝗳𝗼𝗿 𝗟𝗟𝗠 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀

Vector Databases Explained: The Future of Intelligent Data Retrieval

NLP-A Complete Guide for Topic Modeling- Latent Dirichlet Allocation (LDA) using Gensim!

TimesFM: A Foundation Model Revolutionizing Time-Series Forecasting

Synthetic data creation with Persona-Driven Methodology

🌐📝🔍Balancing Model Accuracy and Interpretability: Striking the Right Balance

Power BI Co-Pilot: Transforming Business Intelligence with AI

Traditional RAG vs. Graph RAG vs. Data²: A Comprehensive Analysis

Explore topics