New Tool Compares SLMs and LLMs, Finds Smaller Models Can Significantly Reduce Costs

New Tool Compares SLMs and LLMs, Finds Smaller Models Can Significantly Reduce Costs

In an exciting development from the University of Michigan, researchers have unveiled a groundbreaking tool that compares small language models (SLMs) with large language models (LLMs) such as OpenAI's ChatGPT. Their findings, now available on the arXiv preprint server, reveal that open-source SLMs can provide conversational responses comparable to those of resource-intensive LLMs, but at a fraction of the cost.

Bridging the Gap in AI Model Evaluation

The team developed the first-of-its-kind tool named SLaM (Small Language Model Analysis), capable of evaluating SLMs and comparing them to proprietary LLM Application Programming Interfaces (APIs) in terms of performance and cost. These results were recently presented at the 2024 IEEE International Symposium on Performance Analysis of Systems and Software.

The Rise of Large Language Models

LLMs have revolutionized applications such as virtual assistants, chatbots, and language translation systems due to their impressive language comprehension and generation capabilities. However, the high cost of training these models, which can run into millions of dollars, restricts their development to tech giants. Smaller companies often find themselves dependent on these costly services.

Evaluating SLMs as a Viable Alternative

"A lot of companies such as Duolingo and Slack are incorporating LLMs like OpenAI's GPT-4 into their products. It's important to rigorously examine whether these models are really the best choice for developers and whether small open models could be effective," said Jason Mars, an associate professor of computer science and engineering at the University of Michigan.

While implementing proprietary LLMs offers speed and convenience, it also brings limitations such as reduced customization, data privacy concerns, unreliable performance during peak usage, and high costs. Open-source SLMs present a promising alternative, but until now, there was no systematic way to compare their performance against well-known LLMs.

Introducing SLaM: The Comparative Analysis Tool

The research team designed SLaM to fill this gap, providing an open-source methodology for evaluating the trade-offs—quality, performance, and cost—between SLMs and LLMs. "We created SLaM and made it open source to fill the void in tools that accelerate and automate comparative analysis of open and closed LLMs on a case-by-case basis," Mars explained.

The tool was tested in a real-world AI productivity application under development by Myca AI called "daily pep talk." This feature leverages users' task lists to deliver personalized and intelligent encouragement and advice daily.

Key Findings and Implications

The researchers assessed 29 versions of nine distinct SLMs against OpenAI's GPT-4 within the "daily pep talk" environment. While GPT-4 achieved the highest accuracy according to a human panel, most SLMs delivered similar quality responses with more predictable latency performance.

"We were surprised by the high-quality answers provided by these small models. Many times, users could not really differentiate between SLM and LLMs," noted Lingjia Tang, an associate professor of computer science and engineering.

Crucially, the SLMs reduced costs by five to 29 times compared to LLMs, depending on the model used. "This finding has big implications for smaller companies trying to maintain competitiveness in this fierce AI race. With SLaM tools, companies can select smaller open-source models that provide high-quality answers but cost much less, reducing their dependencies on tech giants," added Tang.



To view or add a comment, sign in

More articles by AG Tech Consulting Services

Insights from the community

Others also viewed

Explore topics