Futurist: Tiny little LLMs on your fingertips?
One of the things I like to do is look at what is coming. In this scenario, you talk about AI, what is coming, machine intelligence I prefer, and what is coming. First, I want to clarify the landscape of what machine intelligence may need to be clarified. What I mean by fuzzy is that some of the initial applications of machine intelligence have already begun to move into the market. I am not an artist. However, I can use various elements and their Chatbots in the multimodal reality to create images. I created my Futurist Logo using Dall-e 2. Using this Multimodal Machine Intelligence allows me the freedom to utilize more professional-looking graphics a lot faster. Today, we will discuss the future of large language models.
In machine intelligence, one of the interesting conversations is on the size of large language models. Most of them are large in terms of the physical memory, access to the GPU they require, and, ultimately, the amount of shared processing power they need to operate. There are several large Language Models on the Internet today. Naming all of them would take more time than I have. The LLMs are backend systems. The front end is normally a "Chatbot." You can utilize a Chatbot-like interface to connect with large language models and get information. But beyond information, we now have this concept of generative AI. GenAI, or Generative Machine-based Intelligence systems, can create things. The advent or addition of multimodal machine intelligence allows machine intelligence to interoperate with you orally and visually, and in so doing, now you can input information differently. You can utilize a picture and ask the machine intelligence to look at it and tell you what it sees. With a portable machine intelligence system, like Meta and their Ray-Ban glasses, you can look at something and ask Meta what you see.
That level of integration allows me to identify things that I probably wouldn't have been able to identify previously. Because it is multimodal, I can ask things like one bird, do I hear? Which leads me to my prognostication. It is an educated guess at this point. Likely, I can move into a hypothesis very quickly. I need to develop testing parameters for my prognostication. That is the simple way that an educated guess moves into the concept of becoming a hypothesis. My hypothesis is this: the power of machine intelligence will be available to more people with the new releases of the smaller models that will run locally on a cell phone.
In the short run, for the next 12 months, these models will become increasingly prevalent on cell phones. They will focus more on supporting voice assistance in the two major cell phone operating systems and making Hey Google and Hey Siri much more powerful. You'll be able to ask questions, and the system can query local machine intelligence rather than go to the Internet. That will reduce the time for many answers radically. It will also begin the next phase of the Industrial Revolution that machine intelligence brings to us.
As we move forward and LLMs are on our cell phones, the concept of several services that, while we don't always understand their value, we certainly put ourselves in a position of trying to use and leverage them whenever we can. The first is, of course, a concept of transcription. The simple concept is recording meetings automatically and transcribing those meetings so that you have notes to review simultaneously and a fallback audio recording. In a conference room with multiple people speaking, you always get the most for the highest recording fidelity. There's much noise in the conference room. If the person beside you shuffles papers, it disrupts your recording. However, that is the first service these models on the device will begin to empower.
Recommended by LinkedIn
Another area where having a large language model on your cell phone is translation. So now we have transcription and translation. The advantage over the existing world-based translation systems is that the large language model will utilize visual input to translate documents, signs, and other physical objects into something you can understand. But it will also be able to translate conversations. Outlook the reality of translation is that there has to be a lag. The person speaks in a language you don't personally speak. The system hears it and begins the translation process; either replace that with your earbuds, or you have to read on the screen. Either way, it adds time to the process, but the value becomes incredible. Because with a large language model, knowing that you are going to Czechoslovakia, you can preload language into your cellular large language model. We can argue that you wouldn't be a native speaker, but you would be able to speak almost every language with a small delay. Again, there will be a lag that communication with a lag is not bad. As a kid, I remember the pauses in conversations from the moon. Between 28 and 32 seconds, Houston asked the astronauts a question, and the astronauts responded, as I said, around 30 seconds later.
I'll end with one last thing: it will be incredibly cool once the large language models are readily available on every device. That is the reality of integrating with your calendar and activities. The large language model can announce emails and incoming phone calls if you're in the car. If you receive a phone call while in the car, it can record that call for you and make a transcription after the fact. It's hard to take notes in a car and not hit somebody else. So, having automated note-taking is a significant value proposition. The other integration piece would be your driving a long trip and a large language model in integration with the mapping system of your cell phone, which knows that you're quite a bit away from home. It's noon; be able to say, are you ready for lunch and look at the map to find the types of restaurants you like. Your favorite restaurant that you enjoy the most is 15 minutes away. Are you willing to wait?
Such capabilities are already on the market today. They're only going to get better. I will borrow from the ancient Chinese proverb old to live in interesting times. I will rewrite that and say, "Oh, to live in machine intelligence-assisted times!