"Just Upload Everything & AI Will Do Everything"

"Just Upload Everything & AI Will Do Everything"

I often have clients who confidently talk about generative AI like they're experts, but in reality, they're headline readers.

They use the right buzzwords but apply them in the wrong way, often assuming AI is far more magical than it really is.

For example, a common one:

"I’ll just upload everything I have, and AI will do what I need."

Like magically, a Large Language Model will know exactly what "I need" does is and then be able to take all the necessary step outside of the Large Lange Model to get it done.

That’s not how it works.

Let’s break this down, in the OpenAI vernacular.

You can't “train” GPT.

The GPT is a Large language Model that stands for Generative Pre-Trained Transformer.

That means the model has already been (i.e. pre) trained on massive datasets, by OpenAI at great expense to them.

We're talking millions of dollars, enormous computing power, and a highly specialised team of machine learning engineers.

You can't just "upload your stuff" and expect the model to learn from it.

What this pre-training has done is, learned how humans string letters and words together (for example) and what is fact or not.

But it's only limited to the datasets it was trained on.

What you can do is augment the existing model.

Using OpenAI as the example vendor, there are a few ways to do this.

You can augment or fine-tune a model, but there’s nuance to that.

There are three core ways to tailor a pre-trained model like GPT to your needs:

1. In-Chat Context Prompting

You paste in information or reference materials at the start of a conversation.

It's easy and quick, but there are token limits (roughly 1 token = 0.75 words.)

If you exceed the token limit content gets cut off.

2. Custom GPTs (ChatGPT Pro Feature)

You upload files, define instructions, and give it a personality or role.

Still, it has file size and token limits and thus t’s not infinite "memory" of what has been put in before it.

3. RAG (Retrieval-Augmented Generation) via API

You don’t feed the model everything you have, because it won't fit in the models token limits.

Instead, you store your data elsewhere (like a database or more technically a "vector store").

When someone asks a question, your system retrieves the relevant chunk of information from the database and injects that into the prompt.

Again you still have token limits per request, but this way you only feed in what is necissary for the context.

4. Fine-Tuning

This is where you actually do train a version of the model, but only on a narrow domain where you want the model to behave in very specific ways or speak a certain “language” (e.g. legal, medical, or brand voice)

For this you still need lots of clean, structured training data, money to pay for computer to train it.

It doesn’t make the model “know everything you uploaded”, it makes it better at handling certain types of prompts.

For instance, if you fine-tune a model on customer service emails, it won’t “know” your company’s entire history or every policy document you have, but it will get better at responding in your tone of voice and following your standard support procedures when prompted.

Again, you’re not training from scratch.

You’re fine-tuning a small layer on top of the pre-trained model.

It’s a powerful tool, but it’s not magic, and not necessary for most business use cases.

And this still can be achieved by the previous 3 methods.

So no, uploading "everything you have" won't make AI magically do what you want.

But with the right combination of prompting, retrieval, or fine-tuning, you can simulate some pretty smart behaviour.

Yet, this still needs a lot of human input to set up.

Not to mention if you need AI to interact with other platforms you use like Microsoft or Google.

Once setup, it will save you a lot of time and money, but full "I’ll just upload everything I have, and AI will do what I need," it not here just yet!

Don’t let headlines and hype shape your expectations.

Instead let understanding shape your AI strategy.

Need help choosing the right path?

I do this every day.

Let’s chat.

To view or add a comment, sign in

More articles by Orren Prunckun

Insights from the community

Others also viewed

Explore topics