LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Generative AI vs. Traditional AI

Context windows

From the course: Generative AI vs. Traditional AI

Start my 1-month free trial

Context windows

“

- When I was younger, I lived down the street from someone who was great at everything you wanted to be great at in high school. She was a great student, had many friends, and was the lead actress in the school theater. While at the same time, I led the Computer Enthusiast Social Club, while also trying to start a Science Fiction Fan Club. Despite our different interests, we were at the same schools for over a decade. Yet the times I would bump into her would always start the same way. I would reintroduce myself and remind her that we lived down the street. Then I'd update her on the latest events at the Computer Enthusiast Social Club. I felt like every time our conversation ended, everything we discussed faded from her memory. Then I'd bump into her a week or a month or a year later. I'd have to start by reintroducing myself. In a sense, the context window for our conversation was very small. We could never have a lengthy conversation because all of our previous conversations were instantly deleted from her memory. In many ways, large language models suffer from a similar challenge. These systems are not yet powerful enough to maintain the context for all your conversations. So instead, they create a small window to hold your most recent exchanges. That means that if you ask an LLM about the new Star Trek series, the longer you have the conversation, the less it remembers. You can think of it like a window that moves through your conversation. The more you talk, the less it remembers. LLMs have gotten over this limitation by making guesses about those facts drop from their memory. It's almost like having a conversation, then the LLM stops listening and starts nodding. Then it tries to make up for this rudeness by making a guess about what you already talked about. As you can imagine, this can lead to some awkward hallucinations. When you start the session by asking a question, the system will provide an answer, but if you change the topic, it may forget your previous question. But these limitations aren't just about being rude. If you're trying to use an LLM to develop a software product, then you need it to remember each of the steps that you took to make improvements. If the system starts to forget where you started, then it's increasingly difficult to develop a product. Also, the more the system knows about you, the more helpful it'll be. Think about any time that you've tried to work with anybody to solve a problem. It would be frustrating to have to repeat everything that you discussed every time you saw that person again. Now, the way that LLMs work is that every time you have a conversation, it creates a bunch of tokens to hold onto that information. These tokens can hold a word, a letter, or even numbers. They're what the system thinks is the bare minimum that it needs to recreate the conversation. Current systems might be able to hold hundreds of thousands of tokens, but in the future, they're hoping to create systems that will record every token from every conversation you ever had. These systems will never forget any of your early conversations, so it'll be much more helpful with your latest requests.

Contents