Connect with us

Hi, what are you looking for?

[stock_market_widget type="ticker-quotes" template="chart" color="#5679FF" assets="MSFT,AAPL,NFLX,GOOG,TSLA,NFLX,AMZN" animation="true" display_currency_symbol="true" api="yf" speed="50" direction="left" pause="true"]

Tech

Anthropic's latest model can take The Great Gatsby as input

Historically and even today, poor memory has been an impediment to the usefulness of text-generating AI. As a recent piece in The Atlantic aptly puts it, even sophisticated generative text AI like ChatGPT has the memory of a goldfish. Each time the model generates a response, it takes into account only a very limited amount

anthropic's-latest-model-can-take-the-great-gatsby-as-input

Historically and even today, poor memory has been an impediment to the usefulness of text-generating AI. As a recent piece in The Atlantic aptly puts it, even sophisticated generative text AI like ChatGPT has the memory of a goldfish. Each time the model generates a response, it takes into account only a very limited amount of text — preventing it from, say, summarizing a book or reviewing a major coding project.

But Anthropic’s trying to change that.

Today, the AI research startup announced that it’s expanded the context window for Claude — its flagship text-generating AI model, still in preview — from 9,000 tokens to 100,000 tokens. Context window refers to the text the model considers before generating additional text, while tokens represent raw text (e.g. the word “fantastic” would be split into the tokens “fan,” “tas” and “tic”).

So what’s the significance, exactly? Well, as alluded to earlier, models with small context windows tend to “forget” the content of even very recent conversations — leading them to veer off topic. After a few thousand words or so, they also forget their initial instructions, instead extrapolating their behavior from the last information within their context window rather than the original request.

Given the benefits of large context windows, it’s not surprising that figuring out ways to expand them has become a major focus of AI labs like OpenAI, which devoted an entire team to the issue. OpenAI’s GPT-4 held the previous crown in terms of context windows sizes, weighing in at 32,000 tokens on the high end — but the improved Claude API blows past that.

With a bigger “memory,” Claude should be able to converse relatively coherently for hours — several days, even — as opposed to minutes. And perhaps more importantly, it should be less likely to go off the rails.

In a blog post, Anthropic touts the other benefits of Claude’s increased context window, including the ability for the model to digest and analyze hundreds of pages of materials. Beyond reading long texts, the upgraded Claude can help retrieve information from multiple documents or even a book, Anthropic says, answering questions that require “synthesis of knowledge” across many parts of the text.

Anthropic lists a few possible use cases:

  • Digesting, summarizing, and explaining documents such as financial statements or research papers
  • Analyzing risks and opportunities for a company based on its annual reports
  • Assessing the pros and cons of a piece of legislation
  • Identifying risks, themes, and different forms of argument across legal documents.
  • Reading through hundreds of pages of developer documentation and surfacing answers to technical questions
  • Rapidly prototyping by dropping an entire codebase into the context and intelligently build on or modify it

“The average person can read 100,000 tokens of text in around five hours, and then they might need substantially longer to digest, remember, and analyze that information,” Anthropic continues. “Claude can now do this in less than a minute. For example, we loaded the entire text of The Great Gatsby into Claude … and modified one line to say Mr. Carraway was ‘a software engineer that works on machine learning tooling at Anthropic.’ When we asked the model to spot what was different, it responded with the correct answer in 22 seconds.”

Now, longer context windows don’t solve the other memory-related challenges around large language models. Claude, like most models in its class, can’t retain information from one session to the next. And unlike the human brain, it treats every piece of information as equally important, making it a not particularly reliable narrator. Some experts believe that solving these problems will require entirely new model architectures.

Advertisement. Scroll to continue reading.

For now, though, Anthropic appears to be at the forefront.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Mining

NAL spodumene concentrate production remains targeted for H1 2023 with revenue potential in Q3 2023. Credit: Piedmont Piedmont Lithium (Nasdaq: PLL; ASX: PLL) announced...

Stocks

SAN FRANCISCO (MarketWatch) — Among the companies whose shares are expected to see active trade in Thursday’s session are BlackBerry Ltd., Oracle Corp., and...

Tech

Over 90% of cybercrime activities that lead to financial fraud or identity theft start with an email impersonation, commonly known as phishing and spoofing....

Top Stories

Following a down year for the stock market, there is no shortage of recession predictions for 2023, especially as the Federal Reserve has signaled...

Advertisement