Skip to content

How ChatGPT and other LLMs work, and where they might go next | WITH CABLE


AI-powered chatbots like as ChatGPT and google bard they’re certainly having a moment: the next generation of conversational software tools promises to do everything from taking over our web searches to producing an endless supply of creative literature to remembering all the world’s knowledge so we don’t have to.

ChatGPT, Google Bard and other bots like them are examples of great language models, or LLM, and how they work is worth investigating. It means you’ll be able to make better use of them and better appreciate what they’re good at (and not to be trusted with).

Like many artificial intelligence systems, such as those designed to recognize your voice or generate images of cats, LLMs train on vast amounts of data. The companies behind them have been pretty circumspect when it comes to revealing where exactly that data comes from, but there are certain clues we can look out for.

For example, the research work In introducing the LaMDA (Language Model for Dialogue Applications) model, on which Bard is based, he mentions Wikipedia, “public forums” and “code documents from programming-related sites, such as Q&A sites, tutorials, etc. “. Meanwhile, Reddit wants to start charging to access his 18 years of text conversations, and StackOverflow just announced plans to start charging as well. The implication here is that LLMs have been making extensive use of both sites up to this point as sources, completely free and at the expense of the people who built and used those resources. It is clear that much of what is publicly available on the web has been mined and analyzed by LLM.

LLMs use a combination of machine learning and human input.

OpenAI via David Nield

All of this text data, wherever it comes from, is processed through a neural network, a type of commonly used AI engine made up of multiple nodes and layers. These networks continually adjust how they interpret and make sense of data based on a number of factors, including the results of past trial and error. Most LLMs use a specific neural network architecture called transformer, which has some particularly well-suited tricks for language processing. (That GPT after Chat stands for Generative Pretrained Transformer.)

Specifically, a transformer can read large amounts of text, detect patterns in how words and phrases relate to each other, and then make predictions about which words should appear next. You may have heard LLMs compared to supercharged autocorrect engines, and that’s actually not too far from the truth: ChatGPT and Bard don’t really “know” anything, but they are very good at figuring out which word follows another, what begins to look like real thought and creativity when they get to a sufficiently advanced stage.

One of the key innovations of these transformers is the self-service mechanism. It’s hard to explain in one paragraph, but in essence it means that the words in a sentence are not considered in isolation, but also in relation to each other in a variety of sophisticated ways. It allows for a higher level of understanding than would otherwise be possible.

There’s some randomness and variance built into the code, so you won’t get the same response from a conversational transformer robot every time. This self-correcting idea also explains how errors can appear. At a fundamental level, ChatGPT and Google Bard don’t know what is correct and what is not. They are looking for answers that seem plausible and natural, and that match the data they have been trained on.


—————————————————-

Source link

For more news and articles, click here to see our full list.