ELI5: large language model
// explanation
What is a large language model?
A large language model (LLM) is a computer brain that learned to understand and write human language by reading tons and tons of text, like if a student read every book in the world and learned all the patterns of how words go together. [1][2]
Why does it work so well?
It works because it practiced finding patterns in millions of sentences, so it learned that certain words usually go togetherโlike "peanut butter" goes together more than "peanut socks." [2][3]
What can it do?
It can answer questions, write stories, have conversations, and help with homework by using the patterns it learned to guess what words should come next. [4]
Is it really thinking?
Not like you thinkโit's more like it's playing the ultimate game of "what word comes next," but it's SO good at this game that its answers seem smart and helpful. [1][2]
// sources
A large language model (LLM) is a computational model trained on a vast amount of data, designed for natural language processing tasks, especially languageย ...
Large language models are AI systems capable of understanding and generating human language by processing vast amounts of text data.
Large language models, also known as LLMs, are very large deep learning models that are pre-trained on vast amounts of data.
A large language model (LLM) is a type of artificial intelligence that can generate human language and perform related tasks. These models are trained onย ...
... language model using JAX and Flax/NNX. We'll implement token embeddings, transformer blocks, and put everything into a transformer-based large language model.
Video by 3Blue1Brown

Video by IBM Technology

Video by Common Craft Explainer Videos
