Glossary

Every abbreviation and piece of jargon we use, explained in plain English. House rule: no three-letter acronym appears in a transmission without being spelled out and linked here.

Agent: An AI system that can plan and carry out multi-step work on its own — browsing, writing files, calling other software — rather than answering a single question.
Alignment: The research problem of making AI systems reliably pursue the goals and values their developers and users intend, rather than unintended ones.
Artificial General IntelligenceAGI: A hypothetical AI capable of matching or exceeding human performance across virtually all cognitive work, rather than excelling at one narrow task.
Benchmark: A standardised test used to compare models — for example, solving maths problems or fixing real software bugs. No single benchmark tells the whole story.
Chain-of-thought: A technique where a model works through a problem step by step in writing before giving its answer, usually improving accuracy on reasoning tasks.
Compute: Shorthand for raw processing power: the chips, electricity and time needed to train or run AI models. A key cost and constraint across the industry.
Context window: The amount of text a model can consider at once, measured in tokens. A bigger window means the model can work with longer documents and conversations.
Distillation: Training a smaller or newer model on the outputs of a larger one, transferring its abilities cheaply. Controversial when done against a rival's model without permission.
Fine-tuning: Further training of an existing model on specific data so it performs better at a particular job.
Frontier model: One of the most capable AI models available at a given moment, typically from the handful of laboratories operating at the leading edge.
GPU: Graphics processing unit: the chip type, originally built for rendering video-game graphics, that now powers most AI training and inference. NVIDIA dominates the market.
Guardrails: Restrictions built into an AI system to block certain outputs or behaviours, from refusing harmful requests to limiting what competitors can ask.
Hallucination: When a model states something false with confidence. Reduced by grounding answers in retrieved documents and by human review.
Inference: Running a trained model to get answers — as opposed to training, which is how the model was built.
Initial public offeringIPO: A private company's first sale of shares to the public, listing it on a stock exchange. Reported valuations before an IPO are claims, not market prices.
Intelligence index: Artificial Analysis's composite score combining several benchmark results into one comparable number per model. The headline column on our model tracker.
Jailbreak: A prompt crafted to trick a model into ignoring its guardrails and producing restricted output.
Large Language ModelLLM: The core technology behind systems like Claude, ChatGPT and Gemini: a model trained on vast amounts of text to predict and generate language.
Market capitalisation: A listed company's total value: share price multiplied by the number of shares.
Mixture of expertsMoE: A model architecture that routes each request through only a relevant subset of its parameters, getting large-model quality at lower running cost.
Model Context ProtocolMCP: An open standard that lets AI models connect to external tools and data sources in a consistent way.
Multimodal: A model that works with more than text — typically images, audio or video, as input, output or both.
Open weights: A model whose trained parameters are published so anyone can run or adapt it on their own machines.
Parameters: The numerical values inside a model that encode what it has learned; counted in billions and often used as a rough measure of model size.
Prompt: The instruction given to a model. Increasingly less important than the broader task definition for long-running agentic systems.
Prompt injection: An attack that hides instructions inside content an AI will read — a web page, email or document — to hijack what the AI does next.
Red-teaming: Deliberately attacking your own AI system — probing for jailbreaks, harmful outputs and failures — to find weaknesses before others do.
Reinforcement Learning from Human FeedbackRLHF: A training technique where humans rate model outputs and the model learns to prefer responses people judge as better.
Retrieval-Augmented GenerationRAG: A technique where a model first looks up relevant documents and then answers using them, improving accuracy.
Synthetic data: Training data generated by AI rather than collected from the real world, increasingly used where human-written data is scarce or expensive.
Time to first token: How long a model takes to begin answering. With tokens per second (generation speed), the standard measure of how fast a model feels.
Token: The unit models read and write — roughly three-quarters of a word in English. Pricing and context windows are measured in tokens.
Valuation: The price investors assign to a company when buying a stake. For private AI firms these figures come from funding rounds and reports, not from open-market trading.