AI Glossary: A to Z Technical Directory
Explore definitions and dynamic coverage analytics for the core concepts shaping artificial intelligence.
A
Accuracy
Accuracy is a classification metric measuring the fraction of total predictions that the model got correct, calculated as the sum of correct predictions divided by all predictions.
Activation Function
An Activation Function is a mathematical formula applied to the output of a neural network node to determine whether it should be activated (transmit signal) or not. It introduces non-linear properties to the network, allowing it to learn complex patterns instead of just linear transformations.
Active Learning
Active Learning is a semi-supervised learning framework where a machine learning algorithm queries a human annotator to label only the most informative or uncertain data points, minimizing labeling cost.
Adam Optimizer
Adam (Adaptive Moment Estimation) is an optimization algorithm used for training deep learning models. It combines the advantages of RMSProp and Momentum by calculating adaptive learning rates for each parameter based on estimates of the first and second moments of the gradients.
Adversarial Attack
An Adversarial Attack is a technique that feeds a machine learning model intentionally designed inputs (adversarial examples) to cause it to make a mistake, fail, or hallucinate. In image models, this often involves introducing imperceptible pixel noise that completely alters the classification.
Agentic AI
Agentic AI refers to artificial intelligence systems designed to act autonomously, make decisions, plan workflows, and execute tasks without constant human intervention. Unlike traditional models that only respond to queries, agentic systems use an agentic loop to perceive environments, reason over goals, use tools, and iterate to achieve outcomes.
AGI
Artificial General Intelligence (AGI) represents a theoretical form of AI that possesses the ability to understand, learn, and apply knowledge across any intellectual task at a level equal to or surpassing human capabilities. Unlike narrow AI, AGI is characterized by general reasoning and autonomous adaptability.
AI Agent
An AI Agent is an autonomous entity that perceives its environment through sensors (or inputs) and acts upon that environment using actuators (or tools) to achieve specific goals. An agent relies on a reasoning brain (typically an LLM) to plan and execute multi-step processes.
AI Compute
AI Compute refers to the processing capacity (measured in floating-point operations or FLOPs) required to train and run inference on large-scale neural networks and machine learning models.
AI Copilot
An AI Copilot is an interactive assistant integrated directly into workspaces and applications, using Large Language Models to help users write code, draft emails, summarize documents, or execute tasks through natural language commands.
AI Ethics
AI Ethics is a multidisciplinary field of study and governance that addresses the moral concerns, social impacts, and legal dilemmas associated with the development and deployment of artificial intelligence systems.
AI Governance
AI Governance refers to the systemic framework of policies, procedures, compliance standards, and organizational structures established to supervise, monitor, and regulate an organization's AI deployment.
AI Model
An AI Model is a mathematical algorithm trained on a dataset to perform specific tasks like classification, prediction, or text generation. It represents the saved states of a neural network (the weights and biases) after training, which can be deployed to run inference on new, unseen data.
AI Orchestration
AI Orchestration is the process of coordinating and managing multiple AI models, autonomous agents, data retrieval pipelines, and database updates to execute complex, end-to-end enterprise workflows.
AI Safety
AI Safety is a field of research focused on ensuring that artificial intelligence systems behave predictably, avoid causing harm, and remain aligned with human interests. It spans technical alignment, risk mitigation, and the study of existential risk from advanced systems.
AI Search Engine
An AI Search Engine is an information retrieval system that utilizes generative models to synthesize direct answers, summaries, and source citations to queries, rather than just returning a list of links (e.g., Perplexity, Google AI Overviews).
Algorithm
An Algorithm is a step-by-step procedure or set of mathematical rules designed to solve a specific problem or perform a calculation. In AI, algorithms determine how a model processes inputs and updates its parameters during learning.
Algorithmic Bias
Algorithmic Bias (or AI Bias) occurs when a machine learning model generates systematic and repeatable errors that create unfair outcomes, typically due to prejudices or imbalances present in the training datasets.
Alignment
Alignment refers to the process of guiding an AI model's behaviors, responses, and values to match human intents, safety principles, and ethical standards. Unaligned models might generate toxic text, assist in harmful activities, or refuse user inputs.
Answer Engine Optimization
Answer Engine Optimization (AEO) is the process of optimizing web content to be retrieved and displayed as the primary, direct answer by search engine featured snippets and voice assistants (like Siri, Alexa, and Google Assistant).
Anthropic
Anthropic is an AI safety and research company, creators of the Claude LLM family, founded by former OpenAI researchers to build steerable, reliable, and constitutional AI systems.
Artificial Intelligence
Artificial Intelligence (AI) is a broad field of computer science dedicated to building systems capable of performing tasks that typically require human cognitive function, such as visual perception, speech recognition, decision-making, and translation.
Attention Mechanism
An Attention Mechanism is a technique in neural networks that mimics cognitive attention, allowing the model to focus on specific parts of the input data when generating an output. It enables models to calculate the contextual relationships between distant elements in a sequence.
Auto-GPT
Auto-GPT is an open-source autonomous agent application that showcases the capabilities of Large Language Models (specifically GPT-4) to run independently to achieve a user-defined goal by chaining thoughts and actions in a continuous loop.
Autoencoder
An Autoencoder is a type of unsupervised neural network designed to learn efficient data codings (representations) by training the network to ignore signal noise. It consists of an encoder that compresses the input data, and a decoder that reconstructs the input from the compressed representation.
Autoencoding
Autoencoding is an unsupervised learning approach where a neural network is trained to reconstruct its input values through a lower-dimensional bottleneck, learning efficient representations of the data.
Autonomous Agent
An Autonomous Agent is an AI system designed to operate independently to achieve specific, high-level objectives. It constructs its own sub-tasks, plans sequences of actions, invokes external tools, inspects intermediate results, and corrects mistakes without user guidance.
Autoregressive Model
An Autoregressive Model is an AI model that predicts future values in a sequence based on past values. In LLMs, autoregressive generation works by taking the prompt, predicting the next word, appending that word to the prompt, and repeating the process.
B
Backpropagation
Backpropagation is the primary algorithm used to train neural networks. It works by calculating the gradient of the loss function with respect to the weights of the network, and then propagating that error backward through the layers using the chain rule to update parameters.
Bag of Words
Bag of Words (BoW) is a simplified representation model used in natural language processing and information retrieval. In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and word order but keeping multiplicity.
Batch Normalization
Batch Normalization is a technique that normalizes the inputs of each layer within a mini-batch during training, stabilizing the learning process and accelerating convergence.
Batch Size
Batch Size is a model training hyperparameter defining the number of training examples processed in a single forward and backward pass before the model's internal parameter weights are updated.
Bayesian Optimization
Bayesian Optimization is a sequential design strategy for global optimization of black-box functions. It is widely used in machine learning to tune hyperparameters, particularly when evaluating the target function is computationally expensive.
BERT
BERT (Bidirectional Encoder Representations from Transformers) is a language model developed by Google in 2018. Unlike autoregressive models, BERT is bidirectional, looking at the words before and after a target word to understand its context.
Bi-Encoder
A Bi-Encoder is a neural network architecture that embeds the query and the candidate document separately into a shared vector space, allowing fast similarity comparisons using mathematical operations like cosine similarity or dot product.
Bias-Variance Tradeoff
The Bias-Variance Tradeoff is a core machine learning concept describing the conflict between a model's ability to minimize bias (errors from simple assumptions) and variance (errors from sensitivity to training data fluctuations). Balancing them is key to avoiding overfitting or underfitting.
Bidirectional LSTM
A Bidirectional LSTM (BiLSTM) is a sequence processing architecture that consists of two LSTMs: one taking the input in a forward direction, and the other taking it in a backward direction. This allows the network to capture both past and future context at any point in the sequence.
Black Box
A Black Box model is an AI or machine learning system whose internal workings, parameters, and decision-making logic are hidden or too complex for humans to interpret or understand (such as deep neural networks with billions of weights).
Blackwell
Blackwell is NVIDIA's high-performance GPU architecture designed specifically to accelerate trillion-parameter large language models, offering massive throughput improvements for AI training and inference workloads.
BM25
Okapi BM25 is a ranking function used by search engines to estimate the relevance of documents to a given search query. It is based on the probabilistic retrieval framework and improves upon TF-IDF by adding term frequency saturation and document length normalization.
C
Cascading Agent Failure
Cascading Agent Failure is a critical failure mode in multi-agent systems where an error, hallucination, or logical exception in a upstream worker agent propagates downstream, causing consecutive errors and a total system collapse.
Causal Language Model
A Causal Language Model is an autoregressive model trained to predict the next token in a sequence given only the preceding tokens. It uses attention masking to prevent the model from looking at future tokens during training.
Chain of Thought
Chain of Thought (CoT) prompting is a technique that instructs Large Language Models to write down their step-by-step reasoning process before outputting the final answer. This improves performance on complex reasoning, math, and logic tasks.
Chatbot
A Chatbot is a software application designed to simulate human-like conversations with users, either through text dialogues or voice interfaces, historically powered by rule-based patterns and now by Large Language Models.
ChatGPT
ChatGPT is a conversational artificial intelligence chatbot developed by OpenAI, built on their family of GPT Large Language Models, which pioneered the generative AI consumer wave by providing fluid, human-like dialogue.
Chunking
Chunking is the process of breaking down a large, continuous document into smaller, manageable, and semantically cohesive text fragments (chunks) before indexing them in a vector database.
Claude
Claude is a family of state-of-the-art Large Language Models developed by Anthropic. Highly regarded for its reasoning, coding capabilities, and context window size, Claude models are trained using a methodology called Constitutional AI.
CLIP
CLIP (Contrastive Language-Image Pre-training) is a neural network developed by OpenAI that learns visual concepts from natural language supervision. It is trained on millions of image-text pairs to match corresponding images and captions in a joint embedding space.
CNN
A Convolutional Neural Network (CNN) is a class of deep neural network most commonly applied to analyzing visual imagery. CNNs use mathematical convolution operations to extract hierarchical features from grid-like structures, making them ideal for image classification, object detection, and computer vision.
Codegen
Codegen (Code Generation) refers to the capability of generative AI models to synthesize executable software code, scripts, or markups from natural language descriptions or existing code contexts.
Cognitive Architecture
Cognitive Architecture is the design blueprint for structuring an autonomous AI Agent. It defines how memory, planning steps, reflection mechanisms, and external tools interact with the core LLM brain to create a persistent agentic loop.
Cold Start Problem
The Cold Start Problem is a challenge in recommender databases and search indexing where the system struggles to recommend items because it has no prior history, ratings, or interaction logs for a new user or a new item.
Collaborative Filtering
Collaborative Filtering is a technique used by recommendation engines to filter or predict a user's interests by collecting preferences from many users. It assumes that if user A agrees with user B on an issue, user A is more likely to share B's opinion on a different issue.
Computer Vision
Computer Vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and videos, models can accurately identify and classify objects, and react to what they "see."
Concept Drift
Concept Drift is the phenomenon where the statistical properties of the target variable that a model is predicting change over time in an unforeseen way, causing the model to become inaccurate.
Conditional Random Fields
Conditional Random Fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning, used for structured predicting. CRFs take context into account when predicting labels for sequence elements.
Constitutional AI
Constitutional AI is an alignment training methodology developed by Anthropic to train helpful and harmless models without human-labeled feedback for safety. The model is given a written list of principles (a constitution) and recursively critiques its own outputs to align with those principles.
Context Engineering
Context Engineering is the practice of designing, structuring, and optimizing the prompt context window to maximize the accuracy and efficiency of Large Language Models. It focuses on how raw data, historical messages, and systemic rules are retrieved, formatted, and pruned before being sent to the model.
Context Window
The Context Window is the maximum volume of text (measured in tokens) that a Large Language Model can process and consider at any single moment. It contains the prompt instructions, user query, system settings, and memory history.
Contrastive Learning
Contrastive Learning is a self-supervised training technique where a model learns to group similar inputs (positive pairs) close together in embedding space while pushing dissimilar inputs (negative pairs) far apart.
Cosine Similarity
Cosine Similarity is a mathematical metric used to measure the similarity between two vectors in high-dimensional space by calculating the cosine of the angle between them. It is independent of vector magnitude, focusing purely on direction.
Cost Function
A Cost Function is a mathematical formula that measures the performance of a machine learning model on the entire dataset. It represents the average of the loss function values computed across all training examples.
CrewAI
CrewAI is an open-source framework designed for orchestrating role-playing autonomous AI agents. It enables developers to structure groups of agents that work together, share memories, delegate tasks, and execute collaborative workflows.
Cross-Encoder
A Cross-Encoder is a neural network architecture used in information retrieval that processes the query and the candidate document together as a single input sequence, computing attention across both to produce a highly accurate relevance score.
Cross-Validation
Cross-Validation is a statistical resampling technique used to evaluate a machine learning model's generalization performance by partitioning the dataset into multiple training and validation folds and testing recursively.
D
Data Augmentation
Data Augmentation is the practice of artificially increasing the size and diversity of a training dataset by applying transformations (like cropping, rotating, flipping, or paraphrasing) to existing data points.
Data Labeling
Data Labeling is the process of identifying raw data points (such as images, text, or audio files) and appending target category tags (labels) to them to create a labeled dataset for supervised learning.
Data Leakage
Data Leakage is a training error that occurs when information from outside the training dataset is used to train a model. This leads to overly optimistic performance scores during validation, but poor generalization on true unseen data.
Data Preprocessing
Data Preprocessing is the initial database and coding phase of cleaning, transforming, and formatting raw input datasets to prepare them for machine learning algorithms.
Dataset
A Dataset is a structured collection of data points, features, and target values used to train, validate, and evaluate machine learning models.
Dataset Curation
Dataset Curation is the process of collecting, cleaning, labeling, filtering, and organizing data to create a high-quality dataset for training or benchmarking machine learning models.
Decision Tree
A Decision Tree is a non-parametric supervised learning method used for both classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.
Deep Learning
Deep Learning is a subset of machine learning based on artificial neural networks with multiple layers (hence "deep"). These layers extract high-level features progressively from raw input, enabling automated feature learning without manual engineering.
Deep Reinforcement Learning
Deep Reinforcement Learning (DRL) is a subfield of machine learning that combines reinforcement learning principles (agents, actions, rewards) with deep neural networks to learn decision-making policies for high-dimensional state spaces.
Deepfake
A Deepfake is synthetic media (images, video, or audio) in which a person's face, voice, or body is digitally altered or replaced using deep generative models, typically autoencoders or Generative Adversarial Networks (GANs), to depict them saying or doing things they did not.
Denoising
Denoising is the process of removing noise from a signal (like a digital image or audio track). In generative AI, denoising autoencoders and diffusion networks are trained to reconstruct clean inputs from intentionally corrupted variants.
Dense Model
A Dense Model is a neural network architecture where 100% of the model's parameters are activated and calculated for every single token processed, representing the traditional design of deep neural networks.
Diffusion Model
A Diffusion Model is a class of generative AI models that generate data by learning to reverse a process of gradual noise addition. By starting with random noise and iteratively removing it, the model can generate high-resolution images, video, or audio.
Dimensionality Reduction
Dimensionality Reduction is the process of reducing the number of input variables (features) in a dataset while retaining as much relevant information as possible. It is used to simplify models and visualize high-dimensional datasets.
Discriminator
A Discriminator is a neural network component within a Generative Adversarial Network (GAN) architecture. Its role is to evaluate inputs and classify them as either "real" (originating from the true training dataset) or "fake" (produced by the generator network).
Distillation
Knowledge Distillation is a compression technique where a smaller model (the student) is trained to replicate the behavior and output probabilities of a much larger model (the teacher). This transfers reasoning capabilities into smaller footprints.
Distributed Training
Distributed Training is the practice of partitioning machine learning workloads (data or parameters) across multiple compute processors (GPUs/TPUs) to accelerate training times for large neural networks.
Document Store
A Document Store is a database designed to store, retrieve, and manage document-oriented information, typically formatted as JSON, XML, or PDF structures. In RAG architectures, it holds the raw text associated with vector embeddings.
Dot Product Similarity
Dot Product Similarity is a metric that measures the alignment of two vectors in a high-dimensional space by multiplying corresponding elements and summing the products. Unlike Cosine Similarity, it is sensitive to vector magnitude.
Double Quantization
Double Quantization is a memory-saving process introduced in QLoRA that quantizes the quantization constants themselves, reducing the memory footprint of fine-tuning runs with zero accuracy loss.
DPO
Direct Preference Optimization (DPO) is a model alignment technique that bypasses the complex reward-model training phase of RLHF. DPO optimizes the policy directly on preference datasets (chosen vs. rejected responses) using a simple binary cross-entropy loss.
Dropout
Dropout is a regularization technique used in neural networks during training where a fraction of network nodes are randomly deactivated (dropped out) in each forward pass, preventing co-adaptation of features.
E
Early Stopping
Early Stopping is a regularization technique that halts a model's training process when its performance on a separate validation dataset stops improving, even if the training loss continues to decrease.
Edge AI
Edge AI is the practice of running machine learning models and processing data directly on physical devices at the "edge" of the network (like smartphones, laptops, or IoT devices), rather than relying on centralized cloud servers.
Embedding
An Embedding is a representation of real-world data (words, sentences, images, user profiles) as high-dimensional vectors of real numbers. Embeddings place semantically similar concepts close to each other in vector space.
Embedding Dimension
Embedding Dimension is the coordinate length of the vector used to represent data items in a latent space (e.g. OpenAI's text-embedding-3-small uses 1536 dimensions). It determines the detail capacity of the semantic space.
Ensemble Methods
Ensemble Methods are machine learning techniques that combine predictions from multiple individual models to create a single, more robust prediction. Examples include bagging, boosting, and stacking.
Epoch
An Epoch is a single complete pass of the entire training dataset through a machine learning model. Training typically consists of many epochs to allow the network to refine weights and biases based on multiple passes over the data.
Euclidean Distance
Euclidean Distance is a mathematical metric measuring the straight-line distance between two coordinates in Euclidean space. In vector search, it is used to measure the similarity between two embedding vectors.
Explainable AI
Explainable AI (XAI) is a suite of processes and methods that allow human users to comprehend and trust the results and outputs generated by machine learning algorithms. It aims to demystify the "black box" of deep neural networks.
Exploding Gradient Problem
The Exploding Gradient Problem is an error during backpropagation training where gradients accumulate, resulting in unstable, massive parameter updates that prevent model weights from converging.
F
F1 Score
The F1 Score is a statistical metric used to evaluate a classification model's accuracy. It is calculated as the harmonic mean of precision (exactness) and recall (completeness), making it ideal for datasets with imbalanced classes.
Feature
A Feature is an individual, measurable property or input variable used by a machine learning model to make predictions. In tabular datasets, features correspond to columns (e.g. square footage, age of home).
Feature Engineering
Feature Engineering is the process of using domain knowledge to select, transform, combine, and manipulate raw variables into highly predictive input features for machine learning algorithms.
Federated Learning
Federated Learning is a decentralized training technique that trains machine learning models across multiple remote edge devices holding local data samples, without exchanging the data itself.
Few-Shot Learning
Few-Shot Learning is a machine learning paradigm where a model is trained or prompted to perform a task using only a small number of training examples. In LLMs, this is achieved by including a few demonstration inputs and outputs directly in the prompt context window.
Fine-Tuning
Fine-Tuning is the process of taking a pre-trained model and training it further on a smaller, specific dataset to adapt it for a particular task or domain. Fine-tuning alters the internal weights of the network, specializing its behavior and tone.
FlashAttention
FlashAttention is a memory-efficient, exact self-attention algorithm that speeds up Transformer training and inference by tiling computations in GPU SRAM and avoiding HBM access.
Foundation Model
A Foundation Model is a large-scale AI model trained on massive, broad datasets (typically through self-supervised learning) that serves as the baseline starting point for multiple downstream tasks. Examples include GPT-4, LLaMA, and stable diffusion models.
Fully Connected Layer
A Fully Connected Layer (Dense Layer) is a layer in an artificial neural network where every neuron is connected to all neurons in the previous layer, mapping linear combinations of inputs to outputs.
Function Calling
Function Calling is an LLM capability where the model outputs a structured JSON object containing argument parameters to invoke specific external functions or APIs, enabling LLMs to act as dynamic interfaces for databases and systems.
G
GAN
A Generative Adversarial Network (GAN) is a generative AI architecture consisting of two neural networks: a Generator (which creates fake data) and a Discriminator (which evaluates if the data is real or fake). The networks train in competition, forcing the generator to produce high-fidelity data.
Gating Mechanism
A Gating Mechanism is a structural design in neural networks that controls the flow of information through internal pathways using sigmoid-activated scalar multipliers.
GELU
GELU (Gaussian Error Linear Unit) is a smooth activation function that scales input values by the cumulative distribution function of the standard normal distribution, commonly used in BERT and modern Transformers.
Gemini
Gemini is a family of highly capable, natively multimodal AI models developed by Google. Designed from the ground up to process and combine different modalities of information (including text, code, audio, image, and video) seamlessly.
Generalization
Generalization is a machine learning model's ability to make accurate predictions on new, unseen test data that was not present in the dataset used to train the network.
Generative AI
Generative AI refers to algorithms and models designed to generate new, original content, including text, images, music, code, or video. Popular architectures like Transformers, GANs, and Diffusion models serve as the engines powering generative AI platforms.
Generative Engine Optimization
Generative Engine Optimization (GEO) is the modern marketing and search optimization practice of structuring website content so it is successfully retrieved, cited, and recommended by AI search engines and LLM answer systems.
Generative Pre-training
Generative Pre-training is the initial phase of training a Large Language Model on massive, unlabeled text datasets where the model learns token relationships by predicting the next word in sequence.
GGUF
GGUF (GPT-Generated Unified Format) is a file format designed for storing models for inference with llama.cpp. It is optimized to support fast on-device loading and quantization.
GPT
GPT (Generative Pre-trained Transformer) is a decoder-only autoregressive transformer architecture developed by OpenAI. It was pre-trained on massive text datasets to predict next words, pioneering the modern conversational AI era.
GPT-4
GPT-4 is a state-of-the-art multimodal Large Language Model developed by OpenAI, trained on both text and visual inputs to perform complex reasoning, coding, and logical operations.
GPT-ese
GPT-ese (or AI-speak) is a colloquial term for the specific stylistic, overly polite, repetitive, or cliché-ridden writing style characteristic of outputs generated by early Large Language Models without custom alignment.
GPU
A Graphics Processing Unit (GPU) is a specialized electronic circuit designed to rapidly manipulate and alter memory. Because training neural networks involves massive matrix multiplication, the parallel processing power of GPUs is critical for modern AI workloads.
Gradient Descent
Gradient Descent is an optimization algorithm used to minimize a model's loss function during training. It iteratively calculates the slope (gradient) of the error surface and updates model parameters (weights) in the direction of the steepest descent.
Graph Neural Network
A Graph Neural Network (GNN) is a class of artificial neural network designed to process data represented as graphs (consisting of nodes and edges), extracting features through message-passing neighborhoods.
Graph RAG
Graph RAG (Graph Retrieval-Augmented Generation) is an advanced retrieval technique that couples vector similarity search with structured Knowledge Graphs. It extracts entities and relationships from documents, building a network to answer complex, global queries.
Greedy Decoding
Greedy Decoding is a sequence generation method where the model always selects the single token with the highest predicted probability at each step during output text generation.
Grounding
Grounding is the process of anchoring an AI model's generated outputs to verifiable real-world facts, external files, or structured databases. It keeps model predictions factual, grounded, and traceably accurate.
Grouped-Query Attention
Grouped-Query Attention (GQA) is an attention query layout grouping query heads to share a single Key and Value head, reducing the memory footprint of the KV cache.
Guardrails
Guardrails refer to validation layers placed around AI models to intercept inputs (prompts) and outputs (completions). They ensure safety policies, structure schemas, and prevent toxic leakage or jailbreaks.
H
Hallucination
Hallucination is a phenomenon where a Large Language Model (LLM) generates outputs that are factually incorrect, nonsensical, or ungrounded in real-world data. It occurs because LLMs predict word probabilities rather than referencing a direct database of facts.
Hidden Layer
A Hidden Layer is a layer of neurons located between the input layer and the output layer in an artificial neural network, responsible for extracting and learning abstract features from input data.
Human-in-the-Loop
Human-in-the-Loop (HITL) is a design pattern in autonomous systems and machine learning workflows that requires human intervention or approval at key checkpoints before executing critical, irreversible, or high-risk actions.
Hyperautomation
Hyperautomation is an enterprise operational strategy focused on identifying, vetting, and automating as many business and IT processes as possible using AI, Robotic Process Automation (RPA), and low-code orchestrations.
Hyperparameter
A Hyperparameter is a configuration variable whose value is set before the machine learning training process begins. Unlike standard parameters (weights and biases) which are learned automatically during training, hyperparameters control the learning behavior itself.
Hyperplane
A Hyperplane is a subspace whose dimension is one less than that of its ambient space. In machine learning, it acts as a decision boundary to separate different data classes.
I
Image Segmentation
Image Segmentation is a computer vision process of partitioning a digital image into multiple segments (sets of pixels), assigning a label to every pixel to outline exact boundaries of objects.
Imbalanced Datasets
An Imbalanced Dataset is a training dataset where one class (or category) is significantly overrepresented compared to other classes, causing models to favor the majority class.
In-Context Learning
In-Context Learning (ICL) is the emergent ability of pre-trained Large Language Models to learn new tasks from examples provided in the prompt context without gradient updates.
Indirect Prompt Injection
Indirect Prompt Injection is a security exploit where an attacker embeds malicious instructions inside untrusted third-party data (like web pages, uploaded PDFs, or emails) that an AI agent is instructed to read. When the agent processes the document, the hidden prompt overrides the system instructions and hijacks the agent.
Inductive Bias
Inductive Bias refers to the assumptions a machine learning algorithm uses to predict outputs for unseen inputs. It prioritizes specific solutions based on the structural design of the model.
Inference
Inference is the process of using a trained AI model to make predictions or generate text based on new inputs. During inference, data flows forward through the neural network to produce an output, without modifying the model's weights.
Intelligent Agent
An Intelligent Agent is an autonomous entity that perceives its environment through inputs, makes rational decisions based on goals, and executes actions using tools to achieve outcomes.
J
K
K-Means Clustering
K-Means Clustering is an unsupervised machine learning algorithm that partitions a dataset into K distinct, non-overlapping clusters by assigning each data point to its nearest centroid.
K-Nearest Neighbors
K-Nearest Neighbors (KNN) is a simple, non-parametric supervised learning algorithm used for classification and regression, which predicts target labels by looking at the majority vote of K closest neighboring coordinates.
Knowledge Graph
A Knowledge Graph is a structured database representing a network of real-world entities (nodes) and their semantic relations (edges), allowing systems to query logical context.
KV Cache
A KV Cache (Key-Value Cache) is an inference-time optimization storing the computed Key and Value attention tensors of past tokens to prevent redundant recalculations in autoregressive decoding.
L
Label
A Label is the target output or correct outcome variable associated with a training example in supervised learning (e.g. labeling a picture as a "dog" or marking an email as "spam").
LangChain
LangChain is an open-source framework designed to simplify the creation of applications using Large Language Models, providing abstractions for chains, prompt templates, memory, and tools.
Latent Space
Latent Space is a multi-dimensional space where raw, complex data (such as images or text) is compressed into mathematical vector representations. In latent space, items that share similar abstract concepts or semantic meanings are mapped closer together.
Layer Normalization
Layer Normalization is a technique that normalizes the activations of a neural network layer across all features for each single training example, stabilizing gradient updates in sequential models.
Learning Rate
Learning Rate is a fundamental tuning hyperparameter in gradient descent optimizers that determines the mathematical step size taken toward the global minimum of the loss function during model training.
Learning Rate Decay
Learning Rate Decay is a training hyperparameter setting that gradually decreases the optimizer's learning rate over epochs, allowing the model to make large updates early and fine adjustments later.
Linear Attention
Linear Attention is a class of attention mechanisms designed to approximate the standard self-attention operation in linear time complexity relative to sequence length, bypassing the quadratic memory scaling limits of standard Transformers.
Linear Regression
Linear Regression is a foundational statistical method and supervised learning algorithm used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.
LLaMA
LLaMA (Large Language Model Meta AI) is a family of state-of-the-art open-weights foundation models released by Meta. LLaMA catalyzed the open-source AI developer ecosystem by offering models that could run locally with high efficiency.
LLM
A Large Language Model (LLM) is a type of artificial intelligence model trained on vast amounts of text data to understand, generate, and manipulate natural language. Built on the Transformer architecture, LLMs use billions of parameters to recognize semantic patterns and reasoning relationships.
LLM Evaluation
LLM Evaluation (LLM Eval) is the process of measuring the accuracy, reasoning quality, safety compliance, and formatting correctness of Large Language Model outputs using benchmarks or judge models.
Logistic Regression
Logistic Regression is a foundational classification algorithm used to predict the probability of a binary target variable by mapping linear inputs to a sigmoid probability curve.
Loop Engineering
Loop Engineering is the practice of designing, optimizing, and securing autonomous agent execution loops. In agentic AI, this involves structuring the iteration cycle—such as prompt loops, self-correction runs, and human-in-the-loop triggers—to minimize infinite recursion and maximize successful task execution.
LoRA
LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning (PEFT) technique that freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, reducing training VRAM requirements.
Loss Function
A Loss Function is a mathematical algorithm that measures the discrepancy between a model's predicted output and the actual true target value during training. The goal of training is to minimize this loss, adjusting weights based on gradients computed from it.
LSTM
LSTM (Long Short-Term Memory) is a specialized recurrent neural network (RNN) architecture. It introduced gating mechanisms (input, output, and forget gates) to manage memory state, solving the vanishing gradient problem for sequential data.
LSTM Memory Cell
An LSTM Memory Cell is the core building block of a Long Short-Term Memory network, containing a cell state that acts as a conveyor belt to carry historical information across sequences.
M
Machine Learning
Machine Learning is a branch of artificial intelligence focused on building systems that learn from data, identify patterns, and make decisions with minimal human intervention. It represents the broader field that includes deep learning and classical statistics.
Machine Translation
Machine Translation (MT) is a subfield of computational linguistics focused on using artificial intelligence models to automatically translate text or speech from one human language to another.
Masked Language Modeling
Masked Language Modeling (MLM) is a self-supervised training task where a model learns token context by predicting hidden (masked) words in a sentence using surrounding left and right text tokens.
Max Pooling
Max Pooling is a sample-based discretization process in CNNs. It divides the input image into sub-regions and outputs the maximum value from each sub-region, reducing dimensional size.
MCP Client
A Model Context Protocol Client (MCP Client) is an application (such as an IDE, chatbot, or developer platform) that implements the MCP protocol to discover, connect to, and invoke tools and data sources exposed by MCP servers.
MCP Server
A Model Context Protocol Server (MCP Server) is a lightweight utility service that exposes databases, file systems, specific APIs, or local command runtimes to MCP clients using a standardized, secure JSON protocol.
Mean Absolute Error
Mean Absolute Error (MAE) is a mathematical loss metric used in regression models that calculates the average absolute differences between predicted values and actual target values.
Mixture of Depths
Mixture of Depths (MoD) is a compute optimization technique where models dynamically route and process only a fraction of tokens through specific layers, skipping computation for simpler tokens.
Mixture of Experts
Mixture of Experts (MoE) is a neural network design that scales model parameters without increasing compute cost. Instead of activating the entire network for every token, MoE routes inputs to specialized sub-networks ("experts") using a gating router.
MLOps
MLOps (Machine Learning Operations) is a set of practices, culture, and tools focused on automating and unifying the lifecycle of machine learning models, spanning data collection, training, testing, deployment, and monitoring.
Model Collapse
Model Collapse is a degenerative process affecting generative AI models trained recursively on synthetic data generated by previous generations of AI models. Over iterations, the model loses diversity, starts repeating patterns, and eventually outputs garbage.
Model Context Protocol
Model Context Protocol (MCP) is an open-source standard created by Anthropic that enables AI applications and agents to connect securely to local or remote data sources, developer tools, and API services via a standardized protocol.
Model Drift
Model Drift (or model decay) is the degradation of an AI model's predictive performance in production over time, caused by changes in the statistical properties of real-world input data relative to the training data.
Model Merging
Model Merging is the process of combining two or more fine-tuned models into a single model without running any retraining or compute-heavy tuning. It averages or mathematically blends the weight metrics of the models.
Model Registry
A Model Registry is a centralized repository store for managing the lifecycle of machine learning models. It stores model weights, parameter logs, version details, and deployment states.
Multi-Agent Orchestration
Multi-Agent Orchestration is the protocol framework that defines how multiple specialized AI agents communicate, delegate sub-tasks, exchange context, and collaborate sequentially or hierarchically to achieve a collective goal.
Multi-Agent System
A Multi-Agent System (MAS) is a computerized system composed of multiple interacting intelligent agents. These agents coordinate, communicate, and collaborate (or compete) with each other to solve complex problems that are beyond the individual capabilities of any single agent.
Multi-Head Attention
Multi-Head Attention is an attention layout in Transformers that splits query, key, and value vectors into multiple subspaces, allowing the model to attend to information from different representation coordinates simultaneously.
Multi-Query Attention
Multi-Query Attention (MQA) is an attention architecture where all query heads share a single Key and Value head to minimize KV cache storage.
Multimodal AI
Multimodal AI refers to systems capable of processing, understanding, and generating multiple types of input and output data modalities simultaneously, such as text, images, audio, video, and code. This mirrors human-like perception across sensory channels.
N
Named Entity Recognition
Named Entity Recognition (NER) is an NLP task that identifies and classifies key elements in text documents into predefined categories (such as names of people, organizations, locations, dates, or product codes).
Neural Architecture Search
Neural Architecture Search (NAS) is an automated process for designing artificial neural networks. By defining a search space, search strategy, and performance metric, NAS algorithms automatically discover optimal layer configurations.
Neural Network
An Artificial Neural Network (ANN) is a computing system inspired by the biological neural networks that constitute animal brains, structured as layers of interconnected nodes that process inputs to produce outputs.
NLP
Natural Language Processing (NLP) is a subfield of computer science and AI concerned with the interactions between computers and human language. It involves training computers to process, analyze, and synthesize large amounts of natural language data.
NormalFloat4
NormalFloat4 (NF4) is an information-theoretically optimal quantile quantization data type for normally distributed data, designed to compress neural network weights to 4-bit precision without losing accuracy.
NPU
A Neural Processing Unit (NPU) is a specialized microprocessor circuit designed specifically to accelerate the execution of machine learning algorithms, commonly integrated into mobile SOCs and edge hardware.
NVIDIA
NVIDIA is a pioneer of GPU computing, dominating the hardware market for AI acceleration, training, and inference with its high-performance Hopper and Blackwell architectures.
O
Object Detection
Object Detection is a computer vision task that combines image classification and localization, identifying what objects are in an image and outputting bounding boxes around their coordinates.
Observability
Observability in AI refers to the ability to measure, trace, and audit the internal states, reasoning paths, tool execution parameters, and model outputs of an AI system. It enables developers to debug complex reasoning steps and optimize agent behaviors.
One-Hot Encoding
One-Hot Encoding is a data preprocessing technique that converts categorical variables (like "dog", "cat") into binary vector representations where only a single element is 1 (hot) and the rest are 0.
One-Shot Learning
One-Shot Learning is a machine learning setup where a model is trained or prompted to perform a task or classify inputs after being shown only a single demonstration example.
OpenAI
OpenAI is an artificial intelligence research and deployment company behind ChatGPT, GPT-4, and Sora, dedicated to building safe and beneficial artificial general intelligence (AGI).
Orchestration Layer
An Orchestration Layer is the control center of an agentic system that manages the execution loop, schedules task transitions, calls external tools, updates state databases, and routes inputs/outputs between the user, tools, and the LLM brain.
Out-of-Distribution
Out-of-Distribution (OOD) data refers to inputs that originate from a different probability distribution than the dataset used to train the machine learning model, often causing models to make confident mistakes.
Overfitting
Overfitting is a common training error where a model learns the details and noise in the training dataset to the extent that it negatively impacts its performance on new, unseen test data. The model performs exceptionally well on training data but fails to generalize.
P
Parameters
Parameters are the internal configuration variables of an AI model that are learned automatically from training data. In a neural network, parameters consist of weights (which determine connection strength) and biases (which offset activation curves).
PEFT
Parameter-Efficient Fine-Tuning (PEFT) is a collection of training methods designed to fine-tune large foundation models by adapting only a tiny subset of additional parameters, while freezing the base model's weights.
Perplexity
Perplexity is a core evaluation metric in natural language processing measuring how well a probability distribution or language model predicts a sample of text.
Pre-training
Pre-training is the initial phase of training an AI model on a massive general-purpose dataset (unsupervised or self-supervised), teaching the model basic syntax, grammar, and features before fine-tuning.
Precision
Precision (Positive Predictive Value) is a classification evaluation metric measuring the fraction of predicted positive examples that are actually correct, calculated as true positives divided by all predicted positives.
Preference Alignment
Preference Alignment refers to the training process of tuning a Large Language Model's conversational behavior to match human preferences regarding helpfulness, safety guidelines, and formatting style.
Prompt
A Prompt is the textual, visual, or binary input submitted to a generative AI model to initiate and guide the generation of a specific response or action.
Prompt Engineering
Prompt Engineering is the practice of designing, structuring, and refining inputs (prompts) to get optimal, predictable outputs from generative AI models. It involves technique selection like chain-of-thought, few-shot prompting, and system routing.
Prompt Injection
Prompt Injection is a security vulnerability where a malicious user provides input that overrides the pre-configured system instructions or safety alignment filters of a Large Language Model, hijacking its control flow.
PyTorch
PyTorch is the dominant open-source machine learning framework developed by Meta AI research, widely used for building, training, and deploying deep learning models.
Q
QLoRA
Quantized Low-Rank Adaptation (QLoRA) is an advanced parameter-efficient fine-tuning (PEFT) technique that runs LoRA over a base model quantized to 4-bit precision. It uses special formats like NormalFloat4 to maintain model accuracy while drastically reducing VRAM overhead.
Quantization
Quantization is the process of compressing neural network parameters by reducing the numerical precision of its weights (e.g. converting 16-bit floating points to 4-bit integers), lowering VRAM requirements and accelerating inference.
R
RAG
Retrieval-Augmented Generation (RAG) is a methodology that optimizes the output of a Large Language Model (LLM) by referencing an authoritative, external knowledge base or Vector Database before generating a response. RAG helps models access real-time information and drastically reduces hallucination.
Random Forest
Random Forest is an ensemble supervised learning algorithm composed of many individual Decision Trees that work together. It trains trees on random subsets of the data and features, averaging their predictions for output.
Reasoning Model
A Reasoning Model (or o1-style model) is an artificial intelligence model trained to perform reinforcement learning and execute chain-of-thought steps internally before returning an answer. This allows the model to deliberate, correct mistakes, and evaluate strategies.
Recall
Recall (Sensitivity or True Positive Rate) is a classification evaluation metric measuring the fraction of actual positive examples that the model correctly identified, calculated as true positives divided by all actual positives.
Recurrent Neural Network
A Recurrent Neural Network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a temporal sequence, allowing it to exhibit temporal dynamic behavior and process variable-length inputs.
Reinforcement Learning
Reinforcement Learning (RL) is a machine learning training paradigm where an agent learns to make decisions by performing actions in an environment to maximize cumulative rewards. The agent learns through trial-and-error feedback.
Repository Intelligence
Repository Intelligence is a capability in AI developer tooling that allows models to index, analyze, and reason over an entire software codebase structure, rather than just reading active, isolated files.
Reranking
Reranking is a secondary step in RAG (Retrieval-Augmented Generation) pipelines where a highly accurate model evaluates and re-orders the candidate documents fetched during initial vector search, ensuring the most relevant context is placed at the top.
Residual Connection
A Residual Connection (or skip connection) is an architectural feature in deep neural networks that passes the input of a layer directly to its output, bypassing one or more intermediate layers by adding them together.
Responsible AI
Responsible AI is a business governance framework that guides how an organization designs, develops, and deploys artificial intelligence systems ethically, ensuring transparency, fairness, privacy, safety, and accountability.
Retrieval Precision
Retrieval Precision is an evaluation metric in RAG systems measuring the fraction of retrieved document chunks that are actually relevant to answering the user query. High retrieval precision prevents prompt clutter and distraction.
Reward Function
A Reward Function is a mathematical formula that defines the goal in reinforcement learning by assigning a numerical score to the states and actions of an agent based on their desirability.
Reward Model
A Reward Model is a neural network trained to score responses generated by an LLM based on human preferences (e.g., helpfulness, safety, format correctness). It is used as the scoring engine in reinforcement learning loops like RLHF.
RLAIF
Reinforcement Learning from AI Feedback (RLAIF) is a model alignment technique where human evaluators are replaced by an AI model (the judge) to generate preference labels for training, lowering alignment training costs.
RLHF
Reinforcement Learning from Human Feedback (RLHF) is a training methodology used to align LLMs with human values and preferences. It uses human evaluations to train a reward model, which then guides the LLM to generate helpful, harmless, and honest outputs.
Rotary Position Embedding
Rotary Position Embedding (RoPE) is an advanced position encoding method applying rotation matrices to token vectors, naturally capturing relative distance between tokens.
S
Scaling Laws
Scaling Laws describe empirical mathematical power-law relationships predicting that an AI model's performance scales predictably as compute budget, training dataset size, and parameter count are scaled up.
Search Grounding
Search Grounding is a verification technique where a generative AI model is connected to a live web search engine or structured document database. Before generating a response, the model queries the search engine to ground its response in real-time factual data.
Self-Attention
Self-Attention (or scaled dot-product attention) is an attention mechanism that relates different positions of a single sequence to compute a representation of the same sequence, allowing the model to calculate context dynamically.
Self-Correction
Self-Correction is an agentic design pattern where an AI agent executes a task, validates the intermediate output (against unit tests, syntax linters, or criteria checklists), and recursively loops to edit and resolve mistakes when errors are found.
Self-Supervised Learning
Self-Supervised Learning is a training paradigm where the model generates its own labels directly from the input data (e.g. masking words and predicting them), allowing training on massive unlabeled datasets without human labeling.
Semantic Search
Semantic Search is an information retrieval technique that seeks to understand the searcher's intent and contextual meaning of terms, rather than just matching keywords. It leverages vector embeddings to find semantically relevant documents.
Sentiment Analysis
Sentiment Analysis is an NLP task that uses classification models to identify and extract subjective information (positive, negative, or neutral tones) from text datasets.
SGD with Momentum
SGD with Momentum is an extension of Stochastic Gradient Descent that accelerates weight updates in the relevant direction by adding a fraction of the previous update vector to the current step.
Sigmoid
Sigmoid is a mathematical activation function that maps any real-valued number into a value between 0 and 1, producing an S-shaped curve.
Slop
Slop is a colloquial internet slang term for low-quality, hollow, or unverified AI-generated content (including text, images, or search summaries) posted online to attract clicks, often cluttering feeds without providing real human value.
Small Language Model
A Small Language Model (SLM) is a lightweight language model with fewer parameters (typically under 10 billion) trained on highly curated, high-quality datasets. SLMs are designed to run efficiently on local edge devices with low power requirements.
Softmax
Softmax is an activation function that takes a vector of raw real numbers (logits) and normalizes them into a probability distribution where each value lies between 0 and 1, and all values sum to 1.
Sora
Sora is an advanced text-to-video diffusion model developed by OpenAI, capable of generating high-fidelity, photorealistic video clips up to 60 seconds long from written text prompts.
Sovereign AI
Sovereign AI refers to a nation or organization's strategy to design, train, and deploy artificial intelligence models and infrastructure locally using their own data, computational hardware, and cultural values to maintain digital sovereignty and security.
SpaceX
SpaceX (Space Exploration Technologies Corp.) is an aerospace manufacturer and satellite communications company that integrates advanced autonomous control systems and AI telemetry software, and recently acquired the AI-coding platform Cursor (Anysphere) to accelerate software automation.
Sparse Model
A Sparse Model is a neural network architecture that activates only a specific subset of its total parameters for any given token or input, utilizing routing mechanisms to achieve massive parameter scale without proportional compute costs.
Speculative Decoding
Speculative Decoding is a latency optimization technique that accelerates LLM generation. A smaller, faster drafting model proposes multiple candidate tokens, which are then validated in parallel by the larger target model in a single forward pass.
Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is an optimization algorithm that updates a model's weights using the gradient calculated from a single randomly chosen training sample (or a small batch) rather than the entire dataset.
Structured Outputs
Structured Outputs is an LLM generation feature that guarantees model completions adhere strictly to a developer-specified schema (such as JSON Schema or Pydantic models), eliminating syntax parsing errors.
Supervised Instruction Tuning
Supervised Instruction Tuning (SFT) is a training phase where a pre-trained base model is fine-tuned on a curated dataset of instruction-response pairs. This teaches the model to understand prompts, adopt an assistant persona, and output responses in a structured format.
Supervised Learning
Supervised Learning is the most common machine learning category, where a model is trained on a labeled dataset. This means each training input is paired with its correct output label, allowing the model to learn mapping relationships.
SwiGLU
SwiGLU is an activation function combining the Gated Linear Unit with Swish activation, used in feed-forward networks of modern Transformer blocks.
Synthetic Data
Synthetic Data is information that is artificially generated by algorithms or computer simulations, rather than being obtained from real-world measurements, often used to train AI models when real data is scarce or sensitive.
System Prompt
A System Prompt (or system instructions) is a set of core instructions provided to an AI model before the user conversation begins, defining the model's persona, boundaries, task rules, and formatting constraints.
T
Technological Singularity
The Technological Singularity is a hypothetical future point in time when technological growth becomes uncontrollable and irreversible, driven by self-improving artificial intelligence systems surpassing human intelligence, resulting in unfathomable changes to human civilization.
Temperature
Temperature is a parameter that controls the randomness and creativity of text generated by an autoregressive language model during inference. Higher values increase randomness, while lower values make outputs more deterministic.
Tensor
A Tensor is a multi-dimensional mathematical array of numbers that serves as the fundamental data structure for representing inputs, weights, and activations in deep learning frameworks like TensorFlow and PyTorch.
Token
A Token is the fundamental unit of text sequence analyzed or generated by a natural language model (roughly equal to 3/4 of a word). Words are encoded into token IDs before passing into neural layers.
Tokenization
Tokenization is the process of breaking down a text string into smaller pieces called tokens (which can be characters, subwords, or full words). Tokenization converts text into numbers that a neural network can process.
Tokenizer
A Tokenizer is a pre-processing component that breaks down raw text strings into discrete units called tokens (words, subwords, or characters) and maps them to numerical integer IDs that can be processed by a neural network.
TPU
A Tensor Processing Unit (TPU) is an application-specific integrated circuit (ASIC) custom-developed by Google specifically to accelerate machine learning workloads, specialized in high-performance matrix math operations.
Training Data
Training Data is the initial dataset used to train a machine learning model, allowing it to learn features, weights, and mathematical relationships by processing inputs and computing adjustments.
Transfer Learning
Transfer Learning is a machine learning technique where a model developed for one task is reused as the starting point for a model on a second, related task, significantly reducing the amount of labeled data and compute needed.
Transformer
A Transformer is a deep learning neural network architecture introduced in 2017 by Google researchers, based entirely on self-attention mechanisms. It processes sequential inputs in parallel, capturing long-range dependencies and serving as the foundational engine for all modern LLMs.
Turing Test
The Turing Test, originally proposed by Alan Turing in 1950, is a test of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human through text conversation.
U
Underfitting
Underfitting is a training error that occurs when a machine learning model is too simple to capture the underlying structure and patterns in the training dataset, resulting in poor performance on both training and validation data.
Unsupervised Learning
Unsupervised Learning is a machine learning category where a model is trained on an unlabeled dataset. The algorithm attempts to discover hidden structures, groupings, or distributions within the input data without external guidance.
V
Validation Data
Validation Data is a subset of the dataset held back during machine learning training, used to evaluate model progress, adjust hyperparameters, and prevent overfitting.
Vanishing Gradient Problem
The Vanishing Gradient Problem is a training difficulty in deep neural networks where the gradients of the loss function shrink exponentially as they propagate backward to the early layers, preventing the model weights from updating and learning.
Vector Database
A Vector Database is a specialized storage engine designed to store, index, and query high-dimensional vector embeddings efficiently. It enables fast semantic search and similarity matching using algorithms like HNSW or IVF.
Vision Transformer
A Vision Transformer (ViT) is a neural network architecture that adapts the Transformer attention mechanism for computer vision tasks. By splitting images into grid patches and treating them like tokens in a sentence, ViT learns long-range visual relations.
VLM
A Vision-Language Model (VLM) is a multimodal AI model trained on both images and text, enabling it to answer questions about visual content, describe images, or extract structured data from documents.