AI Glossary: Letter "B"
Explore definitions and dynamic coverage analytics for the core concepts shaping artificial intelligence.
B
Backpropagation
Backpropagation is the primary algorithm used to train neural networks. It works by calculating the gradient of the loss function with respect to the weights of the network, and then propagating that error backward through the layers using the chain rule to update parameters.
Bag of Words
Bag of Words (BoW) is a simplified representation model used in natural language processing and information retrieval. In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and word order but keeping multiplicity.
Batch Normalization
Batch Normalization is a technique that normalizes the inputs of each layer within a mini-batch during training, stabilizing the learning process and accelerating convergence.
Batch Size
Batch Size is a model training hyperparameter defining the number of training examples processed in a single forward and backward pass before the model's internal parameter weights are updated.
Bayesian Optimization
Bayesian Optimization is a sequential design strategy for global optimization of black-box functions. It is widely used in machine learning to tune hyperparameters, particularly when evaluating the target function is computationally expensive.
BERT
BERT (Bidirectional Encoder Representations from Transformers) is a language model developed by Google in 2018. Unlike autoregressive models, BERT is bidirectional, looking at the words before and after a target word to understand its context.
Bi-Encoder
A Bi-Encoder is a neural network architecture that embeds the query and the candidate document separately into a shared vector space, allowing fast similarity comparisons using mathematical operations like cosine similarity or dot product.
Bias-Variance Tradeoff
The Bias-Variance Tradeoff is a core machine learning concept describing the conflict between a model's ability to minimize bias (errors from simple assumptions) and variance (errors from sensitivity to training data fluctuations). Balancing them is key to avoiding overfitting or underfitting.
Bidirectional LSTM
A Bidirectional LSTM (BiLSTM) is a sequence processing architecture that consists of two LSTMs: one taking the input in a forward direction, and the other taking it in a backward direction. This allows the network to capture both past and future context at any point in the sequence.
BitNet
BitNet is a 1-bit neural network architecture designed for extremely efficient LLM training and inference. By quantizing weights to ternary states (-1, 0, or 1), BitNet replaces expensive floating-point matrix multiplications with cheap integer additions.
Black Box
A Black Box model is an AI or machine learning system whose internal workings, parameters, and decision-making logic are hidden or too complex for humans to interpret or understand (such as deep neural networks with billions of weights).
Blackwell
Blackwell is NVIDIA's high-performance GPU architecture designed specifically to accelerate trillion-parameter large language models, offering massive throughput improvements for AI training and inference workloads.
BM25
Okapi BM25 is a ranking function used by search engines to estimate the relevance of documents to a given search query. It is based on the probabilistic retrieval framework and improves upon TF-IDF by adding term frequency saturation and document length normalization.