top of page
Artificial Intelligence Glossary

Accelerator

A class of microprocessors designed to accelerate AI applications.

​

Agents

Software that can perform specific tasks independently and proactively without human intervention, often utilizing a suite of tools like calculators or web browsers.

 

AGI (Artificial General Intelligence)

Though not widely agreed upon, Microsoft researchers have defined AGI as artificial intelligence that is as capable as a human at any intellectual task.

​

Alignment

The task of ensuring that an AI system's goals align with human values.

​

ASI (Artificial Super Intelligence)

Though subject to debate, ASI is commonly defined as artificial intelligence that surpasses the human mind's capabilities.

 

Attention

In neural networks, attention mechanisms help the model focus on relevant parts of the input when producing an output.

​

Back Propagation

An algorithm often used in training neural networks refers to the method for computing the gradient of the loss function with respect to the network's weights.

​

Bias

Assumptions made by an AI model about the data. A “bias variance tradeoff” is the balance that must be achieved between assumptions a model makes about the data and the amount a model’s predictions change, given different training data. Inductive bias is the set of assumptions that a machine learning algorithm makes about the underlying distribution of the data.

 

Chain of Thought

In AI, this term is often used to describe an AI model's reasoning steps to arrive at a decision.

​

Chatbot

A computer program designed to simulate human conversation through text or voice interactions. Chatbots often utilize natural language processing techniques to understand user input and provide relevant responses.

 

ChatGPT

A large-scale AI language model developed by OpenAI that generates human-like text.

​

CLIP (Contrastive Language–Image Pretraining)

An AI model developed by OpenAI that connects images and text, allowing it to understand and generate descriptions of images.

​

Compute

The computational resources (like CPU or GPU time) used in training or running AI models.

​

Convolutional Neural Network (CNN)

A deep learning model processes data with a grid-like topology (e.g., an image) by applying a series of filters. Such models are often used for image recognition tasks.

​

Data Augmentation

Increasing the amount and diversity of data used for training a model by adding slightly modified copies of existing data.

​

Deep Learning

A subfield of machine learning that focuses on training neural networks with many layers, enabling the teaching of complex patterns.

​

Diffusion

In AI and machine learning, a technique for generating new data by starting with a piece of real data and adding random noise is called a diffusion model. A diffusion model is a generative model in which a neural network is trained to predict the reverse process when random noise is added to data. Diffusion models generate new data samples similar to the training data.

 

Double Descent

A phenomenon in machine learning in which model performance improves with increased complexity, worsens, and improves again.

​

Embedding

Data representation in a new form, often a vector space. Similar data points have more similar embeddings.

​

Emergence/Emergent Behavior (“sharp left turns,” intelligence explosions)

In AI, emergence refers to complex behavior arising from simple rules or interactions. “Sharp left turns” and “intelligence explosions” are speculative scenarios where AI development takes sudden and drastic shifts, often associated with the arrival of AGI.

​

End-to-End Learning

A type of machine learning model that does not require hand-engineered features. The model is fed raw data and expected to learn from these inputs.

​

Expert Systems

An application of artificial intelligence technologies that provides solutions to complex problems within a specific domain.

​

Explainable AI (XAI)

A subfield of AI focused on creating transparent models that provide clear and understandable explanations of their decisions.

​

Fine-tuning

The process of taking a pre-trained machine learning model that has already been trained on a large dataset and adapting it for a slightly different task or specific domain. During fine-tuning, the model’s parameters are further adjusted using a smaller, task-specific dataset, allowing it to learn task-specific patterns and improve performance on the new task.

​

Forward Propagation

In a neural network, forward propagation is the process where input data is fed into the network and passed through each layer (from the input layer to the hidden layers and finally to the output layer) to produce the output. The network applies weights and biases to the inputs and uses activation functions to generate the final output.

​

Foundation Model

Large AI models trained on broad data are meant to be adapted for specific tasks.

​

General Adversarial Network (GAN)

A type of machine learning model used to generate new data similar to some existing data. It pits two neural networks against each other: a “generator,” which creates new data, and a “discriminator,” which tries to distinguish that data from real data.

​

Generative AI

A branch of AI focused on creating models that can generate new and original content, such as images, music, or text, based on patterns and examples from existing data.

 

GPT (Generative Pretrained Transformer)

A large-scale AI language model developed by OpenAI that generates human-like text.

 

GPU (Graphics Processing Unit)

A specialized type of microprocessor primarily designed to render images for output to a display quickly. GPUs are also highly efficient at performing the calculations to train and run neural networks.

​

Gradient Descent

In machine learning, gradient descent is an optimization method that gradually adjusts a model’s parameters based on the direction of the most significant improvement in its loss function. In linear regression, for example, gradient descent helps find the best-fit line by repeatedly refining the line’s slope and intercept to minimize prediction errors.

​

Hallucinate/Hallucination

In the context of AI, hallucination refers to the phenomenon in which a model generates content that is not based on actual data or is significantly different from reality.

​

Hidden Layer

Layers of artificial neurons in a neural network are not directly connected to the input or output.

​

Hyperparameter Tuning

The process of selecting the appropriate values for a machine learning model's hyperparameters (parameters not learned from the data).

​

Inference

The process of making predictions with a trained machine learning model.

​

Instruction Tuning

A technique in machine learning where models are fine-tuned based on specific instructions given in the dataset.

​

Large Language Model (LLM)

A type of AI model that can generate human-like text and is trained on a broad dataset.

​

Latent Space

In machine learning, this term refers to the compressed representation of data that a model (like a neural network) creates. Similar data points are closer in latent space.

​

Loss Function (or Cost Function)

A function that a machine learning model seeks to minimize during training. It quantifies how far the model’s predictions are from the actual values.

​

Machine Learning

A type of artificial intelligence that allows systems to learn and improve from experience without being explicitly programmed.

​

Mixture of Experts

A machine learning technique where several specialized submodels (the “experts”) are trained, and their predictions are combined in a way that depends on the input.

​

Multimodal

AI refers to models that can understand and generate information across several data types, such as text and images.

​

Natural Language Processing (NLP)

A subfield of AI focused on the interaction between computers and humans through natural language. The ultimate objective of NLP is to read, decipher, understand, and make sense of human language in a valuable way.

 

NeRF (Neural Radiance Fields)

A method for creating a 3D scene from 2D images using a neural network. It can be used for photorealistic rendering, view synthesis, and more.

​

Neural Network

A type of AI model inspired by the human brain, a neuron consists of connected units or nodes, called neurons, organized in layers. A neuron takes inputs, does some computation on them, and produces an output.

​

Objective Function

A function that a machine learning model seeks to maximize or minimize during training.

​

Overfitting

A modeling error occurs when a function is too closely fit to a limited set of data points, resulting in poor predictive performance when applied to unseen data.

​

Parameters

In machine learning, parameters are the internal variables that the model uses to make predictions. They are learned from the training data during the training process. For example, the weights and biases are parameters in a neural network.

​

Pre-training

The initial phase of training a machine learning model involves learning general features, patterns, and representations from the data without specific knowledge of the task it will later be applied to. This unsupervised or semi-supervised learning process enables the model to understand the underlying data distribution and extract meaningful features that can be leveraged for subsequent fine-tuning on specific tasks.

​

Prompt

The initial context or instruction sets the model's task or query.

​

Regularization

In machine learning, regularization is a technique used to prevent overfitting by adding a penalty term to the model’s loss function. This penalty discourages the model from excessively relying on complex patterns in the training data, promoting more generalizable and less prone-to-overfitting models.

​

Reinforcement Learning

A type of machine learning where an agent learns to make decisions by taking actions in an environment to maximize some reward.

​

RLHF (Reinforcement Learning from Human Feedback)

A method to train an AI model by learning from human feedback on model outputs.

 

Singularity

In the context of AI, the singularity (also known as the technological singularity) refers to a hypothetical future when technological growth becomes uncontrollable and irreversible, leading to unforeseeable changes to human civilization.

​

Supervised Learning

A type of machine learning where the model is provided with labeled training data.

 

Symbolic Artificial Intelligence

A type of AI that utilizes symbolic reasoning to solve problems and represent knowledge.

​

TensorFlow

An open-source machine learning platform developed by Google is used to build and train machine learning models.

​

TPU (Tensor Processing Unit)

A type of microprocessor developed by Google specifically for accelerating machine learning workloads.

​

Training Data

The dataset used to train a machine learning model.

​

Transfer Learning

A method in machine learning where a pre-trained model is used on a new problem.

​

Transformer

Transformers are a specific type of neural network architecture used primarily for processing sequential data, such as natural language. They are known for their ability to handle long-range dependencies in data, thanks to a mechanism called “attention,” which allows the model to weigh the importance of different inputs when producing an output.

​

Underfitting

A modeling error in statistics and machine learning occurs when a statistical model or machine learning algorithm cannot adequately capture the underlying structure of the data.

​

Unsupervised Learning

A type of machine learning where the model is not provided with labeled training data, and instead must identify patterns in the data independently.

​

Validation Data

A subset of the dataset used in machine learning that is separate from the training and test datasets. It’s used to tune a model's hyperparameters (i.e., architecture, not weights).

 

XAI (Explainable AI)

A subfield of AI focused on creating transparent models that provide clear and understandable explanations of their decisions.

​

Zero-shot Learning

A type of machine learning where the model makes predictions for conditions not seen during training, without fine-tuning.

A
B
C
D
E
F
G
H
I
L
M
N
O
P
R
S
T
U
V
X
Z
bottom of page