Glossary

Chain-of-thought prompting (CoT): Chain-of-thought prompting is when you write prompts to encourage the AI model to mimic a step-by-step reasoning before arriving at an answer. This technique can improve the accuracy and applicability of AI output (Bubeck et al., 2022). It’s especially useful for math problems, logic, or any multi-step decision-making. Some so-called *reasoning models* have this prompting technique built into their system prompt and use a mixture of LLM engine to go through the reasoning steps. #### Example - Regular prompt: *Sarah has 5 boxes. Each box has 8 pencils. She gives 15 pencils to her friends. How many pencils does she have now?* - With CoT prompting: *Sarah has 5 boxes. Each box has 8 pencils. She gives 15 pencils to her friends. How many pencils does she have now? Let's think step by step.* CoT prompting often results in the LLM burning through more tokens, as it is more computationally expensive.
Context window: The context window is the maximum number of tokens (words or parts of words) that an AI model can process and consider simultaneously when generating a response. It is essentially the “memory” capacity of the model during an interaction or task. Models with larger context windows can handle larger attachments/prompts/inputs and sustain “memory” of a conversation for longer (Fogarty, 2023).
Cut-off date: The date that the datasets and frozen before the training run of a neural network model is called the **cut-off date**. Models cannot recall any data after their **cut-off date**. So for example if a model has a cut-off date of 12 Nov 2023, it might be able to recall an event that took place on Nov 11 if it was sufficiently publicized, but it will be unable to know anything about an event taking place on Nov 13 because it is one day after its **cut-off date**. For example DeepSeek's V3 cut-off date is July 2024. Most of the time you can ask models for their **cut-off dates**. #### DeekSeek's response to the question "What version of DeepSeek are you?". > I am DeepSeek-V3, the latest version of DeepSeek's AI model. My knowledge is up to July 2024, and I possess strong capabilities in natural language processing, reasoning, and code understanding. I support 128K context length, allowing me to handle long conversations and complex documents efficiently. Additionally, I have file-reading capabilities, meaning I can process and analyze uploaded documents in formats like PDF, Word, Excel, PowerPoint, and plain text.
Diffusion model: A diffusion model is a type of generative AI model that learns to create data (like an image) by progressively refining random noise into a coherent structure. It's like a sculptor starting with a shapeless block of marble and gradually chipping away to reveal an image. Diffusion models can take a text prompt and generate an image, but they are also capable of "editing images" when they get an image as input. Diffusion models can also produce image sequences for the generation of video.
Frontier model: A frontier model refers to the most advanced, capable, and powerful AI model at the cutting edge of research and development. It represents the current state-of-the-art, pushing the boundaries of what is possible in terms of scale, performance, and general capabilities. Think of it like the frontier on a map—it's the furthest point of exploration and settlement. In AI, these models are at that furthest point, exploring new territories of capability. GPT5, Claude 4 Opus, Gemini 2.5 are examples of so-called frontier models.
LoRA model: LoRA (Low-Rank Adaptation) is a fine-tuning technique. Its primary purpose is to reduce the number of parameters that need to be trained when adapting a large pre-trained model (like GPT-3, LLaMA, or Stable Diffusion) to a specific task. Instead of fine-tuning the entire model (which can have billions of parameters and require immense computational resources), LoRA freezes the original model weights and trains a much smaller set to learn a new ability. This makes the process faster, cheaper, and more accessible. The open weights AI community uses this technique to give models specialized abilities. A classic example of these types of adaptations is to teach a diffusion model to "generate anime-style artwork."
Multimodal model: A multimodal model is an AI model capable of processing and generating multiple types of input/output — such as text, images, audio, and video. Multimodal tools (e.g., GPT-4 with vision) can, for example, describe an image and generate captions or code from a diagram. Or take user input from spoken word and give back answers in text-to-speech audio.
Open weights: An **Open Weights** model is an artificial intelligence (AI) model whose learned parameters (also known as weights) are publicly released by its creators. This allows anyone to download, use, modify, and redistribute the model's core. The choice to open a model's weights has various implications: - Transparency & Scrutiny: Researchers can "look under the hood" to study how the model works, increasing the auditability oof the model. - Democratization: It lowers the barrier to entry. Individuals, startups, and researchers who cannot afford the millions of dollars required to train a large model from scratch can instead download a state-of-the-art model and run it on their own hardware. - Customization and fine-tuning: Users can take the base model and fine-tune it on their own specific dataset for a specialized task. For example, a legal firm could fine-tune an open-weight language model on case law to create a legal assistant. This is a primary driver of innovation. - Privacy and Control: Companies can run the model on their own private servers, ensuring that their proprietary data never leaves their premises. This is a key advantage over using closed API-based models (like GPT-4) where data must be sent to a third party. - Ecosystem Growth: It fosters a community of developers who build tools, applications, and improvements around the model. #### Examples of open weights models - LLaMA 2 & 3 (Meta): A family of large language models. - Mistral & Mixtral (Mistral AI): Powerful models known for their efficient architecture. - Bloom (BigScience): A model specifically designed for transparency and multi-lingual support. - Stable Diffusion (Stability AI): The open-weight model for image generation.
Parameters: In neural networks the concept of a parameter can be seen in it's simplest form in the perceptron. Each connection between two neurons is regulated by a numeric value that weighs the relevance of that connection. This numeric value is called a parameter. Modern neural net systems like LLMs often have billions of parameters. The more parameters a model has the more complex its behaviour is, the more concepts it can encode and the more computational power it needs.
Perceptron: The *perceptron* is a mathematical model that mimics the behaviour of a biological neuron, invented by Frank Rosenblatt in 1956. Perceptrons are the building blocks of neural networks. ## Analogy behind the perceptron ![](https://towardsdatascience.com/wp-content/uploads/2021/12/1hkYlTODpjJgo32DoCOWN5w.png) A neuron is *firing* when the output value `y = 1`. [Read more about how perceptron's work](https://towardsdatascience.com/the-concept-of-artificial-neurons-perceptrons-in-neural-networks-fab22249cbfc/) [Read more about the history of the perceptron](https://news.cornell.edu/stories/2019/09/professors-perceptron-paved-way-ai-60-years-too-soon)
Prompt: An artificial intelligence (AI) prompt is a question, command or statement that a person gives to an artificial intelligence model, such as a large language model, to guide it in generating a specific response. The prompt provides the AI with the necessary context or instructions so it can produce output that is relevant to what you are asking or requesting. Depending on how the prompt is phrased, the AI can generate a variety of outputs, from a single word to a detailed paragraph. It acts as the starting point for the AI’s generation process, directing it to create content that matches your intent. Prompts can range from simple *“Translate this sentence to French”* to complex, multi-part instructions or scenarios.; See also: Prompt engineering
Prompt engineering: Prompt engineering is the practice of designing effective prompts to guide an AI model's output. This involves setting roles, specifying format, adding constraints, or giving examples to improve the quality, tone, or relevance of the response. Prompts are often targeted to a specific LLM and a specific version and need to be re-checked when new version of the target LLMs are released. #### Learn more about prompting techniques These two guides can get you to a good level. - [Prompting guide](https://www.promptingguide.ai/introduction/tips) - [Learnprompting.org](https://learnprompting.org/docs/introduction) #### Prompt libraries You can improve your prompting skills by studying how others do it. There are many prompt libraries and prompting communities that share their findings. A popular one is [prompts.chat](https://prompts.chat/)
Safetensors: Safetensors is a modern, secure, and fast format for storing model weights (the learned parameters of a neural network). Developed by Hugging Face to replace less secure formats. It's the recommended format for sharing models on platforms like Hugging Face Hub. Advantages include: - Faster Loading: It is significantly faster to load, especially on GPU, because the tensors are ready to be copied directly to the GPU without any intermediary processing. - Lazy Loading: It allows for efficient loading of parts of a model rather than the whole thing into memory, which is crucial for large models and multi-GPU setups. - Interoperability: Works with multiple ML frameworks.
System prompt: A system prompt, is a set of instructions provided to the AI model behind the scenes before user interaction begins. These instructions may be hidden from the user. Meta prompts set behavior, tone, or boundaries for how the AI should respond (e.g., “You are a helpful teaching assistant”).
Temperature: An AI tool’s temperature setting controls how deterministic or creative that AI model’s output is. Lower values (e.g., 0.2) lead to more focused and consistent answers, while higher values (e.g., 0.8) produce more varied or imaginative responses.
Token: In the context of AI, especially with Large Language Models (LLMs) like ChatGPT, a token is the basic unit of text that the model processes. For English text, one token is roughly equivalent to 4 characters or ¾ of a word. This means: - Very common words are often their own token (e.g., "the", "is", "and"). - Longer words are broken down into multiple tokens (e.g., "unnecessarily" might be broken into "un", "necessarily"). - Punctuation and spaces are also tokens. This process of breaking text into tokens is called **tokenization**. ### Examples of tokenization: - Sentence: "I'm eating a strawberry." - Tokens: `"I"`, `"'m"`, `" eating"`, `" a"`, `" straw"`, `"berry"`, `"."` - Code Snippet: def calculate_sum(a, b): - Tokens: `"def"`, `" calculate"`, `"_"`, `"sum"`, `"(a", ", ", "b", "):"` You can try the [OpenAI tokenizer tool](https://platform.openai.com/tokenizer) to see for yourself how text is converted into tokens. Tokens are also important from a user's perspective because it is how the cost of interaction with frontier models are calculated. Typically users of LLMs pay for millions of tokens.
Training: In the context of AI, particularly machine learning (ML) and deep learning, training is the fundamental process of teaching a computer program to perform a task by letting it learn from examples, rather than programming it with explicit rules. It's the core of how modern AI systems become "intelligent." These examples are often very large datasets containing a significant proportion of the internet. Depending on the model being trained, these datasets might include text, images, audio, video and other types of media. Training is most often a one-off process in the lifecycle of an AI model, and it is the most expensive phase of the lifecycle in terms of computation. The date that the datasets are frozen before the training is called the **cut-off date**.