Glossary - Arcbeam Documentation

This glossary defines key terms used in Arcbeam and the broader LLM and observability ecosystem.

Arcbeam-Specific Terms

API Key

A secret token used to authenticate your application with the Arcbeam platform. API keys are created in the Arcbeam dashboard and should be stored securely in environment variables.

Collection

A curated set of traces grouped together for team review and analysis. Collections allow you to organize interesting or problematic traces, add comments, and collaborate with stakeholders on improvements.

Connector

The Arcbeam client library that instruments your LLM application to send traces to the Arcbeam platform. Available for Python (LangChain/LangGraph) with additional framework support coming soon.

Data Source

An external system connected to Arcbeam, such as a vector database (e.g., pgvector). Data sources enable Arcbeam to link traces with the actual documents retrieved during execution.

Dataset

A collection of documents from a connected data source (vector database). Datasets represent your knowledge base and can be analyzed to understand document usage and quality.

Environment

A label applied to traces to distinguish between different deployment contexts (e.g., “development”, “staging”, “production”). Environments help organize and filter traces in the dashboard.

Node

An individual step within a trace representing a single operation, such as a model call, retrieval, or tool execution. Traces are composed of multiple nodes organized in a parent-child hierarchy.

Project

A logical grouping for organizing traces from different applications or teams. Projects provide isolation and help manage access control.

Trace

A complete record of a single LLM interaction, including all steps from input to output. Traces capture model calls, retrievals, tool executions, timing, costs, and errors.

LLM & AI Terms

Agent

An autonomous system that uses an LLM to reason about tasks, make decisions, and execute actions using tools. Agents can perform multi-step workflows without explicit programming for each step.

Context Window

The maximum amount of text (measured in tokens) that an LLM can process in a single request, including both input and output. Different models have different context window sizes (e.g., 4K, 8K, 128K tokens).

Embedding

A numerical vector representation of text that captures semantic meaning. Embeddings enable similarity search in vector databases and are fundamental to RAG systems.

Fine-tuning

The process of training a pre-trained LLM on domain-specific data to improve performance for particular tasks or to teach it specialized knowledge.

Hallucination

When an LLM generates information that is factually incorrect, fabricated, or not grounded in its training data or provided context. A key challenge in production LLM applications.

LangChain

An open-source framework for building applications with LLMs. Provides abstractions for chains, agents, memory, and integrations with various LLM providers and tools.

LangGraph

A library built on LangChain for creating stateful, multi-step agent workflows using graph-based orchestration. Enables complex agent behaviors with cycles and conditional logic.

LLM (Large Language Model)

A machine learning model trained on vast amounts of text data to understand and generate human-like text. Examples include GPT-4, Claude, and Llama.

Prompt

The input text sent to an LLM to elicit a desired response. Prompts can include instructions, context, examples, and the user’s question or request.

RAG (Retrieval-Augmented Generation)

A technique that combines information retrieval with LLM generation. The system retrieves relevant documents from a knowledge base and includes them in the prompt to ground the LLM’s response in factual information.

System Prompt

Instructions provided to an LLM that define its behavior, role, and constraints for a conversation. System prompts guide the model’s responses and are typically invisible to end users.

Temperature

A parameter controlling the randomness of LLM outputs. Lower values (e.g., 0.0-0.3) produce more focused and deterministic responses, while higher values (e.g., 0.7-1.0) increase creativity and variation.

Token

The basic unit of text processed by LLMs. Tokens can be words, parts of words, or punctuation. Most LLMs charge based on token usage, and context windows are measured in tokens.

Tool Call

When an LLM decides to invoke an external function or API to retrieve information or perform an action. Tools extend LLM capabilities beyond text generation.

Vector Database

A specialized database optimized for storing and searching embeddings. Vector databases enable semantic search by finding documents with similar meaning rather than exact keyword matches. Examples include pgvector, Pinecone, and Weaviate.

Observability & Monitoring Terms

Latency

The time elapsed between sending a request and receiving a response. In LLM applications, latency includes model processing time, retrieval time, and network overhead.

OpenTelemetry

An open-source observability framework for collecting traces, metrics, and logs from applications. Arcbeam uses OpenTelemetry as the underlying protocol for trace collection.

Span

A single unit of work within a distributed trace, representing an operation with a start time, duration, and metadata. In OpenTelemetry, spans are the building blocks of traces (similar to Arcbeam nodes).

Trace ID

A unique identifier assigned to a trace, used to reference and retrieve specific executions in the Arcbeam dashboard.

Data & Retrieval Terms

Chunk

A segment of a larger document split into smaller pieces for embedding and storage in a vector database. Chunking strategies balance context preservation with retrieval precision.

Data Lineage

The ability to track data from its source through transformations to its final use. In Arcbeam, data lineage connects traces to the specific documents retrieved from vector databases.

Metadata

Additional information attached to documents, traces, or nodes. Metadata can include timestamps, user IDs, document sources, or custom attributes for filtering and analysis.

Retrieval Score

A numerical value indicating the relevance of a retrieved document to a query. Higher scores suggest stronger semantic similarity. Also called similarity score or relevance score.

Semantic Search

A search technique that finds results based on meaning and context rather than exact keyword matching. Powered by embeddings and vector databases.

Vector

See Embedding.

Cost & Performance Terms

Input Tokens

Tokens sent to an LLM as part of the prompt, including system prompts, user messages, and retrieved context. Typically cheaper than output tokens.

Output Tokens

Tokens generated by an LLM in its response. Usually priced higher than input tokens by LLM providers.

Token Usage

The total number of tokens consumed by an LLM interaction, including both input and output tokens. Used to calculate costs and track resource consumption.

Need More Help?

If you encounter a term not listed here, please contact support@arcbeam.ai or check our other documentation:

Introduction

Learn about Arcbeam fundamentals

Sending Traces

Understand trace collection concepts

Data Sources

Learn about data source integration

Support

Get help from our team

​Arcbeam-Specific Terms

​API Key

​Collection

​Connector

​Data Source

​Dataset

​Environment

​Node

​Project

​Trace

​LLM & AI Terms

​Agent

​Context Window

​Embedding

​Fine-tuning

​Hallucination

​LangChain

​LangGraph

​LLM (Large Language Model)

​Prompt

​RAG (Retrieval-Augmented Generation)

​System Prompt

​Temperature

​Token

​Tool Call

​Vector Database

​Observability & Monitoring Terms

​Latency

​OpenTelemetry

​Span

​Trace ID

​Data & Retrieval Terms

​Chunk

​Data Lineage

​Metadata

​Retrieval Score

​Semantic Search

​Vector

​Cost & Performance Terms

​Input Tokens

​Output Tokens

​Token Usage

​Need More Help?