Context Module Reference¶
The central nervous system for intelligent agents, managing memory, knowledge graphs, and context retrieval.
🎯 System Overview¶
The Context Module provides agents with a persistent, searchable, and structured memory system. It is built on a Synchronous Architecture 2.0, ensuring predictable state management and compatibility with modern vector stores and graph databases.
Key Capabilities¶
-
Hierarchical Memory
Mimics human memory with a fast, token-limited Short-Term buffer and infinite Long-Term vector storage.
-
GraphRAG
Combines unstructured vector search with structured knowledge graph traversal for deep contextual understanding.
-
Hybrid Retrieval
Intelligently blends Keyword (BM25), Vector (Dense), and Graph (Relational) scores for optimal relevance.
-
Token Management
Automatic FIFO and importance-based pruning to keep context within LLM window limits.
-
Entity Linking
Resolves ambiguities by linking text mentions to unique entities in the knowledge graph.
When to Use
- Memory Persistence: Enabling agents to remember user preferences and history.
- Complex Retrieval: When simple vector search fails to capture relationships.
- Knowledge Graph: Building a structured world model from unstructured text.
🏗️ Architecture Components¶
AgentContext (The Orchestrator)¶
The high-level facade that unifies all context operations. It routes data to the appropriate subsystems (Memory, Graph, Vector Store) and manages the lifecycle of context.
Constructor Parameters¶
`vector_store`(Required): The backing vector database instance (e.g., FAISS, Weaviate)`knowledge_graph`(Optional): The graph store instance for structured knowledge`token_limit`(Default:`2000`): The maximum number of tokens allowed in short-term memory before pruning occurs`short_term_limit`(Default:`10`): The maximum number of distinct memory items in short-term memory`hybrid_alpha`(Default:`0.5`): The weighting factor for retrieval (`0.0`= Pure Vector,`1.0`= Pure Graph)`use_graph_expansion`(Default:`True`): Whether to fetch neighbors of retrieved nodes from the graph
Core Methods¶
| Method | Description |
|---|---|
store(content, ...) | Writes information to memory. Handles auto-detection, write-through to vector store, and entity extraction. |
retrieve(query, ...) | Fetches relevant context using hybrid search (Vector + Graph) and reranking. |
query_with_reasoning(query, llm_provider, ...) | GraphRAG with multi-hop reasoning: Retrieves context, builds reasoning paths, and generates LLM-based natural language responses grounded in the knowledge graph. |
Code Example¶
from semantica.context import AgentContext
from semantica.vector_store import VectorStore
# 1. Initialize
vs = VectorStore(backend="faiss", dimension=768)
context = AgentContext(
vector_store=vs,
token_limit=2000
)
# 2. Store Memory
context.store(
"User is working on a React project.",
conversation_id="session_1",
user_id="user_123"
)
# 3. Retrieve Context
results = context.retrieve("What is the user building?")
# 4. Query with Reasoning (GraphRAG)
from semantica.llms import Groq
import os
llm_provider = Groq(
model="llama-3.1-8b-instant",
api_key=os.getenv("GROQ_API_KEY")
)
result = context.query_with_reasoning(
query="What IPs are associated with security alerts?",
llm_provider=llm_provider,
max_results=10,
max_hops=2
)
print(f"Response: {result['response']}")
print(f"Reasoning Path: {result['reasoning_path']}")
print(f"Confidence: {result['confidence']:.3f}")
AgentMemory (The Storage Engine)¶
Manages the storage and lifecycle of memory items. It implements the Hierarchical Memory pattern.
Features & Functions¶
- Short-Term Memory (Working Memory)
- Structure: An in-memory list of recent
MemoryItemobjects. - Purpose: Provides immediate context for the ongoing conversation.
- Pruning Logic:
- FIFO: Removes the oldest items first when limits are reached.
- Token-Aware: Calculates token counts to ensure the total buffer size stays under
token_limit.
- Structure: An in-memory list of recent
- Long-Term Memory (Episodic Memory)
- Structure: Vector embeddings stored in the
vector_store. - Purpose: Persists history indefinitely for semantic retrieval.
- Synchronization: Automatically syncs with Short-term memory during
store()operations.
- Structure: Vector embeddings stored in the
- Retention Policy
- Time-Based: Can automatically delete memories older than
retention_days. - Count-Based: Can limit the total number of memories to
max_memories.
- Time-Based: Can automatically delete memories older than
Key Methods¶
| Method | Description |
|---|---|
store_vectors() | Handles the low-level interaction with concrete Vector Store implementations. |
_prune_short_term_memory() | Internal algorithm that enforces token and count limits. |
get_conversation_history() | Retrieves a chronological list of interactions for a specific session. |
Code Example¶
# Accessing via AgentContext
memory = context.memory
# Get conversation history
history = memory.get_conversation_history("session_1")
for item in history:
print(f"[{item.timestamp}] {item.content}")
# Get statistics
stats = memory.get_statistics()
print(f"Stored Memories: {stats['total_memories']}")
ContextGraph (The Knowledge Structure)¶
Manages the structured relationships between entities. It provides the "World Model" for the agent.
Features & Functions¶
- Dictionary-Based Interface
- Design: Uses standard Python dictionaries for nodes and edges, removing dependencies on complex interface classes.
- Benefit: simpler serialization and easier integration with external APIs.
- Graph Traversal
- Adjacency List: optimized internal structure for fast neighbor lookups.
- Multi-Hop Search: Can traverse
khops from a starting node to find indirect connections.
- Node & Edge Types
- Typed Schema: Supports distinct types for nodes (e.g., "Person", "Concept") and edges (e.g., "KNOWS", "RELATED_TO").
Key Methods¶
| Method | Description |
|---|---|
add_nodes(nodes) | Bulk adds nodes using a list of dictionaries. |
add_edges(edges) | Bulk adds edges using a list of dictionaries. |
get_neighbors(node_id, hops) | Returns connected nodes within a specified distance. |
query(query_str) | Performs keyword-based search specifically on graph nodes. |
Code Example¶
from semantica.context import ContextGraph
graph = ContextGraph()
# Add Nodes
graph.add_nodes([
{
"id": "Python",
"type": "Language",
"properties": {"paradigm": "OO"}
},
{
"id": "FastAPI",
"type": "Framework",
"properties": {"language": "Python"}
}
])
# Add Edges
graph.add_edges([
{
"source_id": "FastAPI",
"target_id": "Python",
"type": "WRITTEN_IN"
}
])
# Find Neighbors
neighbors = graph.get_neighbors("FastAPI", hops=1)
Production Graph Store Integration¶
For production environments, you can replace the in-memory ContextGraph with a persistent GraphStore (Neo4j, FalkorDB) by passing it to the knowledge_graph parameter.
from semantica.context import AgentContext
from semantica.graph_store import GraphStore
# 1. Initialize Persistent Graph Store (Neo4j)
gs = GraphStore(
backend="neo4j",
uri="bolt://localhost:7687",
user="neo4j",
password="password"
)
# 2. Initialize Agent Context with Persistent Graph
context = AgentContext(
vector_store=vs, # Your VectorStore instance
knowledge_graph=gs, # Your persistent GraphStore
use_graph_expansion=True
)
# Now all graph operations (store, retrieve, build_graph) use Neo4j directly.
ContextRetriever (The Search Engine)¶
The retrieval logic that powers the retrieve() command. It implements the Hybrid Retrieval algorithm.
Retrieval Strategy¶
- Short-Term Check: Scans the in-memory buffer for immediate, exact-match relevance.
- Vector Search: Queries the
vector_storefor semantically similar long-term memories. - Graph Expansion:
- Identifies entities in the query.
- Finds those entities in the
ContextGraph. - Traverses edges to find related concepts that might not match keywords (e.g., finding "Python" when searching for "Coding").
- Hybrid Scoring:
- Formula:
Final_Score = (Vector_Score * (1 - α)) + (Graph_Score * α) - Allows tuning the balance between semantic similarity and structural relevance.
- Formula:
Code Example¶
# The retriever is automatically used by AgentContext.retrieve()
# But can be accessed directly if needed:
retriever = context.retriever
# Perform a manual retrieval
results = retriever.retrieve(
query="web frameworks",
max_results=5
)
GraphRAG with Multi-Hop Reasoning¶
The query_with_reasoning() method extends traditional retrieval by performing multi-hop graph traversal and generating natural language responses using LLMs. This enables deeper understanding of relationships and context-aware answer generation.
How It Works¶
- Context Retrieval: Retrieves relevant context using hybrid search (vector + graph)
- Entity Extraction: Extracts entities from query and retrieved context
- Multi-Hop Reasoning: Traverses knowledge graph up to N hops to find related entities
- Reasoning Path Construction: Builds reasoning chains showing entity relationships
- LLM Response Generation: Generates natural language response grounded in graph context
Key Features¶
- Multi-Hop Reasoning: Traverses graph up to configurable hops (default: 2)
- Reasoning Trace: Shows entity relationship paths used in reasoning
- Grounded Responses: LLM generates answers citing specific graph entities
- Multiple LLM Providers: Supports Groq, OpenAI, HuggingFace, and LiteLLM (100+ LLMs)
- Fallback Handling: Returns context with reasoning path if LLM unavailable
Method Signature¶
def query_with_reasoning(
self,
query: str,
llm_provider: Any, # LLM provider from semantica.llms
max_results: int = 10,
max_hops: int = 2,
**kwargs
) -> Dict[str, Any]:
Parameters: - query (str): User query - llm_provider: LLM provider instance (from semantica.llms) - max_results (int): Maximum context results to retrieve (default: 10) - max_hops (int): Maximum graph traversal hops (default: 2) - **kwargs: Additional retrieval options
Returns: - response (str): Generated natural language answer - reasoning_path (str): Multi-hop reasoning trace - sources (List[Dict]): Retrieved context items used - confidence (float): Overall confidence score - num_sources (int): Number of sources retrieved - num_reasoning_paths (int): Number of reasoning paths found
Code Example¶
from semantica.context import AgentContext
from semantica.llms import Groq
from semantica.vector_store import VectorStore
import os
# Initialize context
context = AgentContext(
vector_store=VectorStore(backend="faiss"),
knowledge_graph=kg
)
# Configure LLM provider
llm_provider = Groq(
model="llama-3.1-8b-instant",
api_key=os.getenv("GROQ_API_KEY")
)
# Query with reasoning
result = context.query_with_reasoning(
query="What IPs are associated with security alerts?",
llm_provider=llm_provider,
max_results=10,
max_hops=2
)
# Access results
print(f"Response: {result['response']}")
print(f"\nReasoning Path: {result['reasoning_path']}")
print(f"Confidence: {result['confidence']:.3f}")
Using Different LLM Providers¶
# Groq
from semantica.llms import Groq
llm = Groq(model="llama-3.1-8b-instant", api_key=os.getenv("GROQ_API_KEY"))
# OpenAI
from semantica.llms import OpenAI
llm = OpenAI(model="gpt-4", api_key=os.getenv("OPENAI_API_KEY"))
# LiteLLM (100+ providers)
from semantica.llms import LiteLLM
llm = LiteLLM(model="anthropic/claude-sonnet-4-20250514")
# Use with query_with_reasoning
result = context.query_with_reasoning(
query="Your question here",
llm_provider=llm,
max_hops=3
)
When to Use
- Complex Queries: When simple retrieval doesn't capture relationships
- Explainable AI: When you need to show reasoning paths
- Multi-Hop Questions: "What IPs are associated with alerts that affect users?"
- Grounded Responses: When you need answers citing specific graph entities
EntityLinker (The Connector)¶
Resolves text mentions to unique entities and assigns URIs.
Key Methods¶
| Method | Description |
|---|---|
link_entities(source, target, type) | Creates a link between two entities. |
assign_uri(entity_name, type) | Generates a consistent URI for an entity. |
Code Example¶
from semantica.context import EntityLinker
linker = EntityLinker(knowledge_graph=graph)
# Link two entities
linker.link_entities(
source_entity_id="Python",
target_entity_id="Programming",
link_type="IS_A",
confidence=0.95
)
⚙️ Configuration¶
Environment Variables¶
YAML Configuration¶
context:
short_term_limit: 10
retrieval:
hybrid_alpha: 0.5 # 0.0=Vector, 1.0=Graph
max_expansion_hops: 2
📝 Data Structures¶
MemoryItem¶
The fundamental unit of storage.
@dataclass
class MemoryItem:
content: str # The actual text content
timestamp: datetime # When it was created
metadata: Dict # Arbitrary tags (user_id, source, etc.)
embedding: List[float] # The vector representation
entities: List[Dict] # Entities found in this content
Graph Node (Dict Format)¶
{
"id": "node_unique_id",
"type": "concept",
"properties": {
"content": "Description of the node",
"weight": 1.0
}
}
Graph Edge (Dict Format)¶
{
"source_id": "origin_node",
"target_id": "destination_node",
"type": "related_to",
"weight": 0.8
}
🧩 Advanced Usage¶
Method Registry (Extensibility)¶
Register custom implementations for graph building, memory management, or retrieval.
Code Example¶
from semantica.context import registry
def custom_graph_builder(entities, relationships):
# Custom logic to build graph
return "my_graph_structure"
# Register the new method
registry.register("graph", "custom_builder", custom_graph_builder)
Configuration Manager¶
Programmatically manage configuration settings.
Code Example¶
```python from semantica.context.config import context_config
Update configuration at runtime¶
context_config.set("retention_days", 60)
See Also¶
- Vector Store - The long-term storage backend
- Graph Store - The knowledge graph backend
- Reasoning - Uses context for logic
Cookbook¶
Interactive tutorials to learn context management and GraphRAG:
- Context Module: Practical guide to the context module for AI agents
- Topics: Agent memory, context graph, hybrid retrieval, entity linking
- Difficulty: Intermediate
-
Use Cases: Building stateful AI agents, persistent memory systems
-
Advanced Context Engineering: Build a production-grade memory system for AI agents
- Topics: Agent memory, GraphRAG, entity injection, lifecycle management, persistent stores
- Difficulty: Advanced
- Use Cases: Production agent systems, advanced memory management