Glossary¶
A comprehensive reference of terms and concepts used in Semantica.
A¶
Agent : An autonomous AI system that can perceive its environment, reason about information, and take actions to achieve specific goals. In Semantica, agents use knowledge graphs for memory and reasoning.
API (Application Programming Interface) : A set of functions and protocols that allow different software applications to communicate with each other.
Axiom : A statement or rule that is accepted as true without proof, used in ontologies to define logical constraints and relationships.
C¶
Centrality : A measure of the importance or influence of a node in a graph. Common centrality metrics include PageRank, betweenness centrality, and closeness centrality.
Class : In ontologies, a category or type of entity (e.g., Person, Organization, Location).
Community Detection : The process of identifying groups or clusters of densely connected nodes in a graph.
Conflict Resolution : The process of handling contradictory information from multiple sources in a knowledge graph.
Coreference Resolution : The task of determining when two or more expressions in text refer to the same entity (e.g., "Apple" and "the company" referring to Apple Inc.).
Cypher : A declarative query language for graph databases, particularly Neo4j.
E¶
Embedding : A dense vector representation of text, images, or other data that captures semantic meaning in a continuous vector space. Used for similarity search and semantic matching.
Entity : A distinct object or concept in the real world, such as a person, place, organization, or event.
Entity Resolution : The process of determining when two entity mentions refer to the same real-world entity, also known as entity linking or deduplication.
Event Detection : The task of identifying and classifying events (e.g., acquisitions, partnerships, announcements) in text.
G¶
Graph : A data structure consisting of nodes (vertices) and edges (relationships) connecting them.
GraphRAG (Graph-Augmented Retrieval Augmented Generation) : An advanced RAG approach that combines vector search with knowledge graph traversal to provide more accurate and contextually relevant information to LLMs.
H¶
Hybrid Search : A search strategy that combines multiple retrieval methods, typically vector search and keyword search, to improve accuracy.
I¶
Inference : The process of deriving new facts or conclusions from existing knowledge using logical rules.
Ingestion : The process of loading data from various sources (files, databases, APIs, streams) into a system for processing.
K¶
Knowledge Graph (KG) : A structured representation of knowledge using entities (nodes) and relationships (edges). KGs enable reasoning, querying, and semantic analysis of data.
Knowledge Graph Analytics : The application of graph algorithms (e.g., centrality, community detection) to gain insights from the structure of a knowledge graph.
L¶
LLM (Large Language Model) : A type of artificial intelligence model trained on vast amounts of text data, capable of understanding and generating human-like text.
N¶
Named Entity Recognition (NER) : The process of identifying and classifying named entities in text into predefined categories such as persons, organizations, locations, dates, and more.
Node : A vertex in a graph representing an entity or concept.
Normalization : The process of standardizing data into a consistent format (e.g., converting dates to ISO format, standardizing entity names).
O¶
OCR (Optical Character Recognition) : Technology that converts images of text (e.g., scanned documents, photos) into machine-readable text.
Ontology : A formal specification of concepts, relationships, and constraints in a domain, typically expressed in OWL (Web Ontology Language).
OWL (Web Ontology Language) : A W3C standard language for defining and instantiating ontologies on the web.
P¶
PageRank : An algorithm used to measure the importance of nodes in a graph based on the structure of incoming links.
Pipeline : A sequence of data processing steps that transform raw data into a desired output format.
Property : In ontologies, a relationship or attribute that connects entities or describes their characteristics.
Provenance : Information about the origin, history, and lineage of data, including sources, timestamps, and transformations.
R¶
RAG (Retrieval Augmented Generation) : A technique that enhances LLM responses by retrieving relevant information from a knowledge base before generating an answer.
RDF (Resource Description Framework) : A W3C standard for representing information about resources in the form of subject-predicate-object triplets.
Reasoning : The process of deriving new knowledge from existing facts using logical rules and inference.
Relationship Extraction : The task of identifying and extracting semantic relationships between entities in text.
S¶
Semantic : Relating to meaning in language or logic.
Semantic Layer : An abstraction layer that provides a unified, business-friendly view of data by adding context, relationships, and meaning to raw data.
Semantic Network : A knowledge representation that uses a graph structure to represent concepts and their relationships.
SPARQL : A query language for RDF data, similar to SQL for relational databases.
T¶
Temporal Graph : A knowledge graph that tracks changes over time, allowing queries about the state of the graph at specific time points.
Triplet : A basic unit of knowledge in RDF, consisting of a subject, predicate, and object (e.g., <Apple_Inc> <founded_by> <Steve_Jobs>).
Triplet Store : A database designed specifically for storing and querying RDF triplets.
V¶
Vector : A mathematical representation of data as an array of numbers, used in embeddings to capture semantic meaning.
Vector Store : A database optimized for storing and searching high-dimensional vectors, used for semantic similarity search.
Visualization : The graphical representation of data, such as knowledge graphs, embeddings, or analytics.
W¶
Web Scraping : The automated process of extracting data from websites.
See Also¶
- Core Concepts - Deep dive into fundamental concepts
- Getting Started - Begin your journey with Semantica
- API Reference - Technical documentation