Ontology¶
Automated ontology generation, validation, and management system.
🎯 Overview¶
-
Automated Generation
6-stage pipeline to generate OWL ontologies from raw data
-
Inference Engine
Infer classes, properties, and hierarchies from entity patterns
-
Evaluation
Assess ontology quality using coverage, completeness, and granularity metrics
-
OWL/RDF Export
Export to Turtle, RDF/XML, and JSON-LD formats
When to Use
- Schema Design: When defining the structure of your Knowledge Graph
- Data Modeling: To formalize domain concepts and relationships
- Interoperability: To ensure your data follows standard semantic web practices
⚙️ Algorithms Used¶
6-Stage Generation Pipeline¶
The ontology generation process follows these stages:
- Semantic Network Parsing: Extract concepts and patterns from raw entity/relationship data
- YAML-to-Definition: Transform patterns into intermediate class definitions
- Definition-to-Types: Map definitions to OWL types (
`owl:Class`,`owl:ObjectProperty`) - Hierarchy Generation: Build taxonomy trees using transitive closure and cycle detection
- TTL Generation: Serialize to Turtle format using
`rdflib`
Inference Algorithms¶
The module uses several inference algorithms:
- Class Inference: Clustering entities by type and attribute similarity
- Property Inference: Determining domain/range based on connected entity types
- Hierarchy Inference:
`A is_a B`detection based on subset relationships
Main Classes¶
OntologyEngine¶
Unified orchestration for generation, inference, validation, OWL export, and evaluation.
Methods:
| Method | Description |
|---|---|
from_data(data, **options) | Generate ontology from structured data |
from_text(text, provider=None, model=None, **options) | LLM-based generation from text |
validate(ontology, **options) | Validate ontology consistency |
infer_classes(entities, **options) | Infer classes from entities |
infer_properties(entities, relationships, classes, **options) | Infer properties |
evaluate(ontology, **options) | Evaluate ontology quality |
to_owl(ontology, format="turtle") | Export OWL/RDF serialization |
export_owl(ontology, path, format="turtle") | Save OWL to file |
Quick Start:
from semantica.ontology import OntologyEngine
engine = OntologyEngine(base_uri="https://example.org/ontology/")
data = {"entities": entities, "relationships": relationships}
ontology = engine.from_data(data, name="MyOntology")
turtle = engine.to_owl(ontology, format="turtle")
LLMOntologyGenerator¶
LLM-based ontology generation with multi-provider support (openai, groq, deepseek, huggingface_llm).
Example:
from semantica.ontology import OntologyEngine
text = "Acme Corp. hired Alice in 2024. Alice works for Acme."
engine = OntologyEngine()
ontology = engine.from_text(
text,
provider="deepseek",
model="deepseek-chat",
name="EmploymentOntology",
base_uri="https://example.org/employment/",
)
Environment variables:
OntologyGenerator¶
Main entry point for the generation pipeline.
Methods:
| Method | Description |
|---|---|
generate_ontology(data) | Run full pipeline |
generate_from_schema(schema) | Generate from explicit schema |
Example:
from semantica.ontology import OntologyGenerator
generator = OntologyGenerator(base_uri="http://example.org/onto/")
ontology = generator.generate_ontology({
"entities": entities,
"relationships": relationships
})
print(ontology.serialize(format="turtle"))
OntologyEvaluator¶
Scores ontology quality.
Methods:
| Method | Description |
|---|---|
evaluate_ontology(ontology) | Calculate evaluation metrics |
calculate_coverage(ontology, questions) | Verify coverage |
ReuseManager¶
Manages external dependencies.
Methods:
| Method | Description |
|---|---|
import_external_ontology(uri, ontology) | Load and merge external ontology |
evaluate_alignment(uri, ontology) | Assess alignment and compatibility |
Unified Engine Examples¶
from semantica.ontology import OntologyEngine
engine = OntologyEngine(base_uri="https://example.org/ontology/")
# Generate
ontology = engine.from_data({
"entities": entities,
"relationships": relationships,
})
# Validate
result = engine.validate(ontology, reasoner="hermit")
print("valid=", result.valid, "consistent=", result.consistent)
# Export
turtle = engine.to_owl(ontology, format="turtle")
Configuration¶
Environment Variables¶
YAML Configuration¶
Integration Examples¶
Schema-First Knowledge Graph¶
from semantica.ontology import OntologyEngine
from semantica.kg import KnowledgeGraph
# 1. Generate Ontology from Sample Data
engine = OntologyEngine()
ontology = engine.from_data(sample_data)
# 2. Initialize KG with Ontology
kg = KnowledgeGraph(schema=ontology)
# 3. Add Data (Validated against Ontology)
kg.add_entities(full_dataset) # Will raise error if violates schema
Best Practices¶
- Reuse Standard Ontologies: Don't reinvent
PersonorOrganization; import FOAF or Schema.org usingReuseManager. - Validate Early: Run validation during generation to catch logical errors before populating the graph.
- Use Competency Questions: Define what questions your ontology should answer and use
OntologyEvaluatorto verify. - Version Control: Treat ontologies like code. Use
VersionManagerto track changes.
See Also¶
- Knowledge Graph Module - The instance data following the ontology
- Reasoning Module - Uses the ontology for inference
- Visualization Module - Visualizing the class hierarchy
Cookbook¶
Interactive tutorials to learn ontology generation and management:
- Ontology: Define domain schemas and ontologies to structure your data
- Topics: OWL, RDF, schema design, ontology generation
- Difficulty: Intermediate
-
Use Cases: Structuring domain knowledge, schema definition
-
Unstructured to Ontology: Generate ontologies automatically from unstructured data
- Topics: Automatic ontology generation, 6-stage pipeline, OWL validation
- Difficulty: Advanced
- Use Cases: Domain modeling, automatic schema generation