Ontology¶

Automated ontology generation, validation, and management system.

🎯 Overview¶

Automated Generation

6-stage pipeline to generate OWL ontologies from raw data
Inference Engine

Infer classes, properties, and hierarchies from entity patterns
Evaluation

Assess ontology quality using coverage, completeness, and granularity metrics
OWL/RDF Export

Export to Turtle, RDF/XML, and JSON-LD formats

When to Use

Schema Design: When defining the structure of your Knowledge Graph
Data Modeling: To formalize domain concepts and relationships
Interoperability: To ensure your data follows standard semantic web practices

⚙️ Algorithms Used¶

6-Stage Generation Pipeline¶

The ontology generation process follows these stages:

Semantic Network Parsing: Extract concepts and patterns from raw entity/relationship data
YAML-to-Definition: Transform patterns into intermediate class definitions
Definition-to-Types: Map definitions to OWL types (`owl:Class`, `owl:ObjectProperty`)
Hierarchy Generation: Build taxonomy trees using transitive closure and cycle detection
TTL Generation: Serialize to Turtle format using `rdflib`

Inference Algorithms¶

The module uses several inference algorithms:

Class Inference: Clustering entities by type and attribute similarity
Property Inference: Determining domain/range based on connected entity types
Hierarchy Inference: `A is_a B` detection based on subset relationships

Ontology Ingestion¶

Ingest existing ontology files directly into usable data structures using OntologyIngestor.

Function: ingest_ontology(source, method="file")

Argument	Description
`source`	File path, directory path, or list of paths
`method`	Ingestion method (default: "file")

Example:

from semantica.ontology import ingest_ontology

# Ingest file
data = ingest_ontology("ontology.ttl")

# Ingest directory
dataset = ingest_ontology("ontologies/")

Main Classes¶

OntologyEngine¶

Unified orchestration for generation, inference, validation, OWL export, and evaluation.

Methods:

Method	Description
`from_data(data, **options)`	Generate ontology from structured data
`from_text(text, provider=None, model=None, **options)`	LLM-based generation from text
`validate(ontology, **options)`	Validate ontology consistency
`infer_classes(entities, **options)`	Infer classes from entities
`infer_properties(entities, relationships, classes, **options)`	Infer properties
`evaluate(ontology, **options)`	Evaluate ontology quality
`to_owl(ontology, format="turtle")`	Export OWL/RDF serialization
`export_owl(ontology, path, format="turtle")`	Save OWL to file

Quick Start:

from semantica.ontology import OntologyEngine

engine = OntologyEngine(base_uri="https://example.org/ontology/")

data = {"entities": entities, "relationships": relationships}
ontology = engine.from_data(data, name="MyOntology")

turtle = engine.to_owl(ontology, format="turtle")

LLMOntologyGenerator¶

LLM-based ontology generation with multi-provider support (openai, groq, deepseek, huggingface_llm).

Example:

from semantica.ontology import OntologyEngine

text = "Acme Corp. hired Alice in 2024. Alice works for Acme."
engine = OntologyEngine()

ontology = engine.from_text(
    text,
    provider="deepseek",
    model="deepseek-chat",
    name="EmploymentOntology",
    base_uri="https://example.org/employment/",
)

Environment variables:

export OPENAI_API_KEY=...
export GROQ_API_KEY=...
export DEEPSEEK_API_KEY=...

OntologyGenerator¶

Main entry point for the generation pipeline.

Methods:

Method	Description
`generate_ontology(data)`	Run full pipeline
`generate_from_schema(schema)`	Generate from explicit schema

Example:

from semantica.ontology import OntologyGenerator

generator = OntologyGenerator(base_uri="http://example.org/onto/")
ontology = generator.generate_ontology({
    "entities": entities,
    "relationships": relationships
})
print(ontology.serialize(format="turtle"))

OntologyEvaluator¶

Scores ontology quality.

Methods:

Method	Description
`evaluate_ontology(ontology)`	Calculate evaluation metrics
`calculate_coverage(ontology, questions)`	Verify coverage

ReuseManager¶

Manages external dependencies.

Methods:

Method	Description
`import_external_ontology(uri, ontology)`	Load and merge external ontology
`evaluate_alignment(uri, ontology)`	Assess alignment and compatibility

OntologyIngestor¶

Handles ingestion of existing ontologies from files and directories.

Methods:

Method	Description
`ingest_ontology(file_path)`	Ingest a single ontology file
`ingest_directory(directory_path)`	Recursively ingest ontology files from a directory

Unified Engine Examples¶

from semantica.ontology import OntologyEngine

engine = OntologyEngine(base_uri="https://example.org/ontology/")

# Generate
ontology = engine.from_data({
    "entities": entities,
    "relationships": relationships,
})

# Validate
result = engine.validate(ontology, reasoner="hermit")
print("valid=", result.valid, "consistent=", result.consistent)

# Export
turtle = engine.to_owl(ontology, format="turtle")

Configuration¶

Environment Variables¶

export ONTOLOGY_BASE_URI="http://my-org.com/ontology/"
export ONTOLOGY_STRICT_MODE=true

YAML Configuration¶

ontology:
  base_uri: "http://example.org/"
  generation:
    min_class_size: 5
    infer_hierarchy: true

Integration Examples¶

Schema-First Knowledge Graph¶

from semantica.ontology import OntologyEngine
from semantica.kg import GraphBuilder, GraphValidator

# 1. Generate Ontology from Sample Data
engine = OntologyEngine()
ontology = engine.from_data(sample_data)

# 2. Extract schema for validation
schema = {
    "entity_types": [c["name"] for c in ontology["classes"]],
    "relationship_types": [p["name"] for p in ontology["properties"]]
}

# 3. Initialize Validator and Builder
validator = GraphValidator(schema=schema, strict=True)
builder = GraphBuilder()

# 4. Build Knowledge Graph
kg = builder.build(full_dataset)

# 5. Validate against Ontology Schema
validation_result = validator.validate(kg)
if validation_result.is_valid:
    print("Knowledge Graph matches the ontology schema!")
else:
    print(f"Validation issues found: {validation_result.issues}")

Best Practices¶

Reuse Standard Ontologies: Don't reinvent Person or Organization; import FOAF or Schema.org using ReuseManager.
Validate Early: Run validation during generation to catch logical errors before populating the graph.
Use Competency Questions: Define what questions your ontology should answer and use OntologyEvaluator to verify.
Version Control: Treat ontologies like code. Use VersionManager to track changes.

Cookbook¶

Interactive tutorials to learn ontology generation and management:

Ontology: Define domain schemas and ontologies to structure your data
Topics: OWL, RDF, schema design, ontology generation
Difficulty: Intermediate
Use Cases: Structuring domain knowledge, schema definition
Unstructured to Ontology: Generate ontologies automatically from unstructured data
Topics: Automatic ontology generation, 6-stage pipeline, OWL validation
Difficulty: Advanced
Use Cases: Domain modeling, automatic schema generation

Ontology¶

🎯 Overview¶

⚙️ Algorithms Used¶

6-Stage Generation Pipeline¶

Inference Algorithms¶

Ontology Ingestion¶

Main Classes¶

OntologyEngine¶

LLMOntologyGenerator¶

OntologyGenerator¶

OntologyEvaluator¶

ReuseManager¶

OntologyIngestor¶

Unified Engine Examples¶

Configuration¶

Environment Variables¶

YAML Configuration¶

Integration Examples¶

Schema-First Knowledge Graph¶

Best Practices¶

See Also¶

Cookbook¶