Frequently Asked Questions¶

Common questions about Semantica and how to use it.

General¶

What is Semantica?¶

Semantica is an open-source framework for building knowledge graphs from unstructured data. It transforms documents, web pages, and databases into structured, queryable knowledge.

What can I do with Semantica?¶

Build knowledge graphs from documents and data
Extract entities and relationships automatically
Power AI applications with structured knowledge
Create semantic search and GraphRAG systems
Integrate multiple data sources into unified graphs

Is Semantica free?¶

Yes! Semantica is open source under the MIT License.

What makes Semantica different?¶

Modular architecture - Use only what you need
Production-ready - Built for scale and reliability
Extensible - Add custom models and components
Open source - Transparent and community-driven

Installation¶

How do I install Semantica?¶

pip install semantica

What Python version do I need?¶

Python 3.8 or higher. Python 3.11+ is recommended.

What are the system requirements?¶

Python 3.8+
4GB+ RAM for basic use
Optional GPU for embeddings and ML models

Getting Started¶

How do I start using Semantica?¶

from semantica.semantic_extract import NERExtractor
from semantica.kg import GraphBuilder

# Extract entities
ner = NERExtractor()
entities = ner.extract("Apple Inc. was founded by Steve Jobs.")

# Build knowledge graph
kg = GraphBuilder().build({"entities": entities})

Where can I find examples?¶

Getting Started Guide - Quick introduction
Cookbook - Practical examples
GitHub Examples - Code samples

Features¶

What data sources does Semantica support?¶

Files: PDF, DOCX, TXT, JSON, CSV
Web: Websites, RSS feeds, APIs
Databases: PostgreSQL, MySQL, Snowflake, MongoDB
Streams: Kafka, RabbitMQ, real-time data

Can I use custom models?¶

Yes! Semantica supports custom: - Entity extraction models - Embedding models - Language models - Custom processors

Does Semantica support GPUs?¶

Yes, Semantica automatically uses GPUs when available for: - Embedding generation - ML model inference - Vector operations

Technical¶

How does Semantica handle large datasets?¶

Batching - Process data in chunks
Streaming - Handle real-time data
Parallel processing - Use multiple cores
Memory management - Efficient resource usage

Can I deploy Semantica in production?¶

Yes! Semantica is production-ready with: - Scalable architecture - Error handling - Monitoring support - Container deployment

How do I customize Semantica?¶

Custom processors - Add new extraction logic
Custom models - Use your own ML models
Plugins - Extend functionality
Configuration - Adjust behavior

Troubleshooting¶

Installation issues¶

Python version: Ensure Python 3.8+
Dependencies: Install with pip install -e .[dev]
Permissions: Use virtual environments

Performance issues¶

Memory: Increase available RAM
GPU: Install CUDA for GPU acceleration
Batching: Use smaller chunk sizes

Common errors¶

Import errors: Check installation path
Model loading: Verify model availability
Memory errors: Reduce batch sizes

Support¶

Where can I get help?¶

GitHub Issues - Report problems
Discussions - Ask questions
Documentation - Browse guides and references

How do I report bugs?¶

Search existing issues first
Create a new issue with details
Include reproduction steps
Add environment information

Can I contribute?¶

Yes! See the Contributing Guide for details on how to help improve Semantica.