Frequently Asked Questions¶
Common questions about Semantica and how to use it.
General¶
What is Semantica?¶
Semantica is an open-source framework for building knowledge graphs from unstructured data. It transforms documents, web pages, and databases into structured, queryable knowledge.
What can I do with Semantica?¶
- Build knowledge graphs from documents and data
- Extract entities and relationships automatically
- Power AI applications with structured knowledge
- Create semantic search and GraphRAG systems
- Integrate multiple data sources into unified graphs
Is Semantica free?¶
Yes! Semantica is open source under the MIT License.
What makes Semantica different?¶
- Modular architecture - Use only what you need
- Production-ready - Built for scale and reliability
- Extensible - Add custom models and components
- Open source - Transparent and community-driven
Installation¶
How do I install Semantica?¶
What Python version do I need?¶
Python 3.8 or higher. Python 3.11+ is recommended.
What are the system requirements?¶
- Python 3.8+
- 4GB+ RAM for basic use
- Optional GPU for embeddings and ML models
Getting Started¶
How do I start using Semantica?¶
from semantica.semantic_extract import NERExtractor
from semantica.kg import GraphBuilder
# Extract entities
ner = NERExtractor()
entities = ner.extract("Apple Inc. was founded by Steve Jobs.")
# Build knowledge graph
kg = GraphBuilder().build({"entities": entities})
Where can I find examples?¶
- Getting Started Guide - Quick introduction
- Cookbook - Practical examples
- GitHub Examples - Code samples
Features¶
What data sources does Semantica support?¶
- Files: PDF, DOCX, TXT, JSON, CSV
- Web: Websites, RSS feeds, APIs
- Databases: PostgreSQL, MySQL, Snowflake, MongoDB
- Streams: Kafka, RabbitMQ, real-time data
Can I use custom models?¶
Yes! Semantica supports custom: - Entity extraction models - Embedding models - Language models - Custom processors
Does Semantica support GPUs?¶
Yes, Semantica automatically uses GPUs when available for: - Embedding generation - ML model inference - Vector operations
Technical¶
How does Semantica handle large datasets?¶
- Batching - Process data in chunks
- Streaming - Handle real-time data
- Parallel processing - Use multiple cores
- Memory management - Efficient resource usage
Can I deploy Semantica in production?¶
Yes! Semantica is production-ready with: - Scalable architecture - Error handling - Monitoring support - Container deployment
How do I customize Semantica?¶
- Custom processors - Add new extraction logic
- Custom models - Use your own ML models
- Plugins - Extend functionality
- Configuration - Adjust behavior
Troubleshooting¶
Installation issues¶
- Python version: Ensure Python 3.8+
- Dependencies: Install with
pip install -e .[dev] - Permissions: Use virtual environments
Performance issues¶
- Memory: Increase available RAM
- GPU: Install CUDA for GPU acceleration
- Batching: Use smaller chunk sizes
Common errors¶
- Import errors: Check installation path
- Model loading: Verify model availability
- Memory errors: Reduce batch sizes
Support¶
Where can I get help?¶
- GitHub Issues - Report problems
- Discussions - Ask questions
- Documentation - Browse guides and references
How do I report bugs?¶
- Search existing issues first
- Create a new issue with details
- Include reproduction steps
- Add environment information
Can I contribute?¶
Yes! See the Contributing Guide for details on how to help improve Semantica.