Show HN: Unsiloed – VLMs for Document Ingestion
1 ady9999 0 6/13/2025, 9:36:10 PM unsiloed.ai ↗
I'm excited to introduce Unsiloed Chunker, an open-source Python library designed for efficient document chunking in retrieval-augmented generation (RAG) applications.
Key Features:
Multi-threaded Processing: Speeds up chunking operations by processing multiple documents simultaneously. Supports Multiple File Types: Handles PDF, DOCX, and PPTX formats. Flexible Chunking Strategies: Offers fixed-size and page-based chunking methods. Zero Dependencies: Lightweight and easy to integrate into your projects. Installation:
pip install unsiloed-chunker Usage Example:
from unsiloed_chunker import Chunker
chunker = Chunker(file_path="your_document.pdf") chunks = chunker.chunk(strategy="fixed_size", chunk_size=500) for chunk in chunks: print(chunk) For more details, check out the documentation.
I'd love to hear your feedback and suggestions!
No comments yet