Show HN: Unsiloed – VLMs for Document Ingestion

1 ady9999 0 6/13/2025, 9:36:10 PM unsiloed.ai ↗
I'm excited to introduce Unsiloed Chunker, an open-source Python library designed for efficient document chunking in retrieval-augmented generation (RAG) applications.

Key Features:

Multi-threaded Processing: Speeds up chunking operations by processing multiple documents simultaneously. Supports Multiple File Types: Handles PDF, DOCX, and PPTX formats. Flexible Chunking Strategies: Offers fixed-size and page-based chunking methods. Zero Dependencies: Lightweight and easy to integrate into your projects. Installation:

pip install unsiloed-chunker Usage Example:

from unsiloed_chunker import Chunker

chunker = Chunker(file_path="your_document.pdf") chunks = chunker.chunk(strategy="fixed_size", chunk_size=500) for chunk in chunks: print(chunk) For more details, check out the documentation.

I'd love to hear your feedback and suggestions!

Comments (0)

No comments yet