Show HN: Digger Solo – Semantic File Search and Maps
The file search works by combining full text search with semantic search allowing to search for content of text and image files by their meaning / content (even if the image has no descriptive file name). You can start a search query using a tag e.g. "#jpg cat" to search all your jpg files for cats.
Files are sorted by content similarity in interactive maps that reveal hidden connections and patterns across your collection (text, image, video & audio files supported).
Tags are inferred from imported file paths and file types.
Technicalities:
Digger Solo is quite a complex beast: I am using PyTauri (https://github.com/pytauri/pytauri - Python bindings for Tauri with a Rust backend with JS frontend). Python is used to import files, run the dimensionality reduction algorithms (t-SNE, UMAP etc.) and for most database logic (using SQLite3). The CLIP model (JINA CLIP v1) runs as ONNX model using the Rust crate ORT (https://ort.pyke.io/) - inference supports right now only CPU.
The semantic search is powered by the SQLite3 extension https://github.com/asg017/sqlite-vec which just adds support for optimized brute-force based nearest neighbor vector search (no approximate vector index). So Digger Solo is not meant for millions of files, more like a few 100 thousands.
The installer contains all model files used (text, vision and audio encoders) that is why the download size is so big - I did not want to have additional downloads after installation.
The Windows version had problems running the Python modules reliably for some customers (working on a fix). So I recommend trying the free version first and see if file importing works before purchasing a key.
No comments yet