Show HN: FSP2 Tested on excerpt "Romeo and Juliet" impressive compresion results

2 Forgret 4 9/11/2025, 11:36:23 AM
Hi HN, I want to share my updated FSP (Find Similar Patterns) v2 text compression algorithm. I tested it on a non-trivial excerpt from Romeo and Juliet, and it achieved impressive results: original size 437 bytes, compressed size 358 bytes, compression ratio 1.22. Unlike traditional methods like LZMA or Huffman, FSP v2 searches for repeating 3–5 character patterns, storing references (REF) alongside literal characters (LITERAL). This allows it to compress real-world text, maintain lossless decompression, and achieve compressed sizes smaller than the original. The algorithm works on any byte stream or text and can scale to larger files, potentially outperforming classical compression on texts with repetitive or near-repetitive patterns. Code and implementation details are available upon request.

GitHub: https://github.com/Ferki-git-creator/fsp

Website(more info): https://ferki-git-creator.github.io/fsp/

If I made a mistake somewhere, please tell me.

Comments (4)

southwindcg · 2h ago
Before we say too much about the performance of your algorithm, encoding/decoding speed and memory use must be considered, especially with very large inputs.

Note that on this particular small sample of text, Zstandard `zstd -13` compresses it to 288 bytes, and with default settings, 292 bytes. Brotli using default settings compresses it to 236 bytes.

Forgret · 2h ago
Thanks, you’re absolutely right — performance needs to be tested on large inputs with proper speed and memory profiling. I’ll run FSP on bigger datasets and compare it directly with zstd, brotli, gzip, etc. If needed, I’ll improve the algorithm to reduce overhead and make it scale better. This was just an early proof-of-concept, but I agree the next step is serious benchmarking.
southwindcg · 2h ago
You've got some stiff competition out there, with companies like Google and Facebook funding development of these algorithms. I think the days of individual coders surpassing the current state of the art are gone. It's in the interest of companies that move huge amounts of data to be as efficient at it as possible.

It's definitely interesting that your method competes with Zip and such though. Keep it up!

Forgret · 2h ago
Thanks for the inspiration! I realize it’s probably crazy to think my algorithm could ever become universally needed, but I’m not giving up. Even if it doesn’t turn into something as large-scale as I once dreamed, I believe it can still lead to something useful — and the journey itself is worth it.