Here's a little text compression algorithm I made just to learn more about how they work. It uses an llm and huffman encoding with '0xx' mapped to fallback to an llm prediction of what should come next (one of the top four predictions). Its not the best ever but its pretty ok and beats google's brotli in terms of end size (though is far more expensive due to llms) in all samples I've tried by 10-20%. End size reaches about 30-40% of orginal depending on the text, however it depends on the size/quality of the llm used along with how well suited it is for the text given. Performance in terms of speed is quite bad, and both sides would require the same chunky model to decode/encode, but its still interesting enough, and has been fun to work on.
TallTony-dev/Compression
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|