File_Compressor

A text file compression tool based on Huffman encoding algorithm. The tool enjoys the benefit given by huffman encoding algorithm, i.e, Lossless compression. We can compress the data upto 30-40% without any loss in the data, saving 60-70% space. This comes very handy in case of cloud sharing.

UseCase

Imagine you have to send a 1GB data file, and you only have 500MB of internet data. No need to worry! Huffman encoding algorithm got you covered. After compression the file's size becomes somewhere around 300MB(not guaranteed) which can now be safely sent without any problems.

About the Algorithm

Huffman coding, an optimal prefix-free binary code, is a renowned data compression technique designed by David A. Huffman in 1952. This ingenious algorithm is based on the principle of assigning shorter codes to frequently occurring symbols in a dataset, resulting in more efficient representation and compression. Huffman encoding achieves this by constructing a binary tree, known as the Huffman tree, where each leaf node corresponds to a symbol and is assigned a binary code based on its frequency in the input data. The algorithm employs a priority queue to efficiently merge nodes with the lowest frequencies until a single root node is formed. The resulting binary codes exhibit the desirable property of being prefix-free, ensuring that no code is a prefix of another. This characteristic simplifies decoding, as it enables unambiguous identification of each symbol in the compressed data. It finds its unique applications in the real-world. Notably, in file compression algorithms such as ZIP and in network protocols where efficient data transmission is crucial.

Merits of the tool

Lossless Compression:

Huffman coding is a lossless compression algorithm, meaning that the decompressed data is identical to the original.

Widely Used:

Huffman coding is widely adopted and implemented in various applications, including file compression utilities (e.g., ZIP) and network protocols, attesting to its effectiveness.

Simple and Fast Decoding:

The decoding process in Huffman coding is straightforward and fast. The prefix-free nature of the codes ensures unambiguous decoding, making it efficient for both compression and decompression.

Demerits of the tool

Compression Overhead for Small Files:

For very small files or files with limited redundancy, the overhead introduced by the Huffman coding tree might result in a compressed file size that is larger than the original.

Complexity in Real-Time Applications:

While Huffman coding is generally efficient, its adaptive nature and tree-building process might introduce complexities in real-time applications where encoding and decoding need to be performed quickly.

Limited Compression Improvement with ASCII Text:

For ASCII text, where each character already requires 8 bits, the potential for compression improvement is limited compared to scenarios where symbol frequencies vary widely.

Getting Started

Step1 :

Clone the repository(make sure git is installed)

    git clone https://github.com/dave1725/File_compressor.git

Step2 :

Compile the project source file using g++ or any compiler Make sure MingW is installed for windows and OSX.

For LINUX For linux install g++.
```
  apt-get install g++
```

Compiling the project.

    g++ project.cpp - o project

Executing the project.

    ./project

For Windows Install MinGW from https://sourceforge.net/projects/mingw/

Compiling the project.

    g++ project.cpp -o project

Executing the project.

    ./project

Step3 :

Finally! Run the project executable file.

Authors

Krish Nariya
Aakarsh Lohani
Piyusha Mukherjee
Nivedha Sriram
Priyam Kotnala
Dave Meshak

References

License

This project is licensed under MIT license - Kindly refer the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Screenshot (61).png		Screenshot (61).png
Screenshot (62).png		Screenshot (62).png
Screenshot (63).png		Screenshot (63).png
original.txt		original.txt
project.cpp		project.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

File_Compressor

UseCase

About the Algorithm

Merits of the tool

Lossless Compression:

Widely Used:

Simple and Fast Decoding:

Demerits of the tool

Compression Overhead for Small Files:

Complexity in Real-Time Applications:

Limited Compression Improvement with ASCII Text:

Getting Started

Step1 :

Step2 :

Step3 :

Authors

References

License

About

Releases

Packages

Languages

License

dave1725/File_compressor

Folders and files

Latest commit

History

Repository files navigation

File_Compressor

UseCase

About the Algorithm

Merits of the tool

Lossless Compression:

Widely Used:

Simple and Fast Decoding:

Demerits of the tool

Compression Overhead for Small Files:

Complexity in Real-Time Applications:

Limited Compression Improvement with ASCII Text:

Getting Started

Step1 :

Step2 :

Step3 :

Authors

References

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages