1. Introduction
2. Methodology
3. Menu
This project consists of developing a terminal application using the C++ language. Through data structures, it was proposed to create a data compression method using Huffman coding. Huffman coding is a compression method that uses the occurrence probabilities of symbols in the data set to be compressed to determine variable-length codes for each symbol.
For the use of coding, tree structures, queues, lists and hash tables were used. A priority queue was created to handle the reordering of the forest of trees that was contained in a list of trees. Below are the project steps and their respective codes.
1. Count the recurrence of each word in the file;
3. Build the tree with the rules presented by Huffman;
4. Replace words with binary encoding;
5. Save the file in binary format;
6. Observe and discuss space gain or loss.
-
When executing the program, the following options will appear:
1Prints the list of words from the text and their normalized frequencies.2Prints the Huffman tree (in list form) with their associated binary values.3Creates the binary file test.dat in the /src directory.9Ends the program.
-
Expected results from executing the options:
Through the analysis of the compression result, it was possible to identify an increase in file size. This behavior, contrary to what was expected, is due to the fact that the true Huffman code (different from the proposed one) uses characters as keys for building the tree and their respective binary values. The number of characters is much lower than the number of possible words within a text, causing the binary values to grow in such a way (due to the number of words and the code having no repeated prefixes) that the file size increases.
| Command | Function |
|---|---|
| make clean | Deletes the last compilation performed contained in the build folder |
| make | Executes the program compilation using g++, and the result goes to the build folder |
| make run | Executes the program from the build folder after compilation is performed |





