-
Notifications
You must be signed in to change notification settings - Fork 10
Binary Module
Binary module provides custom functions for saving/loading partitions, CLVs and trees. Additionally, you can save/load any other type of data in custom blocks. Binary files can be defined as sequential or random access. Sequential access files are a better choice, for example, when you know beforehand which data you are persisting ad in which order.
Binary files contain the following:
- A main header
- Optionally, a map with the block offsets for random access
- A list of blocks, each one with a header
The main header contains the number of blocks stored and the offset to reach the first block (0 for sequential, and the size of the map for random access).
The block header contains the following:
-
type: Type of block (
PLL_BINARY_BLOCK_{PARTITION | CLV | TREE | CUSTOM}) - block_id: Custom block identifier. It is also the index for finding the offset in the random access map. For instance, the client code can define block_id=1..10 for partitions, 11..20 for trees, etc.
- attributes: Special attributes for the block. For example PLL_BINARY_ATTRIB_PARTITION_DUMP_CLV and PLL_BINARY_ATTRIB_PARTITION_DUMP_WGT determine whether CLVs and weights are saved together with the partition data.
- alignment: Memory alignment of the saved data. This is useful to check whether the memory alignment when saving and loading the data is the same or not, and proceed accordingly. A mismatch in the memory alignment becomes a problem when the number of states is not a multiple of the alignment size.
- block_len: Length of the data stored in the block.
First, we need to create the binary file and place the header:
PLL_EXPORT FILE * pll_binary_create(const char * filename,
pll_binary_header_t * header,
int access_type,
int n_blocks);- header: Output parameter filled with the initial header information. In general users do not need to care about it anymore after calling this function.
- access_type: PLL_BINARY_ACCESS_SEQUENTIAL or PLL_BINARY_ACCESS_RANDOM
- n_blocks: If access_type is PLL_BINARY_ACCESS_RANDOM, we need to know or estimate a maximum number of blocks that will be stored in the file such that the space for the map can be allocated.
Returns a file pointer to an open file (must be closed manually after storing the data).
Next, we can store whatever we want in the binary file with pll_binary_*_dump functions. For example, saving custom memory chunks is the most generic case:
PLL_EXPORT int pll_binary_custom_dump(FILE * bin_file,
int block_id,
void * data,
size_t size,
unsigned int attributes);- bin_file: The file pointer returned by the previous function.
- block_id: A custom block id. If access is random, it should be unique.
- data: The memory block to be saved.
- size: The size of data.
-
attributes: Custom attributes for the function. For random access, it should contain at least
PLL_BINARY_ATTRIB_UPDATE_MAPsuch that the map is updated accordingly.
Save operations proceed as follows:
- Update the header (increasing n_blocks by one).
- Update the map if necessary, adding a new entry with block_id and the corresponding offset.
- Create and append the block header.
- Append the block data.
First, we need to open the binary file for reading and get the header:
PLL_EXPORT FILE * pll_binary_open(const char * filename,
pll_binary_header_t * header);- header Output parameter with the header information contained in the file. Returns a file pointer to an open file (must be closed manually after reading the data).
Optionally, if the file access is random we can read the map. That way we can know which blocks are stored, their IDs and the corresponding offsets. We can access arbitrary blocks anyway using the ID, but that means that the read operations need to search for the offset at the beginning of the file before actually reading the block.
PLL_EXPORT pll_block_map_t * pll_binary_get_map(FILE * bin_file,
unsigned int * n_blocks);- n_blocks Output parameter with the number of blocks contained in the map
Next, we can sequentially read the blocks, or access directly any arbitrary block if the file was defined as random access and we know the id with pll_binary_*_load functions. For example, loading custom memory chunks is the most generic case:
PLL_EXPORT void * pll_binary_custom_load(FILE * bin_file,
int block_id,
size_t * size,
unsigned int * type,
unsigned int * attributes,
long int offset);- block_id The block identifier, if we know it.
- size Output parameter with the size of the memory chunk read.
- type Output parameter with the type of the block read.
- attributes Output parameter with the attributes of the block read.
-
offset Can be a positive integer if we already know the offset, 0 for reading the next block sequentially, or
PLL_BINARY_ACCESS_SEEKfor retrieving it from the beginning of the file. Returns a pointer to the data read from the binary file.
- Wiki Home
- Modules:
- Binary Module
- Msa Module
- Optimize Module
- Tree Module
- Util Module
- Algorithm Module
- Use cases (Binary module):
- Saving and loading partitions
- Use cases (Tree module):
- Computing RF distance