Skip to content

Contiguous Memory #2

@ericomeehan

Description

@ericomeehan

This issue was brought to my attention by @camel-cdr

Hashes and Raw Bytes

When transferring blocks over the network, the first test a node will run over a foreign block to validate its integrity will be the network-standard hashing algorithm. Should the resulting hash satisfy the network-prescribed proof-of-work difficulty, we can begin treating the object as a block; however, there may be discrepancies in how a given machine stores the desired data structures. As a result, the raw bytes that one node sends will not fit into the same structure on the recipient node architecture due to a concept called padding. While the hash generated from those raw bytes would satisfy the difficulty requirement, the data would not cast correctly into the expected data type.

Padding

The CPU does not read data by the bit, nor by the byte, but by the word. The size of a word is dependent on the architecture, but generally a 32 bit CPU will have a 4 byte word while a 64 bit CPU will have an 8 byte word. When the compiler defines the memory layout of a struct, it does so depending on the architecture's word size. The following struct, for example, would require a total of 8 bytes of memory on a 32 bit machine, despite being only 6 bytes of data:

struct example 
{
    char a;
    char b;
    int c;
};

The problem is that the int data type requires four bytes of memory, and it will save CPU cycles to access that member in a single word. The structure is then padded such that the field begins where the CPU may access it in a single cycle. So instead of storing the struct like this:

Word:    |_          _            _            _           |_          _            _            _           |

Data:     char     char          int----------------------------------------------- empty-----------------------

The structure is padded so that the int falls at the beginning of a word:

Word:    |_          _            _            _           |_          _            _            _           |

Data:     char     char          empty-----------------------int-----------------------------

Data Integrity

While this is a more efficient layout in terms of CPU cycles at a small cost in memory, it makes maintaining the integrity of data when handled by multiple nodes difficult. Fortunately, we may override this behavior to designate that all blocks on the network are to be created, stored, and transferred in an unpadded schema:

struct BlockHeaders
{
    unsigned char previous_hash[64];
    unsigned long nonce;
    unsigned char fingerprint[64];
    unsigned char timestamp[20];
    unsigned long size;
} __attribute__((packed));

The packed attribute designates this struct to be created without padding, which will allow the casting of raw bytes sent across a network between nodes with incompatible architectures.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions