-
Notifications
You must be signed in to change notification settings - Fork 2
Description
This goes slightly against what I said in #37, but something that would be quite useful
would be a bulk zero-initialisation. There are lots of zero-holes in memory maps,
whether that is because it is zero-initialised data-section, or because there is a long
run of zeros within a sparsely initialised data-section.
An extra boot command that can zero initialise arbitrary segments of RAM using
a single packet would reduce the amount of message traffic needed at start-up,
particularly when we have MBs of data that is mostly zeros.
e.g. something like this:
else if (cmd == StoreZeroCmd) {
// Store zeros to data memory
int n = msgIn->args[0]; // Size ***in bytes*** to transfer (saves an instruction)
uint32_t addrEnd=addrReg + n;
while( addrReg < addrEnd ){
* (uint32_t*) addrReg = 0;
addrReg += 4;
}
}
I estimate that a total burden of 10-ish instructions added to the bootloader,
and it should be able to fill at about 1 word per 5-ish instructions - presumably
it would end up being DRAM bandwidth limited.
This is assuming that:
-
DRAM is not already zero-initialised: I assume it isn't?
-
Bandwidth from host to boards is much less than total bandwidth to DRAMs; We've
got one PCI Expression link at ~1GB/sec, but even with Aesop we have 6 DRAMs
which offer 12GB/s * 6 = 72 GB/s.
So for a system which is loading multi-GB sections on to DRAM this could
reduce the serial cost quite a bit.
Note that I'm aware that a lot can already be done to support faster loading,
e.g. using multiple threads per DRAM to load, and packing multiple words
into each packet. However a memset instruction would be easy to integrate
into the existing hostlink loaders without adding much complexity, and also
make more sophisticated loaders faster.
Flagrantly not using the PEP system I literally only just proposed because I don't
have time right now - this is more a reminder to turn this into one if it makes sense.