Skip to content

Streaming Compression

bprail edited this page Feb 9, 2015 · 5 revisions

As much as 40% of the bytes in an event trace can be 0s. A much smaller percentage are 0xff. There may be a simple compression scheme to reduce the 0s to single bits and increase the size of non-zero fields. This is trying to reduce the bytes being written to disk, so that more memory is being reused in instrumentation. Reuse lowers the footprint and can increase cache hit rates.

At 40%, a simple compression of 0s reduces the trace by 28%.

The following code performs this simple compression at ~350MB/s.

for (uint64_t i = 0; i < fsz; i++)
{
    v = fc[i];
    buf <<=1;
    bits++;
    if (v == 0)
    {
        
        
    }
    else
    {
        buf |= 0x1;
        buf <<= 8;
        buf |= v;
        bits += 8;
    }
    
    if (bits > 8)
    {
        v = (buf >> (bits - 8));
        bits -= 8;
        
        wCount ++;
    }
}

Using this simple compression scheme slows execution.

fluidanimate 3.57x -> 5.96x
flushtime 5.5s -> 40s (due to not using file cache)
file is 83% of uncompressed size

Clone this wiki locally