Currently CellBench stores every result for each step before moving on to the next step. It's possible to reduce memory burden by running single pipelines to the final result before moving onto the next, reducing the amount of data that needs to be kept in memory.
The difficulty is in computing all required data only once and releasing them when they are no longer needed, this would follow a depth-first tree structure and it's not obvious how to implement it elegantly.