Current implementation allows only a sequential mono-thread execution. An overall design improvement is needed to enable the code for parallelization.