some reorganization, especially in MatrixStorage start playing with loop unrolling, always_inline, and __restrict__