There is a variety of tricks for matrix multiplication over finite fields
that usually goes under the name of ``greasing''. This involves
precomputations, as well as efficient use of a cache memory architecture.
Speedup would affect all uses of finite field matrices.