# e.g. ddot42 performs $C_ij = A_ijkl B_lk$ for the entire grid ...