add in runtime cpu detection for zarch
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed