1.Change dtype of scale to dtype of grad in loss_scale.py; 2.Change dtype of weight_decay to dtype of weight in optimizer.py.
Signed-off-by: leonwanghui <leon.wanghui@huawei.com>