Implements stochastic gradient descent (optionally with momentum). Nesterov momentum is based on the formula from On the importance of initialization and momentum in deep learning.
if(torch_is_installed()){## Not run:optimizer <- optim_ignite_sgd(model$parameters(), lr =0.1)optimizer$zero_grad()loss_fn(model(input), target)$backward()optimizer$step()## End(Not run)}