MultinomialDistribution.compute_gradient(g, u, phi)[source]

In order to compute the Euclidean gradient, we first need to derive the gradient of the moments with respect to the variational parameters:

Now we can make use of the chain rule. Given the Riemannian gradient of the variational lower bound with respect to the variational parameters , put the above result to the derivative term and re-organize the terms to get the Euclidean gradient :

compute_moments_and_cgf