bayespy.inference.vmp.nodes.gaussian_markov_chain.GaussianMarkovChainDistribution¶

class bayespy.inference.vmp.nodes.gaussian_markov_chain.GaussianMarkovChainDistribution(N, D)[source]¶

Implementation of VMP formulas for Gaussian Markov chain

The log probability density function of the prior:

Todo

Fix inputs and their weight matrix in the equations.

$\log p(\mathbf{X} | \boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \mathbf{B}, \boldsymbol{\nu}) =& \log \mathcal{N}(\mathbf{x}_0|\boldsymbol{\mu}, \mathbf{\Lambda}) + \sum^N_{n=1} \log \mathcal{N}( \mathbf{x}_n | \mathbf{Ax}_{n-1} + \mathbf{Bu}_n, \mathrm{diag}(\boldsymbol{\nu})) \\ =& - \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \mathbf{x}_0 + \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x}_0 - \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| \\ & - \frac{1}{2} \sum^N_{n=1} \mathbf{x}_n^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{x}_n + \frac{1}{2} \sum^N_{n=1} \mathbf{x}_n^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} \mathbf{x}_{n-1} + \frac{1}{2} \sum^N_{n=1} \mathbf{x}_{n-1}^T\mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{x}_n - \frac{1}{2} \sum^N_{n=1} \mathbf{x}_{n-1}^T\mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} \mathbf{x}_{n-1} \\ & + \sum^N_{n=1} \sum^D_{d=1} \log\nu_d - \frac{1}{2} (N+1) D \log(2\pi) \\ =& \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix}^T \begin{bmatrix} -\frac{1}{2}\mathbf{\Lambda} - \frac{1}{2}\mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) & & & \\ \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2}\mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) & & \\ & \ddots & \ddots & \ddots & \\ & & \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2}\mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \frac{1}{2} \mathbf{A}^T\mathrm{diag}(\boldsymbol{\nu}) \\ & & & \frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \mathbf{A} & -\frac{1}{2} \mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix} \\ & + \frac{1}{2} \mathbf{x}_0^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \mathbf{x}_0 - \frac{1}{2} \boldsymbol{\mu}^T \mathbf{\Lambda} \boldsymbol{\mu} + \frac{1}{2} \log|\mathbf{\Lambda}| + \sum^N_{n=1} \sum^D_{d=1} \log\nu_d - \frac{1}{2} (N+1) D \log(2\pi)$

For simplicity, $\boldsymbol{\nu}$ and $\mathbf{A}$ are assumed not to depend on $n$ in the above equation, but this distribution class supports that dependency. One only needs to do the following replacements in the equations: $\boldsymbol{\nu} \leftarrow \boldsymbol{\nu}_n$ and $\mathbf{A} \leftarrow \mathbf{A}_n$ , where $n=1,\ldots,N$ .

$u(\mathbf{X}) &= \begin{bmatrix} \begin{bmatrix} \mathbf{x}_0 & \ldots & \mathbf{x}_N \end{bmatrix} \\ \begin{bmatrix} \mathbf{x}_0\mathbf{x}_0^T & \ldots & \mathbf{x}_N\mathbf{x}_N^T \end{bmatrix} \\ \begin{bmatrix} \mathbf{x}_0\mathbf{x}_1^T & \ldots & \mathbf{x}_{N-1}\mathbf{x}_N^T \end{bmatrix} \end{bmatrix} \\ \phi(\boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \boldsymbol{\nu}) &= \begin{bmatrix} \begin{bmatrix} \mathbf{\Lambda} \boldsymbol{\mu} & \mathbf{0} & \ldots & \mathbf{0} \end{bmatrix} \\ \begin{bmatrix} -\frac{1}{2}\mathbf{\Lambda} - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & \ldots & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) - \frac{1}{2} \mathbf{A}\mathrm{diag}(\boldsymbol{\nu})\mathbf{A}^T & -\frac{1}{2}\mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \\ \begin{bmatrix} \mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) & \ldots & \mathbf{A}^T \mathrm{diag}(\boldsymbol{\nu}) \end{bmatrix} \end{bmatrix} \\ g(\boldsymbol{\mu}, \mathbf{\Lambda}, \mathbf{A}, \boldsymbol{\nu}) &= \frac{1}{2}\log|\mathbf{\Lambda}| + \frac{1}{2} \sum^N_{n=1}\sum^D_{d=1}\log\nu_d \\ f(\mathbf{X}) &= -\frac{1}{2} (N+1) D \log(2\pi)$

The log probability denisty function of the posterior approximation:

$\log q(\mathbf{X}) &= \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix}^T \begin{bmatrix} \mathbf{\Phi}_0^{(2)} & \frac{1}{2}\mathbf{\Phi}_1^{(3)} & & & \\ \frac{1}{2}{\mathbf{\Phi}_1^{(3)}}^T & \mathbf{\Phi}_1^{(2)} & \frac{1}{2}\mathbf{\Phi}_2^{(3)} & & \\ & \ddots & \ddots & \ddots & \\ & & \frac{1}{2}{\mathbf{\Phi}_{N-1}^{(3)}}^T & \mathbf{\Phi}_{N-1}^{(2)} & \frac{1}{2}\mathbf{\Phi}_N^{(3)} \\ & & & \frac{1}{2}{\mathbf{\Phi}_N^{(3)}}^T & \mathbf{\Phi}_N^{(2)} \end{bmatrix} \begin{bmatrix} \mathbf{x}_0 \\ \mathbf{x}_1 \\ \vdots \\ \mathbf{x}_{N-1} \\ \mathbf{x}_N \end{bmatrix} + \ldots$

__init__(N, D)¶

Methods

`__init__`(N, D)
`compute_cgf_from_parents`(u_mu_Lambda, ...)	Compute CGF using the moments of the parents.
`compute_fixed_moments_and_f`(x[, mask])	Compute u(x) and f(x) for given x.
`compute_gradient`(g, u, phi)	Compute the standard gradient with respect to the natural parameters.
`compute_logpdf`(u, phi, g, f, ndims)	Compute E[log p(X)] given E[u], E[phi], E[g] and E[f].
`compute_message_to_parent`(parent, index, u, ...)	Compute a message to a parent.
`compute_moments_and_cgf`(phi[, mask])	Compute the moments and the cumulant-generating function.
`compute_phi_from_parents`(u_mu_Lambda, ...[, ...])	Compute the natural parameters using parents' moments.
`compute_rotation_bound`(u, u_mu_Lambda, u_A_V, R)
`compute_weights_to_parent`(index, weights)	Maps the mask to the plates of a parent.
`plates_from_parent`(index, plates)	Compute the plates using information of a parent node.
`plates_to_parent`(index, plates)	Computes the plates of this node with respect to a parent.
`random`(*params[, plates])	Draw a random sample from the distribution.
`rotate`(u, phi, R[, inv, logdet])
`squeeze`(axis)	Squeeze a plate axis from the distribution

bayespy.inference.vmp.nodes.gaussian_markov_chain.GaussianMarkovChainDistribution¶

BayesPy

Navigation

Related Topics