Variational Autoencoder (VAE)
A short Intro to VAE
1. Background
There mainly 2 types of deep generative models:
- Generative Adversarial Network (GAN)
- Variational Autoencoder (VAE)
We will discuss about VAE in this blog. In future blogs, we will venture into the details of GAN.
2. A basic intuition
A VAE is an autoencoder whose encodings distribution is regularised (via variational inferenece) during the training in order to ensure that its latent space has good properties allowing us to generate some new data.
3. Encoder & Decoder
encoder is an agent that transform older featurer representation to a new set of feature representation (usually a lower dimension) using selection or extraction and decoder is an agent producing the reverse process. The encoded representation can span a feature space of certain dimensionality. We call it latent space. Furthermore, we know that certain properties/information of original features may be lost if we encode them. So we categorize transformations as lossy or lossless transformation. we are looking for the pair that keeps the maximum of information when encoding and, so, has the minimum of reconstruction error when decoding.
4. Autoencoder
Autoencoder is done by setting an encoder and a decoder as neural networks and to learn the best encoding-decoding scheme using an iterative optimisation process. So, at each iteration we feed the autoencoder architecture (the encoder followed by the decoder) with some data, we compare the encoded-decoded output with the initial data and backpropagate the error through the architecture to update the weights of the networks. We usually use a Mean Square Error as the loss function for backpropagation. This is often compared with PCA. When the structure of encoder/decoder gets deeper and more non-linear, we observe that autoencoder can still proceed to a high dimensionality reduction while keeping reconstruction loss low.
There are mainly 2 drawbacks of Autoencoder:
- the lack of interpretable and exploitable structures in the latent space (lack of regularity)
- Difficulty in reducing a large number of dimensions while keeping the major part of the data structure information in the reduced representations
As a result of the above two drawbacks, we may generate meaningless data if we simply encode into a latent space and sample random points from it to decode (a generative model often requires this). This issue leads to the need of regularization for the latent's space distribution. This is the motivation for Variational Autoencoder.
VAE in detail
The key idea for VAE is that, instead of encoding an input as a single point, we encode it as a distribution over the latent space. In essence, we don't have point-wise estimation, but an estimation of original inputs' distribution (hence Bayesian inference is here of significant help). Next the evalution of an error is based on a new sample drawn from the distribution estimator and compared with original sample.
(In practice, the encoded distributions are chosen to be normal so that the encoder can be trained to return the mean and the covariance matrix that describe these Gaussians. See why this is the case in mathematical details below)
Because of regularization, we now have an additional term inside loss, which is the KL divergence. We see that the KL divergence between 2 Gaussians has a closed form and hence can be computed easily.
1. All the math down here
We require two assumptions:
- a latent representation
is sampled from the prior distribution ; - the data
is sampled from the conditional likelihood distribution
Now we note here that the “probabilistic decoder” is naturally defined by
These two expressions remind us easily of the Bayes Rule, which we have
We define
2. Practical idea: Neural Network
Now that we have an optimization problem which may be solved using NN, we still need to address a few issues. First, the entire space of
For simplicity of computation, we often require
Using these ideas, we can first sample
Code implementation
To Be Updated
Variational Autoencoder (VAE)
https://criss-wang.github.io/post/blogs/prob_and_stats/variational-autoencoder/