Wednesday, December 5, 2018

Do GANs actually learn the distribution? An empirical study

Sanjeev Arora, Yi Zhang

  • GANs do not provide a measure of distributional fit/perplexity
  • Current tests on GANS rule out the possibility that GANs are copying training, but do not indicate if it has learned the target distribution 
  • Support size: Loosely, the number of features that can specify a distribution - e.g. number/categories of facial features for a set of facial images. What is the support size of a distribution? In a target distribution of small support size, is it possible for a GAN to reach low training error but still not learn the support of the target distribution?

Uses a birthday paradox test. When samples are chosen from a set of size N, the probability that a duplicate exists is reasonably high (close to 0.5) when the number of samples chosen > squareroot(N). This can be used to test a GAN. If the training distribution has support size N,  then generate samples of size S. If S has a reasonable number of duplicates then support size of generated distribution must be S^2. If S^2 is close to N, then the GAN is approximating the training distribution. This test fails if training distribution is highly skewed.

  • CIFAR-10: The support size is the vector embedding of the final layer of a CNN trained for classification.  Duplicates are identified using a Euclidian similarity measure between  the vector embeddings. The GAN used GAN was a stacked GAN
  • Compares two measures: 
    • Diversity: Square of the number of samples need to detect a duplicate with probability> 0.5
    • Discriminator size: Size of embedding vector in final layer of discriminator CNN. 
  • If a GAN is learning the distribution then diversity should be close to discriminator size. 
  • Results on CIFAR indicate that 
    • The GAN is learning lower support size than needed to represent the target distribution, however it is not copying training images
    • The size of the generated distribution’s support (diversity) scales near-linearly with discriminator capacity

No comments:

Post a Comment