Deep Learning with a tale of two cities (Part VII/IX): time, coder, and generation
-- From Transformer to Autoencoder --
The transformer structures covered in W9 motivate further thoughts on the framework of block implementations on data: ideally, a two-step architecture could be introduced for processing and transforming the information. Fundamental examples include compress sensing and Principle Component Analysis, where the data gets reduced to a very low latent dimension, known as encoding, followed by a decoding process where the latent dimension could be mapped into a high dimensional object again. The loss composition is also interesting: whereas in all previous contexts, we evaluate the goodness of the model compared against a predicting target, the loss here can be evaluated to the data itself: the decoded-encoded data against itself, to be precise.
What's the point then? Well, the decoder learns the distribution of the data by mapping the latent dimension into the output: this mimics a conditional probability from the latent variable to the output; likewise, the encoder learns the other conditional probability from the input to the latent variable. For a proper choice of the latent variable, one could then set the probability being learnt to be the one we want to know: for instance, a 32x32 sized image of a cat, conditional on the latent input of "32" and "cat".
It then becomes apparent that the decoder section, if further configured appropriately, could be used as a generator: where we draw samples from the learnt conditional distribution. This empirically performs well, and if we ask ourselves, how to quantify such "goodness", it becomes another topic of interest: discriminator.
-- Generative models --
Following into W11, the last week of the course, we had an overview of the state-of-the-art research into Generative Adversarial Networks (GAN), where a discriminator is introduced on top of the generator. Discriminator-Generator could be thought of as a two-agent game, where both of them intend to maximise their payoff: for the discriminator, this relates to the correctness of it distinguishing the generated data against real data, whereas for the generator, it intends to maximise the likelihood of the discriminator getting the generated data wrong, i.e. fooling the discriminator.
Recommended by LinkedIn
Landing at the end of the course, it all seems too nice to be true: from a stacked composition of neurons, we are able to subsequently learn and construct various data: from images to audios, and finally into the framework of data generation. Yet, I wonder, what is the cost of all these?
-- The ontology of AI --
"We can be both of God and the Devil, since we are trying to raise the dead against the stream of time."
--- Vermouth (2003, in ep. 309 of Meitantei Conan 名探偵コナン)
Meitantei Conan is a Japanese detective manga written by Gosho Aoyama (青山 剛昌). The quote was mentioned by one of the key characters, Vermouth. Back in the early 2000s, AI was still at a very early stage, with various optimisation algorithms undiscovered, and neither were GAN and other sophistications being known to the practitioners. Yet, under the scripts of Gosho, this text carries a rather heavyweight: the development of AI is raising the dead against the stream of time --- generative models are a good class of architectures that make this happen.
A broader range of debates have been brought to attention on these matters: how should we use AI and what are the ethical or moral traps we might fall in. At the close of this 11-week-long course about Deep Learning, it is very tempting to think about these ontological or philosophical questions on top of the mathematical details we process. The technical explanation for the departure of good use of AI varies from lack of sample, training steps, or architecture. We have seen various examples in class, but still, one needs to think deeper into more fundamental issues such as the appropriateness of using algorithms to address a certain class of problems.
We can be both of God and the Devil. Make good use of AI.