We were shown the mathematical approach to GANs (generative adversarial networks), and the algorithmic approach. GANs are notorious for being hard to train, but by understanding the two approaches together it becomes clear how to train them properly.
Also, a plug for the AI residency at Microsoft Research Cambridge.
Alhussein Fawzi (from UCLA, and just about to join DeepMind) showed us a neat trick to find an image which, when you apply it as a small perturbation to a picture, will fool any neural network for any picture. Here is the magic image:
Some Christmas challenges:
- Train a recurrent network on tweets, at the word level, then apply it to Charles Dickens. Splice emoji into Dickens.
- Invent a hybrid character-and-word RNN. There can be character-based units, plus short-circuit links between words (inspired by Residual Network). This might help address the challenge of what to do if the vocabulary is very different between Dickens and twitter.
- Could we recompose Dickens into tweets? This sounds like a translation task, but we don’t have a bilingual dataset. Maybe there are word or sentence embeddings, that can simply be transposed?
Figuring out how to use TensorFlow:
In a reading group early next term, let’s have a show-and-tell, and a discussion of TensorFlow and PyTorch. Also, there’s a Slack channel for any discussion over the holidays.
Fergal Cotter (Engineering) presented Deep residual learning for image recognition, He, Zhang, Ren, and Sun, 2015 [pdf].
Also, a hint about his own work on “lifting” residual networks to interpret deep nodes in a neural network [Jupyter]
Damon Wischik (Computer Lab) briefly described You only look once: unified real-time object detection, Redmon et al., 2015 [arXiv]
Kris Cao (from the Natural Language Processing group in the Computer Laboratory) told us how neural networks are used for natural language processing.
He told us that colourless green ideas sleep furiously; and about the debate between unstructured statistical approaches (RNN) versus grammar-based approaches (RNNG); and about attention models based on memory banks.
Jack Kamm (Genetics) and Johannes Bausch (DAMTP) presented Generating sequences with recurrent neural networks, Graves, 2013 [pdf]
Sebastian Lunz (CCA) and Oliver Leicht (DAMTP) presented Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank, Socher et al., 2013 [pdf]
Royota Tomioka from MSR told us about AMPNet, a system developed at MSR for asynchronous training of neural networks.
Slides to follow…
Damon Wischik from the Computer Laboratory gave a general introduction to backpropagation.