What convolutional neural networks see

What convolutional neural networks see



neural networks and in particular convolutional neural networks have been at the heart of many recent research projects with an artistic flavor some of the better known ones have been deep dream a neural algorithm artistic style or style transfer deep generator networks and most recently wave nets which learn to generate audio they've also been found within many practical applications so everything from self-driving cars to speech to text systems and a eyes that can play the game of Go the recent success comes from an ability to accurately recognize and describe images but the way they do this remains a mystery to most people we can get a few intuitions about what's really going on by inspecting them looking inside them and seeing how they see the world what you're looking at is a neural network which is processing my webcam in real-time it scans the image looking for patterns or what we call features the patterns look like these so some are lines or edges or gradients just really minimal multi pixel patterns things like that and these responses which we sometimes call feature Maps or activations show us the presence of those features inside of the image so in the first layer of the network we've discovered edges gradients and patterns like that things get a little more interesting when we repeat this process many times through a sequence of layers at each layer of the network we take the feature Maps from the previous layer stack them together into a new volume of data and do another round of convolution on top of them so the activation Maps in the second layer which we're looking at here are more interesting because rather than looking for patterns from the raw pixels of the original image we're now looking for patterns from the activation maps of the previous layer of the network so for example it might be able to combine vertical edges and horizontal edges to detect corners which we can think of as higher-level features as we do this process many times progressing through every layer of the network we acquire higher and higher level features or representation of the image so we go from things like edges and gradients to corners and grids to get progressively more complex features may be things like leaves or fences or door handles to get even higher level features houses cause people and so on this process of pushing data through the network over many layers of transformations is why these algorithms are sometimes called deep neural networks or deep learning the deep just means the network has many layers we finally arrive at the last layer of network we have this compact representation of the content inside of the image and we can attach one more classification layer on top of that so that we can describe accurately what's inside the image so for example if I place my phone in front of the camera it'll go ipod or if I place this water bottle in front of the camera it'll accurately detect a water bottle now it can be a bit hard to understand what the feature detectors are looking for but it turns out that there are ways and there has been some work done in the past some of the first work came from Rob Tyler and Matt Fergus in 2013 where they showed patches of actual images which caused certain feature detectors to light up another nice resource is deep visualization toolbox which was made by Jason us in Seattle and was a major inspiration for the visualization software that you saw in the last slide if you're interested in learning more about how these impressive algorithms work or even getting your hands dirty and working with them yourself using a series of practical guides and tutorials I encourage you to check out ml for a github that IO which is an in-progress free online book about machine learning for artists

13 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *