Making Convolutional Networks Shift-Invariant Again (ICML 2019)

Making Convolutional Networks Shift-Invariant Again (ICML 2019)


making convolutional networks shift
invariant again Richard Zhang thank you these two images are classified
correctly these two are not the only difference between these is a seemingly
innocuous shift now this sensitivity has been analyzed in some great recent work
including in this conference inspired by this work we’re aiming to produce a fix
to do so we must first ask why why is this happening how can shift invariance
be lost after all we’ve heard this mantra many times. convolutions are
shift equivariant; that is the features should be moving with the image and
pooling builds up shift invariance giving it a little bit of spatial slack
well we’ll see is that down sampling or striding ends up ignoring the Nyquist
sampling theorem and aliasing now striding can be but in both convolutions
and in pooling but we’ll first start off with an example in Puli here’s a toy
signal 0 0 1 1 0 0 1 1 let’s max pull it together the max of 0 + 0 is 0 the max
of 1 + 1 is 1 and so on and so forth and hopefully you’re checking my math on
this now if we shift the signal by one index we’re gonna take the max of 0 + 1
which is 1 the max of 1 + 0 which is 1 and so on and so forth and we get this
output now this answer is very different than this answer and they were produced
simply by shifting the input signal by single-ended
index what we’ve done here is we broke in shift echo variance let’s see if this
happens in a more realistic scenario we can do see if our classification using
the vgg network and we can probe each of the internal later to see how shift echo
variant they are shift a covariance means that you can do a shifting
operation and a feature extraction operation in either order that is these
two operations should be commutable now if we look at the first comm layer it is
perfectly shift echo variant as we know and this is a good sandi check for us
but if we go to the first pooling layer we see this interesting stippling
pattern occur what this means is that even pixel shifts will give the same
representation shifted over but odd pixel shifts will actually give a
different representation and as we go deeper through the network
each pooling layer actually increases the periodicity of the snipping pattern
making things worse and worse and this is because this layer is down sampling
but it’s not taking into account the name Nyquist sampling theorem and this
aliasing is breaking shift echo variance through the network now if you recall
from your undergraduate signal processing or image processing course
well you may have been taught is that before down sampling a signal what you
need to do is blur it you need to blur it as a means of anti-aliasing in deep
learning historically what we seen through careful evaluations is that max
pooling actually performs empirically better what we’d like to do in our work
is reconcile classic anti-aliasing with max pooling to do so we need to look at
max pooling a little more closely and our simple insight is that max pooling
can be divided into two operations sequentially one is taking the max but
doing so in a dense fashion and the second is sub sampling from this
intermediate feature math and this allows us to isolate the problem we can
actually keep the first operation because it’s not aliasing at all what’s
at fault is the sub sampling operation and we can reduce the aliasing and
aliasing in it by adding a blur filter to the intermediate feature map before
doing the sub sampling and these two operations can be evaluated together so
this bottom on row here indicates our anti alias version of Max pulley and
instead of just one line of code we can add just a line of code to do so now we
were motivated and we have Illustrated this for max pooling but there are other
down sampling methods in deep nets as well such as strata convolution now it’s
important to note that started convolution suffers from the exact same
problem of aliasing and actually the same methodology and the same fix
applies it to it as well so we can take any network off the shelf and better
anti-alias it so we’re going to do that here for a large scale image not
classification test class on test on the y-axis we’re going to plot shift
invariance so this is the probability that two shifts of the same image are
classified the same and on the x-axis we can plot accuracy and here’s a whole
bunch of networks such as vgg dense net a resident family and mobile net on
these two axes now when anti-aliasing before I ran this test my expectation
was that we would improve the shift in but perhaps at a cost of accuracy but
I’ll surprise a c4 VG we actually saw an improvement in both and I ran this for
all the other networks and saw a similar trend as well so adding anti-aliasing
not only improved the shipment variance but it also improves the accuracy
additionally we’ve also seen that improves stability to other
perturbations such as rotations and scaling as well as improve robustness to
corruptions such as noise so if your fine tuning from a pre-trained image
type model for example resonate 50 I’ll consider I would ask you to consider go
on our website and using our anti-alias version as well and if you have this
snippet of code try to equals two in your network anywhere perhaps you can
benefit also from adding some anti-aliasing ok I hope see on my poster
thank you

8 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *