Deep Learning with Tensorflow – Recursive Neural Tensor Networks

Deep Learning with Tensorflow – Recursive Neural Tensor Networks


Hello, and welcome! In this video, we’ll provide an overview of
Recursive Neural Tensor Networks, as well as the natural language processing problems
that they’re able to solve. Sentiment Analysis is the task of identifying
and extracting subjective information, like emotion or opinion, from a source material. For example, this might involve analyzing
a twitter feed to determine which tweets express a positive feeling, which express a negative
feeling, and which are neutral. In order to classify sentences into different
sentiment classes, we’ll need a dataset to use for training. One potential dataset is the Stanford Sentiment
Treebank. Each data point is the syntax tree of a rotten
tomatoes review. The tree itself and all the subtrees are labeled
with a sentiment value from 1 to 25. 25 is the best possible review, while 1 is
the worst. The dataset was created by Stanford researchers,
who utilized Amazon’s Mechanical Turk platform in order to assign values. Recursive neural models can be used for the
sentiment analysis problem. These types of models are characterized by
their use of vector representations. Vectors are used to represent words, as well
as all sub-sentences related to an input’s syntax tree. The word representations are trained with
the model, and the representations of sub-sentences are calculated with a compositionality function. To calculate the sub-sentence’s representations,
we apply the compositionality function bottom-up according to the input’s parse tree. All vectors are fed to the same softmax classifier
to determine the sentiment. The choice of compositionality function is
important, so we’ll present three different types of recursive models, each with a different
function. The first model we’ll look at is the basic
Recursive Neural Network. To compute our word composition, we start
with our vectors that we want to combine, which we’ll call “b” and “c”. We form a “two d” by “d” matrix by concatenating “b” and “c”. This new matrix is multiplied by the “d” by
“two d” weight matrix “W”. “W” is the model’s main training parameter. Then a nonlinearity is applied element-wise
to the resulting vector. In this case, the nonlinearity is the hyperbolic
tangent function. As a brief note, we?ve omitted the bias for
simplicity. Other models use this compositionality function,
like the recursive autoencoder, and recursive auto-associative memories. As you can see, the words only interact implicitly
through the nonlinearity, so the compositionality function may not be consistent with linguistic
principles. The model also ignores reconstruction loss,
since the dataset is large enough to compensate. Now let’s move on to Matrix-Vector Recursive
Neural Networks. This type of model is a linguistically-motivated
improvement over the basic recursive neural network. The big change is that now every word is represented
by both a vector and a “d” by “d” matrix. The compositionality function that you see
here takes four objects. Lowercase “b” and “c” are the word vectors,
while the uppercase “b” and “c” are the respective matrices. Lowercase “p1” is the resulting vector, while
uppercase “P1” is the respective matrix. Just like with basic recursive neural networks,
a matrix “W” is multiplied with a matrix created from the word’s representations. But in this case, the matrix created is much
more dependent on the relationship between the two input words. The problem with this model is that the number
of trainable parameters becomes too large as the vocabulary size increases. The Recursive Neural Tensor Network, or RNTN,
uses a powerful fixed-size compositionality function that only takes the word’s vectors
as arguments. The model is not parameterized by matrices
but it adds a “two d” by “d” by “d” tensor that is used in the function. This tensor is also trained with the model. Each of the “d” slices captures a different type of composition,
so intuitively, it is more capable of learning than the basic recursive neural network. It turns out that RNTNs outperform the known
alternative methods. It has achieved over eighty-seven percent
accuracy in positive negative word classification, and over eighty-five percent accuracy in positive
negative sentence classification on the Stanford Sentiment Treebank. This is a sentence classification accuracy
that’s more than three percent higher compared to normal Recurrent Networks. Recursive Neural Tensor Networks can also
be used in other applications, such as Parsing Natural scenes, and Parsing Natural languages. This is due to the recursive nature of these
problems. If you’re interested in learning more about
RNTNs, we recommend you follow the link here to a great article by Socher, and others. By now, you should understand the intuition
behind recursive neural models, and recursive neural tensor networks. Thank you for watching this video.

1 Comment

Leave a Reply

Your email address will not be published. Required fields are marked *