DAY 93-100 DAYS MLCODE: Text Summarization using Sequence-To-Sequence models

My Tech World

DAY 93-100 DAYS MLCODE: Text Summarization using Sequence-To-Sequence models

February 13, 2019 100-Days-Of-ML-Code blog 0

This is 93rd day of our #100daysofMLCode challenge and we are going to see how Sequence-to-sequence model used for text summarization in the paper Get To The Point: Summarization with Pointer-Generator Networks

Almost all task in Natual Language Processing can be formulated as a sequence to sequence tasks like translation, text summarization etc.

This paper talks that Neural sequence-to-sequence models can be used for a text summarization but it suffer from two issues

  • Summaries sometime liable to reproduce factual details inaccurately
  • Summaries tend to repeat themselves

This paper tried to solve the above issue using Hybrid Network – pointer-generator network. This is a hybrid network that can choose to copy words from the source via pointing while retaining the ability to generate words from the fixed vocabulary. Below is the architecture of the model ( image Source paper)

Pointer-generate model
Pointer-generated model

Comparison of Pointer-generator Vs Seq-to-Seq with Attention

Let’s compare to the sequence-to-sequence-with-attention system with the pointer-generator network:

  1. The pointer-generator network makes it easy to copy words from the source text.
  2. The pointer-generator model is even able to copy out-of-vocabulary words from the source text.
  3. The pointer-generator model is faster to train, requiring fewer training iterations to achieve the same performance as the sequence-to-sequence attention system.

In this way, the pointer-generator network is the best of both worlds, combining both extraction (pointing) and abstraction (generating). In the next blog we’ll see how the code works.