DAY 93-100 DAYS MLCODE: Text Summarization using Sequence-To-Sequence models
This is 93rd day of our #100daysofMLCode challenge and we are going to see how Sequence-to-sequence model used for text summarization in the paper Get To The Point: Summarization with Pointer-Generator Networks
Almost all task in Natual Language Processing can be formulated as a sequence to sequence tasks like translation, text summarization etc.
This paper talks that Neural sequence-to-sequence models can be used for a text summarization but
- Summaries sometime liable to reproduce factual details inaccurately
- Summaries tend to repeat themselves
This paper tried to solve the above issue using Hybrid Network – pointer-generator network. This is a hybrid network that can choose to copy words from the source via pointing while retaining the ability to generate words from the fixed vocabulary. Below is the architecture of the model (
Comparison of Pointer-generator Vs Seq-to-Seq with Attention
Let’s compare to the sequence-to-sequence-with-attention system with the pointer-generator network:
- The pointer-generator network makes it easy to copy words from the source text.
- The pointer-generator model is even able to copy out-of-vocabulary words from the source text.
- The pointer-generator model is faster to train, requiring fewer training iterations to achieve the same performance as the sequence-to-sequence attention system.
In this way, the pointer-generator network is the best of both worlds, combining both extraction (pointing) and abstraction (generating). In the next