Let’s see how to do conversational response generation using DialoGPT – a SOTA dialogue response generation model for multiturn conversations.
DialoGPT from Microsoft is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations.
The responses generated by this model is comparable to to human response quality under a single-turn conversation Turing test.
Here is the original research paper of DialoGPT.
Original source code and how to train the model available in this Github repo.
DialoGPT large-scale pretrained dialogue response generation model for multiturn conversations.
It is trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017.
DialoGPT used the Hugging Face PyTorch transformer to attain performance close to humans both in terms of automatic and human evaluation in single-turn dialogue settings.
DialoGPT’s architecture is based on the GPT2 model.
GPT-2 is a large transformer-based language model with 1.5 billion parameters. This model trained on a dataset of 8 million web pages.
GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text.
It achieves state-of-the-art performance on many language modeling benchmarks and performs rudimentary reading comprehension, machine translation, question answering, and summarization—all without task-specific training.
Currently, three models released by Microsoft. These are DialoGPT-small
, DialoGPT-medium
and DialoGPT-large
Let’s try an interactive Dialo-GPT conversation:-
I strongly recommend to use Colab notebooks for this.
Install transformers,
!pip install transformers
We will use the AutoModelForCausalLM
module as this is for casual language models.
from transformers import AutoModelForCausalLM, AutoTokenizer import torch
Define which DialoGPT model to use. We will use microsoft/DialoGPT-small model.
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small") model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
Now let’s do conversational response generation using DialoGPT Interactively for 5 responses,
# Let's chat for 5 lines for step in range(5): # encode the new user input, add the eos_token and return a tensor in Pytorch new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt') # append the new user input tokens to the chat history bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids # generated a response while limiting the total chat history to 1000 tokens, chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id) # pretty print last ouput tokens from bot print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
Here is a conversation with the bot regarding Moon,
>> User:Do you know Moon's gravitational force less than earth's. DialoGPT: I'm pretty sure that's a moon. >> User:Why there is rabbit shape in moon? DialoGPT: Why there is a rabbit in the moon? >> User:yes DialoGPT: Is this a joke? >> User:not all DialoGPT: Some of us are just not as good as others. >> User:Do you want to go to Moon? DialoGPT: I'm not a fan of the moon.
My other articles about Huggingface/transformers/BERT,
Faster transformer NLP pipeline using ONNX
Text2TextGeneration pipeline by Huggingface transformers
Question answering using transformers and BERT
How to cluster text documents using BERT
How to do semantic document similarity using BERT
Zero-shot classification using Huggingface transformers
Summarize text document using transformers and BERT
Follow me on Twitter, Instagram, Pinterest, and Tumblr for new post notification.
One reply on “Conversational response generation using DialoGPT”
[…] Conversational response generation using DialoGPT […]