Albert BERT DistilBert Huggingface NLP python transformers

Zero-shot classification using Huggingface transformers

Learn how to do zero-shot classification of text using the Huggingface transformers pipeline. Also, see where it fails and how to resolve it.

Now you can do zero-shot classification using the Huggingface transformers pipeline.

The “zero-shot-classification” pipeline takes two parameters sequence and candidate_labels.

How does the zero-shot classification method works?

  1. The NLP model is trained on the task called Natural Language Inference(NLI).
  2. NLI takes in two sequences and determines whether they contradict each other, entail each other, or neither.
  3. The same NLI concept applied to zero-shot classification.
  4. It treats the sequence we want to classify as one NLI sequence(The premise) and turns candidate labels into the hypothesis.
  5. If the model predicts that the constructed premise entails the hypothesis, then we can take that as a prediction that the label applies to the text.

Let’s see the pipeline in action

Install transformers in colab,

!pip install transformers==3.1.0

Import the transformers pipeline,

from transformers import pipeline

Set the zer-shot-classfication pipeline,

classifier = pipeline("zero-shot-classification")

If you want to use GPU,

classifier = pipeline("zero-shot-classification", device=0)
Now pass the sequence and a list of candidate labels,
sequence = "For any budding cricketer, playing with or against MS Dhoni is a big deal"
candidate_labels = ["cricket", "football", "basketball"]

classifier(sequence, candidate_labels)


{'labels': ['cricket', 'football', 'basketball'],
 'scores': [0.9888357520103455, 0.00587576674297452, 0.005288500338792801],
 'sequence': 'For any budding cricketer, playing with or against MS Dhoni is a big deal'}

Well, it correctly predict the correct sport type.

Now lets try for the same sequence but candidates labels will be premier leagues.
sequence = "For any budding cricketer, playing with or against MS Dhoni is a big deal"
candidate_labels = ["Indian Premier league", "English Premier League", "Super Bowl"]

classifier(sequence, candidate_labels)


{'labels': ['Indian Premier league', 'Super Bowl', 'English Premier League'],
 'scores': [0.48798325657844543, 0.27571797370910645, 0.2362987995147705],
 'sequence': 'For any budding cricketer, playing with or against MS Dhoni is a big deal'}

WOW! it also correctly predicts the premier league type.

Multi-class classification

For multi-class classification, do multi_class=True.

Here, each score is independent and lies between 0 and 1.

sequence = "For any budding cricketer, playing with or against MS Dhoni is a big deal"
candidate_labels = ["cricket", "football", "basketball"]

classifier(sequence, candidate_labels, multi_class=True) 


{'labels': ['cricket', 'football', 'basketball'],
 'scores': [0.9787819981575012,
 'sequence': 'For any budding cricketer, playing with or against MS Dhoni is a big deal'}

Sentiment classification:

Lets see a sentiment classification example,

sequence = "IPL 2020: MS Dhoni loses cool again, confronts umpires in clash against Rajasthan Royals"
candidate_labels = ["positive", "negative"]

classifier(sequence, candidate_labels)
{'labels': ['negative', 'positive'],
 'scores': [0.9861539006233215, 0.01384611614048481],
 'sequence': 'IPL 2020: MS Dhoni loses cool again, confronts umpires in clash against Rajasthan Royals'}

Let’s see when this pipeline fails:

It fails to distinguish properly ‘positive’ and ‘negative’ sentiment in this example,

sequence = "Tech Companies in India are having problem raising funds. Nevertheless, they are doing great with customer acquisition"
candidate_labels = ["positive", "negative"]

classifier(sequence, candidate_labels)

The probabilities around 0.5, so the model doesn’t correctly distinguish between ‘positive’ and ‘negative’ sentiment.

{'labels': ['negative', 'positive'],
 'scores': [0.5629852414131165, 0.4370148181915283],
 'sequence': 'Tech Companies in Nepal are having problem to raise funds.Nevertheless, they are doing great with customer acquisition'}

Let’s improve the results by using a hypothesis template that is more specific to the setting of review sentiment analysis. 

sequence = "Tech Companies in India are having problem raising funds. Nevertheless, they are doing great with customer acquisition"
candidate_labels = ["positive", "negative"]
hypothesis_template = "The sentiment of this review is {}."

classifier(sequence, candidate_labels, hypothesis_template=hypothesis_template)

output: Now it gives proper results,

{'labels': ['positive', 'negative'],
 'scores': [0.7728187441825867, 0.22718125581741333],
 'sequence': 'Tech Companies in India are having problem to raise funds.Nevertheless, they are doing great with customer acquisition'}


Let me know in comments section, if you are facing any issues.

You can follow me on Twitter and Instagram to get notified of new posts.

Happy learning 🙂

Some of my other blogs,

By Satyanarayan Bhanja

Machine learning engineer

Leave a Reply

Your email address will not be published. Required fields are marked *

How to do semantic document similarity using BERT Zero-shot classification using Huggingface transformers