Ranking Search Results by Trending Hashtags with a zero-shot classifier

An evening after chores are done is a great time to watch a movie. You get the family together and you go to your favorite streaming service and search for family movies. If it’s December you probably expect to see some Christmas movies in the search results, and if the Olympics are soon you probably would not be surprised to see Cool Runnings listed. These two examples are predictable because we know when Christmas and the Olympics are – but that’s not always the case. To make movie search results timely and relevant we need to be able to consider what topics are currently trending.

Classifying movies based on a trending topic can be difficult because we don’t know what topic might be trending tomorrow. To train a classification model we need a set of labels but we can’t see in the future to know all of the labels we will need. So how can we make a classifier?

With a zero-shot classifier we can classify text that wasn’t available at training time. Given some arbitrary sequence of text, a set of candidate labels, and a hypothesis template, a zero-shot classifier can perform multi-class classification. For each of the candidate labels, the classifier takes the hypothesis template and substitutes in each label. The outputs will be values between 0 and 1 that indicate the probability that the text belongs to each respective label (note that if we choose to not do multi-class classification the output values will sum to 1.)

Zero-shot Classifier

A zero-shot classifier gives us the ability to classify movies into categories that were not known at training time. Let’s look at an example. The movie Outbreak from 1995 is about doctors trying to find a cure for a deadly virus that’s spreading in a town. Let’s see if we can classify the movie using its tag line. We are going to use a Hugging Face transformers pipeline.

sequence = "This animal carries a deadly virus… and the greatest medical crisis in the world is about to happen."
candidate_labels = ['pandemic', 'christmas', 'olympics']

hypothesis_template = "This text is about {}."

And the output:

{'labels': ['pandemic', 'christmas', 'olympics'],
'scores': [0.9605199098587036, 0.0002500491391401738, 0.00024917611153796315],
'sequence': 'This animal carries a deadly virus… and the greatest medical crisis in the world is about to happen.'}

From the output we see that the sequence was classified as pandemic with 96% confidence. The other two candidate labels both come in a distant second and third. The model seems pretty confident this movie is about a pandemic. That’s what we expected, but how does this work?

A zero-shot classifier uses natural language inference (NLI), sometimes referred to as Recognizing Textual Entailment (RTE). NLI is a natural language processing (NLP) task in which we try to determine a relationship between two sentences. One sentence is called the premise and the other sentence is called the hypothesis. The two sentences will either represent an entailment, contradiction, or be neutral (irrelevant to each other). The SNLI Corpus is one such labeled dataset that can be used to train an NLI model used by a zero-shot classifier.

Here are a few sentence pair examples from the SNLI training data:

TextConsensus JudgmentHypothesis
An older and younger man smiling.neutralTwo men are smiling and laughing at the cats playing on the floor.
A black race car starts up in front of a crowd of people.contradictionA man is driving down a lonely road.
A soccer game with multiple males playing.entailmentSome men are playing a sport.

Putting it Together

This approach was presented in my 2021 Berlin Buzzwords talk titled “Applied MLOps to Maintain Model Freshness on Kubernetes.” – here’s the slides. In this talk, we described using a zero-shot classifier for the purpose of classifying movies and using those results to rank the search results. The code for this project implements an Apache Flink application that consumes hashtags from Twitter. The hashtags are aggregated and sorted to determine the most popular, or trending, hashtags. Those hashtags are then used as the candidate labels for the classifier along with an indexed list of movie overviews as the sequences. The classification result gives us a value on which we can sort search results.

Now, for example and as illustrated in the table below, if “Christmas” is trending and we search for “Family” movies we will be shown family Christmas movies first.

“Family” movies“Family” movies with “Christmas” trending

The New Adventures of Pinocchio
Raising the Bar
A Christmas Star
Christmas Miracle
Soul Surfer
Air Bud: Golden Receiver
Playing with Fire

A Christmas Carol
Unaccompanied Minors
Jingle All the Way
Jingle All the Way 2
The Grinch
A Christmas Snow
The 12 Dogs of Christmas
The Nutcracker
Saving Christmas
The Swan Princess Christmas

The zero-shot classifier is deployed to Kubernetes using KFServing from Kubeflow. This encapsulation provides a lot of useful functionality and prevents us from having to re-invent the wheel each time we are deploying a machine learning service to Kubernetes. Models deployed using KFServing are exposed over a REST interface making inference calls from other components easy.

By measuring the model’s search results with a list of human judgments we can develop a baseline for our model’s performance. We can use this baseline to measure improvements to our model’s performance as we train new versions of the model. When we are satisfied with the performance of a new model we can simply deploy it using KFServing and make it available in our workflow.

Try it out!

Check out the code to try it for yourself! Just follow the steps in the README to run it locally using docker-compose to start capturing hashtags, classifying movies, and examining search results.

Do get in touch if you need help using machine learning approaches like this to improve search quality!

image from Media Vectors by Vecteezy