« Back to Glossary Index

Short for ‘Term Frequency times Inverse Document Frequency’, it is a classic relevance scoring equation developed in the 1970s. Its breakthrough was that it introduced a fast way to measure the relevance of a document given a query, and use that when sorting search results. Term Frequency is meant to measure how much a given document is ‘about’ the query’s terms, and Inverse Document Frequency used to measure the rareness of the query’s terms in the entire corpus. A high aboutness multiplied by a high rareness, will score the document higher. (See also BM25)

Classic Similarity

Glossary Comments

If you think we've got something wrong or you can suggest a term we should feature in the Glossary, let us know here:

« Back to Glossary Index