So what is an inverted index and how does it work?

LORY
5 min readMay 28, 2023

Explained in 10 minutes.

What is an Inverted index?

You may have seen the below structure.

So yes, it is about the mapping between words and document lists.

And before the “mapping” happens, articles or sentences will be tokenized and then filtered (both combine into a process called “tokenized”) into a word list. then every word will map into a “document list”.

So that when you search “word”, it can quickly find the linked document ids.

But wait, many questions here, how do the “words” being stored and searched? is the “document list” an array? and how does it store? what if some words don’t even exist in the system and don’t even make sense to go through the mapping?

Let’s address the first question.

Trie

So how many trie problems you have solved in leetcode (you may refer to my previous post for details)😃 ? In short, it is a “word tree” for you to quickly know whether a word exists or not.

So you may ask me “Are you saying here is the use case”?

Hmm not exactly. let’s see an example.

--

--

LORY

A channel which focusing on developer growth and self improvement