I learned this data structure from a data scientist during an interview process

LORY
9 min readJun 11, 2023

K-D tree

The interview

These few weeks I went through a couple of interviews, and surprisingly many are data scientists who wanted to join as a Software Engineer (and I don’t know why) role.

So one of the candidates was a data scientist who is very strong in algorithms. we went through 2 leet code Medium (DFS, UF, and graph) then talked about the search improvement he did, from database full-text search (we both agree should NOT use it :), pls drop your comment if disagree), inverted index in ES, and also vector database.

Here is something new to me.

“vector search, I haven’t used it before, could you help me understand how it works and how it helped in your product? thank you.” I asked.

“Yes sure, it is basically to find nearest neighbors, input is a vector, and return a list of similar datasets (closest vectors)”. he explained clearly.

“Ok, thanks. As I understand from the brute force KNN algorithm should be very slow when the dataset goes huge, it computes every 2 points distance and sorting, even if you put min-heap it should not help much it still O(N²)” I said.

“Yes, so we improved it, first using k-d tree and then ANN”. he answered.

“Well, this is something I don’t know, is it a data structure used very often in Machine Learning?”

--

--

LORY

A channel which focusing on developer growth and self improvement