Principal Component Analysis Based Filtering for Scalable, High Precision k-NN Search

Huan Feng; David Eyers; Steven Mills; Yongwei Wu; Zhiyi Huang

doi:10.1109/TC.2017.2748131

Back

Principal Component Analysis Based Filtering for Scalable, High Precision k-NN Search

Journal article

Peer reviewed

Principal Component Analysis Based Filtering for Scalable, High Precision k-NN Search

Huan Feng, David Eyers, Steven Mills, Yongwei Wu and Zhiyi Huang

IEEE transactions on computers, Vol.67(2), pp.252-267

01/02/2018

DOI: https://doi.org/10.1109/TC.2017.2748131

Abstract

Algorithm design and analysis

approximate knn search

data filtering

Estimation

Euclidean distance

Filtering

K nearest neighbours

multicore

Multicore processing

parallel algorithms

Principal component analysis

Scalability

Approximate <inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="eyers-ieq1-2748131.gif"/> </inline-formula> Nearest Neighbours (A <inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="eyers-ieq2-2748131.gif"/> </inline-formula>NN) search is widely used in domains such as computer vision and machine learning. However, A<inline-formula><tex-math notation="LaTeX">k </tex-math> <inline-graphic xlink:href="eyers-ieq3-2748131.gif"/> </inline-formula>NN search in high-dimensional datasets does not scale well on multicore platforms, due to its large memory footprint. Parallel A <inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="eyers-ieq4-2748131.gif"/> </inline-formula>NN search using space subdivision for filtering helps reduce the memory footprint, but its loss of precision is unstable. In this paper, we propose a new data filtering method-PCAF-for parallel A<inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="eyers-ieq5-2748131.gif"/> </inline-formula>NN search based on principal component analysis. PCAF improves on previous methods, demonstrating sustained, high scalability for a wide range of high-dimensional datasets on both Intel and AMD multicore platforms. Moreover, PCAF maintains highly precise A<inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="eyers-ieq6-2748131.gif"/> </inline-formula>NN search results.

Metrics

1 Record Views

Details

Record Identifier: 9926515374101891
Title: Principal Component Analysis Based Filtering for Scalable, High Precision k-NN Search
Creators: Huan Feng
David Eyers
Steven Mills
Yongwei Wu
Zhiyi Huang
Publication Details: IEEE transactions on computers, Vol.67(2), pp.252-267
Academic Unit: Computer Science
Publisher: IEEE
Date published ; e-published: 01/02/2018
Language: English
Resource Type: Journal article

Principal Component Analysis Based Filtering for Scalable, High Precision k-NN Search

Abstract

Related links

Metrics

Details

Usage Policy