Performance Bottlenecks in Manycore Systems: A Case Study on Large Scale Feature Matching within Image Collections

Xiaoxin Tang; Steven Mills; David Eyers; Kai-Cheung Leung; Zhiyi Huang; Minyi Guo; IEEE

doi:10.1109/HPCC.and.EUC.2013.140

Back

Conference proceeding

Performance Bottlenecks in Manycore Systems: A Case Study on Large Scale Feature Matching within Image Collections

Xiaoxin Tang, Steven Mills, David Eyers, Kai-Cheung Leung, Zhiyi Huang, Minyi Guo and IEEE

2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), pp.985-995

01/01/2013

DOI: https://doi.org/10.1109/HPCC.and.EUC.2013.140

Abstract

Computer Science

Computer Science, Information Systems

Computer Science, Theory & Methods

Engineering

Engineering, Electrical & Electronic

Science & Technology

Technology

In memory-intensive algorithms, the problem size is often so large that it cannot fit into the cache of a CPU, and this may result in an excessive number of cache misses, a bottleneck that can easily make seemingly embarrassingly-parallel algorithms such as feature-matching unscalable in many-core systems. To solve this bottleneck, this paper proposes a general Divide-and-Merge methodology, which divides the feature space into several small sub-spaces, so that the shared resources in each sub-space can be satisfied without causing bottlenecks. Experimental results have shown that the Divide-and-Merge methodology reduces the L3 cache misses and time spent on memory-allocation-related system calls, resulting in a 211% performance improvement on an AMD 64-core CPU machine, and 57% and 16% performance improvements on AMD and Intel 16-core machines respectively. Performance results on a modern GPU also show that a well-tuned algorithm with time complexity of O(F-2) is able to defeat a state-of-the-art O(F-1.5) algorithm by 188% for our real-world dataset, which again highlights the huge performance impact of the memory system.

Metrics

1 Record Views

Details

Record Identifier: 9926549634401891
Title: Performance Bottlenecks in Manycore Systems: A Case Study on Large Scale Feature Matching within Image Collections
Creators: Xiaoxin Tang
Steven Mills
David Eyers
Kai-Cheung Leung
Zhiyi Huang
Minyi Guo
IEEE
Publication Details: 2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), pp.985-995
Academic Unit: Computer Science
Publisher: IEEE
Date published ; e-published: 01/01/2013
Language: English
Resource Type; Subtype: Conference proceeding

Performance Bottlenecks in Manycore Systems: A Case Study on Large Scale Feature Matching within Image Collections

Abstract

Related links

Metrics

Details

Usage Policy