An Index Scheme for Similarity Search on Cloud Computing using MapReduce over Docker Container

DT Tri Nguyen, Chan Ho Yong, Xuan Qui Pham, Huu Quoc Nguyen, Ton Thi Kim Loan, Eui Nam Huh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

We consider the problem of similarity search over the large datasets in the distributed environment. The proposed framework employs the Vp-Tree algorithm that integrated on top of the MapReduce framework to achieve good performance as well as meet the scalability and fault tolerance requirements for the system while data scale up. Since VP-Tree algorithm was implemented initially for partition and searching data in the local disk access, we proposed a new approach to using it in the parallel environment. The key point of the Vp-Tree algorithm is that it distributed the similar data points into groups, thereby reducing number of data need to scan during the searching stage. Consequently, the response time of the entire system has been improved. Otherwise, we used an open source computer vision library OpenCV for detect the similarity among images in the dataset. We evaluate the performance of our proposed framework using a synthetic data to show the positive of our approach. The experiment shows that our proposed framework achieves 57% improvement in response time in comparison with running searching job in tradition Hadoop framework. We also compared our application running time on Docker container against VM-based environment. The result points out that deploy our system over Docker container provide higher performance than VM-based environment in term of response time.

Original languageEnglish
Title of host publicationACM IMCOM 2016
Subtitle of host publicationProceedings of the 10th International Conference on Ubiquitous Information Management and Communication
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450341424
DOIs
Publication statusPublished - 4 Jan 2016
Event10th International Conference on Ubiquitous Information Management and Communication, IMCOM 2016 - Danang, Viet Nam
Duration: 4 Jan 20166 Jan 2016

Publication series

NameACM IMCOM 2016: Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication

Conference

Conference10th International Conference on Ubiquitous Information Management and Communication, IMCOM 2016
Country/TerritoryViet Nam
CityDanang
Period4/01/166/01/16

Bibliographical note

Publisher Copyright:
© 2016 ACM.

Keywords

  • Index scheme
  • MapReduce
  • Similarity search

Fingerprint

Dive into the research topics of 'An Index Scheme for Similarity Search on Cloud Computing using MapReduce over Docker Container'. Together they form a unique fingerprint.

Cite this