ODYS: A Massively-Parallel Search Engine Using a DB-IR Tightly-Integrated Parallel DBMS

Kyu-Young Whang
Kyu-Young Whang
Department of Computer Science, KAIST, Korea
December 6, 2012, 11:50 ~ 12:30

Abstract :

Recently, parallel search engines have been implemented based on scalable distributed file systems (e.g., GFS). However, we claim that building a massively-parallel search engine using a parallel DBMS can be an attractive alternative since it supports a higher-level (i.e., SQL-level) interface than that of a distributed file system while providing scalability. In this paper, we propose a new approach of building a massively-parallel search engine using a DB-IR tightly-integrated parallel DBMS and demonstrate its commercial-level scalability and performance. In addition, we present a performance model (analytic and experimental) for the parallel search engine. We also present a query model that allows us to treat various types of queries in a uniform manner. We have built a five-node parallel search engine according to the proposed architecture using a DB-IR tightly-integrated DBMS. Through extensive experiments, we show the correctness of the model by comparing its output with the experimental results of the five-node engine. The result of the performance model demonstrates that ODYS is capable of handling 1 billion queries per day for 30 billion web pages by using only 43,472 nodes with an average query response time of 211 ms, which is comparable to or even better than those of commercial search engines.


Research Activities :

  • Object-Relational Database Systems, XML and Web Databases, IR and Search Engines
  • Data Warehousing and Data Mining, Advanced Storage Systems, Spatial Databases and GIS
  • Distributed Systems, P2P, Web Services, Multimedia Information Systems, Main-Memory Database Systems, Light-weight Database Systems, Sensor Networks and Stream Databases


Honors and Awards :

  • 2008: KAIST Distinguished Professor
  • 2007: President, Korean Institute of Information Scientists and Engineers
  • 2003-2009: Editor-in-Chief, The VLDB Journal
  • 2009/2007: ACM/IEEE Fellow