Massive storage systems in the cloud computing applications need to deal with the problems of storing and maintaining large-scale data. Existing large-scale data centers contain storage systems with ZB-level data. These data have the multi-source, heterogeneous and massive characteristics. Therefore, the storage systems meet the need for efficient and fast approximate query services. The design can improve the scalability of the storage systems and provide users with high-quality storage services.

 
Yu Hua from the Information Storage and Optical Display Division, Wuhan National Laboratory for Optoelectronics proposes an approximate query scheme based on the locality characteristics of massive data, called NEST. This scheme leverages fast and simple hash computation to obtain multi-dimensional metadata information and the access patterns. The proposed data structure can not only maintain the relationship of data, but also significantly improve query accuracy and load balance. Extensive experimental results demonstrate that NEST can achieve lightweight index structures and provide fast and accurate data query service.
 
This work is published in INFOCOM 2013 and supported in part by the NSFC under grant 61173043, and National Basic Research 973 Program of China under Grant 2011CB302301.
Result:Average query latency with uniform query requests
Result:Average query latency with uniform query requests