R3D3: A doubly opportunistic data structure for compressing and indexing massive data

Máté Nagy, J. Tapolcai, Gábor Rétvári

Research output: Contribution to journalArticle


Opportunistic data structures are used extensively in big data practice to break down the massive storage space requirements of processing large volumes of information. A data structure is called (singly) opportunistic if it takes advantage of the redundancy in the input in order to store it in information-theoretically minimum space. Yet, efficient data processing re-quires a separate index alongside the data, whose size often substantially exceeds that of the compressed information. In this paper, we introduce doubly opportunistic data structures to not only attain best possible compression on the input data but also on the index. We present R3D3 that encodes a bitvector of length n and Shannon entropy H0 to nH0 bits and the accompanying index to nH0(1/2 + O(log C/C)) bits, thus attaining provably minimum space (up to small error terms) on both the data and the index, and supports a rich set of queries to arbitrary position in the compressed bitvector in O(C) time when C = o(log n). Our R3D3 prototype attains several times space reduction beyond known compression techniques on a wide range of synthetic and real data sets, while it supports operations on the compressed data at comparable speed.

Original languageEnglish
Pages (from-to)58-66
Number of pages9
JournalInfocommunications Journal
Issue number2
Publication statusPublished - Jan 1 2019



  • Big data
  • Compressed self-indexes
  • Packet forwarding
  • Succinct and compressed data structures

ASJC Scopus subject areas

  • Computer Science(all)
  • Electrical and Electronic Engineering

Cite this