How Cuckoo Filter Can Improve Existing Approximate Matching Techniques

back to overview

Reference

Gupta, V., & Breitinger, F. (2015). How Cuckoo Filter Can Improve Existing Approximate Matching Techniques. Paper presented at the International Conference on Digital Forensics & Cyber Crime (ICDF2C), Seoul, South Korea.

Publication type

Paper in Conference Proceedings

Abstract

In recent years, approximate matching algorithms have be- come an important component in digital forensic research and have been adopted in some other working areas as well. Currently there are several approaches but especially sdhash and mrsh-v2 attract the attention of the community because of their good overall performance (runtime, compression and detection rates). Although both approaches have a quite different proceeding, their final output (the similarity digest) is very similar as both utilize Bloom filters. This data structure was presented in 1970 and thus has been around for a while. Recently, a new data structure was proposed and claimed to be faster and have a smaller memory footprint than Bloom filter -- Cuckoo filter. In this paper we analyze the feasibility of Cuckoo filter for approximate matching algorithms and present a prototype implementation called mrsh-cf which is based on a special version of mrsh-v2 called mrsh-net. We demonstrate that by using Cuckoo filter there is a runtime improve- ment of approximately 37% and also a significantly better false positive rate. The memory footprint of mrsh-cf is 8 times smaller than mrsh-net, while the compression rate is twice than Bloom filter based fingerprint.

Persons

Organizational Units

  • Institute of Information Systems
  • Hilti Chair for Data and Application Security

DOI

http://dx.doi.org/10.1007/978-3-319-25512-5_4