Performance Issues About Context-Triggered Piecwise Hashing

back to overview

Reference

Breitinger, F., & Baier, H. (2012). Performance Issues About Context-Triggered Piecwise Hashing. Paper presented at the Digital Forensics and Cyber Crime.

Publication type

Paper in Conference Proceedings

Abstract

A hash function is a well-known method in computer science to map arbitrary large data to bit strings of a fixed short length. This property is used in computer forensics to identify known files on base of their hash value. As of today, in a pre-step process hash values of files are generated and stored in a database; typically a cryptographic hash function like MD5 or SHA-1 is used. Later the investigator computes hash values of files, which he finds on a storage medium, and performs look ups in his database. Due to security properties of cryptographic hash functions, they can not be used to identify similar files. Therefore Jesse Kornblum proposed a similarity preserving hash function to identify similar files. This paper discusses the efficiency of Kornblum's approach. We present some enhancements that increase the performance of his algorithm by 55% if applied to a real life scenario. Furthermore, we discuss some characteristics of a sample Windows XP system, which are relevant for the performance of Kornblum's approach.

Persons

Organizational Units

  • Institute of Information Systems
  • Hilti Chair for Data and Application Security

Original Source URL

Link

DOI

http://dx.doi.org/doi:10.1007/978-3-642-35515-8_12