Automated Digital Forensic Triage: Rapid Detection of Anti-Forensic Tools
We live in the information age. Our world is interconnected by digital devices and electronic communication. As such, criminals are finding opportunities to exploit our information rich electronic data. In 2014, the estimated annual cost from computer-related crime was more than 800 billion dollars. Examples include the theft of intellectual property, electronic fraud, identity theft and the distribution of illicit material. Digital forensics grew out of necessity to combat computer crime and involves the investigation and analysis of electronic data after a suspected criminal act. Challenges in digital forensics exist due to constant changes in technology. Investigation challenges include exponential growth in the number of cases and the size of targets; for example, forensic practitioners must analyse multi-terabyte cases comprised of numerous digital devices. A variety of applied challenges also exist, due to continual technological advancements; for example, anti-forensic tools, including the malicious use of encryption or data wiping tools, hinder digital investigations by hiding or removing the availability of evidence. In response, the objective of the research reported here was to automate the effective and efficient detection of anti-forensic tools. A design science research methodology was selected as it provides an applied research method to design, implement and evaluate an innovative Information Technology (IT) artifact to solve a specified problem. The research objective require that a system be designed and implemented to perform automated detection of digital artifacts (e.g., data files and Windows Registry entries) on a target data set. The goal of the system is to automatically determine if an anti-forensic tool is present, or absent, in order to prioritise additional in-depth investigation. The system performs rapid forensic triage, suitable for execution against multiple investigation targets, providing an analyst with high-level information regarding potential malicious anti-forensic tool usage. The system is divided into two main stages: 1) Design and implementation of a solution to automate creation of an application profile (application software reference set) of known unique digital artifacts; and 2) Digital artifact matching between the created reference set and a target data set. Two tools were designed and implemented: 1) A live differential analysis tool, named LiveDiff, to reverse engineer application software with a specific emphasis on digital forensic requirements; 2) A digital artifact matching framework, named Vestigium, to correlate digital artifact metadata and detect anti-forensic tool presence. In addition, a forensic data abstraction, named Application Profile XML (APXML), was designed to store and distribute digital artifact metadata. An associated Application Programming Interface (API), named apxml.py, was authored to provide automated processing of APXML documents. Together, the tools provided an automated triage system to detect anti-forensic tool presence on an investigation target. A two-phase approach was employed in order to assess the research products. The first phase of experimental testing involved demonstration in a controlled laboratory environment. First, the LiveDiff tool was used to create application profiles for three anti-forensic tools. The automated data collection and comparison procedure was more effective and efficient than previous approaches. Two data reduction techniques were tested to remove irrelevant operating system noise: application profile intersection and dynamic blacklisting were found to be effective in this regard. Second, the profiles were used as input to Vestigium and automated digital artifact matching was performed against authored known data sets. The results established the desired system functionality and demonstration then led to refinements of the system, as per the cyclical nature of design science. The second phase of experimental testing involved evaluation using two additional data sets to establish effectiveness and efficiency in a real-world investigation scenario. First, a public data set was subjected to testing to provide research reproducibility, as well as to evaluate system effectiveness in a variety of complex detection scenarios. Results showed the ability to detect anti-forensic tools using a different version than that included in the application profile and on a different Windows operating system version. Both are scenarios where traditional hash set analysis fails. Furthermore, Vestigium was able to detect residual and deleted information, even after a tool had been uninstalled by the user. The efficiency of the system was determined and refinements made, resulting in an implementation that can meet forensic triage requirements. Second, a real-world data set was constructed using a collection of second-hand hard drives. The goal was to test the system using unpredictable and diverse data to provide more robust findings in an uncontrolled environment. The system detected one anti-forensic tool on the data set and processed all input data successfully without error, further validating system design and implementation. The key outcome of this research is the design and implementation of an automated system to detect anti-forensic tool presence on a target data set. Evaluation suggested the solution was both effective and efficient, adhering to forensic triage requirements. Furthermore, techniques not previously utilised in forensic analysis were designed and applied throughout the research: dynamic blacklisting and profile intersection removed irrelevant operating system noise from application profiles; metadata matching methods resulted in efficient digital artifact detection and path normalisation aided full path correlation in complex matching scenarios. The system was subjected to rigorous experimental testing on three data sets that comprised more than 10 terabytes of data. The ultimate outcome is a practically implemented solution that has been executed on hundreds of forensic disk images, thousands of Windows Registry hives, more than 10 million data files, and approximately 50 million Registry entries. The research has resulted in the design of a scalable triage system implemented as a set of computer forensic tools.
Advisor: Wolfe, Hank; MacDonell, Stephen
Degree Name: Doctor of Philosophy
Degree Discipline: Information Science
Publisher: University of Otago
Keywords: digital forensics; computer forensics; reverse engineering; triage; windows registry; file system; differential analysis; file system analysis; differential forensic analysis; automated forensic analysis; system-level forensic analysis; digital forensic analysis; metadata; Vestigium; LiveDiff; CellXML; DFXML; RegXML; automated application detection; automated software detection; anti-forensic tools; hacking tools
Research Type: Thesis