Learning Types for Binaries

Zhiwu Xu, Cheng Wen, Shengchao Qin

Research output: Contribution to journalArticlepeer-review

420 Downloads (Pure)

Abstract

Type inference for Binary codes is a challenging problem due partly to the fact that much type-related information has been lost during the compilation from high-level source code. Most of the existing research on binary code type inference tend to resort to program analysis techniques, which can be too conservative to infer types with high accuracy or too heavy-weight to be viable in practice. In this paper, we propose a new approach to learning types for recovered variables from their related representative instructions. Our idea is motivated by “duck typing”, where the type of a variable is determined by its features and properties. Our approach first learns a classifier from existing binaries with debug information and then uses this classifier to predict types for new, unseen binaries. We have implemented our approach in a tool called BITY and used it to conduct some experiments on a well-known benchmark coreutils (v8.4). The results show that our tool is more precise than the commercial tool Hey-Rays, both in terms of correct types and compatible types.
Original languageEnglish
Pages (from-to)-
JournalLecture Notes in Computer Science
DOIs
Publication statusPublished - 11 Oct 2017

Fingerprint

Dive into the research topics of 'Learning Types for Binaries'. Together they form a unique fingerprint.

Cite this