As code obfuscation techniques evolved, both from the software protection industry and from sophisticated malware threats, more and more research has been done in deobfuscation techniques as well.
Starting from naive manual static and dynamic analysis, more advanced methodologies appeared in order to automate this tedious process, at least partially. Namely, symbolic execution, taint analysis and a combination of both being the most representative.
However, the efficacy of the aforementioned methods strongly depends on the syntactic complexity of the code, which leads to a limiting factor for these methods to scale into dealing with more advanced obfuscation, as there exist modern mechanisms that have been shown to effectively hinder the ability to simplify obfuscated code on a syntactic level. Thus, most recent code deobfuscation techniques are starting to focus on the exposed semantic behavior rather than on the syntactic expression.
The main goal of the talk is to expose the limitations of common code deobfuscation techniques operating on a syntactical level as well as presenting the current state of the art techniques to deal with it based on code semantics, with a special focus in modern program synthesis approaches.