Competing with ‘diversity’- Firmware analysis and its challenges
Started work on a new (sort of) project of mine over the weekend — “Reverse-engineer a random blob of binary recovered from a downloaded firmware image.”
Half-way into it, I realized this topic (is something that) doesn’t seem to get the attention it so desperately needs or rather I couldn’t find enough public discussions on the subject, especially given its current state — ‘Firmware analysis and its challenges’
A quick summary of my thoughts on the subject (in a couple of mind-maps) and an *advertisement* for an open-source analysis framework that attempts to address some of these challenges.
This (really) doesn’t paint a pretty picture but nonetheless, the open source community has been toiling away at the problem — slowly but steadily. The available tooling, frameworks, plugins etc. for modern binary analysis look something like this.
It looks like a good place to start but the entire affair is far from trivial especially when compared to a typical web, desktop or cloud reversing-engagement (desktop OSs, web apps/services). Over the course of my exploration, I happen to come across a framework focused on firmware analysis — Avatar2. It attempts to solve some of these challenges with a few neat tricks.
At its heart, it interconnects debuggers, emulators and analysis frameworks by exposing a consistent API (i.e. easy scripatbility).
The authors have examples for a bunch of use-cases. One example that I particularly love is the ability to transfer state and synchronization capabilities.
A vulnerable FireFox browser: You know that feeling when you need to leverage the strengths of 2 completely disparate tools to achieve an objective. Yeah — It drives you crazy! In this case, we can leverage symbolic execution to find this bug/vulnerability but we have to first sift through Firefox’s massive code-base. Evaluating a large binary like this one, directly in angr (a symbolic execution framework) is tantamount to asking for trouble (i.e. high likely to result in state explosion).
This framework allows us to concretely run the executable in gdb till we reach our function of interest and then transfer the concrete state into angr, symbolize some attacker controlled arguments to the function and explore until we eventually find a bug. This is possible due to the state transfer and synchronization capabilities of Avatar2 which can be used to dynamically transfer states from concretely executed software into symbolic execution engines.
Some features that stand out for me:
- Targets: Treating tools like emulators, debuggers, instrumentation frameworks as python abstractions from which we can interact with the real endpoints looks like a well thought out approach to the problem of firmware reversing.
- Internal memory layout representation: a consistent view of a program’s memory is required. Avatar2 provides interfaces for defining and updating the memory layout, which is then pushed to the targets. A target is simply one of several analysis tools. Ex: gdb, OpenOCD, QEMU, PANDA etc.
- Peripheral modeling: adding prototypes of simple peripherals models. Ex: you could model a (USART) interface in a particular ARM based microcontroller by a particular vendor. The model will receive and transmit input/output over a tcp connection, instead of a physical peripheral.
- Plugin System: Orchestrator for state transfer and synchronization capabilities, Instruction Forwarder to forward I/O accesses to physical devices or to simply forward instructions to other targets to obtain a different set of analysis-artefacts
However, this is still a WIP. We’ll need to see a lot more traction i.e. more targets being added and more options for tool-interoperability. Here is the link to GitHub page —
https://github.com/avatartwo/avatar2 for more info.
This is just a summary of my experience over the years, I’d love to hear what you have to say on this topic!