29 November 2017

Binary Ninja Recipes

What is the value of a blog if you don't post something from time to time. But what to publish when you only recognize two kinds of knowledge: something you know, therefore it is trivial and something you don't know, therefore you shouldn't be writing about that? Well, today is the time for some trivial knowledge - Binary Ninja recipes.

Problem 1: how to develop plugins

I was trying to find an optimal way to structure my development environment for plugins for some time. First - for Binary Ninja to discover and run one it must be located in ~/.binaryninja/plugins/ directory (I'm skipping standalone plugins that you can just run from anywhere). Obvious solution is to edit it directly there, but somehow I was seeing this solution as inelegant. At first, I was editing files in my project directory and copying it manually, but after few times it became tedious. So, in the next step I've developed universal shell script that was taking plugin files and deploying it to relevant directory in binary ninja tree. That however had one tiny flaw - I had to remember to execute the deployment. Multiple times in my flow I was restarting Binary Ninja, opening binary file and executing plugin only to realize I'm still running old version of the code.
My next try was with Binary Ninja internal plugin system - it can fetch code from remote git repository and just make it run locally. But still, it was too complicated for a simple problem I was facing. I've asked good people on Binary Ninja Slack channel and I've adjusted my workflow basing it on few suggestions.

I primarily use git during my development, so I can later push things to github.com. I keep two main branches - stable and dev. Now, in addition to that I basically soft link my project directory under binary ninja plugin directory. When I want to develop new feature I switch to dev branch and I get instant deployment for free and when I just want to use it I checkout stable version. (I told you this is going to be trivial).

Problem 2: Binary Reader

Now, something more technical. Let's say you want, for some reason, to read/scan whole binary you've loaded into binary ninja; to, for example, find some pattern. My initial idea was to do it like this:
# bv stands for BinaryView
for addr in range(bv.start, bv.end):
  b = bv.read(addr, 1)
This approach has few flaws. First of all, return type is string, so if for example you want to read 4 bytes and compare it against value like 0x41414141 you need to unpack it into correct type. Second one is you can't move forward and backward with ease. I've decided that it would be better to use Binary Reader, so I wrote this:
br = bn.BinaryReader(bv)

while not br.eof:
  f_byte = br.read8()
In theory that should scan every byte of a binary, mapped or not. Every read8() call move internal read offset by one byte and return value correspond to relevant function being called. There was on small problem with that code - if ended up with infinite loop. It took me while to understand, what is going on. So, basically, if a read step out of mapped segment and returns null value it stops moving internal offset, hence the infinite loop. Improved version of the code now looks roughly like this:
br = bn.BinaryReader(bv)

while not br.eof:
  if bv.is_valid_offset(br.offset):
    f_byte = br.read8()
  else:
    br.seek_relative(1)
Now it works smoothly.

From now on I will try to write short pieces of How I do things style posts, especially about Binary Ninja. I've even started drafting something I refuse to call book, but if I have enough material related to writing Binary Ninja plugins, who knows. Let me know what do you think about all of this! Next time I will try to write some more about Binary Ninja plugin repository management.

15 March 2017

Nobody expected 64 bits

Apparently if you are not mortally embarrassed by the quality of your code you are releasing it too late [(tm) Silicon Valley]. But to use another only-too-often-used-quote - "Release early, release often". I've made mistakes of hoarding my tools and code for too long, not releasing them because they weren't perfect. This was obviously road to nowhere because if I don't release, nobody uses it. And if nobody uses it I have no motivation to develop it anymore. So, to break this circle I present you a new version of function Annotator for Binary Ninja.

First thing worth mentioning is a new database of functions prototypes. To be exact we now have 4728 prototypes. From this place big thanks to Zach Riggle for his functions project - this update would not be here if not for him.

Next thing is virtual stack for x64 platform - from now on you can also annotate 64-bit applications for Intel/AMD processors.

One small thing that I still need to properly implement is full support for functions operating on floating point types (float, double and long double). Right now they are not properly annotated and there are two important reasons for that:
For 32 bit platform floating point arguments are pushed on the stack using instructions like fstp or fst. Sadly, Binary Ninja right now does not have a corresponding Low Level Instructions for those. They are just showing as unimplemented(). The moment Binary Ninja starts supporting them I just add some more parsing code and everything will be fine.
64 bit platform is slightly more complicated. First of all, arguments to functions are passed via registers. Integers, pointers and such are passed through 6 registers - RDI, RSI, RDX, RCX, R8 and R9 and order matters. Floating point arguments are passed via XMM0-7 registers. Now, let's imagine that we have two functions f1(int, float) and f2(float, int). What will compiler do? Well, on Linux, in case of f1() first argument will end up in RDI and second in XMM0, but in f2 first argument will end up in XMM0 and second one in RDI.
"Wait a minute" - you will say - "but this is exactly the same". I'm glad you are seeing the same problem. Just having state of registers won't tell us what the first and the second argument is unless you know types in the first place. Virtual Stack does not know types, so until I refactor my code FP types won't be supported.

New updates are planned so stay tuned! And of course, please let me know what you think about it and report all bugs.

21 February 2017

Annotate all the things

I don't do reverse engineering for a living but I still like to peek under the hood of binaries from time to time. Either because of testing, looking for bugs or just for fun. Problem is, that IDA Pro, de-facto standard tool for any Reverse Engineer is prohibitively expensive for most of the people. On top of that, licensing policy is very annoying and illogical. But enough about IDA Pro - let's talk about new contender on this field - Binary Ninja.

I'm not going to repeat all the praises that this tool is receiving. Instead, you may for example read how you can use it to automatically reverse 2000 binaries or maybe how the underlying Low Level Instrumentation Language works. All in all platform looks very promising and I couldn't wait to try it after seeing it for the first time. Couple of months ago I was playing with the Beta and pretty much bought it first day it was released.

There is one tiny problem with Binary Ninja however - IDA Pro was here for years, therefore it is both feature rich and ecosystem around it is pretty robust. Binja still has a long way to go in this department - there are not that many useful plugins and some features are missing. One thing I've noticed for example is that while reversing basic libc functions and system calls are not annotated in any way. There is no prototype of them and arguments are not marked in any way. So instead of complaining I've decided to utilize available API and just fix that.

Let's start by defining a problem. For example we have a listing like this:

Not terribly descriptive, right? Well, at least for strcpy() we roughly remember the prototype so we can quickly find where arguments are being pushed on the stack. But what about fchmodat() or sigaction(). Yeah, you need to get back to man page. How cool would be to open a binary and get this:

This is exactly what Annotator plugin does - it iterates through all instruction in the code building a virtual stack as it goes, but instead of variables it tracks instructions that pushed a given variable on to the stack. Upon encountering a call of known function it uses this virtual stack to annotate it with a proper argument prototype.

This is a very first release so it is probably riddled with bugs. Not to mention some features are missing. Right now not all glibc function prototypes are present because I haven't found a good and reliable way to extract them from headers - instead I'm using a combination of grep, regex and cut with some manual cleanup effort. That unfortunately takes time. Same goes for system calls, but I should be able to put all Linux 32bit ones today. Ah, and you have to run plugin manually in every function you view - right now there is no way to automatically apply it to all the functions - I'm contemplating to write one method allowing user to apply it to whole underlying call graph, but we will see about that.

Another thing is quite naive virtual stack implementation - for sure it requires more work to track stack growth more accurately and for example track number of arguments for functions with va_arg type of arguments. Right now I'm also scanning blocks of code in linear manner, but for future version I will probably switch to recursive mode with stack isolation for each path (well, right now I haven't encountered situation where functions arguments are done in different code block than the call itself, but better safe than sorry). Last thing to improve is number of virtual stacks - first for x64 platforms and later for ARM architecture.

Please, let me know what do you think about the extension and report all the bugs.

28 August 2015

In search of golden fleece

Key activity when looking for reflected XSS is to check what parameters provided in request are echoed back in response. Doing that manually is tedious and that time can be spent in more productive way. For example you can write burp extension that will do it for you. So, I present Argonaut.

Extension works in very simple way - it parses captured request to extract all parameters (cookies included) and later search through response body to see if value in question has been echoed back. In such case a short snippet of match is presented to the user.

Currently a parameter parsing is done in quite a dumb way - it works quite well with standard GET and POST parameters, but for example is unable to extract param values from JSON or XML and tried to see for exact match of whole payload. That is not very effective, but it is on my TODO list. One more thing to remember - parameter values shorter then 3 characters are ignored (you don't want 300 matches of '1' in result table).

Hey, but what about escaping, you ask? No worries, I got this covered. Let's say you are testing a web application written on top of Django. Most likely you are going to use Jinja2 template engine, and it applies escaping. Argonaut will search the response body for plain parameter value (let's say test">), but will also apply various defined transformations/escaping to see if for example application returned 'test">'.

I've chosen Jinja2 example for a reason - truth be told Jinja2 is the only transformation implemented so far, but mechanism is in place and I'm planning to add new ones very soon.

There is still work to be done. Some simple tasks will be completed soon - for example new transformations and some UI work. Others, harder - like support for contextual autoescaping libraries and type dependent parameter extraction will have to wait a bit. Anyway, stay tuned and let me know what do you think.

27 July 2015

Migrating repository

Because code.google.com will be finally deprecated really soon I've moved all my projects to github. That includes JSONDecoder.

14 August 2013

MutProxy

Recently I had very little time to write anything meaningful. New post are coming, slowly but steady. In the meantime I've stumbled upon short code at Gynvael page. It reminded me of a project I wrote some years ago for one assessment.
When I finally found it the code wasn't in state where I'd like to show it to anyone. Past few days I've spent cleaning and expanding it a bit. Today I've pushed code into GitHub. Here, take a look.

So, what MutProxy does? (Yep, I know that name is not very original nor brilliant, but come on, I'm not a Junior Creative Director in D'Arcy, I'm just a plain pentester.) It's just a simple proxy/tunnel with ability to attach functions to alter or log traffic in different ways. ReadMe does not exists at the moment, so you will have to read the code to determine functionality. There is some documentation in code comments :).

A lot of work still to be done - mutators are very basic and act more as an example then real deal, logger is very plain and documentation does not exist. Waiting for more free time. I was also planning to write more how to force applications to go through your proxy.

18 June 2013

Small update

This is going to be very short (let's call it a warmup) post.
Just wanted to let you know that I've made small update to JSONDecoder. Changes are mostly cosmetics:

  • Content type check is case insensitive now
  • Decoder is now removing garbage from JSON payload (like }]);)
  • Another Content-type is being checked: text/javascript (twitter uses that)
More stuff soon.