Reverse Engineering with JavaScript

Posted by Ole André Vadla Ravnås NowSecure Marketing

Ole André Vadla Ravnås

Security Researcher at NowSecure

Ole is the creator of Frida, an open-source tool for performing dynamic instrumentation of mobile apps, and indulges his passion for reverse-engineering as a security researcher at NowSecure.

NowSecure Marketing

August 05, 2017

Research & Threat Intel

IUve been doing a lot of reverse engineering and very often thereUs this common thing coming up: I would really like to have a tool for this because I need to look at this particular API, or look at what would happen if I do this vs that, etc.

Is it realistic?

It’s quite a lot of work to build a tool from scratch, and existing tools aren’t really suitable for the iterative reversing kind of use-case. I would also like to have a really short feedback-loop when doing this, because as a reverser my understanding of the problem tends to evolve as I go along, which means I need to circle back to adapt the tool. If I then have a long feedback loop then I’m just going to waste a lot of time, and in the end it might be better to just do things by hand and go through the painstaking process using conventional tools.

I would also like to have just one toolkit, so that whichever platform I’m currently on there’s just one toolkit I need to learn once. Of course there’s platform-specific bits involved, but at least I don’t have to re-learn the basics every time.

oSpy

Back in 2004 I was doing a lot of reverse engineering of protocols. At the time I was focusing on Windows Live Messenger as I was using Linux and wanted reasonable feature parity with my friends using the Windows version. Later I also got myself a Windows Phone, and it turned out only older models were supported by the synchronization software available for Linux users. While doing the ActiveSync reversing I ended up building a tool called oSpy that looks like this:

It’s kind of like Wireshark, but works at the API level, which means you can look at encrypted communications and pretty much anything you’d like. This was written in C#, and I injected code into the target process in order to instrument it, and that code was written in C. This was very powerful as you could trace any function you’d like.

With great power came great complexity

Developing the injected code was a long and tedious process. For each function traced you’d have to write quite a bit of boilerplate. You then had to compile it, restart the target app in case you left it in an unstable state last time, deploy your payload, inject it, and finally wait to the target app to call that API. Well, the feedback loop was pretty horrible. oSpy did help tremendously for the challenges I was facing back then, but it was this very slow thing to evolve, and it was not very portable.

Trying again

Later I had new reversing projects coming up, and I found myself adding temporary hacks to the oSpy code-base. As powerful as the low-level function tracing was, I couldn’t let oSpy turn into a kitchen-sink, and it wasnUt really feasible to keep writing those bits in C. This lead me to creating oSpy2, where I tried to get C out of the equation by letting you define functions to be traced declaratively through a config file:

Time was however not my friend

All of my reversing adventures were of recreational nature, and I couldn’t fit much into my not-so-ample spare-time. oSpy2 had a functional engine, but it was Windows-only, and there was a Windows UI that would let you instrument an application using this engine, and then let you save the resulting trace to a file. I started building a cross-platform UI that would let you open and analyze these files, but I didn’t have time to finish it.

A new beginning

Time went by and I couldn’t quite shake the feeling that I should start from scratch one more time, but this time around I should build a cross-platform dynamic binary instrumentation library and make that work really well. I named this library Gum, as it was a GLib-based instrumentation library written in portable C. More time went by, it was 2010, and Frida was born. The idea was to build a logistics layer that would inject a shared library containing Gum into a target process, and let you upload control logic written in JavaScript that would have full access to the Gum APIs, read/write memory, etc.

The next chapter

It is now 2015, and after one foray into the video-conferencing world followed by another co-founding a startup in the music industry, my passion for dynamic binary instrumentation and reversing obviously has had to take a back-seat for quite some time. I did manage to cram some recreational Frida hacking into spare-time and holidays, and considering how many OSes and architectures it now runs on that’s pretty good. There has however been a yearning to be able to spend more time on Frida, to truly unleash its potential and make it the ultimate toolkit for dynamic instrumentation. This is why I’m absolutely stoked to continue this work with NowSecure, where I see it being very useful to mobile specifically, aligning perfectly with the strong multi-platform focus back when it all started.

In the time ahead I am really excited about improving feature parity across OSes and architectures, and make the mobile part of the story even stronger. I also plan on attending ZeroNights 2015, where I will be conducting a workshop on Frida. It will cover all the basics and also some advanced topics, so donUt miss out if you have a chance to attend.

In closing

Let’s do a quick demo. Suppose someone wrote this simple app:

	#include <stdio.h>
	#include <unistd.h>

	void
	f (int n)
	{
	printf (“Number: %dn“, n);
	}

	int
	main ()
	{
	int i = 0;

	printf (“f() is at %pn“, f);

	while (1)
	{
	f (i++);
	sleep (1);
	}
	}

view raw hello.c hosted with _ by GitHub

       $ clang hello.c -o hello        $ ./hello        f() is at 0x106a81ec0        Number: 0        Number: 1        Number: 2        I

Take note of the address of f(), which is 0x106a81ec0 here. This is the memory address where the function f() starts inside the hello appUs process, and due to Address Space Layout Randomization it will most likely be different between each run.

Install the latest Node.js 0.12 and a couple of packages:

       $ npm install frida co

Pour some code into modify.js, replacing the hard-coded address of f():

	‘use strict‘;

	const co = require(‘co‘);
	const frida = require(‘frida‘);

	let session, script;
	co(function *() {
	session = yield frida.attach(‘hello‘);
	script = yield session.createScript(‘(‘ + agent.toString() + ‘).call(this);‘);
	yield script.load();
	});

	function agent() {
	‘use strict‘;

	Interceptor.attach(ptr(‘0x106a81ec0‘), {
	onEnter: function (args) {
	args[0] = ptr(“1337“);
	}
	});
	}

view raw modify.js hosted with _ by GitHubThen run it while keeping an eye on the output from the hello app:

       $ node --harmony modify.js

You should see the printed number changing, and once you kill modify.js it should go back to normal:

       Number: 1281        Number: 1282        Number: 1337        Number: 1337        Number: 1337        Number: 1337        Number: 1296        Number: 1297        Number: 1298        I

NOWSECURE UNVEILS FIRST AUTOMATED OWASP MASVS V2.1 MOBILE APP SECURITY AND NEW PRIVACY TESTING