Verifying Java Decompilers

Reverse Engineering 0 Comment 215

A few months ago I was discussing in Bytecode Viewer’s GitHub repository about the feasibility of decompiling a Java class file, recompiling it, and still getting the same bytecode out. Storyyeller pointed out that it’s unlikely that the bytecode will be the same for most decompilers (with Krakatau likely being the closest). This post will give a practical demonstration of that, and will show how to spot decompiler errors with BinDiff.

First, what tools are going to be required?

I’m going to use a homework assignment from a few years ago as my example.

image

I’ve compiled the output from this in out_stock/, and opened it with Bytecode Viewer. (Since this is a simple example, we can visually tell that the decompiled output is pretty much the same as the original source code.)

image

I then exported all the class with Krakatau, and compiled it again. As expected, running VotingMachine.class appears to work the exact same as the original. We could just compare the bytecode manually, but that becomes a bit difficult when dealing with an application that has hundreds or thousands of classes.

So, we need a more automated way of telling what differences there are between the two classes. This is where BinDiff helps out. Unfortunately IDA Pro is still needed for using BinDiff.

image

After exporting the files with BinExport 8 (BinExport 9 does not work with BinDiff), IDA Pro is no longer needed.

image

At this point there’s a VotingMachine.BinExport for both the original class file, and the one compiled from Krakatau’s decompiled output.

Next, I opened both of these BinExport files and created a new diff in BinDiff.

image

As expected, BinDiff shows that the two programs are almost identical.

image

They’re not actually 100% identical though. I wonder what could be wrong?

image

The original class that doesn’t match looks like this:

public class VotingMachine {
    private int demoVotes, repubVotes;

    public void voteDemo() {
        this.demoVotes++;
    }
    public void voteRepub() {
        this.repubVotes++;
    }
    // Other functions removed to save space.
}

And the Krakatau source looks like this:

public class VotingMachine {
    private int demoVotes;
    private int repubVotes;

    public void voteDemo() {
        this.demoVotes = this.demoVotes + 1;
    }
    public void voteRepub() {
        this.repubVotes = this.repubVotes + 1;
    }
    // Other functions removed to save space.
}

It turns out that by pure luck I ran into a Krakatau bug on my first try. Java actually treats this.demoVotes = this.demoVotes + 1; and this.demoVotes++; differently (don’t ask me why).

combined_voterepub vs voterepub

combined_votedemo vs votedemo

Ironically, only Krakatau has this bug; CFR, Fernflower and Procyon all decompile and recompile 100% to the original source.

Summary:

  • Java decompilers aren’t actually that bad
  • To identify where decompilers fail, BinDiff can help visualize where they went wrong

Future ideas:

  • Fully automate/script the decompile -> recompile stage with each decompiler, and use BinDiff to identify if the output is correct
  • Write a fuzzer that generates random Java programs to help spot errors

Leave a Reply

Search

Back to Top