Month: April 2014

IL to abstract syntax tree parser

Just a little project I’m working on as part of a larger project. Thought I could post some updates here about the progression of the project. Basically what it does is turn IL code into an abstract syntax tree, meaning it can then easily be converted into C#, VB or whatever else language. What it looks like:

Image

I’ll keep you guys updated, and hopefully soon put it up on GitHub. 🙂

Advertisements

Dissecting ConfuserEx – Invalid metadata

The latest paper in my series. It just explains a bit what the Invalid metadata protection does, and how it affects decompilers etc. I don’t go over how to fix everything manually since it’s not only self-explanatory but de4dot is able to do it automatically, so I don’t see the point.

Introduction

This protection works by injecting “dummy” metadata into the assembly which the decompilers then try to parse and read. They crash since the metadata is not actually pointing to anything real.

Read it here: Dissecting ConfuserEx – Invalid metadata

Also added to the Dissecting ConfuserEx paper series blog entry.

Manual global decryption of MSIL methods

Something I’ve wanted to write about for a while, and just recently got around to. This paper shows you how you can globally decrypt MSIL methods by dumping them from JIT.

Introduction

If you’re not familiar with how the JIT compiler in the CLR runtime works, you should probably read http://geekswithblogs.net/ilich/archive/2013/07/09/.net-compilation-part-1.-just-in-time-compiler.aspx or any other article covering it to understand the basics. The reason this is global is because every .NET assembly that is compiled to IL code needs to be compiled into machine code (assembly) at runtime. This is done by either mscorjit.dll (.NET 3.5 and lower) or clrjit.dll (.NET 4 or higher). Both of these have a function called “compileMethod” that does the actual compiling. Basically the raw IL code is passed to compileMethod and from there a native method is created. This means that even if an obfuscator or protector completely encrypts the body of a method, it HAS to be passed to compileMethod as a clean (obfuscation may still be present), buffer of IL code. This is why we’re gonna take advantage of compileMethod in order to dump the clean bodies.

The reason I say this is an almost global decryption method is because some protections use code virtualization, such as Agile.NET. This means the IL is converted to their own custom bytecode and never runs through the JIT compiler. So the method of decryption I’m about to show is useless against this sort of protection. However, if you’re interested in decrypting a custom bytecode such as Agile.NET’s you should take a look at de4dot.

In this demonstration I will be using the most common ‘template’ for JIT hooking, created by Daniel Pistelli. You can find it here: http://www.codeproject.com/Articles/26060/NET-Internals-and-Code-Injection#JIT_Hooking_Example.

Read it here: Manual global decryption of MSIL methods

Bored? Take a look at this.

If you’re like me, you’ve wondered what some of the classes and methods in the standard mscorlib.dll looks like. Yes, you can use Reflector but it’s a pain and doesn’t really generate as descriptive info as the website below. Anyway, I sometimes sit and browse this site for a while when I’m bored. It’s basically a large library containing sources of methods from the .NET framework.

http://www.dotnetframework.org/Search.aspx

If you find anything weird or funny in there feel free to make a comment about it!

Closer look at the native constant mutation in ConfuserEx

In my Dissecting ConfuserEx – x86 switch predicates paper I quickly went over the actual code used in order to understand the Switch jump flow. But I simply debugged it to see the return value, and didn’t go into detail about what the code does. I just thought it wouldn’t be that interesting or revelant to the paper. That’s why I decided to create this blog entry in order to properly explain it. You will need to read the paper in order to understand what I’m talking about here.

Let’s start by looking at an obfuscated piece of C# code;

Image

and the IL equivalent:

Image

Let’s follow the native method with RVA 20F0 call at:

IL_0011: call int32 <Module>::(int32)

in OllyDbg. We’ll find:

003420F0 /. 89E0 MOV EAX,ESP
003420F2 |. 53 PUSH EBX
003420F3 |. 57 PUSH EDI
003420F4 |. 56 PUSH ESI
003420F5 |. 29E0 SUB EAX,ESP
003420F7 |. 83F8 18 CMP EAX,18
003420FA |.- 74 07 JE SHORT 00342103 <-- THIS JUMP SHOULD NOT BE TAKEN
003420FC |. 8B4424 10 MOV EAX,DWORD PTR SS:[ESP+10]
00342100 |. 50 PUSH EAX
00342101 |.- EB 01 JMP SHORT 00342104
00342103 |> 51 PUSH ECX
00342104 |> B8 6D739303 MOV EAX,393736D
00342109 |. 81C0 BEBDB45E ADD EAX,5EB4BDBE
0034210F |. 59 POP ECX
00342110 |. 69C9 2538C0C9 IMUL ECX,ECX,-363FC7DB
00342116 |. 69C9 E5FC94FD IMUL ECX,ECX,-26B031B
0034211C |. 29C8 SUB EAX,ECX
0034211E |. 81C0 B2C98459 ADD EAX,5984C9B2
00342124 |. 5E POP ESI
00342125 |. 5F POP EDI
00342126 |. 5B POP EBX
00342127 \. C3 RETN

This is the code that returns the value deciding where the switch should jump next. We can skip instructions up until:

003420FC |. 8B4424 10 MOV EAX,DWORD PTR SS:[ESP+10]
00342100 |. 50 PUSH EAX

This is where it moves the data of the passed method parameter into the EAX register, in our case: 0x2515CA13. Right after it pushes the value onto stack. It then does an unconditional jump to:

00342104 |> B8 6D739303 MOV EAX,393736D
00342109 |. 81C0 BEBDB45E ADD EAX,5EB4BDBE

This should be quite self-explanatory. It moves an immediate value of 0x393736D into EAX, then adds 0x5EB4BDBE to it. EAX now contains: (0x393736D + 0x5EB4BDBE) == 0x6248312B

0034210F |. 59 POP ECX

Put whatever value is on top of stack into ECX, in our case it will load 0x2515CA13, pushed from: 00342100 PUSH EAX.

00342110 |. 69C9 2538C0C9 IMUL ECX,ECX,-363FC7DB
00342116 |. 69C9 E5FC94FD IMUL ECX,ECX,-26B031B

Here it does some more arithmetic. It first multiplies whatever is in ECX with -0x363FC7DB and stores it in ECX. It then multiplies whatever is in ECX with -0x26B031B and stores it in ECX once again. ECX now contains: (0x2515CA13 * -0x363FC7DB * -0x26B031B).

0034211C |. 29C8 SUB EAX,ECX

Subtract whatever is in EAX with ECX. EAX now contains: 0x6248312B – (0x2515CA13 * -0x363FC7DB * -0x26B031B).

0034211E |. 81C0 B2C98459 ADD EAX,5984C9B2

Add 0x5984C9B2 to EAX. EAX now contains: (0x6248312B – (0x2515CA13 * -0x363FC7DB * -0x26B031B)) + 0x5984C9B2. And now to the beautiful part. All this complicated looking math is actually equal to 2. Knowing this we can follow the jump flow of the switch as shown in the image below:

Image

This might all seem a bit over complicated for such a simple task of hiding a constant value. But it really isn’t. Splitting a constant up into parts and reassembling it at runtime causes an array of different problems for automatic deobfuscation tools. In my project ConfuserDeobfuscator I created a simple IL emulator in order to fold these “mutated” constants into one value. But doing this in x86 assembly makes it more difficult to emulate. Additionally, these methods are “randomly” generated from a small set of opcodes. The ones showed in this example (IMUL, ADD, SUB) aren’t all. Here’s a list of the possible opcodes:

public enum x86OpCode
 {
          MOV,
          ADD,
          SUB,
          IMUL,
          DIV,
          NEG,
          NOT,
          XOR,
          POP
 }

But all that aside, here’s what the x86 assembly code above could look like if it was implemented in C#:

Image

 

I hope this gave you some more insight on what the native code actually does. Is there something incorrect in the text? Do you have any questions or feedback? Post it in the comments below. 🙂

Dissecting ConfuserEx paper series

As you might know I’ve written several papers covering the different protections of Confuser 1.9. Now that Yck1509 (author of Confuser) started working on a successor project I’m really excited to keep up the papers for the new ConfuserEx! It has far more complex obfuscation routines, and also introduces the use of native methods inside the .NET assembly, so hopefully I’ll learn some more x86 writing these. 🙂 So far I’ve covered 2 protections. I’ll try to continue whenever new features are added to the project. I’ll keep updating this blog entry whenever I release new papers. In the meantime, feel free to read through the list of finished ones:

Enjoy reading them. Feel free to give me feedback or questions in the comments.