Skip to content

Improving Coverage Guided Fuzzing, Using Static Analysis

SMT solvers are powerful, but their great power needs great computational infrastructure (if you want to put them in real-word practice), and its something that I don’t have, So I constantly try to find new ways to improve fuzzing without need of heavy computational overhead. There are some ways to avoid SMT Solvers and improving the code coverage significantly. For example the idea of transforming the code (in compile time) in a way that helps guided brute-force fuzzing find its way in deeper paths of the program has gain attention lately[0][1]. But when you are dealing with binaries, you can’t easily transform the compare instructions (e.g “CMP REG, CONSTANT” or string comparisons) to a split variation that helps the fuzzer to butte-force the constant operand one byte at the time (In KFLOG I can instrument the execution of CMP without any sensible overhead, but I will talk about that in another blog post when I fully implemented and tested it).

But it is still possible to improve the code coverage of a guided fuzzer without use of SMT solvers or instrumentation of CMP instruction when it stucks on a block with a constant comparison, using simple static analysis on CMP instructions.

By use of a simple IDA script, I first enumerates all the CMP instructions with a constant operand, then I enumerate the Jump Instructions that are related to the targeted CMP instruction (some times CMP is followed by more than one jump, for example a JZ and then a JA). After that I generate a constant that negates each CMP conditions, then I save the constant and its negation(s) on a dictionary file. The dictionary content is later gets injected into the input buffer by KFUZZ during a fuzzing session. You can further improve this by adding string constants to the dictionary too.

The effect is really impressive, for example KFUZZ is now able to pass the following test without need to instrument the CMP instruction (to split the constant) while producing good coverage (It can produce full coverage if I don’t stop the fuzzing when it reaches the “return TRUE” line.):

// fuzzer test functions, garbed from LllvmFuzzer tests
// https://github.com/llvm-mirror/llvm/tree/master/lib/Fuzzer/test

static volatile INT sink;

BOOLEAN
LongSwitch(
    CONST PUCHAR Data,
    SIZE_T       Size
    ) 
{
    ULONGLONG X;

    if (Size < sizeof(X)) 
        return FALSE;

    memcpy(&X, Data, sizeof(X));
    switch (X) {

        case 1: sink = __LINE__; break;
        case 101: sink = __LINE__; break;
        case 1001: sink = __LINE__; break;
        case 10001: sink = __LINE__; break;
        case 100001: sink = __LINE__; break;
        case 1000001: sink = __LINE__; break;
        case 10000001: sink = __LINE__; break;
        case 100000001: return TRUE;
    }

    return FALSE;
}

LongSwitchCFG

Driller paper[2] introduced a challenge that emphasis the use of concolic execution on fuzzing. The challenge is written in a way that a normal guided fuzzer like AFL[3] is unable to complete because the use of a 4 bytes numeric constant and two string constants. A fuzzer have to guess this values correctly to be able to find the bug. I implemented the same challenge in KTEST driver and gave it a try using constant dictionary (contains both CMP and string constants + memory compare routines instrumentation disabled):

#define DRILLER_TEST_MAGIC  0x9e3779b9
typedef struct _DRILLER_TEST_CONFIG {
    ULONG Magic;
    CHAR Directive[64];
} DRILLER_TEST_CONFIG, *PDRILLER_TEST_CONFIG;

NTSTATUS
DrillerTest(
    IN PUCHAR Buffer,
    IN ULONG  BufferLen
    )
{
    PDRILLER_TEST_CONFIG pConfig;

    if (BufferLen < sizeof(DRILLER_TEST_CONFIG))         
        return STATUS_INVALID_BUFFER_SIZE;
 
    pConfig = (PDRILLER_TEST_CONFIG)Buffer;     
    if (pConfig->Magic != DRILLER_TEST_MAGIC) {
        
        DbgPrint("[KTEST] Bad magic number\n");
        return STATUS_INVALID_PARAMETER;
    }

    if(!strncmp(pConfig->Directive, "crashstring", 12)) {
        
        DbgPrint("[KTEST] passed the DrillerTest (crashstring)!\n");
        return STATUS_SUCCESS;
    }
    else if(!strncmp(pConfig->Directive, "setoption", 10)) {

        /* setoption(config->directives[1]); */
        DbgPrint("[KTEST] passed the DrillerTest! (setoption)\n");
        return STATUS_SUCCESS;
    }
    else {

        /* _default(); */
        DbgPrint("[KTEST] DrillerTest called the _default()\n");
        return STATUS_INVALID_PARAMETER;
    }
}

First a few bytes in the queued files in demo above are KFUZZ’s internal structure contains some scheduling information about the queued test-case.

After that I tested it using a dictionary contains only CMP constants (no string constant) and enabled memory compare routine instrumentation (I started recording in the middle of session and sorry, eye-protector poped up in the middle of video for 20 sec).


[0] https://lafintel.wordpress.com/2016/08/15/circumventing-fuzzing-roadblocks-with-compiler-transformations/
[1] http://dl.acm.org/citation.cfm?id=2594450
[2] https://www.internetsociety.org/sites/default/files/blogs-media/driller-augmenting-fuzzing-through-selective-symbolic-execution.pdf
[3] http://lcamtuf.coredump.cx/afl/

KFUZZ, a fuzzer story.

Almost two years ago, I started to write a kernel fuzzer to experiment with various ideas I had in mind. In the process I had to deal with some unexpected challenges I thought might be useful to share to people who want to go in the same path. I didn’t have a powerful machine or a fuzzing farm to throw a dumb fuzzer at it and let it run for a while and expect the monkey to come up with the complete works of Shakespeare, so speed and high coverage was my primary concern.

KFUZZ (I know! what a generic name ; ) uses a modular design which is separated in two user/kernel mode components. The user-mode part is responsible for mutating the input and consulting the KFLOG (kernel part) to see the input was able to hit any new edge/block or not. It’s really easy to develop new plugins for new targets and I have already implemented a bunch of them for fonts and generic driver IO fuzzing.

The real challenges began to show up in the kernel part implementation. I decided to implement the AFL[0] mechanism to trace the edge coverage in KFLOG. But there wasn’t any source code to instrument or any fast hardware assisted block tracing technology (except slow Intel PT which I was unaware of its existence at that time) to trace the control flow, and I was dealing with drivers binaries! The solution was (obviously) either binary rewriting or hooking at basic block level. Binary rewiring is really hard to implement and it is very error prone, especially when dealing with optimized kernel drivers, a simple mistake can crash the whole system. The best method I came up was to mirror the text section to avoid the plainly correction of data access in the rewritten text section by redirecting data access to the original unmodified text section, but even this was really a time consuming task!

So I decided to go with basic block level hooking idea (Besides the pain of correctly implementing binary rewriting, kernel component are getting load only once in the system up-time, so there is no need to worry about overhead of reloading and re-hooking the basic blocks at every input execution cycle). The first stage in BBL hooking is to extract basic blocks location from the binary and I knew that static recovery of control flow graph is almost impossible, especially when dealing with windows binaries you can’t rely on linear disassembly[1] and have 100% recovered CFG, but I gave it a try using mighty IDA Pro as static dissembler after Jakstab[2] and others failed me. IDA was able to produce correct CFG for almost all the targets (but failed in some case, like ntoskrnl).

The next thing that I had to deal with was the problem of small basic blocks, in order to hook a BB it has to be more than or equal to 5 bytes (size of a long jump instruction) and unfortunately there are lots of small basic blocks out there you cant hook and didn’t want to miss any of them! So I came up with the idea of using interrupt handler as a mechanism for instrumenting small basic blocks. I rewrote the small block with an Illegal Instruction (only two byes long!) and the used hooked IDT to instrument the block execution. It was tricky to implement but I did it right.

Now I had really fast and very low overhead (almost native!) basic block instrumentation framework. I used statically generated random IDs for each basic block (no need no re-compute it at each edge trace) and also implemented a callback in the edge hit function so I was able to do some extra work like recording control flow graph for each test-case without any overhead.

I also sacrificed a little bit of memory for more speed, unlike AFL I don’t have to re-scan the bitmap every time I ran a test-case to see if it was able to hit new edges. After locating the current edge position in the bitmap I use BTC[3] instruction to check it was set prior this hit or not, something like this:

movzx   esi, byte ptr [ebx+edx] /* Current edge position in the bitmap */
btc     esi, 0       /* ESI holds 1 for new edge and 0 for an old edge */
add     _g_EdgeCounter, esi
mov     byte ptr [ebx+edx], 1

Then I saved the _g_EdgeCounter variable in the shared memory so I was able to access it directly from user-mode component to reduce the kernel I/O at each input execution and hence speed up a little bit more ( I measured it, 10000 times shared memory access took 53709 TICS/174 us and 10000 times access to _g_EdgeCounter using driver I/O took 16778493 TICS/50863 us).

The next speed bottleneck was disk I/O, obviously first solution was using Ram Disks, but when you are fuzzing kernel you can’t rely on Ram because its content vanishes away when you crash the system. In an unsuccessful attempt I tied to hook KeBugcheck internals and dump the Ram content at the time of crash, then I decided to create the Ram disk outside the fuzzing VM and map it using network share inside the VM, so it can survive the VM crash and it worked. But I wanted more speed!

When you are fuzzing kernel drivers they rarely accept a file as an input, so it doesn’t matter how you save your test-case content before you run it through the target! To speed up test-case saving process I created a simple flat filesystem (if you call it filesystem ; ) which is just giant file (rounded up to sector size) with a single handle opened to it (with FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags set). To save a test-case I rounded up its size to sector size and used the index variable to save it in the appropriate free position in the filesystem along some extra information like its original size. This results in significant speed up (I’ve lost the measurements, but in my context the speed gain was really impressive) because I didn’t have go through kernel and filesystem driver codes for opening and closing a file every time I want to save a test-case and I was able to save as many test-cases as I wanted.

After successfully testing KFUZZ against real word examples, I wanted a more systematic approach to evaluate its effectiveness but nor LAVA corpora[4] or EvilCoder[5] were available at that time. So I wrote a very simple random program generator to test the KFUZZ capabilities. Here is sample function generated by my RPG:

BOOLEAN rgp_2_s[9] = { FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE };
// the following program covers about 14 percentage (19/128) of the input buffer!
NTSTATUS RandProg_2(__in PUCHAR Buffer, __in ULONG BufferLen)
{
    if (BufferLen < 128) return STATUS_UNSUCCESSFUL;     if ( ( Buffer[0x55] != 0x22 ) || ( Buffer[0x38] > 0x1D || Buffer[0x7A] < 0x57 || Buffer[0x5A] < 0xFA )  )
    {
        if (rgp_2_s[0] != TRUE)
        {
            DbgPrint("[RGF-2] passed stage 0!\n");
            rgp_2_s[0] = TRUE;
        }

        if ( ( Buffer[0x07] < 0x1D || Buffer[0x46] > 0x1D )  )
        {
            if (rgp_2_s[1] != TRUE)
            {
                DbgPrint("[RGF-2] passed stage 1!\n");
                rgp_2_s[1] = TRUE;
            }

            if ( Buffer[0x28] == 0x69  )
            {
                if (rgp_2_s[2] != TRUE)
                {
                    DbgPrint("[RGF-2] passed stage 2!\n");
                    rgp_2_s[2] = TRUE;
                }

                if ( Buffer[0x37] < 0xFA  )                 {                     if (rgp_2_s[3] != TRUE)                     {                         DbgPrint("[RGF-2] passed stage 3!\n");                         rgp_2_s[3] = TRUE;                     }                     if ( ( Buffer[0x0C] != 0x87 || Buffer[0x31] > 0xCB ) || Buffer[0x7F] > 0x66 || ( Buffer[0x7A] != 0xB4 )  )
                    {
                        if (rgp_2_s[4] != TRUE)
                        {
                            DbgPrint("[RGF-2] passed stage 4!\n");
                            rgp_2_s[4] = TRUE;
                        }

                        if ( ( Buffer[0x3C] == 0x09 ) || Buffer[0x54] < 0x8A  )
                        {
                            if (rgp_2_s[5] != TRUE)
                            {
                                DbgPrint("[RGF-2] passed stage 5!\n");
                                rgp_2_s[5] = TRUE;
                            }

                            if ( ( ( Buffer[0x70] < 0x29 ) ) && ( Buffer[0x43] != 0x87 && Buffer[0x09] != 0x9B )  )
                            {
                                if (rgp_2_s[6] != TRUE)
                                {
                                    DbgPrint("[RGF-2] passed stage 6!\n");
                                    rgp_2_s[6] = TRUE;
                                }

                                if ( ( Buffer[0x1D] == 0x90 )  )
                                {
                                    if (rgp_2_s[7] != TRUE)
                                    {
                                        DbgPrint("[RGF-2] passed stage 7!\n");
                                        rgp_2_s[7] = TRUE;
                                    }

                                    if ( Buffer[0x7E] < 0xC6  )
                                    {
                                        if (rgp_2_s[8] != TRUE)
                                        {
                                            DbgPrint("[RGF-2] passed final stage!\n");
                                            rgp_2_s[8] = TRUE;
                                        }

                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

Soon after, I added tests from LllvmFuzzer test-suit to my test driver KTEST. here are some results (entirely on KTEST binary) [6]:

And after adding memory comparison instrumentation to KFUZZ, it was able to pass the following test in 22 minutes.

ktest_strncmp_test

There are still some features missing in the KFUZZ, like instrumentation of CMP instruction or internal memory/string compare loops instrumentation or using a SMT solver when KFUZZ stuck on an input, which I have plans to implement.


[0] http://lcamtuf.coredump.cx/afl/
[1] https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/andriesse
[2] http://tuprints.ulb.tu-darmstadt.de/2338/
[3] http://x86.renejeschke.de/html/file_module_x86_id_23.html
[4] https://seclab.ccs.neu.edu/static/publications/sp2016lava.pdf
[5] http://dl.acm.org/citation.cfm?id=2991103
[6] https://gist.githubusercontent.com/shjalayeri/58a9e8e42d3cc8300fa7cbcce174fa75/raw/4646bdfafd9892027a8cbb8a210b5e1f4f87be97/test.c

Blood Money

This is my write-up for Amnpardaz (Iranian AV company, I’ve played with their products before) and ISC (Iranian Society of Cryptology) Ransomware’s data decryption challenge. I’m not lying, it was really easy comparing to CTFs that I’ve played (and failed 😉 ). It was three steps in total, you get a bunch of files for each step and you most decrypt them in order to get the next step email address.

Step One

We have a screenshot from the Ransomware’s window which says your files got encrypted with AES-256 blah blah blah, a solveme.gif which we most decrypt in order to get the next level email address, and a PCAP file which contains network traffic at infection time. Obviously we most get encryption key from the PCAP file.

I used forensicPCAP for examining the PCAP file. forensicPCAP is python script based on scapy for Network Forensics. The given PCAP has 12026 packets captured inside, I started by looking on HTTP traffic:

PCAP's web traffic

As I expected there were some many of it, so I decided to take a look at DNS traffics to see if there is any unusual DNS requests or any random looking domain (like what a domain generation algorithm may produce). My suspicion was right and there was a domain that looks randomly generated in DNS records list:

DGA domain

Next thing I do was checking the DNS packet to find the domain IP address:

###[ DNS ]###
           id        = 60305
           qr        = 1L
           opcode    = QUERY
           aa        = 1L
           tc        = 0L
           rd        = 1L
           ra        = 1L
           z         = 0L
           rcode     = ok
           qdcount   = 1
           ancount   = 1
           nscount   = 1
           arcount   = 1
           \qd        \
            |###[ DNS Question Record ]###
            |  qname     = 'www.qwleofjwih.com.'
            |  qtype     = A
            |  qclass    = IN
           \an        \
            |###[ DNS Resource Record ]###
            |  rrname    = 'www.qwleofjwih.com.'
            |  type      = A
            |  rclass    = IN
            |  ttl       = 86400
            |  rdlen     = 4
            |  rdata     = '91.226.213.198'
           \ns        \
            |###[ DNS Resource Record ]###
            |  rrname    = 'qwleofjwih.com.'
            |  type      = NS
            |  rclass    = IN
            |  ttl       = 86400
            |  rdlen     = 20
            |  rdata     = 'ns1.qwleofjwih.com.'
           \ar        \
            |###[ DNS Resource Record ]###
            |  rrname    = 'ns1.qwleofjwih.com.'
            |  type      = A
            |  rclass    = IN
            |  ttl       = 86400
            |  rdlen     = 4
            |  rdata     = '91.226.213.198'

Then I checked the corresponding packets and found a GZIP encoded reply from the web server:

DGA domain's traffic

Decoding the data and gave me this:

{id:"23945yr8',bit:"3wkh8C3AU4tdKQ==",p:"vXc1ARd1dNWdFELzZ9TpfuEe8vYwLJcqMzluPHp42TE=",t:"138542922",k:"34VJcOLfchVZTFD/gPmwyQ=="}

Then I wrote a python script to decrypt the given GIF file, deleted two junk bytes from beginning of the file to got the next level e-mail address.

Next level mail address

Step Two

Incident report says encryption algorithm is AES and we have solveme-enc.jpg which again we most decrypt, a bunch of files from a compromised Drupal website that used by Ransomware as some sort of key generation source.

I started by looking at the CHANGELOG of the given Drupal, got the Drupal version, downloaded the original one and started diffing the files. The only noticeable changes were in databse.inc:

Drupal comparison

I seems to me like lousy key generation algorithm. The only problem here for getting the right key is time() function which used as a seed to PHP random number generator, So I needed the exact EPOCH time to get the same key as malware got in the infection process. For getting the right time, I checked the given 7zip file, and there I see the solveme-enc.jpg‘s last modification time.

JPEG's last modification time

Unfortunately it misses the seconds, so I wrote another simple python script to generate a PHP script with EPOCH times in range of 10:34 PM. Then I had a file like this which produced some keys after I ran it through:

mt_srand(1440957610);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
mt_srand(1440957611);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
mt_srand(1440957612);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
mt_srand(1440957613);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
mt_srand(1440957614);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
mt_srand(1440957615);for ($i=0;$i<128/8;$i++) echo sprintf("%02x", mt_rand(0,255)); echo "</br>";
[...]

Again with help of another python script, I was able to decrypt the given file with each key until I found one of them actually got decrypts as valid JPEG file. Diffing it with the original famous picture, revealed the next level email.

Next level email address

Update :

@pwnslinger pointed out that no brute-force is needed, exact seed is filetime(solveme-enc.jpg) .

Step Three

Yes, here is the fun part 😉 We are given a PE and a Result.txt file, we most decrypt the Result.txt.

The given PE was packed with a custom packer, I used the old fashion PUSHAD/POPAD and hardware break points to get to the OEP and then dumped the unpacked version.

Packed binary

OpenSSL debug string and asserts helped me a bit to understand the program flow, it started by generating a 2048 RSA key and used the this freshly generated key to encrypt the result.txt content :

.rsrc:00401AF0                 push    ebp
.rsrc:00401AF1                 mov     ebp, esp
.rsrc:00401AF3                 sub     esp, 144h
.rsrc:00401AF9                 push    0
.rsrc:00401AFB                 push    0
.rsrc:00401AFD                 push    10001h
.rsrc:00401B02                 push    800h
.rsrc:00401B07                 call    RSA_generate_key
[...]
.rsrc:00402851                 lea     eax, [ebp+var_38]
.rsrc:00402854                 push    eax             ; output_buffer
.rsrc:00402855                 mov     ecx, [ebp+var_4]
.rsrc:00402858                 push    ecx             ; data size
.rsrc:00402859                 mov     edx, [ebp+var_154]
.rsrc:0040285F                 push    edx             ; rsa key
.rsrc:00402860                 mov     eax, [ebp+var_170]
.rsrc:00402866                 push    eax             ; data
.rsrc:00402867                 call    sub_401F80

Then it encodes the newly generated RSA private key using base64 encoding (function renamed manually after identification):

.rsrc:00402945                 mov     eax, [ebp+var_164]
.rsrc:0040294B                 push    eax
.rsrc:0040294C                 mov     ecx, [ebp+var_18]
.rsrc:0040294F                 push    ecx
.rsrc:00402950                 mov     edx, ds:dword_49DAD4
.rsrc:00402956                 add     edx, ds:dword_49DAD8
.rsrc:0040295C                 push    edx
.rsrc:0040295D                 mov     eax, [ebp+var_30]
.rsrc:00402960                 push    eax
.rsrc:00402961                 call    base64encode
.rsrc:00402966                 add     esp, 10h
.rsrc:00402969                 mov     [ebp+var_10], eax

After that It decodes the word “challenge” (Xor’ed with some 1 byte key) and passes it to the this function:

RC4 init phase

Loop counter (256) and memory writes look suspiciously like a RC4 key initiation phase. Using some trial and error I found that my suspicion was right and it is actually RC4 key initiation. Obviously the next function most be RC4 encryption function, so I renamed it to rc4_enc (after looking at it and making sure its actually RC4 encryption). In next step, program encrypts the base64 encoded RSA private key with the RC4 encryption algorithm:

.rsrc:004029B8                 lea     ecx, [ebp+var_148]
.rsrc:004029BE                 push    ecx             ; key
.rsrc:004029BF                 mov     edx, [ebp+var_164]
.rsrc:004029C5                 sub     edx, 1
.rsrc:004029C8                 push    edx             ; priv key base64 size
.rsrc:004029C9                 mov     eax, [ebp+var_18]
.rsrc:004029CC                 push    eax             ; base 64 priv key
.rsrc:004029CD                 call    rc4_enc
.rsrc:004029D2                 add     esp, 0Ch
.rsrc:004029D5                 jmp     loc_402B3B

After that, encrypted key is passed to another function which Rijndael_Te1 string immediately reveals its AES encryption:

.rsrc:0040500A                 mov     eax, ds:Rijndael_Te2[eax*4]
.rsrc:00405011                 mov     ebx, esi
.rsrc:00405013                 shr     ebx, 10h
.rsrc:00405016                 and     ebx, 0FFh
.rsrc:0040501C                 xor     eax, ds:Rijndael_Te1[ebx*4]
.rsrc:00405023                 mov     ebx, edx
.rsrc:00405025                 shr     ebx, 18h
.rsrc:00405028                 xor     eax, ds:Rijndael_Te0[ebx*4]
.rsrc:0040502F                 mov     ebx, edi
.rsrc:00405031                 and     ebx, 0FFh
.rsrc:00405037                 xor     eax, ds:Rijndael_Te3[ebx*4]
.rsrc:0040503E                 mov     ebx, ebp
.rsrc:00405040                 xor     eax, [ecx+10h]

Encryption key for AES phase is constant hex value “726A5C7C475670706F6862567E465E5C”. last encryption phase takes place right before writing the result into the disk, at first glance I thought I might be a variant of some simple encryption algorithms like TEA but I was totally wrong, It was way simpler than TEA 🙂 after a while and being unable to determine the encryption algorithm, I decided to reverse engineer it (it is really small 😉 ). Here is the whole code:

.rsrc:00402440 unknown_enc     proc near               ; CODE XREF: sub_402500+892p
.rsrc:00402440
.rsrc:00402440 var_4C          = byte ptr -4Ch
.rsrc:00402440 out_buff        = dword ptr -1Ch
.rsrc:00402440 var_18          = dword ptr -18h
.rsrc:00402440 out_buff_ret    = dword ptr -14h
.rsrc:00402440 var_10          = word ptr -10h
.rsrc:00402440 var_C           = dword ptr -0Ch
.rsrc:00402440 internal_size   = dword ptr -8
.rsrc:00402440 var_4           = dword ptr -4
.rsrc:00402440 inbuff          = dword ptr  8
.rsrc:00402440 size            = dword ptr  0Ch
.rsrc:00402440
.rsrc:00402440                 push    ebp
.rsrc:00402441                 mov     ebp, esp
.rsrc:00402443                 sub     esp, 1Ch
.rsrc:00402446                 push    ebx
.rsrc:00402447                 push    esi
.rsrc:00402448                 push    edi
.rsrc:00402449                 mov     eax, [ebp+size]
.rsrc:0040244C                 sub     eax, 1
.rsrc:0040244F                 mov     [ebp+var_4], eax
.rsrc:00402452                 mov     cx, word ptr [ebp+var_4]
.rsrc:00402456                 mov     [ebp+var_10], cx
.rsrc:0040245A                 mov     edx, [ebp+size]
.rsrc:0040245D                 add     edx, 1
.rsrc:00402460                 mov     [ebp+internal_size], edx
.rsrc:00402463                 mov     eax, [ebp+size]
.rsrc:00402466                 push    eax
.rsrc:00402467                 call    unknown_libname_5 ; Microsoft VisualC 2-10/net runtime
.rsrc:0040246C                 add     esp, 4
.rsrc:0040246F                 mov     [ebp+out_buff], eax
.rsrc:00402472                 mov     ecx, [ebp+out_buff]
.rsrc:00402475                 mov     [ebp+out_buff_ret], ecx
.rsrc:00402478                 pusha
.rsrc:00402479                 mov     esi, [ebp+inbuff]
.rsrc:0040247C                 mov     edi, [ebp+out_buff_ret]
.rsrc:0040247F                 add     edi, [ebp+var_4]
.rsrc:00402482                 inc     edi
.rsrc:00402483                 mov     bx, [ebp+var_10]
.rsrc:00402487                 mov     ebp, [ebp+internal_size]
.rsrc:0040248A
.rsrc:0040248A loc_40248A:                             ; CODE XREF: unknown_enc+A3j
.rsrc:0040248A                 mov     dx, bx          ; size - 1
.rsrc:0040248D                 and     dx, 3
.rsrc:00402491                 mov     ax, 1C7h
.rsrc:00402495                 push    eax
.rsrc:00402496                 sahf
.rsrc:00402497                 jmp     short loc_4024BA ; al = inbuff[i]
.rsrc:00402499 ; ---------------------------------------------------------------------------
.rsrc:00402499
.rsrc:00402499 loc_402499:                             ; CODE XREF: unknown_enc+85j
.rsrc:00402499                 mov     [ebp+var_C], 0
.rsrc:004024A0                 mov     edx, [ebp+var_C]
.rsrc:004024A3                 add     edx, 0Ah
.rsrc:004024A6                 mov     [ebp+var_C], edx
.rsrc:004024A9                 mov     [ebp+var_18], 0Fh
.rsrc:004024B0                 mov     eax, [ebp+var_C]
.rsrc:004024B3                 imul    eax, [ebp+var_18]
.rsrc:004024B7                 mov     [ebp+var_C], eax
.rsrc:004024BA
.rsrc:004024BA loc_4024BA:                             ; CODE XREF: unknown_enc+57j
.rsrc:004024BA                 lodsb                   ; al = inbuff[i]
.rsrc:004024BB                 pushf
.rsrc:004024BC                 db      36h
.rsrc:004024BC                 xor     al, [esp+50h+var_4C] ; 0xc7
.rsrc:004024C1                 xchg    dl, cl          ; 60
.rsrc:004024C3                 jmp     short loc_4024C7
.rsrc:004024C5 ; ---------------------------------------------------------------------------
.rsrc:004024C5                 jmp     short loc_402499
.rsrc:004024C7 ; ---------------------------------------------------------------------------
.rsrc:004024C7
.rsrc:004024C7 loc_4024C7:                             ; CODE XREF: unknown_enc+83j
.rsrc:004024C7                 rol     ah, cl
.rsrc:004024C9                 popf
.rsrc:004024CA                 adc     al, ah          ; 1, 1, 2
.rsrc:004024CC                 xchg    dl, cl
.rsrc:004024CE                 xor     edx, edx
.rsrc:004024D0                 and     eax, 0FFh
.rsrc:004024D5                 add     bx, ax          ; resutl + size
.rsrc:004024D8                 stosb
.rsrc:004024D9                 mov     cx, dx
.rsrc:004024DC                 pop     eax
.rsrc:004024DD                 jecxz   short loc_4024E7
.rsrc:004024DF                 sub     edi, 2
.rsrc:004024E2                 dec     ebp
.rsrc:004024E3                 jnz     short loc_40248A ; size - 1
.rsrc:004024E5                 jmp     short loc_4024E9
.rsrc:004024E7 ; ---------------------------------------------------------------------------
.rsrc:004024E7
.rsrc:004024E7 loc_4024E7:                             ; CODE XREF: unknown_enc+9Dj
.rsrc:004024E7                 xor     eax, eax
.rsrc:004024E9
.rsrc:004024E9 loc_4024E9:                             ; CODE XREF: unknown_enc+A5j
.rsrc:004024E9                 popa
.rsrc:004024EA                 mov     eax, [ebp+out_buff_ret]
.rsrc:004024ED                 pop     edi
.rsrc:004024EE                 pop     esi
.rsrc:004024EF                 pop     ebx
.rsrc:004024F0                 mov     esp, ebp
.rsrc:004024F2                 pop     ebp
.rsrc:004024F3                 retn
.rsrc:004024F3 unknown_enc     endp

Encryption mechanism is bit tricky (saving/restoring EFLAGS and using ADC while carry flag is always set on). After some messy prototypes, I wrote the following C code for the encryption pahse:

void EncryptBuffer(char* input, char* output, int size)
{
    int i = 0;
    char internal_size = size - 1;

    do
    {
        output[size] = (input[i] ^ 0xc7) + __rol(1, (internal_size &amp; 3)) + 1;
        internal_size += output[size];

        size--;
        i++;
    } while (size >= 0);
}

The hard part was writing the decryption phase of the algorithm, I had to produce the same state (internal_size value) for each iteration to get the right input. Again, after playing with it, I wrote the decryption algorithm as follows:

void DecryptBuffer(char* input, char* output, int size)
{
    int i = 0;
    char internal_size = size -1;

    do
    {
        output[i] = (input[size] - (__rol(1, (internal_size &amp; 3)) + 1)) ^ 0xc7;
        internal_size += input[size];

        size--;
        i++;
    } while (size >= 0);
}

Program uses this algorithm to encrypt 3 blocks of 50 bytes each at locations 0, 100 and 200 of the encrypted RSA key buffer (from the last AES step), then It wrote encrypted RSA key alongside the RSA encrypted file content in the result.txt . Getting the RAS key was just matter of doing the same encryption chain in backwards. By doing this, I got the following RSA private key from given result.txt :

-----BEGIN RSA PRIVATE KEY-----
MIIEowIBAAKCAQEApibgRXncbyC+YYYkJ63OmsN4H3YfROxKFIKaqV4EIplC5+bA
MdWS6qvCHpoPz3AvhUEXceySjybKIotIY7cbWhO8TANXsUzxsMmuBBXs4ILX8EZv
SiLv5Vva1af4MsXsqYbYM97BL7rOg4bHb8rM1RfmqC50yb+/akLHcCIUjycGFaRd
PNRAijh3QHE/ofr9RpRpROIlameuQEjDBLtYlQi1UxqCu9klmR2/py3Qr7wV20oQ
4o0OhxUrNp0Mzu2Zdg63lmSVWsLcIBzSIReygUTMA8qTZ+0Jl/xB2Vhp4vxLDy5e
WR3sbOD/Q97OXFbx7/F6LP5MNrg2uppR2+2N8QIDAQABAoIBAQCNxeReRBI83LK2
YpCdLuh5FEt+hPs/g2PexmaUGD3tC9uUJ0hd/YBkL3TvScQt2+sgiB8qPZP9BDs8
aJ63PznejbKBJeUAy8f7csvCfrbmB5+cTW2O0+rhSZSb9LyLDmnXadE3yV4MjRjE
EBBDKsfHGKLfZOyQbcY2NI8a9mmWj3RwoFcE5eTP9kqfTY6toEJLwI7VFkFX2XPI
Ottho2bxrlc86elKP5Rfir3hEnWU7nea6anvD1y7DADjXJyXoyBNtRDTHPkDvdhG
4plf7kAYp4Rc4sRPobde2aA37pqinLM6IkAhRhHiSg5V0C0vdBcahcvZ/QBbWJqB
krlt6FWlAoGBANjZb56LxOGoPmOSQcmsZi4UbS77kjk7a/GfdkoIEsHK2DJqLfUk
dXRjzl8o8pnXkdT1FVtCvElJzrcwjrP2tJ3TXbcmvMenbUtjM/PhUYjOsb8xozpE
tMBVSfyV00LoaWlVch18K8I/uG3a14K67MwzBnhcKWKbJrppE8zh2eZfAoGBAMQm
QGItEPS2iW0XzQ37HcHXmBoZ00v8Bx+sGDWWyts+GIAvUnS24skFmB9ILbvjxQ+7
IsrVu4VYCxeRJfSWd5TMcv8SNUAfHLCL26/upX/hnAEdCJoqJWXWcpsHJKgvIFks
KXsEpPxFtJNCpSjGtHB7npYzJHmAuBwkDWEf4c2vAoGAUZ10rz13ul6yLJOtgxQJ
2SoC9f3lSPkeZXBY+wAS3zFTMZZY+bzhIA84awRkWpaR4o7jnNd/Oi43SSdTblRa
IlSdHwPLZXGUZx1NPmr9Xvo8V/N8tb+KMCFpmVFik/oZQnXQX1yOs6t75IzLM/7a
hPhnZQF66gvvBZXqx9/xPQ0CgYBcyaOHTb5JpNfZrXqo9HOdMPmYz0KvHSfZibVi
FFUd5X/9k2U0JRee9HCDy8cmrJaZ3HKW9QhiCcYlfdowm8UxtI1psBlUneMaeO6R
iRjtJ7J+rFdXZjyOsiVAxN5IWRK6XDO7J/VMCUVkrBAo++Z7l17ruoG0oHl3hm51
1XkhrQKBgCIZmRxShE8rS1Y6JzTlToG+FHoBk3UqhzfvlVE/k7EfKucNVmviygTc
1TKbuzYDVsVLOqzjUj/I9SIcFRft8O26WGWfwFqptpBuq7akTS0so/VC5T4//lG5
zOEqsCGMWS4sitVN7/eC3dEYXfN5zoiCRIlAEEq9sngtKqa1d0mM
-----END RSA PRIVATE KEY-----

Decrypting the file content  gave me final email address which was “padvish-quest@amnpardaz.com”.

You can get files from my github account.

Defeating Driver Singing Enforcement, Not That Much Hard!

These days everybody talks about Driver Signing Enforcement, and the ways we can bypass it. J00ru talked about the hard way, and I tell you about the easy and very long know way. What we need is just a Singed Vulnerable X64 Driver. As we know, loading drivers require administrator privilege, but these days a normal user with default UAC setting can silently achieve Admin privilege without popping up a UAC dialog.

The driver I was talking about is DCR from DriveCrypt. The X64 version is singed and is vulnerable to a write4 bug.

the latest version is not anymore vulnerable but this version still has a valid signature and that’s enough.

I think it’s obvious that you can make the whole process of escalating privilege from normal user to Admin for loading vulnerable drive ( silently with one of UAC bypass methods) and exploitation pragmatically automatic.

You can find vulnerable version of drive along the exploit at “DriveCrypt\x64\Release“.

Introducing MCEDP HoneyClient

MCEDP is a High Interaction Client Honeypot. Despite other High-Interaction honeyClients which detect malicious servers based on system changes (file system and registery modifications, invoked/killed processes, …), MCEDP uses a new approach. To accomplish this, MCEDP uses exploit detection methods to detect drive-by downloads at exploitation stage and dump malware file. Using this approach, MCEDP eliminates some limitations of current HoneyClients and improves the detection speed of High-Interaction client Honeypots.

More Info plus donwload links…

Windows Kernel Intel x64 SYSRET Vulnerability + Code Signing Bypass Bonus

UPDATE : I’ve just tested the exploit on Windows 2008 R2 SP1 x64, exploit works like a charm without any modification.

Hi again,

This time I worked on Kernel-Land a little. Microsoft Windows Kernel Intel x64 SYSRET Vulnerability (MS12-042) was only exploited by VUPEN, apparently!, But no PoC or exploit publicly available. So I decided to work on this challenge just for fun.At first glance, it was difficult to get Code-Execution but after several times struggling with Windbg I finally succeeded on triggering the bug and get code-execution.

By the way, Windbg had stupid bug on executing SWAPGS by single-stepping! I don’t really know why, but the guest VM always reboots!
I managed to get it to work with IDA Pro + GDB remote Debugging plugin after all!

So, anyway, here is the demonstration:

The shellcode disables Code Signing and will grant NT SYSTEM privilege to specified Application or already running process (PID), After successfully running exploit, I demonstrated installing an unsigned Driver (which Dbgprints “Microsoft eats it own dog food – http://en.wikipedia.org/wiki/Eating_your_own_dog_food) and granting NT SYSTEM privilege to cmd.exe .

*** WARNING: This is only a proof-of-concept, Although its programmed to be very reliable, But I won’t take any responsibility of any damage or abuse. Sorry kids!

Here are source codes.

Bypassing EMET 3.5’s ROP Mitigations

UPDATE : It seems MS was aware of this kind of bypasses, so I bypassed EMET ROP mitigations using another EMET’s implementation mistake. EMET team forget about the KernelBase.dll and left all its functions unprotected. so I used @antic0de‘s method for finding base address of kernelbase.dll at run-time, then I used VirtualProtect inside the kernelbase.dll, not ntdll.dll or krenel32.dll. you can get new exploit at the end of this post.

I have managed to bypass EMET 3.5, which is recently released after Microsoft BlueHat Prize, and wrote full-functioning exploit for CVE-2011-1260 (I choosed this CVE randomly!) with all EMET’s ROP mitigation enabled.

http://support.microsoft.com/kb/2458544

Demo:

EMET’s ROP mitigation works around hooking certain APIs (Like VirtualProtect) with Shim Engine and monitors their initialization.I have used SHARED_USER_DATA which mapped at fixed address “0x7FFE0000” to find KiFastSystemCall address (SystemCallStub at “0x7FFE0300”), So I could call any syscall by now!By calling ZwProtectVirtualMemory’s SYSCALL “0x0D7”, I made shellcode’s memory address RWX. After this step I could execute any instruction I wanted. But to execute actual shellcode (with hooked APIs like “WinExec”) I did patched EMET to be deactivated completely. BOOM! you can use both this methods for generally bypassing EMET ROP mitigations in other exploits, all you need is to bypass ASLR.

Here is the asm code which makes EMET 3.5 deactivated  And actual exploit.