Obfuscating without XOR
Malicious files are generated and spread over the wild Internet daily (read: "hourly"). The goal of the attackers is to use files that are:
- not know by signature-based solutions
- not easy to read for the human eye
That’s why many obfuscation techniques exist to lure automated tools and security analysts. In most cases, it’s just a question of time to decode the obfuscated data. A classic technique is to use the XOR cypher[1]. This is definitively not a new technique (see a previous diary[2] from 2012) but it still heavily used. And many tools can automate the search for XOR’d string. Viper, the binary analysis and management framework, is a good example. It can scan for XOR'd strings easily:
viper tmpnYaBJs > xor -a [*] Searching for the following strings: - This Program - GetSystemDirectory - CreateFile - IsBadReadPtr - IsBadWritePtrGetProcAddress - LoadLibrary - WinExec - CreateFileShellExecute - CloseHandle - UrlDownloadToFile - GetTempPath - ReadFile - WriteFile - SetFilePointer - GetProcAddr - VirtualAlloc - http [*] Hold on, this might take a while... [*] Searching XOR [!] Matched: http with key: 0x74 [*] Searching ROT viper tmpnYaBJs >
Today, many Javascript or VBS files implement other obfuscation techniques that do not rely on XOR. Yesterday, I found a sample that had such behaviour. A first quick analysis revealed that almost no string was in clear text in the source and a function was called in place of regular strings like:
var bcacfdfaebbbfDeck = new ActiveXObject(dbdbfaeefccaee('+L+^%^LK%,LpL(KeL^%z%+%u%u',1));
I took some time to check how the obfuscation was performed. How does it work?
The position of each character is searched in the $data variable and decreased by one. Then the character at this position is returned to build a string of hex codes. Finally, the hex codes are converted into the final string. Example with the two first characters of the example above:
$data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR";
- "+" is located at pos 20, search the character at position 19 (20 - 1): "5"
- "L" is located at pos 5, search the character at position 4 (5 - 1): "7"
- "57" is the hex code for "W"
- etc...
Here is the beautified code from the malicious file:
// Convert a string from hex chars to string. // In: “575363726970742E7368656C6C" // Out: "WScript.shell" function hex2string(hexstring) { var bufferin = hexstring.toString(); var bufferout = ''; for (var i = 0; i < bufferin.length; i += 2) bufferout += String.fromCharCode(parseInt(bufferin.substr(i, 2), 16)); return bufferout; } // Convert the obfuscate string by shifting by 1 char function deobfuscate(string,step){ var data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR"; var bufferout = ""; var l = data.length-1; var size = string.length; for (var i = 0; i <size ; i++){ var p = data.indexOf(string.charAt(i)); var p2 = p - step; if (p2 < 0) { p2 = l - Math.abs(p2); var l2 = l - 1; if (p2==l2) p2 = p2 + step; } bufferout = bufferout + data.charAt(p2); } // Convert to string return hex2string(bufferout); }
This code:
var s = deobfuscate('%zL(L(Lp^2KNKN^P^z^+Ke^P^+^(Ke^+^KKe^P^p^PKN%u%N%L%NKe%,%0%L',1); WScript.Echo(s);
Returns:
hxxp://185.154.52.101/logo.img
And when you understand how to deobfuscate, it’s easy to write the opposite function. So I quickly wrote the function to obfuscate any string based on the same technique:
function obfuscate(string,step){ var data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR"; var bufferout = ""; var l = data.length-1; var size = string.length; for (var i = 0; i <size ; i++){ var hvalue = Number(string.charCodeAt(i)).toString(16).toUpperCase(); for (var j=0; j < 2; j++) { var p = data.indexOf(hvalue.charAt(j)); var p2 = p + step; if (p2<0) { p2 = l + Math.abs(p2); var l2 = l + 1; if (p2==l2) p2 = p2 - step; } bufferout = bufferout + data.charAt(bdfcbaddccffada); } } return bufferout; }
This code:
var foo = obfuscate("https://isc.sans.edu", 1); WScript.echo(foo);
Returns:
%zL(L(LpL^^2KNKN%,L^%^KeL^%P%eL^Ke%+%(L+
Of course, the method analyzed here is a one shot! The number of ways to obfuscate data is unlimited...
[1] https://en.wikipedia.org/wiki/XOR_cipher
[2] https://isc.sans.edu/forums/diary/Decoding+Common+XOR+Obfuscation+in+Malicious+Code/13354
Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant
PGP Key
Reverse-Engineering Malware: Malware Analysis Tools and Techniques | Prague | Sep 30th - Oct 5th 2024 |
Comments
Just the same, it is a general method with a long history of use in manual ciphers. It is a polyalphabetic substitution cipher. The Wikipedia article says, "The Alberti cipher by Leon Battista Alberti around 1467 is believed to be the first polyalphabetic cipher." Yet I hadn't before read of it being used in malware.
Dick
Anonymous
Jun 23rd 2017
7 years ago