Overview
Disclaimer
First Analysis - Discovering the Platform
Upon installing the software, I initiated my analysis by examining the main executable with a hex editor. This preliminary investigation revealed that the file did not contain plain x86/x64 assembly instructions, which are typically observed in non-managed code. Instead, it appeared to be either packed or compiled into managed code, significantly different from traditional executables.
For context, a typical x86/x64 executable includes easily identifiable elements such as function calls, stubs (entry and exit points for functions), and INT 3 instructions used for debugging traps.
To illustrate the contrast, let's compare a hex view of the file in question with that of a standard executable, highlighting the absence of these elements in the managed code.
And for comparison:
This hypothesis was quickly confirmed by examining the program's imports using dumpbin, a process which can reveal much about the nature of an executable.
The critical observation here was the sole import of mscoree.dll:_CorExeMain. This detail is a definitive indicator of a .NET application, as _CorExeMain is the entry point used by .NET executables to bootstrap the managed execution environment.
This discovery set the stage for the following stages of analysis, with a clear understanding that we are dealing with a .NET-based tool.
Anti-Code Extraction Protection
.NET applications compile code into CIL (Common Intermediate Language), providing a layer of abstraction that can sometimes simplify reverse engineering efforts. Anticipating this, my initial approach was to utilize a .NET code extraction tool like ILSpy to decompile the application. However, it quickly became evident that the application was fortified with anti-reverse engineering techniques that ILSpy could not circumvent.
With ILSpy unable to penetrate the application's defenses, I turned to another decompilation tool, dotPeek, for a second attempt. This endeavor was more successful; dotPeek enabled me to browse the application's structure, revealing its classes and, importantly, the specific class that had piqued my interest.
Although dotPeek granted access to the application's internals, the code I found was heavily obfuscated, complicating any direct analysis within the tool. To tackle this obfuscation more effectively, I exported the code for a more detailed examination using Visual Studio. This environment would provide the enhanced analysis capabilities needed to navigate through the obfuscated code.
Bypassing Special UTF-8 Characters
Successfully exporting the code was a significant step forward, but it immediately surfaced two new challenges: the presence of special UTF-8 characters in both filenames and within the code itself. These characters thwarted Visual Studio's ability to correctly recognize and interpret symbols.
.NET applications internally use tokens to reference symbols, which meant that simply renaming these tokens with a printable string could effectively mitigate the issue. To address this, I embarked on an online search for a suitable tool, which led me to de4dot. This utility is specifically designed for renaming special symbols in .NET binaries, making it an ideal solution for the problem at hand.
Applying de4dot to my program, I was able to replace the problematic UTF-8 characters. Following this correction, I once again utilized dotPeek to load the binary and then exported it to a C# project, now cleansed of the troublesome UTF-8 characters.
With the UTF-8 obstructions cleared, the code was ready for a deeper analysis. The elimination of these special characters not only made the code more accessible but also paved the way for a more straightforward reverse engineering process.
Constants Obfuscation
While inspecting a particular decryption class, I noticed an absence of direct constants in the code. Instead, there were numerous calls to a so-called Class170, which piqued my curiosity.
A deeper investigation into this mysterious class revealed several intriguing characteristics:
- It contained various vectors of different data types, including int, long, float, and double.
- The class was responsible for reading and decrypting the DaEi embedded resource, subsequently populating these vectors with the decrypted data.
- The smethod_* functions within the class retrieve an element at a specified index from its corresponding vector, effectively replacing direct constant references in the code.
This approach indicated that the obfuscation technique involved aggregating all constants from the code into vectors, then substituting the original references with calls to these vector-access methods using appropriate indices.
The same obfuscation strategy was applied to strings, rendering them impervious to straightforward extraction via string dumping tools.
Given the complexity of the decryption process observed in Class170, I realized that a static analysis would be time consuming for unraveling the obfuscated constants. This led me to seek out a more dynamic solution, culminating in the discovery of a powerful .NET debugger, dnSpy, to aid in the decryption efforts.
Bypassing Anti-Debug Techniques
Upon opening the program in dnSpy, I set a breakpoint with the intention of stepping through the network packet decryption process. However, my progress was halted almost immediately by an exception.
A thorough examination of the call stack led me to the culprit: a failed check "class2.method_0()", situated before the troubling call to smethod2. Delving into class2.method_0, I discovered obfuscated code that executed Process.GetProcessById, suggesting its role as an anti-debug mechanism designed to detect and thwart debugging attempts by throwing an exception.
There are generally two strategies to circumvent such anti-debug checks: either bypass the check code by setting a breakpoint just before the check and then jumping to the next instruction upon hitting it, or modify the code to remove the check and patch the executable. Unfortunately, neither approach proved effective in this case due to the check's repetitive invocation within a loop and the program corruption resulting from patching, likely due to additional anti-reverse engineering measures.
Further investigation revealed that the problematic code was executed in a separate thread from the main application thread. By simply freezing this worker thread, I allowed the main thread to proceed unimpeded.
This maneuver successfully bypassed the debugging check, but I was soon met with another exception.
Closer inspection revealed yet another anti-debug trap. Fortunately, this one was simpler to navigate: by setting a breakpoint at line 66 and adjusting the value of 7, I num6 to match numwas able to circumvent the issue.
With these anti-debugging measures neutralized, the program resumed normal operation, allowing for the decryption of constants and their observation through the watch window.
This process revealed the actual constants used within the code, enabling me to replace the indirect function calls with their direct values. The code suddenly became much clearer and more understandable.
Bypassing the Delegates
In the obfuscated code, we encounter the use of delegate functions, essentially functioning as runtime-set function pointers. This obfuscation technique conceals the identity of the actual functions being called, further complicating the reverse engineering process.
However, the use of a debugger significantly simplifies the task of uncovering the true nature of these delegates. By placing a breakpoint on the delegate invocation and stepping into the execution with F11, it's possible to trace directly to the functions these delegates represent.
Through this method, I discover that:
- Delegate175.smethod_0 is actually invoking public static LogManager.GetLogger(string name)
- Delegate2624.smethod_0 resolves to public static byte[] GetBytes(this string self)
Substituting these delegate calls with their corresponding functions in the code markedly improves its readability and comprehensibility.
By replacing obscure delegate calls with the actual method names and functionalities they encapsulate, we peel away another layer of obfuscation, bringing us closer to the original, unobfuscated logic of the application.
Code Flow Obfuscation
After addressing the obfuscation related to constants and delegate functions, the code structure remained perplexing due to its scrambled execution flow. This obfuscation technique disrupts the natural sequence of operations, making the code difficult to follow at first glance.
Consider an original, straightforward sequence of operations:
This structure not only rearranges the original lines of code but also incorporates misleading operations, including unnecessary arithmetic and switch cases that are never executed, to further obfuscate the logic.
To revert this obfuscated code back to its original, understandable form, one can employ either a static analysis, painstakingly tracing the code's flow to discern the genuine execution order, or adopt a more dynamic approach using debugging tools.
My strategy involved setting breakpoints on each case statement that seemed unaltered by the flow obfuscation. By running the code and observing the sequence in which these breakpoints were triggered, I could deduce the original execution order.
With the actual execution flow unveiled and after resolving references to Class170 and delegate functions, the genuine, unobfuscated code emerged:
Though the code's structure may still pose some challenges to comprehension, this clarified version enables the use of the rewritten class to decrypt data packets. By treating the encryption mechanism as a "black box," we can now focus on the practical application of the decrypted outputs, sidestepping the need for a granular understanding of the obfuscation techniques initially employed.
Comments
Post a Comment