Jekyll 2024-12-29T14:45:37+01:00 https://letsreverse.net// let’s reverse Blog about software reverse-engineering and hardware hacking. Mainly focused on Windows, file formats, computer games, mobile applications and IoT devices. Calling Game Functions from Your Own Executable 2024-12-26T00:33:00+01:00 2024-12-26T00:33:00+01:00 https://letsreverse.net/2024/12/26/calling-game-function-from-your-own-executable <p>Programmers write functions all the time, but these days, they rarely rely solely on their own code. In fact, a vast number of functions in most programs are reused from other developers’ work. This is thanks to standard libraries, third-party libraries, and frameworks. These resources often come with clean documentation, helpful README files, and instructions on how to build, install, and use them. They’re incredibly useful and let developers get their projects off the ground quickly.</p> <p>But have you ever wondered how you could call a function embedded in someone else’s program? For example, what if you wanted to move your character in a game, draw a new GUI object, or print something to the in-game chat—directly from your own executable? Since programs and games are often closed-source, they don’t provide documentation listing their functions. Fortunately, this is where reverse engineering steps in to save the day.</p> <p>In a <a href="/2024/04/01/unveiling-hidden-gems-exploring-easter-eggs/">previous post</a> we demonstrated how to find a function responsible for parsing commands and, as a side effect, identified a function that prints text to the console. If you haven’t read that post yet, it might be a good idea to check it out before proceeding with this one.</p> <h2 id="understanding-function-calls-and-calling-conventions">Understanding Function Calls and Calling Conventions</h2> <p>When calling a function in your own program, as a programmer, you need to know a few details:</p> <ul> <li>The name of the function</li> <li>The parameters the function accepts</li> <li>The return type (optional but helpful)</li> </ul> <p>However, when dealing with a third-party application, things get more interesting. During compilation, information such as types and function names is often lost (assuming no debug symbols are present and the program is compiled directly to assembly). This makes it impossible to call the function by name. Instead, you will need a slightly different set od details:</p> <ul> <li>The function’s location (its address)</li> <li>The parameters the function accepts and the calling convention</li> <li>The return type (and the calling convention to know where the return value is located)</li> </ul> <p>Two question might come to mind: what is calling convention, and why it is enough to use just the function name in your own code, without needing the function’s location?</p> <p>When you call a function that accepts parameters, there must be an agreed-upon method for the caller to pass those parameters to the callee. This agreement is named the calling convention. For example, parameters can be passed by pushing them onto the stack, either from left to right or right to left. Another question arises: who cleans the stack after the function executes — the caller or the callee? Parameters may also be passed through registers, depending on the strategy. Similarly, the function’s return value must have a designated location, such as a specific register or memory location.</p> <p>Respecting the calling convention is critical so the callee knows how to interpret the parameters passed by the caller and where to place the return value. When writing your own application, you don’t usually need to worry about this, because the compiler takes care of it. It ensures that the caller adheres to the calling convention expected by the callee.</p> <p>Similarly, you don’t need to know the function’s address when calling it from your own program because the function name is automatically resolved to its address by the linker.</p> <h2 id="obtaining-required-information">Obtaining required information</h2> <p>First, we need to gather information about the function we want to call. The most important piece is the function’s location. Whether it is a function to attack a monster, collect an item or print a message to the console, the first step is to pinpoint place in the assembly code.</p> <p>As mentioned earlier, in this post, we are focusing on the function that prints a message to the console. The steps to locate this function were already covered in <a href="/2024/04/01/unveiling-hidden-gems-exploring-easter-eggs/">previous post</a> so we will skip them here.</p> <p>As a side note, for this particular game the address of this function will remain the same no matter how many time you restart the game (assuming you are using the same game binary). This is because the game is always loaded at the same address by the operating system — <code class="language-plaintext highlighter-rouge">0x00400000</code>, which is the default base address for x86 Windows executables. If you are curious why this happens, you can learn more <a href="https://devblogs.microsoft.com/oldnewthing/20141003-00/?p=43923">here</a>. For modern applications, however, a mechanism known as <a href="https://en.wikipedia.org/wiki/Address_space_layout_randomization">ASLR</a> (Address Space Layout Randomization) is used, which loads the program image at a different base address on each run or system restart. This mechanism won’t be covered here.</p> <p>The prologue of the function we are interested in looks like this:</p> <p><img src="/assets/images/kk_print/print_assembly.png" alt="Print to console assembly" class="centered" /></p> <p>Now we need to determine the number of parameters the function accepts, their types, and calling convention it uses. Thanks to modern decompilers like Ghidra or IDA, this is relatively straightforward for less complex functions. Take a look at the decompilation output:</p> <p><img src="/assets/images/kk_print/print_decompilation.png" alt="Print to console" class="centered" /></p> <p>As you can see, Ghidra deduced that the function accepts two parameters (a <code class="language-plaintext highlighter-rouge">this</code> pointer and a pointer of undefined type) and uses the <code class="language-plaintext highlighter-rouge">__thiscall</code> calling convention. Additionally, the funtion doesn’t return anything, so the return type is <code class="language-plaintext highlighter-rouge">void</code>. While modern tools are fantastic for simplifying our work, this is an educational post, so it’s worthwhile to understand how to reach similar conclusions manually.</p> <p>To start, it’s helpful to examine how the function is called: <img src="/assets/images/kk_print/print_call.png" alt="Assembly code to call the print to console" class="centered" /> There are multiple calls to this function, but they mostly follow the same pattern. A PUSH instruction places a pointer to the text to be printed onto the stack, and an address (0x0789d58) is moved into the <code class="language-plaintext highlighter-rouge">ECX</code> register just before the call. After the function executes, the caller doesn’t clean the stack, suggesting the callee handles it.</p> <p>From this, we can deduce that the function accepts two parameters: the message to print (pushed onto the stack) and a mysterious address (placed in the <code class="language-plaintext highlighter-rouge">ECX</code> register). Let’s explore the <a href="https://en.wikipedia.org/wiki/X86_calling_conventions">most common x86 calling conventions</a>.</p> <p><strong>thiscall</strong>:</p> <p><code class="language-plaintext highlighter-rouge">On the Microsoft Visual C++ compiler, the this pointer is passed in ECX and it is the callee that cleans the stack, mirroring the stdcall convention used in C for this compiler and in Windows API functions. When functions use a variable number of arguments, it is the caller that cleans the stack (cf. cdecl). </code></p> <p>and for <strong>stdcall</strong>:</p> <p><code class="language-plaintext highlighter-rouge">The stdcall calling convention is a variation on the Pascal calling convention in which the callee is responsible for cleaning up the stack, but the parameters are pushed onto the stack in right-to-left order, as in the _cdecl calling convention. Registers EAX, ECX, and EDX are designated for use within the function. Return values are stored in the EAX register. </code></p> <p>It seems that Ghidra was indeed correct in deducing the calling convention! One last thing to verify is whether the callee cleans the stack.</p> <p><img src="/assets/images/kk_print/print_ret.png" alt="End of the callee" class="centered" /></p> <p>Looking at the last instruction of the callee, we see ret 4. This indicates that after executing this instruction, 4 bytes will be added to the stack pointer, effectively cleaning up the 4 bytes pushed by the caller.</p> <p>You might be wondering what the <code class="language-plaintext highlighter-rouge">this</code> parameter is and why it’s passed to the function as a parameter. If you’ve ever used a C-like language that supports object-oriented programming, you’re probably familiar with the concept. Here’s a quick recap:</p> <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp"> </span> <span class="k">class</span> <span class="nc">Cat</span> <span class="p">{</span> <span class="nl">private:</span> <span class="kt">int</span> <span class="n">age</span><span class="p">;</span> <span class="nl">public:</span> <span class="n">Cat</span><span class="p">(</span><span class="kt">int</span> <span class="n">_age</span><span class="p">)</span> <span class="o">:</span> <span class="n">age</span><span class="p">{</span><span class="n">_age</span><span class="p">}</span> <span class="p">{}</span> <span class="kt">int</span> <span class="nf">getAge</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">age</span><span class="p">;</span> <span class="p">}</span> <span class="p">};</span> <span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span> <span class="n">Cat</span> <span class="n">cat</span><span class="p">{</span><span class="mi">15</span><span class="p">};</span> <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">cat</span><span class="p">.</span><span class="n">getAge</span><span class="p">();</span> <span class="k">return</span> <span class="mi">0</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>This is a simple C++ program, that constructs a <code class="language-plaintext highlighter-rouge">Cat</code> object, sets its age to <code class="language-plaintext highlighter-rouge">15</code>, and then print the age to the standard output. Nothing too fancy. Take a look at the <code class="language-plaintext highlighter-rouge">this-&gt;age</code> expression. In C++, you can use <code class="language-plaintext highlighter-rouge">this</code> inside a class member function to get a pointer to the current class instance, even though it is not explicitly passed as a parameter. This is thanks to the magic of the compiler. On the assembly level the <code class="language-plaintext highlighter-rouge">getAge</code> function might look like this:</p> <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">getAge</span><span class="p">(</span><span class="n">Cat</span><span class="o">*</span> <span class="k">this</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">age</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>And the invocation of the function would look like this:</p> <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="n">Cat</span><span class="o">::</span><span class="n">getAge</span><span class="p">(</span><span class="o">&amp;</span><span class="n">cat</span><span class="p">);</span> </code></pre></div></div> <p>As you can see, even though the <code class="language-plaintext highlighter-rouge">this</code> parameter isn’t explicitly passed in your code, the compiler automatically includes it. That’s why on the assembly level, it looks like the parameter is passed explicitly.</p> <h2 id="call-the-function">Call the function</h2> <p>Armed with the information we gathered earlier, we can finally attempt to call the function. To do this, we will create our own <code class="language-plaintext highlighter-rouge">.dll</code> file, inject it into game process, and call the function that prints something to the console. Explaining injection in detail deserves its own post, so for now, we will use existing tools. For now, all you need to know is that once you inject a <code class="language-plaintext highlighter-rouge">.dll</code> into a process, it becomes a part of it. This means the <code class="language-plaintext highlighter-rouge">.dll</code> has access to the entire process memory, all the program functions, and more - just as if the game were your own program. If you are interested in learning more about injection, you can read about it <a href="https://en.wikipedia.org/wiki/DLL_injection">here</a>.</p> <p>For this example, we will use the Microsoft Visual Studio IDE and its compiler to create the project. I selected the DLL template as a base. Here is the complete code:</p> <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// dllmain.cpp : Defines the entry point for the DLL application.</span> <span class="cp">#include</span> <span class="cpf">"pch.h"</span><span class="cp"> </span> <span class="k">typedef</span> <span class="nf">void</span><span class="p">(</span><span class="n">__thiscall</span><span class="o">*</span> <span class="n">Print_Function</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span> <span class="n">this_ptr</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">msg</span><span class="p">);</span> <span class="n">Print_Function</span> <span class="n">print_to_console</span> <span class="o">=</span> <span class="p">(</span><span class="n">Print_Function</span><span class="p">)</span><span class="mh">0x0431d40</span><span class="p">;</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">msg</span> <span class="o">=</span> <span class="s">"Visit https://letsreverse.net/"</span><span class="p">;</span> <span class="kt">void</span> <span class="nf">main_func</span><span class="p">()</span> <span class="p">{</span> <span class="n">print_to_console</span><span class="p">((</span><span class="kt">void</span><span class="o">*</span><span class="p">)</span><span class="mh">0x0789D58</span><span class="p">,</span> <span class="n">msg</span><span class="p">);</span> <span class="p">}</span> <span class="n">BOOL</span> <span class="n">APIENTRY</span> <span class="nf">DllMain</span><span class="p">(</span> <span class="n">HMODULE</span> <span class="n">hModule</span><span class="p">,</span> <span class="n">DWORD</span> <span class="n">ul_reason_for_call</span><span class="p">,</span> <span class="n">LPVOID</span> <span class="n">lpReserved</span> <span class="p">)</span> <span class="p">{</span> <span class="k">switch</span> <span class="p">(</span><span class="n">ul_reason_for_call</span><span class="p">)</span> <span class="p">{</span> <span class="k">case</span> <span class="n">DLL_PROCESS_ATTACH</span><span class="p">:</span> <span class="n">main_func</span><span class="p">();</span> <span class="k">break</span><span class="p">;</span> <span class="k">case</span> <span class="n">DLL_THREAD_ATTACH</span><span class="p">:</span> <span class="k">case</span> <span class="n">DLL_THREAD_DETACH</span><span class="p">:</span> <span class="k">case</span> <span class="n">DLL_PROCESS_DETACH</span><span class="p">:</span> <span class="k">break</span><span class="p">;</span> <span class="p">}</span> <span class="k">return</span> <span class="n">TRUE</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>Once the <code class="language-plaintext highlighter-rouge">.dll</code> is injected, the <code class="language-plaintext highlighter-rouge">DllMain</code> function is triggered with the <code class="language-plaintext highlighter-rouge">DLL_PROCESS_ATTACH</code> reason. In this entry point, we invoke <code class="language-plaintext highlighter-rouge">main_func</code>, which makes a call to the game’s function. It uses the <code class="language-plaintext highlighter-rouge">this</code> pointer we identified earlier (<code class="language-plaintext highlighter-rouge">0x0789D58</code>) and passes <code class="language-plaintext highlighter-rouge">msg</code>, the message we want to print to the console.</p> <p>The two lines that might seem mysterious at first:</p> <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="nf">void</span><span class="p">(</span><span class="n">__thiscall</span><span class="o">*</span> <span class="n">Print_Function</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span> <span class="n">this_ptr</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span><span class="o">*</span> <span class="n">msg</span><span class="p">);</span> <span class="n">Print_Function</span> <span class="n">print_to_console</span> <span class="o">=</span> <span class="p">(</span><span class="n">Print_Function</span><span class="p">)</span><span class="mh">0x0431d40</span><span class="p">;</span> </code></pre></div></div> <p>are actually straightforward. The first line creates a <code class="language-plaintext highlighter-rouge">typedef</code> for a function pointer named <code class="language-plaintext highlighter-rouge">Print_Function</code>. This function accepts two parameters: <code class="language-plaintext highlighter-rouge">void*</code> and <code class="language-plaintext highlighter-rouge">const char*</code>. The convention is <code class="language-plaintext highlighter-rouge">__thiscall</code>, which instructs the compiler to place the first parameter <code class="language-plaintext highlighter-rouge">this</code> to <code class="language-plaintext highlighter-rouge">ECX</code> register and push the second parameter <code class="language-plaintext highlighter-rouge">msg</code> onto the stack. Additionally, the compiler knows that it doesn’t need to clean up the stack afterward, as this will be handled by the callee. You can find more details about all calling conventions supported by Microsoft’s compiler <a href="https://learn.microsoft.com/en-us/cpp/cpp/argument-passing-and-naming-conventions?view=msvc-170">here</a>.</p> <p>The second line actually tells the compiler that the <code class="language-plaintext highlighter-rouge">print_to_console</code> function is located at address <code class="language-plaintext highlighter-rouge">0x0431d40</code>. We know this from the first image attached to this post.</p> <h2 id="testing-the-dll">Testing the DLL</h2> <p>Finally, we can test our <code class="language-plaintext highlighter-rouge">.dll</code>. For this, we will use an injector tool <a href="https://www.cheatengine.org/">Cheat Engine</a>. It’s free and open-source, but be cautious during installation — avoid agreeing to any adware. If you’re unfamiliar with the process of injecting a .dll, check out this short <a href="https://www.youtube.com/watch?v=OmwaAwoUqwQ">video</a>.</p> <p><img src="/assets/images/kk_print/result.png" alt="Result of our work" class="centered" /></p> <p>And there you have it! Our .dll works perfectly, and we can see the desired string displayed in the game console.</p> <p>In a future post, we’ll explore the concept of function hooking—modifying existing game functions. This will allow us to alter the function that parses user commands, enabling us to add custom commands and execute specific actions when the user types the appropriate input into the console.</p> Programmers write functions all the time, but these days, they rarely rely solely on their own code. In fact, a vast number of functions in most programs are reused from other developers’ work. This is thanks to standard libraries, third-party libraries, and frameworks. These resources often come with clean documentation, helpful README files, and instructions on how to build, install, and use them. They’re incredibly useful and let developers get their projects off the ground quickly. Unveiling Hidden Gems: Exploring Easter Eggs Through Game Reverse Engineering 2024-04-01T17:40:00+02:00 2024-04-01T17:40:00+02:00 https://letsreverse.net/2024/04/01/unveiling-hidden-gems-exploring-easter-eggs <p>Have you ever heard of <a href="https://en.wikipedia.org/wiki/Kajko_and_Kokosz">Kajko and Kokosz</a>? They are comic characters created by a Polish comic creator named <a href="https://en.wikipedia.org/wiki/Janusz_Christa">Janusz Christa</a>. Two Slavic warriors travel the world, tackling all sorts of problems with their village along the way. Over the years, various media related to them have been released - comics, computer games, and recently, they even got their own <a href="https://www.netflix.com/title/81263662">TV series on Netflix</a>! Some people also suggest that they are the Slavic version of <a href="https://en.wikipedia.org/wiki/Asterix">Asterix &amp; Obelix</a>, as the characters’ visual appearance and archetypes are similar in both series. Those suggestions were, however, never confirmed.</p> <p>Speaking of computer games, in this blog post we will investigate one of them: <a href="https://www.mobygames.com/game/69530/knights-learn-to-fly/">Knights: Learn to Fly</a> published by <a href="https://www.mobygames.com/company/20832/play-publishingcom/">Play Publishing</a> in 2005. In this game you can play as either Kajko or Kokosz. Your task is to collect coins, defeat enemies, and in some levels, even fly on a broom! When I was a child, with new games I liked to press every possible key on a keyboard one by one and observing their effects on the game. This game was no different. After I pressed <code class="language-plaintext highlighter-rouge">~</code> on my keyboard, a console popped up. Back then, I didn’t understand the purpose of such consoles, I tried writing random messages, thinking that maybe this was some kind of chat, and someone would reply. Unfortunately, no one did, so I gave up and closed this mysterious window, focusing instead on playing the game. Today, armed with knowledge about game reverse engineering, we will solve this riddle!</p> <h2 id="getting-started">Getting started</h2> <p><img src="/assets/images/kk_easter_eggs/empty_console.png" alt="Empty console" class="centered" /></p> <p>After we open the console, we are greeted with the message: <code class="language-plaintext highlighter-rouge">Argon engine featuring Kajko i Kokosz. Mirage Interactive 2005.</code>. We can type commands and execute them by pressing <code class="language-plaintext highlighter-rouge">Enter</code> on the keyboard, just like in normal command prompt. The question is, what commands are accepted and what can we achieve by using them?</p> <p><img src="/assets/images/kk_easter_eggs/linux.png" alt="Linux commands" class="centered" /></p> <p>To answer this, we will need to take a look inside the game binary. Our task is to find a function that is handling the user commands. To do so, we will use the following property: after you execute a command, the command itself is printed out to the console, so if you type <code class="language-plaintext highlighter-rouge">ls</code> and press <code class="language-plaintext highlighter-rouge">Enter</code>, <code class="language-plaintext highlighter-rouge">ls</code> will be printed to the console. It seems that before the command handler is executed, the print function is called. We could simply put a breakpoint on the print function and then inspect the call stack. Sounds good, but how do we find the print function? This should actually be easy. Do you remember the greeting message that is shown at the top of the console? It looks like a hardcoded string that should be easy to locate, and that string is probably directly passed to the printing function.</p> <h2 id="analyzing">Analyzing</h2> <p>To analyze the game executable we will use Ghidra. To perform live debugging - x32dbg, both of them are open source and free to use. After throwing the binary into Ghidra, we can go to Window -&gt; Defined Strings menu, and look for the string located at the top of the console</p> <p><img src="/assets/images/kk_easter_eggs/defined_strings.png" alt="Defined strings" class="centered" /></p> <p>We see that Ghidra found one reference to that string. Let’s go there</p> <p><img src="/assets/images/kk_easter_eggs/xref.png" alt="xref" class="centered" /></p> <p><img src="/assets/images/kk_easter_eggs/string_decomp.png" alt="Xref in decompilation window" class="centered" /></p> <p>It looks like the FUN_00431d40 is that one that prints text to the console. We will rename it to <code class="language-plaintext highlighter-rouge">print_to_console</code>. Now we can look up all the places from which the function is called:</p> <p><img src="/assets/images/kk_easter_eggs/references_to_print_to_console.png" alt="References to print to console" class="centered" /></p> <p>As we can see, there are a lot of references to this function. We could analyze them one by one, however it would take a lot of time. A much smarter approach is to attach a debugger now. Basically, we will:</p> <ol> <li>Run the game</li> <li>Attach debugger</li> <li>Put a breakpoint at FUN_00431d40</li> <li>Write something to the game console</li> <li>Inspect call stack on breakpoint trigger, so we can see from where the print was called.</li> </ol> <p>After performing all the steps we see this: <img src="/assets/images/kk_easter_eggs/x64_callstack.png" alt="x64dbg callstack" class="centered" /></p> <p>So our assumption was correct; indeed after we execute any command (<code class="language-plaintext highlighter-rouge">letsreverse</code> in this case), the print function is called, and then the game will jump back to 00434FD0 address. Let’s analyze this function in Ghidra.</p> <p><img src="/assets/images/kk_easter_eggs/ghidra1.png" alt="function in Ghidra" class="centered" /> <img src="/assets/images/kk_easter_eggs/ghidra2.png" alt="quit fck exec" class="centered" /></p> <p>Interestingly, it looks like some commands are processed prior to printing: <code class="language-plaintext highlighter-rouge">quit</code>, <code class="language-plaintext highlighter-rouge">exec</code> and… <code class="language-plaintext highlighter-rouge">fuck</code>. <code class="language-plaintext highlighter-rouge">quit</code> closes the game, <code class="language-plaintext highlighter-rouge">exec</code> executes all the commands that are stored in <code class="language-plaintext highlighter-rouge">ar.cfg</code> file in game directory, while <code class="language-plaintext highlighter-rouge">fuck</code> responds with <code class="language-plaintext highlighter-rouge">Don't make me e-mail your mom!</code>. If any of these commands are executed, the command itself won’t be printed to the console. If the entered command is different from those already mentioned, the command itself is printed. Additionally, if the command contains a parameter, such as <code class="language-plaintext highlighter-rouge">letsreverse 1</code>, then another handler at address 00433ed0 is called. This can be seen in the screenshots above.</p> <h2 id="getting-the-commands">Getting the commands</h2> <p><img src="/assets/images/kk_easter_eggs/ghidra3.png" alt="Next handler" class="centered" /></p> <p>The second handler contains a lot of commands. I parsed the decompiler output to list all of them:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DropDebug credits screenwidth screenheight bitdepth runworld mirmil loadmodels DrawModels DrawToonEffect drawfps DrawPolygonCount drawsnow SnowPhysics SnowNumFlakes drawsky invisible drawshadowmesh raportsubsets raporttextures sound DrawStencilShadows DrawDepthShadows drawlightrays drawgrass detailedgrass draweffects lightsemitsmoke numconsolelines consolelog eraselog extractusedtextures grasswidth meshtospritecoef shadowdistcoef qualitytextures dupadupa raporttechs numlights ForceSmallerTextures farz fogfar fognear fogr fogg fogb ambientr ambientg ambientb skylightr skylightg skylightb SkyLightRotation SkyLightPitch SkyShadowsRotation SkyShadowsPitch DrawCharacterPaths DrawPaths Dither PlayerBreathe ProjectedShadowDist DrawSnowClouds MeshUpdateCoef DrawSprites DrawScreenFilter FontSizeCoef MeshMaterialAmbient MeshMaterialDiffuse StartMenu NoClipSpeed DrawCameraCoordinates DrawTreeShadows DrawTransparent DrawTranslucent DrawCompass michael UpdateMainGame Gamma GammaR GammaG GammaB Debug SystemDebug AutoScreenshotTime EnableAntialiasing TrilinearFiltering DisableLensFlares DisableInsects TreesAlwaysFullDetail CameraDeg CameraDist CameraUp </code></pre></div></div> <p>A lot of these commands are self-descriptive. The most interesting and intriguing ones, in my opinion, include:</p> <ul> <li><code class="language-plaintext highlighter-rouge">credits</code> - Prints out credits that are normally nowhere to be seen in the game(?)</li> <li><code class="language-plaintext highlighter-rouge">runworld</code>- Allows to load arbitrary game levels</li> <li><code class="language-plaintext highlighter-rouge">mirmil</code> - Modifies the game save file to appear as if you have completed the entire game, unlocking all levels. By the way, <code class="language-plaintext highlighter-rouge">Mirmił</code> is the name of the gord’s leader. The gord in which some of the game levels take place is named <code class="language-plaintext highlighter-rouge">Mirmiłowo</code></li> <li><code class="language-plaintext highlighter-rouge">invisible</code> - Makes the game character invisible, so enemies don’t target you</li> <li><code class="language-plaintext highlighter-rouge">michael</code> - Sets some values in the memory. However, it seems these values are never read (as evidenced by both static analysis and setting a hardware breakpoint), so the behavior remains unknown</li> <li><code class="language-plaintext highlighter-rouge">dupadupa</code> - Enables speedhack and possibility to pass walls and obstacles without colliding with them. Intriguing name, especially considering that <code class="language-plaintext highlighter-rouge">dupa</code> means <code class="language-plaintext highlighter-rouge">ass</code> in Polish. You can see the command effect on the video below</li> </ul> <video muted="" autoplay="" controls="" loop=""> <source src="/assets/images/kk_easter_eggs/dupdup.mp4" type="video/mp4" /> </video> <h2 id="future">Future</h2> <p>The reason why those commands were left in the official game release remains unknown. It is unclear whether the authors left them by mistake or intentionally. However, we can make use of them when it comes to modding. In future posts, I will show you how to print your own text to the console, and how to add your own commands.</p> Have you ever heard of Kajko and Kokosz? They are comic characters created by a Polish comic creator named Janusz Christa. Two Slavic warriors travel the world, tackling all sorts of problems with their village along the way. Over the years, various media related to them have been released - comics, computer games, and recently, they even got their own TV series on Netflix! Some people also suggest that they are the Slavic version of Asterix &amp; Obelix, as the characters’ visual appearance and archetypes are similar in both series. Those suggestions were, however, never confirmed. Unpacking Encrypted Game Files 2023-12-02T20:12:34+01:00 2023-12-02T20:12:34+01:00 https://letsreverse.net/2023/12/02/unpacking-encrypted-game-files <p>In the <a href="/2023/08/11/unpacking-garfield-game-files/">last post</a> I described a way that allowed to unpack content of proprietary format. That format was fairly straightforward with no encryption nor decompression and thus it was possible to grab files that are stored inside without even touching a debugger (we used one tho, but it was not necessary). This time, however, things will get a little bit more interesting. Today we will tackle a game from the <a href="https://en.wikipedia.org/wiki/Crazy_Chicken">Crazy Chicken</a> series, to be more precise - Crazy Chicken Kart 2 (or Moorhuhn Kart 2 in original)</p> <h2 id="getting-started">Getting started</h2> <p>First we need to inspect files in the game’s installation directory, to see where assets are stored. We can see that there is a folder named <code class="language-plaintext highlighter-rouge">data</code> which contains a file named <code class="language-plaintext highlighter-rouge">mhk2-00.dat</code>. It is the largest file, which takes about 140MB of space. This will be our target.</p> <p>When we open the file in hex editor we can see this:</p> <p><img src="/assets/images/moorhuhnkart2/hxd.png" alt="hxd" class="centered" /></p> <h2 id="lets-guess">Let’s guess</h2> <p>We can try to guess the structure of the file without using any debugger. Imagine you are a game developer and you are tasked with writing a parser that will unpack and load required files to the game memory. What kind of information is needed?</p> <ul> <li>File count - it is possible to create a file format without it, for example you can put the file header at the end of the content, and then iterate over elements until you meet EOF (end of file), but generally the file count is a part of a file format</li> <li>Location of entry name/id - There must be a way to identify the entry somehow. It can be for example a numeric ID, or a filename just as casual file on the disk</li> <li>Way to obtain the entry content location and its length</li> </ul> <p>There are plenty of possibilities how such information can be stored. Let’s see what we can assume just by looking at the previous image. The very first 16 bytes form the string <code class="language-plaintext highlighter-rouge">Moorhuhn Kart 2</code>. We can treat it like a file header to be sure we are dealing with the right file. Multiple filenames can be seen. From the beginning of the first filename, till the beginning of the next filename there is exactly 0x80 bytes of space. This is applicable also for the next files, when we scroll the view down. For now we can assume that this space is dedicated to describing a particular file entry. Inside such fragment, there probably are our two missing elements, offset to the entry content and its length. There are two 32 bit integers. We can see that one points somewhere much further into the file, while the other is a much smaller integer, so the first one is the data offset, the second one - data length.</p> <p><img src="/assets/images/moorhuhnkart2/hxd_markings.png" alt="hxd_markings" class="centered" /></p> <p>Now, just after the <code class="language-plaintext highlighter-rouge">Moorhuhn Kart 2</code> string we can see an integer <code class="language-plaintext highlighter-rouge">0x456</code>. For now assume this is the file count. We see that the first file entry starts at offset <code class="language-plaintext highlighter-rouge">0x40</code>. The entry is <code class="language-plaintext highlighter-rouge">0x80</code> bytes len, and there are <code class="language-plaintext highlighter-rouge">0x456</code> files. So if the <code class="language-plaintext highlighter-rouge">0x456</code> is really the file count, at offset <code class="language-plaintext highlighter-rouge">0x40 + 0x456 * 0x80</code> is the end of the last file entry. Let’s check! And indeed. It looks like <code class="language-plaintext highlighter-rouge">0x22B40</code> is the beginning of data and at the same time the end of file entries. To be even more sure, let’s look at the first file entry and at its data offset, it is also <code class="language-plaintext highlighter-rouge">0x22B40</code>! So it is even better proof that <code class="language-plaintext highlighter-rouge">0x456</code> is indeed the file count and we are reading data offset just right.</p> <p><img src="/assets/images/moorhuhnkart2/22b40.png" alt="22b40" class="centered" /></p> <h2 id="unpacking">Unpacking</h2> <p>Summarizing all the information obtained above, we can write a simple python script that will iterate over all the file entries, extract and save their content under appropriate names.</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">struct</span> <span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span> <span class="nb">file</span> <span class="o">=</span> <span class="nf">open</span><span class="p">(</span><span class="sh">"</span><span class="s">mhk2-00.dat</span><span class="sh">"</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">rb</span><span class="sh">"</span><span class="p">)</span> <span class="c1"># Seek to the file count </span><span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x20</span><span class="p">)</span> <span class="n">filecount</span><span class="p">,</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;H</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">filecount</span><span class="p">):</span> <span class="c1"># Seek to the beginning of the header </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x80</span><span class="p">)</span> <span class="n">name</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">""</span> <span class="c1"># Read null terminated string </span> <span class="nf">while </span><span class="p">(</span><span class="n">a</span> <span class="p">:</span><span class="o">=</span> <span class="p">(</span><span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">1</span><span class="p">)))</span> <span class="o">!=</span> <span class="sa">b</span><span class="sh">"</span><span class="se">\x00</span><span class="sh">"</span><span class="p">:</span> <span class="n">name</span> <span class="o">+=</span> <span class="n">a</span> <span class="nf">print</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="c1"># Seek to the offset and length position </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x80</span> <span class="o">+</span> <span class="mh">0x68</span><span class="p">)</span> <span class="n">offset</span><span class="p">,</span> <span class="n">length</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;II</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">8</span><span class="p">))</span> <span class="n">filename</span> <span class="o">=</span> <span class="n">name</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="sh">"</span><span class="s">ascii</span><span class="sh">"</span><span class="p">)</span> <span class="c1"># Create the file directory </span> <span class="nc">Path</span><span class="p">(</span><span class="n">filename</span><span class="p">).</span><span class="n">parent</span><span class="p">.</span><span class="nf">mkdir</span><span class="p">(</span><span class="n">parents</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">exist_ok</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="c1"># Seek to the file content </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="n">offset</span><span class="p">)</span> <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">wb</span><span class="sh">"</span><span class="p">)</span> <span class="k">as</span> <span class="n">output</span><span class="p">:</span> <span class="n">content</span> <span class="o">=</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">)</span> <span class="n">output</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="n">content</span><span class="p">)</span></code></pre></figure> <p>It looks easy, isn’t it? So far it might be even easier than the format from the previous blog post. However, at the beginning of this post I promised that this will be more interesting and I didn’t lie.</p> <h2 id="what-is-inside">What is inside?</h2> <p>Inside the root directory there are two items, <code class="language-plaintext highlighter-rouge">config.txt</code> and a directory <code class="language-plaintext highlighter-rouge">mk2</code>. Let’s take a look inside the <code class="language-plaintext highlighter-rouge">mk2</code>:</p> <p><img src="/assets/images/moorhuhnkart2/unpacked_content.png" alt="unpacked content" class="centered" /></p> <ul> <li><code class="language-plaintext highlighter-rouge">items</code> - Contains different textures and presumably 3D objects of different game items</li> <li><code class="language-plaintext highlighter-rouge">karts</code> - Animations of all the playable characters in game</li> <li><code class="language-plaintext highlighter-rouge">lensflares</code> - Textures of flares</li> <li><code class="language-plaintext highlighter-rouge">level0X</code> - Configuration, textures, music, 3d objects and animations for different levels</li> <li><code class="language-plaintext highlighter-rouge">menu</code> - Music and textures to be displayed in main menu</li> <li><code class="language-plaintext highlighter-rouge">misc</code> - Fonts and HUD textures</li> <li><code class="language-plaintext highlighter-rouge">settings</code> - Encrypted configuration of different karts. Looks interesting in terms of modding</li> <li><code class="language-plaintext highlighter-rouge">sfx</code> - Different sound effects, collision, engine etc</li> <li><code class="language-plaintext highlighter-rouge">text.csv</code> - Translation of subtitles in different languages - interesting if you want to translate the game</li> </ul> <h2 id="examining-the-results">Examining the results</h2> <p>When we look at the unpacked data it looks almost right. We can see the images and hear sound effects. However, when we open a file with <code class="language-plaintext highlighter-rouge">txt</code> extension, we are presented with gibberish:</p> <p><img src="/assets/images/moorhuhnkart2/encrypted_file.png" alt="encrypted file" class="centered" /></p> <p>It looks like the authors of the game decided to somehow obfuscate the content of text files, otherwise the text could be easily replaced directly in the .dat file, even without unpacking. As there are no checksums in the .dat file, this could lead to cheating (presumably the .txt files contains configuration of speed of different vehicles etc). To read the real content of the file we need to take another approach. As you can see, without knowing the algorithm that is used to decipher the content, it is almost impossible to progress further. Even if the algorithm would be known, we still somehow need to obtain the decryption key. Remember, the game must be able to read and understand the file content, so it implies that it knows the deciphering procedure. At the same time we have access to the game executable, this means we can discover this procedure too.</p> <h2 id="reversing-the-text-file-decryption-method">Reversing the text file decryption method</h2> <p>By observing the unpacked files we can see that there is a directory for each game level, inside each of them there are 4 folders, <code class="language-plaintext highlighter-rouge">music</code>, <code class="language-plaintext highlighter-rouge">objects</code>, <code class="language-plaintext highlighter-rouge">settings</code> and <code class="language-plaintext highlighter-rouge">textures8bit</code>, going deeper, inside the <code class="language-plaintext highlighter-rouge">settings</code> we can find 3 more folders, <code class="language-plaintext highlighter-rouge">display</code>, <code class="language-plaintext highlighter-rouge">misc</code>, <code class="language-plaintext highlighter-rouge">objects</code>, inside them there is a .txt file with the filename that corresponds to the directory name, so inside <code class="language-plaintext highlighter-rouge">display</code> you can find <code class="language-plaintext highlighter-rouge">display.txt</code> etc. As mentioned previously, all the <code class="language-plaintext highlighter-rouge">txt</code> files are encrypted.</p> <p>Right now we are not interested in the way the game parses the <code class="language-plaintext highlighter-rouge">mhk2-00.dat</code>. We already know it. By making a list of all string references we can spot references to <code class="language-plaintext highlighter-rouge">objects.txt</code>.</p> <p><img src="/assets/images/moorhuhnkart2/string_references.png" alt="string references" class="centered" /></p> <p>This is probably used to grab configuration file of game level we want to play. Let’s put breakpoints on all references and run the game.</p> <p><img src="/assets/images/moorhuhnkart2/breakpoint_objects.png" alt="breakpoint objects" class="centered" /></p> <p>After we hit a breakpoint lets put another one on a <code class="language-plaintext highlighter-rouge">ReadFile</code> win api call. There is also a possibility that the whole file was read to the memory at the program start, but this would be a waste of precious RAM (especially in 2003, when the game was created) to load all the level data at once, thus it is more likely that those levels are being read from disk as needed. As we resume the execution we can see that we hit the breakpoint on <code class="language-plaintext highlighter-rouge">ReadFile</code>, the buffer is filled with the encrypted data, just as desired.</p> <p><img src="/assets/images/moorhuhnkart2/read_file.png" alt="read file" class="centered" /></p> <p>Interestingly, the first <code class="language-plaintext highlighter-rouge">objects.txt</code> to load is from level06 directory. Nevertheless we continue our journey to discover the decryption routine. In order to do so we need to put hardware breakpoint on access at the beginning of the buffer filled with encrypted data.</p> <p><img src="/assets/images/moorhuhnkart2/x64dbg_decryption_routine.png" alt="decryption routine" class="centered" /></p> <p>And boom, we landed a function that performs XORs and shift operations, this looks like some kind of decryption routine. As on the first look it is hard to judge what exactly this function is doing, it is a good idea to use the decompiler to do this for us. After opening the game executable in IDA and navigating to the same address (0x0450384) as we seen in the debugger, we are presented with this view:</p> <p><img src="/assets/images/moorhuhnkart2/normal_ida.png" alt="normal ida" class="centered" /></p> <p>Not really clean, but after renaming some variables and changing their types it looks much better:</p> <p><img src="/assets/images/moorhuhnkart2/clean_ida.png" alt="clean ida" class="centered" /></p> <p>Let’s add this function to our python script. Take a look at the last <code class="language-plaintext highlighter-rouge">if</code> statement in the script:</p> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">struct</span> <span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span> <span class="nb">file</span> <span class="o">=</span> <span class="nf">open</span><span class="p">(</span><span class="sh">"</span><span class="s">mhk2-00.dat</span><span class="sh">"</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">rb</span><span class="sh">"</span><span class="p">)</span> <span class="c1"># Seek to the file count </span><span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x20</span><span class="p">)</span> <span class="n">filecount</span><span class="p">,</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;H</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">filecount</span><span class="p">):</span> <span class="c1"># Seek to the beginning of the header </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x80</span><span class="p">)</span> <span class="n">name</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">""</span> <span class="c1"># Read null terminated string </span> <span class="nf">while </span><span class="p">(</span><span class="n">a</span> <span class="p">:</span><span class="o">=</span> <span class="p">(</span><span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">1</span><span class="p">)))</span> <span class="o">!=</span> <span class="sa">b</span><span class="sh">"</span><span class="se">\x00</span><span class="sh">"</span><span class="p">:</span> <span class="n">name</span> <span class="o">+=</span> <span class="n">a</span> <span class="nf">print</span><span class="p">(</span><span class="n">name</span><span class="p">)</span> <span class="c1"># Seek to the offset and length position </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x80</span> <span class="o">+</span> <span class="mh">0x68</span><span class="p">)</span> <span class="n">offset</span><span class="p">,</span> <span class="n">length</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;II</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">8</span><span class="p">))</span> <span class="n">filename</span> <span class="o">=</span> <span class="n">name</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="sh">"</span><span class="s">ascii</span><span class="sh">"</span><span class="p">)</span> <span class="c1"># Create the file directory </span> <span class="nc">Path</span><span class="p">(</span><span class="n">filename</span><span class="p">).</span><span class="n">parent</span><span class="p">.</span><span class="nf">mkdir</span><span class="p">(</span><span class="n">parents</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">exist_ok</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="c1"># Seek to the file content </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="n">offset</span><span class="p">)</span> <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">wb</span><span class="sh">"</span><span class="p">)</span> <span class="k">as</span> <span class="n">output</span><span class="p">:</span> <span class="n">content</span> <span class="o">=</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">)</span> <span class="c1"># If filename ends with .txt perform the decryption </span> <span class="k">if</span> <span class="n">filename</span><span class="p">.</span><span class="nf">endswith</span><span class="p">(</span><span class="sh">"</span><span class="s">.txt</span><span class="sh">"</span><span class="p">):</span> <span class="n">key</span> <span class="o">=</span> <span class="mh">0x1234</span> <span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="n">content</span><span class="p">:</span> <span class="n">result</span> <span class="o">=</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="p">(</span><span class="n">b</span> <span class="o">^</span> <span class="n">key</span><span class="p">))</span> <span class="o">&amp;</span> <span class="mh">0xFF</span> <span class="n">decrypted_char</span> <span class="o">=</span> <span class="n">result</span> <span class="o">^</span> <span class="p">(</span><span class="n">result</span> <span class="o">^</span> <span class="p">((</span><span class="n">b</span> <span class="o">^</span> <span class="n">key</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mi">1</span><span class="p">))</span> <span class="o">&amp;</span> <span class="mh">0x55</span> <span class="n">key</span> <span class="o">=</span> <span class="p">(</span><span class="mi">3</span> <span class="o">*</span> <span class="n">key</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xFFFF</span> <span class="n">output</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nf">bytes</span><span class="p">([</span><span class="n">decrypted_char</span><span class="p">]))</span> <span class="k">else</span><span class="p">:</span> <span class="n">output</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="n">content</span><span class="p">)</span></code></pre></figure> <p>And check how it works now:</p> <p><img src="/assets/images/moorhuhnkart2/decrypted_content.png" alt="decrypted content" class="centered" /></p> <p>As you can see our effort was worth it! Now the text file is decrypted and we are able to read its content.</p> <h2 id="conclusion">Conclusion</h2> <p>Depending on the method used by the game manufacturer, sometimes it is not possible to unpack data files just by guessing the file structure. When we are dealing with obfuscation and/or encryption we need to reverse engineer the executable to obtain the decryption method. Thanks for reading.</p> In the last post I described a way that allowed to unpack content of proprietary format. That format was fairly straightforward with no encryption nor decompression and thus it was possible to grab files that are stored inside without even touching a debugger (we used one tho, but it was not necessary). This time, however, things will get a little bit more interesting. Today we will tackle a game from the Crazy Chicken series, to be more precise - Crazy Chicken Kart 2 (or Moorhuhn Kart 2 in original) Unpacking Garfield Game Files 2023-08-11T22:08:44+02:00 2023-08-11T22:08:44+02:00 https://letsreverse.net/2023/08/11/unpacking-garfield-game-files <p>You probably heard about Garfield, the cat with orange fur that loves sleep and lasagna. It turns out he is not only protagonist of comics and cartoons, but also computer games. One of such <a href="https://en.wikipedia.org/wiki/Garfield_(video_game)">games</a> uniquely named <code class="language-plaintext highlighter-rouge">Garfield</code> was released for PC in 2004 by company named <code class="language-plaintext highlighter-rouge">The Code Monkeys</code>. In this post, we will take a look into the structure of files used by the game.</p> <h2 id="getting-started">Getting started</h2> <p><img src="/assets/images/garfield/files.png" alt="Files after instalation" class="centered" /></p> <p>After installation, inside the main directory we can see <code class="language-plaintext highlighter-rouge">data</code> folder, where all the assets lies. The content of <code class="language-plaintext highlighter-rouge">audio</code> and <code class="language-plaintext highlighter-rouge">fmv</code> is easily accessible, since it is not packed and files inside use widely known formats, such as <code class="language-plaintext highlighter-rouge">.wav</code>. More interesting things such as images, 3d models are probably hidden inside mysterious <code class="language-plaintext highlighter-rouge">.pak</code> files. Since this is mostly like propriety format, content of those files is inaccessible to most people, however, not for us.</p> <p><img src="/assets/images/garfield/hex1.png" alt="Pak in hex editor" class="centered" /></p> <p>The first thing that should be applied when dealing with unknown file formats is to throw such file inside your favourite hex editor. As we can see, the first 4 bytes, when converted to ASCII are equal to “PACK”. As this value is const among all the <code class="language-plaintext highlighter-rouge">.pak</code> files, it is mostly likely the header, so the game parser knows that it is dealing with the right file. In the next bytes we can spot “PNG” string, just before it there is byte equal to 0x89. Together it forms the beginning of the <code class="language-plaintext highlighter-rouge">.png</code> image. So we know that there is one packed inside this file, we also know that this is not compressed nor encrypted (otherwise we wouldn’t see the raw png header). But how long it is? Does this image have any index or name? Is it the only image in this <code class="language-plaintext highlighter-rouge">.pak</code> file?</p> <h3 id="the-unknown-bytes">The unknown bytes</h3> <p>Between the “PACK” and “PNG” we got 8 unknown bytes. The whole file is 0x1D8018 bytes long. When we interpret first 4 bytes after the “PACK” as little endian value, it looks like some kind of offset that is pointing to almost the end of the file - 0x1D5518. When we go there, we can see that there is a name of a <code class="language-plaintext highlighter-rouge">.png</code> file, so our suspicions were right. It is indeed offset to some valuable information. <img src="/assets/images/garfield/hex2.png" alt="Filename in hex editor" class="centered" /> We still got 4 unknown bytes. We could read it in multiple ways, similar as previously, as a 4 byte little endian integer, but on the other hand maybe we should read it byte by byte? Or maybe there are two 2 byte integers? What this value represent? We still don’t know how to read the size of that png image. Questions arise, but is there a way to solve it without guessing?</p> <p>It turns out that indeed, this can be solved without guessing. Don’t get me wrong. It is perfectly fine to try to solve it by trial and error method (which later in this post will be kinda continued), especially for such file format that looks easy, but to make it more entertaining and more educational, let me introduce more powerful method that will give answers to our questions.</p> <h3 id="fun-with-debugger">Fun with debugger</h3> <p>Let’s attach debugger to the game. Obviously we don’t have the source code, so we are forced to debug it on the assembly level. For this task I will use <code class="language-plaintext highlighter-rouge">x64dbg</code>. The first question that immediately comes to the mind is where to look. The game binary is huge and contains millions of instructions, most of which are irrelevant to us. Good idea for the first try is to look at the string references. There is high probability that at least one of the <code class="language-plaintext highlighter-rouge">.pak</code> filename is hardcoded and is used as a parameter to the “open file” function. <img src="/assets/images/garfield/x64dbg_strings_button.png" alt="x64dbg strnigs button" class="centered" /> To look for string references in <code class="language-plaintext highlighter-rouge">x64dbg</code> we can use the button shown above. Please note the references are searched per module, so you need to be sure you are in the main one. If you want to change current module, you can go to <code class="language-plaintext highlighter-rouge">Symbols</code> tab and then double-click the module you are interested in.</p> <p><img src="/assets/images/garfield/x64dbg_pak_string.png" alt="x64dbg pak string" class="centered" /></p> <p>And here it is. As the function is in the export table (a bit unusual for the <code class="language-plaintext highlighter-rouge">.exe</code> file), we can see its name so we can be even more certain that we are looking at the correct place</p> <h2 id="reversing-the-load_pakfile-function">Reversing the <code class="language-plaintext highlighter-rouge">Load_PakFile</code> function</h2> <p><img src="/assets/images/garfield/x64dbg_loadpakfile.png" alt="x64dbg loadpakfile" class="centered" /></p> <p>First, we put a software breakpoint inside this function and resume the program execution. The breakpoint is indeed hit and we can step through the function. Now the question is, what does this function do to the file data, how does it treat our unknown bytes? Before we answer this, first we need to understand how it is even possible that the game transfers bytes from our hard disk to RAM memory.</p> <h3 id="reading-file-to-ram">Reading file to RAM</h3> <p>You see, on modern platforms like Windows it is impossible for a user-mode application (like game) to talk directly to the hardware (like hard disk). To retrieve information from a hardware, apps need to send requests to the kernel. On Windows such requests can be send using WinAPI functions. So if you are creating application that does some actions to files on Windows, it will eventually use some WinAPI calls to deal with files (under the hood), even if you aren’t aware of that. For the purpose of our task, we are interested in functions that gives us access to files. Probably there are multiple ways to do this using WinAPI, but the most popular way for programs is to use <code class="language-plaintext highlighter-rouge">CreateFile</code> to open a handle, and then <code class="language-plaintext highlighter-rouge">ReadFile</code> to read data to RAM memory.</p> <h3 id="breakpoints-on-api-calls">Breakpoints on API calls</h3> <p>To put breakpoint on API call in <code class="language-plaintext highlighter-rouge">x64dbg</code> you can press Ctrl+G and then type the name of the API function. In our case - <code class="language-plaintext highlighter-rouge">CreateFileA</code> and <code class="language-plaintext highlighter-rouge">ReadFile</code>. Just a one more thing to note. On Windows, a WinAPI function that accepts string as a parameter comes in two versions, ASCII and Unicode. The ASCII ones ends with <code class="language-plaintext highlighter-rouge">A</code>, and Unicode ones with <code class="language-plaintext highlighter-rouge">W</code>. That is the reason why there are two functions for opening a handle to a file - <code class="language-plaintext highlighter-rouge">CreateFileA</code> and <code class="language-plaintext highlighter-rouge">CreateFileW</code>. The ASCII function always call Unicode function at the end of the day, so if you are not sure which version is used by the application, it is always safer to put breakpoint on the Unicode call first. However for this game you can safely put the breakpoint on the ASCII one, as this is the function used by the game.</p> <h3 id="digging-deeper">Digging deeper</h3> <p>As you remember, we are now paused at the beginning of <code class="language-plaintext highlighter-rouge">Load_PakFile</code>, after stepping over the third call we stop on <code class="language-plaintext highlighter-rouge">CreateFileA</code>, looks good so far, this means that the first call is there to open handle to the file. We go back to the function. While going back we also notice that <code class="language-plaintext highlighter-rouge">GetFileSize</code> and <code class="language-plaintext highlighter-rouge">SetFilePointer</code> are called. This might be useful later. <img src="/assets/images/garfield/x64dbg_getting_ack_from_createfile.png" alt="x64dbg getting back from create file" class="centered" /> At this point we are just after the third call. <img src="/assets/images/garfield/x64dbg_just_after_third_call.png" alt="x64dbg just after third call" class="centered" /> As we seen that the <code class="language-plaintext highlighter-rouge">SetFilePointer</code> is used, I placed the breakpoint there too and continue stepping. And it pays off, after few calls we stop at the <code class="language-plaintext highlighter-rouge">SetFilePointer</code>. From the <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfilepointer">WinAPI documentation</a> we know that this function accepts four parameters. We can look on the stack, what are their values.</p> <p><img src="/assets/images/garfield/x64dbg_stack_setfilepointer.png" alt="x64dbg set file pointer" class="centered" /></p> <p>The first one is number that identify the file handle, it may be different on each run. Next two parameters are zero, and the last one is equal to 1 - <code class="language-plaintext highlighter-rouge">FILE_CURRENT</code>. So after this call the position of the file cursor will be moved by zero bytes, counting from the current position. Looks kinda useless. We continue stepping.</p> <p><img src="/assets/images/garfield/x64dbg_after_set_file_pointer.png" alt="x64dbg after set file pointer" class="centered" /></p> <p>And we finally land at <code class="language-plaintext highlighter-rouge">ReadFile</code>, the first call to read file data to RAM. Now developers might use multiple different techniques, for example they can read the data chunk by chunk, or read everything at once and then parse something further in the program, but in this case, it looks like they decided to read just as little bytes as they need in the moment, look at the parameters at <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-readfile">WinAPI documentation</a> and compare with what we get at the stack:</p> <p><img src="/assets/images/garfield/x64dbg_readfile_stack.png" alt="x64dbg readfile stack" class="centered" /></p> <p>So they want to read 4 bytes from the <code class="language-plaintext highlighter-rouge">.pak</code> file into data buffer at 0x18FDBC. After we go back to the <code class="language-plaintext highlighter-rouge">Load_PakFile</code> we see that those bytes read from the file are compared with <code class="language-plaintext highlighter-rouge">0x4B434150</code> or <code class="language-plaintext highlighter-rouge">PACK</code> in ASCII. So we were right. The <code class="language-plaintext highlighter-rouge">PACK</code> is a header, so the game parser can do sanity check and be sure that it is processing valid <code class="language-plaintext highlighter-rouge">.pak</code> file.</p> <p><img src="/assets/images/garfield/x64dbg_compare_pack.png" alt="x64dbg compare pack" class="centered" /></p> <p>Again, we continue with the flow and see another call to <code class="language-plaintext highlighter-rouge">ReadFile</code>, another 4 bytes are being read, as you remember this is offset to the first file name in <code class="language-plaintext highlighter-rouge">.pak</code> file, then another 4 bytes. Those are the “unknown” to us. Now we see that they should be treated as single integer. It is interesting that the value is shifted left by 6 bytes, so in other words multiplied by 0x40 (64 in decimal).</p> <p><img src="/assets/images/garfield/x64dbg_unknown_value.png" alt="x64dbg unkown value" class="centered" /></p> <p>At the end of the image above there is call to [eax+108]. This is a wrapper for RtlAllocateHeap (I stepped into it, so I know) which allocates a buffer with length equal to the value unknown bytes shifted left by 6. Next call sets the file cursor to the first file name offset and then in the loop the game reads the file by 64 bytes chunks into the previously allocated buffer. The loop is executed as many times as the value of 4 bytes “unknown” integer.</p> <p><img src="/assets/images/garfield/x64dbg_loop.png" alt="x64dbg loop" class="centered" /></p> <h3 id="using-obtained-information">Using obtained information</h3> <p>We could reverse the game further, to check what is done to the data that are now in previously allocated buffer, however, we have some new information that we can apply. Firstly, the previously unknown bytes are mostly like the count of files inside <code class="language-plaintext highlighter-rouge">pak</code> file. Then, the offset to first filename is in reality offset to the file information header, and probably each file information header entry is 64 bytes long (as the game was reading it in chunks with that length)</p> <h3 id="inspecting-the-file-entry">Inspecting the file entry</h3> <p><img src="/assets/images/garfield/hex_file_entry.png" alt="Hex file entry" class="centered" /></p> <p>Now, go back to the binary file at that filename offset. We see that at the beginning there is the filename. Length of the name is nowhere to be found, so we can suspect that this should be read until first null character. The last 8 bytes of the entry looks like two 4 bytes integers, their value is repeated twice for some reasons. Those values look like another offset or file length. The same with preceding 4 bytes, another offset that might be a start of the file, as it looks it is increased in each next entry.</p> <p>Now we go to the suspected file start offset, and from that place we select next X bytes, where X is equal to the value of last 4 bytes of the file entry (suspected file length). We save those bytes to separate file with <code class="language-plaintext highlighter-rouge">png</code> extension and try to run it:</p> <p><img src="/assets/images/garfield/nopad.png" alt="nopad" class="centered" /></p> <p>We are greeted with an image. Now we can write simple python script to automate the unpacking process, and also to verify if it works for all the files.</p> <h2 id="code-to-unpack-the-pak-file">Code to unpack the .pak file</h2> <figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="kn">import</span> <span class="n">struct</span> <span class="kn">import</span> <span class="n">os</span> <span class="kn">from</span> <span class="n">pathlib</span> <span class="kn">import</span> <span class="n">Path</span> <span class="n">destination</span> <span class="o">=</span> <span class="sh">"</span><span class="s">output/</span><span class="sh">"</span> <span class="c1"># Set the filename to unpack here </span><span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="sh">"</span><span class="s">startup.pak</span><span class="sh">"</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">rb</span><span class="sh">"</span><span class="p">)</span> <span class="k">as</span> <span class="nb">file</span><span class="p">:</span> <span class="n">header</span><span class="p">,</span> <span class="n">files_info</span><span class="p">,</span> <span class="n">file_count</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;III</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">4</span><span class="o">*</span><span class="mi">3</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="n">file_count</span><span class="p">):</span> <span class="c1"># Go to header of the file i </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="n">files_info</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x40</span><span class="p">)</span> <span class="c1"># Read the filename until 0x00 byte </span> <span class="n">filename</span> <span class="o">=</span> <span class="sa">b</span><span class="sh">""</span> <span class="nf">while </span><span class="p">(</span><span class="n">read_byte</span> <span class="p">:</span><span class="o">=</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span> <span class="o">!=</span> <span class="sa">b</span><span class="sh">'</span><span class="se">\x00</span><span class="sh">'</span><span class="p">:</span> <span class="n">filename</span> <span class="o">+=</span> <span class="n">read_byte</span> <span class="n">filename</span> <span class="o">=</span> <span class="n">filename</span><span class="p">.</span><span class="nf">decode</span><span class="p">(</span><span class="sh">"</span><span class="s">ascii</span><span class="sh">"</span><span class="p">)</span> <span class="nf">print</span><span class="p">(</span><span class="sh">"</span><span class="s">Unpacking {}...</span><span class="sh">"</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">filename</span><span class="p">))</span> <span class="c1"># Go to the file placement information </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="n">files_info</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mh">0x40</span> <span class="o">+</span> <span class="mh">0x34</span><span class="p">)</span> <span class="n">start_offset</span><span class="p">,</span> <span class="n">length</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="nf">unpack</span><span class="p">(</span><span class="sh">"</span><span class="s">&lt;II</span><span class="sh">"</span><span class="p">,</span> <span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="mi">4</span><span class="o">*</span><span class="mi">2</span><span class="p">))</span> <span class="c1"># Go to the beginning of the file data </span> <span class="nb">file</span><span class="p">.</span><span class="nf">seek</span><span class="p">(</span><span class="n">start_offset</span><span class="p">)</span> <span class="c1"># Create required directories </span> <span class="n">output_path</span> <span class="o">=</span> <span class="n">destination</span> <span class="o">+</span> <span class="n">filename</span> <span class="n">os</span><span class="p">.</span><span class="nf">makedirs</span><span class="p">(</span><span class="nc">Path</span><span class="p">(</span><span class="n">output_path</span><span class="p">).</span><span class="n">parent</span><span class="p">,</span> <span class="n">exist_ok</span> <span class="o">=</span> <span class="bp">True</span><span class="p">)</span> <span class="c1"># Write the data to file </span> <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="n">output_path</span><span class="p">,</span> <span class="n">mode</span><span class="o">=</span><span class="sh">"</span><span class="s">wb</span><span class="sh">"</span><span class="p">)</span> <span class="k">as</span> <span class="n">output_file</span><span class="p">:</span> <span class="n">output_file</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="nb">file</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">))</span></code></pre></figure> <h2 id="what-is-inside">What is inside?</h2> <p><img src="/assets/images/garfield/unpacked.png" alt="unpacked" class="centered" /></p> <p>All files has been unpacked so far, and it looks it worked correctly, as the <code class="language-plaintext highlighter-rouge">txt</code> files can be read, the same with <code class="language-plaintext highlighter-rouge">png</code> images. For the purpose of game modding, the <code class="language-plaintext highlighter-rouge">fonts</code> directory may be interesting, as it contains fonts and dialogs in different languages, so for example if you would like to translate the game to your language, you could modify those dialogs and then repack the files back. What is interesting, when I run the game the language is set to Polish and it looks it cannot be changed, but there is no Polish font in the <code class="language-plaintext highlighter-rouge">fonts</code> folder. It turns out the file named <code class="language-plaintext highlighter-rouge">english.lng</code> contains the translated Polish dialogs.</p> <p>To prove this script works for other files, I unpacked some of them, like <code class="language-plaintext highlighter-rouge">attic.pak</code> the content is a little bit different and consist of extension such as:</p> <ul> <li>png - It is probably obvious, images of some assets, loading screens etc</li> <li>rws - This is the most mysterious file extension, probably contains animations, textures</li> <li>ape - This is plain text file, contains description where checkpoints are on the map, and some information to assist the AI</li> <li>bin - Probably meshes</li> </ul> <h2 id="conclusion">Conclusion</h2> <p>As you can see, it was not that hard to make sense how the <code class="language-plaintext highlighter-rouge">pak</code> file is structured and then to make script to unpack the content. In future posts I will try to show examples with different difficulty levels (encryption, compression). Thanks for reading.</p> You probably heard about Garfield, the cat with orange fur that loves sleep and lasagna. It turns out he is not only protagonist of comics and cartoons, but also computer games. One of such games uniquely named Garfield was released for PC in 2004 by company named The Code Monkeys. In this post, we will take a look into the structure of files used by the game.