<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2026-04-10T12:08:45+08:00</updated><id>/feed.xml</id><title type="html">Braid</title><subtitle>Love WASD, enjoy HJKL.</subtitle><author><name>Doodle</name></author><entry><title type="html">C++ Exception Handling ABI, part 1</title><link href="/%E5%AD%A6%E4%B9%A0/C++-Exception-ABI-part-1/" rel="alternate" type="text/html" title="C++ Exception Handling ABI, part 1" /><published>2026-04-09T00:00:00+08:00</published><updated>2026-04-09T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/C++%20Exception%20ABI-part-1</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/C++-Exception-ABI-part-1/"><![CDATA[<p>C++ 异常处理的第一篇，这一篇先对栈展开的流程建立一个基本概念，下一篇再补充一些深入细节。</p>

<h2 id="itanium-c-abi">Itanium C++ ABI</h2>

<p>要聊 C++ 的异常处理，首先我们得了解 Itanium C++ ABI。这个术语中，Itanium 是一个曾经想取代 x86，但目前已经退出市场的处理器架构。虽然处理器架构失败了，但其副产品 Itanium C++ ABI 早就不只服务于 Itanium 架构，而是逐渐演化成类 Unix 平台上 C++ ABI 的事实标准。今天我们在 x86-64、AArch64 等平台上看到的 C++ 异常处理、RTTI、<code class="language-plaintext highlighter-rouge">dynamic_cast</code> 等运行时行为，都遵循这套规范。</p>

<p>对于异常处理，虽然 Itanium ABI 最初是为 C++ 设计的，但它是建立在更底层的组件之上，这些组件都是语言无关的：</p>

<ul>
  <li>System V ABI</li>
  <li>DWARF</li>
  <li>libunwind</li>
</ul>

<p>因此，任何语言只要按照 Itanium ABI 来实现，都可以使用同样的异常处理 ABI 机制。 所以 Itanium C++ ABI 实际可以拆分成两层：</p>

<ul>
  <li>Level 1: Base ABI，定义语言无关的栈展开机制</li>
  <li>Level 2: C++ ABI，在 Level 1 之上补充 C++ 语义，比如 <code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code> 等</li>
</ul>

<p>Base ABI 描述语言无关的栈展开过程，并定义 <code class="language-plaintext highlighter-rouge">_Unwind_*</code> API。常见实现如下所示，它们也就是通常所说的 <code class="language-plaintext highlighter-rouge">unwinder</code> ：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">libgcc</code> 中的 <code class="language-plaintext highlighter-rouge">libgcc_s.so.1</code> （<code class="language-plaintext highlighter-rouge">libgcc_s</code>是gcc运行时一部分）</li>
  <li><code class="language-plaintext highlighter-rouge">libunwind</code>: https://github.com/libunwind/libunwind</li>
  <li><code class="language-plaintext highlighter-rouge">llvm</code> 中的 <code class="language-plaintext highlighter-rouge">libunwind</code>: https://github.com/llvm/llvm-project/tree/main/libunwind</li>
</ul>

<p>C++ ABI则和 C++ 语言本身相关，定义了 <code class="language-plaintext highlighter-rouge">__cxa_*</code> API（例如 <code class="language-plaintext highlighter-rouge">__cxa_allocate_exception</code>、<code class="language-plaintext highlighter-rouge">__cxa_throw</code>、<code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 等），以及如何通过这些 API，实现 C++ 的 <code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code> 语法。常见实现有：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">libstdc++</code> 中的 <code class="language-plaintext highlighter-rouge">libsupc++</code> (support library for C++): 除了上面提到的 <code class="language-plaintext highlighter-rouge">__cxa_*</code> API 之外，还提供 RTTI 以及 <code class="language-plaintext highlighter-rouge">dynamic_cast</code>。<code class="language-plaintext highlighter-rouge">libsupc++</code> 提供了 C++ 中所有动态类型相关的实现</li>
  <li><code class="language-plaintext highlighter-rouge">llvm</code> 中的 <code class="language-plaintext highlighter-rouge">c++abi</code>: https://github.com/llvm/llvm-project/tree/main/libcxxabi</li>
</ul>

<blockquote>
  <p>这些库的名字很容易让人混淆。<code class="language-plaintext highlighter-rouge">libstdc++</code> = C++ 标准库 + <code class="language-plaintext highlighter-rouge">libsupc++</code>，其中 <code class="language-plaintext highlighter-rouge">libsupc++</code> 提供了异常处理所需的 <code class="language-plaintext highlighter-rouge">__cxa_*</code> API。<code class="language-plaintext highlighter-rouge">libgcc</code> 则是编译器生成代码时依赖的底层运行时支持库，其中包含异常处理需要的 <code class="language-plaintext highlighter-rouge">_Unwind_*</code> API。</p>

</blockquote>

<table>
  <thead>
    <tr>
      <th>ABI 层级</th>
      <th>作用</th>
      <th>GCC 常见实现</th>
      <th>LLVM 常见实现</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Level 1: Base ABI（语言无关）</td>
      <td>定义<code class="language-plaintext highlighter-rouge">_Unwind_*</code> API，负责栈展开</td>
      <td><code class="language-plaintext highlighter-rouge">libgcc_s.so.1</code>（隶属 <code class="language-plaintext highlighter-rouge">libgcc</code>）</td>
      <td><code class="language-plaintext highlighter-rouge">libunwind</code></td>
    </tr>
    <tr>
      <td>Level 2: C++ ABI（语言相关）</td>
      <td>负责将 <code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code> 语义转换为对应<code class="language-plaintext highlighter-rouge">__cxa_*</code> API调用</td>
      <td><code class="language-plaintext highlighter-rouge">libsupc++</code>（隶属 <code class="language-plaintext highlighter-rouge">libstdc++</code>）</td>
      <td><code class="language-plaintext highlighter-rouge">libc++abi</code></td>
    </tr>
  </tbody>
</table>

<p>下面会先简单描述两层 API 的主要作用，之后再分章节详细介绍。</p>

<p>第一层规定了如何做栈展开（stack unwinding），是一套语言无关的通用接口。它主要包括以下几部分（具体内容先看不懂也没关系）：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_Exception</code> 异常对象结构</li>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_*</code> API，这里只列出最重要的两个：
    <ul>
      <li><code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code></li>
      <li><code class="language-plaintext highlighter-rouge">_Unwind_Resume</code></li>
    </ul>
  </li>
  <li>如何进行栈展开，整个过程分成搜索和清理两个阶段，后面会详细介绍</li>
  <li><code class="language-plaintext highlighter-rouge">personality</code>：在栈展开过程中，<code class="language-plaintext highlighter-rouge">unwinder</code> 会询问当前栈帧的 <code class="language-plaintext highlighter-rouge">personality</code> 能不能处理这个异常：
    <ul>
      <li>如果能，应该跳到哪个 <code class="language-plaintext highlighter-rouge">catch</code> 块，从哪条指令开始继续执行</li>
      <li>如果不能，当前栈帧是否需要额外清理工作，比如清理栈上对象</li>
    </ul>
  </li>
</ul>

<p>概括一下，这一层定义的是：由 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 负责执行两阶段栈展开，而栈展开过程中涉及的语言相关概念，例如 <code class="language-plaintext highlighter-rouge">catch</code> 块、离开作用域后的对象析构等，都由 <code class="language-plaintext highlighter-rouge">personality</code> 封装。正因为如此，这套 ABI 才能支持多语言，并允许它们和 C++ 一起工作。</p>

<p>第二层就是我们通常所说的 C++ ABI。它定义的是 C++ 语言特性在运行时的实现规则和接口，大致可以分成几部分：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">__cxa_exception</code>: C++ 的异常对象结构，其中包含 Level 1 里的 <code class="language-plaintext highlighter-rouge">_Unwind_Exception</code></li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_* API</code>: 异常运行时 API。它本质上是把 C++ 异常相关的语法（<code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code>）翻译成运行时的函数调用
    <ul>
      <li><code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code></li>
      <li><code class="language-plaintext highlighter-rouge">__cxa_end_catch</code></li>
      <li><code class="language-plaintext highlighter-rouge">__cxa_allocate_exception</code></li>
      <li><code class="language-plaintext highlighter-rouge">__cxa_throw</code></li>
    </ul>
  </li>
  <li>RTTI：动态类型相关的支持，主要包括：
    <ul>
      <li><code class="language-plaintext highlighter-rouge">std::type_info</code></li>
      <li><code class="language-plaintext highlighter-rouge">typeid</code>运算符</li>
      <li>类型比较方式</li>
      <li><code class="language-plaintext highlighter-rouge">dynamic_cast</code></li>
    </ul>
  </li>
</ul>

<blockquote>
  <p>异常处理过程中需要比较 <code class="language-plaintext highlighter-rouge">throw</code> 出来的异常类型和 <code class="language-plaintext highlighter-rouge">catch</code> 声明的类型是否匹配，因此 RTTI 也会参与异常处理。</p>

</blockquote>

<h2 id="level-1-base-abi">Level 1: Base ABI</h2>

<h3 id="_unwind_exception">_Unwind_Exception</h3>

<p>数据结构如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Level 1</span>
<span class="k">struct</span> <span class="nc">_Unwind_Exception</span> <span class="p">{</span>
  <span class="n">_Unwind_Exception_Class</span> <span class="n">exception_class</span><span class="p">;</span> <span class="c1">// an identifier, used to tell whether the exception is native</span>
  <span class="n">_Unwind_Exception_Cleanup_Fn</span> <span class="n">exception_cleanup</span><span class="p">;</span>
  <span class="n">_Unwind_Word</span> <span class="n">private_1</span><span class="p">;</span> <span class="c1">// zero: normal unwind; non-zero: forced unwind, the _Unwind_Stop_Fn function</span>
  <span class="n">_Unwind_Word</span> <span class="n">private_2</span><span class="p">;</span> <span class="c1">// saved stack pointer</span>
<span class="p">}</span> <span class="n">__attribute__</span><span class="p">((</span><span class="n">aligned</span><span class="p">));</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">exception_class</code> 和 <code class="language-plaintext highlighter-rouge">exception_cleanup</code> 由 Level 2 中负责抛异常的 API 设置。Level 1 并不关心 <code class="language-plaintext highlighter-rouge">exception_class</code> 的具体含义，而是把它原样传给 <code class="language-plaintext highlighter-rouge">personality</code>，再由后者判断当前异常是 <code class="language-plaintext highlighter-rouge">native exception</code> 还是 <code class="language-plaintext highlighter-rouge">foreign exception</code> （可以简单理解为 C++ 运行时抛出的异常为 <code class="language-plaintext highlighter-rouge">native exception</code>，其他语言抛出的异常为 <code class="language-plaintext highlighter-rouge">foreign exception</code>，这块文章最后会再补充一些）。</p>

<p><code class="language-plaintext highlighter-rouge">exception_class</code> 用来表示这个异常对象属于哪种语言和运行时，前4个字节一般表示厂商，而后4个字节表示语言。例如，<code class="language-plaintext highlighter-rouge">libc++abi</code> 的 <code class="language-plaintext highlighter-rouge">__cxa_throw</code> 会把 <code class="language-plaintext highlighter-rouge">exception_class</code> 设成表示 <code class="language-plaintext highlighter-rouge">"CLNGC++\0"</code> 的 <code class="language-plaintext highlighter-rouge">uint64_t</code>，而 <code class="language-plaintext highlighter-rouge">libsupc++</code> 使用的是表示 <code class="language-plaintext highlighter-rouge">"GNUCC++\0"</code> 的 <code class="language-plaintext highlighter-rouge">uint64_t</code>。<code class="language-plaintext highlighter-rouge">exception_cleanup</code> 保存对应异常对象的析构函数，会在出 <code class="language-plaintext highlighter-rouge">catch</code> 作用域时，由 Level 2 的 API 调用。</p>

<p>栈展开过程中需要的相关信息，比如给定 IP 或者 SP 寄存器如何获取上一个栈帧的 IP 和 SP，则是由具体实现定义。对于 ELF，栈展开的相关信息都保存在 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 和 <code class="language-plaintext highlighter-rouge">.eh_frame_hdr</code>中。这部分原理不影响理解栈展开的主要流程，我们在下一篇再详细介绍。</p>

<h3 id="api">API</h3>

<p><code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 负责执行异常的栈展开。这个函数没有通常意义上的 return 语句，控制权最终要么转移给匹配到的 <code class="language-plaintext highlighter-rouge">catch</code> 块，要么在无法 <code class="language-plaintext highlighter-rouge">catch</code> 时转移给相应清理代码，从而析构局部对象的代码。整个过程分成两个阶段：<code class="language-plaintext highlighter-rouge">search phase</code>（搜索阶段）和 <code class="language-plaintext highlighter-rouge">cleanup phase</code>（清理阶段）。</p>

<ul>
  <li>在搜索阶段，要找出能够处理该异常的 <code class="language-plaintext highlighter-rouge">catch</code>，并把对应栈帧的栈指针记录到 <code class="language-plaintext highlighter-rouge">private_2</code>
    <ul>
      <li>根据 IP、SP 以及其他已保存寄存器，沿着调用链逐帧回溯</li>
      <li>对每个栈帧，如果没有对应 <code class="language-plaintext highlighter-rouge">personality</code> 就跳过；如果有，就传入 <code class="language-plaintext highlighter-rouge">_UA_SEARCH_PHASE</code> 作为参数并调用它</li>
      <li>如果 <code class="language-plaintext highlighter-rouge">personality</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code>，表示继续向上搜索</li>
      <li>如果 <code class="language-plaintext highlighter-rouge">personality</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code>，表示找到了匹配的 <code class="language-plaintext highlighter-rouge">catch</code> 块，将对应栈帧保存到 <code class="language-plaintext highlighter-rouge">private_2</code>。</li>
      <li>过程中如果发现 ABI 层面不匹配，此时搜索停止</li>
    </ul>
  </li>
  <li>在清理阶段，要先跳转搜索阶段遍历过程中，没有捕获异常的栈帧的清理代码（通常是局部变量析构），最后再把控制权转交给搜索阶段找到的 <code class="language-plaintext highlighter-rouge">catch</code> 块
    <ul>
      <li>同样根据 IP、SP 和其他寄存器沿调用链逐帧回溯</li>
      <li>对每个栈帧，如果没有对应 <code class="language-plaintext highlighter-rouge">personality</code> 就跳过；如果有，就传入 <code class="language-plaintext highlighter-rouge">_UA_CLEANUP_PHASE</code> 作为参数并调用它；而搜索阶段标记过的那个栈帧还会额外带上 <code class="language-plaintext highlighter-rouge">_UA_HANDLER_FRAME</code></li>
      <li>如果 <code class="language-plaintext highlighter-rouge">personality</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code>，表示没有 <code class="language-plaintext highlighter-rouge">landing pad</code>，即该栈帧不需要额外处理</li>
      <li>如果 <code class="language-plaintext highlighter-rouge">personality</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_INSTALL_CONTEXT</code>，表示找到了 <code class="language-plaintext highlighter-rouge">landing pad</code>，需要跳转到 <code class="language-plaintext highlighter-rouge">landing pad</code> 继续执行</li>
      <li>对于那些没有在搜索阶段被标记的中间栈帧，<code class="language-plaintext highlighter-rouge">landing pad</code> 只负责清理工作（通常是析构已离开作用域的变量），然后调用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 回到清理阶段</li>
      <li>对于搜索阶段标记的那个栈帧，<code class="language-plaintext highlighter-rouge">landing pad</code> 会调用 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>，随后执行 <code class="language-plaintext highlighter-rouge">catch</code> 块中的代码，最后调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 完成销毁异常对象</li>
    </ul>

    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">landing pad</code> 在下面 Level 2 部分会介绍，它是一段编译器为函数生成的处理异常的代码。这里补充一点，具体跳转到 <code class="language-plaintext highlighter-rouge">landing pad</code> 的操作由 <code class="language-plaintext highlighter-rouge">unwinder</code> 完成，而跳转到哪里则是由 <code class="language-plaintext highlighter-rouge">personality</code> 决定的。</p>

    </blockquote>
  </li>
</ul>

<p>关于 <code class="language-plaintext highlighter-rouge">personality</code> 我们在下一篇会详细介绍，此处只需要了解它连接了 Level 1 和 Level 2 API，其主要功能是：</p>

<ul>
  <li>在栈展开过程中检查每个栈帧是否有匹配的 <code class="language-plaintext highlighter-rouge">catch</code></li>
  <li>搜索阶段返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code> 或 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code>，以表示该栈帧能否处理该异常</li>
  <li>清理阶段返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code> 或 <code class="language-plaintext highlighter-rouge">_URC_INSTALL_CONTEXT</code>，以表示是否跳转到 <code class="language-plaintext highlighter-rouge">landing pad</code></li>
</ul>

<p>除此之外，还有几个常见的 API：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_ForcedUnwind</code>: 强制栈展开，也就是跳过搜索阶段，直接进入清理阶段，典型场景是 <code class="language-plaintext highlighter-rouge">pthread_cancel</code></li>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_Resume</code>: Level 1 中几乎唯一一个直接由编译器生成调用的 API。如果当前栈帧不能捕获异常、但需要先清理栈上对象，那么清理完成后就会调用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 继续清理阶段</li>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_DeleteException:</code> 调用 <code class="language-plaintext highlighter-rouge">_Unwind_Exception</code> 中的 <code class="language-plaintext highlighter-rouge">exception_cleanup</code> 销毁给定的异常对象。</li>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_Backtrace</code>: 忽略 <code class="language-plaintext highlighter-rouge">personality</code>，而是执行一个回调。典型场景就是 gdb 里的 backtrace，大致原理是用当前指令寄存器 <code class="language-plaintext highlighter-rouge">%rip</code> 去查 <code class="language-plaintext highlighter-rouge">.eh_frame</code>，算出“上一帧在哪”，然后不断重复</li>
</ul>

<p>完整 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 栈展开的代码如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="n">_Unwind_Reason_Code</span> <span class="nf">unwind_phase1</span><span class="p">(</span><span class="n">unw_context_t</span> <span class="o">*</span><span class="n">uc</span><span class="p">,</span> <span class="n">_Unwind_Context</span> <span class="o">*</span><span class="n">ctx</span><span class="p">,</span>
                                         <span class="n">_Unwind_Exception</span> <span class="o">*</span><span class="n">obj</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// Search phase: unwind and call personality with _UA_SEARCH_PHASE for each frame</span>
  <span class="c1">// until a handler (catch block) is found.</span>
  <span class="n">unw_init_local</span><span class="p">(</span><span class="n">uc</span><span class="p">,</span> <span class="n">ctx</span><span class="p">);</span>
  <span class="k">for</span><span class="p">(;;)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">fdeMissing</span><span class="p">)</span> <span class="k">return</span> <span class="n">_URC_END_OF_STACK</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">step</span><span class="p">(</span><span class="n">ctx</span><span class="p">))</span> <span class="k">return</span> <span class="n">_URC_FATAL_PHASE1_ERROR</span><span class="p">;</span>
    <span class="n">ctx</span><span class="o">-&gt;</span><span class="n">getFdeAndCieFromIP</span><span class="p">();</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">personality</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
    <span class="k">switch</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">personality</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">_UA_SEARCH_PHASE</span><span class="p">,</span> <span class="n">obj</span><span class="o">-&gt;</span><span class="n">exception_class</span><span class="p">,</span> <span class="n">obj</span><span class="p">,</span> <span class="n">ctx</span><span class="p">))</span> <span class="p">{</span>
    <span class="k">case</span> <span class="n">_URC_CONTINUE_UNWIND</span><span class="p">:</span> <span class="k">break</span><span class="p">;</span>
    <span class="k">case</span> <span class="n">_URC_HANDLER_FOUND</span><span class="p">:</span>
      <span class="n">unw_get_reg</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">UNW_REG_SP</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">obj</span><span class="o">-&gt;</span><span class="n">private_2</span><span class="p">);</span>
      <span class="k">return</span> <span class="n">_URC_NO_REASON</span><span class="p">;</span>
    <span class="nl">default:</span> <span class="k">return</span> <span class="n">_URC_FATAL_PHASE1_ERROR</span><span class="p">;</span> <span class="c1">// e.g. stack corruption</span>
    <span class="p">}</span>
  <span class="p">}</span>
  <span class="k">return</span> <span class="n">_URC_NO_REASON</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">static</span> <span class="n">_Unwind_Reason_Code</span> <span class="nf">unwind_phase2</span><span class="p">(</span><span class="n">unw_context_t</span> <span class="o">*</span><span class="n">uc</span><span class="p">,</span> <span class="n">_Unwind_Context</span> <span class="o">*</span><span class="n">ctx</span><span class="p">,</span>
                                         <span class="n">_Unwind_Exception</span> <span class="o">*</span><span class="n">obj</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// Cleanup phase: unwind and call personality with _UA_CLEANUP_PHASE for each frame</span>
  <span class="c1">// until reaching the handler. Restore the register state and transfer control.</span>
  <span class="n">unw_init_local</span><span class="p">(</span><span class="n">uc</span><span class="p">,</span> <span class="n">ctx</span><span class="p">);</span>
  <span class="k">for</span><span class="p">(;;)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">fdeMissing</span><span class="p">)</span> <span class="k">return</span> <span class="n">_URC_END_OF_STACK</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">step</span><span class="p">(</span><span class="n">ctx</span><span class="p">))</span> <span class="k">return</span> <span class="n">_URC_FATAL_PHASE2_ERROR</span><span class="p">;</span>
    <span class="n">ctx</span><span class="o">-&gt;</span><span class="n">getFdeAndCieFromIP</span><span class="p">();</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">personality</span><span class="p">)</span> <span class="k">continue</span><span class="p">;</span>
    <span class="n">_Unwind_Action</span> <span class="n">actions</span> <span class="o">=</span> <span class="n">_UA_CLEANUP_PHASE</span><span class="p">;</span>
    <span class="kt">size_t</span> <span class="n">sp</span><span class="p">;</span>
    <span class="n">unw_get_reg</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">UNW_REG_SP</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">sp</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">sp</span> <span class="o">==</span> <span class="n">obj</span><span class="o">-&gt;</span><span class="n">private_2</span><span class="p">)</span> <span class="n">actions</span> <span class="o">|=</span> <span class="n">_UA_HANDLER_FRAME</span><span class="p">;</span>
    <span class="k">switch</span> <span class="p">(</span><span class="n">ctx</span><span class="o">-&gt;</span><span class="n">personality</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">actions</span><span class="p">,</span> <span class="n">obj</span><span class="o">-&gt;</span><span class="n">exception_class</span><span class="p">,</span> <span class="n">obj</span><span class="p">,</span> <span class="n">ctx</span><span class="p">))</span> <span class="p">{</span>
    <span class="k">case</span> <span class="n">_URC_CONTINUE_UNWIND</span><span class="p">:</span>
      <span class="k">break</span><span class="p">;</span>
    <span class="k">case</span> <span class="n">_URC_INSTALL_CONTEXT</span><span class="p">:</span>
      <span class="n">unw_resume</span><span class="p">(</span><span class="n">ctx</span><span class="p">);</span> <span class="c1">// Return if there is an error</span>
      <span class="k">return</span> <span class="n">_URC_FATAL_PHASE2_ERROR</span><span class="p">;</span>
    <span class="nl">default:</span> <span class="k">return</span> <span class="n">_URC_FATAL_PHASE2_ERROR</span><span class="p">;</span> <span class="c1">// Unknown result code</span>
    <span class="p">}</span>
  <span class="p">}</span>
  <span class="k">return</span> <span class="n">_URC_FATAL_PHASE2_ERROR</span><span class="p">;</span>
<span class="p">}</span>

<span class="n">_Unwind_Reason_Code</span> <span class="nf">_Unwind_RaiseException</span><span class="p">(</span><span class="n">_Unwind_Exception</span> <span class="o">*</span><span class="n">obj</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">unw_context_t</span> <span class="n">uc</span><span class="p">;</span>
  <span class="n">_Unwind_Context</span> <span class="n">ctx</span><span class="p">;</span>
  <span class="n">__unw_getcontext</span><span class="p">(</span><span class="o">&amp;</span><span class="n">uc</span><span class="p">);</span>
  <span class="n">_Unwind_Reason_Code</span> <span class="n">phase1</span> <span class="o">=</span> <span class="n">unwind_phase1</span><span class="p">(</span><span class="o">&amp;</span><span class="n">uc</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">ctx</span><span class="p">,</span> <span class="n">obj</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">phase1</span> <span class="o">!=</span> <span class="n">_URC_NO_REASON</span><span class="p">)</span> <span class="k">return</span> <span class="n">phase1</span><span class="p">;</span>
  <span class="k">return</span> <span class="n">unwind_phase2</span><span class="p">(</span><span class="o">&amp;</span><span class="n">uc</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">ctx</span><span class="p">,</span> <span class="n">obj</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>显然这个过程是可以在一次遍历情况下完成的，之所以要遍历两次，主要是为了在没有任何 <code class="language-plaintext highlighter-rouge">catch</code> 能处理异常的情况下，避免过早做真正的栈展开。也就是说，在搜索阶段没有找到任何可以处理异常的栈帧时，运行时就能更早终止程序。</p>

<h2 id="level-2-c-abi">Level 2: C++ ABI</h2>

<p>在 Level 1 的基础上，定义了 <code class="language-plaintext highlighter-rouge">__cxa_*</code> API（例如 <code class="language-plaintext highlighter-rouge">__cxa_allocate_exception</code>、<code class="language-plaintext highlighter-rouge">__cxa_throw</code>、<code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>、<code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 等），以及如何通过这些 API，实现 C++ 的 <code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code> 语法。</p>

<h3 id="__cxa_exception">__cxa_exception</h3>

<p><code class="language-plaintext highlighter-rouge">__cxa_exception</code> 是在 <code class="language-plaintext highlighter-rouge">_Unwind_Exception</code> 的基础上，再补充一层 C++ 异常语义信息的结构。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">__cxa_exception</span> <span class="p">{</span>
  <span class="kt">void</span> <span class="o">*</span><span class="n">reserve</span><span class="p">;</span> <span class="c1">// here on 64-bit platforms</span>
  <span class="kt">size_t</span> <span class="n">referenceCount</span><span class="p">;</span> <span class="c1">// here on 64-bit platforms</span>
  <span class="n">std</span><span class="o">::</span><span class="n">type_info</span> <span class="o">*</span><span class="n">exceptionType</span><span class="p">;</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">exceptionDestructor</span><span class="p">)(</span><span class="kt">void</span> <span class="o">*</span><span class="p">);</span>
  <span class="n">unexpected_handler</span> <span class="n">unexpectedHandler</span><span class="p">;</span> <span class="c1">// by default std::get_unexpected()</span>
  <span class="n">terminate_handler</span> <span class="n">terminateHandler</span><span class="p">;</span> <span class="c1">// by default std::get_terminate()</span>
  <span class="n">__cxa_exception</span> <span class="o">*</span><span class="n">nextException</span><span class="p">;</span> <span class="c1">// linked to the next exception on the thread stack</span>
  <span class="kt">int</span> <span class="n">handlerCount</span><span class="p">;</span> <span class="c1">// incremented in __cxa_begin_catch, decremented in __cxa_end_catch, negated in __cxa_rethrow; last non-dependent performs the clean</span>

  <span class="c1">// The following fields cache information the catch handler found in phase 1.</span>
  <span class="kt">int</span> <span class="n">handlerSwitchValue</span><span class="p">;</span> <span class="c1">// ttypeIndex in libc++abi</span>
  <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">actionRecord</span><span class="p">;</span>
  <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">languageSpecificData</span><span class="p">;</span>
  <span class="kt">void</span> <span class="o">*</span><span class="n">catchTemp</span><span class="p">;</span> <span class="c1">// landingPad</span>
  <span class="kt">void</span> <span class="o">*</span><span class="n">adjustedPtr</span><span class="p">;</span> <span class="c1">// adjusted pointer of the exception object</span>

  <span class="n">_Unwind_Exception</span> <span class="n">unwindHeader</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>每个线程都会维护一个当前被捕获异常的栈，<code class="language-plaintext highlighter-rouge">caughtExceptions</code> 指向栈顶，也就是最近一次被捕获的异常，<code class="language-plaintext highlighter-rouge">__cxa_exception::nextException</code> 则指向栈里的下一个异常。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">__cxa_eh_globals</span> <span class="p">{</span>
  <span class="n">__cxa_exception</span> <span class="o">*</span><span class="n">caughtExceptions</span><span class="p">;</span>
  <span class="kt">unsigned</span> <span class="n">uncaughtExceptions</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">try</span> <span class="p">{</span>
    <span class="k">throw</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="mi">2</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
      <span class="c1">// The global exception stack has two exceptions here.</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>而具体处理异常所需的信息，例如某个 IP 指令寄存器是否位于 <code class="language-plaintext highlighter-rouge">try-catch</code> 范围内、是否存在需要执行的离开作用域变量析构等，通常放在 <code class="language-plaintext highlighter-rouge">language-specific data area</code>（LSDA）里。这部分属于具体实现细节，不是 Level 2 ABI 直接规定的内容。</p>

<blockquote>
  <p>LSDA 也就是 ELF 中的 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code>，我们在下一篇再详细展开。</p>

</blockquote>

<h3 id="landing-pad"><strong>Landing pad</strong></h3>

<p><code class="language-plaintext highlighter-rouge">landing pad</code> 由编译器生成，是一段专门用于异常处理的代码。它通常会完成以下三种动作之一（注意每个栈帧只会执行其中一种）：</p>

<ul>
  <li>无法捕获对应异常，调用已离开作用域变量的析构函数，或者调用通过 <code class="language-plaintext highlighter-rouge">__attribute__((cleanup(...)))</code> 注册的回调，然后使用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 回到清理阶段</li>
  <li>能捕获对应异常，先析构已离开作用域的变量，再调用 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>，执行 <code class="language-plaintext highlighter-rouge">catch</code> 块里的代码，最后调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code></li>
  <li>如果在 <code class="language-plaintext highlighter-rouge">catch</code> 中有 <code class="language-plaintext highlighter-rouge">rethrow</code>，则会先析构 <code class="language-plaintext highlighter-rouge">catch</code> 子句里定义的局部变量，再调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code>，然后通过 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 继续清理阶段</li>
</ul>

<p>如果一个 <code class="language-plaintext highlighter-rouge">try</code> 块后面跟着多个 <code class="language-plaintext highlighter-rouge">catch</code> 子句，那么 LSDA 中会有多条 <code class="language-plaintext highlighter-rouge">catch</code> 条目。不过在代码生成层面，它们通常会汇总到同一个 <code class="language-plaintext highlighter-rouge">landing pad</code> 中。<code class="language-plaintext highlighter-rouge">personality</code> 在把控制权转交给 <code class="language-plaintext highlighter-rouge">landing pad</code> 之前，会调用 <code class="language-plaintext highlighter-rouge">_Unwind_SetGP</code>，把 <code class="language-plaintext highlighter-rouge">handlerSwitchValue</code> 放进 <code class="language-plaintext highlighter-rouge">__builtin_eh_return_data_regno(1)</code> 对应的寄存器里（x86_64 下是 <code class="language-plaintext highlighter-rouge">%rdx</code>），用来告诉 <code class="language-plaintext highlighter-rouge">landing pad</code> 这次匹配到的是哪个类型异常，从而跳转到对应的 <code class="language-plaintext highlighter-rouge">catch</code> 块。</p>

<p><code class="language-plaintext highlighter-rouge">rethrow</code> 则是在 <code class="language-plaintext highlighter-rouge">catch</code> 代码执行过程中通过 <code class="language-plaintext highlighter-rouge">__cxa_rethrow</code> 触发的。它需要先析构 <code class="language-plaintext highlighter-rouge">catch</code> 子句里定义的局部变量，再调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code>，抵消 <code class="language-plaintext highlighter-rouge">catch</code> 开始时那次 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>。</p>

<h3 id="api-1">API</h3>

<ul>
  <li>
    <p><code class="language-plaintext highlighter-rouge">__cxa_allocate_exception</code>：当代码里出现 <code class="language-plaintext highlighter-rouge">throw A();</code> 时，编译器生成的代码会调用这个构造函数，分配一块内存来存放 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 和 <code class="language-plaintext highlighter-rouge">A</code> 对象。其中 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 就紧挨在 <code class="language-plaintext highlighter-rouge">A</code> 对象的左侧。下面这个函数展示了程序可见的异常对象地址和 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 之间的关系：</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">static</span> <span class="kt">void</span> <span class="o">*</span><span class="nf">thrown_object_from_cxa_exception</span><span class="p">(</span><span class="n">__cxa_exception</span> <span class="o">*</span><span class="n">exception_header</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">void</span> <span class="o">*&gt;</span><span class="p">(</span><span class="n">exception_header</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>  <span class="c1">// address of A</span>
  <span class="p">}</span>
</code></pre></div>    </div>

    <blockquote>
      <p>注意 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 是在堆上创建的。运行时通常还会在启动时预留一小块内存，并预先构造一个 <code class="language-plaintext highlighter-rouge">std::bad_alloc</code>，以便在内存分配失败时仍然能够抛出异常。</p>

    </blockquote>
  </li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_throw</code>：先根据上面的关系找到 <code class="language-plaintext highlighter-rouge">__cxa_exception</code>，填好其中各个字段（<code class="language-plaintext highlighter-rouge">referenceCount</code>、<code class="language-plaintext highlighter-rouge">exception_class</code>、<code class="language-plaintext highlighter-rouge">unexpectedHandler</code>、<code class="language-plaintext highlighter-rouge">terminateHandler</code>、<code class="language-plaintext highlighter-rouge">exceptionType</code>、<code class="language-plaintext highlighter-rouge">exceptionDestructor</code>、<code class="language-plaintext highlighter-rouge">unwindHeader.exception_cleanup</code>），然后调用 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 开始栈展开</li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>：编译器会在 <code class="language-plaintext highlighter-rouge">catch</code> 块开头生成对它的调用。主要作用是更新 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 中的 <code class="language-plaintext highlighter-rouge">handlerCount</code>，更新当前线程的全局异常栈，返回被抛出对象的地址</li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_end_catch</code>：编译器会在 <code class="language-plaintext highlighter-rouge">catch</code> 块结束处，或者在 <code class="language-plaintext highlighter-rouge">rethrow</code> 前生成对它的调用。主要作用是更新 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 中的 <code class="language-plaintext highlighter-rouge">handlerCount</code>，如果为0，则从全局异常栈出栈。</li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_rethrow</code>：它会给异常对象打上“重新抛出”的标记。这样当 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 把 <code class="language-plaintext highlighter-rouge">handlerCount</code> 减到 0 时，这个异常对象不会被销毁，因为后续 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 恢复清理阶段时还要继续使用它</li>
</ul>

<p>Level 2 的 API 主要都是为了提供 C++ 的各种语法底层支持，除了基础的 <code class="language-plaintext highlighter-rouge">throw</code> / <code class="language-plaintext highlighter-rouge">catch</code> 之外，还包括 <code class="language-plaintext highlighter-rouge">std::current_exception</code>、<code class="language-plaintext highlighter-rouge">std::rethrow_exception</code>、<code class="language-plaintext highlighter-rouge">std::get_terminate</code> 等。下面是一个简化版的 <code class="language-plaintext highlighter-rouge">__cxa_throw</code> 实现：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">__cxa_throw</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">thrown</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">type_info</span> <span class="o">*</span><span class="n">tinfo</span><span class="p">,</span> <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">destructor</span><span class="p">)(</span><span class="kt">void</span> <span class="o">*</span><span class="p">))</span> <span class="p">{</span>
  <span class="n">__cxa_exception</span> <span class="o">*</span><span class="n">hdr</span> <span class="o">=</span> <span class="p">(</span><span class="n">__cxa_exception</span> <span class="o">*</span><span class="p">)</span><span class="n">thrown</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
  <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">exceptionType</span> <span class="o">=</span> <span class="n">tinfo</span><span class="p">;</span> <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">destructor</span> <span class="o">=</span> <span class="n">destructor</span><span class="p">;</span>
  <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">unexpectedHandler</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">get_unexpected</span><span class="p">();</span>
  <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">terminateHandler</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">get_terminate</span><span class="p">();</span>
  <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">unwindHeader</span><span class="p">.</span><span class="n">exception_class</span> <span class="o">=</span> <span class="p">...;</span>
  <span class="n">__cxa_get_globals</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">uncaughtExceptions</span><span class="o">++</span><span class="p">;</span>
  <span class="n">_Unwind_RaiseException</span><span class="p">(</span><span class="o">&amp;</span><span class="n">hdr</span><span class="o">-&gt;</span><span class="n">unwindHeader</span><span class="p">);</span>
  <span class="c1">// Failed to unwind, e.g. the .eh_frame FDE is absent.</span>
  <span class="n">__cxa_begin_catch</span><span class="p">(</span><span class="o">&amp;</span><span class="n">hdr</span><span class="o">-&gt;</span><span class="n">unwindHeader</span><span class="p">);</span>
  <span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="example">Example</h2>

<p>下面结合一个例子，再梳理一遍整个异常处理流程。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">A</span> <span class="p">{</span>
    <span class="o">~</span><span class="n">A</span><span class="p">()</span> <span class="p">{}</span>
<span class="p">};</span>

<span class="kt">void</span> <span class="nf">baz</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">throw</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">A</span> <span class="n">a</span><span class="p">;</span>
    <span class="n">baz</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
        <span class="n">bar</span><span class="p">();</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">x</span><span class="o">++</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>对应的汇编伪代码大致如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">baz</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">__cxa_exception</span> <span class="o">*</span><span class="n">thrown</span> <span class="o">=</span> <span class="n">__cxa_allocate_exception</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">int</span><span class="p">));</span>
    <span class="o">*</span><span class="n">thrown</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="n">__cxa_throw</span><span class="p">(</span><span class="n">thrown</span><span class="p">,</span> <span class="o">&amp;</span><span class="k">typeid</span><span class="p">(</span><span class="kt">int</span><span class="p">),</span> <span class="nb">nullptr</span><span class="cm">/*destructor*/</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">A</span> <span class="n">a</span><span class="p">;</span>
    <span class="n">baz</span><span class="p">();</span>
    <span class="k">return</span><span class="p">;</span>
<span class="nl">landing_pad:</span>
    <span class="n">a</span><span class="p">.</span><span class="o">~</span><span class="n">A</span><span class="p">();</span>
    <span class="n">_Unwind_Resume</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">();</span>
    <span class="k">return</span><span class="p">;</span>
<span class="nl">landing_pad:</span>
	  <span class="n">__cxa_begin_catch</span><span class="p">(</span><span class="n">obj</span><span class="p">);</span>
	  <span class="n">x</span><span class="o">++</span><span class="p">;</span>
	  <span class="n">__cxa_end_catch</span><span class="p">(</span><span class="n">obj</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>控制流可以概括成下面几步：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">foo</code> 调用 <code class="language-plaintext highlighter-rouge">bar</code>，<code class="language-plaintext highlighter-rouge">bar</code> 调用 <code class="language-plaintext highlighter-rouge">baz</code>，<code class="language-plaintext highlighter-rouge">baz</code> 抛出异常</li>
  <li><code class="language-plaintext highlighter-rouge">baz</code> 动态分配一块内存，这块内存里依次保存一个 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 对象和被抛出的 <code class="language-plaintext highlighter-rouge">int</code>，然后执行 <code class="language-plaintext highlighter-rouge">__cxa_throw</code></li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_throw</code> 会设置 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 中的字段，然后调用 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 开始执行栈展开。第一阶段要先搜索能够捕获 <code class="language-plaintext highlighter-rouge">int</code> 异常的栈帧：</p>

<ul>
  <li>对 <code class="language-plaintext highlighter-rouge">bar</code> 来说，传入 <code class="language-plaintext highlighter-rouge">_UA_SEARCH_PHASE</code> 调用 <code class="language-plaintext highlighter-rouge">personality</code>；返回值是 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code>，表示这里不能捕获该异常</li>
  <li>对 <code class="language-plaintext highlighter-rouge">foo</code> 来说，传入 <code class="language-plaintext highlighter-rouge">_UA_SEARCH_PHASE</code> 调用 <code class="language-plaintext highlighter-rouge">personality</code>；返回值是 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code>，表示这里能捕获该异常</li>
  <li><code class="language-plaintext highlighter-rouge">foo</code> 这个栈帧的栈指针会被记录下来，存入 <code class="language-plaintext highlighter-rouge">private_2</code>，然后搜索阶段结束</li>
</ul>

<p>此时已经确定 <code class="language-plaintext highlighter-rouge">foo</code> 的栈帧可以接住这个异常，第二阶段开始做清理：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">bar</code> 的栈帧没有被搜索阶段标记，传入 <code class="language-plaintext highlighter-rouge">_UA_CLEANUP_PHASE</code> 调用 <code class="language-plaintext highlighter-rouge">personality</code>，返回 <code class="language-plaintext highlighter-rouge">_URC_INSTALL_CONTEXT</code>，代表有 <code class="language-plaintext highlighter-rouge">landing pad</code></li>
  <li>跳转到 <code class="language-plaintext highlighter-rouge">bar</code> 栈帧对应的 <code class="language-plaintext highlighter-rouge">landing pad</code>，完成清理后，通过 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 回到清理阶段</li>
  <li><code class="language-plaintext highlighter-rouge">foo</code> 的栈帧在搜索阶段已经被标记，传入 <code class="language-plaintext highlighter-rouge">_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME</code> 调用 <code class="language-plaintext highlighter-rouge">personality</code> 时，返回 <code class="language-plaintext highlighter-rouge">_URC_INSTALL_CONTEXT</code>，代表有 <code class="language-plaintext highlighter-rouge">landing pad</code></li>
  <li>跳转到 <code class="language-plaintext highlighter-rouge">foo</code> 栈帧对应的 <code class="language-plaintext highlighter-rouge">landing pad</code>，其中调用 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>，执行 <code class="language-plaintext highlighter-rouge">catch</code> 代码，最后调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code></li>
</ul>

<p>完整的<a href="https://godbolt.org/z/TxvWfbh8j">汇编</a>如下，可以对照加深理解（重点关注 <code class="language-plaintext highlighter-rouge">bar</code> 和 <code class="language-plaintext highlighter-rouge">foo</code> 的 <code class="language-plaintext highlighter-rouge">landing pad</code>）：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">A:</span><span class="p">:</span><span class="o">~</span><span class="nf">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">base</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">]:</span>
        <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
        <span class="nf">nop</span>
        <span class="nf">popq</span>    <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">ret</span>
        <span class="nf">.set</span>    <span class="nv">A</span><span class="p">::</span><span class="o">~</span><span class="nv">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">complete</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">],</span><span class="nv">A</span><span class="p">::</span><span class="o">~</span><span class="nv">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">base</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">]</span>
<span class="nf">baz</span><span class="p">():</span>
        <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">4</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
        <span class="nf">call</span>    <span class="nv">__cxa_allocate_exception</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">edx</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">_ZTIi</span><span class="p">,</span> <span class="o">%</span><span class="nb">esi</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">__cxa_throw</span>
<span class="nf">bar</span><span class="p">():</span>
        <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbx</span>
        <span class="nf">subq</span>    <span class="kc">$</span><span class="mi">24</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>
        <span class="nf">call</span>    <span class="nv">baz</span><span class="p">()</span>
        <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">17</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">A</span><span class="p">::</span><span class="o">~</span><span class="nv">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">complete</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">]</span>
        <span class="nf">jmp</span>     <span class="nv">.L6</span>

        <span class="c1">; landing pad of bar</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbx</span>
        <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">17</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">A</span><span class="p">::</span><span class="o">~</span><span class="nv">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">complete</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">]</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rbx</span><span class="p">,</span> <span class="o">%</span><span class="nb">rax</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">_Unwind_Resume</span>
<span class="nl">.L6:</span>
        <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rbx</span>
        <span class="nf">leave</span>
        <span class="nf">ret</span>
<span class="nf">foo</span><span class="p">():</span>
        <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
        <span class="nf">subq</span>    <span class="kc">$</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>
        <span class="nf">call</span>    <span class="nv">bar</span><span class="p">()</span>
        <span class="nf">jmp</span>     <span class="nv">.L12</span>

        <span class="c1">; landing pad of foo</span>
        <span class="nf">cmpq</span>    <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdx</span>
        <span class="nf">je</span>      <span class="nv">.L9</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">_Unwind_Resume</span>
<span class="nl">.L9:</span>
        <span class="c1">; catch block in foo</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">__cxa_begin_catch</span>
        <span class="nf">movl</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">movl</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
        <span class="nf">addl</span>    <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
        <span class="nf">call</span>    <span class="nv">__cxa_end_catch</span>
<span class="nl">.L12:</span>
        <span class="nf">nop</span>
        <span class="nf">leave</span>
        <span class="nf">ret</span>
</code></pre></div></div>

<p>这里详细分析下 <code class="language-plaintext highlighter-rouge">bar</code> 和 <code class="language-plaintext highlighter-rouge">foo</code> 的 <code class="language-plaintext highlighter-rouge">landing pad</code>：</p>

<p>对于 <code class="language-plaintext highlighter-rouge">bar</code> ，在跳转到对应的 <code class="language-plaintext highlighter-rouge">landing pad</code> 之前，<code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 已经通过 <code class="language-plaintext highlighter-rouge">personality</code> 确定了 <code class="language-plaintext highlighter-rouge">bar</code> 不能处理这个异常，因此它的 <code class="language-plaintext highlighter-rouge">landing pad</code> 就是清理栈上的对象，然后调用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 继续栈展开。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="c1">; landing pad of bar</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbx</span>
        <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">17</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">A</span><span class="p">::</span><span class="o">~</span><span class="nv">A</span><span class="p">()</span> <span class="p">[</span><span class="nv">complete</span> <span class="nv">object</span> <span class="nv">destructor</span><span class="p">]</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rbx</span><span class="p">,</span> <span class="o">%</span><span class="nb">rax</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">_Unwind_Resume</span>
</code></pre></div></div>

<p>对于 <code class="language-plaintext highlighter-rouge">foo</code>，在跳转到对应的 <code class="language-plaintext highlighter-rouge">landing pad</code> 之前，<code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 已经通过 <code class="language-plaintext highlighter-rouge">personality</code> 确定了 <code class="language-plaintext highlighter-rouge">foo</code> 能处理这个异常，并且知道是第几个 <code class="language-plaintext highlighter-rouge">catch</code> 块与之匹配。相关信息会通过下面两个寄存器传给 <code class="language-plaintext highlighter-rouge">landing pad</code>：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">; %rax -&gt; exception object，后续会传给 __cxa_begin_catch</span>
<span class="c1">; %rdx -&gt; 类型匹配结果</span>
</code></pre></div></div>

<p>通过比对 <code class="language-plaintext highlighter-rouge">%rdx</code>，跳转到对应的 <code class="language-plaintext highlighter-rouge">catch</code> 块。<code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 会返回被抛出对象的地址，也就是 <code class="language-plaintext highlighter-rouge">catch</code> 块里 <code class="language-plaintext highlighter-rouge">x</code> 对应的地址。执行 <code class="language-plaintext highlighter-rouge">x++</code> 之后，最后调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 完成这次异常捕获。</p>

<blockquote>
  <p>注意，<code class="language-plaintext highlighter-rouge">%rax</code> 里已经保存了抛出的异常对象。<code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 之所以还要再返回一次对象地址，是因为这里可能需要做一次地址调整。</p>

</blockquote>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="c1">; landing pad of foo</span>
        <span class="c1">; 确定foo能处理当前异常 通过比较%rdx 跳转到对应的catch block进行处理</span>
        <span class="nf">cmpq</span>    <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdx</span>
        <span class="nf">je</span>      <span class="nv">.L9</span>                <span class="c1">; go to catch(int)</span>

        <span class="c1">; 不能catch当前异常 继续调用_Unwind_Resume</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">_Unwind_Resume</span>
<span class="nl">.L9:</span>
        <span class="c1">; catch block in foo</span>
        <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
        <span class="nf">call</span>    <span class="nv">__cxa_begin_catch</span>
        <span class="nf">movl</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">movl</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
        <span class="nf">addl</span>    <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">4</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>       <span class="c1">; x++</span>
        <span class="nf">call</span>    <span class="nv">__cxa_end_catch</span>
</code></pre></div></div>

<h2 id="misc">Misc</h2>

<p>最后再补充一些零碎的信息。</p>

<h3 id="native-exception-vs-foreign-exception">Native exception vs Foreign exception</h3>

<p>前面提到 <code class="language-plaintext highlighter-rouge">_Unwind_Exception</code> 中有个 <code class="language-plaintext highlighter-rouge">exception_class</code> 字段，</p>

<p>Level 1 API 不会处理该字段，而是将其原样传给 <code class="language-plaintext highlighter-rouge">personality</code>，<code class="language-plaintext highlighter-rouge">personality</code> 利用这个值来区分 <code class="language-plaintext highlighter-rouge">native exception</code> 和 <code class="language-plaintext highlighter-rouge">foreign exception</code>：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">native exception</code>: 由相同C++ ABI运行时抛出的异常。即包含完整的C++类型信息（RTTI），可以被C++运行时正确地栈展开，从而进行捕获。</li>
  <li><code class="language-plaintext highlighter-rouge">foreign exceptions</code>: 非C++代码产生，不遵循C++ ABI异常处理规范，只能被<code class="language-plaintext highlighter-rouge">catch (...)</code>捕获。</li>
</ul>

<blockquote>
  <p>之所以要强调相同C++ ABI运行时的一个典型例子是：<code class="language-plaintext highlighter-rouge">libstdc++</code> 抛出的异常会被 <code class="language-plaintext highlighter-rouge">libc++abi</code> 视为 <code class="language-plaintext highlighter-rouge">foreign exception</code>。</p>

</blockquote>

<p><code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 和 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 对于 <code class="language-plaintext highlighter-rouge">native exception</code> 以及 <code class="language-plaintext highlighter-rouge">foreign exception</code> 有不同的处理方式：</p>

<p><code class="language-plaintext highlighter-rouge">void* __cxa_begin_catch(void *obj)</code> 编译器会在 <code class="language-plaintext highlighter-rouge">catch</code> 块开头生成对它的调用。对于：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">native exception</code>
    <ul>
      <li>增加 <code class="language-plaintext highlighter-rouge">handlerCount</code></li>
      <li>将异常压入当前线程的全局异常栈，并减少 <code class="language-plaintext highlighter-rouge">uncaught_exception</code> 计数</li>
      <li>返回调整后的异常对象的地址指针</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">foreign exception</code>（不一定有 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 头部）
    <ul>
      <li>若当前线程的全局异常栈为空则压栈，否则调用 <code class="language-plaintext highlighter-rouge">std::terminate</code> （在任意时刻，C++ ABI运行时只能处理一个 <code class="language-plaintext highlighter-rouge">foreign exception</code>）</li>
      <li>返回 <code class="language-plaintext highlighter-rouge">static_cast&lt;_Unwind_Exception *&gt;(obj) + 1</code>（假设 <code class="language-plaintext highlighter-rouge">_Unwind_Exception</code> 紧邻被抛出对象）</li>
    </ul>
  </li>
</ul>

<p><code class="language-plaintext highlighter-rouge">void __cxa_end_catch()</code> 在 <code class="language-plaintext highlighter-rouge">catch</code> 块结束或 <code class="language-plaintext highlighter-rouge">rethrow</code> 时被调用。对于：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">native exception</code>
    <ul>
      <li>从当前线程的全局异常栈中取出异常，减少 <code class="language-plaintext highlighter-rouge">handlerCount</code></li>
      <li>当 <code class="language-plaintext highlighter-rouge">handlerCount</code> 减至 0 时（引用计数为 0），将其从全局异常栈中出栈</li>
      <li>当 <code class="language-plaintext highlighter-rouge">handlerCount</code> 减至 0 时调用 <code class="language-plaintext highlighter-rouge">__cxa_free_exception</code>（若为 dependent exception，则减少 <code class="language-plaintext highlighter-rouge">referenceCount</code>，待其降至 0 时再调用 <code class="language-plaintext highlighter-rouge">__cxa_free_exception</code>）</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">foreign exception</code>
    <ul>
      <li>调用 <code class="language-plaintext highlighter-rouge">_Unwind_DeleteException</code></li>
      <li>执行 <code class="language-plaintext highlighter-rouge">__cxa_eh_globals::uncaughtExceptions = nullptr;</code>（和 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 时对应，栈中只有一个异常）</li>
    </ul>
  </li>
</ul>

<blockquote>
  <p>注意，除 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code> 和 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code> 之外，大多数 <code class="language-plaintext highlighter-rouge">__cxa_*</code> 函数都无法处理 <code class="language-plaintext highlighter-rouge">foreign exception</code>（因为它们没有 <code class="language-plaintext highlighter-rouge">__cxa_exception</code> 头部）。</p>

</blockquote>

<p>这一篇到这就差不多了，主要以了解异常处理和栈展开的流程为主。下一篇将从 <code class="language-plaintext highlighter-rouge">personality</code> 开始，详细描述栈展开的原理。</p>

<h2 id="reference">Reference</h2>

<ul>
  <li>
    <table>
      <tbody>
        <tr>
          <td>[C++ exception handling ABI</td>
          <td>MaskRay](https://maskray.me/blog/2020-12-12-c++-exception-handling-abi)</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li><a href="https://www.youtube.com/watch?v=_Ivd3qzgT7U">CppCon 2017: Dave Watson “C++ Exceptions and Stack Unwinding”</a></li>
</ul>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[C++ 异常处理的第一篇，这一篇先对栈展开的流程建立一个基本概念，下一篇再补充一些深入细节。]]></summary></entry><entry><title type="html">C++ Exception Handling ABI, part 2</title><link href="/%E5%AD%A6%E4%B9%A0/C++-Exception-ABI-part-2/" rel="alternate" type="text/html" title="C++ Exception Handling ABI, part 2" /><published>2026-04-09T00:00:00+08:00</published><updated>2026-04-09T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/C++%20Exception%20ABI-part-2</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/C++-Exception-ABI-part-2/"><![CDATA[<p>话不多说，这一篇争取把上一篇不够详尽的部分补齐。</p>

<h2 id="personality">personality</h2>

<p>首先，我们从上一篇没有详细介绍的 <code class="language-plaintext highlighter-rouge">personality</code> 开始。在栈展开的过程中，<code class="language-plaintext highlighter-rouge">libgcc</code> 或者 <code class="language-plaintext highlighter-rouge">libunwind</code> 作为 <code class="language-plaintext highlighter-rouge">unwinder</code> 会逐帧调用 <code class="language-plaintext highlighter-rouge">personality</code>，它作为连接 Level 1 Base ABI 和 Level 2 C++ ABI 的桥梁，需要告知 <code class="language-plaintext highlighter-rouge">unwinder</code> 以下信息：</p>

<ol>
  <li>在栈展开的搜索阶段，告知 <code class="language-plaintext highlighter-rouge">unwinder</code> 当前帧是否有匹配的 <code class="language-plaintext highlighter-rouge">catch</code> 块来处理该异常</li>
  <li>在栈展开的清理阶段，告知 <code class="language-plaintext highlighter-rouge">unwinder</code> 当前帧是否需要执行相应的清理，如果需要，对应的 <code class="language-plaintext highlighter-rouge">landing pad</code> 地址是什么，以便后续 <code class="language-plaintext highlighter-rouge">unwinder</code> 进行实际跳转。</li>
</ol>

<p>每一帧的清理逻辑，也就是上一篇所说的 <code class="language-plaintext highlighter-rouge">landing pad</code>。根据具体函数逻辑，它会完成以下三项之一：</p>

<ul>
  <li>无法捕获对应异常，调用已离开作用域变量的析构函数，或者调用通过 <code class="language-plaintext highlighter-rouge">__attribute__((cleanup(...)))</code> 注册的回调，然后使用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 回到清理阶段</li>
  <li>能捕获对应异常，先析构已离开作用域的变量，再调用 <code class="language-plaintext highlighter-rouge">__cxa_begin_catch</code>，执行 <code class="language-plaintext highlighter-rouge">catch</code> 块里的代码，最后调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code></li>
  <li>如果在 <code class="language-plaintext highlighter-rouge">catch</code> 中有 <code class="language-plaintext highlighter-rouge">rethrow</code>，则会先析构 <code class="language-plaintext highlighter-rouge">catch</code> 子句里定义的局部变量，再调用 <code class="language-plaintext highlighter-rouge">__cxa_end_catch</code>，然后通过 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 继续清理阶段</li>
</ul>

<p>不同的语言、实现或架构可能会使用不同的 <code class="language-plaintext highlighter-rouge">personality</code> 程序。对于 C++ 而言，在 ELF 中最常见的 <code class="language-plaintext highlighter-rouge">personality</code> 实现是 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code>。</p>

<p>在进一步介绍 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 之前，我们需要再补充一些背景知识。</p>

<h3 id="gcc_except_table">.gcc_except_table</h3>

<p>ELF 中，将具体语言处理异常所需的信息，例如某个 IP 指令寄存器是否位于 <code class="language-plaintext highlighter-rouge">try-catch</code> 范围内、是否存在需要执行的离开作用域变量析构等，保存到 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> 数据段中。这个数据段是 ELF 中一整块连续的字节区域，里面存放了所有函数的异常处理数据，这些数据就是上一篇所提到的 LSDA (Language-specific Data Area)。整体上逻辑关系如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">.</span><span class="n">gcc_except_table</span>
<span class="err">├──</span> <span class="n">LSDA</span><span class="p">(</span><span class="n">func1</span><span class="p">)</span>
<span class="err">├──</span> <span class="n">LSDA</span><span class="p">(</span><span class="n">func2</span><span class="p">)</span>
<span class="err">├──</span> <span class="n">LSDA</span><span class="p">(</span><span class="n">func3</span><span class="p">)</span>
<span class="err">└──</span> <span class="p">...</span>
</code></pre></div></div>

<p>每一个 LSDA 中又包含以下部分：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">header</code>：<code class="language-plaintext highlighter-rouge">landing pad</code> 的基地址，<code class="language-plaintext highlighter-rouge">type table</code> 的编码格式，<code class="language-plaintext highlighter-rouge">call site table</code> 的编码格式，<code class="language-plaintext highlighter-rouge">action table</code> 的起始位置</li>
  <li><code class="language-plaintext highlighter-rouge">call site table</code>：表中每个条目都保存 <code class="language-plaintext highlighter-rouge">[start, length, landing_pad_offset, action_record_offset]</code> 四个字段。当地址在 <code class="language-plaintext highlighter-rouge">[start, start + length)</code> 这段代码中出现异常时，对应的 <code class="language-plaintext highlighter-rouge">landing pad</code> 入口地址偏移量，以及第一个 <code class="language-plaintext highlighter-rouge">action</code> 在 <code class="language-plaintext highlighter-rouge">action table</code> 中的偏移量（如果没有 <code class="language-plaintext highlighter-rouge">action</code> 则为 0）。</li>
  <li><code class="language-plaintext highlighter-rouge">action table</code>：每个条目有两个字段 <code class="language-plaintext highlighter-rouge">[switch_value, next_action_offset]</code>，用于表明给定范围内抛出异常对应的 <code class="language-plaintext highlighter-rouge">action</code>，比如 <code class="language-plaintext highlighter-rouge">cleanup/catch/noexcept</code>。其中 <code class="language-plaintext highlighter-rouge">switch_value</code> 用来保存每个 <code class="language-plaintext highlighter-rouge">catch</code> 的具体类型在 <code class="language-plaintext highlighter-rouge">type table</code> 中的下标（0 代表是一个 <code class="language-plaintext highlighter-rouge">cleanup action</code>），<code class="language-plaintext highlighter-rouge">next_action_offset</code> 表示下一个 <code class="language-plaintext highlighter-rouge">action</code> 在 <code class="language-plaintext highlighter-rouge">action table</code> 中的偏移量（<code class="language-plaintext highlighter-rouge">0</code> 表示没有后续）。注意一段代码对应的所有 <code class="language-plaintext highlighter-rouge">action</code> 被组织成了一个单链表。</li>
  <li><code class="language-plaintext highlighter-rouge">type table</code>：保存各个类型的 <code class="language-plaintext highlighter-rouge">RTTI</code> 指针，用于检查异常对象类型是否匹配。如果指针为空，代表匹配所有类型 <code class="language-plaintext highlighter-rouge">catch (...)</code>。</li>
</ul>

<p>几张表的关联关系如下：</p>

<p>函数会按 <code class="language-plaintext highlighter-rouge">try</code> 语句分割成多个代码范围，<code class="language-plaintext highlighter-rouge">call site table</code> 保存的是给定代码地址范围内出现异常时的 <code class="language-plaintext highlighter-rouge">landing pad</code> 和相应的 <code class="language-plaintext highlighter-rouge">action</code>，而每个条目中 <code class="language-plaintext highlighter-rouge">landing_pad_offset</code> 和 <code class="language-plaintext highlighter-rouge">action_record_offset</code> 可能的组合有：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">landing_pad_offset</code> 为 0，则 <code class="language-plaintext highlighter-rouge">action_record_offset</code> 也 0，代表没有 <code class="language-plaintext highlighter-rouge">landing pad</code>。</li>
  <li><code class="language-plaintext highlighter-rouge">landing_pad_offset</code> 不为 0，代表有 <code class="language-plaintext highlighter-rouge">landing pad</code>，其中包含这段代码的所有可能的异常操作，即包括所有的 <code class="language-plaintext highlighter-rouge">catch</code> （无论是否能捕获），以及额外的 <code class="language-plaintext highlighter-rouge">cleanup</code> 逻辑。此时若：
    <ul>
      <li><code class="language-plaintext highlighter-rouge">action_record_offset</code> 为 0，代表当前栈帧需要进行额外清理（比如局部变量的析构）</li>
      <li><code class="language-plaintext highlighter-rouge">action_record_offset</code> 不为 0，代表有对应的 <code class="language-plaintext highlighter-rouge">action</code>，此时 <code class="language-plaintext highlighter-rouge">action table</code> 中 <code class="language-plaintext highlighter-rouge">action_record_offset</code> 对应条目即为第一个 <code class="language-plaintext highlighter-rouge">action</code>。</li>
    </ul>
  </li>
</ul>

<p>而 <code class="language-plaintext highlighter-rouge">action table</code> 条目中的 <code class="language-plaintext highlighter-rouge">switch_value</code> 大于 0 代表指向 <code class="language-plaintext highlighter-rouge">type table</code> 中的一个条目，等于 0 代表当前栈帧需要进行局部变量清理（对应上面 <code class="language-plaintext highlighter-rouge">call site table</code> 中 <code class="language-plaintext highlighter-rouge">action_record_offset</code> 为 0 的情况），小于 0 则是 <code class="language-plaintext highlighter-rouge">exception specification</code>，已经在现代 C++ 中很少见。</p>

<p>比如，如果某个代码范围内中有两个 <code class="language-plaintext highlighter-rouge">catch</code>，但都无法捕获当前异常，且需要额外清理时，LSDA 中的相关数据示意图如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">call</span> <span class="n">site</span> <span class="n">table</span><span class="o">:</span>
    <span class="n">start</span>
    <span class="n">length</span>
    <span class="n">landing_pad_offset</span><span class="o">:</span> <span class="err">指向入口地址</span> <span class="err">其中包含两个</span><span class="k">catch</span><span class="err">以及</span><span class="n">cleanup</span>
    <span class="n">action_record_offset</span><span class="o">:</span> <span class="err">假设为</span><span class="n">x</span> <span class="err">指向第一个</span><span class="n">action</span>

<span class="n">action</span> <span class="n">table</span><span class="o">:</span>
    <span class="p">;</span> <span class="err">第</span><span class="n">x</span><span class="err">个条目</span> <span class="err">对应第一个</span><span class="k">catch</span> <span class="p">(</span><span class="err">对应</span><span class="n">type</span> <span class="n">table</span><span class="err">中第</span><span class="n">m</span><span class="err">个类型</span><span class="p">)</span>
    <span class="p">[</span><span class="n">switch_value</span> <span class="o">=</span> <span class="n">m</span><span class="p">,</span> <span class="n">next_action_offset</span> <span class="o">=</span> <span class="n">y</span><span class="p">]</span>
    <span class="p">...</span>
    <span class="p">;</span> <span class="err">第</span><span class="n">y</span><span class="err">个条目</span> <span class="err">对应第二个</span><span class="k">catch</span> <span class="p">(</span><span class="err">对应</span><span class="n">type</span> <span class="n">table</span><span class="err">中第</span><span class="n">n</span><span class="err">个类型</span><span class="p">)</span>
    <span class="p">[</span><span class="n">switch_value</span> <span class="o">=</span> <span class="n">n</span><span class="p">,</span> <span class="n">next_action_offset</span> <span class="o">=</span> <span class="n">z</span><span class="p">]</span>
    <span class="p">...</span>
    <span class="p">;</span> <span class="err">第</span><span class="n">z</span><span class="err">个条目</span> <span class="n">switch_value</span><span class="err">为</span><span class="mi">0</span><span class="err">代表是</span><span class="n">cleanup</span> <span class="n">next_action_offset</span><span class="err">为</span><span class="mi">0</span><span class="err">代表没有后续</span><span class="n">action</span>
		<span class="p">[</span><span class="n">switch_value</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">next_action_offset</span> <span class="o">=</span> <span class="mi">0</span><span class="p">]</span>

<span class="n">type</span> <span class="n">table</span><span class="o">:</span>
		<span class="p">;</span> <span class="err">第</span><span class="n">m</span><span class="err">个条目</span>
		<span class="err">第一个</span><span class="k">catch</span><span class="err">类型的</span><span class="n">RTTI</span><span class="err">指针</span>
		<span class="p">...</span>
		<span class="p">;</span> <span class="err">第</span><span class="n">n</span><span class="err">个条目</span>
    <span class="err">第二个</span><span class="k">catch</span><span class="err">类型的</span><span class="n">RTTI</span><span class="err">指针</span>
</code></pre></div></div>

<p>本质上 LSDA 只是一段字节流，没有显式结构体。在栈展开过程中，需要由<code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 解析 LSDA 中的内容（至于是哪个 LSDA 则是由 unwinder 来负责查找并传递），从而确定当前帧能否处理对应异常：</p>

<ol>
  <li>根据 <code class="language-plaintext highlighter-rouge">throw</code> 异常时的指令寄存器 IP，去查 <code class="language-plaintext highlighter-rouge">call site table</code>，确定当前调用点对应 <code class="language-plaintext highlighter-rouge">call site table</code> 中的哪一个条目，以及第一个 <code class="language-plaintext highlighter-rouge">action</code> 是什么</li>
  <li>遍历对应的 <code class="language-plaintext highlighter-rouge">action</code> 链表，读取每一个 <code class="language-plaintext highlighter-rouge">catch</code> 对应的类型下标。通过比较当前异常的类型信息和 <code class="language-plaintext highlighter-rouge">type table</code> 中对应的类型信息，如果匹配则表示当前帧可以处理该异常。否则根据 <code class="language-plaintext highlighter-rouge">next_action_offset</code> 跳转到下一个 <code class="language-plaintext highlighter-rouge">action</code>。</li>
  <li>如果当前帧所有 <code class="language-plaintext highlighter-rouge">action</code> 遍历完后仍不能处理该异常（<code class="language-plaintext highlighter-rouge">next_action_offset</code> 为 <code class="language-plaintext highlighter-rouge">0</code>），则通过返回值告知 <code class="language-plaintext highlighter-rouge">unwinder</code> 当前栈帧无法处理，由 <code class="language-plaintext highlighter-rouge">unwinder</code> 在 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 中继续展开上一个栈帧，并重复上述过程。</li>
</ol>

<p>换而言之，不同代码块对应的 <code class="language-plaintext highlighter-rouge">landing_pad_offset</code> 和 <code class="language-plaintext highlighter-rouge">action_record_offset</code> 如下：</p>

<ul>
  <li>没有局部变量析构的非 <code class="language-plaintext highlighter-rouge">try</code> 块：<code class="language-plaintext highlighter-rouge">landing_pad_offset==0 &amp;&amp; action_record_offset==0</code></li>
  <li>有局部变量析构的非 <code class="language-plaintext highlighter-rouge">try</code> 块：<code class="language-plaintext highlighter-rouge">landing_pad_offset!=0 &amp;&amp; action_record_offset==0</code>，栈展开的清理阶段需要先对当前栈帧进行清理，才能继续</li>
  <li>有 <code class="language-plaintext highlighter-rouge">__attribute__((cleanup(...)))</code> 的非 <code class="language-plaintext highlighter-rouge">try</code> 块：<code class="language-plaintext highlighter-rouge">landing_pad_offset!=0 &amp;&amp; action_record_offset==0</code>，同上</li>
  <li><code class="language-plaintext highlighter-rouge">try</code> 块：<code class="language-plaintext highlighter-rouge">landing_pad_offset!=0 &amp;&amp; action_record_offset!=0</code>。<code class="language-plaintext highlighter-rouge">landing_pad_offset</code> 指向由多个 <code class="language-plaintext highlighter-rouge">catch</code> 块拼接的一段代码。<code class="language-plaintext highlighter-rouge">action table</code> 对应的条目中 <code class="language-plaintext highlighter-rouge">switch_value &gt; 0</code>，指向 <code class="language-plaintext highlighter-rouge">type table</code> 中一个非空类型的 RTTI 指针</li>
  <li>有 <code class="language-plaintext highlighter-rouge">catch (...)</code> 的 <code class="language-plaintext highlighter-rouge">try</code> 块：同上。<code class="language-plaintext highlighter-rouge">action table</code> 对应的条目中 <code class="language-plaintext highlighter-rouge">switch_value &gt; 0</code>， <code class="language-plaintext highlighter-rouge">type table</code> 对应条目中 RTTI 指针为空（表示 <code class="language-plaintext highlighter-rouge">catch (...)</code>）</li>
  <li>在有 <code class="language-plaintext highlighter-rouge">noexcept</code> 说明符的函数中，异常可能向调用方传播：<code class="language-plaintext highlighter-rouge">landing_pad_offset!=0 &amp;&amp; action_record_offset!=0</code>。<code class="language-plaintext highlighter-rouge">landing pad</code> 指向调用 <code class="language-plaintext highlighter-rouge">std::terminate</code> 的代码块，<code class="language-plaintext highlighter-rouge">action table</code> 对应的条目中 <code class="language-plaintext highlighter-rouge">switch_value &gt; 0</code>，且 <code class="language-plaintext highlighter-rouge">type table</code> 对应条目中 RTTI 指针为空（表示 <code class="language-plaintext highlighter-rouge">catch (...)</code>）</li>
</ul>

<h3 id="__gxx_personality_v0">__gxx_personality_v0</h3>

<p>到这我们就可以总结 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 的具体功能了：</p>

<ul>
  <li>通过读取当前栈帧的 LSDA 在栈展开过程中检查每个栈帧是否有匹配的 <code class="language-plaintext highlighter-rouge">catch</code></li>
  <li>搜索阶段：
    <ul>
      <li>返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code>：当前栈帧无法处理该异常</li>
      <li>返回 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code>：当前栈帧能处理该异常</li>
    </ul>
  </li>
  <li>清理阶段：
    <ul>
      <li>返回 <code class="language-plaintext highlighter-rouge">_URC_CONTINUE_UNWIND</code>：没有对应 <code class="language-plaintext highlighter-rouge">landing pad</code>，不需要额外处理</li>
      <li>返回 <code class="language-plaintext highlighter-rouge">_URC_INSTALL_CONTEXT</code>：有对应 <code class="language-plaintext highlighter-rouge">landing pad</code>，由 <code class="language-plaintext highlighter-rouge">unwinder</code> 跳转到该地址继续执行</li>
    </ul>
  </li>
</ul>

<blockquote>
  <p>上述流程没有描述各种错误路径，文章最后会涉及到一些</p>

</blockquote>

<p>在将控制权转移到 <code class="language-plaintext highlighter-rouge">landing pad</code> 之前，<code class="language-plaintext highlighter-rouge">personality</code> 会调用 <code class="language-plaintext highlighter-rouge">_Unwind_SetGR</code> 设置两个寄存器，分别存储 <code class="language-plaintext highlighter-rouge">_Unwind_Exception *</code> 和 <code class="language-plaintext highlighter-rouge">switchValue</code>。</p>

<blockquote>
  <p>这两个寄存器，与架构相关，实际上是通过 <code class="language-plaintext highlighter-rouge">__builtin_eh_return_data_regno(0)</code> 和 <code class="language-plaintext highlighter-rouge">__builtin_eh_return_data_regno(1)</code>设置，x86_64下是 <code class="language-plaintext highlighter-rouge">%rax</code> 和 <code class="language-plaintext highlighter-rouge">%rdx</code>，可以参照上一篇中的例子。</p>

</blockquote>

<p>对于 <code class="language-plaintext highlighter-rouge">native exception</code>，当 <code class="language-plaintext highlighter-rouge">personality</code> 在搜索阶段返回 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code> 时，栈帧的 LSDA 相关信息会被缓存。当 <code class="language-plaintext highlighter-rouge">personality</code> 在清理阶段被再次调用，且参数为 <code class="language-plaintext highlighter-rouge">actions == (_UA_CLEANUP_PHASE | _UA_HANDLER_FRAME)</code> 时，<code class="language-plaintext highlighter-rouge">personality</code> 会加载缓存，无需再解析 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code>。</p>

<p>在其他三种情况下，<code class="language-plaintext highlighter-rouge">personality</code> 必须解析 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code>：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">actions &amp; _UA_SEARCH_PHASE</code></li>
  <li><code class="language-plaintext highlighter-rouge">actions &amp; _UA_CLEANUP_PHASE &amp;&amp; actions &amp; _UA_HANDLER_FRAME &amp;&amp; !is_native</code></li>
  <li><code class="language-plaintext highlighter-rouge">actions &amp; _UA_CLEANUP_PHASE &amp;&amp; !(actions &amp; _UA_HANDLER_FRAME)</code></li>
</ul>

<p>一个简化的 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 实现如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_Unwind_Reason_Code</span> <span class="nf">__gxx_personality_v0</span><span class="p">(</span><span class="kt">int</span> <span class="n">version</span><span class="p">,</span> <span class="n">_Unwind_Action</span> <span class="n">actions</span><span class="p">,</span> <span class="kt">uint64_t</span> <span class="n">exceptionClass</span><span class="p">,</span> <span class="n">_Unwind_Exception</span> <span class="o">*</span><span class="n">exc</span><span class="p">,</span> <span class="n">_Unwind_Context</span> <span class="o">*</span><span class="n">ctx</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">scan_results</span> <span class="n">results</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">actions</span> <span class="o">==</span> <span class="p">(</span><span class="n">_UA_CLEANUP_PHASE</span> <span class="o">|</span> <span class="n">_UA_HANDLER_FRAME</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="n">is_native</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="o">*</span><span class="n">hdr</span> <span class="o">=</span> <span class="p">(</span><span class="n">__cxa_exception</span> <span class="o">*</span><span class="p">)(</span><span class="n">exc</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
    <span class="c1">// Load cached results from phase 1.</span>
    <span class="n">results</span><span class="p">.</span><span class="n">switchValue</span> <span class="o">=</span> <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">handlerSwitchValue</span><span class="p">;</span>
    <span class="n">results</span><span class="p">.</span><span class="n">actionRecord</span> <span class="o">=</span> <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">actionRecord</span><span class="p">;</span>
    <span class="n">results</span><span class="p">.</span><span class="n">languageSpecificData</span> <span class="o">=</span> <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">languageSpecificData</span><span class="p">;</span>
    <span class="n">results</span><span class="p">.</span><span class="n">landingPad</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="kt">uintptr_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">hdr</span><span class="o">-&gt;</span><span class="n">catchTemp</span><span class="p">);</span>
    <span class="n">results</span><span class="p">.</span><span class="n">adjustedPtr</span> <span class="o">=</span> <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">adjustedPtr</span><span class="p">;</span>

    <span class="n">_Unwind_SetGR</span><span class="p">(...);</span>
    <span class="n">_Unwind_SetGR</span><span class="p">(...);</span>
    <span class="n">_Unwind_SetIP</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">results</span><span class="p">.</span><span class="n">landingPad</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">_URC_INSTALL_CONTEXT</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="n">scan_eh_tab</span><span class="p">(</span><span class="n">results</span><span class="p">,</span> <span class="n">actions</span><span class="p">,</span> <span class="n">native_exception</span><span class="p">,</span> <span class="n">unwind_exception</span><span class="p">,</span> <span class="n">context</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">results</span><span class="p">.</span><span class="n">reason</span> <span class="o">==</span> <span class="n">_URC_CONTINUE_UNWIND</span> <span class="o">||</span>
      <span class="n">results</span><span class="p">.</span><span class="n">reason</span> <span class="o">==</span> <span class="n">_URC_FATAL_PHASE1_ERROR</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">results</span><span class="p">.</span><span class="n">reason</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">actions</span> <span class="o">&amp;</span> <span class="n">_UA_SEARCH_PHASE</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="o">*</span><span class="n">hdr</span> <span class="o">=</span> <span class="p">(</span><span class="n">__cxa_exception</span> <span class="o">*</span><span class="p">)(</span><span class="n">exc</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
    <span class="c1">// Cache LSDA results in hdr.</span>
    <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">handlerSwitchValue</span> <span class="o">=</span> <span class="n">results</span><span class="p">.</span><span class="n">switchValue</span><span class="p">;</span>
    <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">actionRecord</span> <span class="o">=</span> <span class="n">results</span><span class="p">.</span><span class="n">actionRecord</span><span class="p">;</span>
    <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">languageSpecificData</span> <span class="o">=</span> <span class="n">results</span><span class="p">.</span><span class="n">languageSpecificData</span><span class="p">;</span>
    <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">catchTemp</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="kt">void</span> <span class="o">*&gt;</span><span class="p">(</span><span class="n">results</span><span class="p">.</span><span class="n">landingPad</span><span class="p">);</span>
    <span class="n">hdr</span><span class="o">-&gt;</span><span class="n">adjustedPtr</span> <span class="o">=</span> <span class="n">results</span><span class="p">.</span><span class="n">adjustedPtr</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">_URC_HANDLER_FOUND</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="c1">// _UA_CLEANUP_PHASE</span>
  <span class="n">_Unwind_SetGR</span><span class="p">(...);</span>
  <span class="n">_Unwind_SetGR</span><span class="p">(...);</span>
  <span class="n">_Unwind_SetIP</span><span class="p">(</span><span class="n">ctx</span><span class="p">,</span> <span class="n">results</span><span class="p">.</span><span class="n">landingPad</span><span class="p">);</span>
  <span class="k">return</span> <span class="n">_URC_INSTALL_CONTEXT</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 的完整实现可以参照：</p>

<ul>
  <li><a href="https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_personality.cc">gcc/libstdc++-v3/libsupc++/eh_personality.cc at master · gcc-mirror/gcc</a></li>
  <li><a href="https://github.com/llvm/llvm-project/blob/main/libcxxabi/src/cxa_personality.cpp">llvm-project/libcxxabi/src/cxa_personality.cpp at main · llvm/llvm-project</a></li>
</ul>

<h2 id="eh_frame">.eh_frame</h2>

<p>了解了 <code class="language-plaintext highlighter-rouge">personality</code> 后，我们再完善上一篇没有说清楚的另一个细节。即 <code class="language-plaintext highlighter-rouge">unwinder</code> 通过 <code class="language-plaintext highlighter-rouge">personality</code> 发现当前帧不能处理该异常时，该如何从当前栈帧获取到上一个栈帧，过程中相关的寄存器又该如何恢复。这部分栈展开的相关信息都保存在 ELF 的 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 和 <code class="language-plaintext highlighter-rouge">.eh_frame_hdr</code> 中。</p>

<p><code class="language-plaintext highlighter-rouge">.eh_frame</code> 里保存的是如何“从当前栈帧恢复到上一个栈帧”的规则（称为 CFI 指令），并不会直接保存上一个栈帧的相关寄存器是多少。换而言之，可以理解为，给定当前寄存器状态，通过这些规则，就能算出上一个栈帧的相关寄存器值。<code class="language-plaintext highlighter-rouge">.eh_frame</code> 由若干条记录组成，分为两类：</p>

<ul>
  <li>CIE（Common Information Entry）：描述一类函数通用的规则</li>
  <li>FDE（Frame Description Entry）：描述某个具体函数（或代码区间）的信息</li>
</ul>

<p>结构关系如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">.</span><span class="n">eh_frame</span><span class="o">:</span>
  <span class="p">[</span><span class="n">CIE</span><span class="p">]</span>
  <span class="p">[</span><span class="n">FDE</span> <span class="o">-&gt;</span> <span class="err">指向某个</span> <span class="n">CIE</span><span class="p">]</span>
  <span class="p">[</span><span class="n">FDE</span> <span class="o">-&gt;</span> <span class="err">指向某个</span> <span class="n">CIE</span><span class="p">]</span>
  <span class="p">...</span>
</code></pre></div></div>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">.eh_frame_hdr</code> 是一个二分索引加速结构，用于给定 IP 快速找到对应的 FDE，这里不展开介绍。</p>

</blockquote>

<p>一个 CIE 包含以下字段：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">length</code></li>
  <li><code class="language-plaintext highlighter-rouge">CIE_id</code>：对于 CIE 而言总是 0，用于区分 CIE 和 FDE</li>
  <li><code class="language-plaintext highlighter-rouge">version</code></li>
  <li><code class="language-plaintext highlighter-rouge">augmentation string</code></li>
  <li><code class="language-plaintext highlighter-rouge">code_alignment_factor</code>：指令地址对齐单位</li>
  <li><code class="language-plaintext highlighter-rouge">data_alignment_factor</code>：栈对齐的单位</li>
  <li><code class="language-plaintext highlighter-rouge">return_address_register</code>：哪个寄存器代表返回地址（<code class="language-plaintext highlighter-rouge">%rip</code>）</li>
  <li><code class="language-plaintext highlighter-rouge">augmentation data</code></li>
  <li><code class="language-plaintext highlighter-rouge">initial instructions</code>：CFI 指令，定义函数刚进入时的“初始栈布局规则”</li>
</ul>

<p>每个 FDE 都有一个关联的 CIE，FDE 包含以下字段：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">length</code></li>
  <li><code class="language-plaintext highlighter-rouge">CIE_pointer</code>：从当前位置减去 <code class="language-plaintext highlighter-rouge">CIE_pointer</code> 得到关联的 CIE</li>
  <li><code class="language-plaintext highlighter-rouge">initial_location</code>：FDE 描述的起始代码地址</li>
  <li><code class="language-plaintext highlighter-rouge">address_range</code>：FDE 描述的范围为 <code class="language-plaintext highlighter-rouge">[initial_location, address_range)</code></li>
  <li><code class="language-plaintext highlighter-rouge">augmentation data</code></li>
  <li><code class="language-plaintext highlighter-rouge">CFI instructions</code>：CFI 指令</li>
</ul>

<p>每个 FDE 中可能会有一个关联的 LSDA 指针， 指向 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> 中对应的 LSDA。当异常发生时，<code class="language-plaintext highlighter-rouge">unwinder</code> 在栈展开过程中会通过 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 找到当前 IP 对应的 FDE，然后从 FDE 中取出 LSDA 指针传给 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code>，由 <code class="language-plaintext highlighter-rouge">personality</code> 去解析 LSDA。</p>

<blockquote>
  <p>之所以说可能是与 CIE 中的 <code class="language-plaintext highlighter-rouge">augmentation string</code> 有关，略过</p>

</blockquote>

<h3 id="cfi">CFI</h3>

<p>CIE 和 FDE 其中很多字段跟我们的问题并没有太大关系，对于栈展开，我们最关心的部分就是 FDE 中的 <code class="language-plaintext highlighter-rouge">instructions</code> 字段，即 CFI 指令（Call Frame Information instructions）。CFI 指令用来描述“在函数执行到不同位置时，如何从当前栈帧恢复出上一层栈帧的寄存器值（尤其是返回地址）”，也就是 <code class="language-plaintext highlighter-rouge">unwinder</code> 在栈展开过程中进行栈帧回溯所需的信息。汇编器会利用这些指令，组装出 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 中的 CIE 和 FDE，以供 <code class="language-plaintext highlighter-rouge">unwinder</code> 使用。</p>

<p>首先我们理解一个核心概念 CFA（Canonical Frame Address），其定义是<strong>调用当前函数前，调用方 caller 的 <code class="language-plaintext highlighter-rouge">%rsp</code></strong>。而 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 的核心任务就是：不管执行到了函数的哪条指令，如何通过当前栈帧的各个寄存器，以及 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 中的 CFI 指令计算出 CFA，最终计算出上一个栈帧的相关寄存器值（这里主要关心 <code class="language-plaintext highlighter-rouge">%rip</code>，<code class="language-plaintext highlighter-rouge">%rbp</code> 和 <code class="language-plaintext highlighter-rouge">%rsp</code>）。</p>

<p>我们用一个最简单的例子来理解下上述的流程。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">throw</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>通过 <code class="language-plaintext highlighter-rouge">g++ -S -O0 test.cpp</code>，可以获取到对应汇编代码。其中 <code class="language-plaintext highlighter-rouge">bar</code> 的汇编如下：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">_Z3barv:</span>
<span class="nl">.LFB0:</span>
    <span class="nf">.cfi_startproc</span>
    <span class="nf">endbr64</span>

    <span class="err">#</span> <span class="nf">prologue</span>
    <span class="nf">pushq</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">.cfi_def_cfa_offset</span> <span class="mi">16</span>
    <span class="nf">.cfi_offset</span> <span class="mi">6</span><span class="p">,</span> <span class="o">-</span><span class="mi">16</span>
    <span class="nf">movq</span> <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">.cfi_def_cfa_register</span> <span class="mi">6</span>

    <span class="err">#</span> <span class="nf">function</span> <span class="nv">body</span>
    <span class="nf">movl</span> <span class="kc">$</span><span class="mi">4</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
    <span class="nf">call</span> <span class="nv">__cxa_allocate_exception@PLT</span>
    <span class="nf">movl</span> <span class="kc">$</span><span class="mi">1</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>
    <span class="nf">movl</span> <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">edx</span>
    <span class="nf">leaq</span> <span class="nv">_ZTIi</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span> <span class="o">%</span><span class="nb">rcx</span>
    <span class="nf">movq</span> <span class="o">%</span><span class="nb">rcx</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">movq</span> <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span> <span class="nv">__cxa_throw@PLT</span>

    <span class="nf">.cfi_endproc</span>
</code></pre></div></div>

<p>有一点基础知识需要提前说明：在 DWARF 规范（也就是 <code class="language-plaintext highlighter-rouge">.cfi_*</code> 使用的规范）中，寄存器是通过编号来表示的。在 x86-64 下：</p>

<ul>
  <li>寄存器 6 代表 <code class="language-plaintext highlighter-rouge">%rbp</code></li>
  <li>寄存器 7 代表 <code class="language-plaintext highlighter-rouge">%rsp</code></li>
</ul>

<p><code class="language-plaintext highlighter-rouge">.cfi_startproc</code> 会标记函数开始。汇编器会在 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 中新建一个 FDE。注意调用方在调用 <code class="language-plaintext highlighter-rouge">bar</code> 时，会额外将调用方 <code class="language-plaintext highlighter-rouge">foo</code> 的返回地址压栈。此时 <code class="language-plaintext highlighter-rouge">CFA = %rsp + 8</code>。（再次强调，CFA 是调用方的 <code class="language-plaintext highlighter-rouge">%rsp</code>）</p>

<p>之后进入 prologue。<code class="language-plaintext highlighter-rouge">pushq %rbp</code> 将上一层函数的 <code class="language-plaintext highlighter-rouge">%rbp</code> 压入栈，此时 <code class="language-plaintext highlighter-rouge">%rsp</code> 减 8，此时 <code class="language-plaintext highlighter-rouge">CFA = %rsp + 16</code>。在执行完 <code class="language-plaintext highlighter-rouge">pushq %rbp</code> 后，需要告知 <code class="language-plaintext highlighter-rouge">unwinder</code> CFA 的计算方式发生了改变，对应 CFI 指令为 <code class="language-plaintext highlighter-rouge">.cfi_def_cfa_offset 16</code>，代表更新偏移量为 16。</p>

<p>另外也需要告知 <code class="language-plaintext highlighter-rouge">unwinder</code> 原先的 <code class="language-plaintext highlighter-rouge">%rbp</code> 被压栈（对应 DWARF 规范中的寄存器 6），即 <code class="language-plaintext highlighter-rouge">%rbp</code> 被保存在 <code class="language-plaintext highlighter-rouge">CFA - 16</code> 处（<code class="language-plaintext highlighter-rouge">CFA - 8</code> 是返回地址），对应 CFI 指令为 <code class="language-plaintext highlighter-rouge">.cfi_offset 6, -16</code>。这样当栈展开时，依靠这个信息就可以把上一个栈帧（我们例子 <code class="language-plaintext highlighter-rouge">foo</code> 的 <code class="language-plaintext highlighter-rouge">%rbp</code>）恢复出来。</p>

<p>在 <code class="language-plaintext highlighter-rouge">movq %rsp, %rbp</code> 更新当前栈帧的 <code class="language-plaintext highlighter-rouge">%rsp</code> 后，需要告诉 <code class="language-plaintext highlighter-rouge">unwinder</code>，计算 CFA 的基址寄存器由调用方的 <code class="language-plaintext highlighter-rouge">%rsp</code> 换成了寄存器 6（也就是当前栈帧的 <code class="language-plaintext highlighter-rouge">%rbp</code>）。对应 CFI 指令是 <code class="language-plaintext highlighter-rouge">.cfi_def_cfa_register 6</code>，偏移量保持上一次设置的 16 不变。之后不管 <code class="language-plaintext highlighter-rouge">%rsp</code> 怎么变化（例如压入临时变量等），寻找 CFA 只需要 <code class="language-plaintext highlighter-rouge">CFA = %rbp + 16</code> 即可得到。</p>

<p>后续具体抛异常的代码略过。最终 <code class="language-plaintext highlighter-rouge">.cfi_endproc</code> 会标记函数结束，对应 FDE 也就完成了。</p>

<p><code class="language-plaintext highlighter-rouge">foo</code> 的情况也类似，我们只补充一下 epilogue 部分：在 <code class="language-plaintext highlighter-rouge">popq %rbp</code> 之后，<code class="language-plaintext highlighter-rouge">%rsp</code> 加 8，此时 <code class="language-plaintext highlighter-rouge">CFA = %rsp + 8</code>（由于 <code class="language-plaintext highlighter-rouge">%rbp</code> 出栈，<code class="language-plaintext highlighter-rouge">CFA = %rbp + 16</code> 不再成立了）。<code class="language-plaintext highlighter-rouge">.cfi_def_cfa 7, 8</code> 指令能告知 <code class="language-plaintext highlighter-rouge">unwinder</code>，<code class="language-plaintext highlighter-rouge">CFA = %rsp + 8</code>，寄存器 7 代表 <code class="language-plaintext highlighter-rouge">%rsp</code>。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="nl">_Z3foov:</span>
<span class="nl">.LFB1:</span>
    <span class="nf">.cfi_startproc</span>
    <span class="nf">endbr64</span>

    <span class="err">#</span> <span class="nf">prologue</span>
    <span class="nf">pushq</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">.cfi_def_cfa_offset</span> <span class="mi">16</span>
    <span class="nf">.cfi_offset</span> <span class="mi">6</span><span class="p">,</span> <span class="o">-</span><span class="mi">16</span>
    <span class="nf">movq</span> <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">.cfi_def_cfa_register</span> <span class="mi">6</span>

    <span class="nf">call</span> <span class="nv">_Z3barv</span>
    <span class="nf">nop</span>

    <span class="err">#</span> <span class="nf">epilogue</span>
    <span class="nf">popq</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">.cfi_def_cfa</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">8</span>
    <span class="nf">ret</span>
    <span class="nf">.cfi_endproc</span>
</code></pre></div></div>

<p>当 <code class="language-plaintext highlighter-rouge">bar</code> 函数 <code class="language-plaintext highlighter-rouge">throw</code> 时，会调用到 <code class="language-plaintext highlighter-rouge">__cxa_throw</code> 函数。<code class="language-plaintext highlighter-rouge">unwinder</code> 会获取当前 CPU 的 IP 寄存器 <code class="language-plaintext highlighter-rouge">%rip</code>，根据 <code class="language-plaintext highlighter-rouge">%rip</code> 找到对应的 FDE 记录。接下来 <code class="language-plaintext highlighter-rouge">unwinder</code> 会回放一遍从该函数开头（<code class="language-plaintext highlighter-rouge">.cfi_startproc</code>）一直到当前抛出异常所在的 IP 地址为止所有的 <code class="language-plaintext highlighter-rouge">.cfi_*</code> 指令。通过回放，<code class="language-plaintext highlighter-rouge">unwinder</code> 可以算出当前的 CFA 是多少。知道 CFA 之后，如何获取调用方 <code class="language-plaintext highlighter-rouge">foo</code> 在调用该函数时的相关寄存器状态呢？</p>

<p>答案是 <code class="language-plaintext highlighter-rouge">unwinder</code> 通过 <code class="language-plaintext highlighter-rouge">.cfi_offset 6, -16</code> 这条指令就能算出 <code class="language-plaintext highlighter-rouge">%rbp</code> 和 <code class="language-plaintext highlighter-rouge">%rip</code>：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">%rbp</code>：调用者的 <code class="language-plaintext highlighter-rouge">%rbp</code>（寄存器 6）被保存在内存中 <code class="language-plaintext highlighter-rouge">CFA - 16</code> 的位置。</li>
  <li><code class="language-plaintext highlighter-rouge">%rip</code>：而对于调用者的 <code class="language-plaintext highlighter-rouge">%rip</code>，也就是返回地址是在进入到被调用函数之前就已经被压栈，因此调用者的返回地址保存在 <code class="language-plaintext highlighter-rouge">CFA - 8</code> 的位置。</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">unwinder</code> 将相关寄存器恢复到调用方调用当前函数前的状态，也就是 <code class="language-plaintext highlighter-rouge">foo</code> 调用 <code class="language-plaintext highlighter-rouge">bar</code> 之前的状态：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">%rbp = CFA - 16</code></li>
  <li><code class="language-plaintext highlighter-rouge">%rip = CFA - 8</code></li>
  <li><code class="language-plaintext highlighter-rouge">%rsp = CFA</code></li>
</ul>

<p>此时 <code class="language-plaintext highlighter-rouge">unwinder</code> 已经从 <code class="language-plaintext highlighter-rouge">bar</code> 回溯到了 <code class="language-plaintext highlighter-rouge">foo</code>。之后可以继续进行栈展开，即使用更新过的 <code class="language-plaintext highlighter-rouge">%rip</code>，去恢复 <code class="language-plaintext highlighter-rouge">foo</code> 的调用方的相关寄存器状态。</p>

<p>到这我们就理解了 <code class="language-plaintext highlighter-rouge">unwinder</code> 如何进行栈帧回溯了。这里我们再进一步思考这样一个问题：为什么栈展开过程中必须恢复 <code class="language-plaintext highlighter-rouge">callee-saved</code> 相关寄存器？又为什么无需恢复 <code class="language-plaintext highlighter-rouge">caller-saved</code> 相关寄存器？</p>

<p>当 <code class="language-plaintext highlighter-rouge">foo</code> 调用 <code class="language-plaintext highlighter-rouge">bar</code> 时，会保存 <code class="language-plaintext highlighter-rouge">caller-saved</code> 相关寄存器，例如 <code class="language-plaintext highlighter-rouge">%rax</code>、<code class="language-plaintext highlighter-rouge">%rcx</code>、<code class="language-plaintext highlighter-rouge">%r8</code>-<code class="language-plaintext highlighter-rouge">%r11</code> 等。当越过 <code class="language-plaintext highlighter-rouge">call</code> 这条边界后，这些寄存器里的数据都变成垃圾了。而当从 <code class="language-plaintext highlighter-rouge">bar</code> 栈展开回到 <code class="language-plaintext highlighter-rouge">foo</code> 时，对于 <code class="language-plaintext highlighter-rouge">foo</code> 而言只不过相当于又跨回了这条边界，这些寄存器对于调用方 <code class="language-plaintext highlighter-rouge">foo</code> 无关紧要，<code class="language-plaintext highlighter-rouge">unwinder</code> 也不需要去恢复它们。</p>

<p>而 <code class="language-plaintext highlighter-rouge">callee-saved</code> 相关寄存器就不一样了。调用方 <code class="language-plaintext highlighter-rouge">foo</code> 期望无论何时，无论被调用方 <code class="language-plaintext highlighter-rouge">bar</code> 正常返回还是异常发生时，<code class="language-plaintext highlighter-rouge">callee-saved</code> 寄存器都能保持不变。然而当异常发生时，<code class="language-plaintext highlighter-rouge">bar</code> 的正常流程被打断了，<code class="language-plaintext highlighter-rouge">bar</code> 并没有机会去执行 <code class="language-plaintext highlighter-rouge">epilogue</code> 来恢复这些寄存器。这也就是 <code class="language-plaintext highlighter-rouge">unwinder</code> 需要通过 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 中的 CFI 指令，代替被调用方 <code class="language-plaintext highlighter-rouge">bar</code> 来完成它没有完成的义务，即将这些调用方所期望的 <code class="language-plaintext highlighter-rouge">callee-saved</code> 寄存器一一恢复。</p>

<h3 id="如何查看eh_frame">如何查看.eh_frame</h3>

<p>可以通过如下方式比对查看<code class="language-plaintext highlighter-rouge">.eh_frame</code> 数据段：</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>readelf <span class="nt">-wF</span> a.out

000000a8 000000000000001c 00000024 FDE <span class="nv">cie</span><span class="o">=</span>00000088 <span class="nv">pc</span><span class="o">=</span>0000000000001169..0000000000001198
   LOC           CFA      rbp   ra
0000000000001169 rsp+8    u     c-8
000000000000116e rsp+16   c-16  c-8
0000000000001171 rbp+16   c-16  c-8

000000c8 000000000000001c 000000cc FDE <span class="nv">cie</span><span class="o">=</span>00000000 <span class="nv">pc</span><span class="o">=</span>0000000000001198..00000000000011a8
   LOC           CFA      rbp   ra
0000000000001198 rsp+8    u     c-8
000000000000119d rsp+16   c-16  c-8
00000000000011a0 rbp+16   c-16  c-8
00000000000011a7 rsp+8    c-16  c-8
</code></pre></div></div>

<p>上面两段就是 <code class="language-plaintext highlighter-rouge">bar</code> 和 <code class="language-plaintext highlighter-rouge">foo</code> 解析之后的 FDE，可以对照最终二进制文件的地址加深理解（汇编代码都是一致的，只不过起始地址有所不同）。</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>objdump <span class="nt">-d</span> a.out

0000000000001169 &lt;_Z3barv&gt;:
    1169:       f3 0f 1e fa             endbr64
    116d:       55                      push   %rbp
    116e:       48 89 e5                mov    %rsp,%rbp
    1171:       bf 04 00 00 00          mov    <span class="nv">$0x4</span>,%edi
    1176:       e8 e5 fe ff ff          call   1060 &lt;__cxa_allocate_exception@plt&gt;
    117b:       c7 00 01 00 00 00       movl   <span class="nv">$0x1</span>,<span class="o">(</span>%rax<span class="o">)</span>
    1181:       ba 00 00 00 00          mov    <span class="nv">$0x0</span>,%edx
    1186:       48 8d 0d 13 2c 00 00    lea    0x2c13<span class="o">(</span>%rip<span class="o">)</span>,%rcx        <span class="c"># 3da0 &lt;_ZTIi@CXXABI_1.3&gt;</span>
    118d:       48 89 ce                mov    %rcx,%rsi
    1190:       48 89 c7                mov    %rax,%rdi
    1193:       e8 d8 fe ff ff          call   1070 &lt;__cxa_throw@plt&gt;

0000000000001198 &lt;_Z3foov&gt;:
    1198:       f3 0f 1e fa             endbr64
    119c:       55                      push   %rbp
    119d:       48 89 e5                mov    %rsp,%rbp
    11a0:       e8 c4 ff ff ff          call   1169 &lt;_Z3barv&gt;
    11a5:       90                      nop
    11a6:       5d                      pop    %rbp
    11a7:       c3                      ret
</code></pre></div></div>

<h2 id="misc">Misc</h2>

<p>最后再补充一些零碎的信息。</p>

<h3 id="exception-propagation">exception propagation</h3>

<p>一些会影响异常传播的编译器参数：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">fno-exceptions -fno-asynchronous-unwind-tables</code>: <code class="language-plaintext highlighter-rouge">.eh_frame</code> 和 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> 都不存在</li>
  <li><code class="language-plaintext highlighter-rouge">fno-exceptions -fasynchronous-unwind-tables</code>: <code class="language-plaintext highlighter-rouge">.eh_frame</code> 存在，<code class="language-plaintext highlighter-rouge">.gcc_except_table</code> 不存在</li>
  <li><code class="language-plaintext highlighter-rouge">fexceptions</code>: <code class="language-plaintext highlighter-rouge">.eh_frame</code> 和 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> 都存在（默认情况）</li>
</ul>

<p>当一个异常从当前函数向调用方传播时（无论是 Level 1 的 <code class="language-plaintext highlighter-rouge">libgcc</code>/<code class="language-plaintext highlighter-rouge">libunwind</code>，还是 Level 2 的 <code class="language-plaintext highlighter-rouge">libstdc++</code>/<code class="language-plaintext highlighter-rouge">libc++abi</code>）：</p>

<ul>
  <li>没有 <code class="language-plaintext highlighter-rouge">.eh_frame</code>： <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_END_OF_STACK</code>。<code class="language-plaintext highlighter-rouge">__cxa_throw</code> 调用 <code class="language-plaintext highlighter-rouge">std::terminate</code></li>
  <li>有 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 但当前栈帧没有对应 LSDA：透传，不调用局部变量析构函数</li>
  <li>有 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 且当前栈帧有对应 LSDA，但 <code class="language-plaintext highlighter-rouge">call site table</code> 中找不到抛异常处 IP 对应条目：<code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 调用 <code class="language-plaintext highlighter-rouge">__cxa_call_terminate</code> 或者 <code class="language-plaintext highlighter-rouge">std::terminate</code>。这表明当前 IP 不在可抛出异常的范围内，找不到对应的 <code class="language-plaintext highlighter-rouge">landing pad</code>，只能退出</li>
  <li>有 <code class="language-plaintext highlighter-rouge">.eh_frame</code> 且当前栈帧有对应 LSDA，<code class="language-plaintext highlighter-rouge">call site table</code> 中找到了抛异常处 IP 对应条目：执行可能的清理并展开到父帧。此时 <code class="language-plaintext highlighter-rouge">landing pad</code> 为 0 表明当前栈帧无需额外处理，继续栈展开。而 <code class="language-plaintext highlighter-rouge">landing pad</code> 非 0 则表示有清理或者 <code class="language-plaintext highlighter-rouge">catch</code>，如果无法捕获异常则会调用 <code class="language-plaintext highlighter-rouge">_Unwind_Resume</code> 继续栈展开。</li>
</ul>

<p>而当一个异常从当前 <code class="language-plaintext highlighter-rouge">noexcept</code> 函数向调用方传播时：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">fno-exceptions -fno-asynchronous-unwind-tables</code>：调用 <code class="language-plaintext highlighter-rouge">std::terminate</code></li>
  <li><code class="language-plaintext highlighter-rouge">fno-exceptions -fasynchronous-unwind-tables</code>：透传，不会调用局部变量析构函数</li>
  <li><code class="language-plaintext highlighter-rouge">fexceptions</code>： 调用 <code class="language-plaintext highlighter-rouge">std::terminate</code></li>
</ul>

<p>当 <code class="language-plaintext highlighter-rouge">std::terminate</code> 被调用时，会有一个诊断信息，形如 <code class="language-plaintext highlighter-rouge">terminate called after throwing an instance of 'int'</code>。此时没有 <code class="language-plaintext highlighter-rouge">stack trace</code>，如果进程会处理 <code class="language-plaintext highlighter-rouge">SIGABRT</code> 信号，<code class="language-plaintext highlighter-rouge">signal handler</code> 可能会获得 <code class="language-plaintext highlighter-rouge">stack trace</code>。</p>

<h3 id="noexcept">noexcept</h3>

<p>最后再看一下 <code class="language-plaintext highlighter-rouge">noexcept</code> 的一些示例。我们仍然用刚才示例代码：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">throw</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">foo</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">bar</code> 抛出的异常，最终调用到 <code class="language-plaintext highlighter-rouge">__cxa_throw</code>，调用 <code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 开始栈展开，过程中会调用 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 查看是否有栈帧能处理这个异常。由于我们代码中压根没有 <code class="language-plaintext highlighter-rouge">try/catch</code> 语句，也不需要额外清理，因此编译器并不会生成 LSDA。搜索阶段一路回溯到栈底也找不到能捕获异常的 <code class="language-plaintext highlighter-rouge">catch handler</code>，<code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 返回 <code class="language-plaintext highlighter-rouge">_URC_END_OF_STACK</code>，由 <code class="language-plaintext highlighter-rouge">__cxa_throw</code> 调用 <code class="language-plaintext highlighter-rouge">std::terminate</code>。core dump 如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">bt</span>
<span class="cp">#0  0x00007c2fa169eb2c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007c2fa164527e in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007c2fa16288ff in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007c2fa1aa5ff5 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007c2fa1abb0da in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007c2fa1aa5a55 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007c2fa1abb391 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x000060248ae80198 in bar() ()
#8  0x000060248ae801a5 in foo() ()
#9  0x000060248ae801b5 in main ()
</span></code></pre></div></div>

<p>而如果我们把 <code class="language-plaintext highlighter-rouge">bar</code> 函数添加上 <code class="language-plaintext highlighter-rouge">noexcept</code> 关键字，可以发现 core dump 有所不同。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">bar</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="k">throw</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">foo</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>在栈展开过程中，<code class="language-plaintext highlighter-rouge">unwinder</code> 调用 <code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 处理这个栈帧时，它发现一个 <code class="language-plaintext highlighter-rouge">noexcept</code> 函数中抛出了异常，会直接调用 <code class="language-plaintext highlighter-rouge">__cxa_call_terminate</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">bt</span>
<span class="cp">#0  0x00007f164189eb2c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f164184527e in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f16418288ff in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f1641ca5ff5 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f1641cbb0da in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f1641ca58e6 in __cxa_call_terminate () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f1641cba8ba in __gxx_personality_v0 () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f1641bf4b06 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#8  0x00007f1641bf51f1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#9  0x00007f1641cbb384 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x0000579e7a320198 in bar() ()
#11 0x0000579e7a3201a5 in foo() ()
#12 0x0000579e7a3201b5 in main ()
</span></code></pre></div></div>

<p>要理解这个路径，我们需要先看看对应程序的 <code class="language-plaintext highlighter-rouge">.gcc_except_table</code>。</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>readelf <span class="nt">-x</span> .gcc_except_table a.out

Hex dump of section <span class="s1">'.gcc_except_table'</span>:
  0x00002154 ffff0100                            ....
</code></pre></div></div>

<p>可以看到只有四个字节，实际是 LSDA 的 header：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">ff</code>：<code class="language-plaintext highlighter-rouge">landing pad</code> 的基地址 —— 表示没有特定的 landing pad 基址</li>
  <li><code class="language-plaintext highlighter-rouge">ff</code>：<code class="language-plaintext highlighter-rouge">type table</code> 的编码格式 —— 表示没有类型信息表（没有 <code class="language-plaintext highlighter-rouge">catch</code>，不需要做 RTTI 类型匹配）。</li>
  <li><code class="language-plaintext highlighter-rouge">01</code>：<code class="language-plaintext highlighter-rouge">call site table</code> 编码，<code class="language-plaintext highlighter-rouge">01</code> 表示采用 <code class="language-plaintext highlighter-rouge">uleb128</code> 编码。</li>
  <li><code class="language-plaintext highlighter-rouge">00</code>：<code class="language-plaintext highlighter-rouge">call site table</code> 长度为 0</li>
</ol>

<blockquote>

  <ul>
    <li>In GCC, for a <code class="language-plaintext highlighter-rouge">noexcept</code> function, a possibly-throwing call site unhandled by a try block does not get an entry in the <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> call site table. If the function has no try block, it gets a header-only <code class="language-plaintext highlighter-rouge">.gcc_except_table</code> (4 bytes)</li>
    <li>In Clang, there is a call site entry calling <code class="language-plaintext highlighter-rouge">__clang_call_terminate</code>. The size overhead is larger than GCC’s scheme. Improving this requires LLVM IR work</li>
  </ul>
</blockquote>

<p>由于 <code class="language-plaintext highlighter-rouge">call site table</code> 中没有任何有效条目，在两阶段栈展开过程中，<code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 都会将该帧的搜索结果设置为 <code class="language-plaintext highlighter-rouge">found_terminate</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">while</span> <span class="p">(</span><span class="n">p</span> <span class="o">&lt;</span> <span class="n">info</span><span class="p">.</span><span class="n">action_table</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">_Unwind_Ptr</span> <span class="n">cs_start</span><span class="p">,</span> <span class="n">cs_len</span><span class="p">,</span> <span class="n">cs_lp</span><span class="p">;</span>
  <span class="n">_uleb128_t</span> <span class="n">cs_action</span><span class="p">;</span>

  <span class="c1">// Note that all call-site encodings are "absolute" displacements.</span>
  <span class="n">p</span> <span class="o">=</span> <span class="n">read_encoded_value</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">info</span><span class="p">.</span><span class="n">call_site_encoding</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">cs_start</span><span class="p">);</span>
  <span class="n">p</span> <span class="o">=</span> <span class="n">read_encoded_value</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">info</span><span class="p">.</span><span class="n">call_site_encoding</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">cs_len</span><span class="p">);</span>
  <span class="n">p</span> <span class="o">=</span> <span class="n">read_encoded_value</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">info</span><span class="p">.</span><span class="n">call_site_encoding</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">cs_lp</span><span class="p">);</span>
  <span class="n">p</span> <span class="o">=</span> <span class="n">read_uleb128</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">cs_action</span><span class="p">);</span>

  <span class="c1">// The table is sorted, so if we've passed the ip, stop.</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">ip</span> <span class="o">&lt;</span> <span class="n">info</span><span class="p">.</span><span class="n">Start</span> <span class="o">+</span> <span class="n">cs_start</span><span class="p">)</span>
    <span class="n">p</span> <span class="o">=</span> <span class="n">info</span><span class="p">.</span><span class="n">action_table</span><span class="p">;</span>
  <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">ip</span> <span class="o">&lt;</span> <span class="n">info</span><span class="p">.</span><span class="n">Start</span> <span class="o">+</span> <span class="n">cs_start</span> <span class="o">+</span> <span class="n">cs_len</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">cs_lp</span><span class="p">)</span>
      <span class="n">landing_pad</span> <span class="o">=</span> <span class="n">info</span><span class="p">.</span><span class="n">LPStart</span> <span class="o">+</span> <span class="n">cs_lp</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">cs_action</span><span class="p">)</span>
      <span class="n">action_record</span> <span class="o">=</span> <span class="n">info</span><span class="p">.</span><span class="n">action_table</span> <span class="o">+</span> <span class="n">cs_action</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">goto</span> <span class="n">found_something</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="c1">// If ip is not present in the table, call terminate.  This is for</span>
<span class="c1">// a destructor inside a cleanup, or a library routine the compiler</span>
<span class="c1">// was not expecting to throw.</span>
<span class="n">found_type</span> <span class="o">=</span> <span class="n">found_terminate</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">do_something</span><span class="p">;</span>
</code></pre></div></div>

<p>完整流程是：</p>

<ol>
  <li>
    <p>在搜索阶段，设置为 <code class="language-plaintext highlighter-rouge">found_terminate</code>，此时 <code class="language-plaintext highlighter-rouge">landing_pad</code> 为 0，将当前结果缓存，并返回 <code class="language-plaintext highlighter-rouge">_URC_HANDLER_FOUND</code>，代表找到了catch handler（尽管实际的 <code class="language-plaintext highlighter-rouge">landing pad</code> 是 terminate）。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">if</span> <span class="p">(</span><span class="n">actions</span> <span class="o">&amp;</span> <span class="n">_UA_SEARCH_PHASE</span><span class="p">)</span> <span class="p">{</span>
   <span class="k">if</span> <span class="p">(</span><span class="n">found_type</span> <span class="o">==</span> <span class="n">found_cleanup</span><span class="p">)</span>
     <span class="n">CONTINUE_UNWINDING</span><span class="p">;</span>

   <span class="c1">// For domestic exceptions, we cache data from phase 1 for phase 2.</span>
   <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">foreign_exception</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">save_caught_exception</span><span class="p">(</span><span class="n">ue_header</span><span class="p">,</span> <span class="n">context</span><span class="p">,</span> <span class="n">thrown_ptr</span><span class="p">,</span> <span class="n">handler_switch_value</span><span class="p">,</span>
                           <span class="n">language_specific_data</span><span class="p">,</span> <span class="n">landing_pad</span><span class="p">,</span> <span class="n">action_record</span><span class="p">);</span>
   <span class="p">}</span>
   <span class="k">return</span> <span class="n">_URC_HANDLER_FOUND</span><span class="p">;</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>在清理阶段，通过读取缓存结果，再次设置为 <code class="language-plaintext highlighter-rouge">found_terminate</code>，最终也就调用了 <a href="https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_personality.cc#L691">__cxa_call_terminate</a>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1">// Shortcut for phase 2 found handler for domestic exception.</span>
 <span class="k">if</span> <span class="p">(</span><span class="n">actions</span> <span class="o">==</span> <span class="p">(</span><span class="n">_UA_CLEANUP_PHASE</span> <span class="o">|</span> <span class="n">_UA_HANDLER_FRAME</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">foreign_exception</span><span class="p">)</span> <span class="p">{</span>
   <span class="n">restore_caught_exception</span><span class="p">(</span><span class="n">ue_header</span><span class="p">,</span> <span class="n">handler_switch_value</span><span class="p">,</span>
                            <span class="n">language_specific_data</span><span class="p">,</span> <span class="n">landing_pad</span><span class="p">);</span>
   <span class="n">found_type</span> <span class="o">=</span> <span class="p">(</span><span class="n">landing_pad</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">?</span> <span class="n">found_terminate</span> <span class="o">:</span> <span class="n">found_handler</span><span class="p">);</span>
   <span class="k">goto</span> <span class="n">install_context</span><span class="p">;</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
</ol>

<p>最后，我们可以从 core dump 看到整个 Itanium C++ ABI 异常处理的各个关键组件：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">__cxa_throw</code> 是 Itanium C++ ABI 定义的接口，<code class="language-plaintext highlighter-rouge">libstdc++</code> 提供了具体实现。</li>
  <li><code class="language-plaintext highlighter-rouge">_Unwind_RaiseException</code> 是 Itanium Base ABI 定义的栈展开接口，<code class="language-plaintext highlighter-rouge">libgcc_s</code> 提供了具体实现，基于 DWARF 展开信息。</li>
  <li><code class="language-plaintext highlighter-rouge">__gxx_personality_v0</code> 负责：
    <ul>
      <li>在栈展开过程中检查每个栈帧是否有匹配的 catch</li>
      <li>决定是否执行 landing pad</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">__cxa_call_terminate</code></li>
  <li>最终 <code class="language-plaintext highlighter-rouge">abort()</code>，则是在 <code class="language-plaintext highlighter-rouge">glibc</code> 中</li>
</ul>

<p>相关内容整理得差不多了，大多数内容都是通过阅读 MaskRay 的博客重新消化输出的，不免会有不少疏漏错误。但整个过程还是学到了不少，有点意思。</p>

<h2 id="reference">Reference</h2>

<ul>
  <li>
    <table>
      <tbody>
        <tr>
          <td>[C++ exception handling ABI</td>
          <td>MaskRay](https://maskray.me/blog/2020-12-12-c++-exception-handling-abi)</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li>
    <table>
      <tbody>
        <tr>
          <td>[Stack unwinding</td>
          <td>MaskRay](https://maskray.me/blog/2020-11-08-stack-unwinding)</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li><a href="https://www.youtube.com/watch?v=_Ivd3qzgT7U">CppCon 2017: Dave Watson “C++ Exceptions and Stack Unwinding”</a></li>
</ul>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[话不多说，这一篇争取把上一篇不够详尽的部分补齐。]]></summary></entry><entry><title type="html">Calling Functions</title><link href="/%E5%AD%A6%E4%B9%A0/Calling-Functions/" rel="alternate" type="text/html" title="Calling Functions" /><published>2026-03-26T00:00:00+08:00</published><updated>2026-03-26T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Calling-Functions</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Calling-Functions/"><![CDATA[<p>最近一直没空学些东西，这两天看了<a href="https://www.youtube.com/watch?v=GydNMuyQzWo">Calling Functions: A Tutorial</a>这个演讲，它详尽的阐述了在函数调用时，编译器如何选择了正确的函数。从这个视角，能把很多概念串联起来（或者说能把一些常见写法和对应的术语对应起来），也能理解这个过程中背后设计的思想。</p>

<h2 id="overview">Overview</h2>

<p>整个过程分为如下的步骤，相关术语都尽量保留了英文。注意整个过程是单向的。</p>

<p><img src="/archive/FunctionCall.png" alt="figure" /></p>

<ul>
  <li><strong>Name Lookup</strong>（名称查找）：在当前作用域内选择所有具有特定名称且可见的候选函数。如果未找到，则继续进入下一个外层作用域。这一过程最终形成一个候选集合。</li>
  <li><strong>Template Argument Deduction</strong>（模板实参推导）：对于候选集合中的函数模板，根据给定的模板实参推导所有函数模板参数，并将其添加到重载集中。SFINAE也是发生在这一步。</li>
  <li><strong>Overload Resolution</strong>（重载决议）：从候选集合中找到最佳匹配项。这一步可能会进行实参类型转换。</li>
  <li><strong>Access Labels</strong>（访问标签）：检查最佳匹配函数在调用点是否可访问。</li>
  <li><strong>Function Template Specialization</strong>（函数模板特化）：如果最佳匹配来自某个模板，则从所选函数模板的所有特化版本中选择最终调用的函数。</li>
  <li><strong>Virtual Dispatch</strong>：如果最佳匹配是虚函数，则需要挑选到most derived子类的对应虚函数。</li>
  <li><strong>Deleting Functions</strong>：检查最佳匹配函数是否已被通过 <code class="language-plaintext highlighter-rouge">=delete</code>显式删除。</li>
</ul>

<blockquote>
  <p>每个阶段的术语，根据语境不同，会中英文混用</p>

</blockquote>

<h2 id="name-lookup">Name Lookup</h2>

<p>首先第一步就是名称查找，整体上遵循的核心原则就是：</p>

<ul>
  <li>Unqualified lookup: 按作用域层级（包括作用域内的<code class="language-plaintext highlighter-rouge">using</code>），由内而外找到第一个匹配的名字</li>
  <li>Qualified lookup: 只在指定的类或者namespace中查找。</li>
  <li>二者的区别就在是否带<code class="language-plaintext highlighter-rouge">::</code></li>
</ul>

<p>可以对照下面几个例子加深理解，不做过多解释。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (1)</span>

<span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>  <span class="c1">// (2)</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span>     <span class="c1">// Unqualified lookup; calls (1)</span>
    <span class="n">f</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>      <span class="c1">// Unqualified lookup; calls (1)</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">f</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>  <span class="c1">// Qualified lookup; calls (2)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>对于Unqualified lookup，从调用处开始，最先找到的变量名或者函数名会把外层的名称都隐藏掉。因此<code class="language-plaintext highlighter-rouge">h</code>中优先查找<code class="language-plaintext highlighter-rouge">N1</code>这个namespace中的<code class="language-plaintext highlighter-rouge">f</code>，因此调用的是<code class="language-plaintext highlighter-rouge">N1</code>中的<code class="language-plaintext highlighter-rouge">f</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (1)</span>

<span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>  <span class="c1">// (2): Function (2) hides function (1)</span>
<span class="kt">void</span> <span class="n">g</span><span class="p">()</span> <span class="p">{</span> <span class="n">N1</span><span class="o">::</span><span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span> <span class="p">}</span>
<span class="kt">void</span> <span class="nf">h</span><span class="p">()</span> <span class="p">{</span> <span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span> <span class="p">}</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">g</span><span class="p">();</span>  <span class="c1">// Qualified lookup; calls (2)</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">h</span><span class="p">();</span>  <span class="c1">// Unqualified lookup; calls (2)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>注意，Name Lookup阶段，变量名也会被考虑在内：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (1)</span>

<span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="k">constexpr</span> <span class="kt">int</span> <span class="n">f</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>  <span class="c1">// (2): Variable (2) hides function (1)</span>
<span class="kt">void</span> <span class="n">g</span><span class="p">()</span> <span class="p">{</span> <span class="n">N1</span><span class="o">::</span><span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span> <span class="p">}</span>
<span class="kt">void</span> <span class="nf">h</span><span class="p">()</span> <span class="p">{</span> <span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span> <span class="p">}</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">g</span><span class="p">();</span> <span class="c1">// Ill-formed, f cannot be used as a function</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">h</span><span class="p">();</span> <span class="c1">// Ill-formed, f cannot be used as a function</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Base</span> <span class="p">{</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>     <span class="c1">// (1)</span>
    <span class="k">virtual</span> <span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (2)</span>
<span class="p">};</span>

<span class="k">class</span> <span class="nc">Derived</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Base</span> <span class="p">{</span>
    <span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">double</span><span class="p">)</span> <span class="k">override</span><span class="p">;</span>  <span class="c1">// (3): Function(3) hides function(1) and (2)</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Derived</span> <span class="n">d</span><span class="p">{};</span>
    <span class="n">d</span><span class="p">.</span><span class="n">f</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span> <span class="c1">// Calls (3)</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="argument-dependent-lookup">Argument Dependent Lookup</h3>

<p>前面我们基本只讲了基本类型，当Name Lookup涉及到<code class="language-plaintext highlighter-rouge">struct</code>或者<code class="language-plaintext highlighter-rouge">class</code>时，编译器就会额外考虑类所在的namespace了。比如下面例子中，<code class="language-plaintext highlighter-rouge">S</code>这个类型来自于<code class="language-plaintext highlighter-rouge">N1</code>这个<code class="language-plaintext highlighter-rouge">namespace</code>，所以Name Lookup时就会查找<code class="language-plaintext highlighter-rouge">N1</code>这个<code class="language-plaintext highlighter-rouge">namespace</code>。</p>

<blockquote>
  <p>Remember that ADL only works for user-defined types.</p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (1)</span>

<span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>  <span class="c1">// (2)</span>
<span class="k">struct</span> <span class="nc">S</span> <span class="p">{};</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">S</span><span class="p">);</span>  <span class="c1">// (3)</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">S</span> <span class="n">s</span><span class="p">{};</span>
    <span class="n">f</span><span class="p">(</span><span class="n">s</span><span class="p">);</span> <span class="c1">// Argument dependent lookup (ADL); calls (3)</span>
<span class="p">}</span>
</code></pre></div></div>

<blockquote>
  <p>对于这个例子，需要注意的是，在Name Lookup阶段编译器会将3个<code class="language-plaintext highlighter-rouge">f</code>函数都作为候选，在后续的Overload Resolution才决定3是最佳匹配。</p>

</blockquote>

<p>关于ADL，一个很常见的例子就是<code class="language-plaintext highlighter-rouge">std::swap</code>。如下所示：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="k">struct</span> <span class="nc">S</span> <span class="p">{};</span>
<span class="kt">void</span> <span class="nf">swap</span><span class="p">(</span><span class="n">S</span><span class="o">&amp;</span><span class="p">,</span> <span class="n">S</span><span class="o">&amp;</span><span class="p">);</span>  <span class="c1">// (1)</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="kt">void</span> <span class="nf">g</span><span class="p">(</span><span class="n">T</span><span class="o">&amp;</span> <span class="n">a</span><span class="p">,</span> <span class="n">T</span><span class="o">&amp;</span> <span class="n">b</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">swap</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="kt">void</span> <span class="nf">h</span><span class="p">(</span><span class="n">T</span><span class="o">&amp;</span> <span class="n">a</span><span class="p">,</span> <span class="n">T</span><span class="o">&amp;</span> <span class="n">b</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">swap</span><span class="p">;</span>
    <span class="n">swap</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">S</span> <span class="n">s1</span><span class="p">{};</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">S</span> <span class="n">s2</span><span class="p">{};</span>
    <span class="n">g</span><span class="p">(</span><span class="n">s1</span><span class="p">,</span> <span class="n">s2</span><span class="p">);</span> <span class="c1">// Qualified lookup, calls std::swap</span>
    <span class="n">h</span><span class="p">(</span><span class="n">s1</span><span class="p">,</span> <span class="n">s2</span><span class="p">);</span> <span class="c1">// Unqualified lookup, calls swap(S,S)</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">g(s1, s2)</code>由于是Qualified lookup，所以只会调用<code class="language-plaintext highlighter-rouge">std::swap</code>。而<code class="language-plaintext highlighter-rouge">h(s1, s2)</code>是Unqualified lookup，因此ADL会额外查找到<code class="language-plaintext highlighter-rouge">N1</code>中的swap函数。即为了让ADL能生效，需要使用Unqualified lookup。至于h中的<code class="language-plaintext highlighter-rouge">using std::swap;</code>，只是为了将<code class="language-plaintext highlighter-rouge">std::swap</code>加入到Name Lookup的结果中，而最终挑选则是在后续的Overload Resolution阶段才完成。</p>

<h3 id="two-phase-lookup">Two-Phase Lookup</h3>

<p>对于模板，事情会更复杂一点。Name Lookup的查找规则分为两阶段：</p>

<ul>
  <li>模板定义时查找不依赖模板参数的名字</li>
  <li>模板实例化时查找依赖模板参数的名字</li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (1)</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="kt">void</span> <span class="nf">g</span><span class="p">(</span><span class="n">T</span> <span class="n">t</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">f</span><span class="p">(</span><span class="n">t</span><span class="p">);</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>  <span class="c1">// (2)</span>

<span class="k">namespace</span> <span class="n">N1</span> <span class="p">{</span>
<span class="k">struct</span> <span class="nc">S</span> <span class="p">{};</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">S</span><span class="p">);</span>  <span class="c1">// (3)</span>
<span class="p">}</span>  <span class="c1">// namespace N1</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">N1</span><span class="o">::</span><span class="n">S</span> <span class="n">s</span><span class="p">{};</span>
    <span class="n">g</span><span class="p">(</span><span class="n">s</span><span class="p">);</span>   <span class="c1">// Argument dependent lookup (ADL); calls (3)</span>
    <span class="n">g</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>  <span class="c1">// Regular lookup (no ADL); calls (1)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>以上面代码为例，<code class="language-plaintext highlighter-rouge">g(s)</code>由于ADL的介入，所以会调用<code class="language-plaintext highlighter-rouge">f(S)</code>，不多赘述。而<code class="language-plaintext highlighter-rouge">g(42)</code>则“出人意料”的调用了<code class="language-plaintext highlighter-rouge">f(double)</code>。这里就和模板的两阶段查找密切相关：</p>

<ul>
  <li>定义模板函数<code class="language-plaintext highlighter-rouge">g</code>时，此时编译器看到<code class="language-plaintext highlighter-rouge">f(t)</code>会进行Unqualified lookup，因此候选集合中只有<code class="language-plaintext highlighter-rouge">f(double)</code>。而<code class="language-plaintext highlighter-rouge">f(int)</code>的声明在模版定义之后，因此不会在第一阶段找到。</li>
  <li>模版实例化<code class="language-plaintext highlighter-rouge">g(42)</code>时，只通过ADL查找依赖模板参数的名字，即只查找与<code class="language-plaintext highlighter-rouge">int</code>类型关联的命名空间和类，由于<code class="language-plaintext highlighter-rouge">int</code>为内置类型，因此不会找到额外函数。因此最终就挑选候选集中唯一的<code class="language-plaintext highlighter-rouge">f(double)</code>。</li>
</ul>

<h2 id="template-argument-deduction">Template Argument Deduction</h2>

<p>在Name Lookup之后，此时的候选集合中包含若干非模版函数和若干模板函数。对于模板函数，此时要进行模板实参推导（或者叫模版参数推导）。这一步可以参考引用中的相关材料，不做单独展开了。</p>

<h2 id="overload-resolution">Overload Resolution</h2>

<p>重载决议主要分成以下两步：</p>

<ol>
  <li>从给定的候选函数集合中，选出所有与给定参数数量匹配且能够被调用（无论是否需要转换）的函数。</li>
  <li><strong>从可行候选集合中寻找最佳匹配，</strong>确定一个与给定参数匹配程度最高的函数。</li>
</ol>

<p>即经过Name Lookup和Template Argument Deduction之后，我们已经有一个候选集合。而Overload Resolution的第一步，就要在这个候选集合中，挑选出可以调用的函数集合（即符合函数调用习惯的）。如下所示，对于单个参数的函数<code class="language-plaintext highlighter-rouge">f</code>，列出了一些能被调用和不能被调用的例子。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">Widget</span> <span class="p">{</span>
    <span class="n">Widget</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>
<span class="p">};</span>

<span class="c1">// Viable candiates</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>            <span class="c1">// Exact/identity match</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="k">const</span> <span class="kt">int</span><span class="o">&amp;</span><span class="p">);</span>     <span class="c1">// Trivial conversions</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>         <span class="c1">// Standard conversions</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">Widget</span><span class="p">);</span>         <span class="c1">// User-defined conversions</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span> <span class="o">=</span> <span class="mi">0</span><span class="p">);</span>   <span class="c1">// Default arguments</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">integral</span> <span class="k">auto</span><span class="p">);</span>  <span class="c1">// Matching constraints</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(...);</span>            <span class="c1">// Ellipsis argument</span>

<span class="c1">// Non-viable candidates</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">();</span>                     <span class="c1">// Less parameters than arguments</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">double</span><span class="p">);</span>          <span class="c1">// More parameters than arguments</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="p">);</span>          <span class="c1">// No conversion available</span>
<span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">floating_point</span> <span class="k">auto</span><span class="p">);</span>  <span class="c1">// Violated constraints</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">f</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>  <span class="c1">// Call 'f()' with a single 'int' argument</span>
<span class="p">}</span>
</code></pre></div></div>

<p>这里能看到即便对于1个参数的函数，也有这么多种写法，编译器的任务就是在其中挑选最匹配的一个函数进行调用。这里根据函数参数的不同，由高到低分为不同优先级：</p>

<ol>
  <li>Exact/identity match</li>
  <li>Trivial conversion (比如int → const int&amp;)</li>
  <li>Promotion (内置类型的向上转型，比如short → int)</li>
  <li>Promotion + trivial conversion</li>
  <li>Standard conversion (比如int → float, float → int, Derived → Base, int → short)</li>
  <li>Standard conversion + trivial conversion</li>
  <li>User-defined conversion</li>
  <li>User-defined conversion + trivial conversion</li>
  <li>User-defined conversion + standard conversion</li>
  <li>Ellipsis argument</li>
</ol>

<p>编译器优先挑选优先级更高的函数，如果同一个优先级最终有多个写法，则编译器无法找到best match，也就是通常所说的调用存在歧义<code class="language-plaintext highlighter-rouge">the call is ambiguous</code>。</p>

<p>而多个参数的函数会更复杂：首先对其中每个参数，都应用单个参数的规则。如果某个函数在至少一个参数上被认为更优，而在所有其他参数上都不差于其他函数，则该函数被选为最佳匹配。否则，该调用存在歧义。</p>

<p>重载决议的详细规则也十分复杂，完整过程可以参考<a href="https://en.cppreference.com/w/cpp/language/overload_resolution.html">这里</a>。</p>

<h2 id="access-labels">Access Labels</h2>

<p>当重载决议完成后，如果最佳匹配是一个成员函数，则会检查该成员函数能否被调用。如下所示：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Object</span> <span class="p">{</span>
<span class="nl">public:</span>
    <span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>  <span class="c1">// (1)</span>
<span class="nl">private:</span>
    <span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">double</span><span class="p">);</span>  <span class="c1">// (2)</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="n">main</span> <span class="p">{</span>
    <span class="n">Object</span> <span class="n">obj</span><span class="p">{};</span>
    <span class="n">obj</span><span class="p">.</span><span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span>  <span class="c1">// (2) is selected; access violation!</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li>Name Lookup挑选了两个函数都作为候选集合</li>
  <li>Overload Resolution认为<code class="language-plaintext highlighter-rouge">f(double)</code>是best match</li>
  <li>然后检查<code class="language-plaintext highlighter-rouge">f(double)</code>是否可访问，发现违背了类的封装报错</li>
</ul>

<p>这里我们可以大致讨论下为什么以这个顺序进行检查，一个很重要的原因就是：给定一个类的成员函数和传入的参数，无论它是在类中调用还是类之外调用，最终都应当对应同一个函数。如果先检查成员函数是否可用，则显然会破坏这个约定。</p>

<h2 id="function-template-specialization">Function Template Specialization</h2>

<p>当重载决议完成后，如果最佳匹配来自某个模板，则从所选函数模板的所有特化版本中选择最终调用的函数。注意：在重载决议过程中，并没有考虑模版函数特化。即</p>

<ul>
  <li>编译器首先进行重载决议，在所有主模板和普通函数中选择最佳匹配</li>
  <li>只有当选中某个主模板后，才会检查该主模板是否有特化版本</li>
  <li>特化版本本身不参与重载决议</li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="p">);</span>  <span class="c1">// (1): primary template</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span><span class="p">);</span>        <span class="c1">// (2): explicit template specializtion of (1)</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="o">*</span><span class="p">);</span> <span class="c1">// (3): primary template</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="kt">char</span><span class="o">*</span> <span class="n">cp</span><span class="p">{</span><span class="nb">nullptr</span><span class="p">};</span>
    <span class="n">f</span><span class="p">(</span><span class="n">cp</span><span class="p">);</span>  <span class="c1">// Calls function (3)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>如上面例子所示，只有1和3两个主模板会参与重载决议，而3会1更匹配一些，因此最终挑选的是3。而2由于不参与重载决议，因此肯定不会被挑选。</p>

<p>可以对比下面的两个例子，加深理解：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="p">);</span>  <span class="c1">// (1): primary template</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="o">*</span><span class="p">);</span> <span class="c1">// (3): primary template</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span><span class="p">);</span>        <span class="c1">// (2): explicit template specializtion of (2)</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="kt">char</span><span class="o">*</span> <span class="n">cp</span><span class="p">{</span><span class="nb">nullptr</span><span class="p">};</span>
    <span class="n">f</span><span class="p">(</span><span class="n">cp</span><span class="p">);</span>  <span class="c1">// Calls function (2)</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="p">);</span>  <span class="c1">// (1): primary template</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="n">T</span><span class="o">*</span><span class="p">);</span> <span class="c1">// (3): primary template</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span> <span class="kt">void</span> <span class="n">f</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">*&gt;</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span><span class="p">);</span> <span class="c1">// (2): explicit template specializtion of (1)</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="kt">char</span><span class="o">*</span> <span class="n">cp</span><span class="p">{</span><span class="nb">nullptr</span><span class="p">};</span>
    <span class="n">f</span><span class="p">(</span><span class="n">cp</span><span class="p">);</span>  <span class="c1">// Calls function (3)</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="virtual-dispatch"><strong>Virtual Dispatch</strong></h2>

<p>当重载决议完成后，如果最佳匹配是一个类中的虚成员函数。编译器此时会确定使用虚函数表，确定对应虚函数在虚函数表的偏移量。最终生成代码是一个间接调用，即调用某个虚函数表中的第N个函数。而运行时根据变量实际的类型，决定使用哪个类的虚函数表。</p>

<h2 id="deleting-functions">Deleting Functions</h2>

<p>到最后这步，此时best match已经完全确定，此时会检查被选中的函数是否被<code class="language-plaintext highlighter-rouge">delete</code>了。注意一个函数被标为<code class="language-plaintext highlighter-rouge">delete</code>，它仍然会参与重载决议。也就是说，best match可能是一个被标为delete的函数。一个函数完全没有被声明和被声明为<code class="language-plaintext highlighter-rouge">delete</code>的核心区别就是它是否参与重载决议。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">f</span><span class="p">(</span><span class="kt">int</span><span class="p">);</span>              <span class="c1">// (1)</span>
<span class="kt">void</span> <span class="n">f</span><span class="p">(</span><span class="kt">double</span><span class="p">)</span> <span class="o">=</span> <span class="k">delete</span><span class="p">;</span>  <span class="c1">// (2)</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">f</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>   <span class="c1">// Calls function (1)</span>
    <span class="n">f</span><span class="p">(</span><span class="mf">1.0</span><span class="p">);</span>  <span class="c1">// Compilation error: Call to deleted function</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="reference">Reference</h2>

<ul>
  <li><a href="https://www.youtube.com/watch?v=GydNMuyQzWo">Calling Functions: A Tutorial - Klaus Iglberger - CppCon 2020</a></li>
  <li><strong>C++ Templates The Complete Guide, 2nd Edition</strong></li>
  <li><a href="https://www.youtube.com/watch?v=wQxj20X-tIU">CppCon 2014: Scott Meyers “Type Deduction and Why You Care”</a></li>
  <li><a href="https://en.cppreference.com/w/cpp/language/overload_resolution.html">Overload resolution - cppreference.com</a></li>
</ul>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[最近一直没空学些东西，这两天看了Calling Functions: A Tutorial这个演讲，它详尽的阐述了在函数调用时，编译器如何选择了正确的函数。从这个视角，能把很多概念串联起来（或者说能把一些常见写法和对应的术语对应起来），也能理解这个过程中背后设计的思想。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 4</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-4/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 4" /><published>2025-11-19T00:00:00+08:00</published><updated>2025-11-19T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-4</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-4/"><![CDATA[<p>本来想直接介绍<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>的，但鉴于上一篇展示了太多的”术”，这一片会从宏观视角，理解一个通过协程实现的异步任务，到底需要实现什么东西，以及为什么需要这么实现，所谓“道”。在此基础上，可能会穿插一些<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>的内容。这篇很多内容都是总结于这个<a href="https://www.youtube.com/watch?v=qfKFfQSxvA8">演讲</a>，也可以看这个更详尽的<a href="https://www.youtube.com/watch?v=lKUVuaUbRDk">版本</a>。</p>

<h3 id="mental-model">Mental Model</h3>

<p>假设要实现如下一个任务，即<code class="language-plaintext highlighter-rouge">main</code>依次调用<code class="language-plaintext highlighter-rouge">spawn_task</code>生成一个任务，其中又依次调用了<code class="language-plaintext highlighter-rouge">outer_function</code>，<code class="language-plaintext highlighter-rouge">middle_function</code>和最里层的<code class="language-plaintext highlighter-rouge">inner_function</code>。</p>

<p><img src="/archive/coroutine-10.png" alt="figure" /></p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">spawn_task</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="n">Result</span> <span class="n">r</span> <span class="o">=</span> <span class="n">outer_function</span><span class="p">();</span>
<span class="p">}</span>

<span class="n">Result</span> <span class="nf">outer_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">PartialResult</span> <span class="n">r</span> <span class="o">=</span> <span class="n">middle_function</span><span class="p">();</span>
    <span class="k">return</span> <span class="n">Result</span><span class="o">::</span><span class="n">from_partial_result</span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="p">}</span>

<span class="n">PartialResult</span> <span class="nf">middle_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">r</span> <span class="o">=</span> <span class="n">inner_function</span><span class="p">();</span>
    <span class="k">return</span> <span class="n">PartialResult</span><span class="o">::</span><span class="n">from_io_result</span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="p">}</span>

<span class="n">IoResult</span> <span class="nf">inner_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">data</span> <span class="o">=</span> <span class="n">blocking_io</span><span class="p">(...);</span> <span class="c1">// this could take some time</span>
    <span class="k">return</span> <span class="n">IoResult</span><span class="o">::</span><span class="n">from_io_data</span><span class="p">(</span><span class="n">data</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>最里层的<code class="language-plaintext highlighter-rouge">inner_function</code>会执行一些异步IO操作，当这个异步IO完成时，我们希望能唤醒整个任务中还没执行完成的部分继续执行。</p>

<p>如果想支持异步执行这个任务，那么整个任务中的每一层可能都要进行相应接口改造，使得<code class="language-plaintext highlighter-rouge">outer_function</code>/<code class="language-plaintext highlighter-rouge">middle_function</code>/<code class="language-plaintext highlighter-rouge">inner_function</code>都变成一个异步任务。下面是一些常用的实现方式：</p>

<ul>
  <li>线程(thread)：通过<code class="language-plaintext highlighter-rouge">std::async</code>的方式启动新线程，通过<code class="language-plaintext highlighter-rouge">std::future</code>获取结果。缺点是线程context switch比较繁重，如果有多个任务之间需要协同，需要额外同步机制。</li>
  <li>有栈协程(fiber)：当需要执行异步操作时，通过fiber主动让出线程并挂起，使其他fiber能够运行。当异步操作完成时，通过调度器能够恢复被挂起的fiber。优势是fiber之间的切换比线程切换更轻量，但需要一些额外栈空间。</li>
  <li>无栈协程(coroutine)：通过实现协程的相关接口，自定义什么时候挂起，以及什么时候由谁恢复。但注意到，每次只能将一个协程挂起。如果我们将<code class="language-plaintext highlighter-rouge">inner_function</code>挂起，那除了自身的挂起点信息之外，同时还需要保存其调用方<code class="language-plaintext highlighter-rouge">middle_function</code>/<code class="language-plaintext highlighter-rouge">outer_function</code>的挂起点信息。</li>
</ul>

<p>所以这一篇我们核心要解释的就是，如果通过coroutine来实现这样一个异步任务，需要提供什么样的能力。</p>

<h2 id="async-task">Async Task</h2>

<p>我们将上面的同步调用改造为一个协程，每个协程都返回一个<code class="language-plaintext highlighter-rouge">Async</code>对象，代表是一个异步任务(Async Task)。</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">folly::coro::Task</code>本质上就是一个我们所实现的这个<code class="language-plaintext highlighter-rouge">Async</code>异步任务，只不过为了更好介绍异步任务的核心原理，省略了一些不是那么重要的细节</p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">inner_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">Async</span><span class="o">&lt;</span><span class="n">PartialResult</span><span class="o">&gt;</span> <span class="n">middle_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">r</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">inner_function</span><span class="p">();</span>
    <span class="k">co_return</span> <span class="n">PartialResult</span><span class="o">::</span><span class="n">from_io_result</span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>上面代码可以等价展开为如下的形式：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">inner_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">Async</span><span class="o">&lt;</span><span class="n">PartialResult</span><span class="o">&gt;</span> <span class="n">middle_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">inner_function</span><span class="p">();</span>
    <span class="n">IoResult</span> <span class="n">r</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">awaitable</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">PartialResult</span><span class="o">::</span><span class="n">from_io_result</span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>注意到，<code class="language-plaintext highlighter-rouge">Async</code>作为一个异步任务，它本身是一个协程的返回值。它又可以在其他协程中被<code class="language-plaintext highlighter-rouge">co_await</code>，比如上面的<code class="language-plaintext highlighter-rouge">middle_function</code>中<code class="language-plaintext highlighter-rouge">co_await inner_function()</code>。</p>

<p>因此<code class="language-plaintext highlighter-rouge">Async</code>既是<code class="language-plaintext highlighter-rouge">ReturnType</code>，又是<code class="language-plaintext highlighter-rouge">Awaitable</code>。即<code class="language-plaintext highlighter-rouge">Async</code>要能满足以下能力：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">ReturnType</code>：作为协程函数的返回值</li>
  <li><code class="language-plaintext highlighter-rouge">Awaitable</code>：可以被 <code class="language-plaintext highlighter-rouge">co_await</code></li>
</ul>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">folly::coro::Task</code>本身是个<code class="language-plaintext highlighter-rouge">ReturnType</code>，又提供了嵌套类<code class="language-plaintext highlighter-rouge">folly::coro::Task::Awaiter</code>作为<code class="language-plaintext highlighter-rouge">Awaitable</code>，本质上一样。</p>

</blockquote>

<h3 id="1-how-to-resume-inner-coroutine">1. How to resume inner coroutine</h3>

<p>首先我们需要解决的第一个问题就是，我们如何恢复异步任务中剩余没有执行完成的部分继续执行。我仍然以上面的例子为例，描述下可能的执行流程（如果看不太懂，需要看下前几篇博客）：</p>

<p>当<code class="language-plaintext highlighter-rouge">middle_function</code>开始执行时，它会<code class="language-plaintext highlighter-rouge">co_await inner_function()</code>。如果<code class="language-plaintext highlighter-rouge">Awaitable</code>对应的<code class="language-plaintext highlighter-rouge">await_ready</code>决定要挂起并调用<code class="language-plaintext highlighter-rouge">await_suspend</code>，此时<code class="language-plaintext highlighter-rouge">middle_function</code>的调用方是能拿到一个<code class="language-plaintext highlighter-rouge">Async</code>对象的。绝大多数实现中，<code class="language-plaintext highlighter-rouge">Async</code>对象中都会保存对应的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。</p>

<p>此时，一种简单的想法就是，直接从外向内恢复，即调用这个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>的<code class="language-plaintext highlighter-rouge">resume</code>方法继续执行。这样做有几个问题，注意到此时<code class="language-plaintext highlighter-rouge">inner_function</code>还没有执行完成：</p>

<ul>
  <li>协程一旦被恢复，就无法再次挂起（除非遇到新的<code class="language-plaintext highlighter-rouge">co_await</code>）</li>
  <li>从外部无法知道需要恢复内层函数多少次才能完成</li>
  <li>外层协程恢复时，内层结果可能还未就绪，无法执行<code class="language-plaintext highlighter-rouge">co_return</code></li>
</ul>

<p>也就是说，当我们恢复outer coroutine，即这里的<code class="language-plaintext highlighter-rouge">middle_function</code>时，我们无法确保<code class="language-plaintext highlighter-rouge">inner_function</code>已经执行完成，也就无法获取到<code class="language-plaintext highlighter-rouge">inner_function</code>的返回值<code class="language-plaintext highlighter-rouge">r</code>，更没有办法执行<code class="language-plaintext highlighter-rouge">co_return</code>。并且在恢复<code class="language-plaintext highlighter-rouge">middle_function</code>继续执行后，已经无法控制协程是否再次挂起了（取决于是否内部还有其他<code class="language-plaintext highlighter-rouge">co_await</code>）。</p>

<p>正确的<code class="language-plaintext highlighter-rouge">co_await</code>语义是：当被<code class="language-plaintext highlighter-rouge">co_await</code>的协程完成，能获取到其结果时，唤醒调用方exactly once。</p>

<p>所以，当我们挂起一个基于协程的异步任务时，所需要的上下文，远远多余单个函数。我们不仅需要整个异步任务之间的调用栈，并且需要要以某种形式将多个互相调用的协程串联起来，使得他们能按预期执行顺序执行。在了解基于协程的异步任务解决方案之前，我们先不妨看下普通函数调用是如何继续执行任务中还未完成的部分。</p>

<p>函数调用时，汇编层面都会有prologue，由被调用方负责保存调用方的基地址%rbp：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">pushq</span> <span class="o">%</span><span class="nb">rbp</span>
<span class="nf">movq</span> <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
</code></pre></div></div>

<p>callee返回时，汇编层面有epilogue，由被调用方恢复调用方的基地址%rbp：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">leave</span>
<span class="nf">ret</span>
</code></pre></div></div>

<p>如下图所示，当从最外层函数调用到最里层时，基地址的指针则是由内指向外，形成了一个单链表，即我们一般所说的调用栈。</p>

<p><img src="/archive/coroutine-11.png" alt="figure" /></p>

<p>对于普通函数而言，stack frame能满足如下要求：</p>

<ul>
  <li>创建最外层函数</li>
  <li>调用内层函数</li>
  <li>从内层函数恢复外层函数继续执行</li>
  <li>传递返回结果</li>
</ul>

<p>但对于基于协程的异步任务而言，需要额外支持几个功能：</p>

<ul class="task-list">
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />创建最外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />调用内层协程，并且需要在co_await callee时建立callee → caller的关系（后面会解释）</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />挂起最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />恢复最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />从内层协程恢复外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />传递最终结果</li>
</ul>

<p>我们接下来看看异步任务如何满足几个需求。</p>

<p>我们可以分析下这个异步任务需要提供什么样的接口，以及如何通过这些接口完成上面所述的功能。注意到<code class="language-plaintext highlighter-rouge">middle_function</code>本身是一个协程，它的<code class="language-plaintext highlighter-rouge">ReturnType</code>是一个<code class="language-plaintext highlighter-rouge">Async</code>对象。而在<code class="language-plaintext highlighter-rouge">middle_function</code>中它又<code class="language-plaintext highlighter-rouge">co_await</code>另一个<code class="language-plaintext highlighter-rouge">Async</code>对象，即<code class="language-plaintext highlighter-rouge">Async</code>还是一个<code class="language-plaintext highlighter-rouge">Awaitable</code>对象（任何支持<code class="language-plaintext highlighter-rouge">co_await</code>的对象都是<code class="language-plaintext highlighter-rouge">Awaitable</code>）。因此，Async需要同时扮演<code class="language-plaintext highlighter-rouge">ReturnType</code>和<code class="language-plaintext highlighter-rouge">Awaitable</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">inner_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">Async</span><span class="o">&lt;</span><span class="n">PartialResult</span><span class="o">&gt;</span> <span class="n">middle_function</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">inner_function</span><span class="p">();</span>
    <span class="n">IoResult</span> <span class="n">r</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">awaitable</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">PartialResult</span><span class="o">::</span><span class="n">from_io_result</span><span class="p">(</span><span class="n">r</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Async</code>作为<code class="language-plaintext highlighter-rouge">ReturnType</code>时，需要维护对应协程的句柄，比如上面例子中<code class="language-plaintext highlighter-rouge">awaitable</code>对象作为一个<code class="language-plaintext highlighter-rouge">Async&lt;IoResult&gt;</code>需要保存<code class="language-plaintext highlighter-rouge">inner_function</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。因此<code class="language-plaintext highlighter-rouge">Async</code>的构造函数如下所示：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">Async</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
        <span class="n">Async</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">get_return_object</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">auto</span> <span class="n">h</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">);</span>
            <span class="k">return</span> <span class="n">Async</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">{</span><span class="n">h</span><span class="p">};</span>
        <span class="p">}</span>
    <span class="p">};</span>

    <span class="c1">// ReturnType part of Async</span>
    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">self</span><span class="p">;</span>
    <span class="n">Async</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">self</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>
<span class="p">};</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Async</code>作为<code class="language-plaintext highlighter-rouge">Awaitable</code>时，它需要能建立协程之间的调用链。这里的关键就是<code class="language-plaintext highlighter-rouge">Awaitable</code>中的<code class="language-plaintext highlighter-rouge">await_suspend</code>接口。它的参数是调用方的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，即它对应<code class="language-plaintext highlighter-rouge">co_await</code>了内层协程（下文都称为<code class="language-plaintext highlighter-rouge">callee</code>）的协程（下文都称为<code class="language-plaintext highlighter-rouge">caller</code>）。并且在<code class="language-plaintext highlighter-rouge">callee</code>还没有准备好时，<code class="language-plaintext highlighter-rouge">caller</code>希望被挂起。</p>

<p>在一个异步任务中，<code class="language-plaintext highlighter-rouge">await_suspend</code>接口的核心功能就是建立从<code class="language-plaintext highlighter-rouge">callee</code>到<code class="language-plaintext highlighter-rouge">caller</code>的联系。准确说就是告诉<code class="language-plaintext highlighter-rouge">callee</code>：“<code class="language-plaintext highlighter-rouge">caller</code>是你的调用者，当你的结果就绪时，应该恢复的是这个协程“。</p>

<p>因此在大多数异步任务的实现中，在<code class="language-plaintext highlighter-rouge">Awaitable</code>的<code class="language-plaintext highlighter-rouge">await_suspend</code>中，需要将<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">promise</code>中。对称转移在此基础上会返回<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">Async</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
        <span class="cm">/*...*/</span>
        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">my_caller</span><span class="p">;</span>
    <span class="p">};</span>

    <span class="c1">// ReturnType part of Async</span>
    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">self</span><span class="p">;</span>

    <span class="c1">// Awaitable part of Async</span>
    <span class="kt">bool</span> <span class="nf">await_ready</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">T</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span>

    <span class="k">auto</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">handle</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">self</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">my_caller</span> <span class="o">=</span> <span class="n">handle</span><span class="p">;</span>
        <span class="c1">// Asymmetric transfer will return void</span>
        <span class="c1">// return;</span>
        <span class="c1">// Symmetric transfer will return callee's handle</span>
        <span class="k">return</span> <span class="n">self</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>

<blockquote>
  <p>这部分<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>对应的代码</p>

  <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Promise</span><span class="p">&gt;</span>
    <span class="n">FOLLY_NOINLINE</span> <span class="k">auto</span> <span class="nf">await_suspend</span><span class="p">(</span>
        <span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
      <span class="n">DCHECK</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
      <span class="k">auto</span><span class="o">&amp;</span> <span class="n">promise</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">();</span>

      <span class="n">promise</span><span class="p">.</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>

      <span class="k">auto</span><span class="o">&amp;</span> <span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">promise</span><span class="p">.</span><span class="n">getAsyncFrame</span><span class="p">();</span>
      <span class="n">calleeFrame</span><span class="p">.</span><span class="n">setReturnAddress</span><span class="p">();</span>

      <span class="k">if</span> <span class="nf">constexpr</span> <span class="p">(</span><span class="n">detail</span><span class="o">::</span><span class="n">promiseHasAsyncFrame_v</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">auto</span><span class="o">&amp;</span> <span class="n">callerFrame</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">();</span>
        <span class="n">folly</span><span class="o">::</span><span class="n">pushAsyncStackFrameCallerCallee</span><span class="p">(</span><span class="n">callerFrame</span><span class="p">,</span> <span class="n">calleeFrame</span><span class="p">);</span>
        <span class="k">return</span> <span class="n">coro_</span><span class="p">;</span>
      <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
        <span class="n">folly</span><span class="o">::</span><span class="n">resumeCoroutineWithNewAsyncStackRoot</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
        <span class="k">return</span><span class="p">;</span>
      <span class="p">}</span>
    <span class="p">}</span>
</code></pre></div>  </div>

</blockquote>

<p>到这我们成功建立了<code class="language-plaintext highlighter-rouge">callee</code>到<code class="language-plaintext highlighter-rouge">caller</code>的联系，如下图所示，从左到右依次为<code class="language-plaintext highlighter-rouge">inner_function</code>/<code class="language-plaintext highlighter-rouge">middle_function</code>/<code class="language-plaintext highlighter-rouge">outer_function</code>，并且将<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到了<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">promise</code>中。</p>

<p><img src="/archive/coroutine-12.png" alt="figure" /></p>

<p>接下来，我们看下如何支持从内层协程恢复外层协程。其实这个功能在上一篇我们已经非常详细的解释过了，即在<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend()</code>阶段中，通过<code class="language-plaintext highlighter-rouge">Awaitable::await_suspend</code>对称转移完成。</p>

<ol>
  <li>
    <p>首先是<code class="language-plaintext highlighter-rouge">callee</code>（这里的<code class="language-plaintext highlighter-rouge">inner_function</code>）在执行完成时，会通过<code class="language-plaintext highlighter-rouge">promise.return_value</code>或者<code class="language-plaintext highlighter-rouge">promise.return_void</code>接口传递返回值。之后进入到最后一次挂起点，即<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend()</code>，此时会返回一个<code class="language-plaintext highlighter-rouge">Awaitable</code>对象（即下图中的<code class="language-plaintext highlighter-rouge">ResumeCaller</code>）。</p>

    <p><img src="/archive/coroutine-13.png" alt="figure" /></p>
  </li>
  <li>
    <p>通过<code class="language-plaintext highlighter-rouge">Awaitable::await_suspend</code>对称转移到<code class="language-plaintext highlighter-rouge">caller</code>（这里的<code class="language-plaintext highlighter-rouge">middle_function</code>）。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">struct</span> <span class="nc">promise_type</span><span class="p">{</span>
     <span class="cm">/*...*/</span>
     <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">my_caller</span><span class="p">;</span>
     <span class="k">auto</span> <span class="n">final_suspend</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
         <span class="k">return</span> <span class="n">ResumeCaller</span><span class="p">{};</span>
     <span class="p">}</span>
 <span class="p">};</span>

 <span class="k">struct</span> <span class="nc">ResumeCaller</span> <span class="p">{</span>
     <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">false</span><span class="p">;</span> <span class="p">}</span>
     <span class="kt">void</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* will never be called! */</span> <span class="p">}</span>
     <span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="p">{</span>
         <span class="c1">// Symmetric Transfer!</span>
         <span class="k">return</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">my_caller</span><span class="p">;</span>
     <span class="p">}</span>
 <span class="p">};</span>
</code></pre></div>    </div>

    <p><img src="/archive/coroutine-14.png" alt="figure" /></p>
  </li>
  <li>
    <p>之后<code class="language-plaintext highlighter-rouge">middle_function</code>通过<code class="language-plaintext highlighter-rouge">await_resume</code>获取到的返回值，并继续执行剩余逻辑，此时<code class="language-plaintext highlighter-rouge">inner_function</code>处于等待销毁的状态。</p>

    <p><img src="/archive/coroutine-15.png" alt="figure" /></p>
  </li>
  <li>
    <p>当<code class="language-plaintext highlighter-rouge">middle_function</code>执行完成时，会将<code class="language-plaintext highlighter-rouge">inner_function</code>对应协程销毁</p>

    <p><img src="/archive/coroutine-16.png" alt="figure" /></p>
  </li>
</ol>

<p>到这我们已经完成了协程所需的两个需求。</p>

<ul class="task-list">
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />创建最外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />调用内层协程，并且需要在调用时建立callee → caller的关系</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />挂起最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />恢复最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />从内层协程恢复外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />传递最终结果</li>
</ul>

<h3 id="2-how-to-perform-async-io-and-resume-innermost-coroutine">2. How to perform async io and resume innermost coroutine</h3>

<p>接下来要解决的问题是如何恢复最内层执行IO的协程。由于具体的IO是由操作系统完成的，必须要有某种机制，使得在操作系统通知用户态的进程IO完成之后，能够恢复协程继续执行。</p>

<p>简单来说，在操作系统层面，异步IO一般会通过<code class="language-plaintext highlighter-rouge">libaio</code>或者是<code class="language-plaintext highlighter-rouge">io_uring</code>完成，而基础库一般会在此基础上封装，提供易于使用的异步IO接口，比如<code class="language-plaintext highlighter-rouge">boost::asio</code>和<code class="language-plaintext highlighter-rouge">folly::SimpleAsyncIO</code>。接下来我们通过梳理<code class="language-plaintext highlighter-rouge">folly::SimpleAsyncIO</code>的主干代码，理解如何在异步IO完成后恢复最内层的IO协程。</p>

<ol>
  <li>
    <p>用户代码在协程中执行IO操作，<code class="language-plaintext highlighter-rouge">SimpleAsyncIO</code>默认使用<code class="language-plaintext highlighter-rouge">libaio</code>来完成异步IO，<code class="language-plaintext highlighter-rouge">coro::Baton</code>用来挂起和恢复协程。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">Async</span><span class="o">&lt;</span><span class="n">IoResult</span><span class="o">&gt;</span> <span class="n">inner_function</span><span class="p">(</span><span class="kt">int</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span> <span class="n">buf</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">size</span><span class="p">,</span> <span class="kt">off_t</span> <span class="n">start</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">SimpleAsyncIO</span> <span class="n">aio</span><span class="p">;</span>
     <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">aio</span><span class="p">.</span><span class="n">co_pread</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">start</span><span class="p">);</span>
     <span class="k">co_return</span> <span class="n">IoResult</span><span class="o">::</span><span class="n">from</span><span class="p">(</span><span class="n">result</span><span class="p">);</span>
 <span class="p">}</span>

 <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">SimpleAsyncIO</span><span class="o">::</span><span class="n">co_pread</span><span class="p">(</span><span class="kt">int</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span> <span class="n">buf</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">size</span><span class="p">,</span> <span class="kt">off_t</span> <span class="n">start</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Baton</span> <span class="n">done</span><span class="p">;</span>
     <span class="kt">int</span> <span class="n">result</span><span class="p">;</span>
     <span class="n">pread</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">buf</span><span class="p">,</span> <span class="n">size</span><span class="p">,</span> <span class="n">start</span><span class="p">,</span> <span class="p">[</span><span class="o">&amp;</span><span class="n">done</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">result</span><span class="p">](</span><span class="kt">int</span> <span class="n">rc</span><span class="p">)</span> <span class="p">{</span>
         <span class="n">result</span> <span class="o">=</span> <span class="n">rc</span><span class="p">;</span>
         <span class="n">done</span><span class="p">.</span><span class="n">post</span><span class="p">();</span>
     <span class="p">});</span>
     <span class="k">co_await</span> <span class="n">done</span><span class="p">;</span>
     <span class="k">co_return</span> <span class="n">result</span><span class="p">;</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>本质上就是注册了一个IO完成时的回调，然后像操作系统提交异步IO请求。即通过<code class="language-plaintext highlighter-rouge">setNotificationCallback</code>设置当IO完成时，向一个Executor中调度用户的回调，也就是上面代码中的<code class="language-plaintext highlighter-rouge">done.post()</code>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kt">void</span> <span class="n">SimpleAsyncIO</span><span class="o">::</span><span class="n">submitOp</span><span class="p">(</span><span class="n">Function</span><span class="o">&lt;</span><span class="kt">void</span><span class="p">(</span><span class="n">AsyncBaseOp</span><span class="o">*</span><span class="p">)</span><span class="o">&gt;</span> <span class="n">preparer</span><span class="p">,</span>
                              <span class="n">SimpleAsyncIOCompletor</span> <span class="n">completor</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">std</span><span class="o">::</span><span class="n">unique_ptr</span><span class="o">&lt;</span><span class="n">AsyncBaseOp</span><span class="o">&gt;</span> <span class="n">opHolder</span> <span class="o">=</span> <span class="n">getOp</span><span class="p">();</span>
     <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">opHolder</span><span class="p">)</span> <span class="p">{</span>
         <span class="n">completor</span><span class="p">(</span><span class="o">-</span><span class="n">EBUSY</span><span class="p">);</span>
         <span class="k">return</span><span class="p">;</span>
     <span class="p">}</span>

     <span class="c1">// Grab a raw pointer to the op before we create the completion lambda,</span>
     <span class="c1">// since we move the unique_ptr into the lambda and can no longer access</span>
     <span class="c1">// it.</span>
     <span class="n">AsyncBaseOp</span><span class="o">*</span> <span class="n">op</span> <span class="o">=</span> <span class="n">opHolder</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>

     <span class="n">preparer</span><span class="p">(</span><span class="n">op</span><span class="p">);</span>

     <span class="n">op</span><span class="o">-&gt;</span><span class="n">setNotificationCallback</span><span class="p">([</span><span class="k">this</span><span class="p">,</span>
                                  <span class="n">completor</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">completor</span><span class="p">)},</span>
                                  <span class="n">opHolder</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">opHolder</span><span class="p">)}](</span><span class="n">AsyncBaseOp</span><span class="o">*</span> <span class="n">op_</span><span class="p">)</span> <span class="k">mutable</span> <span class="p">{</span>
         <span class="n">CHECK</span><span class="p">(</span><span class="n">op_</span> <span class="o">==</span> <span class="n">opHolder</span><span class="p">.</span><span class="n">get</span><span class="p">());</span>
         <span class="kt">int</span> <span class="n">rc</span> <span class="o">=</span> <span class="n">op_</span><span class="o">-&gt;</span><span class="n">result</span><span class="p">();</span>

         <span class="c1">// 当IO完成 在线程池中调度用户回调</span>
         <span class="n">completionExecutor_</span><span class="o">-&gt;</span><span class="n">add</span><span class="p">(</span>
                 <span class="p">[</span><span class="n">rc</span><span class="p">,</span> <span class="n">completor</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">completor</span><span class="p">)}]()</span> <span class="k">mutable</span> <span class="p">{</span> <span class="n">completor</span><span class="p">(</span><span class="n">rc</span><span class="p">);</span> <span class="p">});</span>

         <span class="n">putOp</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">opHolder</span><span class="p">));</span>
     <span class="p">});</span>
     <span class="n">asyncIO_</span><span class="o">-&gt;</span><span class="n">submit</span><span class="p">(</span><span class="n">op</span><span class="p">);</span> <span class="c1">// 提交到内核（libaio/io_uring）</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p>我们不关心具体异步IO是怎么完成的，重点关注如何从系统层面获取到IO完成事件，从而执行这个回调，从而恢复协程的。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">SimpleAsyncIO</code>在构造函数中会注册事件监听，即当IO完成时，内核会让<code class="language-plaintext highlighter-rouge">pollFd()</code>这个fd变成可读状态，调用<code class="language-plaintext highlighter-rouge">libeventCallback</code>这个回调。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">SimpleAsyncIO</span><span class="o">::</span><span class="n">SimpleAsyncIO</span><span class="p">(</span><span class="n">Config</span> <span class="n">cfg</span><span class="p">)</span>
         <span class="o">:</span> <span class="n">maxRequests_</span><span class="p">(</span><span class="n">cfg</span><span class="p">.</span><span class="n">maxRequests_</span><span class="p">),</span>
           <span class="n">completionExecutor_</span><span class="p">(</span><span class="n">cfg</span><span class="p">.</span><span class="n">completionExecutor_</span><span class="p">),</span>
           <span class="n">terminating_</span><span class="p">(</span><span class="nb">false</span><span class="p">)</span> <span class="p">{</span>
     <span class="c1">// ...</span>
     <span class="k">if</span> <span class="p">(</span><span class="n">cfg</span><span class="p">.</span><span class="n">evb_</span><span class="p">)</span> <span class="p">{</span>
         <span class="n">initHandler</span><span class="p">(</span><span class="n">cfg</span><span class="p">.</span><span class="n">evb_</span><span class="p">,</span> <span class="n">NetworkSocket</span><span class="o">::</span><span class="n">fromFd</span><span class="p">(</span><span class="n">asyncIO_</span><span class="o">-&gt;</span><span class="n">pollFd</span><span class="p">()));</span>
     <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
         <span class="n">evb_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">make_unique</span><span class="o">&lt;</span><span class="n">ScopedEventBaseThread</span><span class="o">&gt;</span><span class="p">();</span>
         <span class="n">initHandler</span><span class="p">(</span><span class="n">evb_</span><span class="o">-&gt;</span><span class="n">getEventBase</span><span class="p">(),</span> <span class="n">NetworkSocket</span><span class="o">::</span><span class="n">fromFd</span><span class="p">(</span><span class="n">asyncIO_</span><span class="o">-&gt;</span><span class="n">pollFd</span><span class="p">()));</span>
     <span class="p">}</span>
     <span class="n">registerHandler</span><span class="p">(</span><span class="n">EventHandler</span><span class="o">::</span><span class="n">READ</span> <span class="o">|</span> <span class="n">EventHandler</span><span class="o">::</span><span class="n">PERSIST</span><span class="p">);</span>
 <span class="p">}</span>

 <span class="kt">void</span> <span class="n">EventHandler</span><span class="o">::</span><span class="n">initHandler</span><span class="p">(</span><span class="n">EventBase</span><span class="o">*</span> <span class="n">eventBase</span><span class="p">,</span> <span class="n">NetworkSocket</span> <span class="n">fd</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">ensureNotRegistered</span><span class="p">(</span><span class="n">__func__</span><span class="p">);</span>
     <span class="n">event_</span><span class="p">.</span><span class="n">eb_event_set</span><span class="p">(</span><span class="n">fd</span><span class="p">.</span><span class="n">data</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">EventHandler</span><span class="o">::</span><span class="n">libeventCallback</span><span class="p">,</span> <span class="k">this</span><span class="p">);</span>
     <span class="n">setEventBase</span><span class="p">(</span><span class="n">eventBase</span><span class="p">);</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">SimpleAsyncIO</code>中的<code class="language-plaintext highlighter-rouge">EventBase</code>会通过一个while循环，不断检查是否有事件完成（参见<code class="language-plaintext highlighter-rouge">EventBase::loop()</code>），最终调度用户态回调。整个调用链如下：</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">EventBase</span><span class="o">::</span><span class="n">loop</span><span class="p">()</span> <span class="c1">// 检测到 pollFd 可读</span>
 <span class="err">└─</span> <span class="n">libeventCallback</span> <span class="c1">// libevent调用这个回调</span>
    <span class="err">└─</span> <span class="n">SimpleAsyncIO</span><span class="o">::</span><span class="n">handlerReady</span><span class="p">()</span>
       <span class="err">└─</span> <span class="n">AsyncBase</span><span class="o">::</span><span class="n">pollCompleted</span><span class="p">()</span>
          <span class="err">└─</span> <span class="n">AsyncBase</span><span class="o">:::</span><span class="n">doWait</span><span class="p">()</span>
             <span class="err">└─</span> <span class="n">AsyncBase</span><span class="o">::</span><span class="n">complete</span><span class="p">(</span><span class="n">op</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
                <span class="err">└─</span> <span class="n">AsyncBaseOp</span><span class="o">::</span><span class="n">complete</span><span class="p">()</span>
                   <span class="err">└─</span> <span class="n">cb_</span><span class="p">(</span><span class="k">this</span><span class="p">)</span>  <span class="c1">// 执行在构造时通过notificationCallback设置的回调</span>
                      <span class="err">└─</span> <span class="n">completionExecutor_</span><span class="o">-&gt;</span><span class="n">add</span><span class="p">(...)</span>  <span class="c1">// 调度用户态回调 即Baton.post()</span>
</code></pre></div>    </div>

    <p>下面是比较关键的函数：</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kt">void</span> <span class="n">EventHandler</span><span class="o">::</span><span class="n">libeventCallback</span><span class="p">(</span><span class="n">libevent_fd_t</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">short</span> <span class="n">events</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span> <span class="n">arg</span><span class="p">)</span> <span class="p">{</span>
   <span class="k">auto</span> <span class="n">handler</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">EventHandler</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span>
   <span class="n">assert</span><span class="p">(</span><span class="n">fd</span> <span class="o">==</span> <span class="n">handler</span><span class="o">-&gt;</span><span class="n">event_</span><span class="p">.</span><span class="n">eb_ev_fd</span><span class="p">());</span>
   <span class="p">(</span><span class="kt">void</span><span class="p">)</span><span class="n">fd</span><span class="p">;</span> <span class="c1">// prevent unused variable warnings</span>

   <span class="k">auto</span> <span class="n">observer</span> <span class="o">=</span> <span class="n">handler</span><span class="o">-&gt;</span><span class="n">eventBase_</span><span class="o">-&gt;</span><span class="n">getExecutionObserver</span><span class="p">();</span>
   <span class="k">if</span> <span class="p">(</span><span class="n">observer</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">observer</span><span class="o">-&gt;</span><span class="n">starting</span><span class="p">(</span><span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="kt">uintptr_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">handler</span><span class="p">));</span>
   <span class="p">}</span>

   <span class="c1">// this can't possibly fire if handler-&gt;eventBase_ is nullptr</span>
   <span class="n">handler</span><span class="o">-&gt;</span><span class="n">eventBase_</span><span class="o">-&gt;</span><span class="n">bumpHandlingTime</span><span class="p">();</span>

   <span class="n">handler</span><span class="o">-&gt;</span><span class="n">handlerReady</span><span class="p">(</span><span class="kt">uint16_t</span><span class="p">(</span><span class="n">events</span><span class="p">));</span>

   <span class="k">if</span> <span class="p">(</span><span class="n">observer</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">observer</span><span class="o">-&gt;</span><span class="n">stopped</span><span class="p">(</span><span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="kt">uintptr_t</span><span class="o">&gt;</span><span class="p">(</span><span class="n">handler</span><span class="p">));</span>
   <span class="p">}</span>
 <span class="p">}</span>

 <span class="kt">void</span> <span class="n">SimpleAsyncIO</span><span class="o">::</span><span class="n">handlerReady</span><span class="p">(</span><span class="kt">uint16_t</span> <span class="n">events</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="k">if</span> <span class="p">(</span><span class="n">events</span> <span class="o">&amp;</span> <span class="n">EventHandler</span><span class="o">::</span><span class="n">READ</span><span class="p">)</span> <span class="p">{</span>
         <span class="c1">// All the work (including putting op back on free list) happens in the</span>
         <span class="c1">// notificationCallback, so we can simply drop the ops returned from</span>
         <span class="c1">// pollCompleted. But we must still call it or ops never complete.</span>
         <span class="k">while</span> <span class="p">(</span><span class="n">asyncIO_</span><span class="o">-&gt;</span><span class="n">pollCompleted</span><span class="p">().</span><span class="n">size</span><span class="p">())</span> <span class="p">{</span>
             <span class="p">;</span>
         <span class="p">}</span>
     <span class="p">}</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
</ol>

<p>所以到这我们已经知道如何在协程中完成异步IO并挂起，并且如何在IO完成时唤醒协程继续执行。</p>

<ul class="task-list">
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />创建最外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />调用内层协程，并且需要在co_await callee时建立callee → caller的关系</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />挂起最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />恢复最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />从内层协程恢复外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />传递最终结果</li>
</ul>

<h3 id="3-symmetric-transfer-reviewed">3. Symmetric transfer reviewed</h3>

<p>在解决剩余的问题前，我们再review一个问题：当一个异步任务中又执行了其他的异步任务，如何使他们都在相同的上下文执行，比如让<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>都调度到同一组线程池执行。或者换成更通用的场景，如何让<code class="language-plaintext highlighter-rouge">caller</code>传递任意信息到<code class="language-plaintext highlighter-rouge">callee</code>。</p>

<p>答案跟之前建立<code class="language-plaintext highlighter-rouge">callee -&gt; caller</code>的联系一样，在<code class="language-plaintext highlighter-rouge">caller</code>执行<code class="language-plaintext highlighter-rouge">co_await callee()</code>时，通过异步任务中的<code class="language-plaintext highlighter-rouge">Awaitable</code>的<code class="language-plaintext highlighter-rouge">await_suspend</code>，可以从<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">promise</code>对象传递任意信息到<code class="language-plaintext highlighter-rouge">callee</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">auto</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">handle</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">self</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">my_caller</span> <span class="o">=</span> <span class="n">handle</span><span class="p">;</span>
    <span class="n">self</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">some_thing_else</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">some_thing_else</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">self</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>通常来说，通过协程实现的异步任务都会通过对称转移，将控制权交给<code class="language-plaintext highlighter-rouge">callee</code>，这是为什么？这里的本质在于当<code class="language-plaintext highlighter-rouge">caller</code>执行<code class="language-plaintext highlighter-rouge">co_await callee()</code>时，都会先将<code class="language-plaintext highlighter-rouge">callee</code>挂起（<code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">false</code>），注意此时<code class="language-plaintext highlighter-rouge">callee</code>可能还缺少一些上下文信息。紧接着，在<code class="language-plaintext highlighter-rouge">await_suspend</code>中，<code class="language-plaintext highlighter-rouge">caller</code>将上下文给<code class="language-plaintext highlighter-rouge">callee</code>。</p>

<p>我们分析这一时刻<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>所期望的行为：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">caller</code>肯定是希望要被挂起，因为<code class="language-plaintext highlighter-rouge">callee</code>还没有执行完成</li>
  <li><code class="language-plaintext highlighter-rouge">callee</code>此时已经处于挂起状态，但它想要开始执行。</li>
</ul>

<p>因此在对称转移中，在上下文传递完成之后，就应该返回<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，从而恢复<code class="language-plaintext highlighter-rouge">callee</code>执行（非对称转移则是手动<code class="language-plaintext highlighter-rouge">resume</code>）。从另一种角度上说，正因为存在这种上下文转递的需求，同时我们又希望协程的控制流和普通函数调用几乎一样，使得C++中提供了对称转移这样的协程底层机制。</p>

<h3 id="4-how-to-spawn-a-async-task">4. How to spawn a async task</h3>

<p>最后就是如何启动一个异步任务，并获取其最终结果了，这块我们直接用<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>为例。<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>是lazy启动的（对应的<code class="language-plaintext highlighter-rouge">promise_type</code>中<code class="language-plaintext highlighter-rouge">initial_suspend</code>为<code class="language-plaintext highlighter-rouge">suspend_always</code>）。</p>

<p>启动<code class="language-plaintext highlighter-rouge">Task</code>的形式有两种：</p>

<p>第一种是从协程中启动，即在另一个<code class="language-plaintext highlighter-rouge">Task</code>中<code class="language-plaintext highlighter-rouge">co_await</code>另一个Task。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">callee</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">co_return</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">callee</span><span class="p">();</span>
  <span class="k">co_return</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>主要流程如下：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">caller</code>执行到<code class="language-plaintext highlighter-rouge">co_await callee()</code></li>
  <li>
    <p>编译器调用<code class="language-plaintext highlighter-rouge">caller.promise.await_transform(callee())</code>，获取到对应的<code class="language-plaintext highlighter-rouge">Awaiter</code>。其中注意在<code class="language-plaintext highlighter-rouge">co_viaIfAsync</code>中，将子任务的<code class="language-plaintext highlighter-rouge">Executor</code>设置为了父任务的<code class="language-plaintext highlighter-rouge">Executor</code>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Awaitable</span><span class="p">&gt;</span>
 <span class="k">auto</span> <span class="nf">await_transform</span><span class="p">(</span><span class="n">Awaitable</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span><span class="p">)</span> <span class="p">{</span>
     <span class="n">bypassExceptionThrowing_</span> <span class="o">=</span> <span class="n">bypassExceptionThrowing_</span> <span class="o">==</span> <span class="n">BypassExceptionThrowing</span><span class="o">::</span><span class="n">REQUESTED</span>
                                        <span class="o">?</span> <span class="n">BypassExceptionThrowing</span><span class="o">::</span><span class="n">ACTIVE</span>
                                        <span class="o">:</span> <span class="n">BypassExceptionThrowing</span><span class="o">::</span><span class="n">INACTIVE</span><span class="p">;</span>

     <span class="k">return</span> <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">co_withAsyncStack</span><span class="p">(</span><span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">co_viaIfAsync</span><span class="p">(</span>
             <span class="n">executor_</span><span class="p">.</span><span class="n">get_alias</span><span class="p">(),</span>
             <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">co_withCancellation</span><span class="p">(</span><span class="n">cancelToken_</span><span class="p">,</span>
                                              <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">))));</span>
 <span class="p">}</span>

 <span class="k">friend</span> <span class="k">auto</span> <span class="nf">co_viaIfAsync</span><span class="p">(</span><span class="n">Executor</span><span class="o">::</span><span class="n">KeepAlive</span><span class="o">&lt;&gt;</span> <span class="n">executor</span><span class="p">,</span> <span class="n">Task</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&amp;&amp;</span> <span class="n">t</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">coro_</span><span class="p">);</span>
     <span class="c1">// Child task inherits the awaiting task's executor</span>
     <span class="n">t</span><span class="p">.</span><span class="n">setExecutor</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">executor</span><span class="p">));</span>
     <span class="k">return</span> <span class="n">Awaiter</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">coro_</span><span class="p">,</span> <span class="p">{})};</span>
 <span class="p">}</span>

 <span class="kt">void</span> <span class="nf">setExecutor</span><span class="p">(</span><span class="n">folly</span><span class="o">::</span><span class="n">Executor</span><span class="o">::</span><span class="n">KeepAlive</span><span class="o">&lt;&gt;&amp;&amp;</span> <span class="n">e</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
     <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">executor_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
 <span class="p">}</span>

</code></pre></div>    </div>
  </li>
  <li>编译器调用<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>，其内部就是标准的对称转移实现：
    <ol>
      <li>将<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">promise</code>中</li>
      <li>返回<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，恢复callee执行</li>
      <li>值得注意的是它还设置了一个<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>用于保存协程之间的调用关系，这块内容下一篇会再展开介绍</li>
    </ol>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Promise</span><span class="p">&gt;</span>
 <span class="n">FOLLY_NOINLINE</span> <span class="k">auto</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
     <span class="k">auto</span><span class="o">&amp;</span> <span class="n">promise</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">();</span>

     <span class="c1">// 1. 保存父协程的coroutine_handle</span>
     <span class="n">promise</span><span class="p">.</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>

     <span class="c1">// 2. 设置AsyncStackFrame</span>
     <span class="k">auto</span><span class="o">&amp;</span> <span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">promise</span><span class="p">.</span><span class="n">getAsyncFrame</span><span class="p">();</span>
     <span class="n">calleeFrame</span><span class="p">.</span><span class="n">setReturnAddress</span><span class="p">();</span>

     <span class="k">if</span> <span class="nf">constexpr</span> <span class="p">(</span><span class="n">detail</span><span class="o">::</span><span class="n">promiseHasAsyncFrame_v</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span><span class="p">)</span> <span class="p">{</span>
         <span class="k">auto</span><span class="o">&amp;</span> <span class="n">callerFrame</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">();</span>
         <span class="c1">// 如果父协程也有AsyncStackFrame 那么就子协程加入到调用关系中</span>
         <span class="n">folly</span><span class="o">::</span><span class="n">pushAsyncStackFrameCallerCallee</span><span class="p">(</span><span class="n">callerFrame</span><span class="p">,</span> <span class="n">calleeFrame</span><span class="p">);</span>
         <span class="k">return</span> <span class="n">coro_</span><span class="p">;</span>
     <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
         <span class="n">folly</span><span class="o">::</span><span class="n">resumeCoroutineWithNewAsyncStackRoot</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
         <span class="k">return</span><span class="p">;</span>
     <span class="p">}</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
  <li>之后<code class="language-plaintext highlighter-rouge">callee</code>就开始在<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">Executor</code>上开始执行</li>
</ol>

<p>第二种则是，从非协程启动：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">myTask</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">co_return</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">normalFunction</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">auto</span> <span class="n">future</span> <span class="o">=</span> <span class="n">myTask</span><span class="p">()</span>
      <span class="p">.</span><span class="n">scheduleOn</span><span class="p">(</span><span class="n">executor</span><span class="err">）</span>
      <span class="p">.</span><span class="n">start</span><span class="p">();</span>
  <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">future</span><span class="p">).</span><span class="n">get</span><span class="p">();</span>

  <span class="c1">// or</span>
  <span class="c1">// int result = folly::coro::blockingWait(myTask().scheduleOn(executor));</span>
<span class="p">}</span>
</code></pre></div></div>

<p>主要步骤是：</p>

<ol>
  <li>通过<code class="language-plaintext highlighter-rouge">Task::scheduleOn(executor)</code>获取到一个<code class="language-plaintext highlighter-rouge">TaskWithExecutor</code>对象，Executor会保存在协程的promise中。</li>
  <li>
    <p>调用<code class="language-plaintext highlighter-rouge">TaskWithExecutor::start()</code>启动一个辅助协程<code class="language-plaintext highlighter-rouge">startImpl</code></p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">FOLLY_NOINLINE</span> <span class="n">SemiFuture</span><span class="o">&lt;</span><span class="n">lift_unit_t</span><span class="o">&lt;</span><span class="n">StorageType</span><span class="o">&gt;&gt;</span> <span class="n">start</span><span class="p">()</span> <span class="o">&amp;&amp;</span> <span class="p">{</span>
     <span class="n">folly</span><span class="o">::</span><span class="n">Promise</span><span class="o">&lt;</span><span class="n">lift_unit_t</span><span class="o">&lt;</span><span class="n">StorageType</span><span class="o">&gt;&gt;</span> <span class="n">p</span><span class="p">;</span>

     <span class="k">auto</span> <span class="n">sf</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">getSemiFuture</span><span class="p">();</span>

     <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">).</span><span class="n">startImpl</span><span class="p">(</span>
             <span class="p">[</span><span class="n">promise</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">p</span><span class="p">)](</span><span class="n">Try</span><span class="o">&lt;</span><span class="n">StorageType</span><span class="o">&gt;&amp;&amp;</span> <span class="n">result</span><span class="p">)</span> <span class="k">mutable</span> <span class="p">{</span>
                 <span class="n">promise</span><span class="p">.</span><span class="n">setTry</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">result</span><span class="p">));</span>
             <span class="p">},</span>
             <span class="n">folly</span><span class="o">::</span><span class="n">CancellationToken</span><span class="p">{},</span>
             <span class="n">FOLLY_ASYNC_STACK_RETURN_ADDRESS</span><span class="p">());</span>

     <span class="k">return</span> <span class="n">sf</span><span class="p">;</span>
 <span class="p">}</span>

 <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">F</span><span class="p">&gt;</span>
 <span class="n">detail</span><span class="o">::</span><span class="n">InlineTaskDetached</span> <span class="nf">startImpl</span><span class="p">(</span><span class="n">TaskWithExecutor</span> <span class="n">task</span><span class="p">,</span> <span class="n">F</span> <span class="n">cb</span><span class="p">)</span> <span class="p">{</span>
     <span class="k">try</span> <span class="p">{</span>
         <span class="n">cb</span><span class="p">(</span><span class="k">co_await</span> <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">co_awaitTry</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">task</span><span class="p">)));</span>
     <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
         <span class="n">cb</span><span class="p">(</span><span class="n">Try</span><span class="o">&lt;</span><span class="n">StorageType</span><span class="o">&gt;</span><span class="p">(</span><span class="n">exception_wrapper</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">current_exception</span><span class="p">())));</span>
     <span class="p">}</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p><code class="language-plaintext highlighter-rouge">startImpl</code>这个协程实际就是<code class="language-plaintext highlighter-rouge">co_await</code>了这个<code class="language-plaintext highlighter-rouge">TaskWithExecutor</code>对象。核心步骤在<code class="language-plaintext highlighter-rouge">TaskWithExecutor::Awaiter::await_suspend</code>中，即通过<code class="language-plaintext highlighter-rouge">promise.executor_-&gt;add(...)</code>调度协程在给定的<code class="language-plaintext highlighter-rouge">Executor</code>上启动。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Promise</span><span class="p">&gt;</span>
 <span class="kt">void</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
     <span class="k">auto</span><span class="o">&amp;</span> <span class="n">promise</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">();</span>
     <span class="n">DCHECK</span><span class="p">(</span><span class="n">promise</span><span class="p">.</span><span class="n">executor_</span><span class="p">);</span>

     <span class="k">auto</span><span class="o">&amp;</span> <span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">promise</span><span class="p">.</span><span class="n">getAsyncFrame</span><span class="p">();</span>
     <span class="n">calleeFrame</span><span class="p">.</span><span class="n">setReturnAddress</span><span class="p">();</span>

     <span class="n">promise</span><span class="p">.</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>

     <span class="n">promise</span><span class="p">.</span><span class="n">executor_</span><span class="o">-&gt;</span><span class="n">add</span><span class="p">([</span><span class="n">coro</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">,</span> <span class="n">ctx</span> <span class="o">=</span> <span class="n">RequestContext</span><span class="o">::</span><span class="n">saveContext</span><span class="p">()]()</span> <span class="k">mutable</span> <span class="p">{</span>
         <span class="n">RequestContextScopeGuard</span> <span class="n">contextScope</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">ctx</span><span class="p">)};</span>
         <span class="n">folly</span><span class="o">::</span><span class="n">resumeCoroutineWithNewAsyncStackRoot</span><span class="p">(</span><span class="n">coro</span><span class="p">);</span>
     <span class="p">});</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
</ol>

<p>协程启动的流程就这么多，而当协程执行完毕，通过<code class="language-plaintext highlighter-rouge">Task::Awaiter::await_resume</code>或者<code class="language-plaintext highlighter-rouge">TaskWithExecutor::Awaiter::await_resume</code>，即可获取到其结果。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">T</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">DCHECK</span><span class="p">(</span><span class="n">coro_</span><span class="p">);</span>
    <span class="c1">// Eagerly destroy the coroutine-frame once we have retrieved the result.</span>
    <span class="n">SCOPE_EXIT</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">coro_</span><span class="p">,</span> <span class="p">{}).</span><span class="n">destroy</span><span class="p">();</span>
    <span class="p">};</span>
    <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">result</span><span class="p">()).</span><span class="n">value</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<ul class="task-list">
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />创建最外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />调用内层协程，并且需要在co_await callee时建立callee → caller的关系</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />挂起最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />恢复最内层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />从内层协程恢复外层协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />传递最终结果</li>
</ul>

<h2 id="at-last">At last</h2>

<p>之前的三篇文章，更多是从C++标准提供了什么样的协程底层机制，而这一篇则是运用这些机制，分析一个基于协程的异步任务需要什么接口、该怎么实现以及为什么需要这样实现。但协程的底层机制实在是有些复杂，所以也才会先有前三篇介绍各种细节，最后在这才能加以总结了。</p>

<h2 id="refernce">Refernce</h2>

<p><a href="https://www.youtube.com/watch?v=lKUVuaUbRDk">Manage Asynchronous Control Flow With C++ Coroutines - Andreas Weis</a></p>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><category term="folly" /><summary type="html"><![CDATA[本来想直接介绍folly::coro::Task的，但鉴于上一篇展示了太多的”术”，这一片会从宏观视角，理解一个通过协程实现的异步任务，到底需要实现什么东西，以及为什么需要这么实现，所谓“道”。在此基础上，可能会穿插一些folly::coro::Task的内容。这篇很多内容都是总结于这个演讲，也可以看这个更详尽的版本。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 5</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-5/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 5" /><published>2025-11-19T00:00:00+08:00</published><updated>2025-11-19T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-5</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-5/"><![CDATA[<p>coroutine简化了异步代码的编写难度，但在debug时，却无法还原协程之间的异步调用链。这一篇我们研究下<code class="language-plaintext highlighter-rouge">folly::AsyncStackFrame</code>是如何记录协程之间的调用关系的。注意本文中所说的“协程“，如无特殊说明，都是指代<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>。</p>

<h2 id="background">Background</h2>

<p>我们在上一篇其实有介绍，通过C++标准的协程底层机制，能够建立起协程之间的调用关系，从而实现一个完整的异步任务。比如在<code class="language-plaintext highlighter-rouge">callee</code>的promise对象中记录<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，进而使得<code class="language-plaintext highlighter-rouge">callee</code>在执行完成后，能够恢复<code class="language-plaintext highlighter-rouge">caller</code>继续执行。</p>

<p><img src="/archive/coroutine-17.png" alt="figure" /></p>

<p>第一种想法是：既然我们可以建立起<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>的调用关系，只要能获取到<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，也就能获取到其promise中的任意对象， 包括<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。也就能将两个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>连接在一起，从而建立起协程的调用栈。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">TaskPromise</span> <span class="p">{</span>
  <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">continuation</span><span class="p">;</span>
  <span class="n">Try</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">result</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">__coro_frame</span> <span class="p">{</span>
    <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">resume_fn</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
    <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">destroy_fn</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="n">__coroutine_traits_impl</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">&gt;::</span><span class="n">promise_type</span> <span class="n">__promise</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">__suspend_index</span><span class="p">;</span>
    <span class="kt">bool</span> <span class="n">__initial_await_suspend_called</span><span class="p">;</span>
    <span class="c1">// ...</span>
<span class="p">};</span>
</code></pre></div></div>

<p>然而，这个办法在当<code class="language-plaintext highlighter-rouge">folly::coro::Task&lt;T&gt;</code>中的<code class="language-plaintext highlighter-rouge">T</code>超过某个对齐大小的阈值之后，编译器就会在<code class="language-plaintext highlighter-rouge">resume_fn</code>和<code class="language-plaintext highlighter-rouge">destroy_fn</code>这两个指针之后，插入若干字节的padding。这时我们就无法再获取到协程帧中的promise对象以及其中的成员变量。</p>

<p>第二种想法是：获取到某个协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>后，直接去查看其<code class="language-plaintext highlighter-rouge">resume_fn</code>的实现，进而推断出promise的位置，从而获取到其中的成员变量。然而这种方法要么是需要调试信息，同时不同编译器生成的底层汇编也不同，很难以统一形式处理。</p>

<p>第三种想法是：既然我们可以建立起<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>的调用关系，自然也就能在把协程之间的调用关系，以某种形式保存在协程的promise的任意成员变量中。和第一种的区别在于，它并不会直接尝试从<code class="language-plaintext highlighter-rouge">coroutine_handle</code>读取promise对象中的内容。事实上，folly的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>就是作为一个成员变量，保存在协程的promise中。它和另一个类<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>一起，组成了一个链表，记录了协程的调用关系。通过遍历这个链表，也就能恢复出完整的调用栈。</p>

<p>在具体介绍原理之前，不妨看个例子：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">baz</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_return</span> <span class="n">baz</span><span class="p">();</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_await</span> <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">CPUThreadPoolExecutor</span> <span class="n">executor</span><span class="p">{</span><span class="mi">1</span><span class="p">};</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">blockingWait</span><span class="p">(</span><span class="n">foo</span><span class="p">().</span><span class="n">scheduleOn</span><span class="p">(</span><span class="o">&amp;</span><span class="n">executor</span><span class="p">));</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>如果我们去gdb里面在<code class="language-plaintext highlighter-rouge">baz</code>函数加上断点，看到的调用栈可能是下面这样的：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#0  baz () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:375
#1  0x00005555557e067f in bar(_Z3barv.Frame *) (frame_ptr=0x7ffff0005210) at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:378
#2  0x00005555557ecd8d in std::__n4861::coroutine_handle&lt;void&gt;::resume (this=0x7ffff6ff5ec8) at /usr/include/c++/13/coroutine:135
</span><span class="p">...</span>
<span class="cp">#5  0x0000555555813466 in folly::coro::TaskWithExecutor&lt;void&gt;::Awaiter::await_suspend&lt;folly::coro::detail::BlockingWaitPromise&lt;void&gt; &gt;(std::__n4861::coroutine_handle&lt;folly::coro::detail::BlockingWaitPromise&lt;void&gt; &gt;)::{lambda()#1}::operator()() (__closure=0x7ffff6ff60e0)
</span>    <span class="n">at</span> <span class="o">/</span><span class="n">home</span><span class="o">/</span><span class="n">doodle</span><span class="p">.</span><span class="n">wang</span><span class="o">/</span><span class="n">source</span><span class="o">/</span><span class="n">folly</span><span class="o">/</span><span class="n">folly</span><span class="o">/</span><span class="n">experimental</span><span class="o">/</span><span class="n">coro</span><span class="o">/</span><span class="n">Task</span><span class="p">.</span><span class="n">h</span><span class="o">:</span><span class="mi">526</span>
<span class="p">...</span>
<span class="cp">#10 folly::ThreadPoolExecutor::runTask (this=0x555555b2b750, thread=..., task=...) at /home/doodle.wang/source/folly/folly/executors/ThreadPoolExecutor.cpp:100
#11 0x000055555585de53 in folly::CPUThreadPoolExecutor::threadRun (this=0x555555b2b750, thread=...)
</span>    <span class="n">at</span> <span class="o">/</span><span class="n">home</span><span class="o">/</span><span class="n">doodle</span><span class="p">.</span><span class="n">wang</span><span class="o">/</span><span class="n">source</span><span class="o">/</span><span class="n">folly</span><span class="o">/</span><span class="n">folly</span><span class="o">/</span><span class="n">executors</span><span class="o">/</span><span class="n">CPUThreadPoolExecutor</span><span class="p">.</span><span class="n">cpp</span><span class="o">:</span><span class="mi">326</span>
<span class="p">...</span>
<span class="cp">#22 0x000055555588a0f8 in std::thread::_Invoker&lt;std::tuple&lt;folly::NamedThreadFactory::newThread(folly::Function&lt;void ()&gt;&amp;&amp;)::{lambda()#1}&gt; &gt;::operator()() (
</span>    <span class="k">this</span><span class="o">=</span><span class="mh">0x7ffff00083a0</span><span class="p">)</span> <span class="n">at</span> <span class="o">/</span><span class="n">usr</span><span class="o">/</span><span class="n">include</span><span class="o">/</span><span class="n">c</span><span class="o">++/</span><span class="mi">13</span><span class="o">/</span><span class="n">bits</span><span class="o">/</span><span class="n">std_thread</span><span class="p">.</span><span class="n">h</span><span class="o">:</span><span class="mi">299</span>
<span class="cp">#23 0x0000555555880fa0 in std::thread::_State_impl&lt;std::thread::_Invoker&lt;std::tuple&lt;folly::NamedThreadFactory::newThread(folly::Function&lt;void ()&gt;&amp;&amp;)::{lambda()#1}&gt; &gt; &gt;::_M_run() (this=0x7ffff0008390) at /usr/include/c++/13/bits/std_thread.h:244
</span><span class="p">...</span>
</code></pre></div></div>

<p>注意到<code class="language-plaintext highlighter-rouge">foo</code>是没有出现在调用栈中的。虽然是<code class="language-plaintext highlighter-rouge">main</code>调用了<code class="language-plaintext highlighter-rouge">foo</code>，但<code class="language-plaintext highlighter-rouge">foo</code>是被调度在一个<code class="language-plaintext highlighter-rouge">Executor</code>上的执行的，因此只能看到执行<code class="language-plaintext highlighter-rouge">foo</code>的<code class="language-plaintext highlighter-rouge">Executor</code>的相关调用栈。我们更想要的是一个完整的调用栈，而不区分是不是协程：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">-</span> <span class="n">baz</span>
<span class="o">-</span> <span class="n">bar</span>
<span class="o">-</span> <span class="n">foo</span>
<span class="o">-</span> <span class="n">main</span>
</code></pre></div></div>

<p>我们不妨分析下<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>要还原出完整调用栈会遇到哪些情况，即包括：</p>

<ul class="task-list">
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />普通函数调用普通函数</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />普通函数调用协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" />协程调用协程</li>
  <li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled="disabled" checked="checked" />协程调用普通函数</li>
</ul>

<p>普通函数调用普通函数是通过对<code class="language-plaintext highlighter-rouge">%rbp</code>和<code class="language-plaintext highlighter-rouge">caller</code>的返回地址的压栈出栈操作完成的，大致原理在上一篇我们介绍过，不熟悉的可以回顾下。而协程调用普通函数和普通函数调用普通函数本质上没有什么区别，因此这两种情况都不再赘述。</p>

<blockquote>
  <p>实际上，<code class="language-plaintext highlighter-rouge">folly::AsyncStackFrame</code>不仅仅是支持协程，也支持追踪<code class="language-plaintext highlighter-rouge">folly::Future</code>这样的回调形式的异步调用栈，但鉴于我们这一系列都是分析协程，所以也都以协程为例。剩下篇幅中普通调用栈和同步栈是同义词，异步调用栈和协程调用栈是同义词。</p>

</blockquote>

<h2 id="asyncstackframe">AsyncStackFrame</h2>

<p>普通函数的调用关系，是通过<code class="language-plaintext highlighter-rouge">%rbp</code>和返回地址串联起来，最终形成了普通调用栈，也称为同步栈。同理，为了还原协程之间的调度栈，我们需要记录下来<code class="language-plaintext highlighter-rouge">caller</code>被挂起在什么位置，以便<code class="language-plaintext highlighter-rouge">callee</code>执行完成之后恢复<code class="language-plaintext highlighter-rouge">caller</code>继续执行，本质上和同步栈是一样的。有了这些返回地址后，加上二进制的调试信息，就能把指令地址映射回对应的函数名，最终定位到源代码的文件和行号。</p>

<p>为了实现这个目标，有两个核心问题需要解决：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">callee</code>协程如何获知<code class="language-plaintext highlighter-rouge">caller</code>协程的返回地址，并保存在<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中</li>
  <li>如何将<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>串联起来</li>
</ul>

<p>这里直接展示下<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>的数据结构：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">AsyncStackFrame</span> <span class="p">{</span>
    <span class="n">AsyncStackFrame</span><span class="o">*</span> <span class="n">parentFrame</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">instructionPointer</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
    <span class="n">AsyncStackRoot</span><span class="o">*</span> <span class="n">stackRoot</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>三个成员变量主要用于保存以下信息：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">parentFrame</code>：<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>单链表，在协程被挂起时会更新，记录当前协程是从被哪个协程被调用的，形成异步调用链。</li>
  <li><code class="language-plaintext highlighter-rouge">instructionPointer</code>：在协程被挂起时会更新，记录当前协程下次恢复时要执行的代码地址</li>
  <li><code class="language-plaintext highlighter-rouge">stackRoot</code>：记录当前协程属于哪个<code class="language-plaintext highlighter-rouge">EventLoop</code>（也可以理解为异步操作），只有正在执行的协程中<code class="language-plaintext highlighter-rouge">stackRoot</code>才非空。主要作用是连接普通调用栈和异步调用栈。</li>
</ol>

<p>然后，我们分为协程调用协程和普通函数调用协程两种情况，分别介绍其具体原理。</p>

<h3 id="obtaining-the-return-address-of-an-async-stack-frame"><strong>Obtaining the return-address of an async-stack frame</strong></h3>

<p>对于第一个问题”<code class="language-plaintext highlighter-rouge">callee</code>协程如何获知<code class="language-plaintext highlighter-rouge">caller</code>协程的返回地址”，理论上只要我们能获取到某个协程帧，就能根据其中的<code class="language-plaintext highlighter-rouge">suspend_index</code>，得知当前协程挂起在哪个位置。即当协程恢复时，状态机函数<code class="language-plaintext highlighter-rouge">resume_fn</code>要从哪个地址开始继续执行。这块内容前期篇介绍过，不再赘述。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">__coro_frame</span> <span class="p">{</span>
    <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">resume_fn</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
    <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">destroy_fn</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span>
    <span class="n">std</span><span class="o">::</span><span class="n">__coroutine_traits_impl</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">&gt;::</span><span class="n">promise_type</span> <span class="n">__promise</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">__suspend_index</span><span class="p">;</span>
    <span class="kt">bool</span> <span class="n">__initial_await_suspend_called</span><span class="p">;</span>
    <span class="c1">// ...</span>
<span class="p">};</span>
</code></pre></div></div>

<p>然而，编译器可能会根据协程内部<code class="language-plaintext highlighter-rouge">co_await</code>数量的多少，会把状态机函数中根据<code class="language-plaintext highlighter-rouge">suspend_index</code>进行跳转的汇编代码处理成<a href="https://godbolt.org/z/994b4M">不同的形式</a>：</p>

<ol>
  <li>对于比较小的协程，会直接通过<code class="language-plaintext highlighter-rouge">cmp</code>指令比较<code class="language-plaintext highlighter-rouge">suspend_index</code></li>
  <li>而比较大的协程，则会直接使用jump table进行跳转（本质上就是通过jump table优化<code class="language-plaintext highlighter-rouge">switch</code>语句性能）</li>
</ol>

<p>而不同编译器处理的方式差别更大，故而这种方式虽然可行，但是过于复杂。</p>

<p>而<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>的处理方式非常简单：我们之前介绍对称转移的时候说过，在<code class="language-plaintext highlighter-rouge">Awaitable</code>的<code class="language-plaintext highlighter-rouge">await_suspend</code>方法中可以建立起<code class="language-plaintext highlighter-rouge">caller</code>协程和<code class="language-plaintext highlighter-rouge">callee</code>协程的调用关系，并且可以将<code class="language-plaintext highlighter-rouge">caller</code>的相关信息传递给<code class="language-plaintext highlighter-rouge">callee</code>。那么，只要能以某种获取到<code class="language-plaintext highlighter-rouge">caller</code>协程的的返回地址（也就是<code class="language-plaintext highlighter-rouge">caller</code>调用<code class="language-plaintext highlighter-rouge">co_await callee()</code>的下一条指令地址），也就能传递给<code class="language-plaintext highlighter-rouge">callee</code>，达成建立协程调用栈的目的。</p>

<p>而具体获取到返回地址的方式就是编译器的内置函数<code class="language-plaintext highlighter-rouge">__builtin_return_address</code>，他可以传入一个整数<code class="language-plaintext highlighter-rouge">n</code>，从而获取第<code class="language-plaintext highlighter-rouge">n</code>个stack frame的返回地址。因此<code class="language-plaintext highlighter-rouge">caller</code>的返回地址可以通过如下形式进行传递：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">Task</span><span class="o">::</span><span class="n">Awaiter</span><span class="o">::</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">().</span><span class="n">instructionPointer</span> <span class="o">=</span> <span class="n">__builtin_return_address</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
    <span class="c1">// ...</span>
    <span class="k">return</span> <span class="n">coro_</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<blockquote>
  <p>PS：如果<code class="language-plaintext highlighter-rouge">await_suspend</code>被inline了，那么<code class="language-plaintext highlighter-rouge">__builtin_return_address</code>就会获取到错误的返回地址，因此一般需要禁止inline这个函数。</p>

</blockquote>

<h3 id="hooking-up-the-stack-frames-in-a-chain"><strong>Hooking up the stack-frames in a chain</strong></h3>

<p>至于如何”将<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>串联起来”，我们可以如法炮制。也在<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>中更新<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">parentFrame</code>指针。通过这样的形式，形成了一个到<code class="language-plaintext highlighter-rouge">callee -&gt; caller -&gt; ...</code>的链表，当我们通过打断点等类似手段，发现CPU正在执行任何一个协程时，就能沿着这个链表，恢复整个协程之间的调用关系。</p>

<p>具体形式如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Promise</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">Task</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;::</span><span class="n">Awaiter</span><span class="o">::</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">auto</span><span class="o">&amp;</span> <span class="n">callerFrame</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">();</span>
    <span class="k">auto</span><span class="o">&amp;</span> <span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">();</span>
    <span class="n">calleeFrame</span><span class="p">.</span><span class="n">parentFrame</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">callerFrame</span><span class="p">;</span>
    <span class="n">calleeFrame</span><span class="p">.</span><span class="n">instructionPointer</span> <span class="o">=</span> <span class="n">__builtin_return_address</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
    <span class="c1">// ...</span>
    <span class="k">return</span> <span class="n">coro_</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="finding-the-top-async-stack-frame"><strong>Finding the top async stack-frame</strong></h3>

<p>前面两步中，我们在<code class="language-plaintext highlighter-rouge">Awaitable</code>的<code class="language-plaintext highlighter-rouge">await_suspend</code>中，更新了<code class="language-plaintext highlighter-rouge">parentFrame</code>和<code class="language-plaintext highlighter-rouge">instructionPointer</code>两个字段，从而建立起了协程之间的调用关系。但此处还遗留了一个问题：如何获取到当前正在执行的协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>。</p>

<blockquote>
  <p>对于普通函数的stack frame不存在这个问题，只需要读取<code class="language-plaintext highlighter-rouge">%rbp</code>就知道topmost frame。</p>

</blockquote>

<p>具体解决办法如下：在<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>对称转移部分代码中，我们知道要将控制流交给哪个协程，自然也能把正在执行的协程的这个信息记录下来。只不过，考虑到我们不会直接读取promise中的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>，因此是不直接存在其中的。实际的做法是，将当前正在执行的协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>，保存在一个<code class="language-plaintext highlighter-rouge">thread_local</code>变量中，它的类型是<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>。数据结构如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">AsyncStackRoot</span> <span class="p">{</span>
    <span class="c1">// Pointer to the currently-active AsyncStackFrame for this event</span>
    <span class="c1">// loop or callback invocation. May be null if this event loop is</span>
    <span class="c1">// not currently executing any async operations.</span>
    <span class="n">std</span><span class="o">::</span><span class="n">atomic</span><span class="o">&lt;</span><span class="n">AsyncStackFrame</span><span class="o">*&gt;</span> <span class="n">topFrame</span><span class="p">{</span><span class="nb">nullptr</span><span class="p">};</span>
    <span class="c1">// Pointer to the next event loop context lower on the current</span>
    <span class="c1">// thread's stack.</span>
    <span class="n">AsyncStackRoot</span><span class="o">*</span> <span class="n">nextRoot</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
    <span class="c1">// Pointer to the stack-frame and return-address of the function</span>
    <span class="c1">// call that registered this AsyncStackRoot on the current thread.</span>
    <span class="c1">// This is generally the stack-frame responsible for executing async</span>
    <span class="c1">// callbacks (typically an event-loop).</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">stackFramePtr</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">returnAddress</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">topFrame</code>：指向当前正在执行的协程<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>。</li>
  <li><code class="language-plaintext highlighter-rouge">nextRoot</code>：用于串联多个<code class="language-plaintext highlighter-rouge">EventLoop</code>，从而形成AsyncStackRoot的链表，即<code class="language-plaintext highlighter-rouge">EventLoop A</code>创建了<code class="language-plaintext highlighter-rouge">EventLoop B</code>，则<code class="language-plaintext highlighter-rouge">B</code>的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的<code class="language-plaintext highlighter-rouge">nextRoot</code>指向<code class="language-plaintext highlighter-rouge">A</code>的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>。</li>
  <li><code class="language-plaintext highlighter-rouge">stackFramePtr</code> &amp; <code class="language-plaintext highlighter-rouge">returnAddress</code>：都用于记录普通调用栈的信息。<code class="language-plaintext highlighter-rouge">stackFramePtr</code>记录注册这个<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>时的同步栈帧(即当时的<code class="language-plaintext highlighter-rouge">%rbp</code>)，<code class="language-plaintext highlighter-rouge">returnAddress</code>用于记录对应返回地址。</li>
</ul>

<p>对于正在执行的协程，<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">topFrame</code>和<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中的<code class="language-plaintext highlighter-rouge">stackRoot</code>互相指向对方：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">topFrame</code>指向当前正在执行的协程</li>
  <li><code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中的<code class="language-plaintext highlighter-rouge">stackRoot</code>指向当前线程正在使用的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code></li>
</ul>

<p>而对于被挂起的协程，<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中的<code class="language-plaintext highlighter-rouge">stackRoot</code>为空指针。</p>

<p>下面通过一个具体例子具体介绍<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的作用：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">compute_something</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">coro1</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">compute_something</span><span class="p">();</span>
    <span class="k">co_return</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">func1</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">blockingWait</span><span class="p">(</span><span class="n">coro1</span><span class="p">());</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">coro2</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">func1</span><span class="p">();</span>
    <span class="k">co_return</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">blockingWait</span><span class="p">(</span><span class="n">coro2</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>和文章一开始的例子不同的是，这个例子并没有指定协程在哪个<code class="language-plaintext highlighter-rouge">Executor</code>上执行。<code class="language-plaintext highlighter-rouge">main</code>启动了嵌套的两个协程<code class="language-plaintext highlighter-rouge">coro2</code>和<code class="language-plaintext highlighter-rouge">coro1</code>，两个协程实际上是在同一个线程上执行的。每次<code class="language-plaintext highlighter-rouge">folly::coro::blockingWait</code>时，都会创建一个事件循环，不断推动异步任务执行直至完成。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">FOLLY_NOINLINE</span> <span class="n">T</span> <span class="n">getVia</span><span class="p">(</span><span class="n">folly</span><span class="o">::</span><span class="n">DrivableExecutor</span> <span class="o">*</span><span class="n">executor</span><span class="p">,</span>
                        <span class="n">folly</span><span class="o">::</span><span class="n">AsyncStackFrame</span> <span class="o">&amp;</span><span class="n">parentFrame</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">Try</span><span class="o">&lt;</span><span class="n">detail</span><span class="o">::</span><span class="n">lift_lvalue_reference_t</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span> <span class="n">result</span><span class="p">;</span>
    <span class="k">auto</span> <span class="o">&amp;</span><span class="n">promise</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">();</span>
    <span class="n">promise</span><span class="p">.</span><span class="n">setTry</span><span class="p">(</span><span class="o">&amp;</span><span class="n">result</span><span class="p">);</span>

    <span class="c1">// ...</span>

    <span class="n">executor</span><span class="o">-&gt;</span><span class="n">add</span><span class="p">([</span><span class="n">coro</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">,</span> <span class="n">rctx</span> <span class="o">=</span> <span class="n">RequestContext</span><span class="o">::</span><span class="n">saveContext</span><span class="p">()]()</span> <span class="k">mutable</span> <span class="p">{</span>
        <span class="n">RequestContextScopeGuard</span> <span class="n">guard</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">rctx</span><span class="p">)};</span>
        <span class="n">folly</span><span class="o">::</span><span class="n">resumeCoroutineWithNewAsyncStackRoot</span><span class="p">(</span><span class="n">coro</span><span class="p">);</span>
    <span class="p">});</span>
    <span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="n">promise</span><span class="p">.</span><span class="n">done</span><span class="p">())</span> <span class="p">{</span>  <span class="c1">// &lt;- EventLoop</span>
        <span class="n">executor</span><span class="o">-&gt;</span><span class="n">drive</span><span class="p">();</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">result</span><span class="p">).</span><span class="n">value</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>每个线程可能会启动多个<code class="language-plaintext highlighter-rouge">EventLoop</code>，每个<code class="language-plaintext highlighter-rouge">EventLoop</code>负责推动多个协程执行，在一个线程上，同一时刻只有一个协程正在执行，其余协程都处于挂起状态。为了得到正在执行的协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>，我们需要将当前线程正在通过哪个<code class="language-plaintext highlighter-rouge">EventLoop</code>执行哪个协程以某种形式记录下来。</p>

<p>具体的办法是：每当启动一个<code class="language-plaintext highlighter-rouge">EventLoop</code>，就会创建一个<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>。在协程切换时，也就是<code class="language-plaintext highlighter-rouge">caller</code>执行<code class="language-plaintext highlighter-rouge">co_await callee</code>，以及<code class="language-plaintext highlighter-rouge">callee</code>执行<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend</code>时，把当前线程正在执行的协程信息保存到<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中。示意代码如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// caller -&gt; callee</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">Task</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;::</span><span class="n">Awaiter</span><span class="o">::</span><span class="n">await_suspend_impl</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">,</span>
                                          <span class="n">AsyncStackFrame</span> <span class="o">&amp;</span><span class="n">callerFrame</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="o">&amp;</span><span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">coro_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">getAsyncFrame</span><span class="p">();</span>
    <span class="n">calleeFrame</span><span class="p">.</span><span class="n">parentFrame</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">callerFrame</span><span class="p">;</span>
    <span class="n">calleeFrame</span><span class="p">.</span><span class="n">instructionPointer</span> <span class="o">=</span> <span class="n">__builtin_return_address</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>

    <span class="k">auto</span> <span class="o">*</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="n">callerFrame</span><span class="p">.</span><span class="n">stackRoot</span><span class="p">;</span>
    <span class="n">stackRoot</span><span class="o">-&gt;</span><span class="n">topFrame</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">calleeFrame</span><span class="p">;</span>
    <span class="n">calleeFrame</span><span class="p">.</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="n">stackRoot</span><span class="p">;</span>
    <span class="n">callerFrame</span><span class="p">.</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>

    <span class="c1">// ...</span>
    <span class="k">return</span> <span class="n">coro_</span><span class="p">;</span>
<span class="p">}</span>

<span class="c1">// calee -&gt; caller</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">TaskPromise</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;::</span><span class="n">FinalAwaiter</span><span class="o">::</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">Promise</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="o">&amp;</span><span class="n">promise</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">();</span>

    <span class="n">AsyncStackFrame</span> <span class="o">&amp;</span><span class="n">calleeFrame</span> <span class="o">=</span> <span class="n">promise</span><span class="p">.</span><span class="n">getAsyncFrame</span><span class="p">();</span>
    <span class="n">AsyncStackFrame</span> <span class="o">*</span><span class="n">callerFrame</span> <span class="o">=</span> <span class="n">calleeFrame</span><span class="p">.</span><span class="n">parentFrame</span><span class="p">;</span>
    <span class="n">AsyncStackRoot</span> <span class="o">*</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="n">calleeFrame</span><span class="p">.</span><span class="n">stackRoot</span><span class="p">;</span>

    <span class="n">stackRoot</span><span class="o">-&gt;</span><span class="n">topFrame</span> <span class="o">=</span> <span class="n">callerFrame</span><span class="p">;</span>
    <span class="n">callerFrame</span><span class="o">-&gt;</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="n">stackRoot</span><span class="p">;</span>
    <span class="n">callee</span><span class="o">-&gt;</span><span class="n">stackRoot</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>

    <span class="c1">// ...</span>
    <span class="k">return</span> <span class="n">promise</span><span class="p">.</span><span class="n">continuation</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>可以看到，无论是在<code class="language-plaintext highlighter-rouge">caller -&gt; callee</code>还是<code class="language-plaintext highlighter-rouge">callee -&gt; caller</code>的对称转移处理过程中，都会：</p>

<ul>
  <li>将<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">topFrame</code>指向即将正在执行的协程</li>
  <li>将即将执行的协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中<code class="language-plaintext highlighter-rouge">stackRoot</code>置为非空，即将被挂起的协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>中<code class="language-plaintext highlighter-rouge">stackRoot</code>置为空指针。</li>
</ul>

<p>到这里我们还剩一个问题没有解释：一个线程可能有多个<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>，给定一个线程，如何获取到正在运行的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>呢？答案就是将每个线程正在使用的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>保存到thread local storage中。从宏观上看，<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>和<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的完整关系如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//  Stack Register</span>
<span class="c1">//      |</span>
<span class="c1">//      V</span>
<span class="c1">//  Stack Frame       currentStackRoot (TLS)</span>
<span class="c1">//      |                   |</span>
<span class="c1">//      V                   V</span>
<span class="c1">//  Stack Frame &lt;----- AsyncStackRoot -----&gt; AsyncStackFrame -----&gt; AsyncStackFrame -&gt; ...</span>
<span class="c1">//      |   (stackFramePtr) |      (topFrame)            (parentFrame)</span>
<span class="c1">//      V                   |</span>
<span class="c1">//  Stack Frame             |(nextRoot)</span>
<span class="c1">//      :                   |</span>
<span class="c1">//      V                   V</span>
<span class="c1">//  Stack Frame &lt;----- AsyncStackRoot -----&gt; AsyncStackFrame -----&gt; AsyncStackFrame -&gt; ...</span>
<span class="c1">//      |   (stackFramePtr) |      (topFrame)            (parentFrame)</span>
<span class="c1">//      V                   X</span>
<span class="c1">//  Stack Frame</span>
<span class="c1">//      :</span>
<span class="c1">//      V</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>和<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>之间通过<code class="language-plaintext highlighter-rouge">parentFrame</code>连接起来</li>
  <li><code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>通过<code class="language-plaintext highlighter-rouge">topFrame</code>保存当前正在执行的协程</li>
  <li><code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>和<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>之间则通过<code class="language-plaintext highlighter-rouge">nextRoot</code>连接起来，即<code class="language-plaintext highlighter-rouge">EventLoop A</code>创建了<code class="language-plaintext highlighter-rouge">EventLoop B</code>，则<code class="language-plaintext highlighter-rouge">B</code>的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的<code class="language-plaintext highlighter-rouge">nextRoot</code>指向<code class="language-plaintext highlighter-rouge">A</code>的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code></li>
  <li>当前线程正在使用的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>，则保存到TLS中</li>
</ul>

<blockquote>
  <p>延伸的问题是，在进程外（比如gdb中），如何线程找到对应的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>？通常来说，每个线程的<code class="language-plaintext highlighter-rouge">control-block</code>中会保存线程号、进程号、优先级以及我们所关心的TLS等，<code class="language-plaintext highlighter-rouge">control-block</code>的指针则会保存在<code class="language-plaintext highlighter-rouge">fs</code>寄存器中。在gdb中，可以根据该寄存器，找到线程对应的TLS，也就能找到对应的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>。相关代码可以参考folly中的<code class="language-plaintext highlighter-rouge">AsyncStackRootHolder</code>。</p>

</blockquote>

<h3 id="finding-the-stack-frame-that-corresponds-to-an-async-frame-activation"><strong>Finding the stack-frame that corresponds to an async-frame activation</strong></h3>

<p>到这我们已经完全了解了协程之间的调用栈是如何组织的。最后一个问题就是如何处理普通函数调用协程，即如何把普通调用栈和协程调用栈连在一起，答案就是前面提到的<code class="language-plaintext highlighter-rouge">stackFramePtr</code>指针。仍以文章开头的代码为例：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">baz</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_return</span> <span class="n">baz</span><span class="p">();</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_await</span> <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">CPUThreadPoolExecutor</span> <span class="n">executor</span><span class="p">{</span><span class="mi">1</span><span class="p">};</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">blockingWait</span><span class="p">(</span><span class="n">foo</span><span class="p">().</span><span class="n">scheduleOn</span><span class="p">(</span><span class="o">&amp;</span><span class="n">executor</span><span class="p">));</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>沿着协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>调用栈，我们可以获取到如下的调用栈：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- baz
- bar
- foo
</code></pre></div></div>

<p>而我们期望的完整调用栈是：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">-</span> <span class="n">baz</span>
<span class="o">-</span> <span class="n">bar</span>
<span class="o">-</span> <span class="n">foo</span>
<span class="o">-</span> <span class="n">main</span>
</code></pre></div></div>

<p>即，如果在某个线程中，通过<code class="language-plaintext highlighter-rouge">coroutine_handle</code>恢复了一个协程后，需要将此处的普通调用栈和协程调用栈连在一起。也就是把普通调用栈的相关信息，保存到<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">stackFramePtr</code>和<code class="language-plaintext highlighter-rouge">returnAddress</code>字段中。每当一个线程创建一个<code class="language-plaintext highlighter-rouge">EventLoop</code>，即异步操作时，就会通过<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>记录下普通调用栈中的栈帧位置，以便后续从协程调用栈再切换回普通调用栈。代码的调用关系如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">blockingWait</span>
  <span class="o">-&gt;</span> <span class="n">BlockingWaitTask</span><span class="o">::</span><span class="n">get</span> <span class="n">or</span> <span class="n">BlockingWaitTask</span><span class="o">::</span><span class="n">getVia</span>
    <span class="o">-&gt;</span> <span class="n">resumeCoroutineWithNewAsyncStackRoot</span>
</code></pre></div></div>

<p>实际工作是由<code class="language-plaintext highlighter-rouge">ScopedAsyncStackRoot</code>这个类以RAII的形式设置和恢复的：</p>

<ol>
  <li>创建一个<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>：
    <ol>
      <li>其<code class="language-plaintext highlighter-rouge">stackFramePtr</code>字段指向普通调用栈的地址，即<code class="language-plaintext highlighter-rouge">FOLLY_ASYNC_STACK_FRAME_POINTER</code>，实际是调用<code class="language-plaintext highlighter-rouge">__builtin_frame_address</code>。之后就能通过这个指针，在恢复完整调用栈时，从协程调用栈再切换回普通调用栈。（参照后续说明）</li>
      <li>其<code class="language-plaintext highlighter-rouge">returnAddress</code>字段指向普通调用栈的下一条指令地址</li>
      <li>更新TLS中的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>为调用<code class="language-plaintext highlighter-rouge">blockingWait</code>线程的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code></li>
    </ol>
  </li>
  <li>更新<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">topFrame</code>为要恢复协程的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code></li>
  <li>恢复协程执行</li>
  <li>协程执行完成，将TLS中的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>还原为调用<code class="language-plaintext highlighter-rouge">blockingWait</code>线程的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code></li>
</ol>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">FOLLY_NOINLINE</span> <span class="kt">void</span> <span class="n">resumeCoroutineWithNewAsyncStackRoot</span><span class="p">(</span>
    <span class="n">coro</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">h</span><span class="p">,</span> <span class="n">folly</span><span class="o">::</span><span class="n">AsyncStackFrame</span><span class="o">&amp;</span> <span class="n">frame</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
  <span class="c1">// In ScopedAsyncStackRoot's constructor, it will:</span>
  <span class="c1">// 1. create a AsyncStackRoot with</span>
  <span class="c1">//      stackFramePtr = FOLLY_ASYNC_STACK_FRAME_POINTER()</span>
  <span class="c1">//      returnAddress = FOLLY_ASYNC_STACK_RETURN_ADDRESS()</span>
  <span class="c1">// 2. update TLS AsyncStackRoot as current AsyncStackRoot</span>
  <span class="n">detail</span><span class="o">::</span><span class="n">ScopedAsyncStackRoot</span> <span class="n">root</span><span class="p">;</span>
  <span class="n">root</span><span class="p">.</span><span class="n">activateFrame</span><span class="p">(</span><span class="n">frame</span><span class="p">);</span>
  <span class="n">h</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>

  <span class="c1">// In ScopedAsyncStackRoot's destructor, it will:</span>
  <span class="c1">// 1. restore TLS AsyncStackRoot to nextRoot of current AsyncStackRoot's</span>
<span class="p">}</span>

<span class="n">ScopedAsyncStackRoot</span><span class="o">::</span><span class="n">ScopedAsyncStackRoot</span><span class="p">(</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">framePointer</span> <span class="o">=</span> <span class="n">FOLLY_ASYNC_STACK_FRAME_POINTER</span><span class="p">(),</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">returnAddress</span> <span class="o">=</span> <span class="n">FOLLY_ASYNC_STACK_RETURN_ADDRESS</span><span class="p">())</span> <span class="k">noexcept</span> <span class="p">{</span>
  <span class="n">root_</span><span class="p">.</span><span class="n">setStackFrameContext</span><span class="p">(</span><span class="n">framePointer</span><span class="p">,</span> <span class="n">returnAddress</span><span class="p">);</span>
  <span class="n">root_</span><span class="p">.</span><span class="n">nextRoot</span> <span class="o">=</span> <span class="n">currentThreadAsyncStackRoot</span><span class="p">.</span><span class="n">get</span><span class="p">();</span>
  <span class="n">currentThreadAsyncStackRoot</span><span class="p">.</span><span class="n">set</span><span class="p">(</span><span class="o">&amp;</span><span class="n">root_</span><span class="p">);</span>  <span class="c1">// update thread local AsyncStackRoot</span>
<span class="p">}</span>

<span class="n">ScopedAsyncStackRoot</span><span class="o">::~</span><span class="n">ScopedAsyncStackRoot</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">assert</span><span class="p">(</span><span class="n">currentThreadAsyncStackRoot</span><span class="p">.</span><span class="n">get</span><span class="p">()</span> <span class="o">==</span> <span class="o">&amp;</span><span class="n">root_</span><span class="p">);</span>
  <span class="n">assert</span><span class="p">(</span><span class="n">root_</span><span class="p">.</span><span class="n">topFrame</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">memory_order_relaxed</span><span class="p">)</span> <span class="o">==</span> <span class="nb">nullptr</span><span class="p">);</span>
  <span class="n">currentThreadAsyncStackRoot</span><span class="p">.</span><span class="n">set_relaxed</span><span class="p">(</span><span class="n">root_</span><span class="p">.</span><span class="n">nextRoot</span><span class="p">);</span>
<span class="p">}</span>

<span class="kr">inline</span> <span class="kt">void</span> <span class="n">AsyncStackRoot</span><span class="o">::</span><span class="n">setStackFrameContext</span><span class="p">(</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">framePtr</span><span class="p">,</span> <span class="kt">void</span><span class="o">*</span> <span class="n">ip</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
  <span class="n">stackFramePtr</span> <span class="o">=</span> <span class="n">framePtr</span><span class="p">;</span>
  <span class="n">returnAddress</span> <span class="o">=</span> <span class="n">ip</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>有了普通调用栈的返回地址后，我们就可以把普通调用栈和协程调用栈连接在一起。准确来说，所有<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>暴露的接口，最终都会调用<code class="language-plaintext highlighter-rouge">blockingWait</code>。因此无论怎么使用<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>，普通函数调用协程都会被保存下来。</p>

<p>我们结合这个链表梳理一遍还原整个调用栈的过程：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Stack Register
    |
    V
Stack Frame       currentStackRoot (TLS)
    |                   |
    V                   V
Stack Frame &lt;----- AsyncStackRoot -----&gt; AsyncStackFrame -----&gt; AsyncStackFrame -&gt; ...
    |   (stackFramePtr) |      (topFrame)            (parentFrame)
    V                   |
Stack Frame             |(nextRoot)
    :                   |
    V                   V
Stack Frame &lt;----- AsyncStackRoot -----&gt; AsyncStackFrame -----&gt; AsyncStackFrame -&gt; ...
    |   (stackFramePtr) |      (topFrame)            (parentFrame)
    V                   X
Stack Frame
    :
    V
</code></pre></div></div>

<ol>
  <li>读取TLS中的<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>。</li>
  <li>处理普通函数调用：沿着<code class="language-plaintext highlighter-rouge">%rbp</code>和返回地址，恢复同步调用栈。直到发现当前stack frame的<code class="language-plaintext highlighter-rouge">%rbp</code>和当前<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的<code class="language-plaintext highlighter-rouge">stackFramePtr</code>指向同一个位置（说明在这里启动了一个异步操作），不再沿着<code class="language-plaintext highlighter-rouge">%rbp</code>遍历。这里就是普通调用栈和协程调用栈切换的地方。</li>
  <li>通过<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的<code class="language-plaintext highlighter-rouge">topFrame</code>找到正在执行的协程对应的<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>，并不断沿着<code class="language-plaintext highlighter-rouge">parentFrame</code>延伸协程调用栈。</li>
  <li>直至某个<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>的<code class="language-plaintext highlighter-rouge">parentFrame</code>为空，说明当前<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的整条调用链已经遍历完成。</li>
  <li>此时会根据<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>的<code class="language-plaintext highlighter-rouge">stackFramePtr</code>切换到普通调用栈。</li>
  <li>之后就又从第2步开始重复上述流程。</li>
</ol>

<p>所以恢复出来的调用栈可能是交替出现的：<code class="language-plaintext highlighter-rouge">同步栈 -&gt; 异步栈 -&gt; 同步栈 -&gt; 异步栈 -&gt; ...</code>。</p>

<blockquote>
  <p>PS：也就是说，<code class="language-plaintext highlighter-rouge">AsyncStackRoot</code>中的<code class="language-plaintext highlighter-rouge">topFrame</code>用于解决从同步调用栈切换到异步调用栈，而<code class="language-plaintext highlighter-rouge">stackFramePtr</code>用于解决从异步调用栈切换到同步调用栈。</p>

</blockquote>

<p>在下面图中的序号代表在最终还原出来的调用栈中的序号：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Stack Register
    |
    V
Stack Frame(0)   currentStackRoot (TLS)
    |                  |
    V                  V
Stack Frame(3) &lt;- AsyncStackRoot  -&gt; AsyncStackFrame(1) -&gt; AsyncStackFrame(2) -&gt; X
    |                  |
    V                  |
Stack Frame(4)         |
    :                  |
    V                  V
Stack Frame(7) &lt;- AsyncStackRoot  -&gt; AsyncStackFrame(5) -&gt; AsyncStackFrame(6) -&gt; X
    |                  |
    V                  X
Stack Frame(8)
    :
    V
</code></pre></div></div>

<h2 id="example">Example</h2>

<p>到这<code class="language-plaintext highlighter-rouge">AsyncStackFrame</code>的核心逻辑都介绍完了，folly把恢复异步调用栈都封装到了这个<a href="https://github.com/facebook/folly/blob/main/folly/coro/scripts/co_bt.py">gdb脚本</a>中。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">baz</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// ...</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">bar</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_return</span> <span class="n">baz</span><span class="p">();</span>
<span class="p">}</span>

<span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">foo</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_await</span> <span class="n">bar</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">CPUThreadPoolExecutor</span> <span class="n">executor</span><span class="p">{</span><span class="mi">1</span><span class="p">};</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">blockingWait</span><span class="p">(</span><span class="n">foo</span><span class="p">().</span><span class="n">scheduleOn</span><span class="p">(</span><span class="o">&amp;</span><span class="n">executor</span><span class="p">));</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>对于上面的例子，如果在<code class="language-plaintext highlighter-rouge">baz</code>处打上断点，输入<code class="language-plaintext highlighter-rouge">co_bt</code>就能得到如下的调用栈：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">co_bt</span>
<span class="cp">#0  0x00005555557dfa61 in baz() () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:376
#1  0x00005555557dfcdf in bar(bar()::_Z3barv.Frame*) [clone .actor] () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:379
#2  0x00005555557e0232 in foo(foo()::_Z3foov.Frame*) [clone .actor] () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:382
#3  0x00005555557ec3fb in std::__n4861::coroutine_handle&lt;void&gt;::resume() const () at /usr/include/c++/13/coroutine:135
#4  0x00005555557e00fa in foo(foo()::_Z3foov.Frame*) [clone .actor] () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:383
</span><span class="p">...</span>
<span class="cp">#7  0x00005555557e03dc in main () at /home/doodle.wang/source/folly/folly/experimental/coro/test/BlockingWaitTest.cpp:388
#8  0x00007ffff782a1ca in ??? () at ???:0
#9  0x00007ffff782a28b in __libc_start_main () at ???:0
#10 0x00005555557da2f5 in _start () at ???:0
</span></code></pre></div></div>

<p>完结！</p>

<h2 id="reference">Reference</h2>

<p><a href="https://developers.facebook.com/blog/post/2021/09/16/async-stack-traces-folly-Introduction/">Async stack traces in folly: Introduction</a></p>

<p><a href="https://www.youtube.com/watch?v=nHy2cA9ZDbw">Async Stacks: Making Senders and Coroutines Debuggable - Ian Petersen &amp; Jessica Wong - CppCon 2024 - YouTube</a></p>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><category term="folly" /><summary type="html"><![CDATA[coroutine简化了异步代码的编写难度，但在debug时，却无法还原协程之间的异步调用链。这一篇我们研究下folly::AsyncStackFrame是如何记录协程之间的调用关系的。注意本文中所说的“协程“，如无特殊说明，都是指代folly::coro::Task。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 6</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-6/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 6" /><published>2025-11-19T00:00:00+08:00</published><updated>2025-11-19T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-6</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-6/"><![CDATA[<p>坑越挖越深，这一篇看下Coroutine和C++26引入的Sender。</p>

<h2 id="p2300">P2300</h2>

<p><a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2300r10.html">P2300</a>目前已经已经被确认加入到C++26了，这个提案可以说指明了未来所有异步代码的编写方式。我们简单介绍下这个提案，它主要引入了四个概念：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Scheduler</code>：代表一个执行上下文，也就是在哪里执行一个函数或者任务，可以是线程池、CPU等等。</li>
  <li><code class="language-plaintext highlighter-rouge">Sender</code>：代表一个异步产生结果的对象，可以简单理解为一个lazy的等待执行的函数</li>
  <li><code class="language-plaintext highlighter-rouge">Receiver</code>：代表一个用于接受异步结果的对象</li>
  <li><code class="language-plaintext highlighter-rouge">OperatorState</code>：用于启动异步任务和生命周期管理</li>
</ul>

<p>它们的相关接口如下：</p>

<p><img src="/archive/coroutine-18.png" alt="figure" /></p>

<p>这几个概念的关系如下：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Scheduler</code>有一个<code class="language-plaintext highlighter-rouge">schedule</code>方法，返回一个<code class="language-plaintext highlighter-rouge">Sender</code>。注意返回值是一个空的<code class="language-plaintext highlighter-rouge">Sender</code>，只用来表示后续任务在哪里执行。</li>
  <li><code class="language-plaintext highlighter-rouge">Sender</code>可以在<code class="language-plaintext highlighter-rouge">starts_on</code>或者<code class="language-plaintext highlighter-rouge">then</code>等接口指定实际要执行的任务。但注意，调用<code class="language-plaintext highlighter-rouge">starts_on</code>或者<code class="language-plaintext highlighter-rouge">then</code>时并不会开始执行这些任务，即前面提到的<code class="language-plaintext highlighter-rouge">Sender</code>本质上是一个lazy执行的任务。lazy的优势在于，在真正开始执行这个任务之前，我们可以把多个<code class="language-plaintext highlighter-rouge">Sender</code>组织在一起，形成一个DAG的形式。</li>
  <li><code class="language-plaintext highlighter-rouge">Receiver</code>的相应接口用来保存<code class="language-plaintext highlighter-rouge">Sender</code>的执行结果：
    <ul>
      <li><code class="language-plaintext highlighter-rouge">set_value</code>：正常执行，传递结果</li>
      <li><code class="language-plaintext highlighter-rouge">set_error</code>：执行错误，传递异常</li>
      <li><code class="language-plaintext highlighter-rouge">set_done</code>：任务被取消</li>
    </ul>
  </li>
  <li>有了<code class="language-plaintext highlighter-rouge">Sender</code>和<code class="language-plaintext highlighter-rouge">Receiver</code>之后，我们需要通过<code class="language-plaintext highlighter-rouge">connect</code>将二者连在一起，即告诉给定<code class="language-plaintext highlighter-rouge">Sender</code>在执行完成或者发生异常后，将相应结果告知给定的<code class="language-plaintext highlighter-rouge">Receiver</code>，<code class="language-plaintext highlighter-rouge">connect</code>返回一个<code class="language-plaintext highlighter-rouge">OperatorState</code>。</li>
  <li><code class="language-plaintext highlighter-rouge">OperatorState</code>作用是负责启动异步任务，并保存异步操作的相关状态。</li>
</ul>

<p><img src="/archive/coroutine-19.png" alt="figure" /></p>

<h3 id="why-we-use-senders">Why we use senders?</h3>

<p><code class="language-plaintext highlighter-rouge">stdexec</code>已经证明了P2300的可行性，可以在<a href="https://godbolt.org/z/3cseorf7M">这里</a>感受一下中通过<code class="language-plaintext highlighter-rouge">Sender</code>来完成异步任务的能力：</p>

<ul>
  <li>异步任务能够指定在哪个<code class="language-plaintext highlighter-rouge">Scheduler</code>上执行</li>
  <li>多个异步任务能够链式执行，甚至组织成一个DAG的形式</li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdexec/execution.hpp&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;exec/static_thread_pool.hpp&gt;</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
    <span class="c1">// Declare a pool of 3 worker threads:</span>
    <span class="n">exec</span><span class="o">::</span><span class="n">static_thread_pool</span> <span class="n">pool</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span>

    <span class="c1">// Get a handle to the thread pool:</span>
    <span class="k">auto</span> <span class="n">sched</span> <span class="o">=</span> <span class="n">pool</span><span class="p">.</span><span class="n">get_scheduler</span><span class="p">();</span>

    <span class="c1">// Describe some work:</span>
    <span class="c1">// Creates 3 sender pipelines that are executed concurrently by passing to `when_all`</span>
    <span class="c1">// Each sender is scheduled on `sched` using `on` and starts with `just(n)` that creates a</span>
    <span class="c1">// Sender that just forwards `n` to the next sender.</span>
    <span class="c1">// After `just(n)`, we chain `then(fun)` which invokes `fun` using the value provided from `just()`</span>
    <span class="c1">// Note: No work actually happens here. Everything is lazy and `work` is just an object that statically</span>
    <span class="c1">// represents the work to later be executed</span>
    <span class="k">auto</span> <span class="n">fun</span> <span class="o">=</span> <span class="p">[](</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="n">i</span><span class="o">*</span><span class="n">i</span><span class="p">;</span> <span class="p">};</span>
    <span class="k">auto</span> <span class="n">work</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">when_all</span><span class="p">(</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">sched</span><span class="p">,</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">just</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="o">|</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">then</span><span class="p">(</span><span class="n">fun</span><span class="p">)),</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">sched</span><span class="p">,</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">just</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">|</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">then</span><span class="p">(</span><span class="n">fun</span><span class="p">)),</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">sched</span><span class="p">,</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">just</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="o">|</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">then</span><span class="p">(</span><span class="n">fun</span><span class="p">))</span>
    <span class="p">);</span>

    <span class="c1">// Launch the work and wait for the result</span>
    <span class="k">auto</span> <span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">sync_wait</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">work</span><span class="p">)).</span><span class="n">value</span><span class="p">();</span>

    <span class="c1">// Print the results:</span>
    <span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"%d %d %d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>但事实上，我第一次看这个草案的相关介绍时，我的第一反应是为什么需要这东西（据说该草案据说在标准委员会投票时也有非常大的争议），而且从好几个方面我都产生了怀疑：</p>

<ol>
  <li>标准库已经有promise/future了，更别提folly的Promise/Future也能组织成DAG的形式</li>
  <li>标准库已经有协程了，不是说协程是C++20之后编写异步代码的方式吗</li>
  <li>为什么P2300引入这么多新概念，有必要吗</li>
</ol>

<p>关于这些问题，我推荐去看下这篇<a href="https://ericniebler.com/2024/02/04/what-are-senders-good-for-anyway/">文章</a>。这里浅谈一下我的理解：</p>

<ol>
  <li>按照P2300这个草案，通过各种算法（比如<code class="language-plaintext highlighter-rouge">then</code>/<code class="language-plaintext highlighter-rouge">when_all</code>等），把各个Sender串联起来，构建出非常复杂的任务，这一点的确和folly的Promise/Future一样。但一个核心区别在于Sender是一个lazy任务，它把任务调度和具体任务执行分离开来，更加灵活一些。</li>
  <li>P2300的设计理念遵循<a href="https://www.youtube.com/watch?v=1Wy5sq3s2rg">Structured Concurrency</a>，并且提供了无缝衔接协程的能力（我们后面会用实际代码来解释这一点），保证了父协程一定晚于子协程结束，避免了shared state带来的资源管理问题。</li>
  <li>P2300的出现，使得所有异步任务有了相同抽象，不同库的异步任务可能具体实现方式不同（比如回调、future等），而P2300使得不同异步任务都能统一为<code class="language-plaintext highlighter-rouge">Sender</code>/<code class="language-plaintext highlighter-rouge">Receiver</code>的实现，从而具备将不同库的异步任务串联起来的能力。</li>
</ol>

<p>不过这块内容牵涉面太广，就不再展开了，感兴趣的可以看看引用的这些文章。我们的重点还是研究协程和<code class="language-plaintext highlighter-rouge">Sender</code>。</p>

<h2 id="use-coroutine-as-sender">Use coroutine as sender</h2>

<p>事实上，协程和Sender的关系并不是互相取代，而是Sender从几方面加强了协程：</p>

<ul>
  <li>协程可以作为Sender使用，也就能使用Sender提供的各种算法：
    <ul>
      <li>then/let_value/let_error/let_done</li>
      <li>when_all</li>
      <li>repeat_effect/retry</li>
      <li>schedule_after</li>
      <li>…</li>
    </ul>
  </li>
  <li>协程已经比其他异步代码形式朝Structured Concurrency的方向已经迈了一大步，但这些约束都不是强制性的，比如协程可以创建任务但不等待完成就返回，生命周期管理需要程序员手动保证。但P2300则全方位满足了Structured Concurrency的需求：
    <ul>
      <li>生命周期嵌套：子任务的生命周期严格在父任务内</li>
      <li>错误传播：子任务的异常能正确传播到父任务</li>
      <li>资源清理：所有资源在正确时机清理</li>
      <li>取消支持：父任务取消能传播到所有子任务</li>
    </ul>

    <p>有关这一块内容，有机会再单独展开介绍其具体原理。</p>
  </li>
  <li>性能上，Eric Niebler和Lewis Baker的博客都提到了把协程作为Sender时，coroutine frame就不需要动态分配了，这块没有深究。</li>
</ul>

<p>下面会通过一个简单例子，看看如何把协程当成一个<code class="language-plaintext highlighter-rouge">Sender</code>使用。代码中的<code class="language-plaintext highlighter-rouge">SimpleTask</code>就是之前介绍对称转移时的<a href="https://godbolt.org/z/n896xMrdG">协程</a>，没有做任何修改。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;coroutine&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;utility&gt;</span><span class="cp">
</span>
<span class="cp">#include</span> <span class="cpf">&lt;exec/static_thread_pool.hpp&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;stdexec/execution.hpp&gt;</span><span class="cp">
</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">SimpleTask</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
        <span class="n">SimpleTask</span> <span class="n">get_return_object</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">SimpleTask</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">)};</span>
        <span class="p">}</span>

        <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="nf">initial_suspend</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="p">{};</span>
        <span class="p">}</span>

        <span class="k">struct</span> <span class="nc">FinalAwaiter</span> <span class="p">{</span>
            <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
                <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
            <span class="p">}</span>

            <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span>
                    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
                <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
                <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
                    <span class="k">return</span> <span class="n">continuation</span><span class="p">;</span>
                <span class="p">}</span>
                <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">noop_coroutine</span><span class="p">();</span>
            <span class="p">}</span>

            <span class="kt">void</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{}</span>
        <span class="p">};</span>

        <span class="n">FinalAwaiter</span> <span class="n">final_suspend</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="k">return</span> <span class="p">{};</span>
        <span class="p">}</span>

        <span class="kt">void</span> <span class="nf">return_value</span><span class="p">(</span><span class="n">T</span> <span class="n">v</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">value_</span> <span class="o">=</span> <span class="n">v</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="kt">void</span> <span class="nf">unhandled_exception</span><span class="p">()</span> <span class="p">{</span>
            <span class="n">exception_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">current_exception</span><span class="p">();</span>
        <span class="p">}</span>

        <span class="n">T</span> <span class="n">value_</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span> <span class="n">exception_</span><span class="p">{};</span>
        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation_</span><span class="p">{};</span>
    <span class="p">};</span>

    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">handle_</span><span class="p">;</span>

    <span class="k">explicit</span> <span class="nf">SimpleTask</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>

    <span class="o">~</span><span class="n">SimpleTask</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">handle_</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">();</span>
        <span class="p">}</span>
    <span class="p">}</span>

    <span class="n">SimpleTask</span><span class="p">(</span><span class="k">const</span> <span class="n">SimpleTask</span><span class="o">&amp;</span><span class="p">)</span> <span class="o">=</span> <span class="k">delete</span><span class="p">;</span>
    <span class="n">SimpleTask</span><span class="p">(</span><span class="n">SimpleTask</span><span class="o">&amp;&amp;</span> <span class="n">other</span><span class="p">)</span> <span class="k">noexcept</span> <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">other</span><span class="p">.</span><span class="n">handle_</span><span class="p">,</span> <span class="p">{}))</span> <span class="p">{}</span>

    <span class="k">struct</span> <span class="nc">Awaiter</span> <span class="p">{</span>
        <span class="k">explicit</span> <span class="n">Awaiter</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>

        <span class="o">~</span><span class="n">Awaiter</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">if</span> <span class="p">(</span><span class="n">handle_</span><span class="p">)</span> <span class="p">{</span>
                <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">();</span>
            <span class="p">}</span>
        <span class="p">}</span>

        <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
        <span class="p">}</span>

        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
            <span class="k">return</span> <span class="n">handle_</span><span class="p">;</span>
        <span class="p">}</span>

        <span class="n">T</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value_</span><span class="p">);</span>
        <span class="p">}</span>

        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">handle_</span><span class="p">{};</span>
    <span class="p">};</span>

    <span class="n">Awaiter</span> <span class="k">operator</span> <span class="k">co_await</span><span class="p">()</span> <span class="o">&amp;&amp;</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">Awaiter</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">handle_</span><span class="p">,</span> <span class="p">{})};</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">callee</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_return</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>

<span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="p">)</span> <span class="p">{</span>
    <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">callee</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"caller: "</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">result</span><span class="o">*</span> <span class="n">i</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">struct</span> <span class="nc">DummyReceiver</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">receiver_concept</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">receiver_t</span><span class="p">;</span>

    <span class="kt">void</span> <span class="n">set_value</span><span class="p">(</span><span class="kt">int</span> <span class="n">v</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"DummyReceiver set_value as "</span> <span class="o">&lt;&lt;</span> <span class="n">v</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="kt">void</span> <span class="nf">set_error</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
    <span class="p">}</span>
    <span class="kt">void</span> <span class="nf">set_stopped</span><span class="p">()</span> <span class="o">&amp;&amp;</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">terminate</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Sender/Receiver style"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="c1">// The caller coroutine will be implicitly co_awaited in main thread</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"main: "</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">this_thread</span><span class="o">::</span><span class="n">get_id</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">sender</span> <span class="k">auto</span> <span class="n">sender</span> <span class="o">=</span> <span class="n">caller</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">receiver</span> <span class="k">auto</span> <span class="n">receiver</span> <span class="o">=</span> <span class="n">DummyReceiver</span><span class="p">{};</span>
        <span class="c1">// Connect the sender and receiver</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">operation_state</span> <span class="k">auto</span> <span class="n">op</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">connect</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">sender</span><span class="p">),</span> <span class="n">receiver</span><span class="p">);</span>
        <span class="c1">// Start the operation asynchronously</span>
        <span class="n">stdexec</span><span class="o">::</span><span class="n">start</span><span class="p">(</span><span class="n">op</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Asynchronously executed in thread pool"</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
        <span class="n">exec</span><span class="o">::</span><span class="n">static_thread_pool</span> <span class="n">pool</span><span class="p">{</span><span class="mi">3</span><span class="p">};</span>
        <span class="k">auto</span> <span class="n">scheduler</span> <span class="o">=</span> <span class="n">pool</span><span class="p">.</span><span class="n">get_scheduler</span><span class="p">();</span>
        <span class="c1">// `starts_on` returns a sender, `when_all` returns a sender too</span>
        <span class="k">auto</span> <span class="n">work</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">when_all</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">scheduler</span><span class="p">,</span> <span class="n">caller</span><span class="p">(</span><span class="mi">1</span><span class="p">)),</span>
                                      <span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">scheduler</span><span class="p">,</span> <span class="n">caller</span><span class="p">(</span><span class="mi">2</span><span class="p">)),</span>
                                      <span class="n">stdexec</span><span class="o">::</span><span class="n">starts_on</span><span class="p">(</span><span class="n">scheduler</span><span class="p">,</span> <span class="n">caller</span><span class="p">(</span><span class="mi">3</span><span class="p">)));</span>
        <span class="c1">// `sync_wait` will block until all senders complete, which has a internal receiver to</span>
        <span class="k">auto</span> <span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">j</span><span class="p">,</span> <span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">sync_wait</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">work</span><span class="p">)).</span><span class="n">value</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">}</span>

</code></pre></div></div>

<p>注意到：为什么<code class="language-plaintext highlighter-rouge">SimpleTask</code>没有实现<code class="language-plaintext highlighter-rouge">Sender</code>的相关接口，但是能被当成<code class="language-plaintext highlighter-rouge">Sender</code>使用？这背后的原理是，<code class="language-plaintext highlighter-rouge">stdexec</code>会检查<code class="language-plaintext highlighter-rouge">SimpleTask</code>能否满足<code class="language-plaintext highlighter-rouge">Awaitable</code>，即<code class="language-plaintext highlighter-rouge">SimpleTask</code>能否被<code class="language-plaintext highlighter-rouge">co_await</code>。然后通过<code class="language-plaintext highlighter-rouge">stdexec</code>中的<code class="language-plaintext highlighter-rouge">__connect_awaitable_t</code>将<code class="language-plaintext highlighter-rouge">SimpleTask</code>封装成Sender，具体流程如下：</p>

<ol>
  <li>
    <p><code class="language-plaintext highlighter-rouge">__connect_awaitable_t</code>的<code class="language-plaintext highlighter-rouge">operator()</code>能接受任意<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Receiver</code></p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">_Receiver</span><span class="p">,</span> <span class="n">__awaitable</span><span class="o">&lt;</span><span class="n">__promise_t</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="p">&gt;</span><span class="o">&gt;</span> <span class="n">_Awaitable</span><span class="o">&gt;</span>
 <span class="k">requires</span> <span class="n">receiver_of</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="p">,</span> <span class="n">__completions_t</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="p">,</span> <span class="n">_Awaitable</span><span class="o">&gt;&gt;</span>
 <span class="k">auto</span> <span class="nf">operator</span><span class="p">()(</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;</span> <span class="n">__awaitable</span><span class="p">,</span> <span class="n">_Receiver</span> <span class="n">__rcvr</span><span class="p">)</span> <span class="k">const</span> <span class="o">-&gt;</span> <span class="n">__operation_t</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&gt;</span> <span class="p">{</span>
     <span class="k">return</span> <span class="n">__co_impl</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__awaitable</span><span class="p">),</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">));</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p>此处会通过<code class="language-plaintext highlighter-rouge">requires receiver_of&lt;_Receiver, __completions_t&lt;_Receiver, _Awaitable&gt;&gt;</code>检查<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Receiver</code>是否匹配：</p>

    <p><code class="language-plaintext highlighter-rouge">__completions_t&lt;_Receiver, _Awaitable&gt;</code>会声明<code class="language-plaintext highlighter-rouge">Awaitable</code>作为<code class="language-plaintext highlighter-rouge">Sender</code>时，它可能会调用<code class="language-plaintext highlighter-rouge">Receiver</code>的哪些方法，而<code class="language-plaintext highlighter-rouge">receiver_of</code>里会检查<code class="language-plaintext highlighter-rouge">Receiver</code>能否处理这些方法。即<code class="language-plaintext highlighter-rouge">completion_signatures</code>是<code class="language-plaintext highlighter-rouge">Sender</code>和<code class="language-plaintext highlighter-rouge">Receiver</code>之间的一个contract。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">using</span> <span class="n">__completions_t</span> <span class="o">=</span> <span class="n">completion_signatures</span><span class="o">&lt;</span>
   <span class="n">set_value_t</span><span class="p">()</span> <span class="n">or</span> <span class="n">set_value_t</span><span class="p">(</span><span class="n">T</span><span class="p">),</span> <span class="c1">// according to return type of Awatiable</span>
   <span class="n">set_error_t</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span><span class="p">),</span>
   <span class="n">set_stopped_t</span><span class="p">()</span>
 <span class="o">&gt;</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">__co_impl</code>的具体实现如下，本质上就是在<code class="language-plaintext highlighter-rouge">co_await</code>传入的<code class="language-plaintext highlighter-rouge">Awaitable</code>对象。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">static</span> <span class="k">auto</span> <span class="n">__co_impl</span><span class="p">(</span><span class="n">_Awaitable</span> <span class="n">__awaitable</span><span class="p">,</span> <span class="n">_Receiver</span> <span class="n">__rcvr</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">__operation_t</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&gt;</span> <span class="p">{</span>
     <span class="k">using</span> <span class="n">__result_t</span> <span class="o">=</span> <span class="n">__await_result_t</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="p">,</span> <span class="n">__promise_t</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&gt;&gt;</span><span class="p">;</span>
     <span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span> <span class="n">__eptr</span><span class="p">;</span>
     <span class="n">STDEXEC_TRY</span> <span class="p">{</span>
         <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">same_as</span><span class="o">&lt;</span><span class="n">__result_t</span><span class="p">,</span> <span class="kt">void</span><span class="o">&gt;</span><span class="p">)</span>
             <span class="k">co_await</span> <span class="p">(</span><span class="k">co_await</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__awaitable</span><span class="p">),</span>
                       <span class="n">__co_call</span><span class="p">(</span><span class="n">set_value</span><span class="p">,</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">)));</span>
         <span class="k">else</span>
             <span class="k">co_await</span> <span class="n">__co_call</span><span class="p">(</span><span class="n">set_value</span><span class="p">,</span>
                                <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">),</span>
                                <span class="k">co_await</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__awaitable</span><span class="p">));</span>
     <span class="p">}</span>
     <span class="n">STDEXEC_CATCH_ALL</span> <span class="p">{</span>
         <span class="n">__eptr</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">current_exception</span><span class="p">();</span>
     <span class="p">}</span>
     <span class="k">co_await</span> <span class="nf">__co_call</span><span class="p">(</span><span class="n">set_error</span><span class="p">,</span>
                        <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">),</span>
                        <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__eptr</span><span class="p">));</span>
 <span class="p">}</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>当协程执行完成时，会获取到<code class="language-plaintext highlighter-rouge">Awaitable</code>的执行结果，然后调用<code class="language-plaintext highlighter-rouge">set_value</code>，如果发生异常时则调用<code class="language-plaintext highlighter-rouge">set_error</code>。</p>
  </li>
</ol>

<p>以<code class="language-plaintext highlighter-rouge">set_value</code>为例，对应<code class="language-plaintext highlighter-rouge">__co_impl</code>中核心代码就是：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">co_await</span> <span class="nf">__co_call</span><span class="p">(</span><span class="n">set_value</span><span class="p">,</span>
                   <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">),</span>
                   <span class="k">co_await</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__awaitable</span><span class="p">));</span>
</code></pre></div></div>

<p>本质上等价于：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">auto</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__awaitable</span><span class="p">);</span>
<span class="n">set_value</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">receiver</span><span class="p">),</span> <span class="n">result</span><span class="p">);</span>
</code></pre></div></div>

<p>即，首先<code class="language-plaintext highlighter-rouge">co_await</code>传入的<code class="language-plaintext highlighter-rouge">Awaitable</code>，也就是<code class="language-plaintext highlighter-rouge">SimpleTask</code>，获取到其执行结果。然后调用<code class="language-plaintext highlighter-rouge">receiver.set_value(result)</code>，将结果传给Receiver。</p>

<p>回顾整个例子，可以发现<code class="language-plaintext highlighter-rouge">SimpleTask</code>并没有添加任何代码，那么<code class="language-plaintext highlighter-rouge">stdexec</code>是如何识别到<code class="language-plaintext highlighter-rouge">SimpleTask</code>，然后将其作为<code class="language-plaintext highlighter-rouge">Sender</code>来使用的呢？</p>

<h2 id="cpo">CPO</h2>

<p>要回答这个问题，我们先从CPO(Customization Point Object)说起。CPO是C++20引入的一类特殊函数对象，用来解决如何安全、可扩展地让用户提供自定义行为。</p>

<p>在C++20之前，自定义行为主要依赖于：</p>

<ol>
  <li>模板特化</li>
  <li>ADL(Argument-Dependent Lookup)</li>
</ol>

<p>模板特化比较简单，比如给自定义类型拓展<code class="language-plaintext highlighter-rouge">std::hash</code>就是通过模板特化实现的。ADL的一个常见例子就是：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">foo</span><span class="p">(</span><span class="k">auto</span><span class="o">&amp;</span> <span class="n">a</span><span class="p">,</span> <span class="k">auto</span><span class="o">&amp;</span> <span class="n">b</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">swap</span><span class="p">;</span>
    <span class="n">swap</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">);</span> <span class="c1">// ADL</span>
<span class="p">}</span>
</code></pre></div></div>

<p>如果没有<code class="language-plaintext highlighter-rouge">using std::swap</code>，编译器只会在当前作用域和参数相关命名空间查找<code class="language-plaintext highlighter-rouge">swap</code>。如果有<code class="language-plaintext highlighter-rouge">using std::swap</code>，则<code class="language-plaintext highlighter-rouge">std::swap</code>也会被引入当前作用域，参与重载决议。如果没有找到自定义<code class="language-plaintext highlighter-rouge">swap</code>，<code class="language-plaintext highlighter-rouge">std::swap</code>作为兜底方案会被选中。不难发现，通过ADL来自定义行为，是借助函数重载来完成的。一旦没有using相关的命名空间，会导致自定义重载没有被调用，也容易出现二义性调用，或者选中意料之外的版本。</p>

<p>而CPO本身则是一个特殊的函数对象，编译器会在编译器决定<code class="language-plaintext highlighter-rouge">operator()</code>的行为，避免了意外的重载。具体来说，标准库或者其他库，可以通过重载<code class="language-plaintext highlighter-rouge">operator()</code>，或者<code class="language-plaintext highlighter-rouge">if constexpr</code>的形式，告知用户应当如何自定义这些行为。</p>

<p>我们前面提到<code class="language-plaintext highlighter-rouge">stdexec</code>中会：</p>

<ol>
  <li>通过<code class="language-plaintext highlighter-rouge">connect</code>将<code class="language-plaintext highlighter-rouge">SimpleTask</code>封装成<code class="language-plaintext highlighter-rouge">Sender</code></li>
  <li>通过<code class="language-plaintext highlighter-rouge">set_value</code>将<code class="language-plaintext highlighter-rouge">Awaitable</code>结果传递给<code class="language-plaintext highlighter-rouge">Receiver</code></li>
</ol>

<p>这两个行为实际上都是通过CPO实现的，我们以<code class="language-plaintext highlighter-rouge">set_value</code>为例，结合<code class="language-plaintext highlighter-rouge">stdexec</code>代码来理解下CPO。</p>

<p>首先，<code class="language-plaintext highlighter-rouge">set_value</code>实际是个函数对象，其类型是<code class="language-plaintext highlighter-rouge">set_value_t</code>。<code class="language-plaintext highlighter-rouge">Receiver</code>的其他两个接口<code class="language-plaintext highlighter-rouge">set_error</code>和<code class="language-plaintext highlighter-rouge">set_stopped</code>也是一样的。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">using</span> <span class="n">__rcvrs</span><span class="o">::</span><span class="n">set_value_t</span><span class="p">;</span>
  <span class="k">using</span> <span class="n">__rcvrs</span><span class="o">::</span><span class="n">set_error_t</span><span class="p">;</span>
  <span class="k">using</span> <span class="n">__rcvrs</span><span class="o">::</span><span class="n">set_stopped_t</span><span class="p">;</span>
  <span class="kr">inline</span> <span class="k">constexpr</span> <span class="n">set_value_t</span> <span class="n">set_value</span><span class="p">{};</span>
  <span class="kr">inline</span> <span class="k">constexpr</span> <span class="n">set_error_t</span> <span class="n">set_error</span><span class="p">{};</span>
  <span class="kr">inline</span> <span class="k">constexpr</span> <span class="n">set_stopped_t</span> <span class="n">set_stopped</span><span class="p">{};</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">set_value_t</code>的代码如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">_Receiver</span><span class="p">,</span> <span class="k">class</span><span class="o">...</span> <span class="nc">_As</span><span class="p">&gt;</span>
<span class="k">concept</span> <span class="n">__set_value_member</span> <span class="o">=</span> <span class="k">requires</span><span class="p">(</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;</span><span class="n">__rcvr</span><span class="p">,</span> <span class="n">_As</span> <span class="o">&amp;&amp;</span><span class="p">...</span><span class="n">__args</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">).</span><span class="n">set_value</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_As</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__args</span><span class="p">)...);</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">set_value_t</span> <span class="p">{</span>
    <span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">_Fn</span><span class="p">,</span> <span class="k">class</span><span class="o">...</span> <span class="nc">_As</span><span class="p">&gt;</span>
    <span class="k">using</span> <span class="n">__f</span> <span class="o">=</span> <span class="n">__minvoke</span><span class="o">&lt;</span><span class="n">_Fn</span><span class="p">,</span> <span class="n">_As</span><span class="p">...</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="c1">// Receiver has set_value as member function</span>
    <span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">_Receiver</span><span class="p">,</span> <span class="k">class</span><span class="o">...</span> <span class="nc">_As</span><span class="p">&gt;</span>
        <span class="k">requires</span> <span class="n">__set_value_member</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="p">,</span> <span class="n">_As</span><span class="p">...</span><span class="o">&gt;</span>
    <span class="n">STDEXEC_ATTRIBUTE</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="n">device</span><span class="p">,</span> <span class="n">always_inline</span><span class="p">)</span>
    <span class="kt">void</span> <span class="k">operator</span><span class="p">()(</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;</span><span class="n">__rcvr</span><span class="p">,</span> <span class="n">_As</span> <span class="o">&amp;&amp;</span><span class="p">...</span><span class="n">__as</span><span class="p">)</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="k">static_assert</span><span class="p">(</span><span class="k">noexcept</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">).</span><span class="n">set_value</span><span class="p">(</span>
                              <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_As</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__as</span><span class="p">)...)),</span>
                      <span class="s">"set_value member functions must be noexcept"</span><span class="p">);</span>
        <span class="k">static_assert</span><span class="p">(</span><span class="n">__same_as</span><span class="o">&lt;</span><span class="k">decltype</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">).</span><span class="n">set_value</span><span class="p">(</span>
                                        <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_As</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__as</span><span class="p">)...)),</span>
                                <span class="kt">void</span><span class="o">&gt;</span><span class="p">,</span>
                      <span class="s">"set_value member functions must return void"</span><span class="p">);</span>
        <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">).</span><span class="n">set_value</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_As</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__as</span><span class="p">)...);</span>
    <span class="p">}</span>

    <span class="c1">// Receiver doesn't have set_value as member function</span>
    <span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">_Receiver</span><span class="p">,</span> <span class="k">class</span><span class="o">...</span> <span class="nc">_As</span><span class="p">&gt;</span>
        <span class="k">requires</span><span class="p">(</span><span class="o">!</span><span class="n">__set_value_member</span><span class="o">&lt;</span><span class="n">_Receiver</span><span class="p">,</span> <span class="n">_As</span><span class="p">...</span><span class="o">&gt;</span><span class="p">)</span> <span class="o">&amp;&amp;</span>
                <span class="n">tag_invocable</span><span class="o">&lt;</span><span class="n">set_value_t</span><span class="p">,</span> <span class="n">_Receiver</span><span class="p">,</span> <span class="n">_As</span><span class="p">...</span><span class="o">&gt;</span>
    <span class="n">STDEXEC_ATTRIBUTE</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="n">device</span><span class="p">,</span> <span class="n">always_inline</span><span class="p">)</span>
    <span class="kt">void</span> <span class="nf">operator</span><span class="p">()(</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;</span><span class="n">__rcvr</span><span class="p">,</span> <span class="n">_As</span> <span class="o">&amp;&amp;</span><span class="p">...</span><span class="n">__as</span><span class="p">)</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="k">static_assert</span><span class="p">(</span><span class="n">nothrow_tag_invocable</span><span class="o">&lt;</span><span class="n">set_value_t</span><span class="p">,</span> <span class="n">_Receiver</span><span class="p">,</span> <span class="n">_As</span><span class="p">...</span><span class="o">&gt;</span><span class="p">);</span>
        <span class="p">(</span><span class="kt">void</span><span class="p">)</span><span class="n">tag_invoke</span><span class="p">(</span>
                <span class="o">*</span><span class="k">this</span><span class="p">,</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Receiver</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__rcvr</span><span class="p">),</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_As</span> <span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">__as</span><span class="p">)...);</span>
    <span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>

<p>这里对<code class="language-plaintext highlighter-rouge">operator()</code>提供了两种重载：</p>

<ol>
  <li>如果有<code class="language-plaintext highlighter-rouge">set_value</code>这个成员函数，则直接调用<code class="language-plaintext highlighter-rouge">receiver.set_value()</code>。比如我们前面示例中，<code class="language-plaintext highlighter-rouge">DummyReceiver</code>就提供了<code class="language-plaintext highlighter-rouge">set_value</code>成员函数，当<code class="language-plaintext highlighter-rouge">Sender</code>执行完毕，就会通过这个成员函数，将结果告知给<code class="language-plaintext highlighter-rouge">Receiver</code>。</li>
  <li>如果没有<code class="language-plaintext highlighter-rouge">set_value</code>这个成员函数，但支持通过以<code class="language-plaintext highlighter-rouge">tag_invoke</code>进行自定义，那么就调用<code class="language-plaintext highlighter-rouge">tag_invoke</code>。我们在下面例子就能看到，<code class="language-plaintext highlighter-rouge">tag_invoke</code>就是一个函数模版，只要用户提供了这个函数模板，就能通过标签派发(tag dispatch)调用相应函数。</li>
</ol>

<blockquote>
  <p>同理，<code class="language-plaintext highlighter-rouge">connect</code>时会通过相同的机制调用<code class="language-plaintext highlighter-rouge">__connect_awaitable_t</code>的<code class="language-plaintext highlighter-rouge">operator()</code></p>

</blockquote>

<p>到这也就介绍完了<code class="language-plaintext highlighter-rouge">stdexec</code>如何通过CPO提供了让用户自定义接口的能力。接下来，我们以<code class="language-plaintext highlighter-rouge">SimpleTask</code>为例，看看如何在用户代码中自定义这些行为。<code class="language-plaintext highlighter-rouge">SimpleTask</code>的代码省略，和之前一样。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">namespace</span> <span class="n">stdexec</span> <span class="p">{</span>
<span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">S</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">completion_signatures_of</span><span class="p">;</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">completion_signatures_of</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">type</span> <span class="o">=</span> <span class="n">completion_signatures</span><span class="o">&lt;</span><span class="n">set_value_t</span><span class="p">(</span><span class="n">T</span><span class="p">),</span>
                                       <span class="n">set_error_t</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span><span class="p">),</span>
                                       <span class="n">set_stopped_t</span><span class="p">()</span><span class="o">&gt;</span><span class="p">;</span>
<span class="p">};</span>
<span class="p">}</span>  <span class="c1">// namespace stdexec</span>

<span class="k">struct</span> <span class="nc">TagInvokeReceiver</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">receiver_concept</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">receiver_t</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">value</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">void</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">set_value_t</span><span class="p">,</span> <span class="n">TagInvokeReceiver</span><span class="o">&amp;&amp;</span> <span class="n">r</span><span class="p">,</span> <span class="kt">int</span> <span class="n">v</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="n">r</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">v</span> <span class="o">*</span> <span class="mi">100</span><span class="p">;</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"TagInvokeReceiver set_value as "</span> <span class="o">&lt;&lt;</span> <span class="n">r</span><span class="p">.</span><span class="n">value</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">set_error_t</span><span class="p">,</span> <span class="n">TagInvokeReceiver</span><span class="o">&amp;&amp;</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{}</span>
<span class="kt">void</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">set_stopped_t</span><span class="p">,</span> <span class="n">TagInvokeReceiver</span><span class="o">&amp;&amp;</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{}</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">R</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">TaskOp</span> <span class="p">{</span>
    <span class="n">R</span> <span class="n">receiver_</span><span class="p">;</span>
    <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">task_</span><span class="p">;</span>
    <span class="kt">void</span> <span class="n">start</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="k">try</span> <span class="p">{</span>
            <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">task_</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">done</span><span class="p">())</span> <span class="p">{</span>
                <span class="n">task_</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
            <span class="p">}</span>
            <span class="k">auto</span> <span class="n">v</span> <span class="o">=</span> <span class="n">task_</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value_</span><span class="p">;</span>
            <span class="n">stdexec</span><span class="o">::</span><span class="n">set_value</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">receiver_</span><span class="p">),</span> <span class="n">v</span><span class="p">);</span>
        <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
            <span class="n">stdexec</span><span class="o">::</span><span class="n">set_error</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">receiver_</span><span class="p">),</span> <span class="n">std</span><span class="o">::</span><span class="n">current_exception</span><span class="p">());</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">R</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">connect_t</span><span class="p">,</span> <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;&amp;&amp;</span> <span class="n">task</span><span class="p">,</span> <span class="n">R</span><span class="o">&amp;&amp;</span> <span class="n">r</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TaskOp</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">decay_t</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;&gt;</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">TaskOp</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">decay_t</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;&gt;</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">forward</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;</span><span class="p">(</span><span class="n">r</span><span class="p">),</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">task</span><span class="p">)};</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">stdexec</span><span class="o">::</span><span class="n">sender</span> <span class="k">auto</span> <span class="n">sender</span> <span class="o">=</span> <span class="n">caller</span><span class="p">(</span><span class="mi">2</span><span class="p">);</span>
    <span class="n">stdexec</span><span class="o">::</span><span class="n">operation_state</span> <span class="k">auto</span> <span class="n">op</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">connect</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">sender</span><span class="p">),</span> <span class="n">TagInvokeReceiver</span><span class="p">{});</span>
    <span class="n">stdexec</span><span class="o">::</span><span class="n">start</span><span class="p">(</span><span class="n">op</span><span class="p">);</span>
<span class="p">}</span>

</code></pre></div></div>

<p>首先我们通过<code class="language-plaintext highlighter-rouge">completion_signatures_of</code>，声明了<code class="language-plaintext highlighter-rouge">SimpleTask</code>对应的<code class="language-plaintext highlighter-rouge">Receiver</code>需要提供哪些方法。然后自定义了一个<code class="language-plaintext highlighter-rouge">Receiver</code>类<code class="language-plaintext highlighter-rouge">TagInvokeReceiver</code>，之后就用这个类来保存<code class="language-plaintext highlighter-rouge">SimpleTask</code>的执行结果。此处<code class="language-plaintext highlighter-rouge">using receiver_concept = stdexec::receiver_t</code>是为了告知<code class="language-plaintext highlighter-rouge">stdexec</code>它可以被用作<code class="language-plaintext highlighter-rouge">Receiver</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">TagInvokeReceiver</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">receiver_concept</span> <span class="o">=</span> <span class="n">stdexec</span><span class="o">::</span><span class="n">receiver_t</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">value</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>之后就提供了对应的<code class="language-plaintext highlighter-rouge">tag_invoke</code>函数，这样，<code class="language-plaintext highlighter-rouge">stdexec</code>在使用<code class="language-plaintext highlighter-rouge">set_value_t</code>对象时，就能调用到我们提供的<code class="language-plaintext highlighter-rouge">tag_invoke</code>函数了：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">set_value_t</span><span class="p">,</span> <span class="n">TagInvokeReceiver</span><span class="o">&amp;&amp;</span> <span class="n">r</span><span class="p">,</span> <span class="kt">int</span> <span class="n">v</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="n">r</span><span class="p">.</span><span class="n">value</span> <span class="o">=</span> <span class="n">v</span> <span class="o">*</span> <span class="mi">100</span><span class="p">;</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"TagInvokeReceiver set_value as "</span> <span class="o">&lt;&lt;</span> <span class="n">r</span><span class="p">.</span><span class="n">value</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>同理，<code class="language-plaintext highlighter-rouge">connect</code>也是一个CPO，我们通过另一个<code class="language-plaintext highlighter-rouge">tag_invoke</code>函数，自定义了每次<code class="language-plaintext highlighter-rouge">connect</code>一个<code class="language-plaintext highlighter-rouge">SimpleTask</code>和<code class="language-plaintext highlighter-rouge">Receiver</code>时，都返回<code class="language-plaintext highlighter-rouge">TaskOp</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">class</span> <span class="nc">R</span><span class="p">&gt;</span>
<span class="k">auto</span> <span class="n">tag_invoke</span><span class="p">(</span><span class="n">stdexec</span><span class="o">::</span><span class="n">connect_t</span><span class="p">,</span> <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;&amp;&amp;</span> <span class="n">task</span><span class="p">,</span> <span class="n">R</span><span class="o">&amp;&amp;</span> <span class="n">r</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">TaskOp</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">decay_t</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;&gt;</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">TaskOp</span><span class="o">&lt;</span><span class="n">std</span><span class="o">::</span><span class="n">decay_t</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;&gt;</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">forward</span><span class="o">&lt;</span><span class="n">R</span><span class="o">&gt;</span><span class="p">(</span><span class="n">r</span><span class="p">),</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">task</span><span class="p">)};</span>
<span class="p">}</span>
</code></pre></div></div>

<p>最后，<code class="language-plaintext highlighter-rouge">stdexec::start</code>本身也是个CPO，如果没有通过<code class="language-plaintext highlighter-rouge">tag_invoke</code>自定义的话，就是调用传入对象的<code class="language-plaintext highlighter-rouge">start</code>方法，在我们例子中也就是<code class="language-plaintext highlighter-rouge">TaskOp</code>的<code class="language-plaintext highlighter-rouge">start</code>方法。运行这个代码，就会发现SimpleTask的结果被传给了<code class="language-plaintext highlighter-rouge">TagInvokeReceiver</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">TagInvokeReceiver</span> <span class="n">set_value</span> <span class="n">as</span> <span class="mi">8400</span>
</code></pre></div></div>

<h2 id="at-last">At last</h2>

<p>这篇文章差不多就到此结束了。P2300提出了一种统一的抽象来完成异步任务，这些概念需要一定时间去消化。而从业务代码上而言，如果你不需要去实现一个Sender（这通常是基础库需要做的事），而是只需要使用一个<code class="language-plaintext highlighter-rouge">Sender</code>，那倒不需要太大改工，只需要<code class="language-plaintext highlighter-rouge">co_await</code>一个<code class="language-plaintext highlighter-rouge">Sender</code>，或者是调用一个类似<code class="language-plaintext highlighter-rouge">sync_wait</code>的方法即可。但鉴于P2300饱受争议，又是一个刚采纳没多久的提案，目前只有几个POC的库支持了P2300，比如stdexec和libunifex，其余的基础库比如Boost、Folly和Abseil还处于没支持或者很早期的阶段。从一个基础架构的开发者的角度来说，一个项目一般不会使用很多种异步代码的编写方式，异步代码的技术栈更换说不定远远跟不上新的C++标准更新速度，这未免会让很多人敬而远之。</p>

<h2 id="reference">Reference</h2>

<p><a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2300r10.html">P2300R10: <code class="language-plaintext highlighter-rouge">std::execution</code></a></p>

<p><a href="https://github.com/NVIDIA/stdexec">https://github.com/NVIDIA/stdexec</a></p>

<p><a href="https://www.youtube.com/watch?v=xLboNIf7BTg">Working with Asynchrony Generically: A Tour of C++ Executors (part 1/2) - Eric Niebler - CppCon 21</a></p>

<p><a href="https://www.youtube.com/watch?v=1Wy5sq3s2rg">Structured Concurrency: Writing Safer Concurrent Code with Coroutines… - Lewis Baker - CppCon 2019 - YouTube</a></p>

<p><a href="https://ericniebler.com/2024/02/04/what-are-senders-good-for-anyway/">What are Senders Good For, Anyway? – Eric Niebler</a></p>

<table>
  <tbody>
    <tr>
      <td>[A Universal Async Abstraction for C++</td>
      <td>cor3ntin](https://cor3ntin.github.io/posts/executors/)</td>
    </tr>
  </tbody>
</table>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><category term="folly" /><summary type="html"><![CDATA[坑越挖越深，这一篇看下Coroutine和C++26引入的Sender。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 1</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-1/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 1" /><published>2025-11-06T00:00:00+08:00</published><updated>2025-11-06T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-1</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-1/"><![CDATA[<p>每次看协程的相关介绍，总是被各种繁杂的概念所困扰，之前也尝试过梳理一次，效果也很一般。这次花了不少时间系统的学习了一下，希望能加深一下印象。其中不少的内容都来自于这个<a href="https://lewissbaker.github.io/">博客</a>，但它罗列了过多的细节，缺少了一个全局视角。直到我前一阵子看到了这个<a href="https://www.youtube.com/watch?v=J7fYddslH0Q">演讲</a>，才把各个概念串联起来，有了相对清晰的理解。希望对各位有所帮助。</p>

<h3 id="what-is-a-coroutine">What is a Coroutine?</h3>

<p>一次函数调用被分为两步：调用(Call)和返回(Return)，这里把”抛异常”也广义地归入了返回操作。调用操作会创建一个栈帧，挂起调用函数的执行，并将执行转移到被调用函数的起始位置。返回操作会将返回值传递给调用方，销毁栈帧，然后恢复调用方的执行。</p>

<p>协程在普通函数的基础上，额外具备以下能力：</p>

<ul>
  <li>挂起执行并将控制权返回给调用方</li>
  <li>在被挂起后恢复执行</li>
</ul>

<h3 id="what-makes-a-function-a-coroutine">What makes a function a coroutine?</h3>

<p>如果一个函数包含以下内容，则它是一个协程：</p>

<ul>
  <li>一个<code class="language-plaintext highlighter-rouge">co_return</code>语句</li>
  <li>一个<code class="language-plaintext highlighter-rouge">co_await</code>表达式</li>
  <li>一个<code class="language-plaintext highlighter-rouge">co_yield</code>表达式</li>
</ul>

<blockquote>
  <p>一个函数是否为协程从其函数签名上无法区分，这是一个实现细节。</p>

</blockquote>

<p>在C++中提供的协程是stackless的，当被挂起时，会将控制权转交给调用方。恢复协程继续执行所需的相关信息会保存在一块动态分配的内存中，通常来说会是堆上，因此称为stackless。而之前介绍的Fiber就是stackful的，当Fiber被挂起时，当前的栈帧会被保存在栈上。协程相关信息在内存中的保存位置可以通过指定allocator来指定，不一定是堆上，但一定是动态分配的。</p>

<h2 id="coroutines-ts">Coroutines TS</h2>

<p>C++ Coroutines TS(<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4680.pdf">N4680</a>)引入了协程的基础机制，开发者可以通过这个机制与协程交互并自定义其行为。然而，Coroutines TS提供的更像是协程的底层工具，这些工具很难被直接使用。相反，基础库的编写者可以基于这些底层工具，提供更加简单易用的高级抽象，比如<a href="https://github.com/lewissbaker/cppcoro">cppcoro</a>或者<a href="https://github.com/facebook/folly/tree/main/folly/experimental/coro">folly::coro</a>。</p>

<p>比如，Coroutines TS实际上并没有定义协程的语义：</p>

<ul>
  <li>它没有定义如何生成返回给调用者的值。</li>
  <li>它没有定义如何处理传递给<code class="language-plaintext highlighter-rouge">co_return</code>语句的返回值，或者如何处理从协程传播出去的异常。</li>
  <li>它没有定义应该在哪个线程上恢复协程。</li>
</ul>

<p>相反，它为基础库提供了一种通用机制，基础库通过实现符合特定接口的类型来定制化协程的行为。因此我们可以拓展出许多不同类型的协程，分别用于各种不同的场合。例如，你可以定义一个异步生成单个值的协程，或者一个lazily生成一系列值的协程。</p>

<p>Coroutines TS定义了两种接口：<code class="language-plaintext highlighter-rouge">Promise</code>和<code class="language-plaintext highlighter-rouge">Awaitable</code>。</p>

<p><code class="language-plaintext highlighter-rouge">Promise</code>接口指定了用于自定义协程本身行为的方法。基础库编写者能够自定义：</p>

<ul>
  <li>调用协程的行为</li>
  <li>协程返回时的行为（无论是通过正常方式还是通过未处理的异常）</li>
  <li>协程中<code class="language-plaintext highlighter-rouge">co_await</code>或<code class="language-plaintext highlighter-rouge">co_yield</code>表达式的行为。</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">Awaitable</code>接口指定了控制<code class="language-plaintext highlighter-rouge">co_await</code>表达式语义的方法。当我们<code class="language-plaintext highlighter-rouge">co_await</code>一个表达式时，代码将被转换为对<code class="language-plaintext highlighter-rouge">Awaitable</code>对象上的一系列方法的调用，这些方法允许它指定：</p>

<ul>
  <li>是否挂起当前协程</li>
  <li>在挂起后执行某些逻辑以安排之后协程恢复</li>
  <li>在协程恢复后执行某些逻辑以产生<code class="language-plaintext highlighter-rouge">co_await</code>表达式的结果</li>
</ul>

<p>这一篇主要会从<code class="language-plaintext highlighter-rouge">co_await</code>的角度来介绍协程，主要关注<code class="language-plaintext highlighter-rouge">Awaitable</code>。而下一篇则主要从<code class="language-plaintext highlighter-rouge">Promise</code>的角度来介绍协程。</p>

<h2 id="concept">Concept</h2>

<h3 id="returntype">ReturnType</h3>

<p>协程的概念非常多，为了方便理解，这里直接从一个很简单的例子开始说明。下面例子中<code class="language-plaintext highlighter-rouge">task</code>是一个协程，它的返回值是一个<code class="language-plaintext highlighter-rouge">folly::coro::Task&lt;void&gt;</code>。在这篇文章中我们暂时不需要关心它具体是什么，只需要知道它是一个协程的返回类型，我们把它称为<code class="language-plaintext highlighter-rouge">ReturnType</code>。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">folly</span><span class="o">::</span><span class="n">coro</span><span class="o">::</span><span class="n">Task</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="n">task</span><span class="p">(</span><span class="kt">int</span> <span class="n">arg42</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// ...</span>
  <span class="k">co_return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>当调用协程时，都会获取到一个<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。我们前面提到Coroutines TS不会指定协程返回时的行为，所以开发者通过<code class="language-plaintext highlighter-rouge">ReturnType</code>的接口来定义当协程返回时调用方能够做什么。比如<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>这个<code class="language-plaintext highlighter-rouge">ReturnType</code>就提供了一个<code class="language-plaintext highlighter-rouge">scheduleOn</code>方法来指定这个协程在哪个executor执行。当我们调用其他协程基础库时，首先应当关注的就是<code class="language-plaintext highlighter-rouge">ReturnType</code>。因为基础库开发者通过<code class="language-plaintext highlighter-rouge">ReturnType</code>自定义了这个协程的行为。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kt">void</span> <span class="nf">caller</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">auto</span> <span class="n">f</span> <span class="o">=</span> <span class="n">task</span><span class="p">(</span><span class="mi">42</span><span class="p">).</span><span class="n">scheduleOn</span><span class="p">(</span><span class="n">folly</span><span class="o">::</span><span class="n">getCPUExecutor</span><span class="p">().</span><span class="n">get</span><span class="p">()).</span><span class="n">start</span><span class="p">();</span>
<span class="p">}</span>

</code></pre></div></div>

<h3 id="promise_type">promise_type</h3>

<p>每个<code class="language-plaintext highlighter-rouge">ReturnType</code>中都必须一个<code class="language-plaintext highlighter-rouge">promise_type</code>，以供编译器使用。这个<code class="language-plaintext highlighter-rouge">promise_type</code>也就是前面我们所说的<code class="language-plaintext highlighter-rouge">Promise</code>接口，提供<code class="language-plaintext highlighter-rouge">promise_type</code>的方式有几种：</p>

<ol>
  <li>内嵌
    <ul>
      <li>
        <p>使用using declaration，比如<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>这样：</p>

        <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">class</span> <span class="nc">FOLLY_NODISCARD</span> <span class="n">Task</span> <span class="p">{</span>
   <span class="nl">public:</span>
    <span class="k">using</span> <span class="n">promise_type</span> <span class="o">=</span> <span class="n">detail</span><span class="o">::</span><span class="n">TaskPromise</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">;</span>
    <span class="c1">// ...</span>
  <span class="p">}</span>
</code></pre></div>        </div>
      </li>
      <li>
        <p>内嵌类</p>

        <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">struct</span> <span class="nc">MyTask</span> <span class="p">{</span>
      <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
          <span class="c1">// ...</span>
      <span class="p">};</span>
  <span class="p">};</span>
</code></pre></div>        </div>
      </li>
    </ul>
  </li>
  <li>
    <p>如果无法通过内嵌的形式提供，则可以特化<code class="language-plaintext highlighter-rouge">coroutine_traits</code>，从而指定指定其中的<code class="language-plaintext highlighter-rouge">promise_type</code></p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">template</span><span class="o">&lt;</span><span class="p">&gt;</span>
 <span class="k">struct</span> <span class="nc">coroutine_traits</span><span class="o">&lt;</span><span class="n">MyTask</span><span class="o">&gt;</span> <span class="p">{</span>
     <span class="k">using</span> <span class="n">promise_type</span> <span class="o">=</span> <span class="n">MyPromise</span><span class="p">;</span>
 <span class="p">};</span>
</code></pre></div>    </div>
  </li>
</ol>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">promise_type</code>的命名风格是c++标准中规定的。虽然和文章中其他组件的命名风格不一样，我们也沿用这个风格。</p>

</blockquote>

<p>在这我们先理解下为什么它被称为promise。首先C++标准中提供了<code class="language-plaintext highlighter-rouge">std::future</code>/<code class="language-plaintext highlighter-rouge">std::promise</code>，folly提供了功能更加强大的<code class="language-plaintext highlighter-rouge">folly::Future</code>/<code class="language-plaintext highlighter-rouge">folly::Promise</code>，本质上都是一个异步的生产者消费者模型。promise是生产者，通过<code class="language-plaintext highlighter-rouge">promise.set_value()</code>或者<code class="language-plaintext highlighter-rouge">promise.set_exception()</code>来设置结果。而future是消费者，通过<code class="language-plaintext highlighter-rouge">future.get()</code>等方法来获取结果。</p>

<p>而在协程中，<code class="language-plaintext highlighter-rouge">promise_type</code>也是生产者，每个协程都有一个对应的<code class="language-plaintext highlighter-rouge">promise_type</code>对象，并通过<code class="language-plaintext highlighter-rouge">return_value()</code>或者<code class="language-plaintext highlighter-rouge">return_void()</code>来设置结果，又或者通过<code class="language-plaintext highlighter-rouge">unhandled_exception()</code>来获取异常。和<code class="language-plaintext highlighter-rouge">std::promise</code>不同的是，协程的<code class="language-plaintext highlighter-rouge">promise_type</code>对象不会出现在用户代码中，而是由编译器生成代码来调用它的相关接口。</p>

<p><code class="language-plaintext highlighter-rouge">promise_type</code>接口指定了自定义协程本身行为的方法。基础库编写者能够自定义：当协程被调用时发生什么、当协程返回时发生什么（无论是通过正常方式还是通过未处理的异常）。</p>

<p><code class="language-plaintext highlighter-rouge">promise_type</code>的主要接口如下，为了能更好理解后续的概念，这里会简单介绍一下，相关的代码我们在下一篇再展开。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
    <span class="c1">// creating coroutine object - mandatory</span>
    <span class="n">ReturnType</span> <span class="n">get_return_object</span><span class="p">();</span>

    <span class="c1">// returns awaitable object - mandatory</span>
    <span class="k">auto</span> <span class="n">initial_suspend</span><span class="p">();</span>
    <span class="k">auto</span> <span class="n">final_suspend</span><span class="p">();</span>

    <span class="kt">void</span> <span class="n">unhandled_exception</span><span class="p">();</span>  <span class="c1">// mandatory</span>
    <span class="c1">// one of below is mandatory and only one must be present</span>
    <span class="kt">void</span> <span class="n">return_value</span><span class="p">(</span><span class="cm">/*type*/</span><span class="p">);</span>
    <span class="kt">void</span> <span class="n">return_void</span><span class="p">();</span>

    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">get_return_object</code>：用于从<code class="language-plaintext highlighter-rouge">promise_type</code>对象获取对应的<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。当协程到达其第一个挂起点并且控制流返回给调用方时，调用方将通过调用<code class="language-plaintext highlighter-rouge">get_return_object</code>获得一个<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。这些自定义点可以执行任意逻辑。</li>
  <li><code class="language-plaintext highlighter-rouge">return_void</code>/<code class="language-plaintext highlighter-rouge">return_value</code>/<code class="language-plaintext highlighter-rouge">unhandled_exception</code>：自定义点，用于处理协程到达<code class="language-plaintext highlighter-rouge">co_return</code>语句时的行为以及异常处理方式。</li>
  <li><code class="language-plaintext highlighter-rouge">initial_suspend</code>：自定义点，用于自定义协程体在执行之前的行为，比如是立即执行还是lazily启动。</li>
  <li><code class="language-plaintext highlighter-rouge">final_suspend</code>：自定义点，用于协程体执行之后的行为，比如协程由谁在什么时候销毁。</li>
</ul>

<p>实际上，<code class="language-plaintext highlighter-rouge">Promise</code>是协程代码和协程调用方之间的核心交汇点，它负责管理协程的生命周期，并在内部保存协程的执行结果。这些接口如果现在看起来一头雾水是没有关系，协程本身概念是在太多，无法管中窥豹。接下来还有很多块拼图，只有了解每个拼图块，才能了解全貌。</p>

<h3 id="awaitable--awaiter">Awaitable &amp;&amp; Awaiter</h3>

<p>下一块拼图是<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>。</p>

<p><code class="language-plaintext highlighter-rouge">co_await</code>运算符是一个新的一元运算符，只能在协程的上下文中使用。支持<code class="language-plaintext highlighter-rouge">co_await</code>运算符的类型称为<code class="language-plaintext highlighter-rouge">Awaitable</code>，即文章开头所说的<code class="language-plaintext highlighter-rouge">Awaitable</code>接口。</p>

<p>而<code class="language-plaintext highlighter-rouge">Awaiter</code>类型是实现了三个特殊方法的类型，这些方法作为<code class="language-plaintext highlighter-rouge">co_await</code>表达式的一部分被调用：<code class="language-plaintext highlighter-rouge">await_ready</code>、<code class="language-plaintext highlighter-rouge">await_suspend</code>和<code class="language-plaintext highlighter-rouge">await_resume</code>。准确来说，编译器会把<code class="language-plaintext highlighter-rouge">co_await</code>展开为一段固定的三段式代码（下面的段落会介绍），这段代码会对调用<code class="language-plaintext highlighter-rouge">Awaiter</code>的这三个方法，进而自定义协程是否需要挂起，挂起时的行为，以及协程恢复时返回什么。因此只要实现了这三个方法的任何类型，都能被编译器正确在<code class="language-plaintext highlighter-rouge">co_await</code>中展开。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">suspend_always</span> <span class="p">{</span>
    <span class="c1">// always suspend</span>
    <span class="k">constexpr</span> <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">constexpr</span> <span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span><span class="p">)</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{}</span>
    <span class="k">constexpr</span> <span class="kt">void</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{}</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">suspend_never</span> <span class="p">{</span>
    <span class="c1">// never suspend</span>
    <span class="k">constexpr</span> <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
        <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">constexpr</span> <span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span><span class="p">)</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{}</span>
    <span class="k">constexpr</span> <span class="kt">void</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{}</span>
<span class="p">};</span>

<span class="k">co_await</span> <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span><span class="p">{};</span>
<span class="k">co_await</span> <span class="n">std</span><span class="o">::</span><span class="n">suspend_never</span><span class="p">{};</span>
</code></pre></div></div>

<p>那怎么理解”支持<code class="language-plaintext highlighter-rouge">co_await</code>运算符的类型称为<code class="language-plaintext highlighter-rouge">Awaitable</code>“这句话呢？我们可以认为<code class="language-plaintext highlighter-rouge">Awaitable</code>的主要作用就是告诉编译器如何获取<code class="language-plaintext highlighter-rouge">Awaiter</code>，这样<code class="language-plaintext highlighter-rouge">co_await</code>一个<code class="language-plaintext highlighter-rouge">Awaitable</code>时，就能正确被编译器展开成一段代码，从而获取到<code class="language-plaintext highlighter-rouge">Awaiter</code>。</p>

<p>我们可以参照编译器如何获取<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>的完整流程进行理解：</p>

<p>假设等待协程的<code class="language-plaintext highlighter-rouge">promise_type</code>对象是<code class="language-plaintext highlighter-rouge">promise</code>，如果<code class="language-plaintext highlighter-rouge">promise_type</code>类型有一个名为<code class="language-plaintext highlighter-rouge">await_transform</code>的成员，则首先将<code class="language-plaintext highlighter-rouge">&lt;expr&gt;</code>传递给对<code class="language-plaintext highlighter-rouge">promise.await_transform(&lt;expr&gt;)</code>的调用以获得相应的<code class="language-plaintext highlighter-rouge">Awaitable</code>对象。否则，如果<code class="language-plaintext highlighter-rouge">promise_type</code>没有<code class="language-plaintext highlighter-rouge">await_transform</code>成员，则我们直接使用计算<code class="language-plaintext highlighter-rouge">&lt;expr&gt;</code>的结果作为<code class="language-plaintext highlighter-rouge">Awaitable</code>对象。</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">await_transform</code>我们在下一篇详细介绍<code class="language-plaintext highlighter-rouge">promise_type</code>时会涉及</p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">promise_type</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">decltype</span><span class="p">(</span><span class="k">auto</span><span class="p">)</span> <span class="n">get_awaitable</span><span class="p">(</span><span class="n">promise_type</span><span class="o">&amp;</span> <span class="n">promise</span><span class="p">,</span> <span class="n">T</span><span class="o">&amp;&amp;</span> <span class="n">expr</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">has_any_await_transform_member_v</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">promise</span><span class="p">.</span><span class="n">await_transform</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">));</span>
    <span class="k">else</span>
        <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>然后，通过<code class="language-plaintext highlighter-rouge">Awaitable</code>对象来获取<code class="language-plaintext highlighter-rouge">Awaiter</code>，具体流程是如果<code class="language-plaintext highlighter-rouge">Awaitable</code>重载了<code class="language-plaintext highlighter-rouge">operator co_await()</code>，则对这个对象调用<code class="language-plaintext highlighter-rouge">operator co_await</code>以获得<code class="language-plaintext highlighter-rouge">Awaiter</code>对象。否则，对象<code class="language-plaintext highlighter-rouge">Awaitable</code>对象本身就用作<code class="language-plaintext highlighter-rouge">Awaiter</code>对象。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Awaitable</span><span class="p">&gt;</span>
<span class="k">decltype</span><span class="p">(</span><span class="k">auto</span><span class="p">)</span> <span class="n">get_awaiter</span><span class="p">(</span><span class="n">Awaitable</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">has_member_operator_co_await_v</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&gt;</span><span class="p">)</span>
        <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">).</span><span class="k">operator</span> <span class="k">co_await</span><span class="p">();</span>
    <span class="k">else</span> <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">has_non_member_operator_co_await_v</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">)</span>
        <span class="k">return</span> <span class="k">operator</span> <span class="k">co_await</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">));</span>
    <span class="k">else</span>
        <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">Awaitable</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li>
    <p>通过重载<code class="language-plaintext highlighter-rouge">operator co_await</code>的称为<code class="language-plaintext highlighter-rouge">Awaitable</code>的一个例子就是<code class="language-plaintext highlighter-rouge">folly::Future</code>，co_await返回的<code class="language-plaintext highlighter-rouge">FutureAwaiter</code>提供了Awaiter的三个接口。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
  <span class="kr">inline</span> <span class="n">detail</span><span class="o">::</span><span class="n">FutureAwaiter</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span>
  <span class="cm">/* implicit */</span> <span class="k">operator</span> <span class="nf">co_await</span><span class="p">(</span><span class="n">Future</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&amp;&amp;</span> <span class="n">future</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="k">return</span> <span class="n">detail</span><span class="o">::</span><span class="n">FutureAwaiter</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">future</span><span class="p">));</span>
  <span class="p">}</span>

  <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
  <span class="k">class</span> <span class="nc">FutureAwaiter</span> <span class="p">{</span>
   <span class="nl">public:</span>
    <span class="k">explicit</span> <span class="n">FutureAwaiter</span><span class="p">(</span><span class="n">folly</span><span class="o">::</span><span class="n">Future</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&amp;&amp;</span> <span class="n">future</span><span class="p">)</span> <span class="k">noexcept</span>
        <span class="o">:</span> <span class="n">future_</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">future</span><span class="p">))</span> <span class="p">{}</span>

    <span class="kt">bool</span> <span class="nf">await_ready</span><span class="p">()</span> <span class="p">{</span>
      <span class="k">if</span> <span class="p">(</span><span class="n">future_</span><span class="p">.</span><span class="n">isReady</span><span class="p">())</span> <span class="p">{</span>
        <span class="n">result_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">future_</span><span class="p">.</span><span class="n">result</span><span class="p">());</span>
        <span class="k">return</span> <span class="nb">true</span><span class="p">;</span>
      <span class="p">}</span>
      <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="n">T</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">result_</span><span class="p">).</span><span class="n">value</span><span class="p">();</span> <span class="p">}</span>

    <span class="n">Try</span><span class="o">&lt;</span><span class="n">drop_unit_t</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;</span> <span class="n">await_resume_try</span><span class="p">()</span> <span class="p">{</span>
      <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">Try</span><span class="o">&lt;</span><span class="n">drop_unit_t</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&gt;&gt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">result_</span><span class="p">));</span>
    <span class="p">}</span>

    <span class="n">FOLLY_CORO_AWAIT_SUSPEND_NONTRIVIAL_ATTRIBUTES</span> <span class="kt">void</span> <span class="nf">await_suspend</span><span class="p">(</span>
        <span class="n">coro</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="p">{</span>
      <span class="c1">// FutureAwaiter may get destroyed as soon as the callback is executed.</span>
      <span class="c1">// Make sure the future object doesn't get destroyed until setCallback_</span>
      <span class="c1">// returns.</span>
      <span class="k">auto</span> <span class="n">future</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">future_</span><span class="p">);</span>
      <span class="n">future</span><span class="p">.</span><span class="n">setCallback_</span><span class="p">(</span>
          <span class="p">[</span><span class="k">this</span><span class="p">,</span> <span class="n">h</span><span class="p">](</span><span class="n">Executor</span><span class="o">::</span><span class="n">KeepAlive</span><span class="o">&lt;&gt;&amp;&amp;</span><span class="p">,</span> <span class="n">Try</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;&amp;&amp;</span> <span class="n">result</span><span class="p">)</span> <span class="k">mutable</span> <span class="p">{</span>
            <span class="n">result_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">result</span><span class="p">);</span>
            <span class="n">h</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
          <span class="p">});</span>
    <span class="p">}</span>

   <span class="k">private</span><span class="o">:</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">Future</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">future_</span><span class="p">;</span>
    <span class="n">folly</span><span class="o">::</span><span class="n">Try</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span> <span class="n">result_</span><span class="p">;</span>
  <span class="p">};</span>
</code></pre></div>    </div>
  </li>
</ul>

<p>在很多协程的介绍中并不会出现<code class="language-plaintext highlighter-rouge">Awaiter</code>，而是全部用<code class="language-plaintext highlighter-rouge">Awaitable</code>来介绍，这的确是一种简化的介绍。在上面的步骤中，编译器很多情况下就会把<code class="language-plaintext highlighter-rouge">Awaitable</code>作为<code class="language-plaintext highlighter-rouge">Awaiter</code>，此时如果<code class="language-plaintext highlighter-rouge">Awaitable</code>中没有实现对应三个接口就会报错。</p>

<p>实际上<code class="language-plaintext highlighter-rouge">Awaiter</code>一定是<code class="language-plaintext highlighter-rouge">Awaitable</code>，而<code class="language-plaintext highlighter-rouge">Awaitable</code>不一定是<code class="language-plaintext highlighter-rouge">Awaiter</code>。比如标准库中提供的<code class="language-plaintext highlighter-rouge">suspend_always</code>和<code class="language-plaintext highlighter-rouge">suspend_never</code>既提供了<code class="language-plaintext highlighter-rouge">Awaiter</code>的这三个接口，也就支持<code class="language-plaintext highlighter-rouge">co_await</code>运算符，所以它们既是<code class="language-plaintext highlighter-rouge">Awaiter</code>又是<code class="language-plaintext highlighter-rouge">Awaitable</code>。而一个<code class="language-plaintext highlighter-rouge">Awaitable</code>就不一定是<code class="language-plaintext highlighter-rouge">Awaiter</code>了，准确来说，<code class="language-plaintext highlighter-rouge">Awaitable</code>只需要能生成<code class="language-plaintext highlighter-rouge">Awaiter</code>即可。比如<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>就只是<code class="language-plaintext highlighter-rouge">Awaitable</code>，而不是一个<code class="language-plaintext highlighter-rouge">Awaiter</code>，但它能够生成<code class="language-plaintext highlighter-rouge">Awaiter</code>。</p>

<p><code class="language-plaintext highlighter-rouge">Awaiter</code>的三个接口如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">Awaiter</span> <span class="p">{</span>
    <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">();</span>

    <span class="k">auto</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span><span class="p">);</span>
    <span class="c1">// or specialize on the promise type: void await_suspend(coroutine_handle&lt;promise_type&gt;);</span>

    <span class="k">auto</span> <span class="n">await_resume</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">await_ready</code> - 自定义点，用于控制<code class="language-plaintext highlighter-rouge">Awaiter</code>是否已完成并且可以从中获取结果</li>
  <li><code class="language-plaintext highlighter-rouge">await_suspend</code> - 自定义点，定义如何等待<code class="language-plaintext highlighter-rouge">Awaiter</code>（通常是如何恢复它），将在协程即将进入挂起状态之前执行</li>
  <li><code class="language-plaintext highlighter-rouge">await_resume</code> - 返回整个<code class="language-plaintext highlighter-rouge">co_await</code>表达式的结果，将在协程即将唤醒之前执行</li>
</ul>

<h2 id="co_await">co_await</h2>

<p>了解了<code class="language-plaintext highlighter-rouge">promise_type</code>、<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>之后，我们就能看看编译器是如何展开<code class="language-plaintext highlighter-rouge">co_await</code>的。我们可以把<code class="language-plaintext highlighter-rouge">co_await</code>理解为挂起协程的机会，即调用<code class="language-plaintext highlighter-rouge">co_await</code>是编译器可以挂起协程并将控制流交还给调用方的suspension point。<code class="language-plaintext highlighter-rouge">Awaitable</code>控制在这些挂起点发生什么。比如它们可以不挂起而继续执行协程，这也是为什么前面说它是挂起的机会。</p>

<p>展开后代码如下，其中<code class="language-plaintext highlighter-rouge">promise</code>是当前协程的<code class="language-plaintext highlighter-rouge">promise_type</code>对象：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">value</span> <span class="o">=</span> <span class="o">&lt;</span><span class="n">expr</span><span class="o">&gt;</span><span class="p">;</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">get_awaitable</span><span class="p">(</span><span class="n">promise</span><span class="p">,</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="k">decltype</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="o">&gt;</span><span class="p">(</span><span class="n">value</span><span class="p">));</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaiter</span> <span class="o">=</span> <span class="n">get_awaiter</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="k">decltype</span><span class="p">(</span><span class="n">awaitable</span><span class="p">)</span><span class="o">&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">));</span>
  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">handle_t</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="k">using</span> <span class="n">await_suspend_result_t</span> <span class="o">=</span>
      <span class="k">decltype</span><span class="p">(</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">handle_t</span><span class="o">::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">)));</span>

    <span class="o">&lt;</span><span class="n">suspend</span><span class="o">-</span><span class="n">coroutine</span><span class="o">&gt;</span>

    <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">is_void_v</span><span class="o">&lt;</span><span class="n">await_suspend_result_t</span><span class="o">&gt;</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">handle_t</span><span class="o">::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">));</span>
      <span class="o">&lt;</span><span class="k">return</span><span class="o">-</span><span class="n">to</span><span class="o">-</span><span class="n">caller</span><span class="o">-</span><span class="n">or</span><span class="o">-</span><span class="n">resumer</span><span class="o">&gt;</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
      <span class="k">if</span> <span class="p">(</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">handle_t</span><span class="o">::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">)))</span> <span class="p">{</span>
        <span class="o">&lt;</span><span class="k">return</span><span class="o">-</span><span class="n">to</span><span class="o">-</span><span class="n">caller</span><span class="o">-</span><span class="n">or</span><span class="o">-</span><span class="n">resumer</span><span class="o">&gt;</span>
      <span class="p">}</span>
    <span class="p">}</span>

    <span class="o">&lt;</span><span class="n">resume</span><span class="o">-</span><span class="n">point</span><span class="o">&gt;</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="n">awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>首先通过前面描述流程获取<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>对象。然后调用<code class="language-plaintext highlighter-rouge">await_ready</code>判断<code class="language-plaintext highlighter-rouge">Awaiter</code>是否已经完成异步操作，如果已经完成则不需要再将协程挂起、恢复。如果没有完成，此时就会进到<code class="language-plaintext highlighter-rouge">&lt;suspend-coroutine&gt;</code>，编译器会生成一些代码来保存协程的当前状态以便之后恢复，会将<code class="language-plaintext highlighter-rouge">&lt;resume-point&gt;</code>的位置、协程的形参、以及当前寄存器中的值保存到coroutine frame中（即协程帧，一般是动态分配到堆上）。<code class="language-plaintext highlighter-rouge">&lt;suspend-coroutine&gt;</code>完成后，协程就已经处于挂起状态了。</p>

<p>在返回到调用方或者恢复方之前，编译器生成的代码还会调用<code class="language-plaintext highlighter-rouge">await_suspend</code>，这个函数是第一个可以观测到协程被挂起的地方，注意<code class="language-plaintext highlighter-rouge">await_suspend</code>传入的参数是当前协程。</p>

<p>返回<code class="language-plaintext highlighter-rouge">void</code>的<code class="language-plaintext highlighter-rouge">await_suspend()</code>会在调用<code class="language-plaintext highlighter-rouge">await_suspend()</code>返回时无条件地将控制流转移回协程的调用方/恢复方，而返回<code class="language-plaintext highlighter-rouge">bool</code>的版本允许awaiter对象有条件地立即恢复协程，而不返回给调用方/恢复方。返回<code class="language-plaintext highlighter-rouge">bool</code>的<code class="language-plaintext highlighter-rouge">await_suspend()</code>方法可以返回<code class="language-plaintext highlighter-rouge">false</code>以指示应立即恢复协程并继续执行，也就是把异步操作变为同步操作。</p>

<p>无论哪个版本，如果的确需要挂起，那么就会进入到<code class="language-plaintext highlighter-rouge">&lt;return-to-caller-or-resumer&gt;</code>。此时会将一个<code class="language-plaintext highlighter-rouge">ReturnType</code>对象返回给协程的调用方(上面代码中没有直接体现)，并将协程的栈帧出栈，恢复调用方的栈帧，此时控制流回到调用方，且coroutine frame仍然存在。</p>

<blockquote>
  <p>一个coroutine body会被编译器展开为若干次<code class="language-plaintext highlighter-rouge">co_await</code>，每次<code class="language-plaintext highlighter-rouge">co_await</code>都是一个挂起点，在任何一个挂起点被挂起，一个对应的<code class="language-plaintext highlighter-rouge">ReturnType</code>对象就会被返回给协程的调用方。</p>

</blockquote>

<p>协程一旦挂起之后，协程就可以通过<code class="language-plaintext highlighter-rouge">coroutine_handle</code>进行恢复或者销毁。恢复的时机和方式取决于<code class="language-plaintext highlighter-rouge">awaiter.await_suspend()</code>的实现，比较常见的几种实现方式有：</p>

<ul>
  <li>当某个异步操作完成时恢复</li>
  <li>线程池中的其他任务执行完毕时恢复</li>
  <li>立即恢复</li>
  <li>对称转移(symmetric transfer)，等介绍<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>时候我们再展开</li>
</ul>

<p>无论什么情况，当被挂起的协程被恢复时，会还原寄存器、局部变量以及参数等信息，从<code class="language-plaintext highlighter-rouge">&lt;resume-point&gt;</code>处继续执行，之后就会调用<code class="language-plaintext highlighter-rouge">await_resume</code>去获取<code class="language-plaintext highlighter-rouge">co_await</code>的结果。<code class="language-plaintext highlighter-rouge">await_resume</code>返回值将成为<code class="language-plaintext highlighter-rouge">co_await &lt;expr&gt;</code>的结果。<code class="language-plaintext highlighter-rouge">await_resume</code>方法也可能抛出异常，在这种情况下，异常会从<code class="language-plaintext highlighter-rouge">co_await</code>表达式中传播出去。如果在<code class="language-plaintext highlighter-rouge">await_suspend</code>中抛出异常，则协程将自动恢复，异常也会从<code class="language-plaintext highlighter-rouge">co_await</code>表达式中传播出去，但不会调用<code class="language-plaintext highlighter-rouge">await_resume</code>。</p>

<h2 id="coroutine-handle">Coroutine Handle</h2>

<p>注意到展开<code class="language-plaintext highlighter-rouge">co_await</code>的代码中会调用<code class="language-plaintext highlighter-rouge">await_suspend()</code>，它有一个<code class="language-plaintext highlighter-rouge">coroutine_handle&lt;promise_type&gt;</code>类型参数。这个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>是coroutine frame的一个句柄，可用于恢复协程的执行或销毁coroutine frame。它也可以用来访问协程的<code class="language-plaintext highlighter-rouge">promise</code>对象。需要注意的是<code class="language-plaintext highlighter-rouge">coroutine_handle</code>不是智能指针类型，也不持有coroutine frame。</p>

<p><code class="language-plaintext highlighter-rouge">coroutine_handle</code>类型的主要接口如下所示：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">coroutine_handle</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">explicit</span> <span class="k">operator</span> <span class="kt">bool</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span><span class="p">;</span>

    <span class="k">static</span> <span class="n">coroutine_handle</span> <span class="n">from_address</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span> <span class="n">a</span><span class="p">)</span> <span class="k">noexcept</span><span class="p">;</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">to_address</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span><span class="p">;</span>

    <span class="kt">void</span> <span class="k">operator</span><span class="p">()()</span> <span class="k">const</span><span class="p">;</span>
    <span class="kt">void</span> <span class="n">resume</span><span class="p">()</span> <span class="k">const</span><span class="p">;</span>

    <span class="kt">void</span> <span class="n">destroy</span><span class="p">();</span>
    <span class="kt">bool</span> <span class="n">done</span><span class="p">()</span> <span class="k">const</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">Promise</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">coroutine_handle</span> <span class="o">:</span> <span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="n">Promise</span><span class="o">&amp;</span> <span class="n">promise</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span><span class="p">;</span>
    <span class="k">static</span> <span class="n">coroutine_handle</span> <span class="n">from_promise</span><span class="p">(</span><span class="n">Promise</span><span class="o">&amp;</span><span class="p">)</span> <span class="k">noexcept</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">resume()</code>用于恢复挂起时的协程。此时会在<code class="language-plaintext highlighter-rouge">&lt;resume-point&gt;</code>处重新激活一个挂起的协程。</li>
  <li><code class="language-plaintext highlighter-rouge">destroy()</code>方法用于销毁coroutine frame，调用任何在作用域内的变量的析构函数并释放coroutine frame使用的内存。</li>
  <li><code class="language-plaintext highlighter-rouge">promise</code>和<code class="language-plaintext highlighter-rouge">from_promise</code>将在<code class="language-plaintext highlighter-rouge">coroutine_handle</code>和<code class="language-plaintext highlighter-rouge">promise_type</code>之间进行转换。</li>
</ul>

<p>需要强调的是，<code class="language-plaintext highlighter-rouge">coroutine_handle</code>和<code class="language-plaintext highlighter-rouge">promise_type</code>一般是只有协程库才需要关心的对象，绝大多数情况下，用户代码都不会直接操作<code class="language-plaintext highlighter-rouge">coroutine_handle</code>和<code class="language-plaintext highlighter-rouge">promise_type</code>，可以把二者视为协程的内部实现细节。比如标准库中的<code class="language-plaintext highlighter-rouge">promise</code>和<code class="language-plaintext highlighter-rouge">from_promise</code>函数中都调用了内置的函数：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="n">coroutine_handle</span> <span class="nf">from_promise</span><span class="p">(</span><span class="n">_Promise</span><span class="o">&amp;</span> <span class="n">__p</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">coroutine_handle</span> <span class="n">__self</span><span class="p">;</span>
    <span class="n">__self</span><span class="p">.</span><span class="n">_M_fr_ptr</span> <span class="o">=</span> <span class="n">__builtin_coro_promise</span><span class="p">((</span><span class="kt">char</span><span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">__p</span><span class="p">,</span> <span class="n">__alignof</span><span class="p">(</span><span class="n">_Promise</span><span class="p">),</span> <span class="nb">true</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">__self</span><span class="p">;</span>
<span class="p">}</span>

<span class="n">_Promise</span><span class="o">&amp;</span> <span class="n">promise</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span>
    <span class="kt">void</span><span class="o">*</span> <span class="n">__t</span> <span class="o">=</span> <span class="n">__builtin_coro_promise</span><span class="p">(</span><span class="n">_M_fr_ptr</span><span class="p">,</span> <span class="n">__alignof</span><span class="p">(</span><span class="n">_Promise</span><span class="p">),</span> <span class="nb">false</span><span class="p">);</span>
    <span class="k">return</span> <span class="o">*</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">_Promise</span><span class="o">*&gt;</span><span class="p">(</span><span class="n">__t</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="coroutine-body">Coroutine body</h2>

<p>了解了<code class="language-plaintext highlighter-rouge">co_await</code>的原理后，我们就可以开始了解编译器是如何处理协程。编译器会把一个协程展开为下面三段式代码：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend();</code></li>
  <li><code class="language-plaintext highlighter-rouge">coroutine body</code></li>
  <li><code class="language-plaintext highlighter-rouge">co_await promise.final_suspend();</code></li>
</ol>

<p>展开之后的代码会涉及到前面我们所说的所有组件，整个流程在下一篇还会再详细介绍，这里只大体描述下关键步骤。</p>

<blockquote>
  <p>也可以看标准草案中的<a href="https://eel.is/c++draft/dcl.fct.def.coroutine#5">描述</a></p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Pretend there's a compiler-generated structure called 'coroutine_frame'</span>
<span class="c1">// that holds all of the state needed for the coroutine. Its constructor</span>
<span class="c1">// takes a copy of parameters and default-constructs a promise object.</span>
<span class="k">struct</span> <span class="nc">coroutine_frame</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>

<span class="n">ReturnType</span> <span class="nf">some_coroutine</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span><span class="o">*</span> <span class="n">f</span> <span class="o">=</span> <span class="k">new</span> <span class="n">coroutine_frame</span><span class="p">(...);</span>
    <span class="k">auto</span> <span class="n">returnObject</span> <span class="o">=</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">get_return_object</span><span class="p">();</span>
    <span class="n">expanded_coroutine</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">returnObject</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">expanded_coroutine</span><span class="p">(</span><span class="n">coroutine_frame</span><span class="o">*</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
        <span class="k">co_await</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">initial_suspend</span><span class="p">();</span>
        <span class="o">&lt;</span><span class="n">body</span><span class="o">-</span><span class="n">statements</span><span class="o">&gt;</span>
        <span class="c1">// f-&gt;promise.return_void() or f-&gt;promise.return_value(...) will be called</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
        <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">unhandled_exception</span><span class="p">();</span>
    <span class="p">}</span>
<span class="n">final_suspend_label</span><span class="o">:</span>
    <span class="k">co_await</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>首先，一旦编译器看到这三个关键字之一，确定这是一个协程，接着就会检查<code class="language-plaintext highlighter-rouge">ReturnType</code>。并通过<code class="language-plaintext highlighter-rouge">ReturnType</code>确定<code class="language-plaintext highlighter-rouge">promise_type</code>类型（通过内嵌或者<code class="language-plaintext highlighter-rouge">coroutine_traits</code>的形式）。</p>

<p>获取到<code class="language-plaintext highlighter-rouge">promise_type</code>这个类型后，编译器生成的代码会构造coroutine frame，包括coroutine frame中的<code class="language-plaintext highlighter-rouge">promise_type</code>对象。接着通过<code class="language-plaintext highlighter-rouge">promise_type</code>中的<code class="language-plaintext highlighter-rouge">get_return_object</code>方法得到<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。<code class="language-plaintext highlighter-rouge">ReturnType</code>对象在协程第一次挂起或结束时返回给调用方。</p>

<p>之后会<code class="language-plaintext highlighter-rouge">co_await initial_suspend</code>，当<code class="language-plaintext highlighter-rouge">initial_suspend</code>被恢复时（或者是<code class="language-plaintext highlighter-rouge">await_suspend</code>返回<code class="language-plaintext highlighter-rouge">false</code>时，代表立即恢复），协程体开始执行。也就是说<code class="language-plaintext highlighter-rouge">promise_type</code>中的<code class="language-plaintext highlighter-rouge">initial_suspend</code>决定了协程的函数体什么时候开始执行。</p>

<p>当协程执行完时，根据返回值的不同，<code class="language-plaintext highlighter-rouge">return_void</code>或者<code class="language-plaintext highlighter-rouge">return_value</code>会被调用。如果执行过程中出现异常，则<code class="language-plaintext highlighter-rouge">unhandled_exception</code>会被调用。绝大多数的实现，都是通过这几个方法，把协程的执行结果保存在<code class="language-plaintext highlighter-rouge">promise_type</code>中。</p>

<p>无论哪种情况，最终都会跳转到<code class="language-plaintext highlighter-rouge">final_suspend_label</code>，这里会<code class="language-plaintext highlighter-rouge">co_await final_suspend</code>。它会决定coroutine frame由谁来销毁。通常来说<code class="language-plaintext highlighter-rouge">final_suspend</code>总是会挂起，以便协程库从协程外部对<code class="language-plaintext highlighter-rouge">coroutine_handle</code>调用<code class="language-plaintext highlighter-rouge">destroy()</code>。也就是说<code class="language-plaintext highlighter-rouge">promise_type</code>中的<code class="language-plaintext highlighter-rouge">final_suspend</code>决定了协程由谁和什么时候销毁。</p>

<blockquote>
  <p>对于<code class="language-plaintext highlighter-rouge">final_suspend</code>来说，除了挂起，还有一种实现就是对称转移，这块留到分析<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>时我们再展开</p>

</blockquote>

<h2 id="cheatsheet">Cheatsheet</h2>

<p>了解了上面这些概念后，我们终于可以开始把各个琐碎的细节拼成完整的全景了。我们再整理一下手中的拼图：</p>

<ol>
  <li>编译器会展开<code class="language-plaintext highlighter-rouge">co_await</code>为一段固定格式的代码，通过<code class="language-plaintext highlighter-rouge">Awaitable</code>来控制协程是否挂起，挂起时的自定义行为，以及如何获取<code class="language-plaintext highlighter-rouge">co_await</code>这个表达式的值。</li>
  <li>编译器会将协程展开为固定三段式的代码，通过<code class="language-plaintext highlighter-rouge">promise_type</code>决定协程在启动和停止等关键时间点的行为：
    <ol>
      <li>通过<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend()</code>中的<code class="language-plaintext highlighter-rouge">Awaitable</code>，确定协程体执行之前的行为，比如是立即执行还是lazily启动</li>
      <li>通过<code class="language-plaintext highlighter-rouge">promise</code>中的<code class="language-plaintext highlighter-rouge">return_void</code>/<code class="language-plaintext highlighter-rouge">return_value</code>/<code class="language-plaintext highlighter-rouge">unhandled_exception</code>，确定协程如何处理返回值和异常</li>
      <li>通过<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend()</code>中的<code class="language-plaintext highlighter-rouge">Awaitable</code>，确定协程体执行之后的行为，比如协程由谁在什么时候销毁</li>
    </ol>
  </li>
  <li>协程最终会返回<code class="language-plaintext highlighter-rouge">ReturnType</code>，一个协程库的开发者会通过<code class="language-plaintext highlighter-rouge">ReturnType</code>中的接口，定义了用户应该如何使用这个协程。</li>
</ol>

<p>下面我们开始尝试把这几块拼图合成一个全景。首先我们看下如何把<code class="language-plaintext highlighter-rouge">ReturnType</code>和<code class="language-plaintext highlighter-rouge">promise_type</code>拼在一起：</p>

<p><img src="/archive/coroutine-0.png" alt="figure" /></p>

<p><code class="language-plaintext highlighter-rouge">promise_type</code>和<code class="language-plaintext highlighter-rouge">coroutine_handle</code>可以互相转换，而<code class="language-plaintext highlighter-rouge">promise_type</code>中有个方法能够返回<code class="language-plaintext highlighter-rouge">ReturnType</code>。这是如何做到的呢？绝大多数的<code class="language-plaintext highlighter-rouge">ReturnType</code>都是从<code class="language-plaintext highlighter-rouge">coroutine_handle</code>构造而得，并且会把<code class="language-plaintext highlighter-rouge">coroutine_handle</code>作为成员变量保存，这样<code class="language-plaintext highlighter-rouge">ReturnType</code>可以在合适的时机调用<code class="language-plaintext highlighter-rouge">resume()</code>使其继续执行：</p>

<p><img src="/archive/coroutine-1.png" alt="figure" /></p>

<blockquote>
  <p>这里只是个示例，并不是所有<code class="language-plaintext highlighter-rouge">ReturnType</code>都会提供显示的<code class="language-plaintext highlighter-rouge">resume</code>接口</p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="n">Caller</span>            <span class="n">Coroutine</span> <span class="n">Internals</span>
      <span class="err">│</span>                       <span class="err">│</span>
      <span class="err">▼</span>                       <span class="err">▼</span>
<span class="err">┌────────────┐</span>          <span class="err">┌───────────┐</span>            <span class="err">┌──────────────────┐</span>
<span class="err">│</span> <span class="n">ReturnType</span> <span class="err">│◄─────────│</span>  <span class="n">Promise</span>  <span class="err">│◄──────────►│</span> <span class="n">coroutine</span> <span class="n">handle</span> <span class="err">│</span>
<span class="err">│</span>            <span class="err">│</span>  <span class="n">create</span>  <span class="err">│</span>           <span class="err">│</span>    <span class="n">bind</span>    <span class="err">│</span>                  <span class="err">│</span>
<span class="err">└────────────┘</span>          <span class="err">└───────────┘</span>            <span class="err">└──────────────────┘</span>
      <span class="err">│</span>                       <span class="err">│</span>                            <span class="err">│</span>
      <span class="err">│</span><span class="n">owns</span>                   <span class="err">│</span><span class="n">stores</span>                      <span class="err">│</span><span class="n">points</span> <span class="n">to</span>
      <span class="err">▼</span>                       <span class="err">▼</span>                            <span class="err">▼</span>
<span class="err">┌────────────┐</span>          <span class="err">┌───────────┐</span>            <span class="err">┌──────────────────┐</span>
<span class="err">│</span> <span class="n">coroutine</span>  <span class="err">│</span>          <span class="err">│</span> <span class="n">result</span> <span class="n">or</span> <span class="err">│</span>            <span class="err">│</span> <span class="n">coroutine</span> <span class="n">frame</span>  <span class="err">│</span>
<span class="err">│</span>   <span class="n">handle</span>   <span class="err">│</span>          <span class="err">│</span> <span class="n">exception</span> <span class="err">│</span>            <span class="err">│</span>                  <span class="err">│</span>
<span class="err">│</span><span class="p">(</span><span class="n">not</span> <span class="n">owning</span><span class="p">)</span><span class="err">│</span>          <span class="err">└───────────┘</span>            <span class="err">└──────────────────┘</span>
<span class="err">└────────────┘</span>
</code></pre></div></div>

<p>如下图所示，我们已经能够把调用方、<code class="language-plaintext highlighter-rouge">ReturnType</code>、<code class="language-plaintext highlighter-rouge">coroutine_handle</code>、<code class="language-plaintext highlighter-rouge">promise_type</code>串在一起，那<code class="language-plaintext highlighter-rouge">Awaitable</code>呢？</p>

<p><img src="/archive/coroutine-2.png" alt="figure" /></p>

<p>为了方便理解，我们也把<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>都简化为<code class="language-plaintext highlighter-rouge">Awaitable</code>来介绍。</p>

<p><img src="/archive/coroutine-3.png" alt="figure" /></p>

<p>在上面协程体被展开的代码中，我们可以看到协程调用<code class="language-plaintext highlighter-rouge">Awaiter</code>的不同方法，确定是挂起还是执行。比如在<code class="language-plaintext highlighter-rouge">await_suspend</code>时，会传入一个挂起的协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，进而决定挂起时的行为。而在<code class="language-plaintext highlighter-rouge">coroutine_handle</code>的<code class="language-plaintext highlighter-rouge">resume</code>被调用时，协程会被恢复，此时会通过<code class="language-plaintext highlighter-rouge">await_resume</code>获取到<code class="language-plaintext highlighter-rouge">co_await</code>的结果，从而使得协程能继续执行。</p>

<p>所以各个组件之间完整的关系如下：</p>

<p><img src="/archive/coroutine-4.png" alt="figure" /></p>

<h2 id="examples">Examples</h2>

<p>接下来我们通过一些例子，加深对这张图的理解。</p>

<h3 id="调用方获取coroutine产生的数据">调用方获取Coroutine产生的数据</h3>

<p>假设协程产生了一个42，想要传递到调用方获取。那么首先需要把这个数据保存到<code class="language-plaintext highlighter-rouge">Awaitable</code>中，也就是<code class="language-plaintext highlighter-rouge">TheAnswer</code>里。然后根据上图<code class="language-plaintext highlighter-rouge">Awaitable</code>可以和<code class="language-plaintext highlighter-rouge">promise_type</code>在<code class="language-plaintext highlighter-rouge">await_suspend</code>时进行交互，也就能保存到<code class="language-plaintext highlighter-rouge">promise</code>中。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Coroutine</span> <span class="nf">f1</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">co_await</span> <span class="n">TheAnswer</span><span class="p">{</span><span class="mi">42</span><span class="p">};</span>
<span class="p">}</span>

<span class="k">struct</span> <span class="nc">promise</span> <span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="kt">int</span> <span class="n">value</span><span class="p">;</span>
<span class="p">};</span>

<span class="n">TheAnswer</span><span class="o">::</span><span class="n">TheAnswer</span><span class="p">(</span><span class="n">v</span><span class="p">)</span><span class="o">:</span> <span class="n">value_</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="p">{}</span>

<span class="kt">void</span> <span class="n">TheAnswer</span><span class="o">::</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value</span> <span class="o">=</span> <span class="n">value_</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p><img src="/archive/coroutine-5.png" alt="figure" /></p>

<p>最后调用方可以通过<code class="language-plaintext highlighter-rouge">ReturnType</code>获取到<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，也就能拿到promise对象，读取到协程产生的数据。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">Coroutine</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise</span><span class="o">&gt;</span> <span class="n">handle</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">getAnswer</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">handle</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Coroutien</span> <span class="n">c1</span> <span class="o">=</span> <span class="n">f1</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"The answer is "</span> <span class="o">&lt;&lt;</span> <span class="n">c1</span><span class="p">.</span><span class="n">getAnswer</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p><img src="/archive/coroutine-6.png" alt="figure" /></p>

<h3 id="coroutine获取调用方生成的数据">Coroutine获取调用方生成的数据</h3>

<p>假设调用了某个协程，调用方在某个时间点挂起了协程，并想要传递一些数据到协程，当唤醒时，这些数据已经准备好。</p>

<p>一个简单例子如下所示，当<code class="language-plaintext highlighter-rouge">co_await OutsideAnswer</code>被挂起时，main会继续执行<code class="language-plaintext highlighter-rouge">c1.provide(42)</code>，其中会把数据保存到<code class="language-plaintext highlighter-rouge">promise</code>中，并唤醒协程。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Coroutine</span> <span class="nf">f2</span><span class="p">()</span> <span class="p">{</span>
    <span class="kt">int</span> <span class="n">answer</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">OutsideAnswer</span><span class="p">{};</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="n">Coroutine</span><span class="o">::</span><span class="n">provide</span><span class="p">(</span><span class="kt">int</span> <span class="n">the_answer</span> <span class="p">{</span>
    <span class="n">handle</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value</span> <span class="o">=</span> <span class="n">the_answer</span><span class="p">;</span>
    <span class="n">handle</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Coroutine</span> <span class="n">c1</span> <span class="o">=</span> <span class="n">f2</span><span class="p">();</span>
    <span class="n">c1</span><span class="p">.</span><span class="n">provide</span><span class="p">(</span><span class="mi">42</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>而协程获取这个数据也很简单，当被唤醒时，此时数据已经被保存到<code class="language-plaintext highlighter-rouge">promise</code>了，通过<code class="language-plaintext highlighter-rouge">await_resume</code>获取即可。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">OutsideAnswer</span> <span class="p">{</span>
    <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">false</span><span class="p">;</span> <span class="p">}</span>
    <span class="kt">void</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">handle</span> <span class="o">=</span> <span class="n">h</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="kt">int</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">handle</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value</span><span class="p">();</span>
    <span class="p">}</span>

    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise</span><span class="o">&gt;</span> <span class="n">handle</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p><img src="/archive/coroutine-7.png" alt="figure" /></p>

<h2 id="at-last">At last</h2>

<p>这是协程系列的第一篇， 我们主要介绍了协程的相关概念，包括<code class="language-plaintext highlighter-rouge">ReturnType</code>、<code class="language-plaintext highlighter-rouge">promise_type</code>、<code class="language-plaintext highlighter-rouge">Awaitable</code>和<code class="language-plaintext highlighter-rouge">Awaiter</code>等。接着介绍了编译器是如何处理<code class="language-plaintext highlighter-rouge">co_await</code>的，以及在<code class="language-plaintext highlighter-rouge">co_await</code>的基础上时如何处理协程函数体的。由于协程中概念错综复杂，我们也通过一张图把各个概念串联起来。下一篇我们会着重介绍如何通过<code class="language-plaintext highlighter-rouge">promise_type</code>来自定义协程的行为。</p>

<h2 id="reference">Reference</h2>

<p><a href="https://www.youtube.com/watch?v=J7fYddslH0Q">Deciphering C++ Coroutines - A Diagrammatic Coroutine Cheat Sheet - Andreas Weis - CppCon 2022</a></p>

<table>
  <tbody>
    <tr>
      <td>[Asymmetric Transfer</td>
      <td>Some thoughts on programming, C++ and other things.](https://lewissbaker.github.io/)</td>
    </tr>
  </tbody>
</table>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[每次看协程的相关介绍，总是被各种繁杂的概念所困扰，之前也尝试过梳理一次，效果也很一般。这次花了不少时间系统的学习了一下，希望能加深一下印象。其中不少的内容都来自于这个博客，但它罗列了过多的细节，缺少了一个全局视角。直到我前一阵子看到了这个演讲，才把各个概念串联起来，有了相对清晰的理解。希望对各位有所帮助。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 2</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-2/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 2" /><published>2025-11-06T00:00:00+08:00</published><updated>2025-11-06T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-2</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-2/"><![CDATA[<p>这篇会继续从<code class="language-plaintext highlighter-rouge">promise_type</code>的视角，更完整的介绍编译器是如何把协程转换成一个固定的三段式代码，以及<code class="language-plaintext highlighter-rouge">promise_type</code>是如何自定义了协程的行为。好在有了上一篇对相关概念的介绍，我们终于可以通过demo来研究其中的奥秘了。</p>

<h2 id="promise_type">promise_type</h2>

<p>首先，我们需要再回顾一下<code class="language-plaintext highlighter-rouge">promise_type</code>的作用，上一篇我们是这么描述的：</p>

<p><code class="language-plaintext highlighter-rouge">promise_type</code>接口指定了自定义协程本身行为的方法。基础库编写者能够自定义：当协程被调用时发生什么、当协程返回时发生什么（无论是通过正常方式还是通过未处理的异常）。这些自定义点可以执行任意逻辑。</p>

<p>除此以外，有一个点上一篇没有展开：<code class="language-plaintext highlighter-rouge">promise_type</code>还可以通过<code class="language-plaintext highlighter-rouge">await_transform</code>定义在协程中<code class="language-plaintext highlighter-rouge">co_await</code>或<code class="language-plaintext highlighter-rouge">co_yield</code>时行为。注意这里说的是在当前这个协程中进行<code class="language-plaintext highlighter-rouge">co_await</code>或<code class="language-plaintext highlighter-rouge">co_yield</code>。</p>

<p><code class="language-plaintext highlighter-rouge">promise_type</code>的完整接口如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
    <span class="c1">// creating coroutine object - mandatory</span>
    <span class="n">ReturnType</span> <span class="n">get_return_object</span><span class="p">();</span>

    <span class="c1">// returns awaitable object - mandatory</span>
    <span class="k">auto</span> <span class="n">initial_suspend</span><span class="p">();</span>
    <span class="k">auto</span> <span class="n">final_suspend</span><span class="p">();</span>

    <span class="kt">void</span> <span class="n">unhandled_exception</span><span class="p">();</span>  <span class="c1">// mandatory</span>
    <span class="c1">// one of below is mandatory and only one must be present</span>
    <span class="kt">void</span> <span class="n">return_value</span><span class="p">(</span><span class="cm">/*type*/</span><span class="p">);</span>
    <span class="kt">void</span> <span class="n">return_void</span><span class="p">();</span>

    <span class="c1">// support for yielding values - returns awaitable</span>
    <span class="k">auto</span> <span class="n">yield_value</span><span class="p">();</span>

    <span class="c1">// modification of the awaitable</span>
    <span class="k">auto</span> <span class="n">await_transform</span><span class="p">(</span><span class="cm">/*co_await operand*/</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">get_return_object</code>：用于从<code class="language-plaintext highlighter-rouge">promise_type</code>对象获取对应的<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。当协程到达其第一个挂起点并且控制流返回给调用方时，调用方将通过调用<code class="language-plaintext highlighter-rouge">get_return_object</code>获得一个<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。这些自定义点可以执行任意逻辑。</li>
  <li><code class="language-plaintext highlighter-rouge">return_void</code>/<code class="language-plaintext highlighter-rouge">return_value</code>/<code class="language-plaintext highlighter-rouge">unhandled_exception</code>：自定义点，用于处理协程到达<code class="language-plaintext highlighter-rouge">co_return</code>语句时的行为以及异常处理方式。</li>
  <li><code class="language-plaintext highlighter-rouge">initial_suspend</code>：自定义点，用于自定义协程体在执行之前的行为，比如是立即执行还是lazily启动。</li>
  <li><code class="language-plaintext highlighter-rouge">final_suspend</code>：自定义点，用于协程体执行之后的行为，比如协程由谁在什么时候析构。</li>
  <li><code class="language-plaintext highlighter-rouge">unhandled_exception</code>：处理协程执行过程中抛出的异常。</li>
  <li><code class="language-plaintext highlighter-rouge">return_value</code>和<code class="language-plaintext highlighter-rouge">return_void</code>：保存协程的返回值。</li>
  <li><code class="language-plaintext highlighter-rouge">yield_value</code>：本质上<code class="language-plaintext highlighter-rouge">co_yield &lt;expr&gt;</code>会被编译器翻译为<code class="language-plaintext highlighter-rouge">co_await promise.yield_value(&lt;expr&gt;)</code>，可以通过<code class="language-plaintext highlighter-rouge">yield_value</code>来自定义<code class="language-plaintext highlighter-rouge">co_yield</code>的行为。后面如无特殊情况，不会单独再介绍<code class="language-plaintext highlighter-rouge">co_yield</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">await_transform</code>：每次协程内部执行<code class="language-plaintext highlighter-rouge">co_await</code>时，通过拦截并改写<code class="language-plaintext highlighter-rouge">Awaitable</code>对象。</li>
</ul>

<h2 id="coroutine-body">Coroutine body</h2>

<p>编译器会把一个协程展开为下面三段式代码：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend();</code></li>
  <li><code class="language-plaintext highlighter-rouge">coroutine body</code></li>
  <li><code class="language-plaintext highlighter-rouge">co_await promise.final_suspend();</code></li>
</ol>

<p>三段式示意代码展开如下（省略了部分现在不需要关注的细节），在这一篇中我们把其中的<code class="language-plaintext highlighter-rouge">co_await</code>也展开：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Pretend there's a compiler-generated structure called 'coroutine_frame'</span>
<span class="c1">// that holds all of the state needed for the coroutine. Its constructor</span>
<span class="c1">// takes a copy of parameters and default-constructs a promise object.</span>
<span class="k">struct</span> <span class="nc">coroutine_frame</span> <span class="p">{</span> <span class="p">...</span> <span class="p">};</span>

<span class="n">ReturnType</span> <span class="nf">some_coroutine</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span><span class="o">*</span> <span class="n">f</span> <span class="o">=</span> <span class="k">new</span> <span class="n">coroutine_frame</span><span class="p">(...);</span>
    <span class="k">auto</span> <span class="n">returnObject</span> <span class="o">=</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">get_return_object</span><span class="p">();</span>
    <span class="n">expanded_coroutine</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
    <span class="k">return</span> <span class="n">returnObject</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">expanded_coroutine</span><span class="p">(</span><span class="n">coroutine_frame</span><span class="o">*</span> <span class="n">f</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
        <span class="c1">// 1. co_await promise.initial_suspend() is expanded below</span>
        <span class="p">{</span>
            <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">initial_suspend</span><span class="p">();</span>
            <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaiter</span> <span class="o">=</span> <span class="n">awaitable</span><span class="p">;</span>
            <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
                <span class="o">&lt;</span><span class="n">suspend</span><span class="o">-</span><span class="n">coroutine</span><span class="o">&gt;</span>
                <span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">));</span>
                <span class="c1">// return to caller</span>
                <span class="k">return</span><span class="p">;</span>
            <span class="p">}</span>
            <span class="o">&lt;</span><span class="n">resume</span><span class="o">-</span><span class="n">point</span><span class="o">&gt;</span>
            <span class="n">awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
        <span class="p">}</span>

        <span class="c1">// 2. coroutine body</span>
        <span class="o">&lt;</span><span class="n">body</span><span class="o">-</span><span class="n">statements</span><span class="o">&gt;</span>
        <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">return_void</span><span class="p">()</span> <span class="n">or</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">return_value</span><span class="p">(...);</span>
        <span class="c1">// destruct all local variables in reverse order</span>
        <span class="k">goto</span> <span class="n">final_suspend_label</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
        <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">unhandled_exception</span><span class="p">();</span>
        <span class="c1">// destruct all local variables in reverse order</span>
        <span class="k">goto</span> <span class="n">final_suspend_label</span><span class="p">;</span>
    <span class="p">}</span>

<span class="n">final_suspend_label</span><span class="o">:</span>
    <span class="c1">// 3. co_await promise.final_suspend() is expanded below</span>
    <span class="p">{</span>
        <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="p">();</span>
        <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaiter</span> <span class="o">=</span> <span class="n">awaitable</span><span class="p">;</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
            <span class="o">&lt;</span><span class="n">suspend</span><span class="o">-</span><span class="n">coroutine</span><span class="o">&gt;</span>
            <span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">));</span>
            <span class="c1">// return to caller</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>
    <span class="o">&lt;</span><span class="n">destory</span> <span class="n">couroutine</span> <span class="n">frame</span><span class="o">&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>大致步骤如下：</p>

<ol>
  <li>构造coroutine frame，包括coroutine frame中的<code class="language-plaintext highlighter-rouge">promise_type</code>对象。</li>
  <li>通过<code class="language-plaintext highlighter-rouge">promise_type</code>中的<code class="language-plaintext highlighter-rouge">get_return_object</code>方法得到<code class="language-plaintext highlighter-rouge">ReturnType</code>对象，<code class="language-plaintext highlighter-rouge">ReturnType</code>对象在协程第一次挂起或结束时返回给调用方。</li>
  <li>之后会<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend</code>，通过<code class="language-plaintext highlighter-rouge">promise_type</code>自定义协程体在执行之前的行为，当<code class="language-plaintext highlighter-rouge">initial_suspend</code>被恢复时，协程体开始执行。</li>
  <li>当协程执行完时，根据返回值的不同，<code class="language-plaintext highlighter-rouge">return_void</code>或者<code class="language-plaintext highlighter-rouge">return_value</code>会被调用，结果会被保存在<code class="language-plaintext highlighter-rouge">promise_type</code>中。如果执行过程中出现异常，则<code class="language-plaintext highlighter-rouge">unhandled_exception</code>会被调用。之后所有协程函数体重的局部变量都会被析构。</li>
  <li>无论返回值是哪种，最终都会跳转到<code class="language-plaintext highlighter-rouge">final_suspend_label</code>，调用<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend</code>，通过<code class="language-plaintext highlighter-rouge">promise_type</code>自定义协程体执行之后的行为。</li>
  <li><code class="language-plaintext highlighter-rouge">&lt;destory couroutine frame&gt;</code>处会析构coroutine frame，具体时机根据<code class="language-plaintext highlighter-rouge">final_suspend</code>会有所不同，后面会介绍。</li>
</ol>

<p>下面会详细介绍具体的流程。</p>

<h3 id="allocating-a-coroutine-frame">Allocating a coroutine frame</h3>

<p>首先，编译器会生成对<code class="language-plaintext highlighter-rouge">operator new</code>的调用来为coroutine frame分配内存，其中包括：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">promise_type</code>对象</li>
  <li>所有协程参数</li>
  <li>关于协程当前挂起点的信息以及如何恢复/析构它</li>
  <li>任何生命周期跨越挂起点的局部变量</li>
</ul>

<p>协程需要将原始调用方传递给协程函数的所有参数复制到coroutine frame中，以便它们在协程挂起后仍然有效。如果参数是按值传递给协程的，那么这些参数通过调用类型的移动构造函数被复制到coroutine frame中。如果参数是按引用传递给协程的（无论是左值引用还是右值引用），那么只有引用被复制到coroutine frame中。一旦所有参数都被复制到coroutine frame中，协程就会构造promise对象。</p>

<p>类似的，coroutine frame析构时涉及：</p>

<ol>
  <li>调用<code class="language-plaintext highlighter-rouge">promise_type</code>对象的析构函数。</li>
  <li>调用coroutine frame中协程参数析构函数。</li>
  <li>调用<code class="language-plaintext highlighter-rouge">operator delete</code>释放coroutine frame使用的内存。</li>
</ol>

<h3 id="executing-coroutine-body">Executing coroutine body</h3>

<ol>
  <li>
    <p>获取返回对象</p>

    <p>协程对<code class="language-plaintext highlighter-rouge">promise_type</code>对象做的第一件事是通过调用<code class="language-plaintext highlighter-rouge">promise.get_return_object()</code>来获取<code class="language-plaintext highlighter-rouge">ReturnType</code>对象。当协程首次挂起或运行到完成并将执行返回给调用方后，会把<code class="language-plaintext highlighter-rouge">ReturnType</code>对象返回给协程调用方。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">initial_suspend</code></p>

    <p>一旦coroutine frame初始化完成并获得<code class="language-plaintext highlighter-rouge">ReturnType</code>对象后，接下来执行的是<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend()</code>。这允许通过<code class="language-plaintext highlighter-rouge">promise_type</code>控制协程是应该在执行协程体之前挂起，还是立即开始执行协程体。</p>

    <p>如果协程在<code class="language-plaintext highlighter-rouge">initial_suspend</code>点挂起，那么它可以稍后通过在协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>上调用<code class="language-plaintext highlighter-rouge">resume()</code>或<code class="language-plaintext highlighter-rouge">destroy()</code>来恢复或析构。另外，注意到编译器生成的代码中不会处理<code class="language-plaintext highlighter-rouge">initial_suspend</code>的<code class="language-plaintext highlighter-rouge">await_resume</code>返回值，即<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend()</code>表达式的结果会被丢弃，因此一般会返回<code class="language-plaintext highlighter-rouge">void</code>。</p>

    <p>对于许多类型的协程，<code class="language-plaintext highlighter-rouge">initial_suspend()</code>方法要么返回<code class="language-plaintext highlighter-rouge">std::suspend_always</code>（协程延迟启动），要么返回<code class="language-plaintext highlighter-rouge">std::suspend_never</code>（协程立即启动）。</p>
  </li>
  <li>
    <p>返回给调用方</p>

    <p>当协程第一次被挂起时，或者没有任何一次挂起，则是当协程执行完成时，从<code class="language-plaintext highlighter-rouge">get_return_object()</code>调用返回的<code class="language-plaintext highlighter-rouge">ReturnType</code>对象会被返回给协程的调用方。</p>
  </li>
  <li>
    <p>使用<code class="language-plaintext highlighter-rouge">co_return</code>从协程返回</p>

    <p>当协程到达<code class="language-plaintext highlighter-rouge">co_return</code>语句时，它会被转换为<code class="language-plaintext highlighter-rouge">promise.return_void()</code>或<code class="language-plaintext highlighter-rouge">promise.return_value(&lt;expr&gt;)</code>，接着是<code class="language-plaintext highlighter-rouge">goto final_suspend_label</code>。注意，如果执行在没有<code class="language-plaintext highlighter-rouge">co_return</code>语句的情况下运行到协程的末尾，这相当于在函数体末尾有一个<code class="language-plaintext highlighter-rouge">co_return</code>。</p>

    <p>具体规则如下：</p>

    <ul>
      <li><code class="language-plaintext highlighter-rouge">co_return;</code>转换为<code class="language-plaintext highlighter-rouge">promise.return_void();</code></li>
      <li><code class="language-plaintext highlighter-rouge">co_return &lt;expr&gt;;</code>
        <ul>
          <li>如果<code class="language-plaintext highlighter-rouge">&lt;expr&gt;</code>的类型是<code class="language-plaintext highlighter-rouge">void</code>，则转换为<code class="language-plaintext highlighter-rouge">&lt;expr&gt;; promise.return_void();</code></li>
          <li>如果<code class="language-plaintext highlighter-rouge">&lt;expr&gt;</code>的类型不是<code class="language-plaintext highlighter-rouge">void</code>，则转换为<code class="language-plaintext highlighter-rouge">promise.return_value(&lt;expr&gt;);</code></li>
        </ul>
      </li>
    </ul>

    <p>随后的<code class="language-plaintext highlighter-rouge">goto final_suspend_label</code>会导致所有具有局部变量按构造的相反顺序析构。</p>
  </li>
  <li>
    <p>处理从协程体传播出的异常</p>

    <p>如果协程体中抛出了异常，则异常会被捕获，并在<code class="language-plaintext highlighter-rouge">catch</code>块内调用<code class="language-plaintext highlighter-rouge">promise.unhandled_exception()</code>方法。通常实现常会调用<code class="language-plaintext highlighter-rouge">std::current_exception()</code>来捕获异常并将其存储起来，之后通过<code class="language-plaintext highlighter-rouge">promise_type</code>的相关接口检查是否在协程运行过程中出现异常。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">final_suspend</code></p>

    <p>一旦协程体执行完成，并且调用<code class="language-plaintext highlighter-rouge">return_void()</code>、<code class="language-plaintext highlighter-rouge">return_value()</code>或<code class="language-plaintext highlighter-rouge">unhandled_exception()</code>处理了返回结果后，协程有机会在将控制流返回给调用方/恢复方之前执行一些额外的逻辑。即协程通过执行<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend()</code>执行自定义逻辑，例如发布结果、发出完成信号或Resume continuation（下面会介绍，简单来说就是当前协程执行完时唤醒另一个协程继续执行），也允许协程在析构coroutine frame之前挂起。</p>

    <p>如果协程在<code class="language-plaintext highlighter-rouge">final_suspend</code>点挂起，对这个协程调用<code class="language-plaintext highlighter-rouge">resume()</code>是未定义行为，唯一能做的就是<code class="language-plaintext highlighter-rouge">destroy()</code>。也可以注意到，编译器生成的代码中，<code class="language-plaintext highlighter-rouge">final_suspend</code>是没有恢复点的，它对应的<code class="language-plaintext highlighter-rouge">Awaitable</code>的<code class="language-plaintext highlighter-rouge">await_resume</code>永远不会被调用。另外<code class="language-plaintext highlighter-rouge">final_suspend</code>必须是<code class="language-plaintext highlighter-rouge">noexcept</code>。</p>
  </li>
  <li>
    <p>析构coroutine frame</p>

    <p>在被展开的代码，最后一部分就是析构整个coroutine frame。注意到协程体已经完全执行完毕，没有更多的用户代码需要执行，实际上就是<code class="language-plaintext highlighter-rouge">final_suspend</code>的<code class="language-plaintext highlighter-rouge">await_ready</code>决定了coroutine frame什么时候被析构：</p>

    <ul>
      <li><code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">true</code>，代表不挂起并立即析构</li>
      <li><code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">false</code>，代表挂起并等待外部调用析构</li>
    </ul>

    <p>虽然标准允许协程在<code class="language-plaintext highlighter-rouge">final_suspend</code>点不挂起，但从工程时间角度应当挂起。这样会要求从协程外部对协程调用<code class="language-plaintext highlighter-rouge">.destroy()</code>，常见手段是某个RAII对象的析构函数。</p>
  </li>
</ol>

<h2 id="demo">Demo</h2>

<p>我们用一些实际例子来理解<code class="language-plaintext highlighter-rouge">promise_type</code>如何自定义协程的行为。</p>

<h3 id="coroutine-state-machine">Coroutine state machine</h3>

<p>首先是一个Hello World示例，通过它可以进一步了解编译器是如何将协程展开的。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;coroutine&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="k">struct</span> <span class="nc">ReturnType</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
        <span class="n">ReturnType</span> <span class="n">get_return_object</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">ReturnType</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">)};</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="nf">initial_suspend</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="p">{};</span>
        <span class="p">}</span>
        <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="n">final_suspend</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="k">return</span> <span class="p">{};</span>
        <span class="p">}</span>
        <span class="kt">void</span> <span class="nf">return_void</span><span class="p">()</span> <span class="p">{}</span>
        <span class="kt">void</span> <span class="nf">unhandled_exception</span><span class="p">()</span> <span class="p">{}</span>
    <span class="p">};</span>

    <span class="k">explicit</span> <span class="nf">ReturnType</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">handle</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>

    <span class="o">~</span><span class="n">ReturnType</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">handle</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">handle</span><span class="p">.</span><span class="n">destroy</span><span class="p">();</span>
        <span class="p">}</span>
    <span class="p">}</span>

    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">handle</span><span class="p">;</span>
<span class="p">};</span>

<span class="n">ReturnType</span> <span class="nf">hello</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">string</span><span class="o">&amp;</span> <span class="n">to</span><span class="p">,</span> <span class="kt">int</span> <span class="n">times</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">times</span><span class="p">;</span> <span class="o">++</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"Hello "</span> <span class="o">&lt;&lt;</span> <span class="n">to</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">co_return</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">coro</span> <span class="o">=</span> <span class="n">hello</span><span class="p">(</span><span class="s">"World"</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>注意到运行这个程序是不会打印Hello World的，原因是<code class="language-plaintext highlighter-rouge">initial_suspend</code>返回了<code class="language-plaintext highlighter-rouge">suspend_always</code>，想让它打印可以改成<code class="language-plaintext highlighter-rouge">suspend_never</code>。即<code class="language-plaintext highlighter-rouge">promise_type</code>通过<code class="language-plaintext highlighter-rouge">initial_suspend</code>自定义了协程是延迟执行还是立即执行。</p>

<p>我们把这个代码放到cppinsight中，勾选上<code class="language-plaintext highlighter-rouge">Show coroutine transformation</code>即可。</p>

<p>对于每个协程，都会生成它对应的coroutine frame，其中包含了promise，以及协程恢复和析构时的回调，以及协程的几个函数。此外注意到还有一个<code class="language-plaintext highlighter-rouge">__suspend_index</code>用来保存当前协程是在第几个挂起点挂起，以便恢复时能够正确执行剩余逻辑。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">__helloFrame</span>
<span class="p">{</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">resume_fn</span><span class="p">)(</span><span class="n">__helloFrame</span> <span class="o">*</span><span class="p">);</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">destroy_fn</span><span class="p">)(</span><span class="n">__helloFrame</span> <span class="o">*</span><span class="p">);</span>
  <span class="n">std</span><span class="o">::</span><span class="n">__coroutine_traits_impl</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">&gt;::</span><span class="n">promise_type</span> <span class="n">__promise</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">__suspend_index</span><span class="p">;</span>
  <span class="kt">bool</span> <span class="n">__initial_await_suspend_called</span><span class="p">;</span>
  <span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">basic_string</span><span class="o">&lt;</span><span class="kt">char</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">char_traits</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">allocator</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="o">&gt;</span> <span class="o">&amp;</span> <span class="n">to</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">times</span><span class="p">;</span>
  <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="n">__suspend_30_12</span><span class="p">;</span>      <span class="c1">// initial_suspend</span>
  <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="n">__suspend_30_12_1</span><span class="p">;</span>    <span class="c1">// final_suspend</span>
<span class="p">};</span>
</code></pre></div></div>

<p>协程的代码被展开成下面的流程：</p>

<ol>
  <li>构造coroutine frame</li>
  <li>保存协程参数</li>
  <li>构造<code class="language-plaintext highlighter-rouge">promise_type</code>对象</li>
  <li>设置恢复和析构时的回调</li>
  <li>调用三段式展开函数<code class="language-plaintext highlighter-rouge">__helloResume</code></li>
  <li>返回<code class="language-plaintext highlighter-rouge">ReturnObject</code>对象</li>
</ol>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ReturnType</span> <span class="nf">hello</span><span class="p">(</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">basic_string</span><span class="o">&lt;</span><span class="kt">char</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">char_traits</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">allocator</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="o">&gt;</span> <span class="o">&amp;</span> <span class="n">to</span><span class="p">,</span> <span class="kt">int</span> <span class="n">times</span><span class="p">)</span>
<span class="p">{</span>
  <span class="cm">/* Allocate the frame including the promise */</span>
  <span class="cm">/* Note: The actual parameter new is __builtin_coro_size */</span>
  <span class="n">__helloFrame</span> <span class="o">*</span> <span class="n">__f</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o">&lt;</span><span class="n">__helloFrame</span> <span class="o">*&gt;</span><span class="p">(</span><span class="k">operator</span> <span class="k">new</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">__helloFrame</span><span class="p">)));</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_index</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__initial_await_suspend_called</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">to</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">forward</span><span class="o">&lt;</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="n">basic_string</span><span class="o">&lt;</span><span class="kt">char</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">char_traits</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span><span class="p">,</span> <span class="n">std</span><span class="o">::</span><span class="n">allocator</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="o">&gt;</span> <span class="o">&amp;&gt;</span><span class="p">(</span><span class="n">to</span><span class="p">);</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">times</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">forward</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span><span class="p">(</span><span class="n">times</span><span class="p">);</span>

  <span class="cm">/* Construct the promise. */</span>
  <span class="k">new</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">)</span><span class="n">std</span><span class="o">::</span><span class="n">__coroutine_traits_impl</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="p">{};</span>

  <span class="cm">/* Forward declare the resume and destroy function. */</span>
  <span class="kt">void</span> <span class="nf">__helloResume</span><span class="p">(</span><span class="n">__helloFrame</span> <span class="o">*</span> <span class="n">__f</span><span class="p">);</span>
  <span class="kt">void</span> <span class="nf">__helloDestroy</span><span class="p">(</span><span class="n">__helloFrame</span> <span class="o">*</span> <span class="n">__f</span><span class="p">);</span>

  <span class="cm">/* Assign the resume and destroy function pointers. */</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">resume_fn</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">__helloResume</span><span class="p">;</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">destroy_fn</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">__helloDestroy</span><span class="p">;</span>

  <span class="cm">/* Call the made up function with the coroutine body for initial suspend.
     This function will be called subsequently by coroutine_handle&lt;&gt;::resume()
     which calls __builtin_coro_resume(__handle_) */</span>
  <span class="n">__helloResume</span><span class="p">(</span><span class="n">__f</span><span class="p">);</span>


  <span class="k">return</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">get_return_object</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>三段式函数被展开为<code class="language-plaintext highlighter-rouge">__helloResume</code>。可以看到，本质上协程体就变成了一个状态机，即协程在挂起时，准确说是在调用<code class="language-plaintext highlighter-rouge">await_suspend</code>之后，会设置corourinte frame中的挂起点的下标<code class="language-plaintext highlighter-rouge">__suspend_index</code>，之后会返回给调用方。而之后每次调用<code class="language-plaintext highlighter-rouge">coroutine_handle::resume()</code>时，都会调用这个函数中，并通过<code class="language-plaintext highlighter-rouge">__suspend_index</code>跳转到相应的恢复点并继续执行。</p>

<p>具体逻辑如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">__helloResume</span><span class="p">(</span><span class="n">__helloFrame</span> <span class="o">*</span> <span class="n">__f</span><span class="p">)</span>
<span class="p">{</span>
  <span class="k">try</span>
  <span class="p">{</span>
    <span class="cm">/* Create a switch to get to the correct resume point */</span>
    <span class="k">switch</span><span class="p">(</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_index</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">case</span> <span class="mi">0</span><span class="p">:</span> <span class="k">break</span><span class="p">;</span>
      <span class="k">case</span> <span class="mi">1</span><span class="p">:</span> <span class="k">goto</span> <span class="n">__resume_hello_1</span><span class="p">;</span>
      <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="k">goto</span> <span class="n">__resume_hello_2</span><span class="p">;</span>
    <span class="p">}</span>

    <span class="cm">/* co_await insights.cpp:30 */</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12</span> <span class="o">=</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">initial_suspend</span><span class="p">();</span>
    <span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
      <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">::</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_address</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">void</span> <span class="o">*&gt;</span><span class="p">(</span><span class="n">__f</span><span class="p">)).</span><span class="k">operator</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span><span class="p">());</span>
      <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_index</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
      <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__initial_await_suspend_called</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
      <span class="k">return</span><span class="p">;</span>
    <span class="p">}</span>

<span class="n">__resume_hello_1</span><span class="o">:</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
    <span class="k">for</span><span class="p">(</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">i</span> <span class="o">&lt;</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">times</span><span class="p">;</span> <span class="o">++</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">std</span><span class="o">::</span><span class="k">operator</span><span class="o">&lt;&lt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="k">operator</span><span class="o">&lt;&lt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="k">operator</span><span class="o">&lt;&lt;</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="p">,</span> <span class="s">"Hello "</span><span class="p">),</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">to</span><span class="p">),</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="cm">/* co_return insights.cpp:34 */</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">return_void</span><span class="p">();</span>
    <span class="cm">/* co_return insights.cpp:30 */</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">return_void</span><span class="p">()</span><span class="cm">/* implicit */</span><span class="p">;</span>
    <span class="k">goto</span> <span class="n">__final_suspend</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">catch</span><span class="p">(...)</span> <span class="p">{</span>
    <span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">__initial_await_suspend_called</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">throw</span> <span class="p">;</span>
    <span class="p">}</span>

    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">unhandled_exception</span><span class="p">();</span>
  <span class="p">}</span>

<span class="n">__final_suspend</span><span class="o">:</span>

  <span class="cm">/* co_await insights.cpp:30 */</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12_1</span> <span class="o">=</span> <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="p">();</span>
  <span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12_1</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_30_12_1</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">ReturnType</span><span class="o">::</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_address</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="kt">void</span> <span class="o">*&gt;</span><span class="p">(</span><span class="n">__f</span><span class="p">)).</span><span class="k">operator</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="kt">void</span><span class="o">&gt;</span><span class="p">());</span>
    <span class="n">__f</span><span class="o">-&gt;</span><span class="n">__suspend_index</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="k">return</span><span class="p">;</span>
  <span class="p">}</span>

<span class="n">__resume_hello_2</span><span class="o">:</span>
  <span class="n">__f</span><span class="o">-&gt;</span><span class="n">destroy_fn</span><span class="p">(</span><span class="n">__f</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>其余部分的代码就不再重复解释，对应前面的流程理解即可。</p>

<h3 id="resume-continuation">Resume continuation</h3>

<p>了解了这个状态机后以及<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend()</code>自定义协程执行之前的行为之后，下面开始介绍协程通过<code class="language-plaintext highlighter-rouge">co_await promise.final_suspend()</code>在协程函数体执行后自定义行为。其中一种常见行为就是Resume continuation，即在当前协程执行完成时，唤醒其他协程继续执行，从而实现协程之间的控制流转移。我们用下面的例子来解释：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;coroutine&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;utility&gt;</span><span class="cp">
</span>
<span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">struct</span> <span class="nc">SimpleTask</span> <span class="p">{</span>
    <span class="k">struct</span> <span class="nc">promise_type</span> <span class="p">{</span>
        <span class="n">SimpleTask</span> <span class="n">get_return_object</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">SimpleTask</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_promise</span><span class="p">(</span><span class="o">*</span><span class="k">this</span><span class="p">)};</span>
        <span class="p">}</span>

        <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="nf">initial_suspend</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="p">{};</span> <span class="p">}</span>

        <span class="k">struct</span> <span class="nc">FinalAwaiter</span> <span class="p">{</span>
            <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">false</span><span class="p">;</span> <span class="p">}</span>

            <span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
                <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; FinalAwaiter: coroutine "</span> <span class="o">&lt;&lt;</span> <span class="n">h</span><span class="p">.</span><span class="n">address</span><span class="p">()</span>
                          <span class="o">&lt;&lt;</span> <span class="s">" completed, ready to resume continuation</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
                <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
                <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
                    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; FinalAwaiter: resume continuation -&gt; "</span>
                              <span class="o">&lt;&lt;</span> <span class="n">continuation</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
                    <span class="n">continuation</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
                <span class="p">}</span>
            <span class="p">}</span>

            <span class="kt">void</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{}</span>
        <span class="p">};</span>

        <span class="n">FinalAwaiter</span> <span class="n">final_suspend</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span> <span class="k">return</span> <span class="p">{};</span> <span class="p">}</span>

        <span class="kt">void</span> <span class="nf">return_value</span><span class="p">(</span><span class="n">T</span> <span class="n">v</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; promise: return_value("</span> <span class="o">&lt;&lt;</span> <span class="n">v</span> <span class="o">&lt;&lt;</span> <span class="s">")</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
            <span class="n">value_</span> <span class="o">=</span> <span class="n">v</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="kt">void</span> <span class="nf">unhandled_exception</span><span class="p">()</span> <span class="p">{</span> <span class="n">exception_</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">current_exception</span><span class="p">();</span> <span class="p">}</span>

        <span class="n">T</span> <span class="n">value_</span><span class="p">;</span>
        <span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span> <span class="n">exception_</span><span class="p">{};</span>
        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation_</span><span class="p">{};</span>
    <span class="p">};</span>

    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">handle_</span><span class="p">;</span>

    <span class="k">explicit</span> <span class="nf">SimpleTask</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>

    <span class="o">~</span><span class="n">SimpleTask</span><span class="p">()</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; ~SimpleTask: destruct</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">handle_</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; ~SimpleTask: handle_.destroy() "</span>
                      <span class="o">&lt;&lt;</span> <span class="n">handle_</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
            <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">();</span>
        <span class="p">}</span>
    <span class="p">}</span>

    <span class="n">SimpleTask</span><span class="p">(</span><span class="k">const</span> <span class="n">SimpleTask</span> <span class="o">&amp;</span><span class="p">)</span> <span class="o">=</span> <span class="k">delete</span><span class="p">;</span>
    <span class="n">SimpleTask</span><span class="p">(</span><span class="n">SimpleTask</span><span class="o">&amp;&amp;</span> <span class="n">other</span><span class="p">)</span> <span class="k">noexcept</span>
        <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">other</span><span class="p">.</span><span class="n">handle_</span><span class="p">,</span> <span class="p">{}))</span> <span class="p">{}</span>

    <span class="k">struct</span> <span class="nc">Awaiter</span> <span class="p">{</span>
        <span class="k">explicit</span> <span class="n">Awaiter</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="o">:</span> <span class="n">handle_</span><span class="p">(</span><span class="n">h</span><span class="p">)</span> <span class="p">{}</span>

        <span class="o">~</span><span class="n">Awaiter</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">if</span> <span class="p">(</span><span class="n">handle_</span><span class="p">)</span> <span class="p">{</span>
                <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"    &gt; Awaiter: ~Awaiter() handle_.destroy() "</span>
                          <span class="o">&lt;&lt;</span> <span class="n">handle_</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
                <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">();</span>
            <span class="p">}</span>
        <span class="p">}</span>

        <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">const</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="k">return</span> <span class="nb">false</span><span class="p">;</span>
        <span class="p">}</span>

        <span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"    &gt; Awaiter: await_suspend() - save continuation "</span>
                      <span class="o">&lt;&lt;</span> <span class="n">continuation</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
            <span class="c1">// Store the continuation in the SimpleTask's promise so that the final_suspend()</span>
            <span class="c1">// knows to resume this coroutine when the task completes.</span>
            <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
            <span class="c1">// Then we resume the SimpleTask's coroutine, which is currently suspended</span>
            <span class="c1">// at the initial-suspend-point (ie. at the open curly brace).</span>
            <span class="n">handle_</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
        <span class="p">}</span>

        <span class="n">T</span> <span class="nf">await_resume</span><span class="p">()</span> <span class="p">{</span>
            <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"    &gt; Awaiter: await_resume() - continuation resumed (coroutine "</span>
                      <span class="o">&lt;&lt;</span> <span class="n">handle_</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">" completed)</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
            <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value_</span><span class="p">);</span>
        <span class="p">}</span>

        <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">handle_</span><span class="p">{};</span>
    <span class="p">};</span>

    <span class="n">Awaiter</span> <span class="k">operator</span> <span class="k">co_await</span><span class="p">()</span> <span class="o">&amp;&amp;</span> <span class="p">{</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; SimpleTask: operator co_await()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="k">return</span> <span class="n">Awaiter</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">exchange</span><span class="p">(</span><span class="n">handle_</span><span class="p">,</span> <span class="p">{})};</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">callee</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; callee()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="mi">42</span><span class="p">;</span>
<span class="p">}</span>

<span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">callee</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller: result = "</span> <span class="o">&lt;&lt;</span> <span class="n">result</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">result</span> <span class="o">*</span> <span class="mi">2</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">task</span> <span class="o">=</span> <span class="n">caller</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"&gt; main: start caller coroutine "</span> <span class="o">&lt;&lt;</span> <span class="n">task</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="n">task</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"&gt; main: caller coroutine completed, final result = "</span>
              <span class="o">&lt;&lt;</span> <span class="n">task</span><span class="p">.</span><span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">value_</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>我们先不管具体实现细节，理解下主干代码：<code class="language-plaintext highlighter-rouge">SimpleTask</code>是一个延迟启动的协程，在<code class="language-plaintext highlighter-rouge">main()</code>中调用了<code class="language-plaintext highlighter-rouge">caller()</code>，并手动启动了这个协程。在<code class="language-plaintext highlighter-rouge">caller()</code>协程中，又<code class="language-plaintext highlighter-rouge">co_await</code>了另一个协程<code class="language-plaintext highlighter-rouge">callee()</code>，并最终使用<code class="language-plaintext highlighter-rouge">co_await callee()</code>的返回值，返回结果<code class="language-plaintext highlighter-rouge">result * 2</code>。</p>

<p>接下来分析<code class="language-plaintext highlighter-rouge">SimpleTask</code>是如何实现Resume continuation。在<code class="language-plaintext highlighter-rouge">SimpleTask</code>中，实现了以下组件：</p>

<ul>
  <li>重载了<code class="language-plaintext highlighter-rouge">operator co_await</code>，每当<code class="language-plaintext highlighter-rouge">co_await</code>一个<code class="language-plaintext highlighter-rouge">SimpleTask</code>时，会调用嵌套类<code class="language-plaintext highlighter-rouge">Awaiter</code>，自定义协程被挂起时的行为。注意只有<code class="language-plaintext highlighter-rouge">caller()</code>中<code class="language-plaintext highlighter-rouge">co_await callee()</code>时会构造<code class="language-plaintext highlighter-rouge">Awaiter</code>并调用相关接口，<code class="language-plaintext highlighter-rouge">caller</code>协程自身是被<code class="language-plaintext highlighter-rouge">main()</code>手动启动的。</li>
  <li><code class="language-plaintext highlighter-rouge">SimpleTask</code>作为一个<code class="language-plaintext highlighter-rouge">ReturnType</code>，它内嵌了一个<code class="language-plaintext highlighter-rouge">promise_type</code>。<code class="language-plaintext highlighter-rouge">promise_type</code>中<code class="language-plaintext highlighter-rouge">initial_suspend</code>是<code class="language-plaintext highlighter-rouge">suspend_always</code>，而<code class="language-plaintext highlighter-rouge">final_suspend</code>则有所不同，又实现了一个新的<code class="language-plaintext highlighter-rouge">Awaitable</code>，即<code class="language-plaintext highlighter-rouge">FinalAwaiter</code>。正是通过<code class="language-plaintext highlighter-rouge">FinalAwaiter</code>，自定义了协程之后的行为。</li>
</ul>

<p>为了更好的说明<code class="language-plaintext highlighter-rouge">SimpleTask</code>是实现Resume continuation的原理，按照我们之前描述的流程，<code class="language-plaintext highlighter-rouge">caller</code>会展开为如下代码，其中稍微调整和简化了其中一些步骤以便于理解。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">caller_frame</span> <span class="o">=</span> <span class="k">new</span> <span class="n">coroutine_frame</span><span class="p">(...);</span>
    <span class="k">auto</span> <span class="n">caller_promise</span> <span class="o">=</span> <span class="n">f</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">;</span>
    <span class="k">auto</span> <span class="n">caller_return_object</span> <span class="o">=</span> <span class="n">caller_frame</span><span class="o">-&gt;</span><span class="n">get_return_object</span><span class="p">();</span>
    <span class="k">using</span> <span class="n">handle_t</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="k">co_await</span> <span class="n">caller_promise</span><span class="p">.</span><span class="n">initial_suspend</span><span class="p">();</span>

    <span class="k">try</span> <span class="p">{</span>
        <span class="c1">// co_await callee() is expanded below:</span>
        <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">callee</span><span class="p">();</span>
        <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaiter</span> <span class="o">=</span> <span class="n">awaitable</span><span class="p">.</span><span class="k">operator</span> <span class="k">co_await</span><span class="p">();</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
            <span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">handle_t</span><span class="o">::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">caller_promise</span><span class="p">));</span>
            <span class="k">return</span> <span class="n">caller_return_object</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="c1">// when callee() is resumed</span>
        <span class="k">auto</span> <span class="n">result</span> <span class="o">=</span> <span class="n">awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>

        <span class="k">co_return</span> <span class="n">result</span> <span class="o">*</span> <span class="mi">2</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
        <span class="c1">// ...</span>
    <span class="p">}</span>

    <span class="k">co_await</span> <span class="n">caller_promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="p">();</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>接下来我们看下具体执行流程：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1. main调用caller()
   ├─ 创建caller()协程
   ├─ promise_type中的initial_suspend为suspend_always，协程会被挂起返回SimpleTask
   ├─ 返回SimpleTask
   └─ 通过SimpleTask中的coroutine_handle手动恢复caller()协程继续执行

2: caller()执行co_await callee()，此时callee会挂起
   ├─ 创建callee()协程
   ├─ 对callee()的返回对象(即SimpleTask对象)调用operator co_await，
   │  返回值为一个Awaiter，注意callee()协程的coroutine_handle转交给这个Awaiter，
   │  即return Awaiter{std::exchange(handle_, {})};
   ├─ Awaiter::await_ready返回false，即callee需要挂起
   └─ Awaiter::await_suspend(caller's coroutine handle)，
      ├─ 保存caller的coroutine_handle到callee_promise中
      └─ 随后立即恢复callee
</code></pre></div></div>

<p>注意，由于当前控制流是在执行<code class="language-plaintext highlighter-rouge">caller</code>这个协程函数体，即在<code class="language-plaintext highlighter-rouge">caller</code>的视角，<code class="language-plaintext highlighter-rouge">callee</code>只是一个普通函数，所以<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>传入的参数是<code class="language-plaintext highlighter-rouge">caller()</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。</p>

<p>此处的<code class="language-plaintext highlighter-rouge">await_suspend</code>是理解Resume continuation的关键：</p>

<p>它将<code class="language-plaintext highlighter-rouge">caller()</code>协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到被等待协程的promise中，也就是<code class="language-plaintext highlighter-rouge">callee()</code>协程的promise中，记为<code class="language-plaintext highlighter-rouge">callee_promise</code>。通过<code class="language-plaintext highlighter-rouge">callee()</code>协程的promise，<code class="language-plaintext highlighter-rouge">caller()</code>协程就能在之后的步骤中被恢复。</p>

<p>随后立即恢复<code class="language-plaintext highlighter-rouge">callee()</code>协程（<code class="language-plaintext highlighter-rouge">Awaiter</code>的构造中会将传入的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存为成员变量，即<code class="language-plaintext highlighter-rouge">callee()</code>协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>），这个操作由更复杂的机制触发，这里只是为了简化实现而立即恢复。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="c1">// Store the `continuation` in promise so that the final_suspend()</span>
    <span class="c1">// knows to resume `continuation` coroutine when current task completes.</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"    &gt; Awaiter: await_suspend() - save continuation "</span>
              <span class="o">&lt;&lt;</span> <span class="n">continuation</span><span class="p">.</span><span class="n">address</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
    <span class="c1">// Then we resume current task coroutine, which is currently suspended</span>
    <span class="c1">// at the initial-suspend-point (ie. at the open curly brace).</span>
    <span class="n">handle_</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>随后，恢复<code class="language-plaintext highlighter-rouge">Awaiter</code>中<code class="language-plaintext highlighter-rouge">coroutine_handle</code>对应的协程，即<code class="language-plaintext highlighter-rouge">callee()</code>协程继续执行。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>3. callee()被恢复，继续执行，完成时唤醒caller
   ├─ co_return 42
   │  └─ 调用callee_promise.return_value(42)
   └─ co_await callee_promise.final_supend，即co_await FinalAwaiter{}
      ├─ FinalAwaiter::await_ready返回false, callee会被挂起
      └─ FinalAwaiter::await_suspend(callee's coroutine handle)
         └─ 恢复caller执行
</code></pre></div></div>

<p>当<code class="language-plaintext highlighter-rouge">callee</code>协程体执行完成时，最后会<code class="language-plaintext highlighter-rouge">co_await callee_promise.final_suspend()</code>，此时控制流是可以交还给<code class="language-plaintext highlighter-rouge">caller</code>。注意到在第2步中，<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend(caller's coroutine handle)</code>已经将<code class="language-plaintext highlighter-rouge">caller()协程</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到了<code class="language-plaintext highlighter-rouge">callee_promise</code>中。此时<code class="language-plaintext highlighter-rouge">callee</code>协程执行完成，可以在<code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>中并直接读取出<code class="language-plaintext highlighter-rouge">caller</code>协程对应的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，然后恢复：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="c1">// current coroutine is now suspended at the final-suspend point</span>

    <span class="c1">// In this case, `h` is current coroutine, aka callee's coroutine_handle</span>
    <span class="c1">// `continuation` is caller's coroutine_handle</span>
    <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">continuation</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>控制流会返回到<code class="language-plaintext highlighter-rouge">caller()</code>协程中继续执行：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="mf">4.</span> <span class="n">caller</span><span class="err">继续执行</span>
   <span class="err">├─</span> <span class="err">此时</span><span class="n">callee</span><span class="err">已经执行完成，通过</span><span class="n">Awaiter</span><span class="o">::</span><span class="n">await_resume</span><span class="err">读取</span><span class="n">callee</span><span class="err">的结果</span>
   <span class="err">├─</span> <span class="n">result</span> <span class="o">=</span> <span class="mi">42</span>
   <span class="err">├─</span> <span class="err">调用</span><span class="n">caller_promise</span><span class="p">.</span><span class="n">return_value</span><span class="p">(</span><span class="mi">84</span><span class="p">)</span>
   <span class="err">└─</span> <span class="k">co_await</span> <span class="n">caller_promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="err">，即</span><span class="k">co_await</span> <span class="n">FinalAwaiter</span><span class="p">{}</span>
      <span class="err">├─</span> <span class="n">FinalAwaiter</span><span class="o">::</span><span class="n">await_ready</span><span class="err">返回</span><span class="nb">false</span><span class="p">,</span> <span class="n">caller</span><span class="err">会被挂起</span>
      <span class="err">└─</span> <span class="n">FinalAwaiter</span><span class="o">::</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">caller</span><span class="err">'</span><span class="n">s</span> <span class="n">coroutine</span> <span class="n">handle</span><span class="p">)</span>
</code></pre></div></div>

<p>即<code class="language-plaintext highlighter-rouge">co_await callee()</code>对应的<code class="language-plaintext highlighter-rouge">Awaiter::await_resume</code>会被调用，从而读取到<code class="language-plaintext highlighter-rouge">co_await callee()</code>的返回值42。注意在这个过程中，<code class="language-plaintext highlighter-rouge">co_await callee()</code>对应的<code class="language-plaintext highlighter-rouge">Awaiter</code>会在<code class="language-plaintext highlighter-rouge">await_resume</code>之后，就出作用域并析构<code class="language-plaintext highlighter-rouge">callee()</code>协程的coroutine frame，而<code class="language-plaintext highlighter-rouge">callee()</code>协程的返回值<code class="language-plaintext highlighter-rouge">SimpleTask</code>也会在<code class="language-plaintext highlighter-rouge">co_await callee()</code>执行完成之后析构，注意这个<code class="language-plaintext highlighter-rouge">SimpleTask</code>中的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>为空（之前已经交给对应的<code class="language-plaintext highlighter-rouge">Awaiter</code>了）。</p>

<p>当<code class="language-plaintext highlighter-rouge">caller</code>协程体执行完成时，最后会<code class="language-plaintext highlighter-rouge">co_await caller_promise.final_suspend()</code>，注意<code class="language-plaintext highlighter-rouge">call_promise</code>中的<code class="language-plaintext highlighter-rouge">continuation_</code>为空，即没有协程在等待<code class="language-plaintext highlighter-rouge">caller</code>执行完成，因此对应的<code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>中什么都不会执行。</p>

<p>此时<code class="language-plaintext highlighter-rouge">caller</code>协程体执行完，控制流交还给<code class="language-plaintext highlighter-rouge">main</code>，通过<code class="language-plaintext highlighter-rouge">SimpleTask</code>拿到对应的promise，也就能读取到<code class="language-plaintext highlighter-rouge">caller</code>协程的返回值。最终，<code class="language-plaintext highlighter-rouge">main</code>中的<code class="language-plaintext highlighter-rouge">SimpleTask</code>析构，销毁<code class="language-plaintext highlighter-rouge">caller</code>的coroutine frame。</p>

<p>整个例子的输出可能是这样的，可以对照着上述流程加深理解：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">0x57edddf392b0</code>是<code class="language-plaintext highlighter-rouge">caller()</code>协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code></li>
  <li><code class="language-plaintext highlighter-rouge">0x57edddf39720</code>是<code class="language-plaintext highlighter-rouge">callee()</code>协程的<code class="language-plaintext highlighter-rouge">coroutine_handle</code></li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;</span> <span class="n">main</span><span class="o">:</span> <span class="n">start</span> <span class="n">caller</span> <span class="n">coroutine</span> <span class="mh">0x57edddf392b0</span>
  <span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span>
  <span class="o">&gt;</span> <span class="n">SimpleTask</span><span class="o">:</span> <span class="k">operator</span> <span class="k">co_await</span><span class="p">()</span>
    <span class="o">&gt;</span> <span class="n">Awaiter</span><span class="o">:</span> <span class="n">await_suspend</span><span class="p">()</span> <span class="o">-</span> <span class="n">save</span> <span class="n">continuation</span> <span class="mh">0x57edddf392b0</span>
      <span class="o">&gt;</span> <span class="n">callee</span><span class="p">()</span>
      <span class="o">&gt;</span> <span class="n">promise</span><span class="o">:</span> <span class="n">return_value</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
      <span class="o">&gt;</span> <span class="n">FinalAwaiter</span><span class="o">:</span> <span class="n">coroutine</span> <span class="mh">0x57edddf39720</span> <span class="n">completed</span><span class="p">,</span> <span class="n">ready</span> <span class="n">to</span> <span class="n">resume</span> <span class="n">continuation</span>
      <span class="o">&gt;</span> <span class="n">FinalAwaiter</span><span class="o">:</span> <span class="n">continuation</span><span class="p">.</span><span class="n">resume</span><span class="p">()</span> <span class="o">-&gt;</span> <span class="mh">0x57edddf392b0</span>
    <span class="o">&gt;</span> <span class="n">Awaiter</span><span class="o">:</span> <span class="n">await_resume</span><span class="p">()</span> <span class="o">-</span> <span class="n">continuation</span> <span class="n">resumed</span> <span class="p">(</span><span class="n">coroutine</span> <span class="mh">0x57edddf39720</span> <span class="n">completed</span><span class="p">)</span>
    <span class="o">&gt;</span> <span class="n">Awaiter</span><span class="o">:</span> <span class="o">~</span><span class="n">Awaiter</span><span class="p">()</span> <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">()</span> <span class="mh">0x57edddf39720</span>
      <span class="o">&gt;</span> <span class="o">~</span><span class="n">SimpleTask</span><span class="o">:</span> <span class="n">destruct</span>
  <span class="o">&gt;</span> <span class="n">caller</span><span class="o">:</span> <span class="n">result</span> <span class="o">=</span> <span class="mi">42</span>
      <span class="o">&gt;</span> <span class="n">promise</span><span class="o">:</span> <span class="n">return_value</span><span class="p">(</span><span class="mi">84</span><span class="p">)</span>
      <span class="o">&gt;</span> <span class="n">FinalAwaiter</span><span class="o">:</span> <span class="n">coroutine</span> <span class="mh">0x57edddf392b0</span> <span class="n">completed</span><span class="p">,</span> <span class="n">ready</span> <span class="n">to</span> <span class="n">resume</span> <span class="n">continuation</span>
<span class="o">&gt;</span> <span class="n">main</span><span class="o">:</span> <span class="n">caller</span> <span class="n">coroutine</span> <span class="n">completed</span><span class="p">,</span> <span class="k">final</span> <span class="n">result</span> <span class="o">=</span> <span class="mi">84</span>
      <span class="o">&gt;</span> <span class="o">~</span><span class="n">SimpleTask</span><span class="o">:</span> <span class="n">destruct</span>
      <span class="o">&gt;</span> <span class="o">~</span><span class="n">SimpleTask</span><span class="o">:</span> <span class="n">handle_</span><span class="p">.</span><span class="n">destroy</span><span class="p">()</span> <span class="mh">0x57edddf392b0</span>
</code></pre></div></div>

<p>整体上Resume continuation分为两部分：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Awaiter</code>连接了等待者<code class="language-plaintext highlighter-rouge">caller</code>和被等待者<code class="language-plaintext highlighter-rouge">callee</code>，即把等待者的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到了被等待者的promise中，即将等待者注册为被等待者的continuation</li>
  <li>而<code class="language-plaintext highlighter-rouge">promise_type</code>中的<code class="language-plaintext highlighter-rouge">FinalAwaiter</code>，通过自定义协程执行完成后的行为，使得被等待协程<code class="language-plaintext highlighter-rouge">callee</code>完成时能自动恢复<code class="language-plaintext highlighter-rouge">caller</code>等待者的执行。</li>
</ul>

<p>因此Resume continuation能够让协程之间自动形成调用链，使代码在保持同步风格的同时，实现异步操作。整个调用链中不需要传递回调函数，每当被等待者完成时，就能自动恢复调用者继续执行。但是Resume continuation会造成栈的深度迅速增长。我们可以把这个例子的几个步骤串联起来，得到类似下面的调用栈。其中Resume continuation发生在<code class="language-plaintext highlighter-rouge">continuation.resume()</code>这一步，从<code class="language-plaintext highlighter-rouge">callee</code>栈上又生长出了<code class="language-plaintext highlighter-rouge">caller</code>的栈。而从调用关系上，明明是<code class="language-plaintext highlighter-rouge">caller</code>调用了<code class="language-plaintext highlighter-rouge">callee</code>。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main()
└─ caller协程体 // 通过手动调用caller.handle_.resume()
   └─ Awaiter::await_suspend(caller's handle) // Awaiter指co_await callee()对应的awaiter
      └─ handle_.resume() // 立即恢复了callee
         └─ callee协程体
            └─ FinalAwaiter::await_suspend(callee's handle) // co_await callee_promise.final_supend
              └─ continuation.resume() // 恢复caller 即所谓resume continuation
                  └─ caller协程体
                     └─ Awaiter::await_resume() // Awaiter指co_await callee()对应的awaiter
</code></pre></div></div>

<blockquote>
  <p>关于stack overflow，可以看这篇<a href="https://lewissbaker.github.io/2020/05/11/understanding_symmetric_transfer">博客</a>里举的<a href="https://godbolt.org/z/gy5Q8q">例子</a>。</p>

</blockquote>

<p>有没有其他办法能够既保证协程之间形成调用链，而又不会造成stack overflow呢，答案就是上一篇提到的对称转移，具体的方法要等到下一篇再揭晓了。</p>

<h3 id="await_transform">await_transform</h3>

<p>最后一部分，我们再总结一下<code class="language-plaintext highlighter-rouge">promise_type</code>如何通过<code class="language-plaintext highlighter-rouge">await_transform</code>定义在协程的体中<code class="language-plaintext highlighter-rouge">co_await</code>或<code class="language-plaintext highlighter-rouge">co_yield</code>时行为，注意是当前协程中。</p>

<p>首先，<code class="language-plaintext highlighter-rouge">await_transform</code>是在<code class="language-plaintext highlighter-rouge">co_await</code>中获取<code class="language-plaintext highlighter-rouge">Awaitable</code>这一步会被调用：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">promise_type</span><span class="p">,</span> <span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
<span class="k">decltype</span><span class="p">(</span><span class="k">auto</span><span class="p">)</span> <span class="n">get_awaitable</span><span class="p">(</span><span class="n">promise_type</span><span class="o">&amp;</span> <span class="n">promise</span><span class="p">,</span> <span class="n">T</span><span class="o">&amp;&amp;</span> <span class="n">expr</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="k">constexpr</span> <span class="p">(</span><span class="n">has_any_await_transform_member_v</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">promise</span><span class="p">.</span><span class="n">await_transform</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">));</span>
    <span class="k">else</span>
        <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">expr</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">await_transform</code>的本质作用都是自定义某些类型在co_await时的行为，比如</p>

<ul>
  <li>
    <p>让原本不是<code class="language-plaintext highlighter-rouge">Awaitable</code>的类型变成<code class="language-plaintext highlighter-rouge">Awaitable</code></p>

    <p>例如，一个返回类型为<code class="language-plaintext highlighter-rouge">std::optional&lt;T&gt;</code>的协程的<code class="language-plaintext highlighter-rouge">promise_type</code>可以提供一个<code class="language-plaintext highlighter-rouge">await_transform()</code>重载，该重载接受<code class="language-plaintext highlighter-rouge">std::optional&lt;U&gt;</code>参数并返回一个<code class="language-plaintext highlighter-rouge">Awaitable</code>类型，这个<code class="language-plaintext highlighter-rouge">Awaitable</code>类型要么返回<code class="language-plaintext highlighter-rouge">U</code>类型的值，要么在被等待的值是<code class="language-plaintext highlighter-rouge">std::nullopt</code>时挂起协程。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span><span class="p">&gt;</span>
  <span class="k">class</span> <span class="nc">optional_promise</span> <span class="p">{</span>
      <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">U</span><span class="p">&gt;</span>
      <span class="k">auto</span> <span class="n">await_transform</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">optional</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&amp;</span> <span class="n">value</span><span class="p">)</span> <span class="p">{</span>
          <span class="k">class</span> <span class="nc">awaiter</span> <span class="p">{</span>
              <span class="n">std</span><span class="o">::</span><span class="n">optional</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&amp;</span> <span class="n">value</span><span class="p">;</span>
          <span class="nl">public:</span>
              <span class="k">explicit</span> <span class="n">awaiter</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">optional</span><span class="o">&lt;</span><span class="n">U</span><span class="o">&gt;&amp;</span> <span class="n">x</span><span class="p">)</span> <span class="k">noexcept</span> <span class="o">:</span> <span class="n">value</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{}</span>
              <span class="kt">bool</span> <span class="nf">await_ready</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
                  <span class="k">return</span> <span class="n">value</span><span class="p">.</span><span class="n">has_value</span><span class="p">();</span>
              <span class="p">}</span>
              <span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{}</span>
              <span class="n">U</span><span class="o">&amp;</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span>
                  <span class="k">return</span> <span class="o">*</span><span class="n">value</span><span class="p">;</span>
              <span class="p">}</span>
          <span class="p">};</span>
          <span class="k">return</span> <span class="n">awaiter</span><span class="p">{</span><span class="n">value</span><span class="p">};</span>
      <span class="p">}</span>
  <span class="p">};</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>通过将<code class="language-plaintext highlighter-rouge">await_transform</code>重载声明为<code class="language-plaintext highlighter-rouge">deleted</code>来禁止等待某些类型</p>

    <p>例如，一个返回类型为<code class="language-plaintext highlighter-rouge">std::generator&lt;T&gt;</code>的协程的<code class="language-plaintext highlighter-rouge">promise_type</code>类型可能会声明<code class="language-plaintext highlighter-rouge">await_transform()</code>为<code class="language-plaintext highlighter-rouge">deleted</code>。也就是禁止在这个协程内使用<code class="language-plaintext highlighter-rouge">co_await</code>，而只能使用<code class="language-plaintext highlighter-rouge">yield_value</code>（对于generator类型的协程是合理的）。</p>
  </li>
  <li>
    <p>folly通过<code class="language-plaintext highlighter-rouge">await_transform</code>，提供了一个magic value，即<code class="language-plaintext highlighter-rouge">co_await co_current_executor</code>获取当前协程在哪个executor上执行，</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="c1">// Special placeholder object that can be 'co_await'ed from within a Task&lt;T&gt;</span>
  <span class="c1">// or an AsyncGenerator&lt;T&gt; to obtain the current folly::Executor associated</span>
  <span class="c1">// with the current coroutine.</span>
  <span class="c1">//</span>
  <span class="c1">// Note that for a folly::Task the executor will remain the same throughout</span>
  <span class="c1">// the lifetime of the coroutine. For a folly::AsyncGenerator&lt;T&gt; the current</span>
  <span class="c1">// executor may change when resuming from a co_yield suspend-point.</span>
  <span class="c1">//</span>
  <span class="c1">// Example:</span>
  <span class="c1">//   folly::coro::Task&lt;void&gt; example() {</span>
  <span class="c1">//     Executor* e = co_await folly::coro::co_current_executor;</span>
  <span class="c1">//     e-&gt;add([] { do_something(); });</span>
  <span class="c1">//   }</span>

  <span class="k">class</span> <span class="nc">TaskPromiseBase</span> <span class="p">{</span>
    <span class="c1">// ...</span>
    <span class="k">auto</span> <span class="n">await_transform</span><span class="p">(</span><span class="n">co_current_executor_t</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
      <span class="k">return</span> <span class="n">ready_awaitable</span><span class="o">&lt;</span><span class="n">folly</span><span class="o">::</span><span class="n">Executor</span><span class="o">*&gt;</span><span class="p">{</span><span class="n">executor_</span><span class="p">.</span><span class="n">get</span><span class="p">()};</span>
    <span class="p">}</span>
    <span class="c1">// ...</span>
  <span class="p">};</span>

  <span class="k">template</span> <span class="o">&lt;</span><span class="k">typename</span> <span class="nc">T</span> <span class="o">=</span> <span class="kt">void</span><span class="p">&gt;</span>
  <span class="k">class</span> <span class="nc">ready_awaitable</span> <span class="p">{</span>
    <span class="k">static_assert</span><span class="p">(</span><span class="o">!</span><span class="n">std</span><span class="o">::</span><span class="n">is_void</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;::</span><span class="n">value</span><span class="p">,</span> <span class="s">"base template unsuitable for void"</span><span class="p">);</span>

   <span class="nl">public:</span>
    <span class="k">explicit</span> <span class="n">ready_awaitable</span><span class="p">(</span><span class="n">T</span> <span class="n">value</span><span class="p">)</span> <span class="c1">//</span>
        <span class="k">noexcept</span><span class="p">(</span><span class="k">noexcept</span><span class="p">(</span><span class="n">T</span><span class="p">(</span><span class="n">FOLLY_DECLVAL</span><span class="p">(</span><span class="n">T</span><span class="o">&amp;&amp;</span><span class="p">))))</span>
        <span class="o">:</span> <span class="n">value_</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">value</span><span class="p">))</span> <span class="p">{}</span>

    <span class="kt">bool</span> <span class="n">await_ready</span><span class="p">()</span> <span class="k">noexcept</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">true</span><span class="p">;</span> <span class="p">}</span>
    <span class="kt">void</span> <span class="nf">await_suspend</span><span class="p">(</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{}</span>
    <span class="n">T</span> <span class="n">await_resume</span><span class="p">()</span> <span class="k">noexcept</span><span class="p">(</span><span class="k">noexcept</span><span class="p">(</span><span class="n">T</span><span class="p">(</span><span class="n">FOLLY_DECLVAL</span><span class="p">(</span><span class="n">T</span><span class="o">&amp;&amp;</span><span class="p">))))</span> <span class="p">{</span>
      <span class="k">return</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&amp;&amp;&gt;</span><span class="p">(</span><span class="n">value_</span><span class="p">);</span>
    <span class="p">}</span>

   <span class="k">private</span><span class="o">:</span>
    <span class="n">T</span> <span class="n">value_</span><span class="p">;</span>
  <span class="p">};</span>
</code></pre></div>    </div>
  </li>
</ul>

<h2 id="at-last">At last</h2>

<p>这一篇我们了解了编译器如何将协程体展开为有限状态机，以及如何从<code class="language-plaintext highlighter-rouge">promise_type</code>的视角自定义协程行为，包括<code class="language-plaintext highlighter-rouge">initial_suspend</code>、<code class="language-plaintext highlighter-rouge">final_suspend</code>和<code class="language-plaintext highlighter-rouge">await_transform</code>。结合代码示例，我们介绍了Resume continuation的简单实现，这实际上就是协程的非对称转移（Asymmetric transfer）。如果你对这个术语还不太理解也没关系，下一篇介绍对称转移（Symmetric transfer）时自然就会明白两者的区别。</p>

<h2 id="reference">Reference</h2>

<table>
  <tbody>
    <tr>
      <td>[C++ Coroutines: Understanding the promise type</td>
      <td>Asymmetric Transfer](https://lewissbaker.github.io/2018/09/05/understanding-the-promise-type)</td>
    </tr>
  </tbody>
</table>

<table>
  <tbody>
    <tr>
      <td>[C++ Coroutines: Understanding Symmetric Transfer</td>
      <td>Asymmetric Transfer](https://lewissbaker.github.io/2020/05/11/understanding_symmetric_transfer)</td>
    </tr>
  </tbody>
</table>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[这篇会继续从promise_type的视角，更完整的介绍编译器是如何把协程转换成一个固定的三段式代码，以及promise_type是如何自定义了协程的行为。好在有了上一篇对相关概念的介绍，我们终于可以通过demo来研究其中的奥秘了。]]></summary></entry><entry><title type="html">Deciphering C++ Coroutines, part 3</title><link href="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-3/" rel="alternate" type="text/html" title="Deciphering C++ Coroutines, part 3" /><published>2025-11-06T00:00:00+08:00</published><updated>2025-11-06T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-3</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/Deciphering-Coroutines-part-3/"><![CDATA[<p>Asymmetric transfer vs Symmetric transfer.</p>

<h2 id="asymmetric-transfer-reviewed">Asymmetric transfer reviewed</h2>

<p>上一篇结尾时我们提到：Resume continuation就是Asymmetric transfer，为什么会称为非对称转移呢？</p>

<p>我们首先回顾下上一篇例子中，控制流是如何在<code class="language-plaintext highlighter-rouge">caller</code>和<code class="language-plaintext highlighter-rouge">callee</code>之间转移的。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">callee</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller: result = "</span> <span class="o">&lt;&lt;</span> <span class="n">result</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">result</span> <span class="o">*</span> <span class="mi">2</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li>caller → callee</li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
    <span class="n">handle_</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>通过在<code class="language-plaintext highlighter-rouge">co_await callee</code>中的<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>，将<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到<code class="language-plaintext highlighter-rouge">callee</code>的promise中，然后调用<code class="language-plaintext highlighter-rouge">.resume()</code>恢复<code class="language-plaintext highlighter-rouge">callee</code>执行。注意<code class="language-plaintext highlighter-rouge">callee</code>的协程栈会在当前调用栈的基础上展开，只有当<code class="language-plaintext highlighter-rouge">callee</code>执行完成之后，相关的栈才会释放。</p>

<ul>
  <li>callee → caller</li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">continuation</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>当callee的协程体执行完之后，在<code class="language-plaintext highlighter-rouge">co_await callee_promise.final_suspend</code>时，在对应的<code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>中，通过promise获取到caller的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，然后调用<code class="language-plaintext highlighter-rouge">.resume()</code>恢复<code class="language-plaintext highlighter-rouge">caller</code>执行。注意<code class="language-plaintext highlighter-rouge">caller</code>的协程栈会在当前调用栈的基础上展开，只有当<code class="language-plaintext highlighter-rouge">caller</code>执行完成之后，相关的栈才会释放。</p>

<blockquote>
  <p>在上一篇提到过，这里再强调下：无论谁来调用<code class="language-plaintext highlighter-rouge">.resume</code>都一样，<code class="language-plaintext highlighter-rouge">.resume</code>的调用方在调用时，都会像一个普通函数调用一样，在当前栈的基础上增长出恢复的协程栈。</p>

</blockquote>

<p>乍一看，都是在某个<code class="language-plaintext highlighter-rouge">Awaiter</code>的<code class="language-plaintext highlighter-rouge">await_suspend</code>中恢复了另一个协程。但其实二者有以下不同：</p>

<ol>
  <li>前者是父协程恢复子协程，后者是子协程恢复父协程。</li>
  <li>前者是发生在<code class="language-plaintext highlighter-rouge">co_await</code>表达式时，后者则是协程函数体执行完，控制流到<code class="language-plaintext highlighter-rouge">final_suspend</code>时。</li>
</ol>

<p>正是因为有这些控制流传递过程中的不同，它才被称为Asymmetric transfer。除此之外，在调用栈上也能看到控制流从<code class="language-plaintext highlighter-rouge">caller</code>传递给<code class="language-plaintext highlighter-rouge">callee</code>，又传递给了<code class="language-plaintext highlighter-rouge">caller</code>，造成调用栈的层次不断变深。即每次调用<code class="language-plaintext highlighter-rouge">.resume()</code>时，我们都会创建一个新的stack frame给要恢复的协程，这会导致实际运用中出现<a href="https://godbolt.org/z/gy5Q8q">stack overflow</a>。</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>main()
└─ caller协程体
   └─ Awaiter::await_suspend(caller's handle)
      └─ handle_.resume() // caller -&gt; callee
         └─ callee协程体
            └─ FinalAwaiter::await_suspend(callee's handle)
              └─ continuation.resume() // callee -&gt; caller
                  └─ caller协程体
                     └─ Awaiter::await_resume()
</code></pre></div></div>

<p>实际运行上一篇给出的<a href="https://godbolt.org/z/c3ocoPeM6">例子</a>，也能在gdb中也能看到如下的调用栈，出现了<code class="language-plaintext highlighter-rouge">caller → callee → caller</code>这样的嵌套执行，以及<code class="language-plaintext highlighter-rouge">.resume()</code>时都会生成新的stack frame。</p>

<p><img src="/archive/coroutine-8.png" alt="figure" /></p>

<h2 id="symmetric-transfer">Symmetric transfer</h2>

<p>我们看下Symmetric transfer是怎么解决<code class="language-plaintext highlighter-rouge">.resume()</code>时栈变深的问题。Symmetric transfer最初是在这篇标准<a href="https://wg21.link/P0913R0">提案</a>被提议，最终被纳入到标准中的。其核心是：当一个协程需要切换到另一个协程时，不再通过显式的调用<code class="language-plaintext highlighter-rouge">resume()</code>来恢复协程，而是通过返回一个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>来告知，控制流应该交给哪个协程，然后跳转到这个协程继续执行。</p>

<p>主要引入的修改就是<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>，我们之前提到过它有两种类型的返回值：</p>

<ul>
  <li>返回<code class="language-plaintext highlighter-rouge">void</code>的<code class="language-plaintext highlighter-rouge">await_suspend()</code>会无条件地将控制流转移回协程的调用方/恢复方。</li>
  <li>返回<code class="language-plaintext highlighter-rouge">bool</code>的<code class="language-plaintext highlighter-rouge">await_suspend()</code>在返回<code class="language-plaintext highlighter-rouge">false</code>时表示立即恢复协程并继续执行，而返回true会将控制流转移回协程的调用方/恢复方。</li>
</ul>

<p>实际上还有第三种情况，也就是这个提案中提议的：</p>

<ul>
  <li>返回一个<code class="language-plaintext highlighter-rouge">std::coroutine_handle&lt;T&gt;</code>。即返回一个 <code class="language-plaintext highlighter-rouge">std::coroutine_handle&lt;T&gt;</code>，表明控制流应该对称地移交到由返回的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>所标识的协程。另外提供了一个特殊标识
<code class="language-plaintext highlighter-rouge">std::noop_coroutine_handle</code>，用于标识没有要恢复的协程，控制流会直接返回给当前协程的调用方或者恢复方。</li>
</ul>

<p>也就是说，在Symmetric transfer中，我们只是简单地挂起一个协程并恢复另一个协程。它和Asymmetric transfer的一个重要区别在于：两个协程之间没有隐含的调用者/被调用者关系，当一个协程挂起时，它可以将执行流移交给任何被挂起的协程（包括它自己），并且在下次挂起或完成时，不一定要将执行流移交回之前的那个协程。</p>

<h3 id="co_await-reviewed">co_await reviewed</h3>

<p>根据<a href="https://eel.is/c++draft/expr.await#5.1.1">标准</a>来说，Symmetric transfer并不要做很多修改，只需要在协程需要挂起时，在对应的<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>中返回一个 <code class="language-plaintext highlighter-rouge">coroutine_handle</code>，表明控制流应该对称地移交到由返回的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>所标识的协程。进一步的问题是，控制流是怎么交给这个协程的？</p>

<p>要回答这个问题，我们需要了解编译器在Symmetric transfer情况是如何展开co_await。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">value</span> <span class="o">=</span> <span class="o">&lt;</span><span class="n">expr</span><span class="o">&gt;</span><span class="p">;</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaitable</span> <span class="o">=</span> <span class="n">get_awaitable</span><span class="p">(</span><span class="n">promise</span><span class="p">,</span> <span class="k">static_cast</span><span class="o">&lt;</span><span class="k">decltype</span><span class="p">(</span><span class="n">value</span><span class="p">)</span><span class="o">&gt;</span><span class="p">(</span><span class="n">value</span><span class="p">));</span>
  <span class="k">auto</span><span class="o">&amp;&amp;</span> <span class="n">awaiter</span> <span class="o">=</span> <span class="n">get_awaiter</span><span class="p">(</span><span class="k">static_cast</span><span class="o">&lt;</span><span class="k">decltype</span><span class="p">(</span><span class="n">awaitable</span><span class="p">)</span><span class="o">&gt;</span><span class="p">(</span><span class="n">awaitable</span><span class="p">));</span>
  <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
    <span class="k">using</span> <span class="n">handle_t</span> <span class="o">=</span> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span><span class="p">;</span>

    <span class="o">&lt;</span><span class="n">suspend</span><span class="o">-</span><span class="n">coroutine</span><span class="o">&gt;</span>

    <span class="k">auto</span> <span class="n">next</span> <span class="o">=</span> <span class="n">awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span><span class="n">handle_t</span><span class="o">::</span><span class="n">from_promise</span><span class="p">(</span><span class="n">promise</span><span class="p">));</span>
    <span class="n">next</span><span class="p">.</span><span class="n">resume</span><span class="p">();</span>
    <span class="k">return</span><span class="p">;</span> <span class="c1">// &lt;return-to-caller-or-resumer&gt;</span>

<span class="nl">resume_point_label:</span>
    <span class="o">&lt;</span><span class="n">resume</span><span class="o">-</span><span class="n">point</span><span class="o">&gt;</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="n">awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>注意在第一篇中我们介绍<code class="language-plaintext highlighter-rouge">co_await</code>被展开的代码中，当<code class="language-plaintext highlighter-rouge">await_suspend</code>返回<code class="language-plaintext highlighter-rouge">void</code>或者<code class="language-plaintext highlighter-rouge">true</code>时，会直接<code class="language-plaintext highlighter-rouge">&lt;return-to-caller-or-resumer&gt;</code>，即控制流会被返回给调用方/恢复方。</p>

<p>而在Symmetric transfer中，按我们上面所说的语义，它需要恢复<code class="language-plaintext highlighter-rouge">await_suspend</code>返回的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，并且恢复这个协程即<code class="language-plaintext highlighter-rouge">next.resume();</code>。而<code class="language-plaintext highlighter-rouge">&lt;return-to-caller-or-resumer&gt;</code>部分则会被编译器处理为一个<code class="language-plaintext highlighter-rouge">return;</code>语句。</p>

<p>也就是说，当我们通过<code class="language-plaintext highlighter-rouge">.resume()</code>恢复一个协程时，在这个协程体执行过程中，又通过对称转移调用了另一个<code class="language-plaintext highlighter-rouge">.resume()</code>，而之后的<code class="language-plaintext highlighter-rouge">return;</code>语句会使控制流回到最初<code class="language-plaintext highlighter-rouge">.resume()</code>的调用方。注意到两次<code class="language-plaintext highlighter-rouge">.resume()</code>的函数声明完全一致，本质上就是一个<a href="https://en.wikipedia.org/wiki/Tail_call">tail call</a>。</p>

<h3 id="co_await-reviewed-again">co_await reviewed again</h3>

<p>不幸的是，我前面隐藏了一些事实，编译器生成的实际代码并不和上面的示例完全一致。准确说，编译器生成的代码中并没有<code class="language-plaintext highlighter-rouge">next.resume()</code>，我们下面就会说到这一点。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">caller</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">result</span> <span class="o">=</span> <span class="k">co_await</span> <span class="n">callee</span><span class="p">();</span>
    <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller: result = "</span> <span class="o">&lt;&lt;</span> <span class="n">result</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
    <span class="k">co_return</span> <span class="n">result</span> <span class="o">*</span> <span class="mi">2</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>我们这里直接给出一个对称转移的<a href="https://godbolt.org/z/n896xMrdG">例子</a>，整体流程和上一篇介绍Asymmetric transfer时一样，只修改了<code class="language-plaintext highlighter-rouge">await_suspend</code>部分代码（具体改动在后面）。可以把代码中<code class="language-plaintext highlighter-rouge">caller</code>协程放到cppinsight中，恢复函数会被展开如下的示意代码，这里为了容易理解做了一些命名调整和逻辑简化：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">caller_resume</span><span class="p">(</span><span class="n">__callerFrame</span><span class="o">*</span> <span class="n">frame</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">try</span> <span class="p">{</span>
        <span class="k">switch</span> <span class="p">(</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">suspend_index</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">case</span> <span class="mi">0</span><span class="p">:</span> <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_1</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">4</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_2</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">6</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_3</span><span class="p">;</span>
        <span class="p">}</span>

        <span class="c1">// initial_suspend</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_awaiter</span> <span class="o">=</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">initial_suspend</span><span class="p">();</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
            <span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span>
                <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_address</span><span class="p">(</span><span class="n">frame</span><span class="p">)</span>
            <span class="p">);</span>
            <span class="n">frame</span><span class="o">-&gt;</span><span class="n">suspend_index</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
            <span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_await_called</span> <span class="o">=</span> <span class="nb">true</span><span class="p">;</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>

    <span class="n">resume_point_1</span><span class="o">:</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>

        <span class="c1">// co_await callee()</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">callee_awaiter</span> <span class="o">=</span> <span class="n">callee</span><span class="p">().</span><span class="k">operator</span> <span class="nf">co_await</span><span class="p">();</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">callee_awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
            <span class="k">auto</span> <span class="n">next</span> <span class="o">=</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">callee_awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span>
                <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_address</span><span class="p">(</span><span class="n">frame</span><span class="p">)</span>
            <span class="p">);</span>
            <span class="k">if</span> <span class="p">(</span><span class="n">next</span><span class="p">.</span><span class="n">address</span><span class="p">())</span> <span class="p">{</span>
                <span class="n">frame</span><span class="o">-&gt;</span><span class="n">suspend_index</span> <span class="o">=</span> <span class="mi">4</span><span class="p">;</span>
                <span class="k">return</span><span class="p">;</span>
            <span class="p">}</span>
        <span class="p">}</span>

    <span class="n">resume_point_2</span><span class="o">:</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">result</span> <span class="o">=</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">callee_awaiter</span><span class="p">.</span><span class="n">await_resume</span><span class="p">();</span>
        <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"  &gt; caller: result = "</span> <span class="o">&lt;&lt;</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">result</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">return_value</span><span class="p">(</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">result</span> <span class="o">*</span> <span class="mi">2</span><span class="p">);</span>
        <span class="k">goto</span> <span class="n">final_suspend</span><span class="p">;</span>

    <span class="p">}</span> <span class="k">catch</span> <span class="p">(...)</span> <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">initial_await_called</span><span class="p">)</span> <span class="k">throw</span><span class="p">;</span>
        <span class="n">frame</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">unhandled_exception</span><span class="p">();</span>
    <span class="p">}</span>

<span class="n">final_suspend</span><span class="o">:</span>
    <span class="c1">// final_suspend</span>
    <span class="n">frame</span><span class="o">-&gt;</span><span class="n">final_awaiter</span> <span class="o">=</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">promise</span><span class="p">.</span><span class="n">final_suspend</span><span class="p">();</span>
    <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">final_awaiter</span><span class="p">.</span><span class="n">await_ready</span><span class="p">())</span> <span class="p">{</span>
        <span class="k">auto</span> <span class="n">next</span> <span class="o">=</span> <span class="n">frame</span><span class="o">-&gt;</span><span class="n">final_awaiter</span><span class="p">.</span><span class="n">await_suspend</span><span class="p">(</span>
            <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">&gt;::</span><span class="n">from_address</span><span class="p">(</span><span class="n">frame</span><span class="p">)</span>
        <span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">next</span><span class="p">.</span><span class="n">address</span><span class="p">())</span> <span class="p">{</span>
            <span class="n">frame</span><span class="o">-&gt;</span><span class="n">suspend_index</span> <span class="o">=</span> <span class="mi">6</span><span class="p">;</span>
            <span class="k">return</span><span class="p">;</span>
        <span class="p">}</span>
    <span class="p">}</span>

<span class="n">resume_point_3</span><span class="o">:</span>
    <span class="n">frame</span><span class="o">-&gt;</span><span class="n">destroy_fn</span><span class="p">(</span><span class="n">frame</span><span class="p">);</span>
<span class="p">}</span>

</code></pre></div></div>

<p>和上一篇的介绍一样，<code class="language-plaintext highlighter-rouge">caller</code>的协程体被展开为一个有限状态机，通过在挂起时调整协程的状态，以便在恢复这个协程时，能跳转到正确的位置继续执行。我们聚焦到关心的对称转移部分：</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>返回了一个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，即代码中的<code class="language-plaintext highlighter-rouge">next</code>，控制流应该交给<code class="language-plaintext highlighter-rouge">coroutine_handle</code>对应的协程。</li>
  <li>如果<code class="language-plaintext highlighter-rouge">next</code>是一个非空值，即<code class="language-plaintext highlighter-rouge">next</code>的确指向一个协程，此时会设置挂起点下标，就立刻返回了。此处会发生一些<strong>神奇</strong>的事情，包括恢复<code class="language-plaintext highlighter-rouge">next</code>协程，最终控制流会返回到当前协程的调用方或者恢复方。（注意<code class="language-plaintext highlighter-rouge">next.resume()</code>没有显示出现在这段代码中）</li>
  <li>之后，当<code class="language-plaintext highlighter-rouge">caller</code>协程被最终唤醒时，会继续从<code class="language-plaintext highlighter-rouge">resume_point_2</code>开始执行。</li>
</ol>

<p>为什么说神奇的事情呢，因为甚至<a href="https://eel.is/c++draft/expr.await#note-1">标准草案</a>都说的闪烁其词：</p>

<blockquote>
  <p>If the type of <em>await-suspend</em> is std::coroutine_handle<Z>, *await-suspend*.resume() is evaluated[.](https://eel.is/c++draft/expr.await#5.1.1.sentence-1) This resumes the coroutine referred to by the result of *await-suspend*[.](https://eel.is/c++draft/expr.await#5.1.1.sentence-2) Any number of coroutines can be successively resumed in this fashion, eventually returning control flow to the current coroutine caller or resumer ([[dcl.fct.def.coroutine]](https://eel.is/c++draft/dcl.fct.def.coroutine))[.](https://eel.is/c++draft/expr.await#5.1.1.sentence-3)</Z></p>

</blockquote>

<p>以及这是最初对称转移的提案<a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0913r1.html">P0913R0</a>中的描述：</p>

<blockquote>
  <p>If that expression has type <code class="language-plaintext highlighter-rouge">std::experimental::coroutine_handle&lt;Z&gt;</code> and evaluates to a value <em>s</em>, the coroutine referred to by <em>s</em> is resumed as if by a call <em>s</em><code class="language-plaintext highlighter-rouge">.resume()</code>. [<em>Note:</em> Any number of coroutines may be successively resumed in this fashion, eventually returning control flow to the current coroutine caller or resumer (8.4.4) <em>– end note</em>]</p>

</blockquote>

<p>标准和草案中都提到，当在<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>中返回一个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，它会被恢复。但也提到了，这期间可能会恢复若干个协程，并且最终控制流会回到当前协程的调用方或恢复方。</p>

<p>关于所谓<strong>神奇</strong>的事情，我们在这只需要知道，编译器实际生成的代码中不会直接出现类似<code class="language-plaintext highlighter-rouge">next.resume()</code>的调用，而是会以直接跳转的形式，恢复对称转移返回的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>对应协程执行。这里需要再补充一些背景知识，具体的机制，我们会结合代码再介绍。</p>

<p>从另一方面角度来说，协程体之所以会被编译器处理为固定的一个格式，也正是因为对称转移。即在<code class="language-plaintext highlighter-rouge">std::coroutine_handle::resume()</code>中，又需要以某种形式调用另一个<code class="language-plaintext highlighter-rouge">std::coroutine_handle::resume()</code>并返回。为了避免栈的增长，于是想以tail call优化的形式解决这个问题。</p>

<p>Tail call可以理解为：</p>

<blockquote>
  <p>For compilers generating assembly directly, tail-call elimination is easy: it suffices to replace a call opcode with a jump one, after fixing parameters on the stack.</p>

</blockquote>

<p>也就是，在条件允许的情况下，编译器可以用一个<code class="language-plaintext highlighter-rouge">jump</code>指令，替代<code class="language-plaintext highlighter-rouge">call</code>和<code class="language-plaintext highlighter-rouge">ret</code>指令。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">; before</span>
<span class="nl">foo:</span>
  <span class="nf">call</span> <span class="nv">B</span>
  <span class="nf">call</span> <span class="nv">A</span>
  <span class="nf">ret</span>

<span class="c1">; after</span>
<span class="nl">foo:</span>
  <span class="nf">call</span> <span class="nv">B</span>
  <span class="nf">jmp</span>  <span class="nv">A</span>
</code></pre></div></div>

<p>tail call优化的前提如下所示，而正是为了使得协程能够满足tail call的条件，编译器才把协程体处理为前述的格式：</p>

<ul>
  <li>调用约定(calling convention)支持尾调用，并且调用方和被调用方的调用约定相同：可以看到编译器中把整个协程分为了两部分，构造和初始化coroutine的一个函数（被称为<code class="language-plaintext highlighter-rouge">ramp</code>），以及包含协程体、协程状态机的函数（称为<code class="language-plaintext highlighter-rouge">body</code>，比如上面的<code class="language-plaintext highlighter-rouge">caller_resume</code>）。编译器这样处理，就能保证能保证调用约定的要求。</li>
  <li>返回类型相同：都是调用<code class="language-plaintext highlighter-rouge">std::coroutine_handle::resume()</code>，返回类型都是<code class="language-plaintext highlighter-rouge">void</code></li>
  <li>在返回到调用方之前，不需要在调用之后执行任何non-trivial析构函数：协程中所有生命周期可能会跨越挂起点的所有对象，都会被保存在coroutine frame中，不需要调用回调。而生命周期不跨越挂起点的对象，比如局部变量，都会在挂起之前已经析构。</li>
  <li><del>调用不在try/catch 块内部</del>：我们可以看到coroutine body中是有try/catch的，而按这篇<a href="https://lewissbaker.github.io/2020/05/11/understanding_symmetric_transfer">博客</a>里的说法，而编译器通过前述把<code class="language-plaintext highlighter-rouge">.resume()</code>从<code class="language-plaintext highlighter-rouge">body</code>部分挪出去的手段，使协程满足了这个要求。</li>
</ul>

<h3 id="demo">Demo</h3>

<p>了解了对称转移的原理之后，我们结合<a href="https://godbolt.org/z/n896xMrdG">demo</a>再看下具体改动。大体流程和上一篇中介绍Asymmetric transfer时几乎完全一致，其中只有<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>和<code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>的地方稍有不同（即转移控制流的地方），下面会具体分析。</p>

<p>按照唯二的不同点如下：</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>：
    <ul>
      <li>Asymmetric transfer中，<code class="language-plaintext highlighter-rouge">caller</code>在<code class="language-plaintext highlighter-rouge">co_await callee();</code>时发现要保存<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，然后手动恢复<code class="language-plaintext highlighter-rouge">callee</code>继续执行。</li>
      <li>而Symmetric transfer中，我们不再手动恢复子协程<code class="language-plaintext highlighter-rouge">callee</code>，而是直接返回它的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>。</li>
    </ul>
  </li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
// Asymmetric transfer
void await_suspend(std::coroutine_handle&lt;&gt; continuation) noexcept {
    handle_.promise().continuation_ = continuation;
    handle_.resume();
}
*/</span>

<span class="c1">// Symmetric transfer</span>
<span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
    <span class="k">return</span> <span class="n">handle_</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>：
    <ul>
      <li>Asymmetric transfer中，当<code class="language-plaintext highlighter-rouge">callee</code>协程体执行完时，检查promise中是否有设置过要恢复的协程，如果有则直接恢复。</li>
      <li>而Symmetric transfer中，当<code class="language-plaintext highlighter-rouge">callee</code>协程体执行完时，检查promise中是否有设置过要恢复的协程，如果有则返回它的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，代表控制流要交给这个协程（即例子中的<code class="language-plaintext highlighter-rouge">caller</code>），否则返回一个空值。</li>
    </ul>
  </li>
</ul>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
// Asymmetric transfer
void await_suspend(std::coroutine_handle&lt;promise_type&gt; h) noexcept {
    auto continuation = h.promise().continuation_;
    if (continuation) {
        continuation.resume();
    }
}
*/</span>

<span class="c1">// Symmetric transfer: Return the next coroutine to transfer to, or noop if none.</span>
<span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
    <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">continuation</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">noop_coroutine</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>总结一下两处改动，都是在<code class="language-plaintext highlighter-rouge">await_suspend</code>中返回一个<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，代表控制流需要需要转移到这个协程。</p>

<p>实际运行这个例子，在gdb中可以发现Symmetric transfer和Asymmetric transfer的调用栈不同，虽然仍然出现了<code class="language-plaintext highlighter-rouge">caller → callee → caller</code>这样的嵌套执行，但是并不会出现<code class="language-plaintext highlighter-rouge">.resume()</code>这样的stack frame了。</p>

<p><img src="/archive/coroutine-9.png" alt="figure" /></p>

<p>下面我们会直接走读这个demo的汇编，查看关键步骤的栈帧和堆的状态，揭开对称转移的真实面目。</p>

<h2 id="symmetric-transfer-details">Symmetric transfer details</h2>

<h3 id="backgrounds">Backgrounds</h3>

<p>首先，<code class="language-plaintext highlighter-rouge">caller</code>的协程帧数据结构如下，部分变量命名做过调整：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">__callerFrame</span> <span class="p">{</span>
  <span class="c1">// +0x00</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">resume_fn</span><span class="p">)(</span><span class="n">__callerFrame</span><span class="o">*</span><span class="p">);</span>      <span class="c1">// 协程状态机函数 即恢复函数指针</span>
  <span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">destroy_fn</span><span class="p">)(</span><span class="n">__callerFrame</span><span class="o">*</span><span class="p">);</span>     <span class="c1">// 析构函数指针</span>

  <span class="c1">// +0x10</span>
  <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span> <span class="n">promise</span> <span class="p">{</span>
    <span class="kt">int</span> <span class="n">value</span><span class="p">;</span>
    <span class="n">std</span><span class="o">::</span><span class="n">exception_ptr</span> <span class="n">exception</span><span class="p">;</span>
    <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation_</span><span class="p">;</span>
  <span class="p">};</span>

  <span class="c1">// +0x28</span>
  <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">self_handle</span><span class="p">;</span> <span class="c1">// 自身coroutine_handle</span>

  <span class="c1">// +0x30</span>
  <span class="kt">int16_t</span> <span class="n">suspend_index</span><span class="p">;</span>                  <span class="c1">// 挂起点下标 主要用于标识协程状态</span>
  <span class="kt">bool</span> <span class="n">needs_free</span><span class="p">;</span>                        <span class="c1">// 是否需要释放</span>
  <span class="kt">char</span> <span class="n">initial_await_called</span><span class="p">;</span>              <span class="c1">// initial_suspend是否已调用</span>

  <span class="c1">// +0x34</span>
  <span class="n">std</span><span class="o">::</span><span class="n">suspend_always</span> <span class="n">initial_awaiter</span><span class="p">;</span>    <span class="c1">// initial_suspend的Awaiter</span>

  <span class="c1">// +0x38</span>
  <span class="kt">int</span> <span class="n">result</span><span class="p">;</span>                             <span class="c1">// 生命周期跨越挂起点的局部变量</span>

  <span class="c1">// +0x40</span>
  <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">Awaiter</span> <span class="n">callee_awaiter</span><span class="p">;</span> <span class="c1">// co_await callee()的Awaiter</span>

  <span class="c1">// +0x48</span>
  <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">callee_task</span><span class="p">;</span>            <span class="c1">// callee协程的ReturnType对象</span>
                                          <span class="c1">// 同时也提供了operator co_await</span>

  <span class="c1">// +0x50+</span>
  <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;::</span><span class="n">promise_type</span><span class="o">::</span><span class="n">FinalAwaiter</span> <span class="n">final_awaiter</span><span class="p">;</span>
                                          <span class="c1">// final_suspend的Awaiter</span>
<span class="p">};</span>
</code></pre></div></div>

<p>这其中对于理解对称转移最重要的就是<code class="language-plaintext highlighter-rouge">resume_fn</code>这个函数指针。每个协程都有一个状态机函数，每次协程开始执行或者被恢复时，都会调用这个状态机函数。而协程当前的状态用<code class="language-plaintext highlighter-rouge">suspend_index</code>来表示，即协程当前在哪个挂起点被挂起。</p>

<p>比如<code class="language-plaintext highlighter-rouge">caller</code>协程的状态机函数就是上面的<code class="language-plaintext highlighter-rouge">caller_resume</code>，协程会根据<code class="language-plaintext highlighter-rouge">suspend_index</code>跳转到<code class="language-plaintext highlighter-rouge">caller_resume</code>中的不同位置。</p>

<blockquote>
  <p>实际生成的汇编代码中，caller协程的状态机函数demangle之后命名为<code class="language-plaintext highlighter-rouge">caller(caller()::_Z6callerv.Frame*) [clone .actor]</code></p>

</blockquote>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code>        <span class="k">switch</span> <span class="p">(</span><span class="n">frame</span><span class="o">-&gt;</span><span class="n">suspend_index</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">case</span> <span class="mi">0</span><span class="p">:</span> <span class="k">break</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_1</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">4</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_2</span><span class="p">;</span>
            <span class="k">case</span> <span class="mi">6</span><span class="p">:</span> <span class="k">goto</span> <span class="n">resume_point_3</span><span class="p">;</span>
        <span class="p">}</span>
</code></pre></div></div>

<p>注意到上面给出的状态机代码中，<code class="language-plaintext highlighter-rouge">suspend_index</code>没有奇数的原因：偶数代表正常挂起，而奇数代表协程需要销毁。由于挂起点有多个，因此需要从不同的状态进行相应清理的逻辑也不同。比如<code class="language-plaintext highlighter-rouge">caller</code>协程的<code class="language-plaintext highlighter-rouge">suspend_index</code>对应的完整状态如下：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>suspend_index = 0:  初始状态
suspend_index = 1:  销毁时从状态0清理
suspend_index = 2:  initial_suspend被挂起
suspend_index = 3:  销毁时从状态2清理
suspend_index = 4:  co_await callee()挂起
suspend_index = 5:  销毁时从状态4清理
suspend_index = 6:  final_suspend被挂起
suspend_index = 7:  销毁时从状态6清理
</code></pre></div></div>

<p>状态机函数可能会被调用多次，每次调用时协程处于不同被挂起的挂起点处。除此以外，在汇编代码中，状态机函数与普通函数并没有什么不同，比如都有prologue，即每次函数调用时都有对<code class="language-plaintext highlighter-rouge">%rbp</code>和<code class="language-plaintext highlighter-rouge">%rsp</code>的相应压栈操作：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">0000000000001838</span> <span class="o">&lt;</span><span class="nf">_Z6callerPZ6callervE16_Z6callerv.Frame.actor</span><span class="o">&gt;</span><span class="p">:</span>
    <span class="err">1838</span><span class="p">:</span>  <span class="nf">endbr64</span>
    <span class="err">183</span><span class="nl">c:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="err">183</span><span class="nl">d:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
    <span class="err">1840</span><span class="p">:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbx</span>
    <span class="err">1841</span><span class="p">:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
    <span class="err">1845</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span><span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
</code></pre></div></div>

<p>这里再多提一点，状态机函数的只有一个参数，即协程的coroutine frame指针。每次调用状态机函数，都会把这个指针都保存到了<code class="language-plaintext highlighter-rouge">-0x28(%rbp)</code>处。</p>

<h3 id="main--caller">main → caller</h3>

<p>接下来，我们梳理整个demo的执行流程，完整的汇编参见<a href="https://gist.github.com/critical27/0a95391b8df0df2de3d60d792f547325">这里</a>（为了便于理解，编译器指定<code class="language-plaintext highlighter-rouge">-O0</code>）。<code class="language-plaintext highlighter-rouge">caller</code>的状态机函数入口地址为<code class="language-plaintext highlighter-rouge">1838</code>，<code class="language-plaintext highlighter-rouge">callee</code>的状态机函数入口地址为<code class="language-plaintext highlighter-rouge">13fb</code>。</p>

<ol>
  <li>
    <p><code class="language-plaintext highlighter-rouge">main</code>调用<code class="language-plaintext highlighter-rouge">caller()</code>，创建协程</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">0000000000001</span><span class="nf">c94</span> <span class="o">&lt;</span><span class="nv">main</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="err">1</span><span class="nl">c94:</span>  <span class="nf">endbr64</span>
 <span class="err">1</span><span class="nl">c98:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
 <span class="err">1</span><span class="nl">c99:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
 <span class="err">1</span><span class="nl">c9c:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbx</span>
 <span class="err">1</span><span class="nl">c9d:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x18</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
 <span class="err">1</span><span class="nl">ca1:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1</span><span class="nl">caa:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">1</span><span class="nl">cae:</span>  <span class="nf">xor</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">1</span><span class="nl">cb0:</span>  <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1</span><span class="nl">cb4:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">cb7:</span>  <span class="nf">call</span>   <span class="mi">16</span><span class="nv">e6</span> <span class="o">&lt;</span><span class="nv">_Z6callerv</span><span class="o">&gt;</span>
 <span class="err">1</span><span class="nl">cbc:</span>  <span class="nf">lea</span>    <span class="mh">0x137d</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>   <span class="c1">; return address</span>
</code></pre></div>    </div>

    <p>调用<code class="language-plaintext highlighter-rouge">caller()</code>后状态如下所示：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ caller() constructor                    │
 │ - return addr: 0x1cbc (main)            │ ← pushed by call at 0x1cb7
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame) has not been constructed yet
</code></pre></div>    </div>

    <p>在<code class="language-plaintext highlighter-rouge">16e6</code>开始的接下来一段汇编中，会做几件事：</p>

    <ul>
      <li>分配coroutine frame所需要的内存</li>
      <li>创建<code class="language-plaintext highlighter-rouge">promise</code></li>
      <li>调用<code class="language-plaintext highlighter-rouge">promise.get_return_object()</code></li>
      <li>第一次调用<code class="language-plaintext highlighter-rouge">caller</code>协程的状态机函数</li>
    </ul>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">00000000000016</span><span class="nf">e6</span> <span class="o">&lt;</span><span class="nv">_Z6callerv</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="err">16</span><span class="nl">e6:</span>  <span class="nf">endbr64</span>
 <span class="err">16</span><span class="nl">ea:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
 <span class="err">16</span><span class="nl">eb:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
 <span class="err">16</span><span class="nl">ee:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbx</span>
 <span class="err">16</span><span class="nl">ef:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x38</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
 <span class="err">16</span><span class="nl">f3:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span><span class="o">-</span><span class="mh">0x38</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">16</span><span class="nl">f7:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1700</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">1704</span><span class="p">:</span>  <span class="nf">xor</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">1706</span><span class="p">:</span>  <span class="nf">movq</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">170</span><span class="nl">e:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">-</span><span class="mh">0x21</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">1712</span><span class="p">:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">-</span><span class="mh">0x22</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">1716</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x58</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>          <span class="c1">; 88 bytes for coroutine frame</span>
 <span class="err">171</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">171</span><span class="nl">e:</span>  <span class="nf">call</span>   <span class="mi">1150</span> <span class="o">&lt;</span><span class="nv">_Znwm@plt</span><span class="o">&gt;</span>    <span class="c1">; operator new</span>
 <span class="err">1723</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>    <span class="c1">; -0x20(%rbp) = caller_frame pointer</span>
 <span class="err">1727</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">172</span><span class="nl">b:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="mh">0x32</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>
 <span class="err">172</span><span class="nl">f:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1733</span><span class="p">:</span>  <span class="nf">lea</span>    <span class="mh">0xfe</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>     <span class="c1">; rdx = 1838 即caller的状态机函数地址 (resume_fn)</span>
 <span class="err">173</span><span class="nl">a:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>
 <span class="err">173</span><span class="nl">d:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1741</span><span class="p">:</span>  <span class="nf">lea</span>    <span class="mh">0x519</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>    <span class="c1">; rdx = 1c61 即caller的销毁函数地址 (destroy_fn)</span>
 <span class="err">1748</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>
 <span class="err">174</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1750</span><span class="p">:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;promise)</span>
 <span class="err">1754</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1757</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">204</span><span class="nv">c</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用promise构造函数</span>
 <span class="err">175</span><span class="nl">c:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">-</span><span class="mh">0x21</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">1760</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1764</span><span class="p">:</span>  <span class="nf">lea</span>    <span class="mh">0x10</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>     <span class="c1">; rdx = &amp;(caller_frame-&gt;promise)</span>
 <span class="err">1768</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x38</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = 返回值地址</span>
 <span class="err">176</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">176</span><span class="nl">f:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1772</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">231</span><span class="nv">a</span> <span class="o">&lt;&gt;</span>             <span class="c1">; 调用promise.get_return_object()</span>
 <span class="err">1777</span><span class="p">:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">-</span><span class="mh">0x22</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">177</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">177</span><span class="nl">f:</span>  <span class="nf">movw</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>     <span class="c1">; caller_frame-&gt;suspend_index = 0</span>
 <span class="err">1785</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1789</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">178</span><span class="nl">c:</span>  <span class="nf">call</span>   <span class="mi">1838</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 第一次调用caller的状态机函数</span>
 <span class="err">1791</span><span class="p">:</span>  <span class="nf">jmp</span>    <span class="mi">181</span><span class="nv">a</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>
</code></pre></div>    </div>

    <p>调用状态机后的状态如下所示：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ caller() constructor                    │
 │ - return addr: 0x1cbc (main)            │ ← pushed by call at 0x1cb7
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x1791                   │ ← pushed by call at 0x178c
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 0
</code></pre></div>    </div>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">caller</code>协程的<code class="language-plaintext highlighter-rouge">suspend_index</code>初始值为<code class="language-plaintext highlighter-rouge">0</code>。由于<code class="language-plaintext highlighter-rouge">initial_suspend</code>是<code class="language-plaintext highlighter-rouge">suspend_always</code>，因此在<code class="language-plaintext highlighter-rouge">co_await promise.initial_suspend</code>时，<code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">false</code>，代表会被挂起，而<code class="language-plaintext highlighter-rouge">await_suspend</code>返回<code class="language-plaintext highlighter-rouge">void</code>，代表无条件将控制流返回给调用方，且在返回之前<code class="language-plaintext highlighter-rouge">suspend_index</code>被设置为<code class="language-plaintext highlighter-rouge">2</code>。</p>

    <p>状态机函数返回后，依次执行返回地址<code class="language-plaintext highlighter-rouge">1791</code>的代码，最终由返回到<code class="language-plaintext highlighter-rouge">main</code>函数。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1791</span><span class="p">:</span>  <span class="nf">jmp</span>    <span class="mi">181</span><span class="nv">a</span>
 <span class="c1">; ...</span>
 <span class="err">181</span><span class="nl">a:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">181</span><span class="nl">e:</span>  <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1827</span><span class="p">:</span>  <span class="nf">je</span>     <span class="mi">182</span><span class="nv">e</span>
 <span class="err">182</span><span class="nl">e:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x38</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; rax = &amp;task (return value)</span>
 <span class="err">1832</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>
 <span class="err">1836</span><span class="p">:</span>  <span class="nf">leave</span>
 <span class="err">1837</span><span class="p">:</span>  <span class="nf">ret</span>                        <span class="c1">; return to main (0x1cbc)</span>
</code></pre></div>    </div>

    <p>此时栈帧中只有<code class="language-plaintext highlighter-rouge">main</code>：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ┌─────────────────────┐ ← High Address
 │ main() stack frame  │
 └─────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 2
</code></pre></div>    </div>
  </li>
  <li>
    <p>之后在<code class="language-plaintext highlighter-rouge">main</code>函数中通过<code class="language-plaintext highlighter-rouge">coroutine_handle</code>手动恢复<code class="language-plaintext highlighter-rouge">caller</code>，<code class="language-plaintext highlighter-rouge">call</code>指令的返回地址是<code class="language-plaintext highlighter-rouge">1d10</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1">; 省略main中部分代码...</span>
 <span class="err">1</span><span class="nl">d04:</span>  <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller()的返回对象 即SimpleTask&lt;int&gt;</span>
                                   <span class="c1">; 其中只有一个成员变量即caller协程的coroutine_handle</span>
 <span class="err">1</span><span class="nl">d08:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">d0b:</span>  <span class="nf">call</span>   <span class="mi">26</span><span class="nv">a6</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用caller's coroutine_handle.resume()</span>

 <span class="err">1</span><span class="nl">d10:</span>  <span class="nf">lea</span>    <span class="mh">0x1349</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="c1">; ...</span>
</code></pre></div>    </div>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">00000000000026</span><span class="nf">a6</span> <span class="o">&lt;</span><span class="nv">_ZNKSt7__n486116coroutine_handleIN10SimpleTaskIiE12promise_typeEE6resumeEv</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="err">26</span><span class="nl">a6:</span>  <span class="nf">endbr64</span>
 <span class="err">26</span><span class="nl">aa:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
 <span class="err">26</span><span class="nl">ab:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
 <span class="err">26</span><span class="nl">ae:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
 <span class="err">26</span><span class="nl">b2:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span><span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">26</span><span class="nl">b6:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">26</span><span class="nl">ba:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>        <span class="c1">; %rax = calle_frame pointer</span>
 <span class="err">26</span><span class="nl">bd:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>        <span class="c1">; %rdx = *(%rax) = caller-&gt;resume_fn = 1838</span>
 <span class="err">26</span><span class="nl">c0:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">26</span><span class="nl">c3:</span>  <span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>              <span class="c1">; 调用caller状态机函数</span>
 <span class="err">26</span><span class="nl">c5:</span>  <span class="nf">nop</span>
 <span class="err">26</span><span class="nl">c6:</span>  <span class="nf">leave</span>
 <span class="err">26</span><span class="nl">c7:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p>调用后的状态如下所示：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ coroutine_handle.resume()               │
 │ - return addr: 0x1d10                   │ ← pushed by call at 0x1d0b
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x26c5                   │ ← pushed by call at 0x26c3
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 2
</code></pre></div>    </div>
  </li>
</ol>

<h3 id="caller--callee">caller → callee</h3>

<ol>
  <li>
    <p><code class="language-plaintext highlighter-rouge">caller</code>协程继续执行，<code class="language-plaintext highlighter-rouge">suspend_index</code>为<code class="language-plaintext highlighter-rouge">2</code>，跳转到如下汇编继续执行：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">195</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller coroutine frame</span>
 <span class="err">195</span><span class="nl">f:</span>  <span class="nf">movb</span>   <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="mh">0x33</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>     <span class="c1">; 设置initial_await_called = 1</span>
 <span class="err">1963</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1967</span><span class="p">:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x34</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;initial_awaiter)</span>
 <span class="err">196</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">196</span><span class="nl">e:</span>  <span class="nf">call</span>   <span class="mi">1</span><span class="nv">f22</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用initial_awaiter.await_resume();</span>
 <span class="err">1973</span><span class="p">:</span>  <span class="nf">lea</span>    <span class="mh">0x16a0</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>   <span class="c1">; 加载字符串 "  &gt; caller()\n"</span>
 <span class="err">197</span><span class="nl">a:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">197</span><span class="nl">d:</span>  <span class="nf">lea</span>    <span class="mh">0x36bc</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>   <span class="c1">; 加载 std::cout</span>
 <span class="err">1984</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1987</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">1140</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 输出字符串</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">co_await callee()</code>，对应汇编如下，会通过<code class="language-plaintext highlighter-rouge">call 12a9</code>创建<code class="language-plaintext highlighter-rouge">callee</code>协程。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">198</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1990</span><span class="p">:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x48</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_task)</span>
 <span class="err">1994</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1997</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">12</span><span class="nv">a9</span> <span class="o">&lt;</span><span class="nv">_Z6calleev</span><span class="o">&gt;</span>   <span class="c1">; 调用callee()创建协程</span>
 <span class="err">199</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">a0:</span>  <span class="nf">lea</span>    <span class="mh">0x48</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>     <span class="c1">; %rdx = &amp;(caller_frame-&gt;callee_task)</span>
 <span class="err">19</span><span class="nl">a4:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">a8:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x40</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_awaiter)</span>
 <span class="err">19</span><span class="nl">ac:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">19</span><span class="nl">af:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">19</span><span class="nl">b2:</span>  <span class="nf">call</span>   <span class="mi">244</span><span class="nv">c</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用operator co_await()</span>
 <span class="err">19</span><span class="nl">b7:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">bb:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x40</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_awaiter)</span>
 <span class="err">19</span><span class="nl">bf:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">19</span><span class="nl">c2:</span>  <span class="nf">call</span>   <span class="mi">2550</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用callee_awaiter.await_ready()</span>
 <span class="err">19</span><span class="nl">c7:</span>  <span class="nf">xor</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">19</span><span class="nl">ca:</span>  <span class="nf">test</span>   <span class="o">%</span><span class="nb">al</span><span class="p">,</span><span class="o">%</span><span class="nb">al</span>
 <span class="err">19</span><span class="nl">cc:</span>  <span class="nf">je</span>     <span class="mi">1</span><span class="nv">a0b</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 如果await_ready返回true 跳转到状态4</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>由于<code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">false</code>，调用<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>对称转移至<code class="language-plaintext highlighter-rouge">callee</code>，即返回<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>，将控制权交给<code class="language-plaintext highlighter-rouge">callee</code>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">continuation</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="n">handle_</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span> <span class="o">=</span> <span class="n">continuation</span><span class="p">;</span>
     <span class="k">return</span> <span class="n">handle_</span><span class="p">;</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p>对应汇编如下，在<code class="language-plaintext highlighter-rouge">Awaiter::await_suspend</code>中，会把<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存到<code class="language-plaintext highlighter-rouge">callee</code>的promise中。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">19</span><span class="nl">ce:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">d2:</span>  <span class="nf">movw</span>   <span class="kc">$</span><span class="mh">0x4</span><span class="p">,</span><span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>     <span class="c1">; caller_frame-&gt;suspend_index = 4</span>
 <span class="err">19</span><span class="nl">d8:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">dc:</span>  <span class="nf">lea</span>    <span class="mh">0x40</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>     <span class="c1">; %rbx = &amp;(caller_frame-&gt;callee_awaiter)</span>
 <span class="err">19</span><span class="nl">e0:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">19</span><span class="nl">e4:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = caller的coroutine_handle(当前协程句柄)</span>
 <span class="err">19</span><span class="nl">e8:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">19</span><span class="nl">eb:</span>  <span class="nf">call</span>   <span class="mi">2132</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 转换为coroutine_handle&lt;void&gt;</span>
 <span class="err">19</span><span class="nl">f0:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">19</span><span class="nl">f3:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rbx</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">19</span><span class="nl">f6:</span>  <span class="nf">call</span>   <span class="mi">2564</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用callee_awaiter.await_suspend()</span>
 <span class="err">19</span><span class="nl">fb:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>    <span class="c1">; %rax = 对称转移返回的协程句柄 (即callee的coroutine_handle)</span>
 <span class="err">19</span><span class="nl">ff:</span>  <span class="nf">jmp</span>    <span class="mb">1b</span><span class="mi">66</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 准备跳转到callee</span>
 <span class="err">1</span><span class="nl">a04:</span>  <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a09:</span>  <span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">a27</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>
</code></pre></div>    </div>

    <p>跳转前的状态如下所示，注意<code class="language-plaintext highlighter-rouge">19d2</code>处已经把<code class="language-plaintext highlighter-rouge">suspend_index</code>改为4：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Before symmetric transfer (caller -&gt; callee)

 ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ coroutine_handle.resume()               │
 │ - return addr: 0x1d10                   │ ← pushed by call at 0x1d0b
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x26c5                   │ ← pushed by call at 0x26c3
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 4
</code></pre></div>    </div>

    <p>具体跳转到<code class="language-plaintext highlighter-rouge">callee</code>的汇编代码如下。注意在<code class="language-plaintext highlighter-rouge">19fb</code>时已经把<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>保存在<code class="language-plaintext highlighter-rouge">-0x20(%rbp)</code>了，然后通过<code class="language-plaintext highlighter-rouge">coroutine_handle::address()</code>获取到<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine frame</code>地址，即<code class="language-plaintext highlighter-rouge">callee</code>的状态机函数<code class="language-plaintext highlighter-rouge">callee_resume</code>的入口地址，并保存到<code class="language-plaintext highlighter-rouge">%rdx</code>中，最终通过<code class="language-plaintext highlighter-rouge">call *%rdx</code>跳转至<code class="language-plaintext highlighter-rouge">13fb</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">b66:</span>  <span class="nf">endbr64</span>
 <span class="err">1</span><span class="nl">b6a:</span>  <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; rax = &amp;(callee's coroutine_handle)</span>
 <span class="err">1</span><span class="nl">b6e:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">b71:</span>  <span class="nf">call</span>   <span class="mi">1</span><span class="nv">dd4</span> <span class="o">&lt;</span><span class="nv">...address</span><span class="o">&gt;</span>   <span class="c1">; 调用coroutine_handle::address</span>
                                   <span class="c1">; %rax = callee_frame pointer</span>
 <span class="err">1</span><span class="nl">b76:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>         <span class="c1">; rdx = *(callee_frame) = callee's resume_fn函数地址 即状态机函数</span>
 <span class="err">1</span><span class="nl">b79:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>           <span class="c1">; rdi = callee_frame pointer</span>
 <span class="err">1</span><span class="nl">b7c:</span>  <span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>               <span class="c1">; ★ indirect tail call调用callee的状态机函数 ★</span>
 <span class="err">1</span><span class="nl">b7e:</span>  <span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">c46</span> <span class="o">&lt;</span><span class="nb">cl</span><span class="nv">eanup</span><span class="o">&gt;</span>
</code></pre></div>    </div>

    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">1b66</code>开始的这段汇编，是编译器对<code class="language-plaintext highlighter-rouge">caller</code>协程生成的一段对称转移通用指令，后面还会再见到一次，只不过根据<code class="language-plaintext highlighter-rouge">await_suspend</code>的返回值不同，最终跳转的位置也不同。</p>

    </blockquote>

    <p>跳转后的状态如下所示，注意<code class="language-plaintext highlighter-rouge">caller</code>此时处于被挂起状态，而<code class="language-plaintext highlighter-rouge">callee</code>的<code class="language-plaintext highlighter-rouge">coroutine frame</code>还没有创建。</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> After symmetric transfer (caller -&gt; callee)

 ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ coroutine_handle.resume()               │
 │ - return addr: 0x1d10                   │ ← pushed by call at 0x1d0b
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x26c5                   │ ← pushed by call at 0x26c3
 ├─────────────────────────────────────────┤
 │ callee_resume (not started)             │
 │ - symmetric transferred from 0x1b7c     │
 │ - return addr: 0x1b7e                   │
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 4
   - suspened

 - callee frame (__calleeFrame) has not been constructed yet
</code></pre></div>    </div>

    <p>需要注意的是，<code class="language-plaintext highlighter-rouge">call *%rdx</code>所跳转的函数，是<code class="language-plaintext highlighter-rouge">callee</code>的状态机函数，它也会通过对称转移，使控制流切换到其他协程上。但是在执行<code class="language-plaintext highlighter-rouge">call *%rdx</code>时，下一条指令地址<code class="language-plaintext highlighter-rouge">1b7e</code>的确会被压栈，然后跳转到<code class="language-plaintext highlighter-rouge">*%rdx</code>处，只不过其返回地址<code class="language-plaintext highlighter-rouge">1b7e</code>对应的代码并像普通函数调用返回后立马执行，而可能是会被tail call优化，绕一个大圈回来。</p>
  </li>
</ol>

<h3 id="callee--caller">callee → caller</h3>

<ol>
  <li>
    <p>创建<code class="language-plaintext highlighter-rouge">callee</code>，这一步和之前一样，分配coroutine frame，构造<code class="language-plaintext highlighter-rouge">promise</code>，调用<code class="language-plaintext highlighter-rouge">promise.get_return_object()</code>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">SimpleTask</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">callee</span><span class="p">()</span> <span class="p">{</span>
     <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"      &gt; callee()</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
     <span class="k">co_return</span> <span class="mi">42</span><span class="p">;</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p>经过<code class="language-plaintext highlighter-rouge">co_await initial_suspend</code>和函数体，通过<code class="language-plaintext highlighter-rouge">promise_type</code>的<code class="language-plaintext highlighter-rouge">return_value</code>接口保存了返回值<code class="language-plaintext highlighter-rouge">42</code>。并且此时<code class="language-plaintext highlighter-rouge">callee</code>协程已经执行完成，会将其状态机函数置为<code class="language-plaintext highlighter-rouge">nullptr</code>，后续不能再被调用。（但coroutine frame还没有释放，释放时机下面会讲）</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1543</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">20</span><span class="nv">c4</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用promise.return_value</span>
 <span class="err">1548</span><span class="p">:</span>  <span class="nf">nop</span>
 <span class="err">1549</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; rax = callee_frame pointer</span>
 <span class="err">154</span><span class="nl">d:</span>  <span class="nf">movq</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>         <span class="c1">; callee_frame-&gt;resume_fn = nullptr</span>
                                   <span class="c1">; 即标识callee协程已完成 状态机函数后续不能再被调用</span>
</code></pre></div>    </div>

    <p>最终进入到<code class="language-plaintext highlighter-rouge">co_await final_suspend</code>阶段。<code class="language-plaintext highlighter-rouge">FinalAwaiter</code>的<code class="language-plaintext highlighter-rouge">await_ready</code>返回<code class="language-plaintext highlighter-rouge">false</code>，于是在<code class="language-plaintext highlighter-rouge">await_suspend</code>处再次对称转移。注意<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>已经在前面被保存在了<code class="language-plaintext highlighter-rouge">callee</code>的promise中，<code class="language-plaintext highlighter-rouge">FinalAwaiter::await_suspend</code>返回<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine_handle</code>即可将控制权再交还给<code class="language-plaintext highlighter-rouge">caller</code>。</p>

    <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;&gt;</span> <span class="n">await_suspend</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">coroutine_handle</span><span class="o">&lt;</span><span class="n">promise_type</span><span class="o">&gt;</span> <span class="n">h</span><span class="p">)</span> <span class="k">noexcept</span> <span class="p">{</span>
     <span class="k">auto</span> <span class="n">continuation</span> <span class="o">=</span> <span class="n">h</span><span class="p">.</span><span class="n">promise</span><span class="p">().</span><span class="n">continuation_</span><span class="p">;</span>
     <span class="k">if</span> <span class="p">(</span><span class="n">continuation</span><span class="p">)</span> <span class="p">{</span>
         <span class="k">return</span> <span class="n">continuation</span><span class="p">;</span>
     <span class="p">}</span>
     <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">noop_coroutine</span><span class="p">();</span>
 <span class="p">}</span>
</code></pre></div>    </div>

    <p>前面的汇编就不展开了，只详细看对称转移部分：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">157</span><span class="nl">b:</span>  <span class="nf">endbr64</span>
 <span class="err">157</span><span class="nl">f:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = callee_frame pointer</span>
 <span class="err">1583</span><span class="p">:</span>  <span class="nf">movw</span>   <span class="kc">$</span><span class="mh">0x4</span><span class="p">,</span><span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>     <span class="c1">; callee_frame-&gt;suspend_index = 4</span>
 <span class="err">1589</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">158</span><span class="nl">d:</span>  <span class="nf">lea</span>    <span class="mh">0x35</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>     <span class="c1">; %rdx = &amp;(callee_frame-&gt;final_awaiter)</span>
 <span class="err">1591</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1595</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>     <span class="c1">; %rax = callee的coroutine_handle(当前协程句柄)</span>
 <span class="err">1599</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">159</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">159</span><span class="nl">f:</span>  <span class="nf">call</span>   <span class="mi">21</span><span class="nv">e2</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用FinalAwaiter::await_suspend()</span>
 <span class="err">15</span><span class="nl">a4:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>    <span class="c1">; %rax = 对称转移返回的协程句柄(即caller的coroutine_handle)</span>
 <span class="err">15</span><span class="nl">a8:</span>  <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = &amp;(caller's coroutine_handle)</span>
 <span class="err">15</span><span class="nl">ac:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">15</span><span class="nl">af:</span>  <span class="nf">call</span>   <span class="mi">1</span><span class="nv">dd4</span> <span class="o">&lt;</span><span class="nv">...address</span><span class="o">&gt;</span>   <span class="c1">; 调用coroutine_handle::address</span>
                                   <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">15</span><span class="nl">b4:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>         <span class="c1">; %rdx = *(caller_frame) = caller's resume_fn函数地址 即状态机函数</span>
 <span class="err">15</span><span class="nl">b7:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>           <span class="c1">; %rdi = caller_frame pointer</span>
 <span class="err">15</span><span class="nl">ba:</span>  <span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>               <span class="c1">; ★ indirect tail call调用caller状态机函数 ★</span>
 <span class="err">15</span><span class="nl">bc:</span>  <span class="nf">jmp</span>    <span class="mi">1698</span> <span class="o">&lt;</span><span class="nb">cl</span><span class="nv">eanup</span><span class="o">&gt;</span>
</code></pre></div>    </div>

    <p>原理和上面一次对称转移一样，都是获取<code class="language-plaintext highlighter-rouge">await_suspend</code>的返回值，通过<code class="language-plaintext highlighter-rouge">coroutine_handle::address()</code>获取到返回值，即<code class="language-plaintext highlighter-rouge">caller</code>的<code class="language-plaintext highlighter-rouge">coroutine frame</code>地址，也就是<code class="language-plaintext highlighter-rouge">caller</code>的状态机函数<code class="language-plaintext highlighter-rouge">caller_resume</code>的入口地址，并保存到<code class="language-plaintext highlighter-rouge">%rdx</code>中，最终通过<code class="language-plaintext highlighter-rouge">call *%rdx</code>跳转至<code class="language-plaintext highlighter-rouge">1838</code>。跳转前后的状态如下所示：</p>

    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Before symmetric transfer (callee -&gt; caller)

 ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ coroutine_handle.resume()               │
 │ - return addr: 0x1d10                   │ ← pushed by call at 0x1d0b
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x26c5                   │ ← pushed by call at 0x26c3
 ├─────────────────────────────────────────┤
 │ callee_resume                           │
 │ - symmetric transferred from 0x1b7c     │
 │ - return addr: 0x1b7e                   │
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 4
   - suspended

 - callee frame (__calleeFrame)
   - suspend_index = 4
   - resume_fn = nullptr

 After symmetric transfer (callee -&gt; caller)

 ┌─────────────────────────────────────────┐ ← High Address
 │ main()                                  │
 ├─────────────────────────────────────────┤
 │ coroutine_handle.resume()               │
 │ - return addr: 0x1d10                   │ ← pushed by call at 0x1d0b
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - return addr: 0x26c5                   │ ← pushed by call at 0x26c3
 ├─────────────────────────────────────────┤
 │ callee_resume                           │
 │ - symmetric transferred from 0x1b7c     │
 │ - return addr: 0x1b7e                   │
 ├─────────────────────────────────────────┤
 │ caller_resume                           │
 │ - symmetric transferred from 0x15ba     │
 │ - return addr: 0x15bc                   │
 └─────────────────────────────────────────┘ ← Low Address (rsp)

 Heap State:
 - caller frame (__callerFrame)
   - suspend_index = 4

 - callee frame (__calleeFrame)
   - suspend_index = 4
   - resume_fn = nullptr
   - suspended
</code></pre></div>    </div>

    <p>注意<code class="language-plaintext highlighter-rouge">caller</code>的状态机函数被再次调用，因此又会对<code class="language-plaintext highlighter-rouge">%rbp</code>和<code class="language-plaintext highlighter-rouge">%rsp</code>进行相应操作，出现了一个新的栈帧。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">0000000000001838</span> <span class="o">&lt;</span><span class="nf">_Z6callerPZ6callervE16_Z6callerv.Frame.actor</span><span class="o">&gt;</span><span class="p">:</span>
     <span class="err">1838</span><span class="p">:</span>	<span class="nf">endbr64</span>
     <span class="err">183</span><span class="nl">c:</span>	<span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
     <span class="err">183</span><span class="nl">d:</span>	<span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
</code></pre></div>    </div>

    <p>准确来说，<code class="language-plaintext highlighter-rouge">callee</code>协程当然是被挂起的，而<code class="language-plaintext highlighter-rouge">caller</code>协程是正在执行的，只不过<code class="language-plaintext highlighter-rouge">caller_resume</code>这个函数由于被优化为了一系列tail call，导致在栈上出现了两次。</p>
  </li>
  <li>
    <p><code class="language-plaintext highlighter-rouge">caller</code>状态机会继续执行，此时<code class="language-plaintext highlighter-rouge">suspend_index</code>为4，跳转到如下代码。主要逻辑就是完成<code class="language-plaintext highlighter-rouge">co_await callee()</code>的善后，此时<code class="language-plaintext highlighter-rouge">callee</code>已经执行完成，析构了caller coroutine frame中的<code class="language-plaintext highlighter-rouge">callee_awaiter</code>和<code class="language-plaintext highlighter-rouge">callee_task</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">a0b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">a0f:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x40</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_awaiter)</span>
 <span class="err">1</span><span class="nl">a13:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">a16:</span>  <span class="nf">call</span>   <span class="mi">2630</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用caller_frame-&gt;callee_awaiter.await_resume()</span>
 <span class="err">1</span><span class="nl">a1b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>
 <span class="err">1</span><span class="nl">a1f:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="mh">0x38</span><span class="p">(</span><span class="o">%</span><span class="nb">rdx</span><span class="p">)</span>     <span class="c1">; 保存到caller_frame-&gt;result中</span>
 <span class="err">1</span><span class="nl">a22:</span>  <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a27:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">a2b:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x40</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_awaiter)</span>
 <span class="err">1</span><span class="nl">a2f:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">a32:</span>  <span class="nf">call</span>   <span class="mi">24</span><span class="nv">d4</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用callee_awaiter的析构函数</span>
 <span class="err">1</span><span class="nl">a37:</span>  <span class="nf">cmp</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a3a:</span>  <span class="nf">jne</span>    <span class="mi">1</span><span class="nv">a43</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; %ebx为1 不跳转</span>
 <span class="err">1</span><span class="nl">a3c:</span>  <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a41:</span>  <span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">a48</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>
 <span class="err">1</span><span class="nl">a43:</span>  <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a48:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">a4c:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x48</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;callee_task)</span>
 <span class="err">1</span><span class="nl">a50:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">a53:</span>  <span class="nf">call</span>   <span class="mi">2352</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用callee_task的析构函数</span>
 <span class="err">1</span><span class="nl">a58:</span>  <span class="nf">cmp</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">ebx</span>
 <span class="err">1</span><span class="nl">a5b:</span>  <span class="nf">jne</span>    <span class="mb">1b</span><span class="mi">32</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; %ebx为1 不跳转</span>
 <span class="err">1</span><span class="nl">a61:</span>  <span class="nf">nop</span>
</code></pre></div>    </div>

    <p>注意在析构caller coroutine frame中的<code class="language-plaintext highlighter-rouge">callee_awaiter</code>，也就是析构<code class="language-plaintext highlighter-rouge">co_await callee()</code>生成的<code class="language-plaintext highlighter-rouge">Awaiter</code>对象时，<code class="language-plaintext highlighter-rouge">callee</code>的coroutine frame会被释放。大致调用路径如下：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">a32:</span>  <span class="nf">call</span> <span class="mi">24</span><span class="nv">d4</span>  <span class="c1">; 调用Awaiter::~Awaiter()</span>
   <span class="err">↓</span>
 <span class="err">2544</span><span class="p">:</span>  <span class="nf">call</span> <span class="mi">2770</span>  <span class="c1">; 调用coroutine_handle&lt;SimpleTask&lt;int&gt;::promise_type&gt;::destroy()</span>
   <span class="err">↓</span>
 <span class="err">278</span><span class="nl">e:</span>  <span class="nf">call</span> <span class="o">*%</span><span class="nb">rdx</span> <span class="c1">; callee.Frame.destroy (0x16b3)</span>
   <span class="err">↓</span>
 <span class="err">16</span><span class="nl">df:</span>  <span class="nf">call</span> <span class="mi">13</span><span class="nv">fb</span>  <span class="c1">; callee.Frame.actor (最后一次调用状态机函数，清理)</span>
   <span class="err">↓</span>
 <span class="err">释放</span><span class="nf">callee</span><span class="err">的</span><span class="nv">coroutine</span> <span class="nv">frame</span>
</code></pre></div>    </div>

    <p>具体过程如下，不想深究的可以调到下一步骤。首先在<code class="language-plaintext highlighter-rouge">coroutine_handle&lt;SimpleTask&lt;int&gt;::promise_type&gt;::destroy()</code>中，先根据<code class="language-plaintext highlighter-rouge">coroutine_handle</code>获取coroutine frame指针，再获取coroutine frame中的销毁函数指针，最后调用。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">0000000000002770</span> <span class="o">&lt;</span><span class="nf">_ZNKSt7__n486116coroutine_handleIN10SimpleTaskIiE12promise_typeEE7destroyEv</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="err">2770</span><span class="p">:</span>  <span class="nf">endbr64</span>
 <span class="err">2774</span><span class="p">:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
 <span class="err">2775</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
 <span class="err">2778</span><span class="p">:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
 <span class="err">277</span><span class="nl">c:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span><span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">2780</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>     <span class="c1">; %rax = coroutine handle's pointer</span>
 <span class="err">2784</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>         <span class="c1">; %rax = &amp;(couroutine frame)</span>
                                   <span class="c1">; 即读取coroutine_handle中的coroutine frame指针</span>
                                   <span class="c1">; 等同于调用coroutine_handle::address()</span>
 <span class="err">2787</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>      <span class="c1">; %rdx = &amp;(frame-&gt;destory_fn)</span>
 <span class="err">278</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">278</span><span class="nl">e:</span>  <span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>               <span class="c1">; 调用destory_fn 对于callee来说是16b3</span>
 <span class="err">2790</span><span class="p">:</span>  <span class="nf">nop</span>
 <span class="err">2791</span><span class="p">:</span>  <span class="nf">leave</span>
 <span class="err">2792</span><span class="p">:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p>之后，在销毁函数<code class="language-plaintext highlighter-rouge">callee(callee()::_Z6calleev.Frame*) [clone .destroy]</code>中设置<code class="language-plaintext highlighter-rouge">suspend_index</code>的最低位为1，并最后一次调用状态机函数进行清理。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">00000000000016</span><span class="nf">b3</span> <span class="o">&lt;</span><span class="nv">_Z6calleePZ6calleevE16_Z6calleev.Frame.destroy</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="err">16</span><span class="nl">b3:</span>  <span class="nf">endbr64</span>
 <span class="err">16</span><span class="nl">b7:</span>  <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
 <span class="err">16</span><span class="nl">b8:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
 <span class="err">16</span><span class="nl">bb:</span>  <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>
 <span class="err">16</span><span class="nl">bf:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span><span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
 <span class="err">16</span><span class="nl">c3:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">16</span><span class="nl">c7:</span>  <span class="nf">movzwl</span> <span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">eax</span>     <span class="c1">; %rax = callee_frame-&gt;suspend_index</span>
 <span class="err">16</span><span class="nl">cb:</span>  <span class="nf">or</span>     <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>           <span class="c1">; 设置suspend_index的最低位为1 表示已销毁</span>
 <span class="err">16</span><span class="nl">ce:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">edx</span>
 <span class="err">16</span><span class="nl">d0:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">16</span><span class="nl">d4:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">dx</span><span class="p">,</span><span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>      <span class="c1">; callee_frame-&gt;suspend_index = 5</span>
 <span class="err">16</span><span class="nl">d8:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">16</span><span class="nl">dc:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">16</span><span class="nl">df:</span>  <span class="nf">call</span>   <span class="mi">13</span><span class="nv">fb</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 最后一次调用状态机函数</span>
 <span class="err">16</span><span class="nl">e4:</span>  <span class="nf">leave</span>
 <span class="err">16</span><span class="nl">e5:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p>最后，在状态机函数中析构promise，并调用<code class="language-plaintext highlighter-rouge">operator delete</code>释放内存：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">00000000000013</span><span class="nf">fb</span> <span class="o">&lt;</span><span class="nv">_Z6calleePZ6calleevE16_Z6calleev.Frame.actor</span><span class="o">&gt;</span><span class="p">:</span>
 <span class="c1">; ...</span>
 <span class="err">142</span><span class="nl">b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">142</span><span class="nl">f:</span>  <span class="nf">movzwl</span> <span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">1433</span><span class="p">:</span>  <span class="nf">movzwl</span> <span class="o">%</span><span class="nb">ax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">1436</span><span class="p">:</span>  <span class="nf">cmp</span>    <span class="kc">$</span><span class="mh">0x5</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">1439</span><span class="p">:</span>  <span class="nf">je</span>     <span class="mi">15</span><span class="nv">c1</span> <span class="o">&lt;&gt;</span>             <span class="c1">; suspsend_index = 5则跳转</span>

 <span class="c1">; ...</span>
 <span class="err">15</span><span class="nl">c1:</span>  <span class="nf">jmp</span>    <span class="mi">15</span><span class="nv">d6</span> <span class="o">&lt;&gt;</span>

 <span class="c1">; ...</span>
 <span class="err">15</span><span class="nl">d6:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">15</span><span class="nl">da:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">15</span><span class="nl">de:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">15</span><span class="nl">e1:</span>  <span class="nf">call</span>   <span class="mi">208</span><span class="nv">a</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用promise的析构函数</span>
 <span class="err">15</span><span class="nl">e6:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">15</span><span class="nl">ea:</span>  <span class="nf">movzbl</span> <span class="mh">0x32</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">15</span><span class="nl">ee:</span>  <span class="nf">movzbl</span> <span class="o">%</span><span class="nb">al</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">15</span><span class="nl">f1:</span>  <span class="nf">test</span>   <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
 <span class="err">15</span><span class="nl">f3:</span>  <span class="nf">je</span>     <span class="mi">1698</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>
 <span class="err">15</span><span class="nl">f9:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">15</span><span class="nl">fd:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1600</span><span class="p">:</span>  <span class="nf">call</span>   <span class="mi">1130</span> <span class="o">&lt;</span><span class="nv">_ZdlPv@plt</span><span class="o">&gt;</span>   <span class="c1">; 调用operator delete释放内存</span>
 <span class="err">1605</span><span class="p">:</span>  <span class="nf">jmp</span>    <span class="mi">1698</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>

 <span class="c1">; ...</span>
 <span class="err">1698</span><span class="p">:</span>  <span class="nf">nop</span>
 <span class="err">1699</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">169</span><span class="nl">d:</span>  <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">16</span><span class="nl">a6:</span>  <span class="nf">je</span>     <span class="mi">16</span><span class="nv">ad</span>
 <span class="err">16</span><span class="nl">a8:</span>  <span class="nf">call</span>   <span class="mi">1170</span> <span class="o">&lt;</span><span class="nv">__stack_chk_fail@plt</span><span class="o">&gt;</span>
 <span class="err">16</span><span class="nl">ad:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>
 <span class="err">16</span><span class="nl">b1:</span>  <span class="nf">leave</span>
 <span class="err">16</span><span class="nl">b2:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p>此时的栈状态如下所示，即callee的coroutine frame已经不存在了，但其状态机函数还存在于栈上。虽然coroutine frame已经不存在，那状态机函数还怎么执行呢？这里需要说明的是，callee的状态机函数剩余还未执行的部分，只是一些清理逻辑且会迅速返回，而不会再读取coroutine frame中的内容。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nf">After</span> <span class="nv">callee</span> <span class="nv">coroutine</span> <span class="nv">frame</span> <span class="nv">destructed</span>

 <span class="err">┌─────────────────────────────────────────┐</span> <span class="err">←</span> <span class="nf">High</span> <span class="nv">Address</span>
 <span class="err">│</span> <span class="nf">main</span><span class="p">()</span>                                  <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">coroutine_handle.resume</span><span class="p">()</span>               <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1d10</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x1d0b</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x26c5</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x26c3</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">callee_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">symmetric</span> <span class="nv">transferred</span> <span class="nv">from</span> <span class="mh">0x1b7c</span>     <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1b7e</span>                   <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">symmetric</span> <span class="nv">transferred</span> <span class="nv">from</span> <span class="mh">0x15ba</span>     <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x15bc</span>                   <span class="err">│</span>
 <span class="err">└─────────────────────────────────────────┘</span> <span class="err">←</span> <span class="nf">Low</span> <span class="nv">Address</span> <span class="p">(</span><span class="nb">rsp</span><span class="p">)</span>

 <span class="nf">Heap</span> <span class="nv">State</span><span class="p">:</span>
 <span class="o">-</span> <span class="nf">caller</span> <span class="nv">frame</span> <span class="p">(</span><span class="nv">__callerFrame</span><span class="p">)</span>
   <span class="o">-</span> <span class="nf">suspend_index</span> <span class="err">=</span> <span class="mi">4</span>
   <span class="o">-</span> <span class="nf">callee_awaiter</span> <span class="nv">destructed</span>
   <span class="o">-</span> <span class="nf">callee_task</span> <span class="nv">destructed</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>之后<code class="language-plaintext highlighter-rouge">caller</code>协程继续执行<code class="language-plaintext highlighter-rouge">std::cout &lt;&lt; "  &gt; caller: result = " &lt;&lt; result &lt;&lt; "\n";</code>，略过相应汇编。最终<code class="language-plaintext highlighter-rouge">caller</code>协程返回<code class="language-plaintext highlighter-rouge">result * 2</code>，对应汇编如下</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">aa4:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">aa8:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;promise.value)</span>
 <span class="err">1</span><span class="nl">aac:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>    <span class="c1">; %rdx = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">ab0:</span>  <span class="nf">mov</span>    <span class="mh">0x38</span><span class="p">(</span><span class="o">%</span><span class="nb">rdx</span><span class="p">),</span><span class="o">%</span><span class="nb">edx</span>     <span class="c1">; %rdx = caller_frame-&gt;result</span>
 <span class="err">1</span><span class="nl">ab3:</span>  <span class="nf">add</span>    <span class="o">%</span><span class="nb">edx</span><span class="p">,</span><span class="o">%</span><span class="nb">edx</span>           <span class="c1">; result * 2</span>
 <span class="err">1</span><span class="nl">ab5:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">edx</span><span class="p">,</span><span class="o">%</span><span class="nb">esi</span>
 <span class="err">1</span><span class="nl">ab7:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">aba:</span>  <span class="nf">call</span>   <span class="mi">20</span><span class="nv">c4</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; calle_frame-&gt;promise.return_value()</span>
 <span class="err">1</span><span class="nl">abf:</span>  <span class="nf">nop</span>
</code></pre></div>    </div>
  </li>
</ol>

<h3 id="caller--main">caller → main</h3>

<ol>
  <li>
    <p>之后，<code class="language-plaintext highlighter-rouge">caller</code>协程将进入到<code class="language-plaintext highlighter-rouge">co_await final_suspend</code>阶段。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">ac0:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">ac4:</span>  <span class="nf">movq</span>   <span class="kc">$</span><span class="mh">0x0</span><span class="p">,(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>         <span class="c1">; caller_frame-&gt;resume_fn = nullptr</span>
                                   <span class="c1">; 即标识caller协程已完成 状态机函数后续不能再被调用</span>
 <span class="err">1</span><span class="nl">acb:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">acf:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;promise)</span>
 <span class="err">1</span><span class="nl">ad3:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">ad6:</span>  <span class="nf">call</span>   <span class="mi">21</span><span class="nv">be</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用SimpleTask&lt;int&gt;::promise_type::final_suspend()</span>
 <span class="err">1</span><span class="nl">adb:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">adf:</span>  <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x50</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>          <span class="c1">; %rax = &amp;(caller_frame-&gt;final_awaiter)</span>
 <span class="err">1</span><span class="nl">ae3:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">ae6:</span>  <span class="nf">call</span>   <span class="mi">21</span><span class="nv">ce</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用final_awaiter.await_ready() 返回值为false</span>
 <span class="err">1</span><span class="nl">aeb:</span>  <span class="nf">xor</span>    <span class="kc">$</span><span class="mh">0x1</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>           <span class="c1">; 取反后 %rax = 1</span>
 <span class="err">1</span><span class="nl">aee:</span>  <span class="nf">test</span>   <span class="o">%</span><span class="nb">al</span><span class="p">,</span><span class="o">%</span><span class="nb">al</span>
 <span class="err">1</span><span class="nl">af0:</span>  <span class="nf">je</span>     <span class="mb">1b</span><span class="mi">1</span><span class="nv">f</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; await_ready返回false 不会跳转</span>
 <span class="err">1</span><span class="nl">af2:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">af6:</span>  <span class="nf">movw</span>   <span class="kc">$</span><span class="mh">0x6</span><span class="p">,</span><span class="mh">0x30</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>     <span class="c1">; caller_frame-&gt;suspend_index = 6</span>
 <span class="err">1</span><span class="nl">afc:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">b00:</span>  <span class="nf">lea</span>    <span class="mh">0x50</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>     <span class="c1">; %rax = &amp;(caller_frame-&gt;final_awaiter)</span>
 <span class="err">1</span><span class="nl">b04:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = caller_frame pointer</span>
 <span class="err">1</span><span class="nl">b08:</span>  <span class="nf">mov</span>    <span class="mh">0x28</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>     <span class="c1">; %rax = caller's coroutine handle</span>
 <span class="err">1</span><span class="nl">b0c:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
 <span class="err">1</span><span class="nl">b0f:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">b12:</span>  <span class="nf">call</span>   <span class="mi">21</span><span class="nv">e2</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 调用final_awaiter.await_suspend()</span>
 <span class="err">1</span><span class="nl">b17:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>    <span class="c1">; %rax = std::noop_coroutine</span>
 <span class="err">1</span><span class="nl">b1b:</span>  <span class="nf">jmp</span>    <span class="mb">1b</span><span class="mi">66</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>

</code></pre></div>    </div>

    <p>由于<code class="language-plaintext highlighter-rouge">caller</code>协程并没有指定<code class="language-plaintext highlighter-rouge">continuation</code>，所以在<code class="language-plaintext highlighter-rouge">final_awaiter.await_suspend()</code>时会返回<code class="language-plaintext highlighter-rouge">std::noop_coroutine</code>。再次跳转到我们在第4步中提到处理对称转移的通用序列<code class="language-plaintext highlighter-rouge">1b66</code>处。</p>
  </li>
  <li>
    <p>由于<code class="language-plaintext highlighter-rouge">std::noop_coroutine</code>本质上就是一个dummy coroutine frame，根据<a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0913r0.html">提案</a>中所述，<code class="language-plaintext highlighter-rouge">std::noop_coroutine的address</code>函数返回值非空，但其状态机函数什么都不会执行，于是直接跳转到<code class="language-plaintext highlighter-rouge">caller</code>状态机函数的收尾处。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">b66:</span>  <span class="nf">endbr64</span>
 <span class="err">1</span><span class="nl">b6a:</span>  <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; %rax = std::noop_coroutine</span>
 <span class="err">1</span><span class="nl">b6e:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">b71:</span>  <span class="nf">call</span>   <span class="mi">1</span><span class="nv">dd4</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 对noop_coroutine调用address() 会返回一个dummy coroutine frame</span>
 <span class="err">1</span><span class="nl">b76:</span>  <span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>         <span class="c1">; %rdx = &amp;(dummy coroutine's resume_fn)</span>
 <span class="err">1</span><span class="nl">b79:</span>  <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
 <span class="err">1</span><span class="nl">b7c:</span>  <span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>               <span class="c1">; 调用dummy coroutine的状态机函数</span>
                                   <span class="c1">; 本质上什么都不会执行</span>
 <span class="err">1</span><span class="nl">b7e:</span>  <span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">c46</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 跳转至1c46返回</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>最终在<code class="language-plaintext highlighter-rouge">caller</code>状态机函数返回：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="c1">; epilogue</span>
 <span class="err">1</span><span class="nl">c46:</span>  <span class="nf">nop</span>
 <span class="err">1</span><span class="nl">c47:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1</span><span class="nl">c4b:</span>  <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">1</span><span class="nl">c54:</span>  <span class="nf">je</span>     <span class="mi">1</span><span class="nv">c5b</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>
 <span class="err">1</span><span class="nl">c56:</span>  <span class="nf">call</span>   <span class="mi">1170</span> <span class="o">&lt;</span><span class="nv">__stack_chk_fail@plt</span><span class="o">&gt;</span>
 <span class="err">1</span><span class="nl">c5b:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>
 <span class="err">1</span><span class="nl">c5f:</span>  <span class="nf">leave</span>
 <span class="err">1</span><span class="nl">c60:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p><code class="language-plaintext highlighter-rouge">ret</code>后的状态为：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">┌─────────────────────────────────────────┐</span> <span class="err">←</span> <span class="nf">High</span> <span class="nv">Address</span>
 <span class="err">│</span> <span class="nf">main</span><span class="p">()</span>                                  <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">coroutine_handle.resume</span><span class="p">()</span>               <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1d10</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x1d0b</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x26c5</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x26c3</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">callee_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">symmetric</span> <span class="nv">transferred</span> <span class="nv">from</span> <span class="mh">0x1b7c</span>     <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1b7e</span>                   <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">symmetric</span> <span class="nv">transferred</span> <span class="nv">from</span> <span class="mh">0x15ba</span>     <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x15bc</span>                   <span class="err">│</span> <span class="o">&lt;-</span> <span class="nv">cpu</span> <span class="nv">is</span> <span class="nv">here</span>
 <span class="err">└─────────────────────────────────────────┘</span> <span class="err">←</span> <span class="nf">Low</span> <span class="nv">Address</span> <span class="p">(</span><span class="nb">rsp</span><span class="p">)</span>

 <span class="nf">Heap</span> <span class="nv">State</span><span class="p">:</span>
 <span class="o">-</span> <span class="nf">caller</span> <span class="nv">frame</span> <span class="p">(</span><span class="nv">__callerFrame</span><span class="p">)</span>
   <span class="o">-</span> <span class="nf">suspend_index</span> <span class="err">=</span> <span class="mi">4</span>
   <span class="o">-</span> <span class="nf">callee_awaiter</span> <span class="nv">destructed</span>
   <span class="o">-</span> <span class="nf">callee_task</span> <span class="nv">destructed</span>
</code></pre></div>    </div>

    <p>此时CPU下一条要执行指令是其返回地址指向的<code class="language-plaintext highlighter-rouge">15bc</code>，很快又再次在<code class="language-plaintext highlighter-rouge">16b2</code>处<code class="language-plaintext highlighter-rouge">ret</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">15</span><span class="nl">bc:</span>  <span class="nf">jmp</span>    <span class="mi">1698</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>

 <span class="c1">; ...</span>
 <span class="err">1698</span><span class="p">:</span>  <span class="nf">nop</span>
 <span class="err">1699</span><span class="p">:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x18</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">169</span><span class="nl">d:</span>  <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
 <span class="err">16</span><span class="nl">a6:</span>  <span class="nf">je</span>     <span class="mi">16</span><span class="nv">ad</span>
 <span class="err">16</span><span class="nl">a8:</span>  <span class="nf">call</span>   <span class="mi">1170</span> <span class="o">&lt;</span><span class="nv">__stack_chk_fail@plt</span><span class="o">&gt;</span>
 <span class="err">16</span><span class="nl">ad:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>
 <span class="err">16</span><span class="nl">b1:</span>  <span class="nf">leave</span>
 <span class="err">16</span><span class="nl">b2:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p><code class="language-plaintext highlighter-rouge">ret</code>后的状态为：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">┌─────────────────────────────────────────┐</span> <span class="err">←</span> <span class="nf">High</span> <span class="nv">Address</span>
 <span class="err">│</span> <span class="nf">main</span><span class="p">()</span>                                  <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">coroutine_handle.resume</span><span class="p">()</span>               <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1d10</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x1d0b</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x26c5</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x26c3</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">callee_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">symmetric</span> <span class="nv">transferred</span> <span class="nv">from</span> <span class="mh">0x1b7c</span>     <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1b7e</span>                   <span class="err">│</span> <span class="o">&lt;-</span> <span class="nv">cpu</span> <span class="nv">is</span> <span class="nv">here</span>
 <span class="err">└─────────────────────────────────────────┘</span> <span class="err">←</span> <span class="nf">Low</span> <span class="nv">Address</span> <span class="p">(</span><span class="nb">rsp</span><span class="p">)</span>

 <span class="nf">Heap</span> <span class="nv">State</span><span class="p">:</span>
 <span class="o">-</span> <span class="nf">caller</span> <span class="nv">frame</span> <span class="p">(</span><span class="nv">__callerFrame</span><span class="p">)</span>
   <span class="o">-</span> <span class="nf">suspend_index</span> <span class="err">=</span> <span class="mi">4</span>
   <span class="o">-</span> <span class="nf">callee_awaiter</span> <span class="nv">destructed</span>
   <span class="o">-</span> <span class="nf">callee_task</span> <span class="nv">destructed</span>
</code></pre></div>    </div>

    <p>此时CPU下一条要执行指令是其返回地址指向的<code class="language-plaintext highlighter-rouge">1b7e</code>，这里又跳转到1c46处（代码上面出现过），会再次<code class="language-plaintext highlighter-rouge">ret</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">b7e:</span>  <span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">c46</span> <span class="o">&lt;</span><span class="nv">...</span><span class="o">&gt;</span>          <span class="c1">; 跳转至1c46返回</span>
</code></pre></div>    </div>

    <p><code class="language-plaintext highlighter-rouge">ret</code>后状态为：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">┌─────────────────────────────────────────┐</span> <span class="err">←</span> <span class="nf">High</span> <span class="nv">Address</span>
 <span class="err">│</span> <span class="nf">main</span><span class="p">()</span>                                  <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">coroutine_handle.resume</span><span class="p">()</span>               <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1d10</span>                   <span class="err">│</span> <span class="err">←</span> <span class="nv">pushed</span> <span class="nv">by</span> <span class="nv">call</span> <span class="nv">at</span> <span class="mh">0x1d0b</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">caller_resume</span>                           <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x26c5</span>                   <span class="err">│</span> <span class="o">&lt;-</span> <span class="nv">cpu</span> <span class="nv">is</span> <span class="nv">here</span>
 <span class="err">└─────────────────────────────────────────┘</span> <span class="err">←</span> <span class="nf">Low</span> <span class="nv">Address</span> <span class="p">(</span><span class="nb">rsp</span><span class="p">)</span>

 <span class="nf">Heap</span> <span class="nv">State</span><span class="p">:</span>
 <span class="o">-</span> <span class="nf">caller</span> <span class="nv">frame</span> <span class="p">(</span><span class="nv">__callerFrame</span><span class="p">)</span>
   <span class="o">-</span> <span class="nf">suspend_index</span> <span class="err">=</span> <span class="mi">4</span>
   <span class="o">-</span> <span class="nf">callee_awaiter</span> <span class="nv">destructed</span>
   <span class="o">-</span> <span class="nf">callee_task</span> <span class="nv">destructed</span>
</code></pre></div>    </div>

    <p>此时CPU下一条要执行指令是其返回地址指向的<code class="language-plaintext highlighter-rouge">26c5</code>，再次<code class="language-plaintext highlighter-rouge">ret</code>。</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">26</span><span class="nl">c5:</span>  <span class="nf">nop</span>
 <span class="err">26</span><span class="nl">c6:</span>  <span class="nf">leave</span>
 <span class="err">26</span><span class="nl">c7:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>

    <p><code class="language-plaintext highlighter-rouge">ret</code>后状态为：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">┌─────────────────────────────────────────┐</span> <span class="err">←</span> <span class="nf">High</span> <span class="nv">Address</span>
 <span class="err">│</span> <span class="nf">main</span><span class="p">()</span>                                  <span class="err">│</span>
 <span class="err">├─────────────────────────────────────────┤</span>
 <span class="err">│</span> <span class="nf">coroutine_handle.resume</span><span class="p">()</span>               <span class="err">│</span>
 <span class="err">│</span> <span class="o">-</span> <span class="nf">return</span> <span class="nv">addr</span><span class="p">:</span> <span class="mh">0x1d10</span>                   <span class="err">│</span> <span class="o">&lt;-</span> <span class="nv">cpu</span> <span class="nv">is</span> <span class="nv">here</span>
 <span class="err">└─────────────────────────────────────────┘</span> <span class="err">←</span> <span class="nf">Low</span> <span class="nv">Address</span> <span class="p">(</span><span class="nb">rsp</span><span class="p">)</span>

 <span class="nf">Heap</span> <span class="nv">State</span><span class="p">:</span>
 <span class="o">-</span> <span class="nf">caller</span> <span class="nv">frame</span> <span class="p">(</span><span class="nv">__callerFrame</span><span class="p">)</span>
   <span class="o">-</span> <span class="nf">suspend_index</span> <span class="err">=</span> <span class="mi">4</span>
   <span class="o">-</span> <span class="nf">callee_awaiter</span> <span class="nv">destructed</span>
   <span class="o">-</span> <span class="nf">callee_task</span> <span class="nv">destructed</span>
</code></pre></div>    </div>

    <p>此时<code class="language-plaintext highlighter-rouge">caller</code>协程也执行完成，接下来会将<code class="language-plaintext highlighter-rouge">caller</code>的coroutine frame释放。大致调用路径如下：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">d65:</span> <span class="nf">call</span> <span class="mi">2352</span>  <span class="c1">; 调用caller协程的返回值析构 SimpleTask::~SimpleTask()</span>
   <span class="err">↓</span>
 <span class="err">23</span><span class="nl">db:</span> <span class="nf">call</span> <span class="mi">2770</span>  <span class="c1">; 调用coroutine_handle&lt;SimpleTask&lt;int&gt;::promise_type&gt;::destroy()</span>
   <span class="err">↓</span>
 <span class="err">278</span><span class="nl">e:</span> <span class="nf">call</span> <span class="o">*%</span><span class="nb">rdx</span> <span class="c1">; caller.Frame.destroy (0x1c61)</span>
   <span class="err">↓</span>
 <span class="err">1</span><span class="nl">c8d:</span> <span class="nf">call</span> <span class="mi">1838</span>  <span class="c1">; caller.Frame.actor (最后一次调用状态机函数，清理)</span>
   <span class="err">↓</span>
 <span class="err">释放</span><span class="nf">caller</span><span class="err">的</span><span class="nv">coroutine</span> <span class="nv">frame</span>
</code></pre></div>    </div>

    <p>执行完这一些列操作后，最终就回返回到<code class="language-plaintext highlighter-rouge">main</code>了：</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="err">1</span><span class="nl">db4:</span>  <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rbx</span>
 <span class="err">1</span><span class="nl">db8:</span>  <span class="nf">leave</span>
 <span class="err">1</span><span class="nl">db9:</span>  <span class="nf">ret</span>
</code></pre></div>    </div>
  </li>
</ol>

<h2 id="asymmetric-transfer-vs-symmetric-transfer">Asymmetric transfer vs Symmetric transfer</h2>

<p>希望上面冗长的流程没有吓跑正在阅读的你，事实上我也花了将近一周的时间，才把短短100行左右的demo对应汇编大致分析了一遍。但不可否认的是，通过汇编，的确加深了我对协程整个执行流程的理解。最后我们总结下非对称转移和对称转移的核心。</p>

<p>在非对称转移中，每次<code class="language-plaintext highlighter-rouge">coroutin_handle.resume()</code>都是一个普通的函数调用。正如文章开头我们分析的，如果协程中又调用了其他协程的<code class="language-plaintext highlighter-rouge">coroutin_handle.resume()</code>，就会导致栈的深度线性增长，甚至出现stack overflow。</p>

<p>而对称转移中，编译器会生成如下所示的对称转移汇编。虽然形式上通过<code class="language-plaintext highlighter-rouge">call *%rdx</code>这样的indirect tail call，仍然会导致栈的深度加深，但是其返回地址，也就是<code class="language-plaintext highlighter-rouge">call *%rdx</code>的下一条指令，都会跳转到一段清理的逻辑中。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">endbr64</span>
<span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x20</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>    <span class="c1">; rax = &amp;(coroutine_handle)</span>
<span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
<span class="nf">call</span>   <span class="mi">1</span><span class="nv">dd4</span> <span class="o">&lt;</span><span class="nv">...address</span><span class="o">&gt;</span>   <span class="c1">; 调用coroutine_handle::address</span>
                           <span class="c1">; %rax = frame pointer</span>
<span class="nf">mov</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>         <span class="c1">; rdx = *(frame pointer) = resume_fn函数地址 即状态机函数</span>
<span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>           <span class="c1">; rdi = frame pointer</span>
<span class="nf">call</span>   <span class="o">*%</span><span class="nb">rdx</span>               <span class="c1">; indirect tail call调用对应协程的状态机函数</span>
<span class="nf">jmp</span>    <span class="mi">1</span><span class="nv">c46</span> <span class="o">&lt;</span><span class="nb">cl</span><span class="nv">eanup</span><span class="o">&gt;</span>
</code></pre></div></div>

<p>然而，对称转移并不能完全解决栈深度线性增长的问题，比如A co_await B，B co_await C，C co_await D，一直这样下去，对称转移中的<code class="language-plaintext highlighter-rouge">call *%rdx</code>也会导致栈深度增长，并不能完全避免stack overflow。</p>

<p>但对称转移的优势在于，虽然栈确实会增长，但一旦协程执行完成，就能通过<code class="language-plaintext highlighter-rouge">jmp cleanup</code>并<code class="language-plaintext highlighter-rouge">ret</code>的形式快速清理，而不需要像非对称转移那样层层返回。</p>

<p>到这为止，关于协程的基础介绍应该告一段落，如果有下一篇的话，会研究下<code class="language-plaintext highlighter-rouge">folly::coro::Task</code>协程库。</p>

<h2 id="reference">Reference</h2>

<table>
  <tbody>
    <tr>
      <td>[C++ Coroutines: Understanding Symmetric Transfer</td>
      <td>Asymmetric Transfer](https://lewissbaker.github.io/2020/05/11/understanding_symmetric_transfer)</td>
    </tr>
  </tbody>
</table>

<p><a href="https://eel.is/c++draft/expr.await#5.1.1">[expr.await]</a></p>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[Asymmetric transfer vs Symmetric transfer.]]></summary></entry><entry><title type="html">C++ object model from assembly’s prospective</title><link href="/%E5%AD%A6%E4%B9%A0/C++-object-model-from-assembly's-prospective/" rel="alternate" type="text/html" title="C++ object model from assembly’s prospective" /><published>2025-09-15T00:00:00+08:00</published><updated>2025-09-15T00:00:00+08:00</updated><id>/%E5%AD%A6%E4%B9%A0/C++%20object%20model%20from%20assembly&apos;s%20prospective</id><content type="html" xml:base="/%E5%AD%A6%E4%B9%A0/C++-object-model-from-assembly&apos;s-prospective/"><![CDATA[<p>结合CppCon2015的演讲Intro to the C++ Object Model，从汇编的视角再理解一下C++对象模型。我的环境是Linux和gcc 13.3。</p>

<h3 id="layout-of-a-class">Layout of a class</h3>

<p>首先我们看下一个只有成员变量的类的layout</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
</span>
<span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
  <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
  <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">Complex</span> <span class="n">c</span><span class="p">;</span>
  <span class="n">printf</span><span class="p">(</span><span class="s">"sizeof(Complex): %ld</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">Complex</span><span class="p">));</span>
  <span class="n">printf</span><span class="p">(</span><span class="s">"address of c: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">c</span><span class="p">);</span>
  <span class="n">printf</span><span class="p">(</span><span class="s">"address of c.real: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">c</span><span class="p">.</span><span class="n">real</span><span class="p">);</span>
  <span class="n">printf</span><span class="p">(</span><span class="s">"address of c.imag: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">c</span><span class="p">.</span><span class="n">imag</span><span class="p">);</span>
<span class="p">}</span>

</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">sizeof</span><span class="p">(</span><span class="n">Complex</span><span class="p">)</span><span class="o">:</span> <span class="mi">8</span>
<span class="n">address</span> <span class="n">of</span> <span class="n">c</span><span class="o">:</span> <span class="mh">0x7ffc1d73d930</span>
<span class="n">address</span> <span class="n">of</span> <span class="n">c</span><span class="p">.</span><span class="n">real</span><span class="o">:</span> <span class="mh">0x7ffc1d73d930</span>
<span class="n">address</span> <span class="n">of</span> <span class="n">c</span><span class="p">.</span><span class="n">imag</span><span class="o">:</span> <span class="mh">0x7ffc1d73d934</span>
</code></pre></div></div>

<p>关于类中的成员变量，标准中是这么说的：</p>

<blockquote>
  <p>From Working Draft N4431, Clause 9.2, Note 13:</p>
</blockquote>

<p>Nonstatic data members of a (non-union) class with the same access control (Clause <a href="https://timsong-cpp.github.io/cppwp/std11/class.access">[class.access]</a>) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (<a href="https://timsong-cpp.github.io/cppwp/std11/class.access">[class.access]</a>). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (<a href="https://timsong-cpp.github.io/cppwp/std11/class.virtual">[class.virtual]</a>) and virtual base classes (<a href="https://timsong-cpp.github.io/cppwp/std11/class.mi">[class.mi]</a>).</p>
<blockquote>

</blockquote>

<p>在gcc 13.3 -O0下汇编代码如下</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nf">endbr64</span>
    <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x10</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>

    <span class="c1">; 保存canary到栈上</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
    <span class="nf">xor</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>

    <span class="c1">; print sizeof(Complex)</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x8</span><span class="p">,</span><span class="o">%</span><span class="nb">esi</span>
    <span class="nf">lea</span>    <span class="mh">0xe74</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of c, which is -0x10(%rbp)</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x10</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe6f</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of c.real, which is -0x10(%rbp)</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x10</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe66</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of c.imag, which is -0x10(%rbp) + 4</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x10</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x4</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe5e</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; 检查canary是否被修改</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>
    <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rdx</span>
    <span class="nf">je</span>     <span class="mh">0x55555555520b</span> <span class="o">&lt;</span><span class="nv">main</span><span class="o">+</span><span class="mi">162</span><span class="o">&gt;</span>
    <span class="nf">call</span>   <span class="mh">0x555555555060</span> <span class="o">&lt;</span><span class="nv">__stack_chk_fail@plt</span><span class="o">&gt;</span>

    <span class="nf">leave</span>
    <span class="nf">ret</span>
</code></pre></div></div>

<p>对应的内存layout如下所示</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="nf">return</span> <span class="nv">address</span> <span class="nv">of</span> <span class="nv">main</span><span class="err">'</span><span class="nv">s</span> <span class="nv">caller</span> <span class="o">|</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="nf">main</span><span class="err">'</span><span class="nv">s</span> <span class="nv">caller</span> <span class="o">%</span><span class="nb">rbp</span>              <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="nb">rbp</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="nf">canary</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="nb">rbp</span> <span class="o">-</span> <span class="mi">8</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="nf">c.imag</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="nb">rbp</span> <span class="o">-</span> <span class="mi">12</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="nf">c.real</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="nb">rbp</span> <span class="o">-</span> <span class="mi">16</span> <span class="err">=</span> <span class="o">%</span><span class="nb">rsp</span>
<span class="o">+---------------------------------+</span>
</code></pre></div></div>

<h3 id="x86-64-system-v-abi-calling-convention">x86-64 System V ABI calling convention</h3>

<p>这里回顾下x86-64中调用约定中关于%rsp的部分。在x86-64 System V ABI中，在函数调用时，栈指针<code class="language-plaintext highlighter-rouge">%rsp</code>必须保持16字节对齐：</p>

<ul>
  <li>
    <p>caller负责保证在<code class="language-plaintext highlighter-rouge">call</code>指令执行时，<code class="language-plaintext highlighter-rouge">%rsp</code>是16字节对齐的</p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nl">caller:</span>
      <span class="c1">; ...</span>
      <span class="nf">call</span> <span class="nv">callee</span>          <span class="c1">; 调用call时 caller能确保 %rsp % 16 = 0</span>
</code></pre></div>    </div>
  </li>
  <li>调用call时，caller能确保<code class="language-plaintext highlighter-rouge">%rsp % 16 = 0</code></li>
  <li>call指令执行后，会将caller的下一条指令地址压榨，此时<code class="language-plaintext highlighter-rouge">%rsp % 16 = 8</code></li>
  <li>
    <p>一般callee最开始都有prologue，最开始会执行<code class="language-plaintext highlighter-rouge">pushq %rbp</code>，这时候就能保证<code class="language-plaintext highlighter-rouge">%rsp % 16 = 0</code></p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nl">callee:</span>
      <span class="nf">pushq</span> <span class="o">%</span><span class="nb">rbp</span>
      <span class="nf">movq</span> <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
      <span class="c1">; ...</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p>callee最后会有epilogue，调用完之后同样能保证<code class="language-plaintext highlighter-rouge">%rsp % 16 = 0</code></p>

    <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="nf">leave</span>
  <span class="nf">ret</span>
</code></pre></div>    </div>
  </li>
</ul>

<p>如果在没有prologue的情况下，也需要保证<code class="language-plaintext highlighter-rouge">%rsp</code>必须保持16字节对齐。比如在Compiler explorer gcc 13.3 -Og生成如下汇编代码：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">.LC0:</span>
        <span class="nf">.string</span> <span class="err">"</span><span class="nb">si</span><span class="nv">zeof</span><span class="p">(</span><span class="nv">Complex</span><span class="p">):</span> <span class="o">%</span><span class="nv">ld</span><span class="err">\</span><span class="nv">n</span><span class="err">"</span>
<span class="nl">.LC1:</span>
        <span class="nf">.string</span> <span class="err">"</span><span class="nv">address</span> <span class="nv">of</span> <span class="nv">c</span><span class="p">:</span> <span class="o">%</span><span class="nv">p</span><span class="err">\</span><span class="nv">n</span><span class="err">"</span>
<span class="nl">.LC2:</span>
        <span class="nf">.string</span> <span class="err">"</span><span class="nv">address</span> <span class="nv">of</span> <span class="nv">c.real</span><span class="p">:</span> <span class="o">%</span><span class="nv">p</span><span class="err">\</span><span class="nv">n</span><span class="err">"</span>
<span class="nl">.LC3:</span>
        <span class="nf">.string</span> <span class="err">"</span><span class="nv">address</span> <span class="nv">of</span> <span class="nv">c.imag</span><span class="p">:</span> <span class="o">%</span><span class="nv">p</span><span class="err">\</span><span class="nv">n</span><span class="err">"</span>
<span class="nl">main:</span>
        <span class="nf">subq</span>    <span class="kc">$</span><span class="mi">24</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>

        <span class="c1">; print sizeof(Complex)</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">8</span><span class="p">,</span> <span class="o">%</span><span class="nb">esi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">.LC0</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">call</span>    <span class="nv">printf</span>

        <span class="c1">; print address of c</span>
        <span class="nf">leaq</span>    <span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rsp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rsi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">.LC1</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">call</span>    <span class="nv">printf</span>

        <span class="c1">; print address of c.real</span>
        <span class="nf">leaq</span>    <span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rsp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rsi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">.LC2</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">call</span>    <span class="nv">printf</span>

        <span class="c1">; print address of c.image</span>
        <span class="nf">leaq</span>    <span class="mi">12</span><span class="p">(</span><span class="o">%</span><span class="nb">rsp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rsi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">.LC3</span><span class="p">,</span> <span class="o">%</span><span class="nb">edi</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">call</span>    <span class="nv">printf</span>
        <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
        <span class="nf">addq</span>    <span class="kc">$</span><span class="mi">24</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>
        <span class="nf">ret</span>
</code></pre></div></div>

<p>注意到<code class="language-plaintext highlighter-rouge">main</code>中并没有压栈<code class="language-plaintext highlighter-rouge">%rbp</code>，而是通过调整<code class="language-plaintext highlighter-rouge">%rsp</code>保证16字节对齐。进入<code class="language-plaintext highlighter-rouge">main</code>时，<code class="language-plaintext highlighter-rouge">%rsp % 16 = 8</code>。由于要给Complex分配8字节空间，理论上只需要<code class="language-plaintext highlighter-rouge">subq  $8, %rsp</code>就能同时满足<code class="language-plaintext highlighter-rouge">%rsp</code>16字节对齐以及局部变量内存分配。但gcc 13.3 -Og编译下是直接分配了24字节，也能满足以上要求，只不过有16个字节是没有使用的。对应内存layout如下所示：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+---------------------------------+
| return address of main's caller |  &lt;- %rsp + 24
+---------------------------------+
| unused                          |  &lt;- %rsp + 16
+---------------------------------+
| c.imag                          |  &lt;- %rsp + 12
+---------------------------------+
| c.real                          |  &lt;- %rsp + 8
+---------------------------------+
| unused                          |  &lt;- %rsp
+---------------------------------+
</code></pre></div></div>

<p>总结一下，x86-64 System V ABI的调用约定规定了在<code class="language-plaintext highlighter-rouge">call</code>指令执行时，<code class="language-plaintext highlighter-rouge">%rsp</code>必须是16的倍数：</p>

<ul>
  <li>caller执行<code class="language-plaintext highlighter-rouge">call</code>指令会将8字节的返回地址压栈</li>
  <li>callee会在prologue中再将<code class="language-plaintext highlighter-rouge">%rbp</code>压栈</li>
  <li>两次压栈后仍然能保证<code class="language-plaintext highlighter-rouge">%rsp</code>是16的倍数</li>
  <li>callee在epilogue中会将之前caller的<code class="language-plaintext highlighter-rouge">%rbp</code>和返回地址<code class="language-plaintext highlighter-rouge">%rip</code>弹出栈</li>
</ul>

<h3 id="layout-of-a-derived-class">Layout of a derived class</h3>

<p>接下来，我们看一下没有虚函数的派生类layout</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;cstdio&gt;</span><span class="cp">
</span>
<span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
    <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
    <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Derived</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Complex</span> <span class="p">{</span>
    <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="n">Derived</span> <span class="n">d</span><span class="p">;</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"address of d: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">d</span><span class="p">);</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"address of d.real: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">d</span><span class="p">.</span><span class="n">real</span><span class="p">);</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"address of d.imag: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">d</span><span class="p">.</span><span class="n">imag</span><span class="p">);</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"address of d.angle: %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">d</span><span class="p">.</span><span class="n">angle</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>gcc 13.3 -O0生成如下汇编代码</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="nf">endbr64</span>
    <span class="nf">push</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span><span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">sub</span>    <span class="kc">$</span><span class="mh">0x20</span><span class="p">,</span><span class="o">%</span><span class="nb">rsp</span>

    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
    <span class="nf">xor</span>    <span class="o">%</span><span class="nb">eax</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>

    <span class="c1">; print address of d</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x14</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe72</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of d.real</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x14</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe69</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of d.imag</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x14</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x4</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe61</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="c1">; print address of d.angle</span>
    <span class="nf">lea</span>    <span class="o">-</span><span class="mh">0x14</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">add</span>    <span class="kc">$</span><span class="mh">0x8</span><span class="p">,</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rsi</span>
    <span class="nf">lea</span>    <span class="mh">0xe59</span><span class="p">(</span><span class="o">%</span><span class="nv">rip</span><span class="p">),</span><span class="o">%</span><span class="nb">rax</span>
    <span class="nf">mov</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span><span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">call</span>   <span class="mh">0x555555555070</span> <span class="o">&lt;</span><span class="nv">printf@plt</span><span class="o">&gt;</span>

    <span class="nf">mov</span>    <span class="kc">$</span><span class="mh">0x0</span><span class="p">,</span><span class="o">%</span><span class="nb">eax</span>
    <span class="nf">mov</span>    <span class="o">-</span><span class="mh">0x8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span><span class="o">%</span><span class="nb">rdx</span>
    <span class="nf">sub</span>    <span class="o">%</span><span class="nb">fs</span><span class="p">:</span><span class="mh">0x28</span><span class="p">,</span><span class="o">%</span><span class="nb">rdx</span>
    <span class="nf">je</span>     <span class="mh">0x555555555211</span> <span class="o">&lt;</span><span class="nv">main</span><span class="p">()</span><span class="o">+</span><span class="mi">168</span><span class="o">&gt;</span>
    <span class="nf">call</span>   <span class="mh">0x555555555060</span> <span class="o">&lt;</span><span class="nv">__stack_chk_fail@plt</span><span class="o">&gt;</span>

    <span class="nf">leave</span>
    <span class="nf">ret</span>
</code></pre></div></div>

<p>进入到main执行完prologue之后的内存layout如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="k">return</span> <span class="n">address</span> <span class="n">of</span> <span class="n">main</span><span class="err">'</span><span class="n">s</span> <span class="n">caller</span> <span class="o">|</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">main</span><span class="err">'</span><span class="n">s</span> <span class="n">caller</span> <span class="o">%</span><span class="n">rbp</span>              <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">canary</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span> <span class="o">-</span> <span class="mi">8</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">d</span><span class="p">.</span><span class="n">angle</span>                         <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span> <span class="o">-</span> <span class="mi">12</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">d</span><span class="p">.</span><span class="n">imag</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span> <span class="o">-</span> <span class="mi">16</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">d</span><span class="p">.</span><span class="n">real</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span> <span class="o">-</span> <span class="mi">20</span>
<span class="o">+---------------------------------+</span>
<span class="o">|</span> <span class="n">unused</span>                          <span class="o">|</span>  <span class="o">&lt;-</span> <span class="o">%</span><span class="n">rbp</span> <span class="o">-</span> <span class="mi">32</span> <span class="o">=</span> <span class="o">%</span><span class="n">rsp</span>
<span class="o">+---------------------------------+</span>
</code></pre></div></div>

<p>Inheritance works by “extending” the object。 Think of inheritance as essentially “stacking” subclasses on top of base classes.</p>

<p>本质上，派生类的layout就是在基类基础上继续叠加成员变量。两种写法本质上等价：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
  <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
  <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Derived</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Complex</span> <span class="p">{</span>
  <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">Derived</span> <span class="p">{</span>
  <span class="k">struct</span> <span class="p">{</span>
    <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
    <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
  <span class="p">};</span>
  <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<h3 id="member-function">Member function</h3>

<p>接下来看下非需成员函数是如何实现的：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;cmath&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;cstdio&gt;</span><span class="cp">
</span>
<span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
    <span class="kt">float</span> <span class="n">func</span><span class="p">()</span> <span class="k">const</span> <span class="p">{</span> <span class="k">return</span> <span class="n">real</span> <span class="o">+</span> <span class="n">imag</span><span class="p">;</span> <span class="p">}</span>
    <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
    <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">void</span> <span class="nf">print</span><span class="p">(</span><span class="k">const</span> <span class="n">Complex</span> <span class="o">&amp;</span><span class="n">c</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">printf</span><span class="p">(</span><span class="s">"Abs: %f</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">func</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>编译后查看下符号表，可以看到被name mangled之后的成员函数，通过<code class="language-plaintext highlighter-rouge">c++filt</code>可以看到代码中的函数并。</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>gcc <span class="nt">-o</span> test.o <span class="nt">-c</span> test.cpp

<span class="nv">$ </span>nm test.o                                                                                  doodle.wang@k8s-worker5 10:58:22
                 U <span class="nb">printf
</span>0000000000000000 T _Z5printRK7Complex
0000000000000000 W _ZNK7Complex4funcEv

<span class="nv">$ </span>c++filt _ZNK7Complex4funcEv                                                                doodle.wang@k8s-worker5 10:59:27
Complex::func<span class="o">()</span> const
</code></pre></div></div>

<p>实际上，所有的成员函数实际会被处理为一个被name mangled的free function，比如<code class="language-plaintext highlighter-rouge">func</code>被转换成<code class="language-plaintext highlighter-rouge">_ZNK7Complex4funcEv</code>。而编译器还会把所有成员函数转换出来的free function的第一个参数设置为this指针。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">float</span> <span class="nf">_ZNK7Complex4funcEv</span><span class="p">(</span><span class="k">const</span> <span class="n">Complex</span><span class="o">*</span> <span class="k">this</span><span class="p">);</span>
</code></pre></div></div>

<p>这么做的原因是，通过传递<code class="language-plaintext highlighter-rouge">this</code>指针，不同对象的成员函数就可以使用同一份代码，成员函数也不需要保存在对象中。<code class="language-plaintext highlighter-rouge">func</code>本质上就等同于如下实现：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">float</span> <span class="n">Complex</span><span class="o">::</span><span class="n">func</span><span class="p">(</span><span class="n">Complex</span> <span class="k">const</span> <span class="o">*</span> <span class="k">this</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">real</span> <span class="o">+</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">imag</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>对同一个类的不同对象的成员函数时，只需要将<code class="language-plaintext highlighter-rouge">this</code>指针传递到这个free function进行调用即可。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">float</span> <span class="nf">_ZNK7Complex4funcEv</span><span class="p">(</span><span class="n">Complex</span> <span class="k">const</span> <span class="o">*</span> <span class="k">this</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">return</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">real</span> <span class="o">+</span> <span class="k">this</span><span class="o">-&gt;</span><span class="n">imag</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>我们看下成员函数<code class="language-plaintext highlighter-rouge">func</code>的汇编代码，<code class="language-plaintext highlighter-rouge">%rdi</code>作为第一个参数，保存了<code class="language-plaintext highlighter-rouge">this</code>。之后就可以通过<code class="language-plaintext highlighter-rouge">this</code>指针加上固定的偏移量，读取对应的成员变量。</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>objdump <span class="nt">-d</span> <span class="nt">--no-show-raw-insn</span>  test.o                                                      doodle.wang@k8s-worker5 11:01:35

0000000000000000 &lt;_ZNK7Complex4funcEv&gt;:
   0:   endbr64
   4:   push   %rbp
   5:   mov    %rsp,%rbp
   8:   mov    %rdi,-0x8<span class="o">(</span>%rbp<span class="o">)</span>      <span class="p">;</span> 保存this
   c:   mov    <span class="nt">-0x8</span><span class="o">(</span>%rbp<span class="o">)</span>,%rax
  10:   movss  <span class="o">(</span>%rax<span class="o">)</span>,%xmm1.        <span class="p">;</span> 加载this-&gt;real到xmm1
  14:   mov    <span class="nt">-0x8</span><span class="o">(</span>%rbp<span class="o">)</span>,%rax
  18:   movss  0x4<span class="o">(</span>%rax<span class="o">)</span>,%xmm0      <span class="p">;</span> 加载this-&gt;imag到xmm0
  1d:   addss  %xmm1,%xmm0.         <span class="p">;</span> xmm0 <span class="o">=</span> xmm0 + xmm1
  21:   pop    %rbp
  22:   ret
</code></pre></div></div>

<h3 id="virtual-function">Virtual function</h3>

<p>接下来我们来理解一下虚函数的调用机制。不妨想想下面的代码会输出什么：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span><span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">Erdos</span> <span class="p">{</span>
  <span class="kt">void</span> <span class="n">whoAmI</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I am Erdos</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
  <span class="k">virtual</span> <span class="kt">void</span> <span class="nf">whoAmIReally</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I really am Erdos</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Fermat</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Erdos</span> <span class="p">{</span>
  <span class="kt">void</span> <span class="n">whoAmI</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I am Fermat</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
  <span class="k">virtual</span> <span class="kt">void</span> <span class="nf">whoAmIReally</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I really am Fermat</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">Fermat</span> <span class="n">f</span><span class="p">;</span>
  <span class="n">f</span><span class="p">.</span><span class="n">whoAmI</span><span class="p">();</span>
  <span class="n">f</span><span class="p">.</span><span class="n">whoAmIReally</span><span class="p">();</span>

  <span class="n">Erdos</span> <span class="o">&amp;</span><span class="n">e</span> <span class="o">=</span> <span class="n">f</span><span class="p">;</span>
  <span class="n">e</span><span class="p">.</span><span class="n">whoAmI</span><span class="p">();</span>
  <span class="n">e</span><span class="p">.</span><span class="n">whoAmIReally</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>结果为：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">I</span> <span class="n">am</span> <span class="n">Fermat</span>
<span class="n">I</span> <span class="n">really</span> <span class="n">am</span> <span class="n">Fermat</span>
<span class="n">I</span> <span class="n">am</span> <span class="n">Erdos</span>
<span class="n">I</span> <span class="n">really</span> <span class="n">am</span> <span class="n">Fermat</span>
</code></pre></div></div>

<p>这里有两点需要理解：</p>

<ul>
  <li>非虚函数调用是在编译期确定</li>
  <li>虚函数代用是在运行期确定</li>
</ul>

<blockquote>
  <p>Non-virtual member functions bind statically. Function resolution occurs at compile time. Virtual member functions bind dynamically. Function resolution occurs when the object is created.</p>

</blockquote>

<p>对于非虚函数<code class="language-plaintext highlighter-rouge">whoAmI</code>来说，编译器已经知道<code class="language-plaintext highlighter-rouge">f</code>是一个<code class="language-plaintext highlighter-rouge">Fermat</code>对象。而<code class="language-plaintext highlighter-rouge">Fermat</code>中有一个<code class="language-plaintext highlighter-rouge">Erdos</code>对象，所以可以通过<code class="language-plaintext highlighter-rouge">f</code>获取其基类<code class="language-plaintext highlighter-rouge">Erdos</code>对象。因此在处理<code class="language-plaintext highlighter-rouge">e.whoAmI()</code>时，编译器也知道有一个<code class="language-plaintext highlighter-rouge">Erdos</code>对象。</p>

<p>对于虚函数<code class="language-plaintext highlighter-rouge">whoAmIReally</code>来说，到底调用的是基类方法还是派生类的方法，在这个对象创建时就已经决定了。在构造<code class="language-plaintext highlighter-rouge">f</code>这个<code class="language-plaintext highlighter-rouge">Fermat</code>对象时，已经确定<code class="language-plaintext highlighter-rouge">f</code>使用的是<code class="language-plaintext highlighter-rouge">Fermat</code>的虚函数表。因此调用<code class="language-plaintext highlighter-rouge">f.whoAmIReally</code>时就会输出<code class="language-plaintext highlighter-rouge">I really am Fermat</code>。同理，即便可以通过<code class="language-plaintext highlighter-rouge">f</code>获取到其基类对象<code class="language-plaintext highlighter-rouge">e</code>，但最开始创建<code class="language-plaintext highlighter-rouge">f</code>时就已经确定了<code class="language-plaintext highlighter-rouge">f</code>这个对象的<code class="language-plaintext highlighter-rouge">vptr</code>指向<code class="language-plaintext highlighter-rouge">Fermet</code>的虚函数表，因此调用<code class="language-plaintext highlighter-rouge">e.whoAmIReally</code>时也是输出<code class="language-plaintext highlighter-rouge">I really am Fermat</code>。</p>

<p>我们可以结合汇编代码看看是如何完成上述工作的。为了便于理解，这里用的是compiler explorer中gcc 13.3 -O0的代码：</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">main:</span>
    <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">subq</span>    <span class="kc">$</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>

    <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">vtable</span> <span class="nv">for</span> <span class="nv">Fermat</span><span class="o">+</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">-</span><span class="mi">16</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>         <span class="c1">; 在栈上创建Fermat对象 实际上只有vptr</span>

    <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">16</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>         <span class="c1">; 获取f的地址</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span>    <span class="nv">Fermat</span><span class="p">::</span><span class="nv">whoAmI</span><span class="p">()</span>        <span class="c1">; 调用非虚方法Fermat::whoAmI</span>

    <span class="c1">; 这里虽然调用的是虚函数 但编译器已经确定f是Fermat对象 不需要通过虚函数表进行调用</span>
    <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">16</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>         <span class="c1">; 获取f的地址</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span>    <span class="nv">Fermat</span><span class="p">::</span><span class="nv">whoAmIReally</span><span class="p">()</span>

    <span class="nf">leaq</span>    <span class="o">-</span><span class="mi">16</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>         <span class="c1">; 获取f的地址</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>          <span class="c1">; 把f的地址也保存到-8(%rbp)</span>
    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span>    <span class="nv">Erdos</span><span class="p">::</span><span class="nv">whoAmI</span><span class="p">()</span>         <span class="c1">; 调用非虚方法</span>

    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>          <span class="c1">; 获取f的地址 vptr</span>
    <span class="nf">movq</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>            <span class="c1">; rax = *rax = *vptr = vtable地址</span>
    <span class="nf">movq</span>    <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">),</span> <span class="o">%</span><span class="nb">rdx</span>            <span class="c1">; rdx = *(vtable第一个虚函数的地址)</span>
    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>          <span class="c1">; 获取f的地址</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span>    <span class="o">*%</span><span class="nb">rdx</span>                   <span class="c1">; 调用虚函数</span>

    <span class="nf">movl</span>    <span class="kc">$</span><span class="mi">0</span><span class="p">,</span> <span class="o">%</span><span class="nb">eax</span>
    <span class="nf">leave</span>
    <span class="nf">ret</span>
</code></pre></div></div>

<p>注意到，编译器在处理<code class="language-plaintext highlighter-rouge">f.whoAmIReally()</code>时，虽然<code class="language-plaintext highlighter-rouge">whoAmIReally</code>是一个虚函数，但由于这里知道<code class="language-plaintext highlighter-rouge">f</code>的类型就是<code class="language-plaintext highlighter-rouge">Fermat</code>，因此不需要通过虚函数表进行虚函数调用。</p>

<p>而在处理<code class="language-plaintext highlighter-rouge">e.whoAmIReally()</code>时，首先是获取了对象<code class="language-plaintext highlighter-rouge">f</code>的地址，由于<code class="language-plaintext highlighter-rouge">Fermat</code>和基类<code class="language-plaintext highlighter-rouge">Erdos</code>中没有任何成员变量，所以这个地址保存的就是<code class="language-plaintext highlighter-rouge">f</code>的<code class="language-plaintext highlighter-rouge">vptr</code>地址。<code class="language-plaintext highlighter-rouge">f</code>是一个<code class="language-plaintext highlighter-rouge">Fermat</code>对象，所以<code class="language-plaintext highlighter-rouge">vptr</code>指向<code class="language-plaintext highlighter-rouge">Fermat</code>的<code class="language-plaintext highlighter-rouge">vtable</code>地址。<code class="language-plaintext highlighter-rouge">Fermat</code>只有一个虚函数，所以<code class="language-plaintext highlighter-rouge">Fermat</code>的<code class="language-plaintext highlighter-rouge">vtable</code>的地址也就是<code class="language-plaintext highlighter-rouge">Fermat</code>的<code class="language-plaintext highlighter-rouge">whoAmIReally</code>函数地址。相关地址的关系如下：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">*</span><span class="p">(</span><span class="o">&amp;</span><span class="n">f</span><span class="p">)</span> <span class="o">=</span> <span class="n">vptr</span>
<span class="o">*</span><span class="n">vptr</span> <span class="o">=</span> <span class="n">address</span> <span class="n">of</span> <span class="n">Fermat</span><span class="err">'</span><span class="n">s</span> <span class="n">vtable</span>
<span class="o">*</span><span class="p">(</span><span class="n">vtable</span><span class="p">)</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">Fermat</span><span class="o">::</span><span class="n">whoAmIReally</span>
</code></pre></div></div>

<h3 id="layout-of-derived-class-with-virtual-function">Layout of derived class with virtual function</h3>

<p>既然带有虚函数的派生类和基类中有虚函数表，那就继续看看他们的layout。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;cmath&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span><span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
  <span class="k">virtual</span> <span class="o">~</span><span class="n">Complex</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
  <span class="k">virtual</span> <span class="kt">float</span> <span class="n">Abs</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">real</span><span class="p">,</span> <span class="n">imag</span><span class="p">);</span> <span class="p">}</span>
  <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
  <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Derived</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Complex</span> <span class="p">{</span>
  <span class="k">virtual</span> <span class="o">~</span><span class="n">Derived</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
  <span class="k">virtual</span> <span class="kt">float</span> <span class="n">Abs</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">real</span><span class="p">,</span> <span class="n">imag</span><span class="p">),</span> <span class="n">angle</span><span class="p">);</span> <span class="p">}</span>
  <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"sizeof(float): "</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
  <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"sizeof(void*): "</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
  <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"sizeof(Complex): "</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">Complex</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
  <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"sizeof(Derived): "</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">Derived</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">"</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">sizeof</span><span class="p">(</span><span class="kt">float</span><span class="p">)</span><span class="o">:</span> <span class="mi">4</span>
<span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span><span class="o">*</span><span class="p">)</span><span class="o">:</span> <span class="mi">8</span>
<span class="k">sizeof</span><span class="p">(</span><span class="n">Complex</span><span class="p">)</span><span class="o">:</span> <span class="mi">16</span>
<span class="k">sizeof</span><span class="p">(</span><span class="n">Derived</span><span class="p">)</span><span class="o">:</span> <span class="mi">24</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Complex</code>的大小为16，因为现在有虚函数，所以需要额外8字节保存<code class="language-plaintext highlighter-rouge">vptr</code>，即：</p>

<p><code class="language-plaintext highlighter-rouge">sizeof(Complex) = sizeof(vptr) + sizeof(real) + sizeof(imag) = 8 + 4 + 4 = 16</code></p>

<p><code class="language-plaintext highlighter-rouge">Derived</code>的大小为24，在<code class="language-plaintext highlighter-rouge">Complex</code>的基础上，额外需要4字节保存<code class="language-plaintext highlighter-rouge">angle</code>，由于不满足8字节对齐，所以增加了4字节padding，总大小为24。注意派生类的<code class="language-plaintext highlighter-rouge">vptr</code>也是保存在对象的首地址中。</p>

<p><code class="language-plaintext highlighter-rouge">sizeof(Derived) = sizeof(Complex) + sizeof(angle) + 4 byte padding = 16 + 4 + 4 = 24</code></p>

<h3 id="vtable-review">vtable review</h3>

<p><code class="language-plaintext highlighter-rouge">Complex</code>和<code class="language-plaintext highlighter-rouge">Derived</code>类各有一个虚函数表，每个对象在构造时，会将<code class="language-plaintext highlighter-rouge">vptr</code>指针指向相应的虚函数表，之后再调用虚函数时，根据<code class="language-plaintext highlighter-rouge">vptr</code>找到虚函数表，然后调用相应虚函数。</p>

<p><img src="/archive/vtable-1.png" alt="figure" /></p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">Complex</code>对象<code class="language-plaintext highlighter-rouge">c</code>的<code class="language-plaintext highlighter-rouge">vptr</code>指向<code class="language-plaintext highlighter-rouge">Complex</code>类的虚函数表</p>

</blockquote>

<p><img src="/archive/vtable-2.png" alt="figure" /></p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">Derived</code>对象<code class="language-plaintext highlighter-rouge">c</code>的<code class="language-plaintext highlighter-rouge">vptr</code>指向<code class="language-plaintext highlighter-rouge">Derived</code>类的虚函数表</p>

</blockquote>

<h3 id="constructor-of-derived-class">Constructor of derived class</h3>

<p>那么一个对象的<code class="language-plaintext highlighter-rouge">vptr</code>又是在什么时候被复制的呢？答案就是构造函数时，因为构造一个对象的时候，知道这个对象具体是哪个类，也就能在构造对象时，将<code class="language-plaintext highlighter-rouge">vptr</code>初始化为对应类的虚函数表地址了。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;cmath&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span><span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">Complex</span> <span class="p">{</span>
  <span class="n">Complex</span><span class="p">()</span> <span class="o">:</span> <span class="n">real</span><span class="p">(</span><span class="mi">0</span><span class="p">),</span> <span class="n">imag</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">{}</span>
  <span class="k">virtual</span> <span class="o">~</span><span class="n">Complex</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
  <span class="k">virtual</span> <span class="kt">float</span> <span class="nf">Abs</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">real</span><span class="p">,</span> <span class="n">imag</span><span class="p">);</span> <span class="p">}</span>
  <span class="kt">float</span> <span class="n">real</span><span class="p">;</span>
  <span class="kt">float</span> <span class="n">imag</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Derived</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Complex</span> <span class="p">{</span>
  <span class="n">Derived</span><span class="p">()</span> <span class="o">:</span> <span class="n">angle</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="p">{}</span>
  <span class="k">virtual</span> <span class="o">~</span><span class="n">Derived</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
  <span class="kt">float</span> <span class="n">angle</span><span class="p">;</span>
  <span class="k">virtual</span> <span class="kt">float</span> <span class="nf">Abs</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">hypot</span><span class="p">(</span><span class="n">real</span><span class="p">,</span> <span class="n">imag</span><span class="p">),</span> <span class="n">angle</span><span class="p">);</span> <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">Derived</span> <span class="n">d</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>基类<code class="language-plaintext highlighter-rouge">Complex</code>的构造函数如下，注意到构造函数也会通过第一个参数<code class="language-plaintext highlighter-rouge">%rdi</code>传递<code class="language-plaintext highlighter-rouge">this</code>指针，可以理解为在给定地址构造对象。相关逻辑比较简单，本质上就是初始化<code class="language-plaintext highlighter-rouge">*%rdi</code>为<code class="language-plaintext highlighter-rouge">Complex</code>类的虚函数表地址，保存<code class="language-plaintext highlighter-rouge">*(%rdi + 8)</code>为0，<code class="language-plaintext highlighter-rouge">*(%rdi + 12)</code>为0，分别对应<code class="language-plaintext highlighter-rouge">Complex</code>中的<code class="language-plaintext highlighter-rouge">real</code>和<code class="language-plaintext highlighter-rouge">imag</code>。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">Complex:</span><span class="p">:</span><span class="nf">Complex</span><span class="p">()</span> <span class="p">[</span><span class="nv">base</span> <span class="nv">object</span> <span class="nv">constructor</span><span class="p">]:</span>
    <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>

    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>                 <span class="c1">; 保存this到-8(%rbp)</span>
    <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">vtable</span> <span class="nv">for</span> <span class="nv">Complex</span><span class="o">+</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">edx</span>   <span class="c1">; 保存Complex的vtable地址到%edx</span>
    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>                   <span class="c1">; 把vtable地址保存到*rdi</span>

    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">pxor</span>    <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="o">%</span><span class="nv">xmm0</span>
    <span class="nf">movss</span>   <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>                 <span class="c1">; 初始化*(%rdi + 8)为0 即this-&gt;real为0</span>

    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">pxor</span>    <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="o">%</span><span class="nv">xmm0</span>
    <span class="nf">movss</span>   <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="mi">12</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>                <span class="c1">; 初始化*(%rdi + 12)为0 即this-&gt;imag为0</span>
    <span class="nf">nop</span>

    <span class="nf">popq</span>    <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">ret</span>
</code></pre></div></div>

<p>派生类<code class="language-plaintext highlighter-rouge">Derived</code>的构造函数如下，注意是先调用了基类的构造，然后再更新<code class="language-plaintext highlighter-rouge">vptr</code>值。</p>

<div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">Derived:</span><span class="p">:</span><span class="nf">Derived</span><span class="p">()</span> <span class="p">[</span><span class="nv">base</span> <span class="nv">object</span> <span class="nv">constructor</span><span class="p">]:</span>
    <span class="nf">pushq</span>   <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rsp</span><span class="p">,</span> <span class="o">%</span><span class="nb">rbp</span>
    <span class="nf">subq</span>    <span class="kc">$</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">rsp</span>                       <span class="c1">; 预留24字节用于构造对象</span>

    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rdi</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">)</span>
    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rax</span><span class="p">,</span> <span class="o">%</span><span class="nb">rdi</span>
    <span class="nf">call</span>    <span class="nv">Complex</span><span class="p">::</span><span class="nv">Complex</span><span class="p">()</span> <span class="p">[</span><span class="nv">base</span> <span class="nv">object</span> <span class="nv">constructor</span><span class="p">]</span>

    <span class="nf">movl</span>    <span class="kc">$</span><span class="nv">vtable</span> <span class="nv">for</span> <span class="nv">Derived</span><span class="o">+</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="nb">edx</span>    <span class="c1">; 保存Complex的vtable地址到%edx</span>
    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">movq</span>    <span class="o">%</span><span class="nb">rdx</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>                    <span class="c1">; 把vtable地址保存到*rdi</span>

    <span class="nf">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="nb">rbp</span><span class="p">),</span> <span class="o">%</span><span class="nb">rax</span>
    <span class="nf">pxor</span>    <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="o">%</span><span class="nv">xmm0</span>
    <span class="nf">movss</span>   <span class="o">%</span><span class="nv">xmm0</span><span class="p">,</span> <span class="mi">16</span><span class="p">(</span><span class="o">%</span><span class="nb">rax</span><span class="p">)</span>                 <span class="c1">; 初始化*(%rdi + 16为0 即this-&gt;angle为0</span>
    <span class="nf">nop</span>

    <span class="nf">leave</span>
    <span class="nf">ret</span>
</code></pre></div></div>

<p>从上面的代码中可以看到，构造一个有虚函数的派生类对象时，依次进行了以下操作：</p>

<ol>
  <li>调用基类构造</li>
  <li>将对象首地址设置为<strong>基类</strong>的虚函数表地址（注意不是派生类）</li>
  <li>按初始化列表初始化基类的成员变量</li>
  <li>执行基类构造函数体中的剩余操作</li>
  <li>更新对象首地址为<strong>派生类</strong>的虚函数地址</li>
  <li>按初始化列表初始化派生类的成员变量</li>
  <li>执行派生类构造函数体中的剩余操作</li>
</ol>

<p>研究清楚了调用派生类构造函数的执行顺序后，可以思考下面代码会输出什么结果。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span><span class="k">using</span> <span class="n">std</span><span class="o">::</span><span class="n">cout</span><span class="p">;</span>

<span class="k">struct</span> <span class="nc">Erdos</span> <span class="p">{</span>
  <span class="n">Erdos</span><span class="p">()</span> <span class="p">{</span> <span class="n">whoAmIReally</span><span class="p">();</span> <span class="p">}</span>
  <span class="k">virtual</span> <span class="kt">void</span> <span class="nf">whoAmIReally</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I really am Erdos</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="nc">Fermat</span> <span class="o">:</span> <span class="k">public</span> <span class="n">Erdos</span> <span class="p">{</span>
  <span class="k">virtual</span> <span class="kt">void</span> <span class="n">whoAmIReally</span><span class="p">()</span> <span class="p">{</span> <span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="s">"I really am Fermat</span><span class="se">\n</span><span class="s">"</span><span class="p">;</span> <span class="p">}</span>
<span class="p">};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">Fermat</span> <span class="n">f</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>结果是，稍微有点反直觉：</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">I</span> <span class="n">really</span> <span class="n">am</span> <span class="n">Erdos</span>
</code></pre></div></div>

<p>按照之前分析的结果，在构造<code class="language-plaintext highlighter-rouge">Fermat</code>中的<code class="language-plaintext highlighter-rouge">Erdos</code>基类对象时，<code class="language-plaintext highlighter-rouge">vptr</code>会被首先设置为<code class="language-plaintext highlighter-rouge">Erdos</code>的虚函数表地址，之后调用<code class="language-plaintext highlighter-rouge">whoAmIReally</code>时，调用的自然是<code class="language-plaintext highlighter-rouge">Erdos::whoAmIReally</code>。而当构造完基类对象后，<code class="language-plaintext highlighter-rouge">vptr</code>才被更新为正确的子类虚函数表。</p>

<blockquote>
  <p>Effective C++中有一条就是Never call virtual functions during construction or destruction.</p>

</blockquote>

<p>构造函数的相关汇编如下，可以再结合加深理解。</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Erdos</span><span class="o">::</span><span class="n">Erdos</span><span class="p">()</span> <span class="p">[</span><span class="n">base</span> <span class="n">object</span> <span class="n">constructor</span><span class="p">]</span><span class="o">:</span>
    <span class="n">pushq</span>   <span class="o">%</span><span class="n">rbp</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rsp</span><span class="p">,</span> <span class="o">%</span><span class="n">rbp</span>
    <span class="n">subq</span>    <span class="err">$</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="n">rsp</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rdi</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">)</span>
    <span class="n">movl</span>    <span class="err">$</span><span class="n">vtable</span> <span class="k">for</span> <span class="n">Erdos</span><span class="o">+</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="n">edx</span>
    <span class="n">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">),</span> <span class="o">%</span><span class="n">rax</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rdx</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="n">rax</span><span class="p">)</span>
    <span class="n">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">),</span> <span class="o">%</span><span class="n">rax</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rax</span><span class="p">,</span> <span class="o">%</span><span class="n">rdi</span>
    <span class="n">call</span>    <span class="n">Erdos</span><span class="o">::</span><span class="n">whoAmIReally</span><span class="p">()</span>
    <span class="n">nop</span>
    <span class="n">leave</span>
    <span class="n">ret</span>

<span class="n">Fermat</span><span class="o">::</span><span class="n">Fermat</span><span class="p">()</span> <span class="p">[</span><span class="n">base</span> <span class="n">object</span> <span class="n">constructor</span><span class="p">]</span><span class="o">:</span>
    <span class="n">pushq</span>   <span class="o">%</span><span class="n">rbp</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rsp</span><span class="p">,</span> <span class="o">%</span><span class="n">rbp</span>
    <span class="n">subq</span>    <span class="err">$</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="n">rsp</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rdi</span><span class="p">,</span> <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">)</span>
    <span class="n">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">),</span> <span class="o">%</span><span class="n">rax</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rax</span><span class="p">,</span> <span class="o">%</span><span class="n">rdi</span>
    <span class="n">call</span>    <span class="n">Erdos</span><span class="o">::</span><span class="n">Erdos</span><span class="p">()</span> <span class="p">[</span><span class="n">base</span> <span class="n">object</span> <span class="n">constructor</span><span class="p">]</span>
    <span class="n">movl</span>    <span class="err">$</span><span class="n">vtable</span> <span class="k">for</span> <span class="n">Fermat</span><span class="o">+</span><span class="mi">16</span><span class="p">,</span> <span class="o">%</span><span class="n">edx</span>
    <span class="n">movq</span>    <span class="o">-</span><span class="mi">8</span><span class="p">(</span><span class="o">%</span><span class="n">rbp</span><span class="p">),</span> <span class="o">%</span><span class="n">rax</span>
    <span class="n">movq</span>    <span class="o">%</span><span class="n">rdx</span><span class="p">,</span> <span class="p">(</span><span class="o">%</span><span class="n">rax</span><span class="p">)</span>
    <span class="n">nop</span>
    <span class="n">leave</span>
    <span class="n">ret</span>
</code></pre></div></div>

<h2 id="refernce">Refernce</h2>

<p><a href="https://www.youtube.com/watch?v=iLiDezv_Frk">CppCon 2015: Richard Powell “Intro to the C++ Object Model” - YouTube</a></p>]]></content><author><name>Doodle</name></author><category term="学习" /><category term="C++" /><summary type="html"><![CDATA[结合CppCon2015的演讲Intro to the C++ Object Model，从汇编的视角再理解一下C++对象模型。我的环境是Linux和gcc 13.3。]]></summary></entry></feed>