foxido https://foxido.dev/ Recent content on foxido Hugo en Tue, 15 Jul 2025 20:00:00 +0300 Fuzzing https://foxido.dev/posts/fuzzzing/ Tue, 15 Jul 2025 20:00:00 +0300 https://foxido.dev/posts/fuzzzing/ <h2 id="overview">Overview</h2> <p>Fuzzing is a software testing technique that involves feeding a program with random generated data to identify corner-case errors. There are different types of fuzzers:</p> <ol> <li>&ldquo;Black-box&rdquo; fuzzers, that runs your program as-is, without any instrumentation and knowledge.</li> <li>&ldquo;Grey-box&rdquo; fuzzers, that instruments your program at compile time or runs it inside QEMU (efficiently), gaining insight about control flow.</li> </ol> <p>Grey-box fuzzers, also known as <em>feedback-driven</em> fuzzers, are superior to black-box ones but require full access to the binary or source code. By leveraging control-flow knowledge, they can employ genetic algorithms for input mutation, save new discovered path and explore your program step-by-step.</p> Understanding PostgreSQL internals, part 2: Executor (with JIT) https://foxido.dev/posts/postgresql-internal-2/ Mon, 09 Jun 2025 20:00:00 +0300 https://foxido.dev/posts/postgresql-internal-2/ <p>In the <a href="https://foxido.dev/posts/postgresql-internal-1/">previous part</a>, we discussed how PostgreSQL transforms declarative queries into imperative steps. Therefore, it seems logical to discuss the execution part next.</p> <h2 id="overview">Overview</h2> <p>A basic understanding of how PostgreSQL executes trees can be obtained from the following <a href="https://git.foxido.dev/archive/postgres/-/blob/REL_17_5/src/backend/executor/execProcnode.c#L20">source code</a> comment:</p> <div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#75715e">/* </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * NOTES </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * This used to be three files. It is now all combined into </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * one file so that it is easier to keep the dispatch routines </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * in sync when new nodes are added. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * EXAMPLE </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Suppose we want the age of the manager of the shoe department and </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * the number of employees in that department. So we have the query: </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * select DEPT.no_emps, EMP.age </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * from DEPT, EMP </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * where EMP.name = DEPT.mgr and </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * DEPT.name = &#34;shoe&#34; </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Suppose the planner gives us the following plan: </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Nest Loop (DEPT.mgr = EMP.name) </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * / \ </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * / \ </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Seq Scan Seq Scan </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * DEPT EMP </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * (name = &#34;shoe&#34;) </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * ExecutorStart() is called first. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * It calls InitPlan() which calls ExecInitNode() on </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * the root of the plan -- the nest loop node. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * * ExecInitNode() notices that it is looking at a nest loop and </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * as the code below demonstrates, it calls ExecInitNestLoop(). </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Eventually this calls ExecInitNode() on the right and left subplans </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * and so forth until the entire plan is initialized. The result </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * of ExecInitNode() is a plan state tree built with the same structure </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * as the underlying plan tree. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * * Then when ExecutorRun() is called, it calls ExecutePlan() which calls </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * ExecProcNode() repeatedly on the top node of the plan state tree. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Each time this happens, ExecProcNode() will end up calling </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * ExecNestLoop(), which calls ExecProcNode() on its subplans. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * Each of these subplans is a sequential scan so ExecSeqScan() is </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * called. The slots returned by ExecSeqScan() may contain </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * tuples which contain the attributes ExecNestLoop() uses to </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * form the tuples it returns. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * * Eventually ExecSeqScan() stops returning tuples and the nest </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * loop join ends. Lastly, ExecutorEnd() calls ExecEndNode() which </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * calls ExecEndNestLoop() which in turn calls ExecEndNode() on </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * its subplans which result in ExecEndSeqScan(). </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * This should show how the executor works by having </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * ExecInitNode(), ExecProcNode() and ExecEndNode() dispatch </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * their work to the appropriate node support routines which may </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> * in turn call these routines themselves on their subplans. </span></span></span><span style="display:flex;"><span><span style="color:#75715e"> */</span> </span></span></code></pre></div><p>In PostgreSQL, trees execute with a &lsquo;pull strategy&rsquo;: the plan is executed from top to bottom, and each node <em>pulls</em> content row by row from child nodes. As an alternative approach, we could try to execute queries from bottom to top, but among other cons, this approach lacks laziness. For example, <code>select * from very_big_table limit 1</code> would scan the whole table and then filter everything out instead of pulling just one row. You might argue that this can be optimized by pushing the <code>limit</code> operator into the scan node, but in reality it won&rsquo;t work - consider a limit above a join. In that case, we can&rsquo;t predict how many rows we&rsquo;ll need from each table because we can&rsquo;t predict how successful the row joins will be. Therefore, the pull strategy is the only way to implement laziness, and its overhead can be optimized via batching (pulling multiple rows at once, though this is currently not implemented in PostgreSQL).</p> Understanding PostgreSQL internals, part 1: Planner https://foxido.dev/posts/postgresql-internal-1/ Wed, 04 Jun 2025 20:00:00 +0300 https://foxido.dev/posts/postgresql-internal-1/ <h2 id="introduction">Introduction</h2> <p>This is a small translation of <a href="https://git.foxido.dev/foxido/db-course">my homework</a> for an MIPT course, where I describe how PostgreSQL internals work with references to the source code.</p> <p>All source code references are based on the <code>REL_17_5</code> PostgreSQL git tag. For building PostgreSQL, you can refer to the <a href="https://git.foxido.dev/foxido/db-course/-/blob/master/dockerbuild">dockerbuild directory</a>.</p> <p>During this series, I assume that you have following tables:</p> <div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t1(a int, b1 int); </span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t2(a int, b2 int); </span></span><span style="display:flex;"><span><span style="color:#66d9ef">CREATE</span> <span style="color:#66d9ef">TABLE</span> t3(a int, b3 int); </span></span><span style="display:flex;"><span> </span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t1 <span style="color:#66d9ef">SELECT</span> i, (i <span style="color:#f92672">*</span> <span style="color:#ae81ff">5</span>) <span style="color:#f92672">%</span> <span style="color:#ae81ff">7</span> <span style="color:#66d9ef">from</span> generate_series(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">10</span>) i; </span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t2 <span style="color:#66d9ef">SELECT</span> i <span style="color:#f92672">*</span> <span style="color:#ae81ff">2</span>, (i <span style="color:#f92672">*</span> <span style="color:#ae81ff">5</span>) <span style="color:#f92672">%</span> <span style="color:#ae81ff">7</span> <span style="color:#66d9ef">from</span> generate_series(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">10</span>) i; </span></span><span style="display:flex;"><span><span style="color:#66d9ef">INSERT</span> <span style="color:#66d9ef">INTO</span> t3 <span style="color:#66d9ef">SELECT</span> i <span style="color:#f92672">*</span> <span style="color:#ae81ff">3</span>, (i <span style="color:#f92672">*</span> <span style="color:#ae81ff">5</span>) <span style="color:#f92672">%</span> <span style="color:#ae81ff">7</span> <span style="color:#66d9ef">from</span> generate_series(<span style="color:#ae81ff">1</span>, <span style="color:#ae81ff">10</span>) i; </span></span><span style="display:flex;"><span> </span></span><span style="display:flex;"><span><span style="color:#66d9ef">VACUUM</span> (<span style="color:#66d9ef">ANALYZE</span>, <span style="color:#66d9ef">FULL</span>); </span></span></code></pre></div><h2 id="why-we-need-planner-at-all">Why we need planner at all?</h2> <p>Well, SQL is a declarative (not imperative) language: it specifies what we want from the database but not how to get it. So because we can execute only imperative statements, we need to transform these wishful queries into executable plans — and like any good developer, we want this transformation to be optimal.</p> [DRAFT] cutie.team announce https://foxido.dev/posts/cutie.team.announce/ Sun, 10 Nov 2024 21:13:00 +0300 https://foxido.dev/posts/cutie.team.announce/ <p>TODO</p> <p>right now just for redirect from cutie.team</p> Partial VPN for docker-compose and ... email https://foxido.dev/posts/mail-setup-network/ Fri, 01 Nov 2024 21:13:00 +0300 https://foxido.dev/posts/mail-setup-network/ <p>Если честно, я так долго откладывал написание этой заметки, что контекст уже слегка потерян&hellip; Но я все-таки решил ее написать, как напоминание о моей лени и наставление писать такие вещи сразу. Ну и чтобы опыта было побольше, хех</p> <p>Для начала постановка задачи: поднять собственный email сервер рофла ради. При этом homelab у меня состоит из двух устройств: VPS сервера в качестве gateway (nginx + пару мелких сервисов) и основной машины за серым IP (где и поднималась почта), соединненных wg туннелем.</p> Raspberry Pi + Archlinux ARM + btrfs https://foxido.dev/posts/rpi-archlinux-btrfs/ Sat, 28 May 2022 14:09:13 +0300 https://foxido.dev/posts/rpi-archlinux-btrfs/ Simple instructions for installing arch linux with btrfs root partition instead of default ext4. $ whoami https://foxido.dev/whoami/ Mon, 01 Jan 0001 00:00:00 +0000 https://foxido.dev/whoami/ <p>This page is currently being rewritten</p> <p>But I love system programming:</p> <ul> <li><a href="https://git.foxido.dev/foxido/nyacc-rs">Developed</a> a C-like compiler in Rust + LLVM.</li> <li><a href="https://git.foxido.dev/mipt-jos">Wrote</a> virtio-net and virtio-gpu drivers, including necessary PCIe, sockets, and file descriptor subsystems, as well as an SDL-like library for an educational operating system, which had zero infrastracture for this. Later, with other students, ported TinySSH and Pong game to it.</li> <li>Wrote something simular to Go runtime in C++: stackfull coroutines, thread pool, lock free syncronization primitives</li> <li>Had fun with raytracing, small unity games, FPV drones, selfhosting, simple stack CPU emulators and so on</li> </ul> <p>Formerly a database developer on Huawei’s GaussDB team; now part of Huawei’s Linux kernel team, working on memory managment and RCU algorithms.</p>