Jekyll2025-05-25T15:52:35+00:00https://sjay05.github.io//Sanjay SeenivasanMinimal, responsive Jekyll theme for hackers.Sanjay SeenivasanOn WALs (Write Ahead Logs) and fsync()2024-01-06T00:00:00+00:002024-01-06T00:00:00+00:00https://sjay05.github.io/2024/01/06/cwal<p>This post will discuss <a href="https://github.com/sjay05/cwal">CWal</a>: A Write Ahead log (WAL) implementation I wrote in C++ (tested on Ubuntu 22.04). Here is a direct link: <a href="https://github.com/sjay05/cwal">https://github.com/sjay05/cwal</a>.</p>
<p>In a database system, write ahead logs store operations in a sequential order within a log file which are flushed to the disk before any expensive database-wide commits are performed.</p>
<p>This preserves the <em>durability</em> of the data, as it is recoverable in the case of a power failure/system crash.</p>
<!---CWal is a simple C++ library that provides an append-only WAL interface with `read/write` operations. In order to preserve data-integrity, CWal also computes `CRC32` Checksums.--->
<h3>Log Entries:</h3>
<p>Log entries consist of a <code class="language-plaintext highlighter-rouge">byte_len</code> representing the size of <code class="language-plaintext highlighter-rouge">data</code> in bytes, and a 4 byte CRC Checksum value (<code class="language-plaintext highlighter-rouge">CRC_CHECKSUM</code>).</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">LogEntry</span> <span class="p">{</span>
<span class="kt">uint64_t</span> <span class="n">byte_length</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">data</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>----|-------------------|--------------------|-----------------------|-----
... | uint64_t byte_len | const string* data | uint32_t CRC_CHECKSUM | ...
| (8 bytes) | (byte_len bytes) | (4 bytes) |
----|-------------------|--------------------|-----------------------|-----
</code></pre></div></div>
<!-- more -->
<h3>fsync() system call:</h3>
<p>In Linux systems, the traditional C-style <code class="language-plaintext highlighter-rouge">write()</code> or C++ <code class="language-plaintext highlighter-rouge">std::ostream<CharT,Traits>::write()</code> does not guarantee immediate write to disk due to Page Caches (or Disk Cache).</p>
<p>Page caches are implemented in order to optimize freqeuent system call operations that target the disk, and is held in the RAM. Thus, the disk updates are implemented with deferred evaluation, where they wait a few seconds for further <code class="language-plaintext highlighter-rouge">read/write</code> calls before the data is flushed to disk.</p>
<p>This is a volatile method of storage and will <em>not</em> work for Write Ahead Logs. However, using the <code class="language-plaintext highlighter-rouge">fsync</code> system call, we can force the OS to flush the page cache to the disk.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fsync(2) - Linux Manual Page
NAME
fsync - sychronize a file's in-core state with that on disk
SYNOPSIS
#include <unistd.h>
int fsync(int fd);
int fdatasync(int fd);
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">fd</code> (file descriptor) may be obtained by opening the file with <code class="language-plaintext highlighter-rouge">open()</code>. <code class="language-plaintext highlighter-rouge">fsync()</code> will return once all buffers have been flushed to permanent storage, which achieves durable storage for the Write Ahead Log.</p>
<p>This leads to several questions now. What is the exact expense of the <code class="language-plaintext highlighter-rouge">fsync()</code> syscall? Does every log append require a <code class="language-plaintext highlighter-rouge">fsync</code> or can they batched up into blocks and committed when ready?</p>
<h3>Benchmarks</h3>
<p>CWal uses two types of <a href="https://github.com/sjay05/cwal/blob/master/tests/bench.cpp">benchmarks</a>. The first is <code class="language-plaintext highlighter-rouge">benchmark_write(const int LOG_LENGTH, const int LOG_SIZE, bool RFLUSH, bool SYNC, const int SYNC_PERIOD)</code>, which appends a total of <code class="language-plaintext highlighter-rouge">LOG_LENGTH</code> logs with data of <code class="language-plaintext highlighter-rouge">LOG_SIZE</code> bytes each.</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">RFLUSH</code> indicates if a routine flush will be performed after each append.</li>
<li><code class="language-plaintext highlighter-rouge">SYNC</code> period indicates if a routine fsync, every <code class="language-plaintext highlighter-rouge">SYNC_PERIOD</code> operations will occur.</li>
</ul>
<p>In particular, <code class="language-plaintext highlighter-rouge">benchmark_write</code> runs with $\mathcal{O}(\text{LOG_LENGTH} \cdot \text{LOG_SIZE})$ time. We set <code class="language-plaintext highlighter-rouge">LOG_LENGTH * LOG_SIZE ~= 1e6</code>.</p>
<hr />
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Reg. Benchmark: 1000 entries | data_length = 1000 | Flush? No | Sync? No
==> 2931 ms | 2.93 ms/log
Reg. Benchmark: 1000 entries | data_length = 1000 | Flush? Yes | Sync? No
==> 3556 ms | 3.56 ms/log
Reg. Benchmark: 1000 entries | data_length = 1000 | Flush? No | Sync? Yes | SYNC_PERIOD = 1
==> 107601 ms | 107.60 ms/log
Reg. Benchmark: 1000 entries | data_length = 1000 | Flush? No | Sync? Yes | SYNC_PERIOD = 10
==> 63919 ms | 63.92 ms/log
</code></pre></div></div>
<p>It is clear that <code class="language-plaintext highlighter-rouge">fsync()</code> is very expensive, and causes a $35\%$ increase in time. However it is unreasonable that a database would wait for <code class="language-plaintext highlighter-rouge">1000</code> log entries to be flushed before it’s state is modified. So, we can try to batch the logs into segments of set size.</p>
<p>At the end of each batch, CWal would start overwriting over previous log, as this is more time efficient that truncating the file. The database would also then commit it’s changes to the disk, so the previous logs would not be required anymore.</p>
<p>The function <code class="language-plaintext highlighter-rouge">batched_sync_benchmarks(const int LOG_LENGTH, const int LOG_SIZE, cont int BATCH_SIZE)</code> has one new argument, <code class="language-plaintext highlighter-rouge">BATCH_SIZE</code>. We stick with the same specifications of <code class="language-plaintext highlighter-rouge">LOG_LENGTH</code> and <code class="language-plaintext highlighter-rouge">LOG_SIZE</code>. Two <code class="language-plaintext highlighter-rouge">BATCH_SIZE</code>’s we can experiment with are $\sqrt{\text{LOG_LENGTH}}$ and $\sqrt[3]{\text{LOG_LENGTH}}$.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Batched Sync Benchmark: 1000 entries | data_length = 1000 | BATCH_SIZE = 31
==> 98265 ms | 98.27 ms/log
Batched Sync Benchmark: 1000 entries | data_length = 1000 | BATCH_SIZE = 10
==> 101750 ms | 101.75 ms/log
</code></pre></div></div>
<p>We can see slight improvements with the times of the 3rd Regular Benchmark.</p>
<p>Hence, with continued tweaking a user can dictate how often the write ahead log performs <code class="language-plaintext highlighter-rouge">fsync()</code> operations, and batch the logs appropriately.</p>
<h3>Extensions for CWal</h3>
<ul>
<li>Asynchronous write ahead logging</li>
<li>Copies of log files for redundancy</li>
<li>Experiment with memory mapped files for WalReader IO performance</li>
</ul>Sanjay SeenivasanThis post will discuss CWal: A Write Ahead log (WAL) implementation I wrote in C++ (tested on Ubuntu 22.04). Here is a direct link: https://github.com/sjay05/cwal. In a database system, write ahead logs store operations in a sequential order within a log file which are flushed to the disk before any expensive database-wide commits are performed. This preserves the durability of the data, as it is recoverable in the case of a power failure/system crash. Log Entries: Log entries consist of a byte_len representing the size of data in bytes, and a 4 byte CRC Checksum value (CRC_CHECKSUM). struct LogEntry { uint64_t byte_length; std::string data; }; ----|-------------------|--------------------|-----------------------|----- ... | uint64_t byte_len | const string* data | uint32_t CRC_CHECKSUM | ... | (8 bytes) | (byte_len bytes) | (4 bytes) | ----|-------------------|--------------------|-----------------------|-----Prefix Digits - An Outline2022-10-24T00:00:00+00:002022-10-24T00:00:00+00:00https://sjay05.github.io/2022/10/24/pdigit<p>Problem Link: <a href="https://dmoj.ca/problem/pdigit">https://dmoj.ca/problem/pdigit</a></p>
<p>This post will dicuss the solution for the problem linked above, that I created
for a mini-contest in DMOJ.</p>
<h3>Statement</h3>
<p>You are given two integers ~n~ and ~k~, and can perform operations to ~n~.</p>
<p>Each operation allows you to <em>prepend</em> a digit ~d~ ~(0 \le 0 \le 9)~ to ~n~,
and it is your task to determine if there exists a sequence of operations such
that ~n~ will end up being divisible by ~k~.</p>
<p>Note: ~n~ and ~k~ can be fairly large with bounds ~(1 \le n, k \le 10^9)~, and
you are required to answer ~t~ test cases.</p>
<!-- more -->
<h3>Subtask 1</h3>
<p>~1 \le k, \le 9~</p>
<p>Since divisibility rules exist from ~1~ to ~9~, we can use logic to solve for
each case.</p>
<p>Note: This subtask doesn’t exist in the linked problem.</p>
<h3>Subtask 2</h3>
<p>~1 \le t \le 10^5~</p>
<p>~1 \le n, k, \le 10^9~</p>
<h4>Step 1</h4>
<p>Since ~t~ can be ~10^5~, we are looking for a ~\mathcal{O}(T \cdot \log N)~, or ~\mathcal{O}(T)~ with some form of log factor, unless this problem can be solved in constant time.</p>
<p>Next, notice that the integer ~n~, after say ~m~ operations, can also be represented with an equation. Suppose ~d_1, d_2, d_3, \dots, d_m~ are the digits prepended to ~n~ in order from ~1~ to ~m~.</p>
<p>All the operations can be represented as the addition of <em>one</em> integer with digits ~d_1, d_2, d_3, \dots, d_m~ to the front of ~n~.</p>
<p>So, let ~y~ be the <em>integer</em> with digits ~d_1, d_2, d_3, \dots, d_m~.</p>
<p><strong>Example</strong>: Prepend The Integer 23 to 45:</p>
<p>Our resultant value will have a length of ~2 + 2 = 4~, and we can picture this operation to be ~2300 + 45 = 2345~.</p>
<p>Hence notice, that we create ~0~s in the position where the ~45~ will go into. The number ~2300~ is created by multiplying ~23 \cdot 10^2~.</p>
<p><strong>Generalization</strong>: Add Integer ~y~ to ~n~:</p>
<p>Define the length of an integer to be ~\text{len}(x)~. For example, ~\text{len}(342) = 3~.</p>
<p>The new integer ~n~, with ~y~ prepended is:</p>
\[10^{\text{len}(n)} \cdot y + n\]
<h3>Step 2</h3>
<p>How do we represent that a number ~n~ is divisible by ~k~ with an equation? We can write this as ~n \equiv 0 \pmod k~, where ~n~ is <em>congruent</em> to ~0 \pmod k~.</p>
<p>Since we figured out how to represent the final value of ~n~, our congruence is:</p>
\[10^{\text{len}(n)} \cdot y + n \equiv 0 \pmod k\]
<p>Our linear congruence is similar to form of ~ax \equiv b \pmod m~, where ~a = 10^{\text{len}(n)} \cdot y~, ~b = -n~ and ~m = k~.</p>
<p>Since the problem asks us <code class="language-plaintext highlighter-rouge">YES</code> or <code class="language-plaintext highlighter-rouge">NO</code>, does a sequence of operations exist, this is similar to asking if the congruence has any solution.</p>
<p>A congruence of the form ~ax \equiv b \pmod m~, has a solution when ~\text{gcd}(a, m)~ is a divisor of ~b~.</p>
<p>Therefore we output <code class="language-plaintext highlighter-rouge">YES</code> when ~\gcd(10^{\text{len}(n)}, k)~ is a divisor of ~-n \bmod k~, and <code class="language-plaintext highlighter-rouge">NO</code> otherwise.</p>Sanjay SeenivasanProblem Link: https://dmoj.ca/problem/pdigit This post will dicuss the solution for the problem linked above, that I created for a mini-contest in DMOJ. Statement You are given two integers ~n~ and ~k~, and can perform operations to ~n~. Each operation allows you to prepend a digit ~d~ ~(0 \le 0 \le 9)~ to ~n~, and it is your task to determine if there exists a sequence of operations such that ~n~ will end up being divisible by ~k~. Note: ~n~ and ~k~ can be fairly large with bounds ~(1 \le n, k \le 10^9)~, and you are required to answer ~t~ test cases.AAC1 P5 - Odd Alpacas2021-07-01T00:00:00+00:002021-07-01T00:00:00+00:00https://sjay05.github.io/2021/07/01/aac1p5<p>Problem Link: <a href="https://dmoj.ca/problem/aac1p5">dmoj.ca/problem/aac1p5</a></p>
<p>This post will discuss the solution for the problem linked above. I created
this problem along with <a href="https://dmoj.ca/user/samliu12">Sam Liu</a> for <a href="https://dmoj.ca/contest/aac1">Animal
Contest 1</a> on DMOJ.</p>
<p>Statistics:</p>
<ul>
<li>Served as P5 of a 6-problem set.</li>
<li>~9~ correct submissions during contest.</li>
<li>~29.33\%~ AC rate (including subtasks).</li>
</ul>
<h3>Statement</h3>
<p>You are given an tree of ~N~ nodes and ~N - 1~ weighted edges
connecting ~u_i~ and ~v_i~ with weight ~w_i~ for ~1 \le i \le N - 1~.</p>
<!-- more -->
<p>Let the “length” of a path ~(x, y)~ to be the sum of weights
on the edges from node ~x~ to node ~y~.</p>
<p>Let ~x~ to be the number of even length paths, and ~y~ to be the number
of odd length paths.</p>
<p>By changing the weight of one edge, minimize ~|x - y|~.</p>
<p><strong>Note</strong>: You are allowed to modify ~0~ edges.</p>
<h3>Subtask 1</h3>
<p>The constraint ~1 \le N \le 200~ was set on purpose to allow
brute force solutions to pass for ~10\%~ of points.</p>
<p>First, notice how the modification of an edge modifies a path
length.</p>
<p>Suppose a path was defined of the following weight parities:</p>
\[\text{len} = \text{odd} + \text{even} + \text{odd} + \text{even}\]
<p>The parity of ~\text{len}~ would only change if one of the 4 parities
also changed. This is either changing an ~\text{odd}~ to an ~\text{even}~ or the
other way around.</p>
<p>Hence, for this subtask we can simulate changing the parity for each edge.
Once that is done, how can we find ~x~ and ~y~?</p>
<p>For each node ~v~ ~(1 \le v \le N)~, run a dfs on an assumption that ~v~ is an
endpoint on a path. Create a distance array maintaining parity and ~|x - y|~
can be found easily.</p>
<p><strong>Time Complexity</strong>: ~\mathcal{O}(N^3)~</p>
<h4>Code Snippets</h4>
<hr />
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">Dfs</span><span class="p">(</span><span class="kt">int</span> <span class="n">v</span><span class="p">,</span> <span class="kt">int</span> <span class="n">pr</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span> <span class="n">p</span> <span class="o">:</span> <span class="n">g</span><span class="p">[</span><span class="n">v</span><span class="p">])</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">to</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">first</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">w</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">second</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">to</span> <span class="o">==</span> <span class="n">pr</span><span class="p">)</span> <span class="p">{</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">min</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span> <span class="o">==</span> <span class="n">mod_x</span> <span class="o">&&</span> <span class="n">max</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span> <span class="o">==</span> <span class="n">mod_y</span><span class="p">)</span> <span class="p">{</span>
<span class="n">w</span> <span class="o">^=</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">dist</span><span class="p">[</span><span class="n">to</span><span class="p">]</span> <span class="o">=</span> <span class="n">dist</span><span class="p">[</span><span class="n">v</span><span class="p">]</span> <span class="o">+</span> <span class="n">w</span><span class="p">;</span>
<span class="n">Dfs</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="n">v</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>if <code class="language-plaintext highlighter-rouge">mod_x</code> and <code class="language-plaintext highlighter-rouge">mod_y</code> are the nodes we are modifying, we can
do a check and do <code class="language-plaintext highlighter-rouge">w ^= 1</code> to switch the parity.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="n">x</span> <span class="o">=</span> <span class="n">get</span><span class="o"><</span><span class="mi">0</span><span class="o">></span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">y</span> <span class="o">=</span> <span class="n">get</span><span class="o"><</span><span class="mi">1</span><span class="o">></span><span class="p">(</span><span class="n">e</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">x</span> <span class="o">></span> <span class="n">y</span><span class="p">)</span> <span class="p">{</span>
<span class="n">swap</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">mod_x</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span>
<span class="n">mod_y</span> <span class="o">=</span> <span class="n">y</span><span class="p">;</span>
<span class="p">{</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">odd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">even</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">dist</span><span class="p">.</span><span class="n">assign</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="n">Dfs</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">j</span> <span class="o"><</span> <span class="n">n</span><span class="p">;</span> <span class="n">j</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">==</span> <span class="n">j</span><span class="p">)</span> <span class="p">{</span>
<span class="k">continue</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">odd</span> <span class="o">+=</span> <span class="p">(</span><span class="n">dist</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">1</span><span class="p">);</span>
<span class="n">even</span> <span class="o">+=</span> <span class="p">(</span><span class="n">dist</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">odd</span> <span class="o">/=</span> <span class="mi">2</span><span class="p">;</span>
<span class="n">even</span> <span class="o">/=</span> <span class="mi">2</span><span class="p">;</span>
<span class="n">ans</span> <span class="o">=</span> <span class="n">min</span><span class="p">(</span><span class="n">ans</span><span class="p">,</span> <span class="n">abs</span><span class="p">(</span><span class="n">odd</span> <span class="o">-</span> <span class="n">even</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>
<p>For each edge ~(x, y)~, we can set these as <code class="language-plaintext highlighter-rouge">mod_x</code> and <code class="language-plaintext highlighter-rouge">mod_y</code>
and run a dfs for each node from ~1~ to ~N~.</p>
<p><strong>Note</strong>: This implementation uses 0-based indexing. Hence the
nodes are labeled from ~0~ to ~N - 1~.</p>
<h3>Subtask 2</h3>
<p>Constraints in this subtask (~1 \le N \le 2 \times 10^3~) were set
to allow for a more optimized brute force to pass.</p>
<p>If ~N = 2 \times 10^3~, a ~\mathcal{O}(N^2)~ algorithm with about
~4 \times 10^6~ operations will pass.</p>
<p>We can draw inspiration from a very common ~\text{LCA}~ property
used to find distance between two nodes in a tree:</p>
<p>If ~\text{dist}[x]~ is the distance from the root (~1~):</p>
\[\text{dist}(x, y) = \text{dist}[x] + \text{dist}[y] - 2 \times \text{dist}[\text{lca}(x, y)]\]
<p>Notice that any number (odd or even), when multipled by ~2~ will always
result in a ~even~ result. Since ~ 2 \times \text{dist}[\text{lca}(x, y)]~
will always be even, the parity of ~\text{dist}(x, y)~ will be determined
by ~\text{dist}[x]~ and ~\text{dist}[y]~.</p>
<p>So:</p>
<ul>
<li>If ~\text{dist}[x] + \text{dist}[y]~ is odd, ~\text{dist}(x, y)~ will be odd.</li>
<li>If ~\text{dist}[x] + \text{dist}[y]~ is even, ~\text{dist}(x, y)~ will be even.</li>
</ul>
<p>Let ~\alpha = \text{dist}[x] + \text{dist}[y]~.</p>
<p>Now, we want to be able to count ~x~ and ~y~ in ~\mathcal{O}(N)~ time, since
we are trying each ~N - 1~ edges.</p>
<p>To count all paths with ~\alpha \equiv 1 \pmod 2~, we can multiply the number of
nodes ~v~ with odd distance by the number of nodes with even distance. Let this result
be ~\text{odd}~.</p>
<p>For the other case ~\alpha \equiv 0 \pmod 2~, we can subtract the number of odd paths
from the total number of paths: (~\frac{n \cdot (n - 1)}{2} - \text{odd}~).</p>
<p><strong>Time Complexity</strong>: ~\mathcal{O}(N^2)~</p>
<h4>Code Snippets</h4>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="kt">long</span> <span class="n">odd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">even</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">odd</span> <span class="o">+=</span> <span class="p">(</span><span class="n">dist</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">1</span><span class="p">);</span>
<span class="n">even</span> <span class="o">+=</span> <span class="p">(</span><span class="n">dist</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">o_cnt</span> <span class="o">=</span> <span class="n">odd</span> <span class="o">*</span> <span class="n">even</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">e_cnt</span> <span class="o">=</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">-</span> <span class="n">o_cnt</span><span class="p">;</span>
<span class="n">ans</span> <span class="o">=</span> <span class="n">min</span><span class="p">(</span><span class="n">ans</span><span class="p">,</span> <span class="n">abs</span><span class="p">(</span><span class="n">o_cnt</span> <span class="o">-</span> <span class="n">e_cnt</span><span class="p">));</span>
</code></pre></div></div>
<p>For each edge ~i~ (~1 \le i \le N - 1~), we run a DFS and calculate
~|x - y|~ like so.</p>
<h3>Subtask 3</h3>
<p>For the final subtask with ~(1 \le N \le 2 \times 10^5)~, a ~\mathcal{O}(N)~
algorithm must be derived.</p>
<p>The intended solution still goes on the assumption that all edges must be tried,
but has an constant time way of finding ~x~ and ~y~.</p>
<p>Define an “odd” node to be an arbitrary node ~v~ such that ~\text{dist}[v] \equiv 1 \pmod 2~.</p>
<p>Define an “even” node to be an arbitrary node ~v~ such that ~\text{dist}[v] \equiv 0 \pmod 2~.</p>
<p>Suppose an edge ~(x, y)~ parity is changed. What paths are affected by this edge?</p>
<p>We can make the claim that:</p>
<ul>
<li>Any path ~(u, v)~ intersecting edge ~(x, y)~ will have either ~u~ or ~v~ in the subtree
of edge ~(x, y)~.</li>
</ul>
<p align="center">
<img src="/assets/images/aac1p5_fig1.jpg" style="height: 500px; width: 500px; text-align: center;" />
</p>
<p>Take the tree above, if the edge ~(3, 5)~ (highlighted in green) was modified, notice
that the grey, red, and blue path all have a end-vertex in the subtree of ~(3, 5)~.</p>
<p>We can notice that all even nodes will swap to odd nodes and vise-versa in this subtree,
because all paths must have an end-point in the subtree.</p>
<p>Hence, if we keep a counter for the number of odd and even nodes in each subtree, when the
time comes to modify an edge ~(x, y)~ we can swap the odd and even nodes appropriately and
calculate ~x~ and ~y~ with the formula described in Subtask 2.</p>
<p><strong>Time Complexity</strong>: ~\mathcal{O}(N)~</p>
<h4>Code Snippets</h4>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="kt">long</span> <span class="n">ov</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">ev</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">n</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">ov</span> <span class="o">+=</span> <span class="p">(</span><span class="n">d</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">1</span><span class="p">);</span>
<span class="n">ev</span> <span class="o">+=</span> <span class="p">(</span><span class="n">d</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">o_cnt</span> <span class="o">=</span> <span class="n">ov</span> <span class="o">*</span> <span class="n">ev</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">e_cnt</span> <span class="o">=</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">-</span> <span class="n">o_cnt</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">ans</span> <span class="o">=</span> <span class="n">abs</span><span class="p">(</span><span class="n">o_cnt</span> <span class="o">-</span> <span class="n">e_cnt</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="k">auto</span><span class="o">&</span> <span class="n">e</span> <span class="o">:</span> <span class="n">es</span><span class="p">)</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">;</span>
<span class="n">tie</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="o">=</span> <span class="n">e</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">dep</span><span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o"><</span> <span class="n">dep</span><span class="p">[</span><span class="n">y</span><span class="p">])</span> <span class="p">{</span>
<span class="n">swap</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">o_aux</span> <span class="o">=</span> <span class="n">ov</span> <span class="o">-</span> <span class="n">odd</span><span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">+</span> <span class="n">even</span><span class="p">[</span><span class="n">x</span><span class="p">];</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">e_aux</span> <span class="o">=</span> <span class="n">ev</span> <span class="o">-</span> <span class="n">even</span><span class="p">[</span><span class="n">x</span><span class="p">]</span> <span class="o">+</span> <span class="n">odd</span><span class="p">[</span><span class="n">x</span><span class="p">];</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">new_o_cnt</span> <span class="o">=</span> <span class="n">o_aux</span> <span class="o">*</span> <span class="n">e_aux</span><span class="p">;</span>
<span class="kt">long</span> <span class="kt">long</span> <span class="n">new_e_cnt</span> <span class="o">=</span> <span class="n">n</span> <span class="o">*</span> <span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">-</span> <span class="n">new_o_cnt</span><span class="p">;</span>
<span class="n">ans</span> <span class="o">=</span> <span class="n">min</span><span class="p">(</span><span class="n">ans</span><span class="p">,</span> <span class="n">abs</span><span class="p">(</span><span class="n">new_o_cnt</span> <span class="o">-</span> <span class="n">new_e_cnt</span><span class="p">));</span>
<span class="p">}</span>
<span class="n">cout</span> <span class="o"><<</span> <span class="n">ans</span> <span class="o"><<</span> <span class="sc">'\n'</span><span class="p">;</span>
</code></pre></div></div>Sanjay SeenivasanProblem Link: dmoj.ca/problem/aac1p5 This post will discuss the solution for the problem linked above. I created this problem along with Sam Liu for Animal Contest 1 on DMOJ. Statistics: Served as P5 of a 6-problem set. ~9~ correct submissions during contest. ~29.33\%~ AC rate (including subtasks). Statement You are given an tree of ~N~ nodes and ~N - 1~ weighted edges connecting ~u_i~ and ~v_i~ with weight ~w_i~ for ~1 \le i \le N - 1~.