<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>Pernosco</title>
    <link href="https://pernos.co/blog/atom.xml" rel="self" type="application/atom+xml"/>
    <link href="https://pernos.co/blog/"/>
    <generator uri="https://www.getzola.org/">Zola</generator>
    <updated>2025-01-27T00:00:00+00:00</updated>
    <id>https://pernos.co/blog/atom.xml</id>
    <entry xml:lang="en">
        <title>Tackling the C++ debugging UI nightmare with Pernosco</title>
        <published>2025-01-27T00:00:00+00:00</published>
        <updated>2025-01-27T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/debugging-cpp/" type="text/html"/>
        <id>https://pernos.co/blog/debugging-cpp/</id>
        
        <summary type="html">&lt;p&gt;The full unambiguous names of C++ functions and variables include type names and namespace names, which makes them verbose. This is especially true with heavy use of templates because the type names include template parameter names (recursively). The challenge for a debugger (and other tools) is to display names that convey enough information to the user without overwhelming the UI. Pernosco tackles this by abbreviating names but making the abbreviations &lt;em&gt;interactive&lt;&#x2F;em&gt;.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>New features in Linux 6.10 contributed by Pernosco</title>
        <published>2024-09-19T00:00:00+00:00</published>
        <updated>2024-09-19T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/linux-kernel-additions/" type="text/html"/>
        <id>https://pernos.co/blog/linux-kernel-additions/</id>
        
        <summary type="html">&lt;p&gt;The Linux 6.10 kernel release contains two new features in the perf event subsystem contributed by Pernosco. These features are intended to benefit rr (and thus Pernosco), but they also have broader applications if adopted by other software. In this blog post I will discuss what we added, why it benefits rr, and the broader applications it could have.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>ELF symbol interposition and RTLD_LOCAL</title>
        <published>2022-07-19T00:00:00+00:00</published>
        <updated>2022-07-19T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/interposition-rtld-local/" type="text/html"/>
        <id>https://pernos.co/blog/interposition-rtld-local/</id>
        
        <summary type="html">&lt;p&gt;You may be familiar with &amp;quot;&lt;a href=&quot;http:&#x2F;&#x2F;www.goldsborough.me&#x2F;c&#x2F;low-level&#x2F;kernel&#x2F;2016&#x2F;08&#x2F;29&#x2F;16-48-53-the_-ld_preload-_trick&#x2F;&quot;&gt;the LD_PRELOAD trick&lt;&#x2F;a&gt;&amp;quot;. This &amp;quot;trick&amp;quot; is used to implement things like &lt;a href=&quot;https:&#x2F;&#x2F;manpages.ubuntu.com&#x2F;manpages&#x2F;jammy&#x2F;man1&#x2F;heaptrack.1.html&quot;&gt;heaptrack&lt;&#x2F;a&gt;. By interposing a third library between an application and libc&#x27;s malloc&#x2F;free you can track the state of the heap and recognize errors like double frees and memory leaks. But this doesn&#x27;t work for libraries loaded with RTLD_LOCAL, which is the default behavior of &lt;a href=&quot;https:&#x2F;&#x2F;linux.die.net&#x2F;man&#x2F;3&#x2F;dlopen&quot;&gt;dlopen&lt;&#x2F;a&gt;. Why not? Let&#x27;s look at how this sort of linking works normally first, and then we can figure out why it goes wrong with RTLD_LOCAL.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Shrink Your Compile Times With Split DWARF</title>
        <published>2021-12-21T00:00:00+00:00</published>
        <updated>2021-12-21T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/split-dwarf/" type="text/html"/>
        <id>https://pernos.co/blog/split-dwarf/</id>
        
        <summary type="html">&lt;p&gt;What if you could reduce the time it takes to link your program by 25%, reduce the memory it takes to link your program by 40%, and reduce the size of the binary by 50%, all by changing a compiler flag? That&#x27;s the power of &amp;quot;split DWARF&amp;quot;, a compiler and debugger feature that uses a new format for the DWARF debugging information that&#x27;s specifically designed to reduce the work the linker is required to do. Let&#x27;s dive into how it works and what is required for you to benefit from it.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Automatic Downcasting: How Does It Work?</title>
        <published>2021-12-14T00:00:00+00:00</published>
        <updated>2021-12-14T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/downcasting/" type="text/html"/>
        <id>https://pernos.co/blog/downcasting/</id>
        
        <summary type="html">&lt;p&gt;Many programming languages include mechanisms for &lt;a href=&quot;https:&#x2F;&#x2F;blog.feabhas.com&#x2F;2010&#x2F;05&#x2F;polymorphism-in-c&#x2F;&quot;&gt;dynamic polymorphism&lt;&#x2F;a&gt;. These pose challenges for debuggers, because viewing only fields from the declared type of a variable may not be particularly useful. Automatically deducing the most-&amp;quot;derived&amp;quot; type and downcasting to it presents the entire object to developers and makes debugging code that uses dynamic polymorphism much more pleasant. Our Pernosco Omniscient Debugger automatically downcasts types that use dynamic polymorphism in supported languages (C++, Rust, and Ada). You might also be familiar with this technique in gdb via the &lt;code&gt;set print object on&lt;&#x2F;code&gt; command. But how is it actually implemented?&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Making Debuggers Sad: C++ Identifier Canonicalization</title>
        <published>2021-11-24T00:00:00+00:00</published>
        <updated>2021-11-24T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/canonicalized-identifiers/" type="text/html"/>
        <id>https://pernos.co/blog/canonicalized-identifiers/</id>
        
        <content type="html">&lt;p&gt;Why do debuggers like gdb take so long to start up on large programs? There are many reasons, but one surprising reason is that gdb spends significant amounts of time &lt;a href=&quot;https:&#x2F;&#x2F;gcc.gnu.org&#x2F;bugzilla&#x2F;show_bug.cgi?id=94845#c6&quot;&gt;parsing C++ identifiers and re-emitting them into a canonical form&lt;&#x2F;a&gt;. This is due to deficiencies in clang++ and g++ (and, arguably, DWARF) — but not everyone agrees. The underlying reasons also apply to Pernosco so we have implemented something similar, although we&#x27;re able to hide the startup impact more effectively by folding it into our &amp;quot;build the big database of everything&amp;quot; step.&lt;&#x2F;p&gt;
&lt;p&gt;Suppose a debugger user wants to evaluate the value of the variable &lt;code&gt;Foo&amp;lt;short&amp;gt;::FOO&lt;&#x2F;code&gt;, where &lt;code&gt;Foo&lt;&#x2F;code&gt; is declared with&lt;&#x2F;p&gt;
&lt;pre&gt;&lt;code&gt;template &amp;lt;typename T&amp;gt; struct Foo {
    enum Enum { FOO = 1 };
};
&lt;&#x2F;code&gt;&lt;&#x2F;pre&gt;
&lt;p&gt;The DWARF debuginfo for this program contains a &lt;code&gt;DW_TAG_structure_type&lt;&#x2F;code&gt; for &lt;code&gt;Foo&amp;lt;short&amp;gt;&lt;&#x2F;code&gt;, which contains debuginfo for the &lt;code&gt;Enum&lt;&#x2F;code&gt; enum and its values. We&#x27;ll have to search for this &lt;code&gt;DW_TAG_structure_type&lt;&#x2F;code&gt; by name. Unfortunately, the &lt;code&gt;DW_AT_name&lt;&#x2F;code&gt; in the debuginfo produced by gcc 9 is not &lt;code&gt;Foo&amp;lt;short&amp;gt;&lt;&#x2F;code&gt; — it&#x27;s &lt;code&gt;Foo&amp;lt;short int&amp;gt;&lt;&#x2F;code&gt; — so we may not find the type with a naive search 😞.&lt;&#x2F;p&gt;
&lt;p&gt;The basic problem here is that there are many valid ways of writing the same template parameters, so the user might pick a different way than the compiler emitted. This applies not just to template parameters that are types, but also values, e.g. given &lt;code&gt;template &amp;lt;unsigned long V&amp;gt; struct Bar { ... }&lt;&#x2F;code&gt;
the compiler might emit a type with name &lt;code&gt;Bar&amp;lt;1UL&amp;gt;&lt;&#x2F;code&gt; (as clang++ 12 does), while the user enters &lt;code&gt;Bar&amp;lt;1&amp;gt;&lt;&#x2F;code&gt;.&lt;&#x2F;p&gt;
&lt;p&gt;The only general solution here is for the debugger to parse all the C++ type names that contain template parameters and store them in a canonical form. When a user enters a type name, it is canonicalized using the same algorithm, so a match will be found if one exists. E.g. in the above examples the debugger could canonicalize the DWARF names &lt;code&gt;Foo&amp;lt;short int&amp;gt;&lt;&#x2F;code&gt; and &lt;code&gt;Bar&amp;lt;1UL&amp;gt;&lt;&#x2F;code&gt; to &lt;code&gt;Foo&amp;lt;short&amp;gt;&lt;&#x2F;code&gt; and &lt;code&gt;Bar&amp;lt;1&amp;gt;&lt;&#x2F;code&gt; respectively and use the latter for lookup. This requires parsing C++ type syntax, which is nasty, but the debugger already needs to do this to handle various forms of user input, so it&#x27;s not a new problem. Potentially parsing many gigabytes of C++ symbols does subject the parser to increased performance stress, however.&lt;&#x2F;p&gt;
&lt;p&gt;There are situations where it gets very difficult or impossible to correctly parse C++ type syntax outside the context of a compilation unit, but let&#x27;s studiously ignore that.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;interaction-with-demangling&quot;&gt;Interaction with demangling&lt;&#x2F;h2&gt;
&lt;p&gt;C++ entities with &amp;quot;linkage&amp;quot;, i.e. functions and variables, are assigned &lt;a href=&quot;https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Name_mangling&quot;&gt;mangled names&lt;&#x2F;a&gt; in their binaries. Debuggers demangle these into fully-qualified human-readable names. We take advantage of this in Pernosco by ensuring that the demangler always produces names in our canonical form (via options passed to &lt;a href=&quot;https:&#x2F;&#x2F;crates.io&#x2F;crates&#x2F;cpp_demangle&quot;&gt;cpp_demangle&lt;&#x2F;a&gt;). This greatly reduces the number of C++ identifers we would otherwise have to parse and canonicalize.&lt;&#x2F;p&gt;
&lt;p&gt;BTW you would hope that GNU&#x27;s &lt;code&gt;c++filt&lt;&#x2F;code&gt; demangler at least produces names that are consistent with the names gcc emits into debuginfo, but &lt;a href=&quot;https:&#x2F;&#x2F;gcc.gnu.org&#x2F;bugzilla&#x2F;show_bug.cgi?id=94845&quot;&gt;it does not&lt;&#x2F;a&gt;. Likewise &lt;code&gt;llvm-cxxfilt&lt;&#x2F;code&gt; produces names inconsistent with clang++.&lt;&#x2F;p&gt;
&lt;h2 id=&quot;ideal-solution&quot;&gt;Ideal solution&lt;&#x2F;h2&gt;
&lt;p&gt;Ideally the text serialization of C++ names would be standardized, gcc and clang++ would produce standard names in their debuginfo, their demanglers would also emit standard names (at least when the right options are set), and debuggers could detect that this has been done and avoid a lot of work. That isn&#x27;t likely to ever happen; as far as I can tell, compiler maintainers don&#x27;t think that the current situation is a problem.&lt;&#x2F;p&gt;
&lt;p&gt;A slightly less ideal approach that would still be a big improvement is the same thing I suggested for &lt;a href=&quot;https:&#x2F;&#x2F;pernos.co&#x2F;blog&#x2F;canonicalized-identifiers&#x2F;structured-identifiers&quot;&gt;structured identifiers&lt;&#x2F;a&gt;: making debuginfo include mangled names for all C++ types. This would let us rely on demangling instead of having to parse C++ type syntax in debuginfo names. But my guess is this won&#x27;t happen either.&lt;&#x2F;p&gt;
&lt;p&gt;So it looks like we&#x27;ll just have to get good at parsing gigabytes of C++ type syntax really fast.&lt;&#x2F;p&gt;
</content>
        
    </entry>
    <entry xml:lang="en">
        <title>Where Should the Debugger Set a Breakpoint?</title>
        <published>2021-11-09T00:00:00+00:00</published>
        <updated>2021-11-09T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/function-prologues/" type="text/html"/>
        <id>https://pernos.co/blog/function-prologues/</id>
        
        <summary type="html">&lt;script src=&quot;&#x2F;demo-data&#x2F;blog-prologues.js&quot; defer&gt;&lt;&#x2F;script&gt;
&lt;p&gt;When you search for executions of a function in Pernosco (or set a breakpoint in a more traditional debugger), how does the debugger decide where to stop? You might think it just stops at the first instruction in a function, but that&#x27;s not true in general.&lt;&#x2F;p&gt;
&lt;div class=&quot;DOMRecMovie&quot; id=&quot;mainNotAtTopMovie&quot; style=&quot;width:1194px; height:458px&quot;&gt;&lt;&#x2F;div&gt;
&lt;p&gt;Debuggers actually stop immediately after the function&#x27;s &amp;quot;prologue&amp;quot;, and for good reason. At &lt;code&gt;-O0&lt;&#x2F;code&gt;, compilers generally do not produce accurate debug information for the function prologue, and stopping any earlier would lead to the debugger seeing incorrect parameter values. That, in turn, would produce the wrong results when evaluating conditional breakpoints. Finding the end of a function&#x27;s prologue can be surprisingly difficult though. In this article I&#x27;ll go through why this is an issue, and some of the heuristics involved in determining where to place a breakpoint.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Implementing Pernosco Support For Github Actions</title>
        <published>2021-10-29T00:00:00+00:00</published>
        <updated>2021-10-29T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/github-actions/" type="text/html"/>
        <id>https://pernos.co/blog/github-actions/</id>
        
        <summary type="html">&lt;p&gt;Pernosco supports &lt;a href=&quot;&#x2F;about&#x2F;workflow-ci&quot;&gt;debugging failures in Github Actions tests&lt;&#x2F;a&gt;. When a test fails, you get a link that takes you directly to debug that failure:&lt;&#x2F;p&gt;
&lt;img class=&quot;window-screenshot&quot; src=&quot;&#x2F;img&#x2F;PernoscoGithubReproduced.png&quot; width=&quot;1026&quot; height=&quot;402&quot; title=&quot;Pernosco Github Actions integration screenshot&quot;&gt;
&lt;p&gt;(If you want Pernosco debugging for the Github Actions of your open-source project, &lt;a href=&quot;mailto:inquiries@pernos.co&quot;&gt;contact us&lt;&#x2F;a&gt;.)&lt;&#x2F;p&gt;
&lt;p&gt;Implementing this required reimplementing (some of) Github Actions workflow execution. We have just published our &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;Pernosco&#x2F;gha-runner&quot;&gt;&lt;code&gt;gha-runner&lt;&#x2F;code&gt;&lt;&#x2F;a&gt; implementation of Github Actions (as a Rust library). You can use &lt;code&gt;gha-runner&lt;&#x2F;code&gt; to run Github Actions workflows locally, and it has extension points to let projects like Pernosco modify how workflows are executed. &lt;code&gt;gha-runner&lt;&#x2F;code&gt; contains an example that lets you run Github Actions steps locally under &lt;code&gt;strace&lt;&#x2F;code&gt;, e.g.:&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>rr Trace Portability: TZCNT</title>
        <published>2021-10-21T00:00:00+00:00</published>
        <updated>2021-10-21T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/tzcnt-portability/" type="text/html"/>
        <id>https://pernos.co/blog/tzcnt-portability/</id>
        
        <summary type="html">&lt;p&gt;When we &lt;a href=&quot;https:&#x2F;&#x2F;robert.ocallahan.org&#x2F;2017&#x2F;09&#x2F;rr-trace-portability.html&quot;&gt;introduced trace portability&lt;&#x2F;a&gt; to rr our primary goal was to enable use cases like Pernosco, where traces would be recorded on one machine and replayed on another for advanced analysis. At the time it was an open question exactly how portable these traces would be, but for the most part the x86 architecture is well behaved enough that things just work. As Pernosco usage has grown we have ingested traces from a larger collection of systems and discovered a few subtle differences. We have previously written about &lt;a href=&quot;https:&#x2F;&#x2F;robert.ocallahan.org&#x2F;2021&#x2F;09&#x2F;rr-trace-portability-diverging-behavior.html&quot;&gt;emulating RSQRTTSS and other &amp;quot;approximate&amp;quot; floating point instructions&lt;&#x2F;a&gt; which produce different results on Intel and AMD processors. The most recent issue we&#x27;ve come across is related to the TZCNT instruction.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Debugging Dataflow Through Pipes And Sockets</title>
        <published>2021-10-18T00:00:00+00:00</published>
        <updated>2021-10-18T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/dataflow-sockets/" type="text/html"/>
        <id>https://pernos.co/blog/dataflow-sockets/</id>
        
        <summary type="html">&lt;script src=&quot;&#x2F;demo-data&#x2F;dataflow.js&quot; defer&gt;&lt;&#x2F;script&gt;
&lt;p&gt;One of the most powerful features of the &lt;a href=&quot;&#x2F;&quot;&gt;Pernosco Omniscient Debugger&lt;&#x2F;a&gt; is using &lt;a href=&quot;&#x2F;about&#x2F;dataflow&#x2F;&quot;&gt;dataflow analysis&lt;&#x2F;a&gt; to track a value back to its origin. We recently made this even more powerful, by giving it the ability to track values that flow through pipes and through sockets (when traces are recorded with &lt;a href=&quot;https:&#x2F;&#x2F;github.com&#x2F;rr-debugger&#x2F;rr&#x2F;releases&#x2F;tag&#x2F;5.5.0&quot;&gt;rr 5.5&lt;&#x2F;a&gt;).&lt;&#x2F;p&gt;
</summary>
        
    </entry>
    <entry xml:lang="en">
        <title>Structured Identifiers</title>
        <published>2021-10-04T00:00:00+00:00</published>
        <updated>2021-10-04T00:00:00+00:00</updated>
        <author>
          <name>Unknown</name>
        </author>
        <link rel="alternate" href="https://pernos.co/blog/structured-identifiers/" type="text/html"/>
        <id>https://pernos.co/blog/structured-identifiers/</id>
        
        <summary type="html">&lt;script src=&quot;&#x2F;demo-data&#x2F;structured-identifiers.js&quot; defer&gt;&lt;&#x2F;script&gt;
&lt;p&gt;Source-level debuggers devote much interface real estate to program identifiers. To avoid ambiguity we often want to display &lt;em&gt;fully qualified&lt;&#x2F;em&gt; identifiers (e.g. including namespace&#x2F;module names and the parameters of generic types), but these are often very long and unnecessarily verbose. Pernosco takes advantage of interactivity by eliding selected parts of complex identifiers; clicking on an elided part reveals elided text (which may contain nested elided parts). This requires treating identifiers not as plain strings, but as syntax trees — &lt;em&gt;structured identifiers&lt;&#x2F;em&gt;. This need to obtain structured identifiers has implications for the design and implementation of debuginfo formats and demangling APIs.&lt;&#x2F;p&gt;
</summary>
        
    </entry>
</feed>
