Creating.Software https://creating.software/ Recent content on Creating.Software Hugo -- 0.145.0 en-US Copyright (c) 2024 Matthew Steffen, all rights reserved. Sun, 30 Mar 2025 00:00:00 +0000 The Engineering Ladder https://creating.software/essays/ladder/ Sun, 30 Mar 2025 00:00:00 +0000 https://creating.software/essays/ladder/ How I’ve come to understand the somewhat standard tech company engineering ladder Originally posted 2022-10-01, updated 2025-03-30

Due to market dynamics, I think companies across the tech industry eventually converge on approximately the same shared engineering ladder. In my opinion, this shared ladder reflects a skill trajectory that is essential to software engineering (and probably most creative pursuits).

When I first wrote this post, we’d just updated and extended our own internal engineering ladder at Pachyderm (R.I.P.). This update was a big project, the result of which had to meet the practical needs of a lot of different groups. Finishing it required me to read and think a lot about the engineering ladder, so I wanted to use the freedom of my personal blog to lay out my newly augmented understanding.

My original post (and my contributions to Pachyderm’s engineering ladder) characterized software engineering as a sort of game: make software that lots of people love and use, and you win! The engineering ladder captured the level of meta-game each engineer was playing. Do you stew over tactical issues like variable names and module boundaries? Junior engineer. Do you ponder strategic concerns like user experience and product trends? Staff engineer! Since then, I’ve gained more management experience at more companies, and as I re-read this post and realized how much my thinking has changed, I decided to revise it and discuss that change instead. I think it reflects a larger change in my understanding of tech companies and the tech industry.

I now feel that the software-engineering-as-a-game framework, while amusing, was totally wrong. Tactics vs. strategy is not the difference between Junior and Staff engineers. Here’s how I think of the job ladder now:

Level Description
L1 I don’t understand how this organization works or what’s expected of me day-to-day. When working on a project, I need help by default (how else can I learn?). I want to be able to get work done on my own.
L2 I know how to go through the standard motions of completing a project, but if anything unexpected happens, I’ll need help. I get stuck a lot. I want to get through projects without anyone holding my hand.
L3 (Senior) I come to work and complete projects. If I need help with something, I know where to get it; I unblock myself. I’m not too worried about how the projects are chosen. Or maybe I feel that the company should be doing something differently, but I can’t reliably implement that sentiment for one reason or another.
L5 (Staff) I know how teams (like mine) are supposed to work in the context of the larger company, so when the team starts going in the wrong direction, I can come up with effective, realistic solutions using my experience. I also have good communication skills, so I can convincingly explain to stakeholders why a different approach will be better for the company. Finally, I have strong project management skills, so I can coordinate the implementation of that approach. Company leaders know that I have a track record of effectively solving technically, politically, and organizationally complex problems, so they seek my guidance in difficult situations.
L6 (Principal) I’m highly regarded in our industry, and I have a network of industry contacts. When the company starts going in the wrong direction, I can come up with effective, realistic solutions using my experience and the expertise of others within my network. I can organize the implementation of those solutions with managers and engineers across the company as well as external colleagues and partners. Industry leaders know of me, and I can have significant influence on industry-wide decisions and trends.

Staff engineers

This description was the main thing I changed when revising this post.

To understand why Staff engineers have more prestige and higher salaries than Senior engineers, you must first understand what types of problems tech companies typically have, and then you must understand how Staff engineers make themselves indispensable to their employers.

At most companies where I’ve worked (or on most teams, at larger companies), the typical state of affairs is that you have 10,000 problems that are absolutely guaranteed to kill your company/get your product cancelled at some point. In any given week, you have time to solve maybe two of those problems. If you can pick the two that will kill you first and solve them, you get to not die! One of the reasons this is hard is that every single person around you sees some subset of those 10,000 problems, knows that they’ll kill the team/company, and is panicked about it and demanding something be done. Therefore a default failure mode for companies is lack of focus, leading to inaction, leading to death by whatever the next problem is.

What makes Staff engineers awesome is not (only) that they have the best ideas, but that they can execute, and they can execute even when execution involves herding other people. Many people hit a wall here; they can write code (or whatever their job is) but they can’t communicate convincingly or keep a multi-person project organized.

This isn’t (only) a matter of having a silver tongue, either. You have to have ideas that are good enough to actually work and simple enough to be tractable, and you have to be able to explain those ideas compellingly to colleagues, and you have to be organized enough to make sure all aspects of implementing the idea are addressed. All of that is necessary. And at the end, you have to have actually solved an important problem. The reason Staff engineers are a big deal is because without them, the company tends towards quivering helplessly.

Wait, what are managers for?

There is obvious overlap between the skills and responsibilities of Staff engineers and managers. I must first note, though, that this question is backwards: Staff engineers typically make more money than Engineering Managers at the same level. The reason that Staff engineers and managers are at the same level in many job ladders is that Staff engineers should be able to do manager-sized projects (see also the concept of Completed Staff Work). Finally, while managers do (some of) this stuff too, the emphasis of their job is different. See my manager README; a manager’s job is largely planning, communication, and coordination, and “having ideas” is nice when it happens, but that’s not really the point. Managers are mostly an interface to their teams.

Why do companies have a level in the ladder for engineers who’ve developed a set of management skills, though? Because, as a practical matter, some engineers do learn them, and there’s a lot they’re able to accomplish afterwards.

Consider, in contrast, that there is a type of engineer out there who complains that their ideas don’t go anywhere, and what they seem to expect is for management to pick up their idea in a one-on-one, coordinate the entire implementation of their idea for them (or somehow dictate that the rest of the company must re-orient around implementing this idea), and then turn around and give them a promotion at the end. Those people aren’t Staff engineers.

What do you think?

As with all my posts, I hope this helped someone. I spent many of my early-career years trying to understand the inscrutable-to-me job ladder of my employer (“I wrote a design doc and showed it to another team; am I ‘working across teams’?”). But I also would love feedback from any readers on this post in particular, since the “standard job ladder,” insofar as it exists, is necessarily defined by what each company out there thinks. Please post a comment or send me a note if you’re so inclined!

]]>
Deadlines https://creating.software/drafts/deadlines/ Wed, 16 Nov 2022 20:00:00 -0700 https://creating.software/drafts/deadlines/ <blockquote> <p>&ldquo;You can ship all your features, or you can ship on time, but you can&rsquo;t do both&rdquo;</p> <p>    – God</p></blockquote> <h2 id="a-story">A Story</h2> <p>Pachyderm Projects (I&rsquo;ll call them &ldquo;namespaces&rdquo; for blog purposes): &ldquo;all the functions in PPS identify repos with a string: the repo name. Well, with projects, we need two strings: the repo and the project. This is a major refactoring, affecting all of our functions and classes, as well as our database schema.</p>

“You can ship all your features, or you can ship on time, but you can’t do both”

    – God

A Story

Pachyderm Projects (I’ll call them “namespaces” for blog purposes): “all the functions in PPS identify repos with a string: the repo name. Well, with projects, we need two strings: the repo and the project. This is a major refactoring, affecting all of our functions and classes, as well as our database schema.

  • This is not an issue that, practically, you will ever uncover during planning. Nobody has all of your code’s internal function signatures memorized, so there’s no way for the issue to be raised in a planning meeting. You have to be well into the implementation phase of a project before it will come up
  • Why is this so hard? Because there are so innumerably many decisions that have gone into your project, that you can’t know them all or predict how they’ll interact due to bounded recall and bounded rationality
  • Software planning is chaotic. Arbitrarily small amounts of additional planning time may have arbitrarily large effects on the expected completion date.
    • This means that there’s always a risk your software project, as written, will not just slip, but completely explode on you, and there’s nothing you can do about it.
    • The solution is to be flexible about your plan

How I do it:

  • Make a plan to get the feature built in X time. Tell stakeholders that it will be done in 2*X time.
  • The 2x factor is not there to improve your odds of finishing on time. Again, there’s nothing you can do to ensure that. It’s simply there to give you time to think, as I’ll explain
  • I don’t have much advice on the “make a plan” part (no “break it down to tasks that are less than a week” or anything like that. Finger-in-the air is fine, but be very mindful of uncertainty)

Then, here’s how a project (should) go:

  • Say you have a project with a three-month plan, and then a month in, you realize something is very wrong and the project will actually take a year
  • Now you are eating into your buffer time. What you do is: get everyone together, and you change your plan.
    • This is an active process! That’s why the 2x buffer wasn’t just sandbagging. You’re using the extra time for planning.
    • You can cut features, change the order you do things in, incur technical debt, whatever. AT the end, you want a plan that leaves you bunch more buffer
  • Eventually, you cut enough stuff and add enough shortcuts that it’s back to being a three-month plan. You’re a month in, so you’ll finish at 4 months, still ahead of what you told stakeholders
  • Say all this happens again. You cut or change more stuff, make a third three-month plan
  • This one actually works, and now you’re “done” (modulo the stuff you cut) at five months
  • You’re now in a great position because not only do you still have a month of time, but you also understand the problem and its risks much better. The list of stuff you cut is mostly exhaustive; it might still blow up on you, but your odds are lot better than they were at the beginning.
  • Use your remaining buffer to implement the most valuable stuff on the list. Whatever doesn’t get done gets cut, but that’s fine.

I had one TL who would do one round of re-planning preemptively, with the following trick: when starting a project, he’d ask you to write out a plan and estimate. Then, you’d give him a number T. He’d always ask, regardless of T: “okay, what would it take to get this done in T/2. You’d cut a bunch of stuff, and that would be your project plan.

Lesson

  • You must actively manage a project during construction to hit a deadline.
  • There’s nothing you can do upfront—planning, sandbagging, whatever—to accomplish this. It’s all about what you do while the clock is ticking and the ball is in play.
  • AFAICT, mid-project re-planning is basically the whole central idea of Agile.
]]>
Teamwork https://creating.software/essays/teamwork/ Mon, 24 Oct 2022 08:00:00 -0700 https://creating.software/essays/teamwork/ [ Decisions part 3 of 3 ] In this essay, I use the ideas from &ldquo;Decisions&rdquo; and &ldquo;The Theory of a Program&rdquo; to lay out what I&rsquo;ve learned about how software teams work and how you can build your career. This is part three of a three part series on decisions. Here are part one and part two.


Main Ideas

  • The role of “code owners” in tech companies is to know and understand the decisions that went into the code that they own. When interacting with unfamiliar code, you should be in contact with its owner regarding questions and changes.
  • If you’re trying to grow your career, consider finding code you can own by looking for neglected code or bugs and assuming ownership.
  • If another team is more interested than you are in some code that you own, consider giving them ownership of that code.

As described in part two, the “theory” of a program is stored in the brains of its maintaining engineers. In this essay, I’ll try to lay out how I’ve seen this work in practice, and how it can form the foundation for teamwork in software in general. Looking back, I think I could’ve avoided a lot of embarrassment and frustration in my early career if I’d gotten the right conceptual model for software teamwork early.

Code Owners, The Knowers of Things

When one engineer wants to know how some unfamiliar piece of their company’s codebase works, there is typically another engineer at the company, often called the “code owner”1, whose job is to know that. The code owner may direct inquisitors to documentation, but that is, from a teamwork perspective, only an efficiency that saves the owner time—the owner still has unique personal familiarity with the code and its history. In particular, they will have knowledge that is unobtainable from the code itself and is inconsistently documented: what else has been tried, or considered, and why the code is written this way. The code owner is often, but not always, the code’s original author, and when a new owner takes over existing code, they often make heavy modifications (not as a rule, but, as described below, code often comes under new ownership as part of larger changes).

This organizational dynamic is the scaffolding on which teamwork in tech is built. Because code owners are responsible for knowing the theory of their code, they must also review and approve any new changes (because, as the owner, they might be called on to explain them). Non-code decisions are typically distributed in a similar way, though typically not among engineers (product decisions are owned by PMs, sales and marketing decisions are distributed among sales and marketing, etc). In my capacity as a manager, I’ve written documentation specific to my team about ownership, built on this framework.

This foundation leads to two pieces of advice for my former self:

  • If you need to understand how some code that you don’t own works, find the owner and ask them
  • Before changing code, talk to the owner. They’re responsible for it.

How Do I Get to Own Stuff?

Very often, the most knowledgeable engineers on a team (the tech leads and staff engineers) write the least code. This is a somewhat unstable arrangement: as soon as you work on some code, you’ll be the expert in how that code works and its de facto owner. When you start someplace new, your manager will hopefully ask you to work on code that is unowned, or code whose owner is overwhelmed. You’ll naturally be the new owner.

That said, at my first software engineering job, the company culture was that managers remained aloof and engineers directed their own work. This often left new engineers, and especially new engineers who didn’t understand ownership yet (like me), feeling lost. Thus a third piece of advice: if you want to build your career, and your manager isn’t giving you stuff to own, or you’re generally not sure what to do, try to find something to own yourself. Look for code in your team’s codebase or a feature in your team’s bug queue that the rest of the team has neglected, and make yourself the owner of it (with the assent of its current owners, if any). Mainly, establish yourself as the expert on some part of the codebase instead of bouncing around haphazardly.

Isn’t This Claiming Turf?

It can feel that way, but the best engineers assiduously avoid doing this. The counterweight to the above advice, “find code to own,” is this: eventually you may find yourself the owner of some code that another team cares about more than you do. When that happens, your best option is to give them ownership.

Early in my career, I accidentally became the owner (the prior owners left for other teams) of an authorization library. The library no longer made sense with our company’s new storage systems, so we were replacing it with a new service on which I also worked. The library was still widely-used when I inherited it, but over the next year or so, its user base dwindled to, effectively, two or three other products. My relationship with one user in particular deteriorated rapidly: my TL didn’t want me spending a lot of time on a dying library, but that user still made heavy use of it and, with their own ambitious product goals in mind, wanted to make some sweeping changes. They would send me massive code reviews that I would try to get through at night and in spare moments so I could finish the new service’s projects during the day, which was not the level of responsiveness they wanted. True to part two of this essay, the hardest part was remembering all the details of the old library while learning all the details of the new service, so my work on the new service suffered too.

Things really came to a head when my TL cornered me in a conference room and asked me why my projects were going so slowly. I actually lost it: I told him that nobody seemed to care about the old library but that it still had users and was still running in production and still mattered to the company. And it was taking up a ton of time and I was exhausted. In a moment that I’ve taken with me for the rest of my career, he actually cheered right up. He really just wanted to understand the problem, and now he did.

My TL solved the problem by giving the other team ownership of the authorization library. In retrospect, this was a natural and obvious change, but at the time, I didn’t realize it was an option. Immediately, everything improved. The old library’s ambitious user could make the changes they wanted, I could focus on the projects my new team needed, and I was working a more sustainable schedule while I did it. As far as everyone’s careers went: I still owned code and still had natural places to contribute, but it was all in the new service; I was growing by working on promising new software, and the other team was growing by getting new features to market quickly.

Your Old Company Sounds Weird

I would be remiss not to clarify that not all software teams work this way. It is true, though, that only the engineers involved in the design and implementation of a piece of software can explain, fully, why it is the way it is. Some organizations make this division of knowledge more explicit than others. In my case, though, I wish I’d known about code ownership out of the gate. The organizational dynamics are real and unavoidable, and whether or not they’re recognized, they’re central to the success of software projects and teams.


  1. See e.g. GitHub’s CODEOWNERS feature, or (again), “Software Engineering at Google”, Chapter 16 ↩︎

]]>
The Theory of a Program https://creating.software/essays/theory_of_a_program/ Sat, 22 Oct 2022 08:00:00 -0700 https://creating.software/essays/theory_of_a_program/ [ Decisions part 2 of 3 ] The &ldquo;theory&rdquo; of a program: it is the intricate knowledge of why a particular solution was chosen over its alternatives that allows software to be good. This reality is the root of many challenges in software engineering. This is part two of a three part series on decisions. Here are part one and part three.


Main Ideas

  • Managing decisions—retaining their rationales and their history, and making future decisions that are consistent with past decisions—is the bottleneck in large software projects.
  • It’s easier to write code than read it because you don’t have access to those rationales.
  • People and time are the dimensions along which a decision may break down; by extension, documentation is useful to distant people, or the same people in a distant time.
  • Most tech companies manage decisions by sharding knowledge of their context across engineers.

One of my favorite essays on Software Engineering is “How to Build Good Software,” by Li Hongyi. I agree with most of the points it makes, but my favorite is that “[t]he main value in software is not the code produced, but the knowledge accumulated by the people who produced it.” The author elaborates:

To make progress, you need to start with a bunch of bad ideas, discard the worst, and evolve the most promising ones. Apple, a paragon of visionary design, goes through dozens of prototypes before landing on a final product. The final product may be deceptively simple; it is the intricate knowledge of why this particular solution was chosen over its alternatives that allows it to be good.

This knowledge continues to be important even after the product is built. If a new team takes over the code for an unfamiliar piece of software, the software will soon start to degrade.

I later discovered that this idea predates Li’s essay. Turing award-winner Peter Naur wrote a longer exploration of this idea in Programming as Theory Building in 1985 (it’s a long quote but all relevant):

A main claim of the Theory Building View of programming is that an essential part of any program, the theory of it, is something that could not conceivably be expressed, but is inextricably bound to human beings. It follows that in describing the state of the program it is important to indicate the extent to which programmers having its theory remain in charge of it….The building of the program is the same as the building of the theory of it by and in the team of programmers. During the program life a programmer team possessing its theory remains in active control of the program, and in particular retains control over all modifications. The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

The extended life of a program according to these notions depends on the taking over by new generations of programmers of the theory of the program. For a new programmer to come to possess an existing theory of a program it is insufficient that he or she has the opportunity to become familiar with the program text and other documentation. What is required is that the new programmer has the opportunity to work in close contact with the programmers who already possess the theory, so as to be able to become familiar with the place of the program in the wider context of the relevant real world situations and so as to acquire the knowledge of how the program works and how unusual program reactions and program modifications are handled within the program theory.

In short, the essence of a computer program is not its source code, but a theory of what the computer ought to be doing for certain users and how.

I think that Li and Naur are describing the same issue I wrestled with in my prior essay on decisions. What their writing reveals is the extent to which managing these decisions—retaining their rationales and their history, and making future decisions that are consistent with past decisions— becomes the bottleneck in a software project.

On reflection, I now believe that many of the difficulties in software engineering reduce to this problem. For example, many others1 have written about how code is easier to write than read, and I think this framing lays the cause bare: it’s transparently much easier to reason (an almost automatic process) and make your own decisions than to deduce the private reasoning behind another person’s decisions. If the other person’s reasoning was informed by experience you don’t have and can’t easily get, the work might be worth it, but then that’s the hardest reasoning to infer. You don’t know what’s motivating it.

Another example is “Software Engineering is what you get when you take programming and add people and time”2. Or, equivalently (as I’ll explain): “write documentation for your future self as much as for others”3. People and time (i.e. “others” and “your future self”) are the axes along which a decision might break down. A decision that worked for one person may not work for another, or circumstances may change and a decision may no longer be useful. People and time are, by extension, where a past decision may be questioned and therefore where documentation may usefully provide an explanation. If you subtract people and time from software engineering, you remove the possibility that a decision might fail.

Wait, Can’t We Just Document Our Decisions Then?

Documentation is essential but necessarily has two flaws. First, documentation is always incomplete. Each decision made while developing a piece of software may have an arbitrarily complex rationale, and a piece of software is composed of almost innumerably many decisions. Second, each new line of documentation adds to the amount that must be read by an engineer looking to understand the software. Even if e.g. a new team member doesn’t need to know about a particular decision, extending the documentation by recording that decision will further dilute whatever information they do need.

If not through documentation, how does complex software stay alive and functional? Most tech companies solve this problem by documenting the most recent, most important decisions and storing the rest (per “Programming as Theory Building”) in their employees’ brains—that is, just knowing and understanding the decisions made in the company’s codebase is a big part of software engineers’ jobs at large tech companies (especially senior engineers). As I’ll explain in part 3, this explains some of the organizational dynamics that tech companies tend to have.


Epilogue: Can You Have a Big Project Without Big Headcount?

A single skilled engineer can store an enormous amount of information about a software project in their mind (c.f. Dwarf Fortress). But the risk associated with actually doing so is if the engineer leaves the project, then all of that information is lost. Tech companies try to avoid this by storing that knowledge redundantly across multiple engineers, and an explicit goal of many engineering managers is to distribute knowledge of their team’s codebase among their engineers. But, a natural question remains: is there any way to build a complicated piece of software that outlives its designer and isn’t supported by a big tech recruiting team?

Another Turing award-winner, Fred Brooks, discussed a version of this question in his 1986 essay “No Silver Bullet”:

How much of what software engineers now do is still devoted to the accidental, as opposed to the essential? Unless it is more than 9/10 of all effort, shrinking all the accidental activities to zero time will not give an order of magnitude improvement. Therefore it appears that the time has come to address the essential parts of the software task, those concerned with fashioning abstract conceptual structures of great complexity.

In it, Brooks discusses a variety of ideas for managing software complexity, but finds only a few promising prospects. However, one of those, “Requirements refinement and rapid prototyping”, I find promising myself. As far as I can tell, this is the approach that yet another Turing award-winner, Leslie Lamport, aims to take with TLA+. I have no experience with it myself, but but I know it’s already gaining adoption at Amazon.

It makes intuitive sense to me that such an approach would help. While I drew a sharp distinction between “making decisions” and “solving problems” in part one to make a point, the line is honestly somewhat blurry, especially in software. If you choose an inefficient algorithm that becomes a bottleneck for a few users, or if you handle a particular corner case poorly, did you make a creative decision that renders your product a bad fit for those users, or did you fail to solve a problem? I think many software teams make these decisions incidentally; they implement something simple and then deal with the results as they arise. Their projects then become very complicated; decisions ping-pong between different trade-offs and sharp edges (exacerbated by changing user demands, but that’s probably unavoidable). By forcing engineers to enumerate, formally, some properties that they expect their software to have, rather than allowing them to make decisions ad hoc, I can see how TLA+ (or similar tools; I know about Alloy) would greatly restrict a product’s decision space, which may mean more stable products.

Concretely, how much of senior engineers’ minds consists of knowledge like “we did it this way because our other approaches created UX traps/operational problems”? If tools like TLA+ catch on, software projects might require less of this flavor of institutional knowledge, because the TLA+ checker would prevent many bad ideas from getting released in the first place. I’m excited to see whether tools like TLA+ catch on and what effect they have on software team dynamics.


  1. I don’t remember where I first heard this (I want to say Coding Horror, but if so, I couldn’t find it). But I found at least one blog post as well as tweets by @kathytafel and @recipromancer and others, as well as the odd HN comment↩︎

  2. I first recall hearing this at Google, and sure enough it’s in Chapter 1 of “Software Engineering at Google”, though rewritten slightly:

    This suggests the difference between software engineering and programming is one of both time and people.

    It’s since circulated a bit in casual discussion on the Internet. ↩︎

  3. This is likewise in “Software Engineering at Google,” Chapter 10:

    Most of the documentation an engineer at Google writes comes in the form of code comments…Tricks in code should be avoided, in any case, but good comments help out a great deal when you’re staring at code you wrote two years ago, trying to figure out what’s wrong.

    It also shows up in a few blog posts on design docs, and is often tweeted:

    Hey developers - document your damn code, if not for yourself, then for the people that have to follow. - @lonnieezell

    I found similar tweets by @Crell, @franconchar, @danigrrl and others ↩︎

]]>
Decisions https://creating.software/essays/decisions/ Thu, 13 Oct 2022 20:00:00 -0700 https://creating.software/essays/decisions/ [ Decisions part 1 of 3 ] &ldquo;Decisions are something you <em>make</em>; something is created when you make a decision. Making a decision is an act of will, not an act of thought.&rdquo; I&rsquo;ve come to love the analogy between a decision and a tangible, creative work, and now see the need to make innumerable open-ended decisions as the defining characteristic of creative work. This is part one of a three part series on decisions. Here are part two and part three.


Main Ideas

  • Decisions are a creative product, like a bookshelf or a painting, but instead of being made from wood or paint, they’re made from information.
  • The defining characteristic of creative work is the need to make a huge number of open-ended decisions.
  • Making high-level decisions upfront (“establishing a vision”) makes later small decisions easier.
  • An artistic style is a strategy for managing the immense number of decisions involved in creative work.
  • The advantage of working at a large firm is that you learn the house style.

When I first started out as a software engineer, I hubristically expected to be good at it. I’ve been programming since I was a little kid and majored in CS in college (and did well). But after “going pro” I was actually quite bad. Mainly, I was really slow. Every project took me forever because I was so self-conscious that any time I wrote any code at all, I’d worry that it was somehow wrong, and I’d wring my hands and overthink every part of the code to death.

One of the biggest, most useful ideas that I’ve encountered in my career, which helped me escape this mindset in particular, came from the book The Chairs are Where the People Go:

…the expression “to make a decision” is perfectly accurate: a decision is something you create. There’s an inclination to think that with enough research and thinking and conversation and information, it’s possible to determine what the correct decision is; to think that decision making is an intellectual puzzle. But generally it’s not. You make decisions. Something is created when you make a decision. It’s an act of will, not an act of thought.

Decisions are Like a Bookshelf

I’ve been turning over this idea for years, and I came to love the analogy between a decision and a tangible creative product—something you make, like a painting, or a bookshelf. For one thing, it really illuminated that “correct” is an incoherent goal: a shelf might make certain tradeoffs (like being sturdy vs. attractive vs. cheap), and it might feel more or less useful than another shelf to a specific person in a specific context, but there’s too much variety in peoples’ values and too much flexibility in how a shelf may be used for “incorrect” to even make sense. Likewise, if I decide to call a variable “X” in my code, that entails certain tradeoffs: the code might be more concise but might be less readable. Perhaps it hews closer to some mathematical convention. Other parts of the codebase may adapt (with the name “X” taken, other variables may get longer names). I may come to feel that a different decision would have been more useful in this specific context, e.g. if the short name causes us to accidentally commit a bug. But it’s not really “incorrect”.

However, there are more, deeper parallels: in the same way that the raw material out of which one makes a shelf is wood, the raw material out of which one makes a decision is information. Sometimes you don’t have much wood. A good carpenter might still be able to make a shelf that holds up, but you’d know not to expect miracles given the constraints. Likewise, if you don’t have much information, as at the beginning of a project, you shouldn’t feel guilty for making a decision that, ultimately, must be replaced.

Then, the sum of these two lessons is a third: sometimes you just have to make a decision, knowing it might be bad, in order to see what happens. You might be struggling to make a decision because you don’t know what you value (or your values are adapting to a new environment). You might be struggling because you don’t have enough information about the consequences of each option. Either way, making the decision will help you, and you can use the information to make better decisions later.

An empowering parallel truth to realize is that most decisions are easy to change. If you call your variable something bad, it’s a simple fix to rename it. So the cost of guessing is actually low, and not only do you save time, you learn more.

When I’m particularly anxious about a decision, I like to create a Google Doc where I write out the problem, the decision I made, and the rationale. It feels like the risk is contained. Then I come back a month or two later and write up how the decision played out.

Actually, Bookshelves are Just Made of Decisions

Since then, I’ve come to take the analogy even further. The reason decisions (in software, where I came from) are like bookshelves is because making a bookshelf itself entails making a lot of decisions. I now feel that this is, in fact, defining: the essential characteristic of all creative work is that it requires you to make a huge number of open-ended decisions. Novelists decide how to phrase things, how a plot should unfold, what characters will be named; carpenters choose types of wood, what joints to use, aesthetic flourishes; painters choose brushes, paints, palette, composition. Software engineers face the same thing in their own field1.

By extension, there’s a lot that people in software can learn from other creative professionals. One is that making high-level decisions upfront makes later small decisions easier. For example, in writing, authors often develop backstories for their characters which then determine characters’ appearance, dialogue, and behavior:

Backstory includes significant events that impact the character’s behavior and motivation during your story. The biggest benefit to backstory for each character is depth in your story. A rich character background allows you to pull details to improve distinctive character actions and dialogue in your story.

Your bad guy has a scar on his temple. Your love interest is reluctant to commit. Your protagonist is afraid of dogs. Characters had relationships with other characters before the beginning of your story.

In the realm of software, as UC Berkeley’s CS162 course concisely puts it:

The design is essentially the most important part of the project. Having a good project design can literally cut your total coding time by a factor of 10…keep’em short and to the point.

A clear vision makes later polish and maintenance decisions easier too. If you decide upfront that your software will be e.g. the fastest text editor available, downstream decisions about when to optimize corner cases (often) and when extra features should be added (rarely) follow naturally.

A second lesson that I think software engineers can learn from creatives is the value of developing a style:

The writing choices an author makes tend to follow patterns. When a writer finds a technique or habit they like, they stick with it, often throughout their entire career. Put all those writing choices together, and the writing takes on a unique “voice” that “sounds” different from other writing.

Or, in Ira Glass’s somewhat more heroic terms:

Nobody tells this to people who are beginners, I wish someone told me. All of us who do creative work, we get into it because we have good taste. But there is this gap. For the first couple years you make stuff, it’s just not that good. It’s trying to be good, it has potential, but it’s not. But your taste, the thing that got you into the game, is still killer. And your taste is why your work disappoints you. A lot of people never get past this phase, they quit. Most people I know who do interesting, creative work went through years of this. We know our work doesn’t have this special thing that we want it to have. We all go through this. And if you are just starting out or you are still in this phase, you gotta know its normal and the most important thing you can do is do a lot of work. Put yourself on a deadline so that every week you will finish one story. It is only by going through a volume of work that you will close that gap, and your work will be as good as your ambitions. And I took longer to figure out how to do this than anyone I’ve ever met. It’s gonna take awhile. It’s normal to take awhile. You’ve just gotta fight your way through.

Finding the techniques and habits you like, and “closing the gap” between your work and your taste are the same process: the process of developing a style. In a forest of decisions, a style is one well-worn path to the meadow of success. That is, reusing decisions that have worked well lets you think less while committing fewer blunders, so if you develop a style, your work will go faster and be better than if you make every decision de novo.

One further lesson: when you work at a big publication, or a big tech company, you learn the house style. I worked in big tech for five years, and by the end, I felt I could build an arbitrarily complex piece of software using the design/implementation/support process I’d learned. Other processes may be better, but at least this one worked.

So, What Now?

Mainly, if you’re a junior engineer wringing your hands about the code you write, as I was: mellow out. Don’t be afraid to send out your first draft for review if you don’t see anything wrong with it. If someone tells you it’s bad, don’t worry; just think about whether you agree and how you’d write the same code next time. Or, if you stumble upon a different approach that you like better in someone else’s code, steal that. In fact, go out and read the source of projects you like, in case there’s something good in it. By learning to see software as one would see writing, art, or carpentry, you can get comfortable making decisions better and faster and get to putting good software out in the world.


  1. I know that this is almost stereotypically common, but early in my career I often wondered why programming is so hard. Why are programs so much faster to think up than to write out? The “innumerably-many open-ended decisions” epiphany answered it for me: though you may see a program’s whole design in your head, there are many decisions that must be made to have a working program: what language and libraries will you use? What data structures will you use? How will the code be laid out? What will your classes, methods, and variables be? Et cetera. Making all of those remaining decisions are what takes time. ↩︎

]]>
Time Management for Software Engineers https://creating.software/drafts/time_management/ Thu, 06 Oct 2022 20:00:00 -0700 https://creating.software/drafts/time_management/ <p>The nature of being a Software Engineer: you wake up every morning, and you have a million things you need to do, all of which will kill your startup/get your project cancelled if not done: a hundred tests to fix, a thousand messes to refactor, ten thousand features to add. This week, you have time to do two of them. Unless you decide to really push it, starting early and staying late, charging boldly into the abyss; then you might be able to start on a third.</p> The nature of being a Software Engineer: you wake up every morning, and you have a million things you need to do, all of which will kill your startup/get your project cancelled if not done: a hundred tests to fix, a thousand messes to refactor, ten thousand features to add. This week, you have time to do two of them. Unless you decide to really push it, starting early and staying late, charging boldly into the abyss; then you might be able to start on a third.

The keys to being a successful Software Engineer are 1) accepting this reality, and 2) picking the right two. If you fail, you actually do get shut down, but if you succeed, you get to play again next week!

Here are the tactics, ideas, and mindsets I’ve found most helpful for dealing with this:

Focusing

A lot of the people who care most about time management really just want to avoid distractions and stay focused. I am, to some extent, one of those people, but the main thing I’ve learned is that website blockers and such work less well than getting good at noticing when you’re distracted and returning your focus to work right then.

Returning to work quickly is a skill with multiple components, and it takes practice. For example, if I catch myself reading something interesting at my desk, I note the article in a scratch document somewhere and go back to the task at hand. This habit was born of several insights:

  • It was a while before I stumbled on the idea of a scratch pad of interesting distractions. It really helps me “let go” of distractions emotionally, so I’m don’t feel like I’ll miss anything by going back to work right now.
  • The flashes of self-awareness you have while distracted are fleeting and you must take advantage of them. Seize the moment, before you get distracted again!

Some other tricks:

  • If some small tasks need to get done (e.g. sending a request, starting a test run), learn to use your desire desire not to do them as motivation to get them out of the way quickly. Tasks like that can inflate if given too much time, and by waiting until you don’t want to to do them, you’re sure not to give them much.
    • I’m a somewhat socially anxious person, and a practical consequence of that is that I spent way too long drafting emails before sending them. This came to a crisis point at my last job, such that I felt like I had no time for productive work due to email. I timed how long it took me to send an email, and it came out to almost an hour. I had enough time each day to send eight emails do nothing else. I realized I was in big trouble if that persisted, so I gave myself permission to send crappy email. I figured I’d send drafts and see what happened, and suddenly I was a lot more productive. The key is that it’s essential to keep small tasks small.
  • Social media works by exploiting FOMO, and to some extent that fear is grounded in reality: you really might miss something interesting if you stop checking social media completely. I’ve managed to convince myself, though, that anything I miss would come into my life in more than one way if it’s interesting enough. “Of all the most interesting things ever written, very few of them have been written on Facebook” (find citation)

Leave Buffer

The above applies in contexts other than staying focused as well. For example, I used to arrive late to meetings regularly, which I was embarrassed by but couldn’t seem to stop doing. On one occasion, I remembered that I had a meeting, checked the clock, and realized that if I stopped now I’d get to the meeting five to ten minutes too early. I noted that I’d need to leave soon. I was lucky enough, though, to be struck with a flash of insight right then: I was probably going to be late to the meeting because I’d set myself up such that the window of time in which I needed to remember to leave for this meeting was extraordinarily small. If I remembered too early, I’d just keep working. If I remembered too late, I’d arrive late. I started giving myself permission to arrive ten minutes early and sit on my hands, and leaving right when the thought to do so occurred to me, and suddenly I stopped being late nearly as often. (from doing this, I later learned that attempting to arrive early is a great way to arrive on time, in case the meeting moved or something).

Finally, at a previous company, some other engineers had made a little game where it would track the amount of time you spent working (by watching writes to disk) but turn it into a little game, where many writes close together if two writes occurred more than ~20 minutes apart,

Send Crappy Email

This advice pretty much lives in the title, but two experiences I had:

  1. I’ve always been a nervous communicator, I felt like

Don’t Sneak

Note: Should this maybe go with “Teamwork” instead? All this stuff seems related. Likewise, the whole “what would it take to get this done in T/2” tactic goes with both “Leave Buffer” and “Technical Risk”

Don’t try to sneak fun stuff in front of your project work without telling your manager/colleagues and then catch up later. You won’t actually be able to catch up, and your manager and you will be stressed. Be forthright about what you’re working on, but if you think something’s a good idea, try to convince your manager (or client, or whoever) to let you spend some percentage of your time on it (e.g. one day a week).

It’s Only OK

Sometimes you have a project in mind that’s so rad that you’re sure, if you can just scrape together a prototype, people will fall in love with it as soon as they see it. The result will justify the time you spent building the prototype, you’re certain. I can’t say honestly that this never works, but I can say that every engineer thinks this regularly, and that engineers should do this, on average, 99% less often than they want to.

Limiting Scope

This refers to two closely related tools: working incrementally (which is almost always a good idea) and cutting corners (which is usually a bad idea, but good engineers should know how to do it).

Working Incrementally

There’s a whole essay on this blog that explains why and how to work incrementally, but this paragraph refers specifically to slicing up your work1 and starting with the most critical slices. That incremental work essay offers more advice on on how to do so, but here I will simply assert here that a) every can be made as a series of small patches, and b) there major advantages to doing so.

The benefit of working incrementally with respect to time management is that if you get good at slicing up work, you get much better at dealing with deadlines too, because you can minimize the amount of pre-deadline work you have to do and shift most of the project into post-deadline revision. Stated in these general terms, this sounds like cutting corners (which is a tool in and of itself, as explained in the next section) but here I’m referring to, for example, adding a new API endpoint in time for a major release, and then adding features and options to it in subsequent minor releases.

Cutting Corners

Cutting corners—that is, writing bad code to save time—is a skill that good engineers should have (but should rarely use). There are a few ways to do it.

One big one is to prefer adding new code over changing existing code or factoring code out. Some examples:

  • Rather than change a method, copy it over to a new method so that you don’t have to update the existing callers or check that their use of the updated method is still correct.
  • Rather than adding a new method to an interface, create a new sub-interface with the method, so that you don’t have to update the existing implementations. Then pass around the new sub-interface only in the parts of the code that need the new functionality; unaffected code can keep using the old super-interface.
  • Copy and paste data around rather than designing new abstractions. For example, if many functions receive the same six arguments, they could probably be factored into their own class, but you can save time in the short term by not doing so; the receivers won’t need to be updated, and you won’t have to spend time designing and revising the abstraction.

The connection between cutting corners and working incrementally is that a) you can always construe cutting corners as incremental work by declaring that your project will proceed in two steps: 1) write messy, redundant code, and 2) fix it. You shouldn’t put your codebase into a messy state without a good reason, though. The longer it stays like that, the harder cleanup will be (e.g. new callers of your redundant code will be added and will need to be unified; you will slowly forget which functions were copied and why they were patched to differ from the originals; etc.).

Another way to cut corners that needs to be mentioned is to skip writing automated tests. Be very, very careful about doing this, though. On one particularly embarrassing project, I skipped integration tests to save time and ended up delaying the project by probably a month because of how slow we became at discovering and fixing the final issues in bug queue (we did end up building automated tests, which was how the project got finished, but only after wasting a lot of time first).

There are certain cases where skipping tests makes sense, though. One example: I once worked on a bug where our product would get slower and flakier over about ten hours, until it stopped working completely and needed to be restarted. After a lot of debugging, we determined that a previous refactor had accidentally moved a library call that started a TCP session to the inside of a retry loop, causing us to leak TCP sessions. Testing the fix (at least naively) would’ve required building a test harness in which we set up both our product and our upstream dependency and let them run for a day, but the fix itself was a two-line change: moving the function call back to the right place. Customers couldn’t use our product as it was, so we merged the fix with no test (until later).

]]>
Mastering Programming https://creating.software/reference/kent_beck_mastering_programming/ Sun, 02 Oct 2022 20:00:00 -0700 https://creating.software/reference/kent_beck_mastering_programming/ <p>This post is a copy of <a href="https://www.facebook.com/notes/655499231823308/">a Facebook post by programmer Kent Beck</a>. I still refer back to it and have found the advice about slicing up work to be particularly helpful (and I now pass it on to others).</p> <hr> <p>    <em><strong>Kent Beck</strong></em></p> <p>From years of watching master programmers, I have observed certain common patterns in their workflows. From years of coaching skilled journeyman programmers, I have observed the absence of those patterns. I have seen what a difference introducing the patterns can make.</p> This post is a copy of a Facebook post by programmer Kent Beck. I still refer back to it and have found the advice about slicing up work to be particularly helpful (and I now pass it on to others).


    Kent Beck

From years of watching master programmers, I have observed certain common patterns in their workflows. From years of coaching skilled journeyman programmers, I have observed the absence of those patterns. I have seen what a difference introducing the patterns can make.

Here are ways effective programmers get the most out of their precious 3e9 seconds on the planet.

The theme here is scaling your brain. The journeyman learns to solve bigger problems by solving more problems at once. The master learns to solve even bigger problems than that by solving fewer problems at once. Part of the wisdom is subdividing so that integrating the separate solutions will be a smaller problem than just solving them together.

Time

  • Slicing. Take a big project, cut it into thin slices, and rearrange the slices to suit your context. I can always slice projects finer and I can always find new permutations of the slices that meet different needs.
  • One thing at a time. We’re so focused on efficiency that we reduce the number of feedback cycles in an attempt to reduce overhead. This leads to difficult debugging situations whose expected cost is greater than the cycle overhead we avoided.
  • Make it run, make it right, make it fast. (Example of One Thing at a Time, Slicing, and Easy Changes)
  • Easy changes. When faced with a hard change, first make it easy (warning, this may be hard), then make the easy change. (e.g. slicing, one thing at a time, concentration, isolation). Example of slicing.
  • Concentration. If you need to change several elements, first rearrange the code so the change only needs to happen in one element.
  • Isolation. If you only need to change a part of an element, extract that part so the whole subelement changes.
  • Baseline Measurement. Start projects by measuring the current state of the world. This goes against our engineering instincts to start fixing things, but when you measure the baseline you will actually know whether you are fixing things.

Learning

  • Call your shot. Before you run code, predict out loud exactly what will happen.
  • Concrete hypotheses. When the program is misbehaving, articulate exactly what you think is wrong before making a change. If you have two or more hypotheses, find a differential diagnosis. (I have written about the great value of this habit specifically in my essay on debugging - Matthew)
  • Remove extraneous detail. When reporting a bug, find the shortest repro steps. When isolating a bug, find the shortest test case. When using a new API, start from the most basic example. “All that stuff can’t possibly matter,” is an expensive assumption when it’s wrong.
    • E.g. see a bug on mobile, reproduce it with curl
  • Multiple scales. Move between scales freely. Maybe this is a design problem, not a testing problem. Maybe it is a people problem, not a technology problem [cheating, this is always true].

Transcend Logic

  • Symmetry. Things that are almost the same can be divided into parts that are identical and parts that are clearly different.
  • Aesthetics. Beauty is a powerful gradient to climb. It is also a liberating gradient to flout (e.g. inlining a bunch of functions into one giant mess).
  • Rhythm. Waiting until the right moment preserves energy and avoids clutter. Act with intensity when the time comes to act.
  • Tradeoffs. All decisions are subject to tradeoffs. It’s more important to know what the decision depends on than it is to know which answer to pick today (or which answer you picked yesterday).

Risk

  • Fun list. When tangential ideas come, note them and get back to work quickly. Revisit this list when you’ve reached a stopping spot.
  • Feed Ideas. Ideas are like frightened little birds. If you scare them away they will stop coming around. When you have an idea, feed it a little. Invalidate it as quickly as you can, but from data not from a lack of self-esteem.
  • 80/15/5. Spend 80% of your time on low-risk/reasonable-payoff work. Spend 15% of your time on related high-risk/high-payoff work. Spend 5% of your time on things that tickle you, regardless of payoff. Teach the next generation to do your 80% job. By the time someone is ready to take over, one of your 15% experiments (or, less frequently, one of your 5% experiments) will have paid off and will become your new 80%. Repeat.

Conclusion

The flow in this outline seems to be from reducing risks by managing time and increasing learning to mindfully taking risks by using your whole brain and quickly triaging ideas.

]]>
Feedback https://creating.software/essays/feedback/ Sat, 01 Oct 2022 12:05:16 -0700 https://creating.software/essays/feedback/ What I learned about feedback as a junior engineer I found this perspective transformative early in my career, and gave it to many of my early-career friends who found it equally helpful. These days I find this post a little sophomoric, but I leave it up in case it helps someone who happens to be where I was, psychologically speaking

Early in my career, I always found performance reviews and their attendant constructive feedback incredibly, existentially stressful. Constructive feedback feels a lot like criticism, and criticism feels a lot like rejection. When you’re new and are insecure about your place in and value to an organization, even a whiff of rejection can be overwhelming.

The only cure I know about for that sort of general insecurity is time and experience (stick with it, buddy!), but I did eventually arrive at a better perspective on constructive criticism, that I wish I’d found sooner:

Every career is like a mountain. At the summit stands the canonical ideal software engineer1. No one is here, because no one is perfect. Everybody starts their career somewhere around the base of this mountain and needs to go in some direction to get higher, but the direction you need to go depends on where you start. On a real mountain, if you start on the east side, you have to go west to get to the top, but if you start on the west side, you have to go east. Likewise, at work, if you’re too passive you may need to ask questions and share concerns more freely, but if you’re opinionated and loquacious, you may need to filter yourself. Some people talk the right amount but need to edit and test their code more thoroughly, others need to get more comfortable accepting some risk, merging their code and moving on. Where you start, in turn, depends on your personality and your whole life experience up to this point.

To move up, you can try to figure out the direction in which improvement lies and practice behaving that way. Or you can switch to another job, which will have a different summit in a different direction that might be closer to where you currently are.

This mindset implies a few things, which are what, I think, would’ve helped me. Most importantly, everyone in the organization—including the most senior engineers and executives—should be getting constructive feedback. If they’re not, it just means that those people don’t know what they’re bad at, not that they’re perfect (nobody is). Second, because distance grants perspective, a good manager is a coach who can grow your skills and subsequently your career more quickly and to a greater extent than you could on your own. Therefore, you should be comfortable discussing your work-related weaknesses with your manager (in my opinion, managers shouldn’t be in charge of their reports’ promotions because that disincentivizes their reports from doing this).

As I described in my essay on decisions decisions, I really struggled at work when I started my career. That I didn’t really trust my manager in my early days made my early problems intractable. I was more focused on pleasing than growing, so I’m sharing this, in part, for anyone else in my old position. Welcome constructive feedback and open up to your manager (or start job-shopping). Real career advancement only comes from actually getting good at this, and if you want that, you need constructive feedback.

Finally, Julia Evans has a great, relevant Zine that I recommend.


  1. Or at least your company’s version of it. ↩︎

]]>
About https://creating.software/about/ Fri, 30 Sep 2022 13:06:08 -0700 https://creating.software/about/ <p>Hi! I&rsquo;m a software engineer living in the San Francisco Bay Area. I write about the ideas I&rsquo;ve encountered that have helped me the most in my career, or (when I&rsquo;ve dispensed them as advice) seem to have helped others. I hope some of them help you.</p> <p>I&rsquo;m currently an engineer at <a href="https://www.sentilink.com/">SentiLink</a>. Previously, I was the fourth (or third, depending on how you count) engineer at <a href="https://www.pachyderm.com/">Pachyderm</a>. I&rsquo;ve also been an engineer and engineering manager at <a href="https://www.hpe.com/us/en/home.html">HPE</a> (post-acquisition. We made <a href="https://silicon-valley.fandom.com/wiki/The_box">a box</a>) and <a href="http://google.com/">Google</a>. I love to hear from any readers, so if anything on here piques your interest, please get in touch!</p> Hi! I’m a software engineer living in the San Francisco Bay Area. I write about the ideas I’ve encountered that have helped me the most in my career, or (when I’ve dispensed them as advice) seem to have helped others. I hope some of them help you.

I’m currently an engineer at SentiLink. Previously, I was the fourth (or third, depending on how you count) engineer at Pachyderm. I’ve also been an engineer and engineering manager at HPE (post-acquisition. We made a box) and Google. I love to hear from any readers, so if anything on here piques your interest, please get in touch!

]]>
Debugging https://creating.software/drafts/debugging/ Fri, 30 Sep 2022 12:05:16 -0700 https://creating.software/drafts/debugging/ <ul> <li>The epistemological structure of a piece of software and a scientific theory is the same: you can&rsquo;t &ldquo;prove&rdquo; a theory is correct in the same way you can&rsquo;t &ldquo;prove&rdquo; a piece of software has no bugs, but you <em>can</em> do lots and lots of testing</li> <li>The logical parallel clarifies (IMO in an interesting way) a fact of scientific research that I didn&rsquo;t understand for a long time: you need expertise/peer review to evaluate scientific studies. This is for the same reason that programmers need domain-specific experience to know whether a given piece of software is &ldquo;well-tested&rdquo; or not. &ldquo;Well-tested&rdquo;, as a criteria, is defined relative to other, similar software and to standard practices in that field. This is why &ldquo;facts guy&rdquo; is often wrong: they don&rsquo;t know what works, so they don&rsquo;t know if a given experiment is rigorous.</li> <li>When you have any kind of non-trivial bug in your software, you have two problems: (1) the software is broken and (2) your mental model is broken (i.e. you don&rsquo;t know why the bug is happening because you don&rsquo;t know what the software is doing). It&rsquo;s sort of impossible to fix (1) without fixing (2), so my debugging technique focuses on fixing (2).</li> <li>Fixing broken mental models is the purpose of the scientific method, and it really works great for debugging. I just keep a Google Doc, and I use my mental model to make guesses about what the software is doing (but not necessarily about the root cause of the bug; just anywhere I think I might be wrong about anything).</li> <li>Once I have a hypothesis about what the software is doing, I literally write out &ldquo;Hypothesis&rdquo;, &ldquo;Experiment&rdquo;, &ldquo;Result&rdquo; in my Google Doc. I think of one or more experiments to test the Hypothesis, write them down, do them, and record their results.</li> <li>&ldquo;Result&rdquo; always starts with &ldquo;disproves hypothesis&rdquo; or &ldquo;consistent with hypothesis&rdquo; (again, you can&rsquo;t &lsquo;prove&rsquo; a hypothesis is true. But you&rsquo;re testing it, and &ldquo;consistent with&rdquo; means the hypothesis passed this test, so it&rsquo;s not wrong yet). I color-code the &ldquo;Hypothesis-Experiment-Result&rdquo; block red or gray if the hypothesis is disproven, and green if it&rsquo;s not.</li> <li>This technique parallelizes well: if lots of people are working on a bug, each person can do an experiment in parallel. Similarly, if a manager wants to see progress in investigating a bug, this list of hypotheses and experiments is a good way to show that.</li> </ul>
  • The epistemological structure of a piece of software and a scientific theory is the same: you can’t “prove” a theory is correct in the same way you can’t “prove” a piece of software has no bugs, but you can do lots and lots of testing
  • The logical parallel clarifies (IMO in an interesting way) a fact of scientific research that I didn’t understand for a long time: you need expertise/peer review to evaluate scientific studies. This is for the same reason that programmers need domain-specific experience to know whether a given piece of software is “well-tested” or not. “Well-tested”, as a criteria, is defined relative to other, similar software and to standard practices in that field. This is why “facts guy” is often wrong: they don’t know what works, so they don’t know if a given experiment is rigorous.
  • When you have any kind of non-trivial bug in your software, you have two problems: (1) the software is broken and (2) your mental model is broken (i.e. you don’t know why the bug is happening because you don’t know what the software is doing). It’s sort of impossible to fix (1) without fixing (2), so my debugging technique focuses on fixing (2).
  • Fixing broken mental models is the purpose of the scientific method, and it really works great for debugging. I just keep a Google Doc, and I use my mental model to make guesses about what the software is doing (but not necessarily about the root cause of the bug; just anywhere I think I might be wrong about anything).
  • Once I have a hypothesis about what the software is doing, I literally write out “Hypothesis”, “Experiment”, “Result” in my Google Doc. I think of one or more experiments to test the Hypothesis, write them down, do them, and record their results.
  • “Result” always starts with “disproves hypothesis” or “consistent with hypothesis” (again, you can’t ‘prove’ a hypothesis is true. But you’re testing it, and “consistent with” means the hypothesis passed this test, so it’s not wrong yet). I color-code the “Hypothesis-Experiment-Result” block red or gray if the hypothesis is disproven, and green if it’s not.
  • This technique parallelizes well: if lots of people are working on a bug, each person can do an experiment in parallel. Similarly, if a manager wants to see progress in investigating a bug, this list of hypotheses and experiments is a good way to show that.
  • ]]>
    Developing Incrementally https://creating.software/drafts/incremental/ Fri, 30 Sep 2022 12:05:16 -0700 https://creating.software/drafts/incremental/ <ul> <li>Incrementality is a style of development that affects everything in a software company, from &ldquo;how to structure PRs&rdquo; at the bottom to &ldquo;how to release and market products&rdquo; at the top.</li> <li>I have a collection of practices that I&rsquo;ve learned, where general theme is to make software development more incremental. I use them unless I have a good reason not to. I&rsquo;ve seen more failures from insufficient incrementality than from superfluous incrementality, but I&rsquo;ve seen a non-zero number of failures of each type.</li> <li>Starting from the top: it&rsquo;s good to build MVPs. Every product is, at release time, an <a href="../debugging">experiment</a> testing the hypothesis &ldquo;people will like this&rdquo;.</li> <li>Engineers are often anxious about releasing MVPs because they have visions of being overwhelmed by operational problems. I&rsquo;ve learned that, typically, no one uses a piece of software on release, and you usually have several weeks to fix things before your software gets any users at all (even with good marketing), and perhaps months before you get more than a handful of users.</li> <li>At a lower level of abstraction than the software product itself: I usually try to include the ability to release experimental features. <ul> <li>I usually implement this with a single &ldquo;experimental mode&rdquo; feature flag, client library, or beta release series, containing all experimental features to limit combinatorial complexity. <ul> <li>I know some projects, e.g. <a href="https://emberjs.com/">ember.js</a> and <a href="https://www.google.com/chrome">Google Chrome</a> include a set of feature flags, one per experimental feature. If you&rsquo;re confident you can manage the combinatorial complexity, this is better for users because they can use as little experimental code as they need</li> </ul> </li> <li>This way, you can release features as &ldquo;experimental&rdquo; as you develop them, get feedback from 1-2 interested users, iterate, and then release those features as &ldquo;non-experimental&rdquo; in the next major release.</li> <li>You can greatly reduce value risk and product risk with this approach, and also provide more value directly as experimental users aren&rsquo;t stuck waiting for &ldquo;the next big release that fixes everything&rdquo;.</li> </ul> </li> <li>Finally, at a lower level of abstraction than &ldquo;features&rdquo;: I strongly endorse writing code incrementally: <ul> <li>Write a design doc before writing any code. <ul> <li>Even if you don&rsquo;t show it to anybody (initially) design docs are much shorter than code, but detailed enough to reveal a lot of design problems. Iterating on the design is much, much faster when writing English.</li> <li>They&rsquo;re also a useful piece of documentation (make sure to include <em>why</em> the project is needed)</li> <li>They can obviate annoying status meetings; just record your implementation progress in the design doc as you go and send it to partners/managers who want to see progress.</li> <li>On teams with limited product vision, a common problem is that there are too many ideas. Design docs serve as a crude triage mechanism by imposing a &ldquo;proof of work&rdquo; burden on new ideas. If someone wants to take the product in yet another new direction, you can delay (or sometimes eliminate) debate by asking them to write a design doc first. This is pretty dysfunctional, but it&rsquo;s better than actually changing direction every day.</li> </ul> </li> <li>Write any new persistent data structures or schemas next. Whenever writing new code you should always write the data structures first<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> <ul> <li>When it&rsquo;s time to write code, I&rsquo;m a huge, huge fan of breaking up patches as much as possible. Reviewers are sometimes annoyed by the flood of patches, but in my experience, code gets merged much more quickly and safely this way, because each patch is between easy and trivial to review, so they get reviewed immediately.</li> <li>To do this, I often implement each change twice: once as a monolithic patch that contains a whole prototype of the feature (which I eventually discard), and then again as a series of small patches. I use the monolithic change as a guide for what&rsquo;s left to merge (by continually rebasing on &rsquo;trunk&rsquo; and refactoring as I merge patches), and try to factor out: <ul> <li>any non-functional changes (e.g. updating comments, renaming variables), merged as separate patches. In general, since I&rsquo;m trying to make reviews fast, I also try to keep diffs small, and factoring out non-functional changes is critical to that goal. <ul> <li>For example, if I move a function, rename it, and change the implementation, I&rsquo;ll make that three separate patches: <ul> <li>Moving a function is trivial to review (the diff is the size of the function but the lines are the same)</li> <li>renaming a function is trivial to review (the diff is 1+number of callers, but every diff line is a simple replace)</li> <li>changing the implementation is nontrivial to review, but because the function has already been moved and renamed, the diff is no larger than the function body and shows exactly what&rsquo;s different.</li> </ul> </li> </ul> </li> <li>any new classes or internal data structures, with no implementation. This attracts a lot of design feedback that is much easier to apply before the implementation is written</li> <li>any new methods/APIs (again, with an implementation of &ldquo;error: not implemented&rdquo;). The implementation and tests are added in a second, now-smaller followup patch.</li> <li>each API call&rsquo;s implementation, and tests for just that API</li> </ul> </li> </ul> </li> </ul> </li> </ul> <hr> <p>Other writing (specifically about breaking up patches) that I&rsquo;ve done on this, which I&rsquo;d like to incorporate:</p>
  • Incrementality is a style of development that affects everything in a software company, from “how to structure PRs” at the bottom to “how to release and market products” at the top.
  • I have a collection of practices that I’ve learned, where general theme is to make software development more incremental. I use them unless I have a good reason not to. I’ve seen more failures from insufficient incrementality than from superfluous incrementality, but I’ve seen a non-zero number of failures of each type.
  • Starting from the top: it’s good to build MVPs. Every product is, at release time, an experiment testing the hypothesis “people will like this”.
  • Engineers are often anxious about releasing MVPs because they have visions of being overwhelmed by operational problems. I’ve learned that, typically, no one uses a piece of software on release, and you usually have several weeks to fix things before your software gets any users at all (even with good marketing), and perhaps months before you get more than a handful of users.
  • At a lower level of abstraction than the software product itself: I usually try to include the ability to release experimental features.
    • I usually implement this with a single “experimental mode” feature flag, client library, or beta release series, containing all experimental features to limit combinatorial complexity.
      • I know some projects, e.g. ember.js and Google Chrome include a set of feature flags, one per experimental feature. If you’re confident you can manage the combinatorial complexity, this is better for users because they can use as little experimental code as they need
    • This way, you can release features as “experimental” as you develop them, get feedback from 1-2 interested users, iterate, and then release those features as “non-experimental” in the next major release.
    • You can greatly reduce value risk and product risk with this approach, and also provide more value directly as experimental users aren’t stuck waiting for “the next big release that fixes everything”.
  • Finally, at a lower level of abstraction than “features”: I strongly endorse writing code incrementally:
    • Write a design doc before writing any code.
      • Even if you don’t show it to anybody (initially) design docs are much shorter than code, but detailed enough to reveal a lot of design problems. Iterating on the design is much, much faster when writing English.
      • They’re also a useful piece of documentation (make sure to include why the project is needed)
      • They can obviate annoying status meetings; just record your implementation progress in the design doc as you go and send it to partners/managers who want to see progress.
      • On teams with limited product vision, a common problem is that there are too many ideas. Design docs serve as a crude triage mechanism by imposing a “proof of work” burden on new ideas. If someone wants to take the product in yet another new direction, you can delay (or sometimes eliminate) debate by asking them to write a design doc first. This is pretty dysfunctional, but it’s better than actually changing direction every day.
    • Write any new persistent data structures or schemas next. Whenever writing new code you should always write the data structures first12
      • When it’s time to write code, I’m a huge, huge fan of breaking up patches as much as possible. Reviewers are sometimes annoyed by the flood of patches, but in my experience, code gets merged much more quickly and safely this way, because each patch is between easy and trivial to review, so they get reviewed immediately.
      • To do this, I often implement each change twice: once as a monolithic patch that contains a whole prototype of the feature (which I eventually discard), and then again as a series of small patches. I use the monolithic change as a guide for what’s left to merge (by continually rebasing on ’trunk’ and refactoring as I merge patches), and try to factor out:
        • any non-functional changes (e.g. updating comments, renaming variables), merged as separate patches. In general, since I’m trying to make reviews fast, I also try to keep diffs small, and factoring out non-functional changes is critical to that goal.
          • For example, if I move a function, rename it, and change the implementation, I’ll make that three separate patches:
            • Moving a function is trivial to review (the diff is the size of the function but the lines are the same)
            • renaming a function is trivial to review (the diff is 1+number of callers, but every diff line is a simple replace)
            • changing the implementation is nontrivial to review, but because the function has already been moved and renamed, the diff is no larger than the function body and shows exactly what’s different.
        • any new classes or internal data structures, with no implementation. This attracts a lot of design feedback that is much easier to apply before the implementation is written
        • any new methods/APIs (again, with an implementation of “error: not implemented”). The implementation and tests are added in a second, now-smaller followup patch.
        • each API call’s implementation, and tests for just that API

  • Other writing (specifically about breaking up patches) that I’ve done on this, which I’d like to incorporate:

    • Small patches are easier to review and will go through review faster, especially if some patches are refactoring*only. If you need to refactor your code in order to make a change, it will be much faster to refactor first, get that change reviewed, and then make a much smaller patch with logical changes and tests.
    • Bugs are more likely to be caught during the review.
    • There’s no epic merge that has to be made once the change is done, which is a major source of bugs when not doing things incrementally.
    • The main branch will not change while the project is in progress or (much more frustratingly) in review.
    • You’ll have a much better sense of whether the project is behind schedule and

    In fact, an important part of working incrementally is separating refactoring (which is not user-visible, does not require tests, and is comparatively fast to review) from logic changes. Typically, you should refactor first (which, again, will be fast to review) such that your subsequent logic-change patch is as small as possible, and then make the logic change, which will be fast to review due to its smallness. when it’s likely to be done.

    How to Slice Up Work

    • If you’re making some big, cross-cutting change, notice when some part of that change could be done as a standalone refactoring patch. Make that change and get it reviewed separately. By the end, your big cross-cutting change may be fairly small.

    • If adding a new API or codepath, don’t add the whole thing all at once. First make changes to the data layer (e.g. the schema, the protobuf, the jsonspec, etc.), then add the API implementation, then add all the call sites. Reviewing each of these separately minimizes the cost of any changes proposed. Specifically, if you make all changes in on PR, and then a reviewer recommends a different schema, much of your code will have to be rewritten. If you get the schema merged first, you can confidently add an API implementation knowing that it’s approximately correct, and then add callers knowing the API won’t change.

    • The three-part change: when changing some method and all of its callers, rather than changing everything in one mega-patch, you should copy the method to a new implementation containing the desired changes, then gradually change all callers to use the new method over several patches, then delete the old method. This mostly allows you to minimize the amount of time your change spends in progress and limit the number of new callers of the old method that other engineers add while you’re working.

      If you’ve worked for a big SaaS company (Google, Facebook), you know this type of change is common there, to work around mismatched deployment schedules (add new code to server, deploy updated server. Change client to call new code, deploy updated client. Delete old code from server, deploy updated server. Done.), but if you haven’t, it’s good to know about.

    ]]>
    Technical Risk https://creating.software/drafts/technical_risk/ Fri, 30 Sep 2022 12:05:16 -0700 https://creating.software/drafts/technical_risk/ <ul> <li>Software people love to debate the value of software timelines/project deadlines<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup><sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>. I think this hand-wringing arises from an incomplete understanding of how deadlines get missed.</li> <li>Deadlines are really a problem when software projects are harder than you thought they&rsquo;d be. Emphasis on &ldquo;than you thought&rdquo; rather than on &ldquo;harder&rdquo;.</li> <li>Not knowing how hard a technical project is, is the definition of technical risk</li> <li>Technical risk and difficulty are totally orthogonal. Digitizing forms is high-difficulty, low-risk. &ldquo;I <em>think</em> there&rsquo;s an API for that&rdquo; is low-difficulty, high-risk</li> <li>Risk depends on both the nature of the project and the knowledge of the developer. Everything is higher-risk for new teammates who don&rsquo;t know the tech stack well, and it&rsquo;s highest risk for new devs who don&rsquo;t know any analogous tech stacks</li> <li>A challenging corollary of this is that only the implementing developer knows how risky a project is, because only they know what they don&rsquo;t know.</li> <li>Another corollary is that senior engineers can provide a huge amount of value just by de-risking others&rsquo; projects with the knowledge they passively hold (by estimating whether an approach will work and answering questions as the work progresses).</li> <li><input disabled="" type="checkbox"> There are lots of techniques for managing/mitigating technical risk: spikes (for the risks you know about), MVPs (or, my favorite game: <a href="https://news.ycombinator.com/item?id=21479289">what would it take to get this done in T/2</a>, which forces you to reflect on what you <em>know</em> will work vs. what you <em>hope</em> will work)</li> </ul> <div class="footnotes" role="doc-endnotes"> <hr> <ol> <li id="fn:1"> <p><a href="https://jproco.medium.com/how-to-deliver-software-without-deadlines-872f8eb244b0">https://jproco.medium.com/how-to-deliver-software-without-deadlines-872f8eb244b0</a>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
  • Software people love to debate the value of software timelines/project deadlines123. I think this hand-wringing arises from an incomplete understanding of how deadlines get missed.
  • Deadlines are really a problem when software projects are harder than you thought they’d be. Emphasis on “than you thought” rather than on “harder”.
  • Not knowing how hard a technical project is, is the definition of technical risk
  • Technical risk and difficulty are totally orthogonal. Digitizing forms is high-difficulty, low-risk. “I think there’s an API for that” is low-difficulty, high-risk
  • Risk depends on both the nature of the project and the knowledge of the developer. Everything is higher-risk for new teammates who don’t know the tech stack well, and it’s highest risk for new devs who don’t know any analogous tech stacks
  • A challenging corollary of this is that only the implementing developer knows how risky a project is, because only they know what they don’t know.
  • Another corollary is that senior engineers can provide a huge amount of value just by de-risking others’ projects with the knowledge they passively hold (by estimating whether an approach will work and answering questions as the work progresses).
  • There are lots of techniques for managing/mitigating technical risk: spikes (for the risks you know about), MVPs (or, my favorite game: what would it take to get this done in T/2, which forces you to reflect on what you know will work vs. what you hope will work)
  • ]]>
    To Do https://creating.software/todo/ Thu, 01 Jan 1970 00:00:01 -0700 https://creating.software/todo/ <h1 id="in-progress">In Progress</h1> <ul> <li><input disabled="" type="checkbox"> Teamwork</li> <li><input disabled="" type="checkbox"> Time Management</li> </ul> <h1 id="to-do">To Do</h1> <ul> <li> <p><input disabled="" type="checkbox"> Add profile picture to &ldquo;about&rdquo; page</p> </li> <li> <p><input disabled="" type="checkbox"> Post &ldquo;Decisions&rdquo; somewhere - HN I guess</p> </li> <li> <p><input disabled="" type="checkbox"> Add comments to the Kent Breck facebook post</p> </li> <li> <p><input disabled="" type="checkbox"> Turn the &ldquo;references&rdquo; section into a single-page list of links, with comments explaining what you like about each</p> <ul> <li><del>Singapore essay on the model</del> <ul> <li>Went into &ldquo;teamwork&rdquo;</li> </ul> </li> <li><del>&ldquo;No Silver Bullet&rdquo;, maybe as an aside? It&rsquo;s too long, but it has some good parts.</del> <ul> <li>Went into &ldquo;teamwork&rdquo;</li> </ul> </li> <li><input disabled="" type="checkbox"> Kent Breck Facebook post, mirrored with comments <ul> <li>Seems like this should go into &ldquo;Time Management&rdquo;?</li> </ul> </li> <li>Anything else? There must be others.</li> </ul> </li> <li> <p><input disabled="" type="checkbox"> Debugging</p> In Progress
    • Teamwork
    • Time Management

    To Do

    • Add profile picture to “about” page

    • Post “Decisions” somewhere - HN I guess

    • Add comments to the Kent Breck facebook post

    • Turn the “references” section into a single-page list of links, with comments explaining what you like about each

      • Singapore essay on the model
        • Went into “teamwork”
      • “No Silver Bullet”, maybe as an aside? It’s too long, but it has some good parts.
        • Went into “teamwork”
      • Kent Breck Facebook post, mirrored with comments
        • Seems like this should go into “Time Management”?
      • Anything else? There must be others.
    • Debugging

      • #software-craftsmanship, #advice-for-my-former-self
    • Incremental Development

      • #software-craftsmanship, #advice-for-my-former-self
    • Technical Risk

      • #management
    • Tag existing posts

    • Management advice?

      • Starting to think of problems in terms of “an engineer could solve this”
      • Know your audience, and avoid negativity unless it’s specifically worth damaging your relationship with someone to make them more anxious
        • Most engineers are too anxious already

    Done

    • Decisions
      • This is the only one I’m really happy with right now
      • #software-craftsmanship
    • Mark the other posts as drafts (or move it to a “drafts” folder or something?)
    • Feedback
      • mostly done, but this should be split up, I think
        • Update: not sure how I’d split this up, actually
    • Ladder
      • I guess this is done, but it doesn’t seem very interesting to me, in retrospect. Also the intro is too long and self-absorbed.
        • Revised this a bit, now I think it’s a little better.
      • Add examples of projects that engineers at each level would do
    ]]>