dev.in.the.shellYet another dev bloghttps://devintheshell.com/How to SUCK at pair programminghttps://devintheshell.com/blog/pairing-suck/https://devintheshell.com/blog/pairing-suck/Make your co-workers hate youThu, 03 Jul 2025 23:00:00 GMT<p>Pair programming is a wonderful technique where two developers come together to accomplish one task with half the productivity and twice the resentment.</p>
<p>For those looking to derail this process with style and master the art of collaborative sabotage, here's your step-by-step guide to making every session as painful and unproductive as humanly possible.</p>
<h2>Establish Dominance</h2>
<p>The driver role is sacred, and by sacred, I mean yours. Hold on to it for dear life.
When your partner timidly suggests a role swap, laugh softly and say, "No worries, I’ve got it".</p>
<p>You didn't get here by being a team player, you got here by pushing people around until you get your way.</p>
<p>Let 'em know who's boss.</p>
<h2>Ignore Your Partner</h2>
<p>They mention a typo? Nod silently and keep typing.
Suggesting a better approach? Let out a passive-aggressive "mhmm" and proceed to do it <strong>your way</strong> (the best way).</p>
<p>Don't miss the opportunity to make your partner feel like there is no point to even talk.</p>
<h2>Human Rubber Ducky</h2>
<p>Tired of writing code? Time to piss off time looking at your phone.
This is a great moment to mercifully allow that other poor soul to type.</p>
<p>Your primary role here is to be utterly, completely, and silently useless.</p>
<p>Your partner types, you stare.
They debug, you breathe.
They ask for input, you offer a cryptic grunt or perhaps a well-timed yawn.</p>
<p>The goal is to be less helpful than a syntax error, essentially transforming yourself into a warm body occupying a chair.</p>
<h2>The ~Navigator~ Distractor</h2>
<p>Since you are not typing, hence not paying any attention whatsoever, it's a great moment to bring up anything that comes to mind. The less relevant the better.</p>
<p>That meeting that nobody asked for? Complain about it.
You ate some crap last night and got mild diarrhea? Let 'em know.
Planning your next trip? Give them all the details.</p>
<p>Bonus points if they are debugging prod.</p>
<h2>Nitpicks For Days</h2>
<p>Conversely, if you can't manage complete apathy, go for the opposite extreme. Every keystroke is an opportunity for critique.</p>
<p><em>"You missed a white space!"</em>
<em>"Do you really need a while loop?"</em>
<em>"I don't like that variable name!"</em></p>
<p>Your partner should feel like they're undergoing a highly aggressive driving test, with you as the perpetually disappointed instructor.
The key here is to offer no constructive alternatives, just relentless, nitpicky condemnation.</p>
<h2>Weaponize Questions</h2>
<p>You already know it all, so questions only serve one purpose: to traumatize your co-worker.</p>
<p><em>"You know how this works, <strong>right</strong>?"</em>
<em>"Did you not see this coming?"</em>
<em>"Who would write crap like this?"</em></p>
<p>Passive-aggressive is your middle name.</p>
<h2>Praise, But Not Really</h2>
<p>Sprinkle in some motivational comments like:</p>
<p><em>"I like it...as a first approach."</em>
<em>"That's quite good...for someone your level."</em>
<em>"That's an...interesting way to do it."</em></p>
<p>Make them feel like there's hope...and then crush it.</p>
<h2>Your Own Best Practices</h2>
<p>Use the "best practices" hammer to strike down on any and all approach to programming you personally dislike.</p>
<p>Don't like functional programming? Best practice is to use OOP.
Don't like layers? Best practice is to do everything in one file. No modules, no nothing. Call it something fancy like "locality of behavior" to hide the fact that it's just bullshit.
Doesn't matter if you use the term incorrectly, just shoot some fancy-sounding words at the problem.</p>
<p>"Best practice" is whatever <strong>you</strong> want it to be.</p>
<h2>Code Style Is A Weapon</h2>
<p>Spacing? Braces? Tabs vs spaces? Semicolon?
You are the Judge, Jury, and Linter. Repeat with me: "<strong>I am the law</strong>".</p>
<p>Declare a new law every 15 minutes.
Retroactively shame them for not knowing the law.</p>
<p>If they question the law, come up with some "industry standard" that confirms the law and refuse to cite sources.</p>
<h2>Make It Weirdly Personal</h2>
<p>If they make a mistake, use it as a segue into a dysfunctional psychotherapy session.</p>
<p><em>"You are very thorough with your tests. Do you feel insecure about your ability to write bug-free code?"</em>
<em>"Interesting variable name. What made you come up with it?"</em></p>
<p>Don't evaluate the code, judge the person.</p>
<h2>Conclusion</h2>
<p>Remember, pair programming isn't about collaboration.
It's a psychological endurance sport, and you don't need to be the best: you just need to make your partner collapse faster than you.</p>
<p>It's a great chance to do <em>fuck all</em> while pretending to work, a chance to show someone how inferior their thought process, syntax choices, and entire existence are compared to yours.</p>
<p>Have some fun, do nothing productive, make them suffer.</p>
Try, Catch, Repeathttps://devintheshell.com/blog/try-catch-repeat/https://devintheshell.com/blog/try-catch-repeat/Handling unexpected behaviorThu, 05 Jun 2025 23:00:00 GMT<p>Quick overview of different ways to handle errors in software development.</p>
<h2>Error Codes</h2>
<p>The simplest and oldest way to handle errors. You would usually return an integer or <code>null</code>.</p>
<p>Unix shells famously follow this convention by returning either <code>0</code> for success or <code>1-255</code> in case of error.
This way, not only can you inform the caller of an error, but also specify the kind or severity of it.</p>
<p>In C, <code>-1</code> is commonly returned to indicate an error. Whenever the return type cannot be constrained to an <code>int</code>, <code>NULL</code> is returned instead.</p>
<p>Of course there's an issue here: it's just a matter of convention (what code should I use?).
There is no specific error type or compile time error that tells you how to handle the error.
In fact, the caller might very well just ignore it!</p>
<p>Plus, this approach limits the signature and design of a function call.
What if I try to fetch a user from a DB and fail to connect? I can return <code>NULL</code>, but then what should I do when I just don't find the user? Also <code>NULL</code>?</p>
<h2>Exceptions</h2>
<p>A much more popular approach is to <em>"throw an exception"</em> or <em>"raise an error"</em>, depending on the specific language jargon.</p>
<p>This relies on the language providing some construct to create a separate, conditional execution flow.
Sort of like an <code>if/else</code> statement, only in this case, the <code>if</code> branch continues normal execution while the <code>else</code> branch unwinds the call stack halting the rest of the program.</p>
<p>This takes the form of a <code>try/catch</code> in Java or C#:</p>
<pre><code>try {
doABarrelRoll();
} catch (Exception e) {
// handle error
}
</code></pre>
<p>Or a <code>try/except</code> in Python:</p>
<pre><code>try:
do_a_barrel_roll()
except Exception as e:
# handle error
</code></pre>
<p>More often than it should, this gets used as a clever way to not pay attention to errors: I can now <code>throw</code> them wherever I want, as long as there's a <code>try/catch</code> somewhere up the stack I'm golden.</p>
<p>As the call stack grows and the number of <code>throws</code> grows with it, following this conditional logic gets incredibly complicated, especially when these Exceptions/Errors are not explicitly declared by the type system at compile time (looking at you C#).</p>
<h2>Callbacks</h2>
<p>In functional style languages and async heavy programming, functions are passed to the main function being called, which will then be run whenever the main function finishes.
These are, of course, callback functions, as in they will be <em>called back</em> at a later point in time.</p>
<p>Typically found in and popularized by pre-promises JavaScript, it looks something like this:</p>
<pre><code>doABarrelRoll((err, result) => {
if (err) {
// handle error
} else {
// use result
}
});
</code></pre>
<p>Of course, nothing stops you from calling other functions from within the callback function. And more functions in the callbacks for those functions...
Welcome to the infamous <em>Callback Hell</em>:</p>
<pre><code>getUser(function(user) {
getPosts(user.id, function(posts) {
getComments(posts[0].id, function(comments) {
sendNotification(user.email, function(response) {
console.log('Notification sent:', response);
}, function(error) {
console.error('Failed to send notification:', error);
});
}, function(error) {
console.error('Failed to get comments:', error);
});
}, function(error) {
console.error('Failed to get posts:', error);
});
}, function(error) {
console.error('Failed to get user:', error);
});
</code></pre>
<p>Of course, this is mostly seen in legacy codebases. JavaScript (thankfully) now uses <code>async/await</code> and <code>try/catch</code>, which is far from perfect, but miles better than this.</p>
<h2>Returning the Error</h2>
<p>As an alternative to error codes, some languages (most notably Golang) allow for multiple return values.</p>
<pre><code>res, err := doABarrelRoll()
if err != nil {
// handle error
}
</code></pre>
<p>This way, the error doesn't interfere with the actual result of the operation, and both can be independently typed and checked accordingly.</p>
<p>Pretty nifty, but also quite verbose: the code gets littered with <code>if err != nil</code> everywhere.
Then again, at least with this approach devs are forced to confront and handle errors, which might be tedious but also necessary.</p>
<p>Also of note, nothing is stopping you from writing a function that returns <code>User, Post</code> instead of <code>User, Error</code>, so there's a whole other way to shoot yourself in the foot here.
In fact, some languages simulate this pattern by wrapping the return values in a list, so that one "value" is return containing <code>n</code> values.
Complexity here can get out of hand real fast.</p>
<h2>Returning a Wrapper</h2>
<p>Similar in spirit from that list approach, we can be more explicit about the return values by wrapping them in a dedicated type.</p>
<p>A <code>Result</code> type is often used for this, such as in Rust:</p>
<pre><code>match do_a_barrel_roll() {
Ok(success) => /*handle success*/,
Err(error) => /*handle error*/
}
</code></pre>
<p>Where the function signature looks something like this:</p>
<pre><code>fn do_a_barrel_roll() -> Result<i32, String>
</code></pre>
<p>And the definition of the <code>Result</code> type is:</p>
<pre><code>enum Result<T, E> {
Ok(T),
Err(E),
}
</code></pre>
<p>Or in Kotlin:</p>
<pre><code>when (val result = doABarrelRoll()) {
is Ok -> /*handle success*/
is Err -> /*handle error*/
}
</code></pre>
<p>Or Haskell:</p>
<pre><code>let result = do_a_barrel_roll
case result of
Right val -> -- handle success
Left err -> -- handle error
</code></pre>
<p>Notice how Haskell is different in that instead of <code>Ok</code> and <code>Err</code> it has <code>Right</code> and <code>Left</code>.</p>
<p>This is because, instead of having a <code>Result</code> type like Rust and Kotlin, it uses <code>Either</code> as its return value.
In fact, <code>Result</code> is little more than a specialized version of <code>Either</code>, where the semantics more clearly indicate the meaning of each possible value.</p>
<p>This paradigm is kinda hard to get accustomed to, as it forces you to think about the error just as much as you do about the happy path. Which is awesome IMO, but needs some getting used to if you are used to the "throw an error and hope for the best" school of thought.</p>
<p>Also, this requires either a language with a decently strong type system or very diligent developers.</p>
<h2>Conclusions</h2>
<p>At the end of the day, the language you use often defines the error handling approach for you.</p>
<p>That being said, as long as the rest of the team is on board, I would suggest playing around with different paradigm from the ones your stack defaults to.</p>
<p>Not everyone finds the same error handling system equally intuitive, and some contexts are just better suited for some paradigms than others.</p>
Your User Stories Suckhttps://devintheshell.com/blog/user-stories-suck/https://devintheshell.com/blog/user-stories-suck/And your system is unhealthy because of thatThu, 22 May 2025 23:00:00 GMT<p>I have mixed feelings about <a href="https://martinfowler.com/bliki/UserStory.html">User Stories</a>. Not the structure, necessarily, but the idea that the <strong>end user is the only perspective that matters</strong>.
I feel this is distilled pretty well in User Stories, so I'll use them as scapegoats here.</p>
<p>At a high level it's pretty reasonable: focus on the user when defining a problem/need.</p>
<p><em>"As a [type of user], I want [some goal] so that [some reason]."</em></p>
<p>However, it's not immediately clear how "we need to update the database" or "we are asking to get hacked" fit into this frame.
To be more specific, if a dev's job is to take care of an iceberg worth of <em>"shit that needs doing"</em>, User Stories seem to only fit well for the exposed tip of the ordeal.
Plus, a user can <em>want</em> an infinite amount of features, including features that don't fit well into the product.</p>
<p>What do we do with those other tasks? How can we stop the feature creep? Are user stories not fit for purpose?</p>
<h2>User Stories</h2>
<blockquote>
<p>"User Stories are chunks of desired behavior of a software system."
<cite>Martin Fowler[^1]</cite></p>
</blockquote>
<p>[^1]: <a href="https://martinfowler.com/bliki/UserStory.html">User Story</a></p>
<p>Sounds like a great way to ensure the team focuses on adding value to the product instead of tinkering endlessly with not-so-relevant parts of the system.</p>
<p>It seems particularly useful in the context of <a href="https://theleanstartup.com/principles">Lean Startups</a>, or greenfield projects more broadly.
Especially in regard to <a href="https://www.startuplessonslearned.com/2010/09/good-enough-never-is-or-is-it.html">build-measure-learn</a>, where the assumption is that users will guide product direction through <strong>measured</strong> behavior and feedback.</p>
<p>That same approach kinda falls apart when you’re knee-deep in a legacy codebase or a mature product.
These systems are often full of past compromises (made to get to market fast) that now need attention, but <em>The User</em> doesn't know or care about this.</p>
<p>So what's the approach here? How do we make space for that work?</p>
<h2>Dev Stories</h2>
<p>You might try to create a somewhat separate backlog of "Dev Stories" to ensure <em>technical stuff</em> doesn't get drowned out by shiny new features.
But how would that look like?</p>
<p><em>"As a developer, I want to fix known issues so that I don't get emergency calls at 2 AM."</em>
<em>"As a developer, I want a secure, up-to-date tech stack so that my job doesn't suck as much."</em></p>
<p>Doesn't quite have the same ring to it, does it?</p>
<p>Jokes aside, this would mean juggling two separate backlogs, when one is already hard enough to keep at bay.</p>
<p>Also, how is your Product Owner/Manager/Analyst/Thing supposed to prioritize stuff they don't even understand?
How important is keeping your server OS updated? Is it more urgent than the feature marketing wants by Friday?</p>
<p>It's a hard sell without a clear link to user value.</p>
<h2>The Temptation of More</h2>
<p>The average user seems to want <em>one tool to rule them all</em>. Technically minded people (usually) know that to be a bad idea.
A tool that tries to do everything is often great at nothing, and since we (hopefully) want our products to be great, there is a conflict between what the users ~think they~ want and what is best for the product.</p>
<p>Add to that the constant marketing <em>"necessity"</em> to add flashy features to more easily sell the product, and you end up with a great recipe for <a href="https://en.wikipedia.org/wiki/Feature_creep">feature creep</a>.</p>
<p>This manifests as a near-infinite pile of User Stories that often begs the question: Who actually asked for this and why?</p>
<p>I think that question is pretty key.</p>
<p>Do we <strong>know</strong> what <em>The User</em> wants, or are we just guessing? Is marketing chasing trends or chasing value? Are we conflating <em>The User</em> with the marketing team?</p>
<p>Anecdotally, it's hard for me to map this feature obsession with actual people using actual software.
100% of my non-tech friends and family have 0 interest in <em>The Cool New Feature</em><sup><small><small>TM</small></small></sup> and would much rather have any given software be better and faster at what it already does, than poorly perform new tricks on every update.</p>
<p>Do we actually know what percentage of our users want/need that new feature, or are we spending time and money chasing hunches?</p>
<h2>Hierarchy of (Software) Needs</h2>
<p>I find it useful to adapt <a href="https://en.wikipedia.org/wiki/Maslow's_hierarchy_of_needs">Maslow's pyramid</a> to software, where each additional level only adds value if the previous one is in place.</p>
<p>In my mind, a software product should be:</p>
<ol>
<li><strong>Functional</strong>: It does what it should.</li>
<li><strong>Reliable</strong>: It can be trusted to do so.</li>
<li><strong>Usable</strong>: It's accessible to non-tech users.</li>
<li><strong>Secure</strong>: It protects user data and privacy.</li>
<li><strong>Performant</strong>: It's <em>reasonably</em> efficient.</li>
<li><strong>Scalable</strong>: It adapts to <em>reasonable</em> increases in workload.</li>
<li><strong>Delightful</strong>: Modern UI, quality-of-life features, polish, etc.</li>
</ol>
<p>The point here isn't to undervalue #7. It's critically important.
But using a beautiful UI to entice a bunch of new users to use a service that can barely cope with the current ones makes no sense.
And showing off shiny new features when user data is simply not safe in the system due to poor security practices is not only disingenuous, but immoral.</p>
<p>The issue here is that the average user only seems to see tips #1, #3 and #7 of the iceberg.</p>
<h2>Blind User Is Blind</h2>
<p>Most users are too technically illiterate to ask for quality software.
But that desire exists, it's just 'hidden' as a combination of implicit assumptions and unconscious expectations.</p>
<p>This is often visible after the fact: Once a service gets hacked and its users start getting spam calls all day, they care about security. If the software starts behaving subjectively "too slow", they will complain. If the team takes too long to adapt to market changes due to horrible developer experience, they will leave.</p>
<p>By the time <em>The User</em> asks for security, reliability or scalability to be "added" to a system, it's usually too late.
These are not things you add later, they must be built and designed into the system from the start and maintained constantly.
They can hardly be just bolted on.</p>
<p>However, I don't think that makes User Stories unfit.
It's just a matter of reading between the lines, surfacing implicit requirements instead of waiting for <em>The User</em> to manifest them.
These implicit needs should bubble up as any other item in the backlog.</p>
<p><em>"As a paying user, I want the system to scale well so that broad adoption doesn't hinder my experience."</em>
<em>"As a privacy-conscious user, I want my data protected, so future attacks don't affect me even if successful."</em></p>
<p>If these User Stories don't fit in the backlog, that's just the team not caring about their users.
At least not in those regards.</p>
<h2>So What Do We Do?</h2>
<p>It's always good to question new features:</p>
<ul>
<li>For whom is it?</li>
<li>How does it fit the system?</li>
<li>What's the cost to stability and maintainability?</li>
</ul>
<p>Learn to say no.
More features is not "more better".</p>
<p>I think we should, on one hand, expose those 'hidden features' as any other User Story, they are just as important if not more.
When raising a technical concern about the system, <strong>tie it to real user impact</strong>.</p>
<p>On the other hand, if you can't do this, either try harder or admit there is no actual reason to be concerned. You might just <em>want</em> to improve something and that's valid too.</p>
Make the terminal great againhttps://devintheshell.com/blog/tame-your-terminal/https://devintheshell.com/blog/tame-your-terminal/Tips, tricks and tools to make it nice(r)Thu, 24 Apr 2025 23:00:00 GMT<p>The terminal is a very powerful tool, but it seems to scare developers into using less productive GUIs.
This guide provides tips, tricks, and tools to enhance your terminal experience, making it more efficient and enjoyable.</p>
<h2>Choosing a shell</h2>
<p>The shell is what interprets the command you write in the terminal.</p>
<p>This is not to be confused with the terminal emulator itself.
When you change the background of your terminal, you are configuring the <strong>emulator</strong>; when you set up an alias or run commands you are interacting with the <strong>shell</strong>.</p>
<p>There are <strong>a lot</strong> of shells out there, but the most frequently used ones are <code>bash</code> and <code>zsh</code>.
While <code>bash</code> is the most ubiquitous, <code>zsh</code> offers some worthwhile advantages that make it IMO the go-to option:</p>
<ul>
<li>Its tab completion is much better than <code>bash</code></li>
<li>It's <code>bash</code> compatible (unlike some other shells)</li>
<li>It's designed to be extensible via plugins</li>
</ul>
<p><a href="https://github.com/ohmyzsh/ohmyzsh">OMZSH</a> sometimes gets mixed up with <code>zsh</code> itself.
OMZSH is a big framework built on top of the z shell, with tons of functionality built in.</p>
<p>While using it might make sense as a first step, I would suggest you move past it as soon as you feel comfortable doing so.
Realistically, there are two features you care about here: plugin management and fancy prompts.</p>
<p>As a plugin manager, OMZSH is incredibly overkill. Consider that <code>zsh</code> plugins can be handled simply by cloning and updating them.
There is really not much management needed.
Alternatives like <a href="https://github.com/zap-zsh/zap"><code>zap</code></a> are much, much faster and simpler.</p>
<p>As a prompt, it's clunky as all hell and doesn't really help with customization all that much.
Dedicated solutions like <a href="https://github.com/starship/starship"><code>starship</code></a> are much faster and more customizable.</p>
<p>Just in general, it pollutes your <code>zsh</code> config with a bunch of settings you might not really need (or want), as well as a huge amount of aliases you wouldn't even notice are there, but might alter how commands behave.</p>
<p>So yea, I'd say use it to help you get started but leave it behind as soon as you get comfortable.
More often than not, less is more.</p>
<h3>Plugins</h3>
<p>There are (at least) three <code>zsh</code> plugins you should use to greatly improve your experience with the shell.</p>
<p>Use <a href="https://github.com/zsh-users/zsh-syntax-highlighting"><code>zsh-syntax-highlighting</code></a> to add pretty colors to your commands as you type.
This will help you catch typos and mistakes faster.</p>
<p><a href="https://github.com/zsh-users/zsh-autosuggestions"><code>zsh-autosuggestions</code></a> suggests past commands based on what you type, as you type.</p>
<p><script src="https://asciinema.org/a/37390.js" id="asciicast-37390" async="true"></script></p>
<p>You can use <a href="https://github.com/Aloxaf/fzf-tab"><code>fzf-tab</code></a> to fuzzy find through <code>zsh</code>'s built-in tab completion.</p>
<p><script src="https://asciinema.org/a/293849.js" id="asciicast-293849" async="true"></script></p>
<h2>Tips</h2>
<p>People seem to think that using the terminal requires you to manually type out long commands or have a near-infinite amount of aliases.</p>
<p>Here are some other things you can do to reduce redundant typing.</p>
<h3>Clipboard</h3>
<p>Copy/pasting to/from the terminal seems to cause more issues than one might expect.
Most modern emulators support <code>Ctrl+Shift+C</code> for copying and <code>Ctrl+Shift+V</code> for pasting, but remember that you can always just pipe the results of a command to your clipboard:</p>
<pre><code>some --cool command | xclip -sel clipboard # or wl-copy, pbcopy, ...
</code></pre>
<p>Or pipe the content of your clipboard into a command:</p>
<pre><code>xclip -sel clipboard -o | some --cool command # or wl-copy, pbcopy, ...
</code></pre>
<h3>History expansions</h3>
<p>If you use the command line long enough, you will eventually get a <code>Permission denied</code> message after forgetting to use <code>sudo</code> to run a command.
You don't need to re-write or copy the whole thing: <code>sudo !!</code> will re-run the last command with <code>sudo</code> at the start of it.</p>
<p>Here are a bunch of other useful expansions:</p>
<pre><code>!! # expands to the previous command
!$ # expands to the last argument of the previous command
!^ # expands to the first argument of the previous command
!* # expands to all arguments of the previous command
!echo # expands to the most recent `echo` command
!?echo? # expands to the most command containing the string `echo`
</code></pre>
<p>If these look like random symbols, understanding some <a href="../how-to-regex">basic regex syntax</a> might help.</p>
<h3>Simple but useful</h3>
<p>Moving around the file system with the command line can be a bit of a pain.
Always remember that running <code>cd</code> with no arguments sends you to your home directory, while <code>cd -</code> sends you back (and forth) to the previously visited directory.
So you can swap between two distant directories easily.</p>
<p>Similarly, you might find yourself running the same set of commands in the same order multiple times, especially while troubleshooting issues.
This hints to a shell script being a better approach, but if that seems like overkill, you can use <code>;</code> and <code>&&</code> to automate this a bit:</p>
<pre><code>echo one && echo two # runs the second command only if the first one succeeds
echo one; echo two # runs the second command even if the first one fails
</code></pre>
<p>You can combine as many commands as you want, and go for a coffee while they run unattended.</p>
<h2>Tools</h2>
<p>As much as those tips help, some things are better handled by some clever command line tools.</p>
<h3>z.lua</h3>
<p><a href="https://github.com/skywind3000/z.lua"><code>z.lua</code></a> <em>"learns"</em> which directories you move to the most, and suggest them by <em>frecency</em>.</p>
<p>In practice, this means that after a while of <code>cd</code>ing to directories like these:</p>
<pre><code>cd ~/Documents/repos/work/legacy
cd ~/Pictures/trips/colombia
</code></pre>
<p>You'll be able to run <code>z leg</code> or <code>z col</code> to go to those directories no matter where you are. It really does feel like it's reading your mind.</p>
<p>Using <code>zsh</code>, you can install it directly as a plugin by adding something like this to your config:</p>
<pre><code>plug "skywind3000/z.lua" # adapt syntax to match plugin manager
</code></pre>
<h3>fzf</h3>
<p><a href="https://junegunn.github.io/fzf/"><code>fzf</code></a> is a command line fuzzy finder.
What does it fuzzy find? <strong><em>Anything</em></strong>.</p>
<p>Running <code>fzf</code> in your home directory will list every single file in it, typing something will filter the results and pressing enter prints out the selected entry.</p>
<p>This might not seem like much at first, but consider that anything can be piped into <code>fzf</code>, and its output can also be piped into any other command.
We'll see this in action with some useful <a href="#aliases">aliases</a>. For now, adding these env vars to your shell config will make it behave more intuitively:</p>
<pre><code>export FZF_DEFAULT_COMMAND='rg --files -g "!.git" --hidden' # show hidden files but still ignore .git/ dir
# export FZF_DEFAULT_COMMAND='find . -type f -not -path "./.git/*"' # if not using ripgrep
export FZF_DEFAULT_OPTS='--height 70% --layout=reverse --border' # "better" (?) layout
</code></pre>
<h3>bat and eza</h3>
<p><a href="https://github.com/sharkdp/bat"><code>bat</code></a> is a prettier <code>cat</code> with git integration, while <a href="https://eza.rocks/"><code>eza</code></a> is a prettier <code>ls</code> with icons.
Plain and simple.</p>
<p><img src="https://camo.githubusercontent.com/43e40bf9c20d5ceda8fa67f1d95b5c66548b2f6f8dca8403e08129991cc32966/68747470733a2f2f692e696d6775722e636f6d2f326c53573452452e706e67" alt="bat" /></p>
<h3>File management</h3>
<p><a href="https://github.com/itchyny/mmv"><code>mmv</code></a> allows you to do bulk rename of files:</p>
<p><img src="https://user-images.githubusercontent.com/375258/72040421-d4f8cd00-32eb-11ea-828f-d9f14f3261ac.gif" alt="mmv" /></p>
<p>More broadly, consider installing a terminal file manager like <a href="https://yazi-rs.github.io/"><code>yazi</code></a>, <a href="https://ranger.fm/"><code>ranger</code></a>, or even good old <a href="https://vifm.info/"><code>vifm</code></a>, especially if you often have to fiddle around with the file system.</p>
<h3>Utilities</h3>
<p>As far as system monitoring goes, you'd be hard-pressed to find a better solution than <a href="https://github.com/aristocratos/btop"><code>btop</code></a>.</p>
<p><img src="https://github.com/aristocratos/btop/raw/main/Img/normal.png" alt="btop" /></p>
<p>If your struggle to handle git though the command line, <a href="https://github.com/jesseduffield/lazygit"><code>lazygit</code></a> might help.</p>
<p><img src="https://github.com/jesseduffield/lazygit/raw/assets/demo/commit_and_push-compressed.gif" alt="lazygit" /></p>
<p>Same goes for docker and <a href="https://github.com/jesseduffield/lazydocker"><code>lazydocker</code></a>.</p>
<p><img src="https://github.com/jesseduffield/lazydocker/raw/master/docs/resources/demo3.gif" alt="lazydocker" /></p>
<h2>Aliases</h2>
<p>Having gone through those tools, you can see how I cope with my crippling allergy to unnecessary typing:</p>
<pre><code>alias cat="bat"
alias cb="cd .."
alias cc="z"
alias cl="clear"
alias cpd="cp -ir"
alias fm="yazi"
alias l="eza --group --all --icons --long"
alias mkdir="mkdir -p"
alias rmd="rm -rf"
alias sctl="sudo systemctl"
alias tre="eza --all --icons --group-directories-first --tree --git-ignore"
# maybe add `--level=2` to reduce output
</code></pre>
<p>Which especially applies to <code>git</code>:</p>
<pre><code>alias ga="git add -A"
alias gamen="git commit --amend"
alias gc="git commit"
alias gcempty="git commit --allow-empty --allow-empty-message"
alias gcom="git add -A && git commit"
alias gfs="git fetch && git status"
alias glg='git log --graph --abbrev-commit --decorate --format=tformat:"%C(yellow)%h%C(reset)%C(reset)%C(auto)%d%C(reset) %s %C(white)%C(bold green)(%ar)%C(reset) %C(dim blue)<%an>%C(reset)" -15'
alias gmkb="git checkout -b"
alias gmv="git checkout"
alias gp="git pull"
alias gpush='git push'
alias gpushf='git push --force'
alias grmb="git branch -D"
alias gs="git status"
</code></pre>
<p>Aliases work as long as they are self-contained, but you can't be specific with the arguments.
For more complex use cases, we can use <strong>functions</strong> instead, which for this particular use case will act just like aliases.</p>
<h3>Fancy funcs</h3>
<p>How often do you create a directory only to then have to <code>cd</code> into it?</p>
<pre><code>mkd() { mkdir -p "$1" && cd "$1" }
</code></pre>
<p>Or <code>cd</code> into a directory and instantly run <code>ls</code>?</p>
<pre><code>c() { cd "$1" && ls }
</code></pre>
<p>Adding this to your <code>zsh</code> config will allow you to run these functions as if they were built-in commands.</p>
<p>You can fuzzy find files in the current directory and open them with your favorite text editor in one go:</p>
<pre><code>vo() { file="$(fzf)" && nvim "$file" }
</code></pre>
<p>Or fuzzy find your <code>zsh</code> history to look for that command you barely remember:</p>
<pre><code>hist() {
eval "$(fc -l -1 0 | awk '{$1=""; print substr($0,2)}' | awk '!seen[$0]++' | fzf)"
}
</code></pre>
<p>How about installing packages? Package managers are awesome, but you need to know the exact name of what you're after.
We can use <code>fzf</code> to make our lives much easier:</p>
<pre><code># Arch
install() {
package=$(paru -Slq | fzf --preview 'paru -Si {1}') && paru -S --skipreview "$package"
}
# Debian
install() {
package=$(apt-cache pkgnames | fzf --preview 'apt-cache show {1}') && sudo apt install -y "$package"
}
</code></pre>
<p>This will present all available packages in <code>fzf</code> for you to fuzzy find the one you need. Pressing enter on the result will install the package.</p>
<p>Similarly, we can list all installed packages in <code>fzf</code> for easy removal:</p>
<pre><code># Arch
remove() {
package=$(paru -Qq | fzf --preview 'paru -Qi {1}') && paru -Rns --noconfirm "$package"
}
# Debian
remove() {
package=$(dpkg --get-selections | awk '{print $1}' | fzf --preview 'apt-cache show {1}') && sudo apt remove --purge "$package"
}
</code></pre>
<p>If fills me with no joy to admit that more often than I'd like, I need to debug tests or commands that only fail sometimes.
This is always a pain in the ass, but something like this can make it easier:</p>
<pre><code>loop() {
local cmd=("${@:2}")
for i in {1.."$1"}; do
eval "$cmd"
if [[ $? != 0 ]]; then
echo "\nCommand failed on run $i"
return 1
fi
done
echo "\nCommand never failed"
}
</code></pre>
<p>Now, I can run <code>loop 5 make flaky_tests</code> once and have <code>make flaky_tests</code> run 5 (or 5000) times, reporting the failed iteration if present.</p>
<p>You can see how this can get out of hand fast.
Beware of complexity!</p>
<h3>Global aliases</h3>
<p>Some pipes and/or redirections are used quite often, but aliasing them won't work as you'd expect.</p>
<p>You can use <code>-g</code> to declare them as <em>global aliases</em> (as in they can be placed anywhere in the command, not just the beginning):</p>
<pre><code>alias -g C="| xclip -sel clipboard" # or wl-copy, pbcopy, ...
alias -g NOER="2> /dev/null"
alias -g NOOUT="> /dev/null 2>&1"
alias -g S="| sort"
alias -g SU="| sort -u"
</code></pre>
<p>Now I don't need to remember how to send the output to my clipboard.</p>
<h2>Shell setup</h2>
<p>There are some other minor tweaks you can do to your shell config to make it nicer.
For starters, chose a text editor and set it as default, so you don't unexpectedly get thrown into an unfamiliar environment:</p>
<pre><code>export EDITOR=nvim
</code></pre>
<p>Explicitly set your desired terminal emulator so things don't open in some odd default terminal:</p>
<pre><code>export TERMINAL=kitty
export TERM=kitty
</code></pre>
<p>Don't ask why you need to set it twice...</p>
<p>Also, you can ensure the command history behaves sensibly:</p>
<pre><code># Number of commands to store in memory
export HISTSIZE=10000
# Number of commands to store in disk
export SAVEHIST=10000
# Ignore duplicated commands during session
setopt HIST_IGNORE_ALL_DUPS
# Ignore duplicated commands when saving to hist file
setopt HIST_SAVE_NO_DUPS
# Append commands to history file instead of overwriting it
setopt append_history
# Append commands to history file as soon as they are run (instead of when the session ends)
setopt inc_append_history
</code></pre>
<h3>Key bindings</h3>
<p>With <code>zsh</code>, we can assign any function to a key combination by first turning it into a <em>widget</em>:</p>
<pre><code>custom_func() {
echo "Hello from custom function!"
}
zle -N custom_func
bindkey '^H' custom_func
</code></pre>
<p>Here we use <code>zle -N</code> to register a function as a widget, and <code>bindkey</code> to assign it to <code>Ctrl+H</code>.
A widget is, keeping it simple, any ZLE (Zsh Line Editor) compatible command, which is what <code>bindkey</code> expects.</p>
<p>There's much more you can do with widgets (this is a great place to fall down a rabbit hole!) and <code>zsh</code> comes with a bunch of useful ones built-in.</p>
<p>There are two in particular that I find super useful when going up and down the command history: <code>up-line-or-beginning-search</code> and <code>down-line-or-beginning-search</code>.</p>
<p>By default, pressing the up and down arrow keys allows you to scroll up/down the command history.
These two widgets will scroll based on what you <strong>already typed</strong>.
So if I type <code>e</code> and then the up arrow, only commands starting with <code>e</code> will be shown (so <code>echo</code> would appear but <code>ls</code> would be ignored).</p>
<p>These need to be loaded in memory before being registered, which then allows you to bind them:</p>
<pre><code>autoload -U up-line-or-beginning-search down-line-or-beginning-search
zle -N up-line-or-beginning-search
zle -N down-line-or-beginning-search
bindkey $key[Up] up-line-or-beginning-search
bindkey $key[Down] down-line-or-beginning-search
</code></pre>
<p>This binding syntax is slightly different from the previous one.
Special keys (arrows, backspace, tab, delete, etc.) are handled this way.</p>
<p>Hopefully this helps you enjoy your time in the command line a bit more!</p>
How to git without hubhttps://devintheshell.com/blog/git-patch/https://devintheshell.com/blog/git-patch/Git patches and how to apply themThu, 10 Apr 2025 23:00:00 GMT<p>How did people use Git collaboratively before GitHub was a thing?
The two are often coupled, but some projects that don't rely on GitHub (or any centralized service for that matter) for their git hosting needs.
Not just any old project too: the Linux kernel, Debian, Apache, GNU core utils and Golang are some examples of well known projects that handle their git repositories on their own infrastructure.</p>
<p>Ever wondered how they manage?
The simplest and most rudimentary way of using Git collaboratively is by sending changes to the maintainer in a file via email.</p>
<p>Let's go over what that workflow might look like.</p>
<h2>Where are my changes?</h2>
<p>We need a way to get a set of changes (often called <em>change-set</em>) out of your local repository so that they can be sent elsewhere easily.</p>
<p>A commit is represented by a hash or an alias, which wouldn't be very useful by itself in our case.
We can however get the underlying changes using <code>diff</code>.</p>
<h3>Creating a diff</h3>
<p>We can create a diff between two commits with something like <code>git diff <hash-1> <hash-2></code> and send the results to a file:</p>
<pre><code>git diff <hash-1> <hash-2> > mypatch.diff
</code></pre>
<p>You can also use a range of commits or something like <code>HEAD~2</code> to get the diff for the last 2 commits.
If given only one hash, <code>git diff</code> will create a diff between it and your working directory (so <code>git diff HEAD</code> on a clean working directory prints nothing).</p>
<p>This gives us a neat file with the changes we want to send upstream.</p>
<h3>Applying the diff</h3>
<p>Just run <code>git apply mypatch.diff</code>!</p>
<p>This will <em>apply</em> the changes in the diff to the working directory, but they won't be staged. The maintainer would have to stage them and create a commit to actually add the changes to the source tree.</p>
<h3>Shortcomings</h3>
<p>So this is great, but there are a couple of glaring issues here:</p>
<ul>
<li>The original author and metadata of the changes got lost</li>
<li>The original commits all got squashed into one (or re-organized however the maintainer decides)</li>
<li>A person that didn't write the changes (maintainer) got to commit them and appear as the author in the log</li>
</ul>
<p>This is fine for a quick POC or draft to share during development, but doesn't really scale well.</p>
<p>We need a way to maintain the original commits and their <strong>metadata</strong>, so the original contributor ends up in the log and their work is merged <em>as provided</em> (assuming no changes are needed from the maintainer).</p>
<h2>Creating a patch</h2>
<p>A patch is similar to a diff but keeps all the relevant metadata.
Since diffs are sometimes colloquially called patches, these can be called formatted patches.</p>
<p>You can create one using <code>git format-patch</code>:</p>
<pre><code>git format-patch -1 <commit-hash> --stdout > my_patch.patch
</code></pre>
<p>Here, <code>-1</code> represents the number of commits before <code><commit-hash></code> to be added to the patch.
So for a series of commits like:</p>
<pre><code>A -- B -- C -- D
</code></pre>
<p><code>git format-patch -1 C</code> and <code>git format-patch B..C</code> would achieve the same thing: a patch of the changes between <code>B</code> and <code>C</code>.
Conversely, <code>git format-patch C</code> would produce a patch of the changes between <code>C</code> and <code>HEAD</code>, which in this case is <code>D</code>.</p>
<p>The <code>--stdout > my_patch.patch</code> bit is just to send the data to a file.</p>
<p>It would be nice if <code>git diff</code> and <code>git format-patch</code> had consistent interfaces, but consistency is not really a strong point of git's UI...</p>
<p>Anyway, if you inspect this file you'll see that it not only contains the changes but also a bunch of metadata about them.</p>
<p>Let's see how an actual formatted patch can be applied (because of course it's not <code>git apply</code> like before...).</p>
<h2>Applying the patch</h2>
<p>Running <code>git apply</code> on a formatted patch <em>"works"</em> but leaves all that metadata out of the picture, just like before.</p>
<p>So instead, we use <code>git am my_patch.patch</code> (<strong>a</strong>pply <strong>m</strong>ailbox if you're curious).</p>
<p>Now, all the original commits in the patch (with their timestamp and messages) will be added to the tree, with the original contributor/s as the author/s.
This of course might create conflicts (although it shouldn't if the patch was created with care, more on that later). These can be fixed like any other merge conflicts, using <code>git am --continue</code> to resume the process.</p>
<p>In this case, the maintainer applying these changes will not appear anywhere in the commits.
This might be fine, but for later reference it might be useful to use the <code>--signoff</code> flag.
This way, the maintainer will be referenced at the end of the commit/s message/s with something like this:</p>
<pre><code>Signed-off-by: Some One <[email protected]>
</code></pre>
<h2>Where is my fork button?</h2>
<p>There is none, you just clone the repository, work on your local copy and send the patch to the maintainer/s.</p>
<p>Here's the thing: forks are not really a git thing, they are a GitHub ~complication~ abstraction.</p>
<p>When you click that fork button, what essentially happens is that GitHub creates a copy of that repository under your user, with a reference to the original for ease of integration.
You would then clone your copy of the repo, work on that, push to your remote and then handle the merge request to upstream through GitHub's UI.</p>
<p>Here, you just clone the original (upstream) repo, work however you want, and send the patch to the maintainer.</p>
<h2>GitHub-less Workflow</h2>
<p>So here's what the full workflow might look like, from both perspectives:</p>
<h3>As contributor</h3>
<p>Clone the project and create a branch.</p>
<pre><code>git clone [email protected]:Upstream/Repo.git
git checkout -b cool_branch
</code></pre>
<p>Do and commit the work.</p>
<pre><code>git commit -a -m "no idea what i'm doing"
</code></pre>
<p>Pull any new changes and <a href="../git-remote/#rebase">rebase</a> <code>master</code> onto your branch.
This makes sure your changes don't cause conflicts and are up-to-date with the main branch. The maintainer will thank you if you do this and yell at you if you don't.</p>
<pre><code>git checkout master
git pull
git checkout cool_branch
git rebase master
</code></pre>
<p>Create the formatted patch:</p>
<pre><code>git format-patch master --stdout > the_patch.patch
</code></pre>
<p>Like we saw before, git will produce a patch of the diff between the head of the current branch and <code>master</code>.
This is the data you would see in a GitHub Pull Request.</p>
<p>Send the patch to the maintainer, and you're done!</p>
<h3>As maintainer</h3>
<p>Get the patch somehow (email, curl, etc.) and apply it to a newly created branch.</p>
<pre><code>git checkout -b dont_trust_that_guy
git am --signoff the_patch.patch
</code></pre>
<p>Hope the contributor did a rebase to avoid conflicts, yell at him if he didn't.
Review the work and merge it if correct and up to standards.</p>
<pre><code>git branch master
git merge dont_trust_that_guy
git push
</code></pre>
<p>Done! Now the contribution is in the main working tree, yay!</p>
<h2>WTF do I care?</h2>
<p>Why would anybody care to work like this when we have lovely, Microsoft-provided, green "Merge" buttons?</p>
<p>Well, for starters some projects started before GitHub was a thing. Some projects are so big and distributed in nature that the GitHub workflow isn't really fit for purpose (such as the Linux kernel).
Some might argue that having the biggest repository of free and/or open source software hosted in Microsoft's servers might not be the brightest idea...</p>
<p>In any case, as a contributor you might not really have a say in this. If you want/need to contribute changes to these kinds of projects you'll have to adapt to how they work.</p>
<p>Apart from that, I just think it's pretty cool to be able to send a quick diff or a patch to a co-worker or a maintainer/contributor without the usual rigmarole of creating a branch, pushing to remote, fighting with the pipeline, etc.
It's just a file you can send via Matrix or Slack.</p>
<p>Simplicity has a charm all of its own.</p>
Remote branches and how to handle themhttps://devintheshell.com/blog/git-remote/https://devintheshell.com/blog/git-remote/Understanding git's push, pull and mergeFri, 28 Mar 2025 00:00:00 GMT<p>Often enough, the confusions with git arise not when working on local repositories, but when collaborating with others and handling remote ones.
In this post, we'll go over some basic concepts and commands to better understand what's going on when we push, pull and merge commits/branches.</p>
<h2>Remote origins</h2>
<p>So what are remotes? And how come there's more than one?</p>
<p>A <code>remote</code> is simply a place where we can pull from and push to.
When you first clone a project, you clone the remote repo to your local machine.</p>
<p>Since git can handle multiple remotes, these are named. By default, the URL from which you cloned a repo is set as the <code>origin</code> remote.
We can change that URL with something like <code>git remote set-url origin [email protected]:User/Repo.git</code>.</p>
<p>How is that useful?
Well it's useful as a feature when working on Open Source Software, since more often than not you'll have your own <code>remote</code> for the project (a fork) as <code>origin</code> and the actual <em>upstream</em> project as a separate <code>upstream</code> remote.
This is done to keep your 'copy' up to date with the 'original'.</p>
<p>More broadly, it's useful to understand these concepts firstly because you'll find references to <code>origin</code> and <em>remotes</em> when looking for information online (including this post), and secondly, because the following commands can be told which <code>remote</code> to operate on.</p>
<p>For simplicity, it will be omitted wherever possible, just know that there's nothing special about <code>origin</code> and that a <code>remote</code> is nothing more than a git server somewhere.</p>
<h2>Fetch</h2>
<p>To <em>fetch</em> the 'state' of the remote repo, we can run <code>git fetch</code>.
This allows other commands like <code>git status</code> or <code>git log</code> to show the full picture, since these only work with local information.</p>
<p>This hints to the fact that whatever git fetches has to be stored locally somehow.</p>
<p>Indeed, just like there is a <code>main</code> branch on a local repo, there's also a <code>origin/main</code> <strong>tracking branch</strong>.
A remote tracking branch's only job is to locally store the state of the corresponding remote branch. You cannot, for example, <code>git checkout</code> to one.</p>
<p>Of course, there are other ways git uses the <code>fetch</code> command, or rather the underlying plumbing. More on that in a bit.</p>
<h2>Merge</h2>
<p>Pretty explicit: It merges two branches together, specifically the second one into the first.
If only given one branch, <code>git merge</code> will integrate the given branch into the current one.</p>
<p>This is the main selling point that made git so popular in the first place and although nowadays, there are plenty of <a href="../tbd-why/#whats-wrong-with-branches">reasons to limit the use of branches</a> (and thus, merges), it's still worth understanding what's happening and how.</p>
<p>Since this is where a bunch of git issues arise, and there are multiple ways git might handle a merge, we'll go over the different ways this might happen (merge strategies) and the implications.</p>
<h3>Fast-Forward Merge (ff)</h3>
<p>Git's default behavior when possible.
Whenever a branch A has newer commits than another branch B, and A needs to get merged into B, these newer commits will be 'copied' or fast-forwarded into B.</p>
<p>For example, given these branches and commits:</p>
<pre><code>main: A -- B
feature: ↘-- C -- D
</code></pre>
<p>The command <code>git merge main feature</code> would produce the following result:</p>
<pre><code>main: A -- B -- C -- D
feature: ↘-- C -- D
</code></pre>
<p>Since the new commits on <code>feature</code> <em>'come from'</em> the last commit on <code>main</code>, they simply get put on top of it.</p>
<p>This is usually the best approach whenever possible, because it doesn't create new commits, conflicts or other complications.</p>
<h3>Recursive Merge or Merge Commit</h3>
<p>This is the default alternative to <code>ff</code>, and it's also what the green 'Merge' button does on GitHub by default.</p>
<p>In this case, git creates a new commit that points <strong>both</strong> to the last commit of branch A <strong>and</strong> to the last commit of branch B.</p>
<p>So for this setup:</p>
<pre><code>main: A -- B -- C -- D
feature: ↘-- X -- Y
</code></pre>
<p>Doing a fast-forward merge is not possible, so merging <code>feature</code> into <code>main</code> would look like this:</p>
<pre><code>main: A -- B -- C -- D -- M (Merge commit pointing back to both D and Y)
feature: ↘-- X -- Y --↗
</code></pre>
<p>With this approach, the whole history is preserved and the merge point is marked with its own commit.
This is also called <em>three-way merge</em>, because on top of (in this case) commits <code>D</code> and <code>Y</code>, commit <code>B</code> is also involved in the merge as it is the common base for both branches.</p>
<p>Notice how the merge commit has <strong>two parent commits</strong>, as in it points to two different commits as it's 'previous' one.
This is why 'undoing' or reverting a merge commit is a bit more involved than a usual <code>git revert [HASH]</code>.</p>
<h3>Squash</h3>
<p>Another strategy we might use is creating a squash commit, or squashing the changes into a single commit.
This is similar to the previous approach, only in this case the commit history will not be preserved like before.</p>
<p>When we do a squash merge, the changes in all the commits of (in this example) <code>feature</code> will be 'compacted' into one new commit in <code>main</code>.</p>
<p>So for the same setup we had before:</p>
<pre><code>main: A -- B -- C -- D
feature: ↘-- X -- Y
</code></pre>
<p>Running <code>git merge --squash main feature</code> would produce this output:</p>
<pre><code>main: A -- B -- C -- D -- S (Squash commit containing changes in X and Y)
feature: ↘-- X -- Y
</code></pre>
<p>Be careful when using this: if a big branch with a bunch of commits is squashed this way, and a bug is introduced in one them, it won't be easy to spot which one is the culprit.
Remember, there will only be one commit on the main branch after the merge.</p>
<h3>Rebase</h3>
<p>A rebase is not <em>really</em> a merge strategy, but it's vaguely related and will be relevant further down.</p>
<p>In this case we literally 'change the base' of a (set of) commit/s. That is, we assign them a different parent commit.</p>
<p>So given the same old setup:</p>
<pre><code>main: A -- B -- C -- D
feature: ↘-- X -- Y
</code></pre>
<p>Running <code>git rebase main feature</code> would <code>rebase</code> <code>feature</code> onto <code>main</code>:</p>
<pre><code>main: A -- B -- C -- D
feature: ↘-- X' -- Y'
</code></pre>
<p>So before the <code>rebase</code> we had a <code>X</code> commit with <code>B</code> as it's parent, but after we ended up with a <code>X'</code> commit with <code>D</code> as its parent.</p>
<p>Notice how the commits in the <code>feature</code> branch are actually different now.
In git land, commits are <strong>immutable</strong>. Once we change the parent we change the whole commit. New parent, new hash, new commit.</p>
<p>As mentioned before, this is not a merge: commits <code>X</code> and <code>Y</code> simply got substituted for new commits, they didn't get merged anywhere.
This does, however, enable you to do a <code>ff</code> merge onto <code>main</code>, which was not possible before the <code>rebase</code>.</p>
<p>Rebase should be used with caution when working with other people, as these new commits might conflict with other people's work.
Generally speaking, this is usually done to update your local branch to the latest commit in <code>main</code> or in a remote. So to keep a local feature branch up to date.</p>
<p>In fact, as a rule of thumb, never rebase branches where other people might be working and <strong>never, ever rebase <code>main</code></strong>.</p>
<h2>Pull</h2>
<p>When we pull changes from a remote, (simplifying things a bit) we are <em>fetching</em> recent commits and <em>merging</em> them into our local branch.
This is done using the remote tracking branch we mentioned before to store these new commits, and merging that into the branch we are in.</p>
<p>As such, the different approaches to merge (strategies) also apply here:</p>
<ul>
<li><code>--ff</code> fast-forwards local changes onto the new changes coming from the remote</li>
<li><code>--squash</code> squashes new remote commits into a single commit into the local branch</li>
<li><code>--rebase</code> rebases local commits onto the remote ones.</li>
<li>If no flags are given it will follow the same defaults as <code>merge</code>, or whatever is configured in the <code>.gitconfig</code> file.</li>
</ul>
<p>On top of these, the <code>pull</code> command also has <code>--only</code> and <code>--no</code> versions of these flags.
This indicates that a pull should either <strong>only</strong> or <strong>never</strong> follow a given strategy when pulling changes.</p>
<h2>Push</h2>
<p>Of course, we can also push our local changes to the remote repo.</p>
<p>What happens when we push a branch is that the remote tracking branch is updated (<code>fetch</code>) and <strong>if</strong> your local commits can be fast-forwarded onto the tracking branch, these commits (not the whole branch) will be pushed to the remote.</p>
<p>It doesn't really make sense to have multiple strategies here, since anything other than <code>ff</code> is bound to mess with other contributor's work.</p>
<p>We can however, in dire circumstances and hopefully not in the <code>main</code> branch, make a forced push using the <code>--force</code> flag.
Be careful with this, since all remote changes will be overridden with whatever is on your local branch.</p>
<p>We also use the <code>push</code> command to delete remote branches once they are no longer useful: <code>git push origin --delete feature-branch</code></p>
Are IDEs worth it?https://devintheshell.com/blog/editor-vs-ide/https://devintheshell.com/blog/editor-vs-ide/An overview on how they compare to code editorsFri, 14 Mar 2025 00:00:00 GMT<p>Much of the eternal online debate about IDEs and code editors is either out of date, out of context or out of both.</p>
<p>Let's go over some of these misconceptions and some less talked-about points, and evaluate which alternative makes more or less sense and in what context.</p>
<h2>Speed</h2>
<p>IDEs are slow, or so the trope goes.</p>
<p>While this is most noticeable at startup, this slowness is not necessarily limited to startup times.
It makes sense: IDEs are constantly trying to help and make suggestions. This has a compute cost.</p>
<p>Of course, given good enough hardware or simple enough projects, this might not really be an issue.
Still, that doesn't change the fact that, <em>ceteris paribus</em>, a code editor is faster.</p>
<p>Some might say that it doesn't matter, that it's not noticeable or that it's more than worth it.
Those are some <em>"that's like your opinion dude"</em> kinda points more than anything else.</p>
<p>Whether you notice it or not is a matter of how sensitive <strong>you are</strong> to things like input lag or what you consider fast enough.
Whether it's worth it or not depends on a bunch of different things, like the stack <strong>you are working with</strong>, the particular IDE you are using or how close you can get by adding plugins to a code editor.</p>
<p>In other words: it's context dependent and up to you if this is an issue or not.</p>
<h2>LSP</h2>
<p>Traditionally, the only way to get decent code completion and navigation was using an IDE.
In that sense, the developer experience used to be much, much better for IDEs than for code editors.</p>
<p>Since <a href="https://microsoft.github.io/language-server-protocol/">LSP</a> came out however, this is no longer the case.
Long story short, the Language Server Protocol standardizes how an editor or IDE interacts with a language, which enables any LSP-capable editor to perform much like an IDE.</p>
<p>Newer languages have official LSPs available on day one and even older ones often have unofficial alternatives made by the community.
To be clear, not all languages benefit from this. Obvious examples are Java and C#: good luck using them without Visual Studio or IntelliJ.</p>
<p>This used to be a big difference between using an IDE or not, but nowadays, its more a matter of what <em>else</em> can an IDE provide.</p>
<h2>Distractions</h2>
<p>Some of us <strong>need</strong> a distraction-free environment.</p>
<p>First thing I do after installing an app or subscribing to a service is disabling all non-critical notifications.
I don't want to deal with useless popups, sounds or flashing icons.</p>
<p>Clearly, IDEs are not made for people that have trouble focusing.</p>
<p>An IDE is always trying to help the user, which means bringing stuff to your attention. Constantly.
This is great if you need the help and can handle the distraction, but absolutely unbearable if you don't.</p>
<p>You can of course tone all of this down, just keep in mind that you are swimming against the tide: the IDE doesn't really want to shut up.</p>
<h2>Learning</h2>
<p>As the name suggests, IDEs integrate <em>everything you need</em><sup><sup>TM</sup></sup> under a single interface.</p>
<p>It's super convenient, but has the unexpected side effect of allowing you to treat all of those "<em>things you need</em>" as a black box.</p>
<p>You abstract away complexity, but lose the understanding of the underlying systems.</p>
<p>I've personally met and worked with people that only know how to interact with a database through their IDEs UI. Not <strong>an</strong> IDE, <strong>their</strong> IDE.
The same goes for docker, running tests and, worse of all, <strong>running the code itself</strong>.</p>
<p>Let me repeat, some professional developers can only run the code they write through the green "Play" button of their IDE. Talk about vendor lock-in.</p>
<p>Now, this is a human issue, not an IDE issue. People that behave like that tend to have the same exact problem with all the software they use.
Still, humans gravitate towards comfort and more often than not the (apparently) easier path will be chosen. Even if that means settling for ignorance.</p>
<p>People that have only ever driven automatic are unlikely to learn how to drive a manual out of curiosity, even though they might benefit from the knowledge at some point.</p>
<p>This might be a worthwhile trade-off, or it might not. But it is <strong>a trade-off</strong>, and we should be aware of it.</p>
<h2>Death by feature</h2>
<p>Code editors are usually pretty bare-bones out of the box: the user is supposed to <strong>add features as needed</strong>.
This usually results in a (sometimes unreasonably long) list of plugins.
The idea here is to end up with the features needed by the user and nothing else. Keep things relatively simple.</p>
<p>Conversely, IDEs come with <strong>all the features</strong>: the ones you need and the ones you don't.
This of course is far removed from the good old <em>"do one thing and do it well"</em> UNIX principle, and may introduce unexpected complexity or interactions under the hood.
The point is for the user to find everything he might need right from the get-go.</p>
<p>If you use a code editor expecting to not have to configure anything up front, you are going to have a bad time.
If you use an IDE expecting it not to pollute your dev environment with opinionated solutions, you are going to have a bad time.</p>
<h2>IDEs actual use case</h2>
<p>This all seems pretty negative towards IDEs, but there are a couple of very key areas where they shine so much more than code editors it's not even fair.</p>
<h3>Refactoring</h3>
<p>This is not necessarily true for every IDE out there, but once you learn to refactor code using any of JetBrains IDEs, there is absolutely no turning back.</p>
<p>The amount of manual labor it eliminates, of refactors you wouldn't even dare to do by hand, the time it saves: there's simply no comparison with any code editor.
Yes, there are plugins that somehow try to mimic these features. But it's not even the same league.</p>
<p>There's just no way around it: If you do frequent heavy refactoring, right now there is nothing that even comes close.</p>
<h3>Intelligent analysis and suggestions</h3>
<p>The counterpart of all that annoying slowness: most of the time, it feels like the IDE is reading your mind.
It just knows much more about the project than what is currently possible through the LSP.</p>
<p>Yes you can get good suggestions in any LSP-capable code editor, but a decent IDE will suggest stuff like variable names so cleverly you'd think your computer is actually doing your job.
The suggestions regarding code structure, repetition, standards and general best practices are just not there for code editors. Yes, you can use linters, but it's just another level of awareness.</p>
<p>Mind you, this was the case even before the AI craze.</p>
<h3>IDEs as a crutch</h3>
<p>The less you know about a stack or project, the more useful an IDE is.
Which is not to say that senior devs shouldn't use them or they are "<em>for noobs</em>", far from it.</p>
<p>This is precisely why we love them so much in consulting: we often get thrown head first into new projects with god knows what weird or ancient stack, using languages we are not super familiar with, while expecting us to refactor, stabilize or improve the codebase.
Using an IDE allows me to focus on <em>the system</em> first, and leave the stack/language details for later. It fills any gaps in my knowledge and allows me to start being helpful faster.</p>
<p>This might hurt some people's ego or pride. You should leave those out of your workplace.</p>
<h2>Use both</h2>
<p>For whatever reason, this issue is often presented as a false dichotomy (as with most other online debates): either use a fully featured IDE for everything or a dementedly minimalist setup with barely any software apart from a terminal emulator.</p>
<p>This is a bit silly, there's no reason why you shouldn't use both. Choose the best tool for the job, don't fall in love with software.</p>
<p>You don't need an IDE to edit a <code>bash</code> script or a <code>yaml</code> file.
You don't want to work on a legacy <code>java</code> codebase with a text editor.</p>
Selfish reasons to engage in Open Sourcehttps://devintheshell.com/blog/oss-makes-you-better/https://devintheshell.com/blog/oss-makes-you-better/How OSS can help you be a better devFri, 07 Feb 2025 00:00:00 GMT<p>There are many reasons to use, make or contribute to OSS.
While the ethical and societal benefits might be enough for some, I'd like to argue there are also purely selfish reasons for a developer to get involved.</p>
<p>Indeed, just like running Linux in your workstation deepens your knowledge of operating systems and broadens your tool-set basically for free, so too can OSS make you a better developer pretty much by osmosis.</p>
<p>Here are a bunch of ways this osmosis happens.</p>
<h2>Other people's code</h2>
<p>Contrary to what I thought when starting out, the bulk of our work consists of <strong>navigating, understanding and modifying</strong> other people's code.
And yes, <em>"other people's code"</em> includes the code you wrote a year ago.</p>
<p>This skill can hardly be trained or practiced as it should: you can type faster, practice problem-solving or read a bunch of books, but nothing trains you to deal with systems you didn't write.
In OSS however you'll be doing this all the time.</p>
<p>Very different people from different walks of life contribute to OSS, I've seen ways of writing and structuring code that I could have never imagined (both good and bad).
Participating in this space will allow you to improve this skill set.</p>
<p>You might think something like <em>"no, thanks, I get plenty of that at work"</em>, but is that really the case?
Mostly, I see people getting accustomed to their teammates code, creating a sort of cesspool of the lowest common denominator.
At work, we learn to navigate <strong>a specific codebase</strong>, written by specific people, instead of learning the skill of code navigation more broadly.
This is a shame.</p>
<h2>Documentation</h2>
<p>Maybe I've just been very lucky (or unlucky, depending on how you look at it), but I have never, not even once, seen documentation half as good in proprietary projects as I have in OSS projects. Both dev-facing and user-facing.</p>
<p>This not only means that the latter ones are a joy to work with by comparison, but also that I have a chance to <strong>learn</strong> what makes documentation good and how to write it myself.</p>
<p>There is no asking around co-workers to understand how to build the project, what ritual to follow when modifying the system (<a href="../tbd-why/">if any</a>) or the general structure of the code base.
Any reasonably complex OSS project that actually expects and welcomes contributions will at least have a <code>CONTRIBUTING.md</code> file to document all these issues and many more.
We should normalize this for proprietary projects as well.</p>
<p>The same goes for user-facing documentation: where companies have on-boarding teams and an endless supply of outdated docs that often force users to open support tickets, OSS projects have a compact, to-the-point <code>README.md</code> for basic instructions and very little incentive to offer free support (which means the docs are usually kept up to date and users are expected to actually read them).</p>
<p>Sure, on-boarding teams often serve a more service-oriented purpose. Something like <em>"tell me what to do, I don't want to waste my time reading your docs"</em>.
Still, concise, to the point documentation is often just as, if not more, valuable than having someone to directly ask for help.</p>
<h2>Troubleshooting</h2>
<p>There are plenty of ways to help OSS projects, well written bug reports are one of them, and they don't get the respect they deserve.
<a href="https://github.com/jdx/mise/discussions/3965">A good bug report</a> can literally be <a href="https://github.com/jdx/mise/pull/4010/files#diff-2f4ca92b70774de1683c7265e99c20689bd7bccdf8bde6448276f8e1f85e4a34">used as a test case</a>.
Writing a decent, well-structured bug report is non-trivial and can teach <strong>a lot</strong> about the troubleshooting process in general.</p>
<p>It is common for projects to have some sort of template or required information for the bug report. Something like versions of relevant software, configurations, expected behavior and minimum reproducible setup.</p>
<p>Not only can one learn a lot by paying attention to what information they require, but the act of gathering it and writing the report usually already starts the troubleshooting process for the maintainer.</p>
<p>I'm not sure how to describe it, but this process trains a sort of mind-set that can save <strong>a lot of time</strong> when things go wrong and help you get to the crux of the issue instead of fumbling around blindly.</p>
<h2>Soft skills</h2>
<p>These are often average at best in developers and understandably so: they are only indirectly related to writing code and are quite difficult to train.
They are however really important and often make the difference between a good dev and a great one.
OSS can be used as a playground to improve them, including but not limited to:</p>
<h3>Communication</h3>
<p>When participating in OSS projects, you will receive <strong>criticism</strong>. This is needed to keep the projects afloat and while it often comes with care and good intentions, this is not always the case.
This is a great chance to learn how to handle and respond to it, with the added benefit of not putting your job at risk.</p>
<p>Giving and receiving constructive <strong>feedback</strong> during code reviews hones your ability to communicate professionally and respectfully.</p>
<p>You will also have to communicate with <strong>clarity</strong>, since there is no <em>"let's quickly hop on a call"</em> when talking to strangers on the internet. At least not usually.
Writing issues, participating in discussions, writing or improving documentation: these all force you to articulate complex ideas <strong>clearly and succinctly</strong>, including all necessary information and absolutely no unneeded padding.</p>
<h3>Interpersonal</h3>
<p>More often than not, you'll have to <strong>adapt</strong> to the preferences of either the maintainer of a project or the wider community.
This might not seem like a great thing, but learning to <em>disagree and commit</em> is rather important especially in a business setting, as the alternative often consists of endless, bitter discussions with teammates and co-workers.</p>
<p>It takes some <strong>empathy</strong> to understand that people you'll never meet might have the same general goals and good intentions you have, while disagreeing with this or that particular point.
It takes some more empathy to not get mad at a user telling you that the default behavior of your software makes no sense, or that a given feature is actually a bug.
Sure it's frustrating, but how often do we have the luxury of talking directly with the end-users of our software?</p>
<p>Another source of interpersonal friction is a lack of <strong>cultural awareness</strong>. It's easy to forget that you might be interpreting as rude something that is totally respectful for a different culture.
It's even harder to think of a better environment to get used to this than OSS, short of a language exchange bar.</p>
<p>Sure, communication is mainly in English (which already cuts out non-English speakers), but you'd be surprised how many of them are not using their first language.
Open any issue tracker or online discussion, you might be reading a German dev responding to a Chinese user.
This matters a lot and changes one's approach to a conversation entirely.
Rarely have I been in a work environment with comparable diversity.</p>
Better ssh client setuphttps://devintheshell.com/blog/ssh-config/https://devintheshell.com/blog/ssh-config/And some other ssh tipsFri, 31 Jan 2025 00:00:00 GMT<p>You probably already know how to create and register an ssh key pair.</p>
<pre><code>ssh-keygen -t ed25519 -C "[email protected]"
# Press enter a bunch of times
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
</code></pre>
<p>Simple enough.
You can then copy the pub key into <em>[insert relevant UI here]</em> or just send it to your sever with <code>ssh-copy-id -i ~/.ssh/id_ed25519.pub user@server</code>.</p>
<p>However, when using more than one service or handling more than one server, it might be best to use different keys.</p>
<h2>Why use multiple ssh keys</h2>
<p>On one hand, it limits the impact of a compromised key.
If a machine gets compromised, but the ssh keys it used only granted access to a limited set of services, one can rest assured that the other services are probably fine.
This is especially relevant when separating work from personal keys.</p>
<p>It's also useful for revocation. You can cut access to one key/machine without affecting others.
Or set the ssh key for an untrustworthy machine to expire on a short cycle.</p>
<p>Fine, you create multiple ssh keys, and now find yourself having to run your ssh commands like <code>ssh -i ~/.ssh/key user@server</code> or even worse, having to <code>ssh-add ~/.ssh/key</code> before managing your remote git repo.</p>
<p>This kinda sucks, but there is a better solution.</p>
<h2>Config file</h2>
<p>You might not be aware that ssh will look for a client config under <code>~/.ssh/config</code>.
The file follows this basic structure:</p>
<pre><code>Host [address or alias]
IdentityFile [path to the ssh key]
</code></pre>
<p>You can have as many of these as you want.
The file is read up-to-down and the first match is used, so the least specific sections should be at the bottom.</p>
<p>This config for example, would use the first section for that IP, while using the wildcard at the end for all <strong>other</strong> ssh connections:</p>
<pre><code>Host 192.168.1.69
IdentityFile ~/.ssh/nice_key
Host *
IdentityFile ~/.ssh/other_key
</code></pre>
<h2>Handling multiple services and keys</h2>
<p>In practice, you might end up with a config that looks like this:</p>
<pre><code>Host github.com
IdentityFile ~/.ssh/github
Host gitlab.com
IdentityFile ~/.ssh/gitlab
Host vps
HostName 209.85.231.104
User vps_usr
</code></pre>
<p>Note here that, since a <code>HostName</code> is provided, the <code>Host</code> in the last section acts as the alias one would use to refer to that service.
So for that section, you would run <code>ssh vps</code> instead of <code>ssh [email protected]</code>, much less verbose.</p>
<p>Since the first two sections will be used by <code>git</code>, adding a <code>HostName</code> and a <code>User</code> doesn't make much sense.
For starters, <code>git</code> will always default to the <code>git</code> user, no need to explicitly set that.
Plus when working on GitHub/GitLab hosted repos, one would clone something like <code>[email protected]:ORG/REPO.git</code>, so the <code>Host</code> would always be the same as the <code>HostName</code>, thus making it redundant.</p>
<p>Of course, other options can be set in the config file, like <code>Port</code> or <code>ConnectTimeout</code>.
But there are more clever things that can be done.</p>
<h2>Advanced options</h2>
<p>There are, of course, many more options than these.
These are just used to show the usefulness of an ssh config file.</p>
<h3>Proxy Jump</h3>
<p>Depending on the security requirements of an organization, a <code>ProxyJump</code> might be needed when connecting to a server.</p>
<p>This simply means that the outgoing connection must go through a dedicated server before being redirected to final one, which might for example not be exposed to the internet.
To do this, one would <code>ssh -J jump_host target_host</code>.</p>
<p>As you might imagine, each might have different settings, so that command is bound to get messy.
You could create a shell alias, or you could use the ssh config:</p>
<pre><code>Host target_host
HostName target_host.com
User target_user
IdentityFile ~/.ssh/target_key
ProxyJump jump_host
Host jump_host
HostName jump_host.com
User jump_user
IdentityFile ~/.ssh/jump_key
</code></pre>
<p>This allows the ssh command to look like <code>ssh target_host</code>, no need to worry about who jumps where and with which credentials.</p>
<h3>Port Forwarding</h3>
<p>Another security-related config is <em>ssh tunneling</em> or <em>port forwarding</em>.</p>
<p>This is done with something like:</p>
<pre><code>ssh -L localhost:3000:localhost:3306 example.com
</code></pre>
<p>Which means '<em>please take all traffic going to <code>localhost:3000</code> and send it to <code>example.com</code> on port <code>3306</code></em>'
The second <code>localhost</code> is from the perspective of the remote server.</p>
<p>This might seem a bit silly. Why not just point directly to <code>example.com:3306</code>?
The interesting bit here is that the traffic is being sent through the ssh connection (so port <code>22</code> by default).
The server would receive the traffic on <code>:22</code> and re-route it to <code>:3306</code>.</p>
<p>This might be interesting not only to reduce the number of exposed ports in a server, but also to ensure cryptographic security. There's no need to use SSL/TLS here, OpenSSH is plenty secure and comes for free with no work needed on either side of the communication.</p>
<p>Of course, one could route more than one ports, and point to more than the remote "<code>localhost</code>", with a designated user, etc.:</p>
<pre><code>ssh -L localhost:8000:[IP_ADDRESS]:8000 -L localhost:8001:localhost:8001 [email protected]
</code></pre>
<p>This is not exactly easy to type, but a config like this might help:</p>
<pre><code>Host what_a_mess
HostName example.com
User username
LocalForward localhost:8000 [IP_ADDRESS]:8000
LocalForward localhost:8001 localhost:8001
</code></pre>
<p>One would only need to run <code>ssh what_a_mess</code>.</p>
<p>As you might tell, <code>LocalForward</code> implies there is also a <code>RemoteForward</code>, which indeed is used to send traffic the other way around (from the server to the client).</p>
<p>Just be careful when committing this config to your dotfiles repo: make sure no sensitive information is public!</p>
A different approach to AI code assistantshttps://devintheshell.com/blog/ai-tdd/https://devintheshell.com/blog/ai-tdd/Trust the tests, not the AISun, 04 Jun 2023 10:57:51 GMT<p>AI code-generation might usher a future where TDD becomes a <strong>strict requirement</strong> for software development. In that future, code quality will only be relevant for tests.</p>
<p>Let's do a thought experiment.</p>
<h2>Hallucinations</h2>
<p>AI generated code kinda sucks.</p>
<p>It's often messy, buggy, over-complicated or just simply incorrect. You need to be super careful with it, double check every line it produces. Troubleshooting bugs is a huge pain with generated code.</p>
<p>This is a shame because the efficiency is undeniable: I doubt that the average dev can produce code at a similar speed that your Copilot/GPT thing can.</p>
<p>Of course, speed is not everything, but we all want to be more productive.</p>
<p>How can we take advantage of the AI's ability to generate code quickly while ensuring the code actually works, and does so as expected/required?</p>
<p>If only there was an approach to software development that could guarantee the behavior of a piece of code...</p>
<h2>Testing</h2>
<p>What if we add AI to the TDD cycle?</p>
<p>I write a little test, the AI produces code to make it pass.</p>
<p>I write another little test, the AI makes both pass, either adapting current code or generating new one.</p>
<p>This way, there's at leas one thing we could always be certain of: the generated code makes the tests pass.</p>
<p>Given enough <strong>quality</strong> tests, this would allow us to ensure the system's behavior.</p>
<p>This is not far away from conventional TDD: You just don't write (or read) production code. You treat it as a black box and only focus on the tests.</p>
<h2>Evolution</h2>
<p>Take two equally capable developers: Peter and John.
Similar experience, similar skills.</p>
<p>Peter uses the previously described TDD+AI approach while John doesn't.</p>
<p>While <strong>only</strong> writing tests, Peter would produce working software that is <strong>guaranteed</strong> to behave as expected (so far as the tests correctly describe its behavior). This obviously can be done in less time than the alternative, he is only writing tests.</p>
<p>John faces a difficult choice: either use AI and embrace its quirks or avoid it altogether. While the former might seem faster at first, he could soon find himself spending more time fiddling around with generated code than actually writing it.</p>
<p>Avoiding AI altogether will be significantly slower than Peter's with the possible exception of just not writing tests at all, which wouldn't be a fair comparison and has plenty of other downsides.</p>
<p>Which one seems more productive/employable? Remember, we are considering two equally capable devs.</p>
<h2>Neglect the black box</h2>
<p>As you might imagine, the proposed approach would imply a near-total neglect of production code.</p>
<p>This is a big jump from our current notions of clean code or maintainability.</p>
<p>If you can produce code in a matter of seconds, does it really matter if it's easy to understand and modify? Wouldn't you just tell the AI to <strong>rewrite</strong> the thing if your requirements change?</p>
<p>Remember, you'll have your test suite to ensure no current behavior is lost. Just add more tests for the new behavior and the bugs you need to fix.</p>
<p>We wouldn't be spending much time (if any) with production code at all: that will be the AI's territory.</p>
<p>We would mostly work with the tests, using them to ensure the AI behaves correctly and doesn't make stuff up.</p>
<p>This doesn't make clean code or maintainability concepts obsolete. Rather, they will find their place within the tests. Those notions where meant for us humans anyway, not for the machine. If we focus our efforts on the tests, they'll come along for the ride.</p>
<p>Grim future? Depends on how you feel about TDD I guess <code>¯\_(ツ)_/¯</code>.</p>
Diagnosing Digital Patientshttps://devintheshell.com/blog/diagnosis/https://devintheshell.com/blog/diagnosis/Bugs as illnessesFri, 21 Apr 2023 08:03:27 GMT<p>Unless building a greenfield project, devs spend <strong>a lot</strong> of time troubleshooting buggy systems.</p>
<p>I find it odd that we seem to have no method to this madness, no procedures, no nothing. Just smash your head into the keyboard until something clicks.</p>
<p>Medical professionals have to <em>'troubleshoot people'</em> all the time, maybe they know what they're doing.</p>
<h2>Gather data</h2>
<p>There's not always a sensible bug report to start with.</p>
<h3>Pinpoint the issue</h3>
<p>People often don't realize the full extent of their symptoms. They just know it <em>kinda hurts around here sometimes</em>.</p>
<p>Good, sensible questions need to be asked to get a full picture of the unexpected behavior.</p>
<blockquote>
<p>What exactly is not working? Does it fail all the time? How does it fail exactly? What's the expected behavior?</p>
</blockquote>
<p>Get to know the system, understand the failure.</p>
<h3>Clinical History</h3>
<p>Look at the <strong>context</strong> surrounding the error, problems don't come out of nowhere.</p>
<blockquote>
<p>When does it happen? What makes it fail? What happened before it started? Can you find a pattern?</p>
</blockquote>
<p>If a system hasn't changed recently and a bug was 'introduced yesterday', either the user is the bug, or it's been there for a while.</p>
<h3>Physical Exam</h3>
<p>Well, digital really but you get the point.</p>
<p>Once a general understanding of the behavior and context is reached, try to go deeper.</p>
<h4>'It hurts when I…'</h4>
<p>Get your user to reproduce the bug for you.</p>
<p>Yes, this is not always possible. But patient and therapist should be on the same page. Maybe it's not a bug but a missing feature.</p>
<blockquote>
<p>What input(s) causes the unintended behavior? How do we get it to happen consistently?</p>
</blockquote>
<p>This aims at a low level, I/O approach to reproduce the issue. Reason about the bug like if you were to write a test around it (which you might actually want to do).</p>
<p>If you can reproduce it consistently, you'll fix it eventually.</p>
<h4>Does this hurt?</h4>
<p>The <em>'prod it with a stick'</em> part of the process.</p>
<blockquote>
<p>Does <strong>Y</strong> seem to make it any better? Does it also break if you <strong>X</strong>? What makes it worse?</p>
</blockquote>
<p>This might come off as a bit sadistic (and sometimes it is), but it's <strong>analysis by I/O</strong>: Give the system a bunch of different inputs and see how it affects the output.</p>
<blockquote>
<p>What if you press this button/use that plugin instead?</p>
</blockquote>
<p>Fear no consequence, break the thing: Software (unlike people) <strong>can</strong> be rolled back.</p>
<h3>Tests and Data Analysis</h3>
<p>There's no MRI for software, but we do have logs, metrics, user data, observability, etc.</p>
<p>If they are not present in the system, yesterday is a good time to add them. There is never too much information, you can always filter out irrelevant data.</p>
<p>While test results are a very important part of any objective analysis, they should not be the <strong>only</strong> base for a diagnosis. Use this data to complement the information gathered in the previous steps.</p>
<h2>Make a bet</h2>
<p>Gathering data is alright, but how does one actually reach a diagnosis? In any reasonably complex system, it's hard or impossible to actually know what is happening e2e. There are often unknowns, black boxes we don't fully understand.</p>
<p>Even so, we can do better than guessing.</p>
<h3>Pattern recognition</h3>
<p>If you feel tired, your nose is running, and you have a fever, it doesn't take a rocket scientist to bet on you having the Flu.</p>
<p>If you updated your Nvidia drivers yesterday, and today you got a black screen on boot, your OS is probably fine, the drivers are likely broken or incompatible.</p>
<p>This doesn't mean there <strong>cannot</strong> be any other issue, it's just <strong>so likely</strong> to be the cause that focusing on any other possibility as a first guess makes no sense.</p>
<p>Of course, this requires some experience: you can probably only recognize these pattern if they are not new to you.</p>
<h3>Differential diagnosis</h3>
<p>It involves finding <strong>all</strong> possible causes and <strong>eliminating</strong> them one by one, leaving only the (most likely) root cause.</p>
<p>A PC may not boot for a bunch of different reasons, but if you can hear the fans spinning and see some lights turn on, you can eliminate the power supply as one of them.</p>
<p>This can be, especially with software, a long and tedious process. But is accessible with or without previous experience, and allows you to be methodical in the process.</p>
<h2>Treat the damn issue</h2>
<p>Medical and IT professionals both face a critical choice: Either find the root cause and <strong>treat it</strong>, or simply treat <strong>the symptoms</strong>.</p>
<p>In medical fields, the latter is exclusively reserved for three scenarios:</p>
<ol>
<li>There is <strong>no treatment</strong>, so best we can do is alleviate the symptoms.</li>
<li>The treatment is <strong>unavailable</strong>/unaffordable.</li>
<li>The system is <strong>overstretched</strong>, and we lack the time/resources to diagnose and/or treat properly.</li>
</ol>
<p>Unfortunately, in the IT space, treating the symptom is a <strong>near-ubiquitous</strong> practice and the third scenario seems to be the norm.</p>
<p>We should keep in mind that software serves the needs of <strong>people</strong>, and even if users are not visible, they are still affected by inadequate diagnosis and treatment.</p>
<p>Would we behave the same way if the user was sitting by your side? What if the software was used by medical professionals? Are we considering the impact that software has in the lives of people and the choices they make?</p>
<p>We should make sure there is a valid reason <strong>not</strong> to diagnose and treat the root cause.</p>
<h2>Follow-up</h2>
<p>Once a diagnosis is reached, and a treatment is prescribed, follow-up appointments are scheduled.</p>
<p>This is done to confirm that the diagnosis was <strong>correct</strong>, the treatment is <strong>effective</strong> and that there are no unwanted surprises or further actions needed.</p>
<p>Try to reproduce the bug, press the same buttons as before, stress the system.</p>
<p>The issue is not fixed until proven so.</p>
Myths and clichéshttps://devintheshell.com/blog/myths/https://devintheshell.com/blog/myths/Separating the wheat from the chaffSun, 02 Apr 2023 15:22:31 GMT<p>Conventional wisdom can go a long way and is often a useful guide. However, it is usually best served with a healthy dose of skepticism and scrutiny.</p>
<p>Here, we explore the good, the bad and the ugly behind some common clichés I see floating around.</p>
<h2>TDD</h2>
<p>Probably the most misunderstood of the bunch.</p>
<h3>Requires clairvoyance</h3>
<p>A simple (but unfortunately common) reading of TDD lead some to believe that <strong>all</strong> tests need to be written before <strong>any</strong> of the relevant production code is.</p>
<p>This would require developers to plan ahead in some sort of whiteboard, which of course is usually a bit silly, since we learn about a problem domain as we build software around it.</p>
<h4>Does it though?</h4>
<p>Let's take a module or a Class for example: what part of TDD dictates that <strong>all</strong> tests for that module should be written as step 0?</p>
<p>Is it not TDD if I write a test for one of the functions, write that function and keep going bit by bit?</p>
<p>TDD is not a way of planning out your code with boxes on a whiteboard, quite the contrary: it incentivizes you to <em>design as you go</em>, to think about the public facing design of your code even before you think about the code itself.</p>
<p>Its usefulness comes in part from the fact that it helps detect and resolve unknowns <strong>before</strong> writing a piece of code, where they can cause issues.</p>
<h2>Testing</h2>
<h3>Makes no sense in an MVP</h3>
<p>An MVP needs to be built fast and only has to prototype the behavior of the system in a narrow scope.</p>
<p>Edge cases are often (purposefully) overlooked, so testing would slow down the process without adding much.</p>
<p>Who cares? It's just an MVP anyway, it will get re-written if it works.</p>
<h4>MVPs are forever</h4>
<p>MVPs often turn into the (nearly incomprehensible) core of Legacy projects that have to be maintained 20 years down the line. There is always <em>'not enough time'</em> to re-write them.</p>
<p>What would you rather do: write tests as part of the process or convince the Business team to stop new developments for a while to test what has already been "proven" to work in production?</p>
<p>You don't need 100% test coverage here (or ever). Just make sure your code is testable to begin with. Further testing can come later down the line.</p>
<h2>CI/CD</h2>
<h3>Means having a pipeline</h3>
<p>It's not uncommon to come across teams that think having a pipeline as part of their workflow qualifies as Continuous Development.</p>
<p>Somehow, the principle and the tool got mushed together.</p>
<h4>You could make-do without a pipeline</h4>
<p>Assuming you wanted to make this as inefficient and unreliable as possible, you could argue that Continuous Integration can be achieved without a pipeline.</p>
<p>After all, it's about integrating your code with everybody else's as frequently as reasonably possible: just <a href="../tbd-why/">push to master</a>, see what happens.</p>
<p>You could also pay a poor soul to continuously hit the big red 'Deploy' button all day long as well, no need for a pipeline.</p>
<p>The point of having a pipeline is to act as the gatekeeper for code quality, reliability, performance, stability, etc. when sending it to production.</p>
<p>This is the only sane way I can think of doing CI/CD, but pipelines are a tool, not a set of practices.</p>
<p>You can have the most over-engineered pipeline in the world, if you deploy and merge branches once a week well... I have bad news for you.</p>
<h3>Automated Deployment is Continuous Deployment</h3>
<p>Some think of CD as the absence of manual deployments.</p>
<p>Like with pipelines, automated deployments are the only sane way to do CD.</p>
<p>But if you really want to suffer, nobody is stopping you from manually producing and uploading all required artifacts by hand.</p>
<p>Don't miss the forest for the trees.</p>
<h2>Clean Code</h2>
<h3>Horrible performance</h3>
<blockquote>
<p>The "clean" code rules were developed because someone thought they would produce more maintainable codebases.</p>
<p>Even if that were true, you'd have to ask, "At what cost?"</p>
<p><cite>Casey Muratori<a href="%5Bcomputerenhance.com%5D(https://www.computerenhance.com/p/clean-code-horrible-performance)">^1</a></cite></p>
</blockquote>
<p>Let's not dwell on the fact that the generation of developers that basically founded this industry are not <em>someone</em>, and rather focus on the claim regarding the cost.</p>
<p>Nowhere in the book does the author advocate for clean code <em>at all costs</em>.</p>
<p>Quite the contrary, it is mentioned several times that code efficiency must be taken into account:</p>
<blockquote>
<p>I will avoid [the more efficient solution] if the [efficiency] cost is small.</p>
</blockquote>
<p>The claim about performance is either missing the point entirely, based on a woefully misguided reading of the book or simply clickbait.</p>
<p>No one is advocating for a complete disregard to performance.</p>
<p>Rather, it is suggested that you should write your code with other people in mind.</p>
<p>Eventually, someone will have to maintain your code: make sure there is a <strong>good reason</strong> to make it hard to work with.</p>
<p>The reality is that in a lot (if not most) contexts, performance is far down the list of things to worry about.</p>
<p>In most cases, the real world, practical performance differences between clean code and whatever the alternative is (performance-focused code?) are far outweighed by the maintainability of the former.</p>
<p>As with most things: it's a trade-off.</p>
<p>Your code doesn't need to be <em>text-book-clean</em>, but you should aspire to keep it reasonably clean considering the circumstances.</p>
<h3>It's in the eye of the beholder</h3>
<p>One might argue that what's clean to one person might not be clean to another.</p>
<p>Advanced programmers might find easy to read code that seem incomprehensible for beginners.
Each language has different idioms.</p>
<p>This might indicate that clean code is too relative a thing to be of any help.</p>
<h4>It not always is</h4>
<p>On the one hand yes, what is or isn't clean/readable depends on the context.</p>
<p>Your team might be used to working with 20000+ LOC files. They might find this normal and desirable.</p>
<p>This is not the end of the world: if it works for the team then it's all good.</p>
<p>On the other hand, not everything is relative.</p>
<p>Calling a variable <code>x</code> is objectively less clear than giving it a decently descriptive name.</p>
<p>Writing your code in a way that can be understood by a beginner, someone coming from a different background/language, or your future self, has clear and obvious advantages.</p>
<p>Using you language's latest super-fancy, concise, ultra-functional gimmick is often less about clean code and more about showing off.</p>
<p>There is a fine line between taking advantage of a given language's features, and gate-keeping the codebase to only those <em>'smart/versed enough'</em> to follow it.</p>
<p>Clean code doesn't look the same in all projects/teams/contexts, but that's not an excuse to disregard best practices and write code <strong>like you want to</strong>.</p>
<h2>Linux</h2>
<h3>Is free if you don't value your time</h3>
<p>Some find Linux (Desktop) to require too much time to set up/configure/maintain to be worth the effort.</p>
<p>To them, all possible gains from using Linux are offset by the amount of time and attention it requires.</p>
<p>In fairness, Linux can be as much of a time sink as you want it to. That being said, nothing prevents the use of a ready-to-use distribution.
These come already set up/configured and require little to no maintenance.</p>
<p>Things don't actually break for no reason (except when updating Windows).</p>
<p>Since this is quite obvious, a more charitable reading of the claim might be something like: <em>"The amount of stuff you have to learn is not worth the effort."</em></p>
<h4>Is it really not worth it?</h4>
<p>Learning a new <em>anything</em>, a new OS in this case, implies... A learning process, which quires time and might be frustrating.</p>
<p>Linux being FOSS furthers the amount of learning required, since we all come from using proprietary software and things are quite different here.</p>
<p>If you are not interested in learning a new skill set... Don't. Stick to what you already know.</p>
<p>If instead you are, this can be a lot of fun.</p>
<p>And if you are a developer, I cannot stress enough the amount of times I have been able to solve a problem or help a co-worker just by virtue of having a deeper understanding of how an OS actually functions.</p>
<p>If you come with an open mind, this stuff makes you a better dev. For free.</p>
<h3>Breaks all the time</h3>
<p>Some consider Linux Desktop to be unstable.</p>
<p>I'm still not sure what makes it a reliable server, but an unstable client. Maybe someone will point it out eventually.</p>
<h4>Still less of a headache</h4>
<p>I've been personally working from a rolling release distribution, widely considered unstable and breakage-prone for more than 4 years ATOW.</p>
<p>In this time, it broke twice:</p>
<ul>
<li>The first one as a consequence of me doing things I didn't understand as <code>sudo</code>.</li>
<li>The other one due to a combination of my being silly and upstream changes (a borked update that <em>should have been</em> easy to recover from).</li>
</ul>
<p>In contrast, I've lost count of how many times I've had to <em>'unfuck'</em> my Windows machine after an update.</p>
<p>MacOS has been less of a headache in that regard, but in that land you either do it <em>'The Apple Way'</em> or you don't. I like it my way, thank you very much.</p>
<p>There is a kernel of truth though: Linux <strong>allows the user</strong> to break things, while the alternatives usually limit what the user can do so much that the only reasonable way something breaks is if they (Microsoft, Apple) break it.</p>
<p>This is easier on the user's ego because it makes him inherently <strong>blameless</strong>. The system likely 'breaks less' because <strong>the user is out of the equation</strong>.</p>
<p>Plus, what's so terrible about breaking things? It's fun and the best way to learn!</p>
<p>Just have a backup and you'll be fine.</p>
TBD in actionhttps://devintheshell.com/blog/tbd-how/https://devintheshell.com/blog/tbd-how/Tips, tricks and how to make it workSun, 05 Mar 2023 16:41:12 GMT<p>As mentioned <a href="../tbd-why">previously</a>, you can be as lax as needed when adapting TBD to a team's workflow.</p>
<p>We'll go over do's, don'ts, and how-to's to help adjust this approach to software development for each context.</p>
<h2>Follow the principles, not the rules</h2>
<h3>Branch wisely</h3>
<p>While the point of TBD is obviously to only work on one main branch, this is an ideal that some teams strive for, but might be out of reach when starting out.</p>
<p>Start by ensuring <strong>no branch lives for more than a day</strong>, the shorter-lived the branch, the better.</p>
<p>Crucially, remove all long-lived branches that run in parallel to the main one.</p>
<p>Don't branch out of habit, do it if/when you actually need to. A POC might be a good example of a valid reason to branch.</p>
<h3>Distrust PRs</h3>
<p>While you work your way to a branch-less workflow, PRs are still going to happen.</p>
<p>Some restrictions might be worth considering:</p>
<ul>
<li>The <strong>pipeline</strong> should define the standards for code validity, quality and security. Not the PR.</li>
<li>Reviews are <strong>optional</strong>: Reviewers shouldn't prevent code from reaching production, reviews should be requested by the submitting dev <strong>if needed</strong> (and should be done synchronously if possible).</li>
<li>Use PRs as <strong>tools</strong>, not rituals: You might feel more confident blocking all pushes to the main branch and having the pipeline run on merge. This is little more than implementation detail, and is fine as long as it doesn't interfere with the workflow.</li>
<li>Keep them <strong>small</strong>: The easier they are to revert, the better. Prefer multiple small PRs over one big PR per user story.</li>
<li>Consider whether <strong><a href="../pair-programming-101">pair programming</a></strong> is a better alternative than a PR.</li>
</ul>
<p>In general, try to see PRs as little more than <em>'a thing that happens'</em> semi-automatically when pushing commits.</p>
<h3>Stay in sync</h3>
<p>The less time you spend away from your main branch, the better.</p>
<p>Constantly ask yourself: <em>Could this be merged?</em></p>
<p>Doesn't matter if the feature is done or if the bug is fully fixed.
If the answer is yes (as in "tests pass, code compiles and doesn't break prod"), <strong>do it</strong>.</p>
<p>Even if using branches: merge with master, open a new branch and keep going.
Make this a normal part of your workflow.</p>
<p>PRs and branches should end up feeling more like a chore than anything else.</p>
<h3>Deploy whenever</h3>
<p>The more, the merrier.</p>
<p>Keep your code deployable while you work, don't break the build, keep your tests green.</p>
<p>Test yourself and your team by deploying at least once a day. See if your code really is <em>"always in a releasable state"</em>.</p>
<p>Automatically deploying every commit might be a bit much to begin with, but the closer you get, the faster you gather user feedback, the faster you can make informed decisions.</p>
<p>For the bold and brave: have a pipeline that automatically deploys every evening/morning. You might be surprised how much that can change how you work.</p>
<h3>Work in small steps</h3>
<p>Commit code frequently, multiple times per hour. Doesn't matter if the code is not perfect or if it's a <em>"Work In Progress"</em>.</p>
<p>Wrote a test? Commit. Made it pass? Commit. Made it compile? Commit. Refactored a module? Commit.</p>
<p>If it compiles and passes the test suite it's good to go.</p>
<p>Reverts are easy when working in small increments, trust your VCS, think of commits as checkpoints.</p>
<h2>Must-haves</h2>
<p>Of course, there are some technical must-haves to make this work. You might not be able to just take your existing codebase, and go for a TBD workflow.</p>
<p>Here are some things to consider.</p>
<h3>Pipeline</h3>
<p>You need a solid, <strong>cared for</strong>, efficient and stable pipeline.</p>
<p>This should be a primary focus of the team: Issues with the setup (build, tests, containers, pipeline, etc.) should be resolved <strong>immediately</strong>.</p>
<p>Ideally it would take care of building, testing, code analysis, security tests, deploying to production and any other task that can possibly be automated.</p>
<p>The pipeline should be fast and efficient. Builds and tests should run as fast as possible, ideally in parallel.</p>
<p>Only changed code should be built, and only relevant tests should run. Of course this requires enough modularity to make this viable: you can't only run the tests for module A if you expect the changes to affect other parts of the system.</p>
<p>Cache your dependencies, optimize anything that comes to mind. Every minute wasted here will add up really fast.</p>
<h3>Fast builds and tests</h3>
<p>You need to have a comprehensive and meaningful suite of <strong>automated tests</strong>, mostly unit tests with a more selective approach to e2e and integration tests.</p>
<p>These need to be fast and reliable and the team should trust them enough to consider the code deployable as soon as it passes them. They should be the judge of what is or isn't production ready.</p>
<p>Ideally, building the project and running the tests shouldn't take more than a few minutes from start to finish. If there are tests or builds that take longer than the rest, isolate them.</p>
<p>Fuzz tests for example might run after the code is deployed or in parallel to it, slow builds might be avoided for patches that don't involve that specific part of the system.</p>
<h4>Locally reproducible</h4>
<p>When doing following this line of work, <em>braking trunk</em> slows down the rest of the team.</p>
<p>Since mistakes will inevitably happen, ensure the system can be <strong>quickly and fully</strong> built and tested locally. This should be done regularly before pushing changes.</p>
<p>Tests that "only work in Jenkins", flaky tests, or slow/complicated builds incentivize devs to run very little checks locally. This might not be out of carelessness, they might trust the pipeline so much that they count on it spotting errors, they might think it's not a smart use of their time (why bother if the pipeline is going to do the same thing?).</p>
<p>This is a good thing, but not good enough to slow everyone else down. Make sure these things are not a chore, but a quick check one does without even thinking about it.</p>
<h3>Fine-grained deploys</h3>
<p>Ideally, especially with monolithic applications, one wouldn't need to re-deploy the whole thing. Rather, deploys should only involve the parts of the system that have been updated.</p>
<p>This is easy enough when working with microservices (if done right), but can be challenging with monolithic systems.</p>
<p>Modularize your code in a way that allows for partial deploys. Ideally, the selection of which part to deploy would be automatic based on the git diff, but human selection might be a good or even better idea depending on your system.</p>
<p>If for example the codebase is fragile and changes in one place are bound to affect other places, automatically selecting which piece to deploy might be a bad idea, while a human might have the context needed to make that decision.</p>
<h2>Tips and tricks</h2>
<p>When coming from a branch based workflow, it is likely unclear how exactly to make changes without breaking things.</p>
<p>There are multiple tricks you can use to protect the system from your code:</p>
<h3>Feature Flags</h3>
<p>A feature flag is a way to hide a functionality or a piece of code unless certain criteria is met.</p>
<p>What these criteria are is up to you and context dependent. Feature flags can be as simple as "only available to user X" and complex enough to require a <a href="https://github.com/Unleash/unleash">purpose build solution</a> just to manage them.</p>
<pre><code>// old code
if (user.flags.newFeature || user.email === "[email protected]") {
// cool new feature!
}
// old code
</code></pre>
<p>This allows you to easily "turn your code off and on" for one or more users, handle Betas or simply to manually test out the code in production.</p>
<p>It also has the added benefit of allowing work in progress code to live in production without affecting the application or the users in the slightest.</p>
<p>On top of that, it can pave the way for <a href="https://en.wikipedia.org/wiki/A/B_testing">A/B testing</a>.
This can be enough of a reason to implement feature flags on its own.</p>
<p>As you can imagine, there is much more to feature flags. You can learn more <a href="https://trunkbaseddevelopment.com/feature-flags/">here</a>.</p>
<h3>Branch by abstraction</h3>
<p>When making changes to a piece of code that other developers or teams depend on, branching from that code by abstracting the API is very helpful.</p>
<p>To use a simple example: If changes need to be done in <code>function_foo()</code> but someone else is using it, extracting N functions from it can allow for easy swapping of the parts that need work or a new implementation, without having to go for more invasive approaches, like changing the usages of the original function to another <code>wip_function_foo()</code>.</p>
<p>This might seem needlessly complex for functions, classes or interfaces/traits, but in complex systems and/or big enough changes we might be talking about whole modules. Even changing a function signature might entail an unmanageable amount of merge conflicts.</p>
<p>You can think of this as a type of Parallel Change, although on top of allowing you to keep the tests passing, it allows other team members to <strong>keep working uninterrupted</strong>.</p>
<h3>Dark Launches & Canary Releases</h3>
<p>Dark launches simply refer to releases that are hidden and only made visible/usable for a subset of users.</p>
<p>Similarly, canary releases are only meant to go out to a select group of users.</p>
<p>The former is used when all users necessarily run the same version of the software (a web app, SaaS, etc.) while the second might make more sense in the opposite case (a phone or desktop app). Both have the same purpose and hold the same value.</p>
<p>The point here is to only give access to the feature to a predefined group of trusted users (or a small percentage of the total user base), usually with the help of feature flags.</p>
<p>By doing so, you can see a feature in action (not only in production but in use by actual users), gather feedback, evaluate how it performs and decide if a full-scale release makes sense or more work needs to be done.</p>
Push to prod, do it oftenhttps://devintheshell.com/blog/tbd-why/https://devintheshell.com/blog/tbd-why/Why TBD is a good ideaSat, 04 Mar 2023 12:00:13 GMT<p>Software development might mean different things to different people, but at the end of the day the whole point is to satisfy the needs of its users, whichever they might be.</p>
<p>Think of it in that light, and some gaps will appear in the way those needs are <strong>conveyed</strong>:</p>
<ul>
<li>What the user <strong>actually</strong> wants <code>!=</code> what he thinks he wants.</li>
<li>What the user thinks he wants <code>!=</code> what the business team thinks he wants.</li>
<li>What the business team requests <code>!=</code> what ends up as a Requirement or User Story for the dev team.</li>
</ul>
<p>This is unfortunate, but apart from trying to get in touch directly with the user, there's little we can do as devs to close that gap.</p>
<p>What's maybe more relevant to us is the gap that runs on the other side of the equation, between code being written and the user actually using it (the way the needs are <strong>met</strong>):</p>
<ol>
<li>Write the code</li>
<li>Make a PR</li>
<li>Wait for the reviewer and discuss the code</li>
<li>Deploy to a staging env</li>
<li>Wait for QA to validate it</li>
<li>Mark as ready to release</li>
<li>Wait for the release cycle/window</li>
<li>... Is the user still there?</li>
<li>Fuck I pushed a bug to prod <code>-.-U</code></li>
<li>Hotfix? Rollback and <code>goto</code> to step one?</li>
</ol>
<p>All this with the accompanying mess of branches, merges and possible conflicts with other co-workers or teams.</p>
<p>In an ideal world, the user would tell us what he needs directly and clearly, peek over our shoulders while we code, understand what we are writing, and let us know if we are on the right track.</p>
<p>Sadly, this world is not ideal. But there is a lot we can do to shorten the time between writing code and receiving the users feedback on it.</p>
<p>Trunk Based Development is one approach we might use to achieve this.</p>
<h2>What even is TBD?</h2>
<p><a href="https://www.atlassian.com/continuous-delivery/continuous-integration/trunk-based-development"><strong>T</strong>runk <strong>B</strong>ased <strong>D</strong>evelopment</a> is, in a nutshell, the practice of writing software avoiding branches as much as possible, streamlining the development in one main "Trunk" branch, also used for deploys, QA and the likes.</p>
<p>The fundamental importance of this approach is in recognizing that the main "Trunk" branch is <strong>the only</strong> source of truth.</p>
<p>There's no <em>'my version vs your version'</em> here, we all have the same version. No '<em>what commit is prod right now?</em>': the answer is ideally always <em>'the last one'</em>.</p>
<p>While using branches is not necessarily blasphemy (it might be more practical to use them), no <strong>long-lived</strong> branches should exist.</p>
<p>Of course for this to be viable, commits should be small and happen as frequently as possible. Code should be thoroughly tested and <strong>always releasable</strong>.</p>
<p>Simply put, you should aspire to constantly commit correct, tested code straight to production.</p>
<h2>What's wrong with branches?</h2>
<p>Nothing per se, but we often forget the trade-offs they bring. Here's a refresher.</p>
<h3>Merge Hell</h3>
<p>In big projects with multiple dev teams working on shared code, <strong>Merge Hell</strong> is a very real issue.</p>
<p>Nobody wants to spend half a work-day resolving merge conflicts, much less pay for someone's salary to do so.</p>
<p>The feeling one gets after resolving merge conflicts for 2 hours, only to find out more conflicting code was pushed in the meantime is... not nice.</p>
<p>It's just not a productive use of time.</p>
<h3>Partial Truths</h3>
<p>If you are working on a branch created more than a day ago, you know how your code behaves with <strong>yesterday's project</strong>. Your <em>'Truth'</em> got stuck in time.</p>
<p>Today's version of the project might not behave the same, and you might need to re-think what you are doing. Wouldn't you want to know if that's the case as soon as possible?</p>
<p>The other side of the coin might be even worse: While your code is not in the main branch, you are <strong>hiding information</strong> from the rest of the team.</p>
<p>Nobody knows what your code looks like, how it behaves or how to work with it. You are hiding your <em>'Truth'</em> from the rest of the team(s).</p>
<h3>Speed (or lack thereof)</h3>
<p>Do we really need long-lived branches at all? When a critical hotfix is required, we clearly have no issue pushing directly to the main branch.</p>
<p>Even with a protected main branch, we create a short-lived branch on the fly before deploying it directly to production, no fluff involved.</p>
<p>This approach allows us to quickly update production software, delivering immediate value by squashing a bug or addressing a critical need.</p>
<p>So why not aim for the same speed when rolling out new features or UI updates? Why should we go fast only in emergencies?</p>
<h2>TBD? CI? CD?</h2>
<p>Some might argue that these problems are avoided by practicing CI/CD, no need for this TBD business. The thing is, you are probably <strong>not really</strong> doing CI/CD if you aren't also doing TBD.</p>
<p>On the other hand, if you’re doing TBD, you're either practicing CI/CD, incredibly smart, or incredibly dumb.</p>
<p>CI/CD is a somewhat ambiguous term: It depends on what one considers <em>"continuous"</em>.</p>
<p>Teams coming from a monthly release cycle might consider weekly integrations to be <em>"continuous"</em>, while some might argue that daily integrations are the <strong>bare minimum</strong> to be considered <em>"continuous"</em>.</p>
<p>CI/CD is also quite often mistaken with <em>"having a pipeline"</em>, which is indeed a necessary and important part of the deal, but not the whole thing.</p>
<p>One can hide behind these ambiguities when talking about CI/CD, but there's no hiding from TBD: You either push to prod, or you don't.</p>
<p>If we want to deliver software continuously (CD), we need to integrate our work continuously (CI). At some point one has to evaluate if the rituals involved in a branch-based workflow really allow any of this, or if we are better off getting rid of the paperwork and focusing on what matters.</p>
<blockquote>
<p>The fundamental assumption of CI is that there's only one interesting version, the current one.</p>
<p>-<a href="https://wiki.c2.com/?ContinuousIntegration=">Wiki</a></p>
</blockquote>
<p>If the only interesting version is the current one, why waste effort, time and resources in other versions?</p>
<h2>PRs kinda suck</h2>
<p>This does not mean we need to ditch PRs completely: As mentioned before, branches are not the devil, PRs have their place.</p>
<p>TBD is a general way of approaching development, not a strict dogma.</p>
<p><a href="../tbd-how">We'll see</a> that these principles can be followed even in a <a href="../tbd-how/#no-gatekeeping">PR centric workflow</a>.
So long as their use is reasoned and makes sense in context.</p>
<p>Still, it’s worth considering why we use and (supposedly) <em>"need"</em> PRs in the first place, especially given their downsides:</p>
<ul>
<li>
<p><strong>Reviews are a pain:</strong> PR reviews are often treated like chores: rarely does one enjoy doing them and more often than not, they are done as an afterthought in some spare time or hastily before doing the <em>'actual work'</em>.</p>
</li>
<li>
<p><strong>Context change:</strong> They demand frequent context switching, especially if one is expected to prioritize them. This is not productive and should be avoided if possible.</p>
</li>
<li>
<p><strong>Slow pace:</strong> Reviewing PRs, if done with care, is a time-consuming process (more so with long-lived branches), and even with the best intentions and effort, it still significantly slows down the pace at which we deliver value to users. After all, perfectly good code might be sitting in a PR right now just waiting for someone to review it while it could be adding value to the project.</p>
</li>
<li>
<p><strong>Better alternatives:</strong> In most situations, live code reviews and <a href="../pair-programming-101">pair or mob programming</a> sessions are a much faster way of ensuring code quality (or asking for input/help), with much less chance of overlooking mistakes, introducing bugs or creating unnecessary friction between co-workers.</p>
</li>
</ul>
<h3>Gatekeeping</h3>
<p>More often than not, PRs are used as a form of gatekeeping. We assign a keeper for the 'Security' gate, one for the 'Testing' gate, another one for the 'Efficiency' gate and so on.</p>
<p>To be clear, this can make sense in some cases:</p>
<ul>
<li>
<p><strong>Many juniors, few seniors:</strong> PRs are a nice way of managing these team layouts ensuring quality standards are met and bugs avoided. Ideally the senior would pair-program his way out of this situation, but it is a useful temporary crutch.</p>
</li>
<li>
<p><strong>Open Source Software:</strong> In this context there are only a handful of maintainers with a complete picture of the codebase and sometimes hundreds of occasional contributors. It just makes sense for them to inspect the code before merging it and PRs are the best way to do so. The maintainers will be the ones... maintaining the code in the long run after all.</p>
</li>
</ul>
<p>Gatekeeping might be done with good intentions and might be necessary in certain moments and contexts. Usually though, especially in a business setting, there's a deeper underlying cause.</p>
<h2>Trust</h2>
<p>You might feel uncomfortable letting "anyone push to prod".</p>
<p>It might be worth digging deeper here. What would you expect to happen?</p>
<p>Do you expect your teammates to knowingly introduce bugs? Are you worried that they aren't <em>'good enough'</em>? <em>'Professional enough'</em>? <em>'Smart enough'</em>?</p>
<p>If these doubts sound silly: Congrats, you trust your teammates!</p>
<p>If instead they sound reasonable, you might want to ask yourself if you are comfortable working with people you don't trust or respect, well before thinking about TBD.</p>
<p>A team can't work effectively if their members don't <strong>fully trust each other</strong>. Each member should feed like the rest of the team has their back, and that everyone is capable of doing at least as good a job as they do.</p>
<p>If this isn't the case, no, TBD is not for you. I would argue the team itself is not for you either.</p>
Write secure softwarehttps://devintheshell.com/blog/cybersec-devs/https://devintheshell.com/blog/cybersec-devs/Don't leave security as an afterthoughtFri, 17 Feb 2023 23:26:08 GMT<p>Bolting security onto a system as an afterthought is about as effective as retroactively adding tests to meet coverage goals, which is to say, not very.</p>
<p>Testing a system that wasn't designed for testability is a huge pain and often limits both the scope and method of testing.
Similarly, securing a system that wasn't built with security in mind can feel like using a sledgehammer to crack a nut.</p>
<p>What follows is a non-comprehensive list of heuristics to ensure a system is built with security at its core.</p>
<h2>Minimize complexity</h2>
<p>Unnecessary complexity is the enemy of good software.
It is also the enemy of secure software.
The more complex the system, the more surface area, the more nooks and crannies that can be exploited.</p>
<p>Complex UIs are more likely to produce invalid states, keep them simple.
The more user flows your app has, the harder it becomes to keep track of all of them.</p>
<p>Think of each software integration, SaaS or dependency as a possible security risk, each supported platform is a new Pandora's box to be opened.
This doesn't mean these things should be avoided altogether, just keep in mind that they imply <strong>risk</strong>, evaluate if it's worth it or not.</p>
<p>Software minimalism is a bit of a meme, but it is true that having <em>less</em> (features, versions, size, etc.) is a surefire way to minimize possible attacks.</p>
<h2>Clearly define boundaries</h2>
<p>It's surprising how often complex systems don't have clearly defined points of entry.
This often leads to incomplete or inconsistent <strong>input validation</strong>.</p>
<p>Defining what pieces of your system will communicate with external systems (including but not limited to <em>The Users</em>) is a key step to securing it. This determines <strong>what</strong> exactly you are securing, where to focus your efforts.</p>
<p>Another team's microservice sending incorrect input to your part of the system is an issue, but usually benign and solvable with a Slack message.
A public API receiving incorrect input might be user error, but might also be something else.</p>
<p>This doesn't mean internal services can disregard security, but security requirements vary based on what it is you are securing.
Defining where a system ends and another one starts is key to detecting what parts need more or less security.</p>
<h2>What are you defending against?</h2>
<p>This doesn't need to be a full threat model for the whole organization, but it is useful to ask certain questions.</p>
<p>Is the business B2C or B2B? Is the API public or for paid users only? What kind of data is being stored? Is the government involved somewhere? What relationship does the business have with its clients? Are possible competitors also using the software?</p>
<p>You don't need detailed answers to all of these questions, but the more information/context you have, the better you can define security requirements.</p>
<p>You might have none of the answers you need to make an informed decision.
First: Really? How do you have no context of the software you work on?
In any case, you can look at the <a href="https://owasp.org/www-project-top-ten/">most critical security risks</a> for a good baseline.</p>
<p>Defending against everything often ends up defending against nothing. Defining <strong>what you are defending against</strong> is key.</p>
<h2>Restrict access</h2>
<p>Both at the user and at the code level.</p>
<p>Users should not see parts of the system they shouldn't interact with.
Giving a user a big red "DELETE ALL" button and telling them not to use it is like handing a child a crayon and expecting them not to draw on the walls.</p>
<p>Making internal functionality available for others to use (think making a function public instead of private) and expecting them not to, is equally naive.
This goes back to the previous point about defining boundaries: limit the ways a piece of code can be interfaced with.</p>
<p>Grant as little access and as little visibility as needed. Not more, not less.</p>
<h3>Whitelisting > Blacklisting</h3>
<p>Ideally, one would prefer the former over the latter.
This is kind of what is usually done with admin users: only these 'whitelisted' users have access to certain things.</p>
<p>The same goes with IPs: don't wait for a DOS attack to start blacklisting IPs, block everything except the ones registered by the users/clients.</p>
<p>Of course, this isn't always possible as is the case with public facing APIs or services.
Blacklists should not be avoided, but Whitelists should be preferred when possible.</p>
<h2>Create alarms</h2>
<p>You'd be surprised how often inappropriate uses are discovered when investigating logs trying to squash a bug.</p>
<p>Consider creating alarms for unexpected execution flows. Of course, handle the error in the code, but also give it a thought: Could this behavior suggest more than a user error?</p>
<p>An IP address constantly being rate limited should not go unnoticed.
A delete operation performed 1200 times in 1 minute in the main production database is likely more than a bug (and even if it is, you probably still want to be notified ASAP).</p>
<p>Set up a <strong>useful</strong> logging system, have alarms to cover edge cases and/or critical user flows (logins, payments, etc.).
This will help minimize the time it takes for a possible attack to be detected and dealt with.</p>
<h2>Test for security</h2>
<p>Pentesting is great, but often time-consuming and expensive.
While writing security driven tests is no substitute for this, it can alleviate some of the work.</p>
<p>Write your security requirements as tests, like you would write an acceptance tests for a use case.
Use <a href="https://en.wikipedia.org/wiki/Fuzzing">fuzz testing</a> to discover unwanted behavior.</p>
<p>If you fix a security issue, write a regression test to ensure it doesn't happen in the future.
Or even better, use TDD to fix it.</p>
<h2>Organizational aspects</h2>
<p>While a lot can be done as a dev, ensuring the whole organization is aligned on security practices is key.
Here are some things worth considering.</p>
<h3>Involve your business team</h3>
<p>Security is <strong>always and implicit requirement</strong>, but often not an explicit one.
Users might not ask for it, PMs might not add it to the board, but they do expect the system to be secure.</p>
<p>Make sure everyone involved is aware of this.
Resources should be allocated to security.</p>
<h3>Have a plan</h3>
<p>Don't assume that nothing bad can happen, nor that you will be able to recover from it.</p>
<p>Have a reliable backup system, some sort of contingency plan: <strong>When</strong> something goes wrong, you should be able to recover from it.</p>
<p>Ensure you can roll back the state of your software: <strong>When</strong> a vulnerability is introduced, you should be able to quickly roll back even before you start fixing it.</p>
<h3>Audit your system</h3>
<p>Reacting quickly is key, but <strong>preventing</strong> security issues can be far more cost-effective.</p>
<p>Consider having a bounty program, call a pentester once in a while.
Even better, learn the basics of pentesting yourself!</p>
<h3>Remove trust from the equation</h3>
<p>Often, particularly non-technical people, think that proprietary software is more secure by virtue of being opaque.
After all, if I can see how it's made I can see its weaknesses, right?</p>
<p>This is <em>"security by obscurity"</em> and, while it can <a href="https://www.baeldung.com/cs/security-by-obscurity">make sense</a> in some contexts, it is generally <a href="https://en.wikipedia.org/wiki/Security_through_obscurity#Criticism">not recommended</a> (especially not on its own).</p>
<p>A secure system is not one that is only safe <strong>if the attacker can't see it</strong>, but one that is safe even in that case.</p>
<p>Either outsource security to a third party based on the <strong>guarantees</strong> it offers (and it's SLA), or prefer open standards and software.</p>
<p>The chance of auditable, open and widely used protocols and/or software being insecure is slim: everybody is using them, depending on them and auditing them. Plus, issues with these systems are, by nature, instantly public and have a vast pool of talent willing and invested in solving those vulnerabilities if/when they occur.</p>
Software licenceshttps://devintheshell.com/blog/licences/https://devintheshell.com/blog/licences/A quick run downFri, 10 Feb 2023 16:15:44 GMT<p>An overview of different open source software license types with a quick run down of the most common ones.</p>
<h3>TL;DR</h3>
<ul>
<li>Code with no license attached is under exclusive copyright of the creator.
<ul>
<li>This applies for public GitHub/GitLab repos: users can only see and fork them.</li>
</ul>
</li>
<li>GPL requires all derivative work to be licensed with the same license.</li>
<li>MIT has basically no restrictions and is the most widely used.</li>
<li>Apache is like MIT, <strong>but</strong> patent rights are explicitly granted to the users.</li>
<li>MPL is like MIT, <strong>but</strong> modifications must share the same license.</li>
<li>BSD-3 is like MIT, <strong>but</strong> the project name cannot be used as endorsement.</li>
</ul>
<p>You can read more about the various licenses <a href="https://choosealicense.com/licenses/">here</a> and <a href="https://tldrlegal.com/">here</a>.</p>
<h2>Permissive vs Restrictive</h2>
<p>These words get thrown around a lot, but their meaning might seem counter-intuitive.</p>
<p>In a nutshell, restrictive licenses work a bit like viruses: They require derived work to share the same license. They <strong>restrict</strong> derivative work to that specific license.
In contrast, permissive licenses have a more <em>"do what you want"</em> approach.</p>
<p>So in this context, these concepts are not from the point of view of the user (none of these license restricts the user in any way) but from the perspective of other developers/business, which may or may not do what they want with the software.</p>
<h2>MIT & similar licenses</h2>
<p><strong>MIT</strong> licensed code can be used by whoever to do whatever.
The only restriction is that the original "owner" cannot be held liable.</p>
<p>A business that stumbles across some MIT licensed code can feel free to use it however it wants.</p>
<p>Clearly, this license is the most business-friendly, in the sense that businesses are not restricted in any way by it.
The story is however quite different from the developers point of view, as this license is perfectly fine with <em>BigTechCorp</em> using your code for profit and giving nothing back.</p>
<p>It's by far the most widely used, not least due to its <a href="https://tldrlegal.com/license/mit-license#fulltext">simplicity</a>.</p>
<p>Code licensed under the <strong>Apache</strong> license is functionally in the same situation, with the added technicality that patent rights are explicitly granted to the users.
This is done to prevent patent-holding contributors from possibly suing users of patented code.</p>
<p>The <strong>BSD-3</strong> license is a lot like MIT, with the added restriction that the original project name cannot be used as endorsement of derivative work.
So if a business decides to redistribute a piece of BSD-3 licensed software, it could not use the name of that software to promote or endorse their product in any way.</p>
<p><strong>MPL</strong> is technically a restrictive license, but it only is so for <strong>modifications</strong> of licensed code. This is to say, all modifications of an MPL licensed code must also be licensed under an MPL license.</p>
<h2>GPL & friends</h2>
<p><strong>GPL</strong> is a family of restrictive, copyleft licenses that broadly require all derivative work to be distributed under the same license.
Code under these licenses is generally considered not only <em><a href="https://opensource.org/osd">Open Source</a></em>, but <em><a href="https://www.gnu.org/philosophy/free-sw.en.html">Free and Open Source</a></em>.</p>
<p>Code under these licenses can be used freely only in projects under the same or similarly restrictive licenses.
So if a business wants to use a piece of software under a GPL license, it can do so with the condition all code <em>derived</em> from it is distributed under the same license.</p>
<p>This is why it is traditionally considered less business-friendly than the previously mentioned licenses: I can't just "steal" your work and profit, I have to "give away" the code as well.
Making money from GPL code requires the business to actually provide a service, add value, instead of merely hiding the code behind a paywall.</p>
<p>Note that internally or privately used codebases don't apply here: These licenses are concerned with code that is <strong>distributed</strong> (freely or not) to end users.
Also, the notion of <strong>derivative work</strong> here generally refers to any code that depends on and is distributed with the original work.
This however may vary from one type of GPL license to another.</p>
<h3>Lesser GPL and Affero GPL</h3>
<p>A more permissive and restrictive version of the GPL respectively.</p>
<p>LGPL is non-restrictive as long as the licensed work is being used <em>"through interfaces provided by the licensed work"</em>.
AGPL is more restrictive in that when the licensed work is being used <em>"to provide a service over a network, the complete source code [...] must be made available"</em>.</p>
<p>Simply put:
<strong>LGPL</strong> draws an exception for libraries.
<strong>AGPL</strong> also considers it <em>"derived work"</em> if it sits behind a network.</p>
<h3>GPLv2 vs GPLv3</h3>
<p>Similarly, while GPLv2 includes no particular consideration regarding hardware, GPLv3 does:</p>
<blockquote>
<p>If the software is part of a consumer device, you must include the installation information necessary to modify and reinstall the software.</p>
</blockquote>
<p>So in this case, distributing software as "part" of hardware does not exclude it from the GPL restrictions.</p>
<p>Why does this matter?
Well some companies thought that only software distributed by itself (think CDs or floppies in the olden days or a software download nowadays) would be subject to the license restrictions.
For the more curious, look up "Tivoization".</p>
<p>The new version of the license explicitly prevents this "misunderstanding".</p>
<h2>Other licenses</h2>
<p>There are of course as many licenses as one is willing to look for.</p>
<p>The <strong>Unlicense</strong> for example <a href="https://choosealicense.com/licenses/unlicense/">states</a> that the work is dedicated to the public domain.
No copyright, no restrictions, no strings attached.</p>
<p>Other honorable mentions are the <a href="https://tldrlegal.com/license/do-wtf-you-want-to-public-license-v2-(wtfpl-2.0)">Do What The Fuck You Want To</a>, the <a href="https://tldrlegal.com/license/idgaf-v1.0">I Don't Give A Fuck</a>, or the <a href="https://github.com/me-shaon/GLWTPL/blob/master/NSFW_LICENSE">Good Luck With That Shit</a> licenses.
Some people just can't be bothered.</p>
<h2>No License?</h2>
<p>So what happens if a piece of code is not licensed at all?
By default, the author has full exclusive copyright: nobody should do anything with that work without his/her permission.</p>
<p>This is irrelevant if the work is kept private, but might lead to interesting situations when sharing it.
Say a user stumbles upon an unlicensed piece of code on the internet.
This user has three options:</p>
<ul>
<li>Don't use the software.</li>
<li>Negotiate a private license/bring a lawyer.</li>
<li><em>Yarr me salty seadog!</em></li>
</ul>
<p>This is awkward and can be messy to deal with, so please just license the work!</p>
<h2>GitHub TOS</h2>
<p>This leaves us with a final consideration on GitHub <a href="https://docs.github.com/en/site-policy/github-terms/github-terms-of-service#5-license-grant-to-other-users">TOS</a>:</p>
<blockquote>
<p>By setting your repositories to be viewed publicly, you agree to allow others to view and fork your repositories.</p>
</blockquote>
<p>So in trusting Microsoft, we agree that our <strong>unlicensed, public</strong> repos are free to view and fork, but nothing else.
This leaves us in the same situation described above so again: Just license your work!</p>
Parse JSON with jqhttps://devintheshell.com/blog/jq-yq/https://devintheshell.com/blog/jq-yq/Sed and awk's lost cousinSun, 05 Feb 2023 13:25:18 GMT<p>Our beloved <a href="../series/cli-fu">GNU utils</a>, especially sed and awk, work better with some file types than others. JSON, YML or XML files can be a bit of a pain to work with.</p>
<p><a href="https://stedolan.github.io/jq/">Jq</a> is a parser specifically designed to handle JSON files, and there's a bonus tool at the end for YML and XML files as well!</p>
<h2>The basics</h2>
<p>Let's take a simple JSON as an example: run <code>curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter</code> on your command line to see the data.</p>
<p>Since this data is presented as a one-liner, we can use <code>jq</code> to format the output:</p>
<pre><code>curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter | jq
</code></pre>
<p>We can query the interesting bits and remove some noise simply by referring to its node name:</p>
<pre><code>curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter | jq '.data.posts[]'
</code></pre>
<p>To output the data as an array we can just enclose the query in <code>[]</code>:</p>
<pre><code>curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter | jq '[.data.posts[]]'
</code></pre>
<p>Or we can do some interesting manipulation to the data and present a parsed version:</p>
<pre><code>curl https://til.hashrocket.com/api/developer_posts.json?username=doriankarter | jq '.data.posts[] | {id: .slug, formatted_title: ("THIS IS A TITLE - " + .title)}'
</code></pre>
<p>Again, we are accessing the data by their node name and doing some string concatenation.</p>
<p>Notice how we use a pipe (<code>|</code>) to pass the data from one command to the next.</p>
<h2>The not so basics</h2>
<p>This tool has a bunch of very useful functions available, we'll go over a few of them.</p>
<p>From now on, there will be no reference to the <code>curl</code> command to keep the code blocks more concise.</p>
<h3>Delete node</h3>
<p>Use it to clear out unwanted noise:</p>
<pre><code>jq 'del(.data.posts[].slug)'
</code></pre>
<h3>Filter data</h3>
<p>Select only the entries that match the given condition:</p>
<pre><code>jq '[.data.posts[] | select(.title | length > 30)]'
</code></pre>
<h3>Add a node</h3>
<p>You can add nodes to the JSON:</p>
<pre><code>jq '.data.posts[] | (. + {hi: "mom"})'
</code></pre>
<h3>Conditional logic</h3>
<p>Following the previous example, we can use <code>if</code> statements to add a node with variable content.</p>
<p>Here we create a new one called <code>IS_VALID</code> with the value <code>"Too short!"</code> or <code>"yes"</code> depending on the length of the <code>.title</code>.</p>
<pre><code>jq '.data.posts[] | (. + {IS_VALID: (if .title | length < 30 then "Too short!" else "yes" end)})'
</code></pre>
<p>Perhaps more useful, we can add the new node or not depending on the condition:</p>
<pre><code>jq '.data.posts[] | (if .title | length > 30 then . + {IS_VALID: true} else . end)'
</code></pre>
<h3>Group by</h3>
<p>Group nodes by values using <code>group_by()</code>:</p>
<pre><code>jq '[.data.posts[] | (. + {IS_VALID: (if .title | length < 30 then "Too short!" else "yes" end)})] | group_by(.IS_VALID)'
</code></pre>
<p>Notice how in this case we create a new array with the data before sending it to <code>group_by()</code>.</p>
<h3>Sort by length</h3>
<p>Sorting is also possible and can the result be reversed if needed:</p>
<pre><code>jq '[.data.posts[] | (. + {len: (.title | length)})] | sort_by(.len) | reverse'
</code></pre>
<p>Notice that we add a <code>.len</code> node with the result of passing <code>.title</code> to the <code>length</code> built-in function.</p>
<h3>Modify in place</h3>
<p>So far we've always focused on the content of the <code>posts</code> array, losing it and the <code>data</code> node names in the process.</p>
<p>This might be what you want, but in some cases one needs to modify the data <em>'in place'</em>, keeping the original data structure.</p>
<p>This can be done swapping the pipe operator (<code>|</code>) for the modify-in-place operator (<code>|=</code>), so for this simple example from before:</p>
<pre><code>jq '.data.posts[] | (. + {hi: "mom"})'
</code></pre>
<p>If we wanted to modify the original data structure including the <code>data</code> and <code>posts</code> node names, we could instead do:</p>
<pre><code>jq '.data.posts[] |= (. + {hi: "mom"})'
</code></pre>
<h2>Handle other file types with yq</h2>
<p>Since this is so useful, someone took the time to <a href="https://github.com/mikefarah/yq">create</a> <code>yq</code> (as in YAML query). It actually doesn't just handle YAML files, but also XML, CSV and TSV.</p>
<p>Not only that, you can easily use this application to convert one file type into another!
Check the <a href="https://mikefarah.gitbook.io/yq/">docs</a> to find out more.</p>
<p>Keep in mind that apart from what is shown below, all the previous operations can be applied to any of these file types.
Since <code>yq</code> uses similar syntax as <code>jq</code>, I'll keep it out of the examples to keep things simple.</p>
<p>This is just a quick overview of how you might want to use the tool, it can achieve <strong>much</strong> more than I'm showing here.</p>
<h3>YAML to other types</h3>
<p>For a <code>cool.yaml</code> file of the structure:</p>
<pre><code>pets:
cat:
- purrs
- meows
</code></pre>
<p>The command <code>yq -o xml '.' your_cool.yaml</code> would output it with XML structure:</p>
<pre><code><pets>
<cat>purrs</cat>
<cat>meows</cat>
</pets>
</code></pre>
<p>Or you can run it like <code>yq -o json '.' your_cool.yaml</code> to get a JSON instead:</p>
<pre><code>{
"pets": {
"cat": ["purrs", "meows"]
}
}
</code></pre>
<h3>Any Input, Any Output</h3>
<p>Say you have a <code>cool.csv</code> file of the structure:</p>
<pre><code>name,numberOfCats,likesApples,height
Gary,1,true,168.8
Samantha's Rabbit,2,false,-188.8
</code></pre>
<p>Convert it to YAML with <code>yq -o yaml -p csv '.' your_cool.csv</code>:</p>
<pre><code>- name: Gary
numberOfCats: 1
likesApples: true
height: 168.8
- name: Samantha's Rabbit
numberOfCats: 2
likesApples: false
height: -188.8
</code></pre>
<p>Again, use the <code>-o</code> flag to change the output format <code>yq -o json -p csv '.' your_cool.csv</code>:</p>
<pre><code>[
{
"name": "Gary",
"numberOfCats": 1,
"likesApples": true,
"height": 168.8
},
{
"name": "Samantha's Rabbit",
"numberOfCats": 2,
"likesApples": false,
"height": -188.8
}
]
</code></pre>
<p>Notice the use of the <code>-p</code> flag to indicate the input format, since by default it will expect a YAML.</p>
Pair Programming 101https://devintheshell.com/blog/pair-programming-101/https://devintheshell.com/blog/pair-programming-101/Values, Roles, How's and Dont'sSat, 04 Feb 2023 10:10:33 GMT<p>A quick overview of things to keep in mind and best practices when pair programming.</p>
<h2>Values</h2>
<p>Not too far away from the ones described in <a href="https://www.amazon.com/Extreme-Programming-Explained-Embrace-Change/dp/0321278658">XP</a>, the key <a href="https://www.algolia.com/blog/engineering/pair-programming-roles-challenges-guiding-principles-and-tools/">values</a> to foster when working in pairs are as follows:</p>
<h3>Humility</h3>
<p>Appreciate <strong>feedback</strong>. Learn to give and receive it.
Don’t be afraid of being wrong, <strong>ask questions</strong> when needed (especially if you feel blocked).</p>
<h3>Trust</h3>
<p>Believe in yourself and (<strong>especially</strong>) in your partner.
Understand that everyone solves problems differently.
This is a good thing, <strong>trust</strong> that someone else's path will lead to a desirable outcome.</p>
<h3>Grit</h3>
<p>Muster the courage to stick together, <strong>challenge</strong> and <strong>motivate</strong> each other.</p>
<h3>Care</h3>
<p>Foster <strong>teamwork</strong>, care for your partner.
Be there for one another.</p>
<h2>Roles</h2>
<p>The two main roles in pair programming are the <strong>Driver</strong> (the one at the keyboard) and the <strong>Navigator</strong> (the one helping out).
Contrary to popular belief, these roles entail much more than "<em>I type you look</em>".</p>
<h3>Driver</h3>
<p>As a driver, you should do one thing and to it well: <strong>Focus</strong>.
Worry only about the smallest step possible, <em>"forget"</em> about the feature/task as a whole.</p>
<p><strong>Think small</strong>: naming, algorithms, implementation details, etc. should be your main practical concerns.</p>
<p>Two key and often overlooked aspects of being the driver:</p>
<ul>
<li>
<p><strong>Think out loud</strong>
It's hard for the pair to be on the same page if the navigator doesn't know what the driver is thinking.
Narrate your thought process.</p>
</li>
<li>
<p>Be <strong>open</strong> to criticism/improvement/discussion
Part of the point of working in pairs is to question one another and reach the best approach possible.
Don't take suggestions as an insult, your navigator is just trying to help!</p>
</li>
</ul>
<h3>Navigator</h3>
<p>A good navigator should practice <strong>tactical thinking</strong>, reaching compromises when required.
On top of that, think about the <strong>bigger task</strong> at hand, constraints, API, performance, etc.</p>
<p>The navigator should be asking <strong>good questions</strong> to ensure the development follows a desirable path.
In this, <strong>rabbit holes</strong> should be avoided.</p>
<p>This role is in an advantageous position to <strong>catch</strong> typos, errors, etc. and should <strong>beware of complexity</strong> (especially if avoidable).</p>
<p><strong>Forcing breaks</strong> when needed, keeping an eye out for <strong>meetings</strong> and urgent Slack messages are tasks generally best suited to this role.</p>
<p>If you are in this position, be sure to <strong>help out</strong> as much as possible and <strong>take notes</strong> for future reference.
You should have the relevant <strong>resources</strong> and <strong>documentation</strong> at hand, and should remember (or have a list of) the things that came up during the session that might need further investigation or analysis.</p>
<h2>Practice</h2>
<p>On a more <a href="https://martinfowler.com/articles/on-pair-programming.html">technical</a> note, here are some things to keep in mind as far as <em>"how to's"</em>.</p>
<h3>A pair of what?</h3>
<p>Are Juniors supposed to pair? What about Seniors?
There are three possible combinations here, each with its strengths and weaknesses:</p>
<h4>Novice–novice</h4>
<p>Significantly better results than two novices working independently.
Still, this makes it hard for novices to develop <strong>good habits</strong>, since they lack a proper role model.</p>
<h4>Expert–novice</h4>
<p>Possibly the best practice for <strong>mentoring</strong> and for getting fresh air into the system, as the novice is likely to question established practices.
This is great, but the expert needs to have the patience to handle the situation and the novice might feel too intimidated to have a productive session.</p>
<h4>Expert–expert</h4>
<p>Best for productivity and, given that both understand pair programming, very likely to produce great results.
Still, two experts are unlikely to question established practices, so novel ways to solve problems will likely not come up in these sessions.</p>
<h3>Alternative styles</h3>
<h4>Ping-Pong</h4>
<p>Best served with a side dish of TDD.
Simply put, switch roles on every TDD step:</p>
<ol>
<li>I write a test (You are the navigator)</li>
<li>You make it pass (You are the driver)</li>
<li>I refactor (You are the navigator)</li>
<li>You write a test (You are the driver)</li>
<li>...</li>
</ol>
<h4>Strong style</h4>
<p>The one with the idea (and/or knowledge) sticks to being the navigator.</p>
<blockquote>
<p>I've got an idea, please take the keyboard!</p>
</blockquote>
<p>Very useful to ensure knowledge transfer.</p>
<blockquote>
<p>For an idea to go from your head into the computer it MUST go through someone else's hands.</p>
</blockquote>
<p>Here, the senior should be navigator and the junior should be driver.
This can also be applied with two experts while handing over a task, as it ensures the objective is achieved.</p>
<h4>Mob-ish</h4>
<p>The line between the two roles gets blurred out.</p>
<p>With some tasks the role separation might not fit very well, but it might still be advantageous to tackle the issue as a pair.
Simulating a mob programming session, the pair as a whole takes care of the responsibilities of both roles.</p>
<h3>Time management</h3>
<p>It is extremely important to consider breaks as a natural part of your work as a dev, much more so when pairing.</p>
<p>This practice can be quite tiresome and is not sustainable if frequent breaks are not taken.
And proper breaks at that: no slack, no emails, no checking tech-support.
Get up, go for a glass of water, step outside, etc.</p>
<p>There are multiple ways of ingraining breaks into your pairing routine:</p>
<h4>Methodical approach</h4>
<p>The classic <em>"Pomodoro"</em> style:</p>
<ol>
<li>Work for 25 min</li>
<li>Take a 5-min break</li>
<li>Work for 25 min</li>
<li>Take a 5-min break</li>
<li>Work for 25 min</li>
<li>Take a 20-min break</li>
</ol>
<p>The time windows should be adjusted as needed, but some consistency should be kept (you either take 5-min breaks or 10-min breaks, not 5 now, 10 later and 8 afterwards...).</p>
<h4>Organic approach</h4>
<p>If (and only if) the team is well-adjusted to pairing and there is a safe work environment, going by <em>"gut feeling"</em> might work quite nicely.
Take breaks as needed, have a <em>"feel"</em> for when a good moment to take a break arrives.</p>
<h4>Rotation dependent</h4>
<p>Another approach is to sync your breaks with your role rotations.
This can work really well in teams that are used to pairing and tend to swap roles very frequently.</p>
<p>Of course, not all tasks will allow this and not all teams are okay with it.
But especially in a remote environment, it is often quite nice to do so when possible.</p>
<h3>Rotations</h3>
<h4>Role Rotation</h4>
<p>Really helpful when people are new to pair programming, as they can quickly get a feel for how each role works.
This also keeps you on your feet, since it requires you to fully focus on you current role's responsibilities.</p>
<p>As to when to rotate, it can sync with your time management schema or go by user story.
Make a conscious decision about when and how to do this, avoid switching roles for no reason.</p>
<h4>Pair Rotation</h4>
<p>Useful to spread knowledge between people, and facilitate collective code ownership.
It requires that everyone on the team is willing (and capable) to work with each other.</p>
<p>It keeps things fresh, as different ideas and new point of views are likely to come up on a regular basis.</p>
<p>Usually it doesn't make much sense to rotate pairs multiple times a day.
Think of doing it each Sprint, every X days, or maybe with every new user story.</p>
<p>Different teams and workflows will allow for more or less frequency in this regard.</p>
<h2>Don'ts</h2>
<p>Each team does pair programming in its very own unique way.
This is a good thing, practices like this should be adapted to the team's needs and/or abilities.</p>
<p>There are however a few things that are better avoided.</p>
<p>Don't...</p>
<ul>
<li>Drift apart, zone out, loose focus, look at your phone...</li>
<li>Micromanage what your driver should do. This is <strong>sometimes</strong> welcome, especially if the driver is new and/or blocked.</li>
<li>Be impatient. Leave some room for your driver to figure out the error.</li>
<li>Stress out your partner.</li>
<li>Marry the keyboard. Sharing is caring.</li>
<li>Pair all the time. Your job is much more than coding, pairing does not apply when writing emails or researching topics.</li>
</ul>
Tame your dotfiles with Stowhttps://devintheshell.com/blog/stow/https://devintheshell.com/blog/stow/Stow your config files safely!Fri, 03 Feb 2023 17:21:42 GMT<p>As soon as you start putting some time into your *nix system configuration, you'll probably notice your config files getting out of hand.</p>
<p>Dotfiles all over the place, no good way to back them up, versioning is a pain, keeping your configs in sync in multiple OS or multiple computers becomes a chore.</p>
<p>You can use <code>stow</code> with <code>git</code> to tame them once and for all!</p>
<h2>TL;DR</h2>
<ol>
<li>Make a <code>~/dotfiles/</code> directory using this <a href="https://github.com/EricDriussi/dotfiles">repo</a> as a reference for the dir structure.</li>
<li><strong>Move</strong> your config files to the corresponding directory in <code>~/dotfiles/</code>.</li>
<li>Run <code>stow *</code> from <code>~/dotfiles/</code>.</li>
<li>Profit?</li>
</ol>
<h2>Context</h2>
<h3>Symlinks</h3>
<p>Symbolic links are a way for us to make the OS believe there is a file where there's not.
It's a reference to a file that might be wherever we want.</p>
<p>You can create one from the command prompt with <code>ln -s <path_to_file> <path_to_link></code>.</p>
<p>Naturally, the link will always show the same content the file has.</p>
<h3>Stow</h3>
<p>This GNU command line utility is probably already installed on your *nix system.
At a basic level, it's a symlink farm manager.
Give it a directory, and it will create a mirror directory with the same structure and links to the files in the original one.</p>
<blockquote>
<p>Stow is a symlink farm manager which takes distinct sets of software and/or data
located in separate directories on the file system, and makes them all appear to be
installed in a single directory tree.</p>
</blockquote>
<p>This can be used as a really basic package manager, but we can make better use of it.</p>
<blockquote>
<p>However, Stow is still used not only for software package management, but also for
other purposes, such as facilitating a more controlled approach to management of
configuration files in the user’s home directory, especially when coupled with version
control systems.</p>
</blockquote>
<h2>The Plan</h2>
<p>Say our favorite text editor expects a <code>vimrc</code> file in <code>.config/nvim/</code>.</p>
<p>If we run <code>stow</code> from a directory that mirrors the expected one, it will create the corresponding directory structure and symlink.</p>
<p>Let's see it in practice:</p>
<pre><code>~
├──dotfiles [YOU_ARE_HERE]
│ ├── nvim
│ │ ├── .config
│ │ ├── nvim
│ │ │ ├── vimrc
...
</code></pre>
<p>Running <code>stow nvim</code> from <code>~/dotfiles/</code> will create the following structure, directories and all (assuming no conflicting files or directories where already there):</p>
<pre><code>~
├── .config
├── nvim
│ ├── vimrc
...
</code></pre>
<p>This <code>vimrc</code> only references the one under <code>~/dotfiles/nvim/vimrc</code>.</p>
<h3>⚠️ Careful</h3>
<p>By default, <code>stow</code> will clone the dir structure <strong>from</strong> <code>cwd</code> (configurable through the <code>--dir</code> flag), <strong>to</strong> its parent directory (configurable through the <code>--target</code> flag).</p>
<p>This is why the examples assume that the dotfiles directory is in <code>~/dotfiles</code>.
I would advise you to follow this structure to avoid unpleasant surprises.</p>
<h2>The Execution</h2>
<p>This means we can have all our config files in one neat repository like <a href="https://github.com/EricDriussi/dotfiles">this</a> and deploy them with <code>stow *</code>.</p>
<p>The deploy command will only be needed once, since editing the files from the dotfiles directory will update the symlinks as well.</p>
<p>Whenever you update your config on one machine, just <code>git commit</code> and push the changes.
You'll be a <code>git pull</code> away from syncing them across other machines!</p>
<p>Some programs expect their configs in <code>~/</code>.
Handle them like <a href="https://github.com/EricDriussi/dotfiles/tree/master/base">this</a>.</p>
<p>Others expect them under <code>~/.config/</code>.
No problem: just as we explained, create a <a href="https://github.com/EricDriussi/dotfiles/tree/master/picom/.config">mirror image</a> and let stow take care of it.</p>
<p>This, apart from being <strong>incredibly</strong> easy to keep track of these files, allows us to easily use <code>git</code> to version (and share!) them.
Moreover, it makes it a lot easier to <a href="https://gitlab.com/ericdriussi/sys-init/-/blob/master/tasks/config-dotfiles.yml#L38">automate</a> the config process of your favorite OS.</p>
<h3>Migration</h3>
<p>More than likely, if you try to use <code>stow</code> on a system already quite configured it won't be happy.</p>
<p>This is because, if it detects files where it wants to establish symlinks it will let you know and abort.</p>
<p>There are flags available to tell <code>stow</code> to stomp over whatever it finds, but I would advise you <strong>not</strong> to follow this approach.</p>
<p>Take your time when setting this up. Go one by one, moving the file or directory over to your dotfiles directory.
Ensure the structure is correct, remove the old config and run <code>stow DIR_NAME</code> one by one.</p>
<p>You'll be happy you did!</p>
How to Awkhttps://devintheshell.com/blog/how-to-awk/https://devintheshell.com/blog/how-to-awk/Wardly make it workThu, 02 Feb 2023 16:13:59 GMT<p>Not only a command but a full-blown scripting language, awk is a powerful tool for text processing.</p>
<p>It's a great way to quickly search through text files, extract and format data, and even perform basic calculations.</p>
<p>Dive much deeper into awk <a href="https://github.com/adrianlarion/simple-awk">here</a> and <a href="https://blog.jpalardy.com/posts/awk-tutorial-part-1/">here</a>.</p>
<h4>Keep in mind</h4>
<p>Not all awk implementations are created equal. This post references the GNU implementation.</p>
<h2>The basics</h2>
<p>Awk operates on records and fields. By default, a record is a line (uses <code>\n</code> as a separator) and a field is a "word" (uses <code>\s</code> or 'space' as a separator).</p>
<p>It performs an <strong>action</strong> based on a <strong>pattern</strong>, as in <em>"if it matches this, do that"</em>.</p>
<p>Your basic awk command looks something like this:</p>
<pre><code>awk '/himom/ {print $0}' file
| |
pattern action
</code></pre>
<p>Patterns will always be delimited by <code>/</code> while actions will be within <code>{}</code>. Note also the use of single quotes.</p>
<p>This reads: <em>"On each record (line) that matches the pattern <code>himom</code>, run the action <code>print $0</code> (which prints the whole line)."</em></p>
<p>You can omit the pattern to perform the action in all lines, or omit the action to print each matching line.</p>
<p>So <code>awk '{print $0}' file</code> would print the whole file, while <code>awk '/himom/' file</code> would do the same as <code>awk '/himom/ {print $0}' file</code>.</p>
<h3>Positions</h3>
<p>As you might imagine, changing <code>$0</code> for <code>$n</code> will print the nth field (word) instead of the whole record (line).</p>
<h3>Regex</h3>
<p>The pattern <code>'/himom/'</code> is a shorthand for <code>'$0 ~ /himom/'</code>.
This means that patterns can be applied on a per-column basis.</p>
<p>If you know your <a href="../how-to-regex">regex</a>, you might expect the previous pattern to match lines containing <strong>only</strong> the word <code>himom</code>.</p>
<p>This is not the case, the default behavior is to match <strong>anything containing</strong> the given pattern.</p>
<p>Also, while <code>^</code> and <code>$</code> usually designate beginning and end of line, here they indicate beginning and end of <strong>word</strong> (field).</p>
<p>This means that for <code>awk '$1 ~ /01$/'</code>, the line <code>01 02 03</code> <strong>would match</strong></p>
<p>Unlike <a href="../how-to-sed">sed</a> or <a href="../how-to-grep">grep</a>, you don't need the <code>-E</code> to use extended regular expressions since this is the default behavior.</p>
<h2>Variables</h2>
<h3>Ignorecase</h3>
<p>Awk is case-sensitive by default, but this can be switched off by setting this variable:</p>
<pre><code>awk -v IGNORECASE=1 '/fooBar/ {print $1}' file
</code></pre>
<p>We use the <code>-v</code> flag to set the <code>IGNORECASE</code> <strong>V</strong>ariable to <code>true</code>.</p>
<h3>Filename</h3>
<p>When processing multiple files in a script, it might be useful to also print the current file name:</p>
<pre><code>awk '{print FILENAME}' file.txt
</code></pre>
<h3>(Input) Record and Field separator (RS & FS)</h3>
<p>As mentioned before, the default RS is <code>\n</code> while the default FS is <code>\s</code>.
This can be configured to fit different file structures.</p>
<p>A CSV for example might not behave as expected:</p>
<pre><code>Tonia,Ellerey,[email protected],firefighter
Joleen,Viddah,[email protected],police officer
Cherilyn,Kat,[email protected],firefighter
Janenna,Natica,[email protected],worker
</code></pre>
<p>Something like <code>awk '{print $3}' file</code> will not really work, but <code>awk -v FS="," '{print $3}' file</code> will:</p>
<pre><code>[email protected][email protected][email protected][email protected]
</code></pre>
<p>Similarly, we could change the <code>RS</code> variable as well, although that is a less common use case.</p>
<h3>(Output) Record and Field separator (ORS & OFS)</h3>
<p>These are used to format the output of your awk command.</p>
<p>While for simple commands, something like <code>awk '{print $3" - "$4}' file</code> should do the trick, this can get tedious and unreadable fast with more complex ones.</p>
<p>For such cases, use the <code>OFS</code> variable:</p>
<pre><code>awk -v OFS=" - " '{print $3, $4}' file
</code></pre>
<p>Notice the <code>" - "</code> separator in both examples.</p>
<p>There is also <code>printf</code> support in awk, so you can get <a href="https://linux.die.net/man/3/printf">as fancy as you want</a>.</p>
<h3>Record and Field number (NR & NF)</h3>
<p>These hold the value of the current line (record) and word (field) numbers. You can print them with something like:</p>
<pre><code>awk '{print "Line num:", NR, "Num of fields:", NF, "Content:", $0}' file
</code></pre>
<p>Or use them to conditionally apply the action:</p>
<pre><code>awk 'NF<10 && NR>2 {print $2}' file
</code></pre>
<p><em>"Print the 2nd field of all records whose NR is greater than 2 (3rd line onwards) and whose NF is less than 10 (9 or fewer fields)"</em>.</p>
<h2>The not so basics</h2>
<h3>Logical operators</h3>
<p>As hinted above, we can use <code>&&</code> and <code>||</code> as in most other programming language. Patterns can be mixed and matched using these logical operators.</p>
<pre><code>awk '/bilbo/ && /frodo/ {print "My Precious"}' file
awk '/bilbo/ || /frodo/ {print "Is it you mister Frodo?"}' file
</code></pre>
<p>Or you can negate the match, as in <em>"only perform the action on lines that DON'T match the pattern"</em>.</p>
<pre><code>awk '! ~ /frodo/ { print "Pohtatoes" }' file
</code></pre>
<h3>Ternary operations</h3>
<p>Since we can use logical operators, you might imagine that we can also take advantage of ternary operators.</p>
<pre><code>awk '/frodo/ ? /ring/ : /orcs/ { print $0" --> Either frodo with the ring, or the orcs" }' file
</code></pre>
<p>Which we can write in pseudocode as:</p>
<pre><code>if matches(frodo) AND matches(ring)
print "Either frodo with the ring, or the orcs"
else if matches(orcs)
print "Either frodo with the ring, or the orcs"
else
don't print
</code></pre>
<p>So for a file:</p>
<pre><code>frodo
ring
orcs
frodo ring
frodo orcs
ring orcs
frodo ring orcs
</code></pre>
<p>The command above would output:</p>
<pre><code>orcs --> Either frodo with the ring, or the orcs
frodo ring --> Either frodo with the ring, or the orcs
ring orcs --> Either frodo with the ring, or the orcs
frodo ring orcs --> Either frodo with the ring, or the orcs
</code></pre>
<h3>Range</h3>
<p>If the file you are working with has some kind of internal sorting, you might want to operate based on that instead of the NR.</p>
<p>You can use multiple matches to create a range on which to perform the action.
So on a file like:</p>
<pre><code>first line
second line
third line
fourth line
fifth line
</code></pre>
<p>The command <code>awk '/second/ , /fourth/ {print $0}' file</code> outputs:</p>
<pre><code>second line
third line
fourth line
</code></pre>
<h2>Scripting</h2>
<p>Here we only covered how to use awk as a one-liner from the command line, but awk is actually a fully featured scripting language.</p>
<p>The previous point regarding ternary operations skips over the fact that the action <em>per se</em> can include conditional logic.</p>
<p>This for example, is a valid awk script:</p>
<pre><code>#!/usr/bin/awk
/hi/ {
if($1 > $2){
print "mom!"
}
else print "there!"
}
</code></pre>
<p>In fact, if your awk commands are getting a bit out of hand, turning them into a script might make things a lot easier.</p>
How to grephttps://devintheshell.com/blog/how-to-grep/https://devintheshell.com/blog/how-to-grep/What you needFri, 22 Apr 2022 11:47:49 GMT<p>Grep helps you locate any given pattern(s) within one or more files.</p>
<p>Very useful when parsing logs!</p>
<h4>Keep in mind</h4>
<p>Not all grep implementations are created equal. This post references the GNU implementation.</p>
<h2>The basics</h2>
<p>Grep commands have the following structure:</p>
<p><code>grep [OPTIONS] 'this_string' that_file</code></p>
<p>This will output the full line(s) where <code>this_string</code> was found, highlighting the match itself.</p>
<h3>Context</h3>
<p>There is a <code>-n</code> flag you can use to get the line <strong>N</strong>umbers of the matches.</p>
<p>You might find it useful to have some more <strong>C</strong>ontext around your grep results.</p>
<p>Use something like <code>-C2</code> to tell grep to also print the two lines before and after each match.</p>
<p>Keep in mind that the amount of context lines printed will be limited by other matches as well as the beginning and end of the file, so you might not always get exactly the amount of lines you asked for.</p>
<p>These two flags work well together, since grep will separate line numbers from the line itself using <code>:</code> for matching lines and <code>-</code> for context lines.</p>
<p>So given a <code>grepme</code> file like so:</p>
<pre><code>Lorem ipsum odor amet, consectetuer adipiscing elit.
3183_22_4 -> '3183_22'
Lorem ipsum odor amet, consectetuer adipiscing elit.
3183_22_5 -> '3183_22'
Lorem ipsum odor amet, consectetuer adipiscing elit.
Lorem ipsum odor amet, consectetuer adipiscing elit.
3283_23_1 -> '3183_23'
Lorem ipsum odor amet, consectetuer adipiscing elit.
3183_23_2 -> '3183_23'
</code></pre>
<p><code>grep -nC2 '3183_22' grepme</code> will output</p>
<pre><code>1-Lorem ipsum odor amet, consectetuer adipiscing elit.
2:3183_22_4 -> '3183_22'
3-Lorem ipsum odor amet, consectetuer adipiscing elit.
4:3183_22_5 -> '3183_22'
5-Lorem ipsum odor amet, consectetuer adipiscing elit.
6-Lorem ipsum odor amet, consectetuer adipiscing elit.
</code></pre>
<h3>Multiple files</h3>
<p>You can use <code>dir/*</code> instead of a file name to tell grep to look in all files in <code>dir/</code> (or simply <code>*</code> to look in all files under <code>cwd</code>).</p>
<p>If there are any directories here, it will print errors since it can't do much with them.</p>
<p>To <strong>S</strong>uppress these errors, use the <code>-s</code> flag.</p>
<h2>Quality of life</h2>
<h3>Count</h3>
<p>More often than not you'll need the <strong>number</strong> of matching lines, more so than the lines themselves.</p>
<p>You might be tempted to pipe grep into <code>wc -l</code>, but there are better options.</p>
<p><code>grep 'hi there!' file | wc -l</code> and <code>grep -c 'hi there!' file</code> produce the same output: They both <strong>C</strong>ount the number of matching <strong>lines</strong>.</p>
<p>Or, use the pipe with the <code>-o</code> flag to get the number of <strong>O</strong>currances (which will differ from <code>-c</code> if there are more than one match per line).</p>
<p>So following the previous example:</p>
<p><code>grep -c '3183_22' grepme</code> ➡️ <code>2</code></p>
<p><code>grep -o '3183_22' grepme | wc -l</code> ➡️ <code>4</code></p>
<p><code>-o</code> on its own will simply print the matches themselves, which doesn't make much sense right now, but will once you add <a href="#regex">regular expressions</a> to the mix.</p>
<h3>The classics</h3>
<p>There are some combinations that are used so often you might as well create an alias for them.</p>
<pre><code>grep -rinv 'foo' .
grep -rl 'bar' .
</code></pre>
<p>The first command will output all lines plus lines <strong>N</strong>umbers (<code>-n</code>) <strong>NOT</strong> matching <code>foo</code> (<code>-v</code>). It will look for the match recursively (<code>-r</code>) with case <strong>I</strong>nsensitivity (<code>-i</code>).</p>
<p>The second one will output all files containing a match (<code>-l</code>, <code>-L</code> would output only files <strong>NOT</strong> containing a match) for <code>bar</code>, recursively (<code>-r</code>).</p>
<h2>The not so basics</h2>
<h3>Multiple searches</h3>
<p>Just like <a href="../how-to-sed">sed</a>, you can use <code>-e</code> to concatenate multiple searches in the same grep command.</p>
<p>Using sed, this flag runs the all commands on each line. Similarly, here it will print out all lines that match <strong>any</strong> of the expressions.</p>
<p>This might be surprising, since when piping grep commands into each other, the result will be the exact opposite: you will get only lines that match <strong>all</strong> the expressions.</p>
<p>So again, using the example file from <a href="#context">before</a>:</p>
<p><code>grep -e '3183' -e '22' grepme | wc -l</code> ➡️ <code>4</code></p>
<p><code>grep '3183' grepme | grep '22' | wc -l</code> ➡️ <code>2</code></p>
<p>Here I use <code>| wc -l</code> instead of <code>-c</code> for clarity/symmetry.</p>
<h3>Regex</h3>
<p>Again, just like <a href="../how-to-sed">sed</a> and <a href="../how-to-find">find</a>, grep uses reduced <a href="../how-to-regex">regex</a> by default and the <code>-E</code> flag allows you to use its full regex engine.</p>
<p>If instead you want to avoid regex altogether and look for a literal string with strange characters, use <code>-F</code>.</p>
<p><code>grep -F '[Hh]ello moto*' file</code> will literally match "[Hh]ello moto*". Not <em>"Hello moto"</em>, not <em>"hello moto"</em>, and not <em>"[Hh]ello moto, something else"</em>.</p>
<h3>Exclude and include</h3>
<p>You can exclude and include files from the search by a given pattern.</p>
<p>Even better, you can use both flags together to fine tune where you are searching exactly.</p>
<p><code>grep -s --exclude=*.py --include=main.py 'something' *</code></p>
<p>Will exclude all Python files from the search, except for <code>main.py</code>.</p>
<h3>Grep based on a file</h3>
<p>Say you have a list of blacklisted words you want to ensure are not present in a project.</p>
<p><code>grep -f blacklist.words projectFile</code> will print out all matches for any of the lines in <code>blacklist.words</code>, while also passing it the <code>-l</code> flag from <a href="#the-classics">before</a> will print only the problematic filenames.</p>
<p>For this to work, <code>blacklist.words</code> has to contain one expression (or word) per line.</p>
<p>Another neat use case:
<code>ls | grep -f blacklist.files</code></p>
<p>This will output all filenames in <code>cwd</code> listed in <code>blacklist.files</code>.</p>
How to better isolate your testshttps://devintheshell.com/blog/testing-isolation/https://devintheshell.com/blog/testing-isolation/Isolate and conquerSat, 26 Feb 2022 17:14:45 GMT<p><!-- markdownlint-disable MD024 --></p>
<h2>Introduction</h2>
<p>These are tools used to imitate or substitute parts of the production code in our testing environment. Usually Services, Repositories, event buses, etc. although you can apply the same principles in much simpler contexts (like katas).
They are useful to ensure we are testing the different parts of the system <strong>in isolation</strong>.</p>
<p>In practical terms, we can say that when the system under test (<strong>SUT</strong>) depends on a separate piece of our application (separate as in <strong>should be tested separately</strong>), the second one <a href="https://leanpub.com/tdd-en-castellano">should be substituted</a>.
Specifically the parts we are <strong>not</strong> interested in testing but are required for our <strong>SUT</strong> to function.</p>
<p>As was the case for the tools we saw on the previous <a href="../testing-convenience">post</a>, these <strong>Doubles</strong> might take a minute to set up, but not using these tools makes our tests significantly more fragile and less reliable, since when they break we won't have a clear picture of who exactly is at fault.</p>
<p>In the end, <strong>Doubles</strong> are just a false implementation of production code.
Say you want to test <code>UserService</code>, but it depends on and requires a <code>UserRepository</code> with a <code>search()</code> function. You would make an <code>InMemoryUserRepository</code> that implements <code>UserRepository</code> (with its <code>search()</code> function) to test the Service independently of the Repository.</p>
<p>That <code>InMemoryUserRepository</code> is a sort of <strong>Double</strong>.
We'll use this common example going forward.</p>
<hr />
<h3>Note on definitions</h3>
<p>The terms <em>Mock</em>, <em>Spy</em> and <em>Double</em> are often used in different ways depending on the <a href="https://blog.cleancoder.com/uncle-bob/2014/05/14/TheLittleMocker.html.">source material</a></p>
<p>When naming things in your code base please make sure there is a consensus within the team regarding what each word refers to.
When looking for help online or debating with a co-worker, ensure you understand what they mean by these concepts.
You might be using the same words but talking about different things.</p>
<p>Some consider <em>Stubs</em> to be very different from <em>Fakes</em>, some don't include <em>Dummy</em> as a testing Double, and some don't differentiate between <em>Mocks</em> and <em>Spies</em>.
Some just consider everything a type of <em>Mock Object</em>.</p>
<p>So here's more fuel to the fire.</p>
<p><img src="./fuel-fire.webp" alt="fuel" /></p>
<p>Good luck 🙃</p>
<hr />
<h2>Dummy</h2>
<p>False implementation of production code with no real behavior.
It's literally there to make your code compile.</p>
<h3>Use Case</h3>
<p>You would use a Dummy to substitute a dependency of your <strong>SUT</strong> when it's only needed at compile time but doesn't really do anything in your testing scenario.</p>
<p>For <a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/b9870a8c405348c10cfba2293a3e0127c0bde746/tests/doubles/DummyInMemoryUserRepository.ts">Example</a>, if our <code>InMemoryUserRepository</code> were a Dummy, it would implement the production <code>UserRepository</code> and have a <code>search()</code> function that does nothing (or the absolute minimum to compile).</p>
<h2>Fake</h2>
<p>False implementation of production code with <strong>very basic</strong>, test specific behavior.
It would receive some starting data to simulate operations.</p>
<h3>Use Case</h3>
<p>A Fake is useful whenever you need some very simplistic behavior.</p>
<p>For <a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/ab15e41b7f982ff97ff17c2e849fb9498f62310f/tests/doubles/FakeInMemoryUserRepository.ts">example</a>, if our <code>InMemoryUserRepository</code> were a Fake, it would implement the production <code>UserRepository</code> and have a <code>search()</code> function that actually implements production logic, but searches in an Array that it got via constructor (starting data).</p>
<h2>Stub</h2>
<p>False implementation of production code with basic, use case specific and <strong>re-usable</strong> behavior.
It would use some hard coded data to simulate operations.</p>
<h3>Use Case</h3>
<p>Use a Stub when you need some basic <strong>test independent</strong> behavior.
Basically as soon as you use the same Fake twice with the same starting data, build a Stub with that starting data.</p>
<p>Following our <a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/b7452cbc706d10ed8969f542df8a598acda31735/tests/doubles/StubInMemoryAdminUserRepository.ts">example</a>, instead of our previous <code>InMemoryUserRepository</code>, you would build an <code>InMemoryAdminUserRepository</code> that implements the production <code>UserRepository</code> and has a <code>search()</code> function with production logic, but searches in a predefined, hard coded Array made up of a bunch of random <em>Admin Users</em>.
This hard coded Array substitutes the starting data from the Fake example.</p>
<p>This way you could use the same Stub in multiple tests without rewriting the Array and ensuring they all work with the same data.</p>
<h2>Spy</h2>
<p>Piece of code (or external library) that allows the tester to check if and how a specific interaction with the spied code has taken place.
It can tell the tester how many times its methods were called, what parameters were passed to each of them, in what order they were called, etc.</p>
<p>This is the first concept that <em>only</em> takes into account <strong>behavior</strong>, disregarding output.</p>
<p>It is also the first tool that allows us to see the inside workings of the systems we are testing.
Useful, but prone to coupling.
Use them sparingly!</p>
<h3>Use Case</h3>
<p>One would use a Spy to further detach the <strong>SUT</strong> from its dependency and/or to only make assertions regarding the interaction between the two, not really caring about the final output.</p>
<p>Common scenarios are the assertions <em>'if the function was called at least once'</em> or <em>'if the function was called with <strong>X</strong> argument'</em>.
You'll tend to find this behavior implemented within other Doubles, since it is often insufficient by itself.</p>
<p>So now, our <code>InMemoryUserRepository</code> would implement <code>UserRepository</code> and a <code>search()</code> function that literally does whatever, as long as it notifies the tester that it was called.</p>
<p>In our <a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/5d5a50f2f92e55f28d62036f2e4a8b1e6938cfbe/tests/doubles/SpyInMemoryUserRepository.ts">very simple example</a>, the tester could ask the <code>InMemoryUserRepository</code> for the state of <code>searchHasBeenCalled</code> and use that to assert the expected behavior, no matter what the <code>search()</code> function actually does.</p>
<p>You can hopefully see that this couples our Service test to <strong>how</strong> our service functions (we test if it calls a given function), rather than to <strong>what</strong> it actually achieves (simply testing the output).</p>
<h2>Mock</h2>
<p>Keep in mind that, as noted before, some literature refer to everything we've seen here as <em>'mock objects'</em>.
That being said, the sources I've found that consider them as its own specific thing all agree on Mocks being the more sophisticated of the lot.</p>
<p>Speaking of sources, the actual <a href="https://www2.ccs.neu.edu/research/demeter/related-work/extreme-programming/MockObjectsFinal.PDF">source material</a> states:</p>
<blockquote>
<p>[Mock Objects] replace domain code with dummy implementations that both emulate real functionality and enforce assertions about the behaviour of our code.</p>
</blockquote>
<p>Similarly, <a href="https://martinfowler.com/articles/mocksArentStubs.html">Martin Fowler</a> describes them as:</p>
<blockquote>
<p>Objects pre-programmed with expectations which form a specification of the calls they are expected to receive.</p>
</blockquote>
<p>One <a href="https://blog.cleancoder.com/uncle-bob/2014/05/14/TheLittleMocker.html">major benefit</a> of using them is:</p>
<blockquote>
<p>It makes it a lot easier to write a mocking tool.</p>
</blockquote>
<p>Thus, they are often provided by an external library.</p>
<p>Mock objects give the tester full control over the behavior of the code being mocked, which can even be manipulated dynamically.
It usually offers all the benefits from the previous tools as well.</p>
<p>As you might imagine, this can be as complex to implement as one heart's desires (hence the external library), but they are very useful and easy to work with.</p>
<p>A super simple Mock might look a lot like a Spy that, instead of exposing whether a given function was called, has some sort of <code>assert</code> function.
Mocks <em>"know what they are testing"</em>, they make assertions on their own.</p>
<p>That being said, things can (and usually do) get more complicated than that.</p>
<h3>Use Case</h3>
<p>Say you want to simulate some specific complex behavior of our <code>UserRepository</code> to see how the <code>UserService</code> responds.
Given a complex enough behavior, you might have to duplicate (and <strong>maintain</strong>) quite a bit of code or give up completely and test both elements together.</p>
<p>This might put you between a rock and a hard place, having to choose between <strong>flaky tests</strong> or giving up <strong>test isolation</strong>.</p>
<p>You can use a Mock to abstract that complexity away altogether.
In fact, if using an external library, you wouldn't even be implementing a substitute for our Repository (like the <code>InMemoryUserRepository</code> from before), since those usually provide a way to create mocks <strong>on the fly</strong> based on the interface it should implement.</p>
<h3>Example</h3>
<p>Suppose that, when our <code>UserService</code> calls the <code>UserRepository</code> implementation, the Repository needs to go fetch some data from the database, wait for an email to be sent, check for authentication with a third party service and call your mom to say you love her.
Then, based on the results, the Repository returns either an empty array, an array with 4 elements or <code>null</code> (which as it turns out is the behavior you need for your use case).</p>
<p>You could "<em>re-implement</em>" all that code/behavior, or you could mock the whole thing.
With <a href="https://site.mockito.org/">Mockito</a> for example you would annotate the Repository with <code>@Mock</code> and use it like this:</p>
<pre><code>when(mockRepo.doTheThing()).thenReturn(null)
// the rest of your test...
</code></pre>
<p>A lot simpler than the alternative! Although you are adding a dependency to your tests.
Pick your poison!</p>
Abstractions for convenient testinghttps://devintheshell.com/blog/testing-convenience/https://devintheshell.com/blog/testing-convenience/Make your live easier!Sat, 26 Feb 2022 16:43:01 GMT<p><!-- markdownlint-disable MD024 --></p>
<p>Broadly speaking, these tools are used to make things easier when testing by setting up objects/data in a reproducible and reliable manner with minimal effort.
They might take a minute to set up, but as soon as the data is needed in more than two tests you'll be thankful you took the time.</p>
<h2>Fixture</h2>
<p>A <a href="https://en.wikipedia.org/wiki/Test_fixture">test fixture</a> is an environment, a state or a dataset we use to consistently test a piece of software.</p>
<p>This example used by <a href="https://martinfowler.com/bliki/ObjectMother.html">Martin Fowler</a> seems appropriate:</p>
<blockquote>
<p>When you write tests in a reasonably sized system, you find you have to create a lot of example data.
If I want to test a sick pay calculation on an employee, I need an employee.
But this isn't just a simple object - I'll need the employee's marital status, number of dependents, some employment and payroll history.
Potentially this can be a lot of objects to create.</p>
</blockquote>
<p>All of that data is what we call a test Fixture, no matter where it is or what shape it has as long as it's valid.
It could be something as simple as a <code>json</code> file with all the data we need.</p>
<hr />
<p>Often enough, the term <strong>Fixture</strong> is used to also refer to the utilities we build to provide that data.
<a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/a7e78cc623c9f79f9ade5ba4894c82f30bd8c434/tests/helpers/UserFixture.ts">Example</a>.</p>
<p>For example, say you need the data from the example above to be persisted in your testing database to see whether a given function can fetch it correctly.
You might create an <code>EmployeeFixture</code> with a <code>save()</code> function that receives an <code>Employee</code> and persists it.</p>
<p>Here the naming gets kinda muddy: Although we usually call these types of helpers <strong>Fixtures</strong>, what they actually do is <strong>provide the Fixture</strong> itself, they <strong>set up</strong> the testing environment.</p>
<hr />
<h2>Builder</h2>
<p>A <a href="https://refactoring.guru/design-patterns/builder">creational design pattern</a> that lets you easily construct complex objects step by step as needed.
The pattern allows you to produce different types and representations of an object using the same construction code.</p>
<p>They still end up producing a (in memory) Fixture.
It's just a more useful and flexible way of getting it.</p>
<h3>Example</h3>
<p>Say you want to test how your application behaves when saving a user to the database if it has a faulty email address.
You could go <code>new User(name, age, id, email, maritalStatus, ...)</code>, but you really only care about the email for this test case.
Plus, imagine <code>name</code>, <code>age</code> and <code>maritalStatus</code> all go through validations, so you can't just put whatever in those fields.
It would be nice if you could use a sort of "<em>default valid User</em>" and just set a faulty email to it.</p>
<p>Something like <code>User myUser = new UserBuilder().withEmail('doesntWork').build()</code> with the Builder setting the rest of the properties to some irrelevant (but valid) default for you.
<a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/f485272430862f0a849676f10511a444d91e674e/tests/helpers/UserBuilder.ts">Example</a>.</p>
<h3>Details</h3>
<p>You'll often want to test behavior affecting semi complex entities.</p>
<p>You can use the Builder pattern to your advantage by having it set some sane defaults to, in our example, the User while also allowing you to customize the Entity at will.</p>
<p>This also gives you a centralized standard <em>'User maker'</em> for your tests.
So as long as this Builder accurately reflects the behavior of the production entity, you can be sure that your tests are relevant.</p>
<p>Plus, if something changes about your User (for example, the age now defaults to 18 if not set) you only need to apply the change in the Builder instead of parsing all the tests that use the User entity.</p>
<h2>Object Mother</h2>
<p>An <a href="http://wiki.c2.com/?ObjectMother">Object Mother</a> is a sort of fancy factory pattern, delivering prefabricated test-ready objects via a simple method call.</p>
<p>Again <a href="https://martinfowler.com/bliki/ObjectMother.html">Mr. Fowler</a>:</p>
<blockquote>
<p>[...] it makes sense to have a factory object that can return standard Entities.
Maybe 'John', an employee who just got hired last week; 'Heather' and employee who's been around for a decade.
Object Mother is just a catchy name for such a factory</p>
</blockquote>
<h3>Example</h3>
<p>You might find yourself using our <code>UserBuilder</code> in a few different tests just to end up creating the same type of User.
What <em>'type of User'</em> means depends on context but think of your typical Admin User, Guest User, New User, etc.</p>
<p>You can remove this duplication by abstracting the User creation into an Object Mother and just write <code>testAdminUser = new UserMother.withAdminRole()</code> or <code>testAdminUser = new UserMother.admin()</code> and call it a day!
<a href="https://github.com/EricDriussi/testing-toolbox-ts/blob/65eb68775573142c9bff2c5bb68dc721166d7f83/tests/helpers/UserMother.ts">Example</a>.</p>
<h3>Details</h3>
<p>Object Mothers differ from Builders in that Builders usually create dummy versions of domain Entities with no specific scenario in mind while Object Mothers are meant to Build more specific and complex instantiations of your domain Entities with the necessary data.</p>
<p>As soon as you find yourself creating the same <em>kind</em> of user in two different tests, go for an Object Mother.</p>
<p>You'll use it to reduce code duplication, increase test maintainability and encourage other developers to write more tests by making test objects super-easily accessible.</p>
DDD Strategieshttps://devintheshell.com/blog/ddd-strategy/https://devintheshell.com/blog/ddd-strategy/The Domain strikes backSun, 13 Feb 2022 18:23:40 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>We went over the Tactical concepts of DDD <a href="../ddd-tactics">here</a>. In this post, we'll cover the Strategic side.</p>
<p>When linking together the two parts for a more comprehensive picture, pay spacial attention to the concepts of <strong>Ubiquitous Language</strong> and <strong>Bounded Contexts</strong>, since these are the bits that keep the whole thing together.</p>
<h2>Ubiquitous language</h2>
<p>The idea is to use the same language everywhere possible, and let that language be dictated by the Domain and/or the Domain experts.</p>
<p>In an ideal world, we wouldn't have to map developer-speak to business-speak: we would all be using the same terms to describe the same things (God knows that almost never happens).</p>
<blockquote>
<p>Let the code reflect the business language.</p>
</blockquote>
<p>One of the advantages of following this approach is bringing together Domain experts, technical team, and other stakeholders involved in the project, with as little ambiguity as possible.</p>
<p>This is often not easy to do: in order to develop this Ubiquitous Language you need to understand the business and the Domain.</p>
<p>Developers also need to accept that they will often not be the ones in charge of naming things. Which frankly, is a very good thing IMO.</p>
<h2>Bounded Context</h2>
<blockquote>
<p>A Bounded Context is a linguistic and/or conceptual delimitation.</p>
</blockquote>
<p>The same concepts might have different implications in different contexts.</p>
<p>These contexts more or less reflect the business structure of the enterprise, or the problem domain.</p>
<blockquote>
<p>Bounded contexts define isolated parts of the model with some degree of independence.</p>
</blockquote>
<p>The isolation can be achieved by decoupling logic, code segregation, database segregation and also in terms of team organization.</p>
<p>The degree to which we isolate Bounded Contexts depends on the needs and realities of the business, and will often be context dependent.</p>
<p>You don't need tight, completely independent, future-proof, Bounded Contexts.</p>
<p>But you do need enough flexibility in your system to <strong>easily promote</strong> Modules to Bounded Contexts when needed.</p>
<h3>Modules</h3>
<p>Bounded Contexts are made up of Modules, which you can think of as mini-Bounded Contexts: Smaller semantic units that make sense within a greater common Context.</p>
<p>It's usually a good idea to have only one <a href="../ddd-tactics#aggregates">Aggregate</a> per Module. The need for more than one might indicate a need for a new Module or for the Module to get promoted to Bounded Context.</p>
<p>So to be specific, you could manifest this in a structure like:</p>
<pre><code>src
├── BoundedContext
│ ├── Module
│ │ ├── Application
│ │ │ ├── ApplicationService (Actions, Handlers, Commands, etc.)
│ │ │ ├── Repositories [Might also be under Domain]
│ │ │ ├── ...
│ │ ├── Domain
│ │ │ ├── AggregateRoot
│ │ │ ├── Entitiy/VO
│ │ │ ├── Domain Service
│ │ │ ├── Repositories [Might also be under Application]
│ │ │ ├── ...
│ │ ├── Infrastructure
│ │ │ ├── MySqlRepository
│ │ │ ├── ...
│ │ ├── ...
│ ├── AnotherModule
│ ├── ...
├── AnotherBoundedContext
├── ...
...
</code></pre>
<h3>Apps</h3>
<p>These should be the entry points to our Bounded Contexts.</p>
<p>They are usually called by API controllers, CLI interfaces, etc. and orchestrate use cases.</p>
<p>They lay outside our Bounded Contexts and call (directly or not) the <a href="../ddd-tactics#services">Application Services</a> to initiate use case execution.</p>
<p>There might be various Applications per Bounded Context, and their directory structure usually reflects this relationship.</p>
<p>So adding this to the previous example:</p>
<pre><code>apps
├── BoundedContext
│ ├── UseCaseOneApp
│ ├── UseCaseTwoApp
│ ├── ...
├── AnotherBoundedContext
│ ├── AnotherUseCaseOneApp
│ ├── AnotherUseCaseTwoApp
│ ├── ...
src
├── BoundedContext
│ ├── Module
│ │ ├── Application
│ │ │ ├── ApplicationService (Actions, Handlers, Commands, etc.)
│ │ │ ├── Repositories [Either under Application or Domain]
│ │ │ ├── ...
│ │ ├── Domain
│ │ │ ├── AggregateRoot
│ │ │ ├── Entitiy/VO
│ │ │ ├── Domain Service
│ │ │ ├── Repositories [Either under Application or Domain]
│ │ │ ├── ...
│ │ ├── Infrastructure
│ │ │ ├── MySqlRepository
│ │ │ ├── ...
│ │ ├── ...
│ ├── AnotherModule
│ ├── ...
├── AnotherBoundedContext
├── ...
...
</code></pre>
<p>Something like that anyway, these structures are only here to better illustrate the relations between each piece.</p>
<h2>Context Maps</h2>
<p>Visual representation of a system's Bounded Contexts and how they relate to each other.</p>
<p>It helps understand the project as a whole (high-level design) as well as showing the communication patterns between contexts.</p>
<p>One of the main benefits of DDD is that it allows multiple teams to simultaneously work on different parts of the same system.</p>
<p>These <em>'parts'</em> usually, though not always, come down to our Bounded Context and as such, building a context map will also show <strong>organizational issues, bottlenecks and team dependencies</strong>.</p>
<p>These are some ways Bounded Contexts might relate to one another:</p>
<h3>Client - Server (Customer - Supplier)</h3>
<p>As you might expect, one Bounded Context is upstream while another one is (or multiple ones are) downstream.</p>
<p>This makes them somewhat independent, but ultimately one of them will dictate the integration contract.</p>
<h3>Anti-corruption layer</h3>
<p>Another upstream/downstream relationship, where the downstream Bounded Context implements a layer responsible for translating upstream objects/structures into its own.</p>
<p>Mostly used to separate the old, legacy part of the system from a greenfield. It allows you to treat that part of the codebase as a <em>'black box'</em>.</p>
<h3>Shared Kernel</h3>
<p>A more thoughtful version of your typical Utils directory.</p>
<p>Here, a common contract is defined and referenced by multiple bounded contexts.</p>
<p>The key to implementing a shared kernel correctly is to keep its scope as small and limited as possible.</p>
<p>Another less thoughtful but common approach is to have a shared kernel that holds only <strong>dumb</strong> components that are needed in multiple Contexts (or Modules), and <strong>only when</strong> they are needed.</p>
<p>So a classic example could be the <a href="../ddd-tactics#entities--value-objects">Value Object</a> for user IDs: Their structure will depend on our Domain, and they will most definitely be used all over the place, while not holding significant logic apart from basic validation. They are also very unlikely to change, so they make for a somewhat safe common dependency.</p>
<h3>Also read</h3>
<ul>
<li><a href="https://thedomaindrivendesign.io/what-is-tactical-design/">https://thedomaindrivendesign.io/what-is-tactical-design/</a></li>
<li><a href="https://thedomaindrivendesign.io/what-is-strategic-design/">https://thedomaindrivendesign.io/what-is-strategic-design/</a></li>
<li><a href="http://gorodinski.com/blog/2013/03/11/the-two-sides-of-domain-driven-design/">http://gorodinski.com/blog/2013/03/11/the-two-sides-of-domain-driven-design/</a></li>
<li><a href="https://herbertograca.com/2017/09/07/domain-driven-design/">https://herbertograca.com/2017/09/07/domain-driven-design/</a></li>
</ul>
DDD Tacticshttps://devintheshell.com/blog/ddd-tactics/https://devintheshell.com/blog/ddd-tactics/A new DomainSun, 13 Feb 2022 17:08:24 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>From Eric Evans 2003 <a href="https://www.amazon.com/dp/0321125215">book</a>, this approach to software design aims to couple the design of a system to the business domain it operates in.</p>
<p>This is to say, a system design should reflect the business logic for which it was created.</p>
<p>He broadly separates Tactical from Strategic design. In this post, we'll go over some concepts from the former.</p>
<h2>Layers</h2>
<p>Broadly speaking, four main layers are considered in this architecture:</p>
<ul>
<li><strong>User Interface</strong>: More or less equivalent to <strong>Boundaries</strong> in EBI.</li>
<li><strong>Application</strong>: Partly in charge of the role of the <strong>Interactor</strong>, specifically related to use case orchestration.</li>
<li><strong>Domain</strong>: In line with the <strong>Entities</strong> from EBI</li>
<li><strong>Infrastructure</strong>: Simply in charge of persistence, messaging, and such.</li>
</ul>
<p>If <em>'<a href="../ebi-arch">EBI architecture</a>'</em> doesn't ring a bell, you might want to start <a href="../arch-for-noobs">here</a>.</p>
<h2>Entities / Value Objects</h2>
<p>Concrete representations of very basic Domain concepts.</p>
<p>They differ on <strong>mutability</strong> and <strong>identity</strong>.</p>
<h3>Entity</h3>
<p>An employee might <strong>change</strong> their role within the company, that doesn't make it a different employee.</p>
<p>Apart from their name, you'll likely identify them by some sort of <strong>ID</strong>, so a change in their attributes doesn't change their identity.</p>
<p>So we would say an employee is an Entity in our system.</p>
<h3>Value Object</h3>
<p>A phone number on the other hand <strong>does not change</strong>. Or rather, if it does, we are talking about a different phone number.</p>
<p>It wouldn't really make sense for a phone number to have an ID: The object is <strong>identifiable by its attributes</strong>.</p>
<p>Thus, things like phone numbers, email addresses, etc. are Value Objects (VO).</p>
<h3>Rich Domain</h3>
<p>While in general, Entities have IDs and VO don't, not all cases are so clear-cut as these.
The context and Domain will dictate which to use in a given situation.</p>
<p>It's also important to remember that neither should be anemic: they should <strong>encapsulate as much logic</strong> as reasonably possible (usually all logic regarding their individual behavior).</p>
<h2>Aggregates</h2>
<p><strong>Conceptual</strong> elements made up of multiple Entities and/or VO, which only have meaning or make sense together.</p>
<p>The <strong>concrete</strong> representation of this element is the <strong>Aggregate Root</strong>. This serves as a gateway to the rest of the elements enclosed within the Aggregate (Entities and VO).</p>
<p>The Aggregate Root should be the only way of accessing those elements, especially when modifying their state.</p>
<p>It's not hard to imagine such a structure getting out of hand. Prevent this from happening by:</p>
<ul>
<li>Keep them as small as possible.</li>
<li>Allow for easy promotion from Entity to Aggregate Root, in case one of them grows significantly.</li>
<li>Aggregates should relate to one another by ID or directly through Services or Events to maintain scalability.</li>
</ul>
<p>All logic pertaining multiple Aggregates should be delegated to a Domain Service.</p>
<h2>Services</h2>
<p>Stateless objects that perform Domain-specific operations that escape the boundary of the Aggregate. Based on their scope, there are two kinds:</p>
<ul>
<li><strong>Domain services</strong>: Executes logic that does not fit nicely within an Aggregate. Orchestrates interactions between multiple Aggregate Roots.</li>
<li><strong>Application services</strong>: Orchestrates a Use Case, using Repositories, Domain Services, Aggregate Roots but always <strong>within its own Module</strong>.</li>
</ul>
<p>To be clear, the scope should grow the further away we go from VO:</p>
<pre><code>VO < Entity < Aggregate Root < Domain Service < Application Service < Module
</code></pre>
<p>If communication is needed between Modules, the Application Services should talk to one another Without accessing other Domain objects from different Modules.</p>
<h2>Domain Events</h2>
<p>A decoupled way for different parts of the system to indirectly interact with one another.</p>
<p>These usually materialize into a Pub/Sub structure:</p>
<h3>Publisher</h3>
<p>There are at least two ways of approaching who should be in charge of event publishing:</p>
<ul>
<li>Aggregate Roots <strong>publish</strong> changes in their state directly.</li>
<li>Aggregate Roots <strong>register</strong> these events for the Application Service to publish.</li>
</ul>
<h3>Subscriber</h3>
<p>Event subscribers look a bit like controllers, just limited in scope within our Domain.</p>
<p>They both ingest the primitive types of their respective input and use them to run the relevant use case.</p>
<p>Where controllers receive requests, subscribers receive events, but in essence you can think of their role as equivalent in practice, just with different scopes and implementation.</p>
<p>So for example, a generic Subscriber interface signature might look something like:</p>
<pre><code>public interface DomainEventSubscriber<DomainEvent>
</code></pre>
<p>Where the implementation looks like:</p>
<pre><code>public class DoStuffOnCustomEvent implements DomainEventSubscriber<CustomEvent>
</code></pre>
<p><code>CustomEvent</code> might, for example, implement or extend from <code>DomainEvent</code>.</p>
<h2>Repositories</h2>
<p>They abstract concerns about data storage and other infrastructure.</p>
<p>Ideally, there will be one Repository per Aggregate Root, and it should only be called by the relevant Application Service/s as part of a use case orchestration process.</p>
<p>They usually take the form of a domain leaning interface with concrete implementations based in the specific infrastructure at hand.
This is more or less borrowed from <a href="../ports-and-adapters">Ports and Adapters</a>.</p>
<h2>More to come</h2>
<p>By now this is probably sounding like a big ball of jargon with not much of an architecture behind it, no real intention or plan.</p>
<p>We'll go over the <a href="../ddd-tactics">Strategic</a> side if DDD in a later post.</p>
<h3>Also read</h3>
<ul>
<li><a href="https://thedomaindrivendesign.io/what-is-tactical-design/">https://thedomaindrivendesign.io/what-is-tactical-design/</a></li>
<li><a href="https://thedomaindrivendesign.io/what-is-strategic-design/">https://thedomaindrivendesign.io/what-is-strategic-design/</a></li>
<li><a href="http://gorodinski.com/blog/2013/03/11/the-two-sides-of-domain-driven-design/">http://gorodinski.com/blog/2013/03/11/the-two-sides-of-domain-driven-design/</a></li>
<li><a href="https://herbertograca.com/2017/09/07/domain-driven-design/">https://herbertograca.com/2017/09/07/domain-driven-design/</a></li>
</ul>
Pixar Driven Developmenthttps://devintheshell.com/blog/pixar/https://devintheshell.com/blog/pixar/Not really a thingSun, 13 Feb 2022 11:53:06 GMT<p>Weather you personally like them or not, it's hard to argue the impact that Pixar films have on the vast majority of people and the rest of the industry.</p>
<p>I'm sure you know who Buzz Light Year is, and any 90s kid can recognize the sound of the Pixar lamp from a mile away.</p>
<p><img src="./pixar-jump.webp" alt="lamp" /></p>
<p>So <a href="https://www.youtube.com/watch?v=rkvuOLuoTfs">how do they achieve this</a>? And why on earth should software developers care at all?</p>
<h2>Technical Similarities</h2>
<p>There are a couple of techniques they follow that sound surprisingly similar to some of the best practices in software development.</p>
<p>See if any of the following sound familiar:</p>
<h3>Not over yet</h3>
<p>When writing the script, they don't just get the script done first and then continue with the production.</p>
<p>They keep writing and improving the script until the whole film is <strong>completely</strong> finished (as in actually being released).</p>
<p>So the <em>development</em> of the script is <em>continuous</em>, and the <em>integration</em> with the rest of the production team is constant and changing.</p>
<h3>Refactor that scene</h3>
<p>The opening of Toy Story 3 was re-written 60 times to get it just right.</p>
<p>No matter how good the first <em>iteration</em> is, you can almost certainly make it better.</p>
<h3>Break the script down into Bounded Contexts</h3>
<p>Once the script has the minimum basic structure and general shape, they will <em>break it down</em> into sequences: relatively small story arcs that, although all connected, are somewhat independent of one another.</p>
<p>They look for about 25 to 30 sequences per film and assign them to different teams which can work on them (and give each other feedback) <em>concurrently</em>.</p>
<p>What an <em>Agile</em> approach!</p>
<h3>Fast Feedback</h3>
<p><strong>While</strong> all of this is happening, a storyboard is created. Not before the work begins, not after it's done, but <strong>while</strong> it's taking place.</p>
<p>This allows for a broader, more birds eye view of the project and gives a clear picture of <em>what works and what doesn't</em>.</p>
<h3>Communication</h3>
<p>Obviously, none of the above are possible without constant communication.</p>
<p>This is not a blanket statement: communication between teams, departments and individuals is absolutely <strong>key</strong> to making this work.</p>
<h3>Digital in Nature</h3>
<p>There are clear differences between live action movies and animated ones, mainly due to the digital nature of the latter.
Carrying over the limitations of the <em>real world</em> to the digital space makes no sense. Why not take advantage of the differences?</p>
<p>In that spirit, they create a rough draft of the entire film. Everything from fake voice acting done by employees to very rough animations.
Doing so allows them to modify animations, adjust the script and the sequences however the like.</p>
<p>After all, it's not like they need to record the whole movie with real actors and only then edit the scenes working with whatever they got.</p>
<p>It's really kind of strange the amount of influence the manufacturing industry has had over software development when you think about it.</p>
<h2>Let them write</h2>
<p>Another thing Pixar does well is hiring the right talent for the job at hand.</p>
<p>They understand not all animators and/or writers are the same nor are they always good at everything vaguely related to their profession.</p>
<p>Of course, this is easier said than done.
In most cases companies have to work with what they can find and/or employ.</p>
<p>There is one thing to keep in mind here: Writers work with total creative freedom, no oversight, no deadlines.
You can imagine this sounds like heaven to a writer.</p>
<p>This creates a positive feedback loop where Pixar seeks out the best talent and facilitates a work environment in which pretty much everyone would want to work, which in turn makes finding that talent a lot easier.</p>
<h2>Collaboration</h2>
<p>None of the movies are the result of one brave employee who knows best and has all the skills.</p>
<p>To set an example, the creation of <em>Toy Story 3</em> involved all the top level creative people going <strong>together</strong> to a cabin in the woods.
After just 2 days they had the basic premise and ideas for the movie.</p>
<p>These were 6 creative minded people coming together and reaching an <strong>agreement</strong> (in very little time as well).
This trickles down to the rest of the production team.</p>
<p>Every moment of every scene is the fruit of the work and attention to details of hundreds of employees in constant collaboration.</p>
<p>Directors work with writers, writers work with animators, animators with actors, and so on.</p>
<p>It's a network of hard work, trust and harmony that makes the whole deal not only work, but work amazingly well and produces some of the most memorable films of my generation.</p>
<h2>The rules</h2>
<p>Let me leave you with an extract of <a href="https://www.aerogrammestudio.com/2013/03/07/pixars-22-rules-of-storytelling/">Pixar 22 rules of storytelling</a>.
See if you can spot the similarities:</p>
<ul>
<li>You gotta keep in mind what’s interesting to you as an audience, not what’s fun to do as a writer. They can be very different.</li>
<li>Trying for theme is important, but you won’t see what the story is actually about til you’re at the end of it.</li>
<li>Simplify. Focus. Combine characters. Hop over detours. You’ll feel like you’re losing valuable stuff, but it sets you free.</li>
<li>What is your character good at, comfortable with? Throw the polar opposite at them. Challenge them. How do they deal?</li>
<li>Come up with your ending before you figure out your middle. Seriously. Endings are hard, get yours working up front.</li>
<li>Finish your story, let go even if it’s not perfect. In an ideal world you have both, but move on. Do better next time.</li>
<li>Putting it on paper lets you start fixing it. If it stays in your head, a perfect idea, you’ll never share it with anyone.</li>
<li>No work is ever wasted. If it’s not working, let go and move on – it’ll come back around to be useful later.</li>
</ul>
Docker 102https://devintheshell.com/blog/docker-102/https://devintheshell.com/blog/docker-102/High level explanation of Docker's bits and piecesSun, 30 Jan 2022 16:14:56 GMT<p>We'll go over some every-day commands and files you'll use as part of your development workflow with docker.</p>
<p>Let's start by tying together the concepts from <a href="../docker-101">the previous post</a>, with the ones we are about to see:</p>
<blockquote>
<p>One <strong>builds</strong> and <strong>image</strong> (which might share a <strong>volume</strong> with the host machine) based on the definition found in a <strong>Dockerfile</strong>, <strong>runs</strong> it in a <strong>container</strong> and optionally <strong>composes</strong> multiple images together.</p>
</blockquote>
<p>To make sense of this, let's take a closer look.</p>
<h2>Dockerfile</h2>
<p>A file that defines a docker Image, a blueprint of sorts.
It will look something like this:</p>
<pre><code>FROM alpine
RUN apk update
RUN apk add nginx
RUN echo Image created!
</code></pre>
<p>It contains a series of commands of the format <code>INSTRUCTION arguments</code>.</p>
<p>Keep in mind that every line is a new layer in the Image.
So the order <strong>does</strong> matter.</p>
<h3>Common Instructions</h3>
<h4>FROM</h4>
<p>Sets the Base Image for subsequent instructions.
In its most basic form, you'll see here what OS the Image is based on (Alpine Linux in our example).</p>
<p>A valid Dockerfile <strong>must start</strong> with a <code>FROM</code> instruction.
Most commonly, this will be done by <strong>pulling an already existing image</strong> from the public <a href="https://docs.docker.com/docker-hub/repos/">repos</a>.</p>
<h4>RUN</h4>
<p>Executes a command <strong>within</strong> the Container.</p>
<p>Create a directory? <code>RUN mkdir</code>.
Update your system? <code>RUN apk update</code>.
Install a dependency? <code>RUN apk add dependency</code>.</p>
<p>Plain and simple.</p>
<h4>CMD</h4>
<p>Default command to execute when <a href="#docker-run">running</a> an image, <em>'what the image does'</em>.</p>
<pre><code>CMD ["echo", "This will be printed to the host system!"]
</code></pre>
<p>Only one is allowed per Dockerfile and whatever command we append to the <code>docker run</code> command will override this instruction.
We'll take a closer look at the <a href="#docker-run">run command</a> further down.</p>
<h4>ENV</h4>
<p>Sets environment variables, quite like you would in your <code>.bashrc</code> or <code>.zshrc</code>.</p>
<p>Useful if you need information to be set at <a href="#docker-build">build time</a>, for later modification or reference at run time.</p>
<h4>COPY</h4>
<p>Copy files or directories from the host to the Container.</p>
<p>It works pretty much as you would expect:</p>
<pre><code>COPY /source/host/path/afile /destination/container/path/
</code></pre>
<h4>ADD</h4>
<p>Pretty much like <code>COPY</code>, with the remarkable difference that <code>ADD</code> can also unpack tarballs and fetch files from remote URLs.</p>
<p>So you could say</p>
<pre><code>ADD https://cool-github-repo.git /destination/container/path/
</code></pre>
<p>Pretty handy, but if you don't need the added functionality, prefer <code>COPY</code>.</p>
<h4>VOLUME</h4>
<p>Creates a sort of shared directory between host and Container.</p>
<p>So an instruction like:</p>
<pre><code>VOLUME ["/opt"]
</code></pre>
<p>Would make the Container's <code>/opt</code> directory accessible from the host.
In fact, it will actually '<strong>mount the volume</strong>' somewhere under the host's <code>/var/lib/docker/volumes/</code> directory.</p>
<h2>Docker Build</h2>
<p>Used to build an Image, use <code>-f</code> to specify the Image path (optional if it's located in the <code>cwd</code>) and <code>-t</code> to give it a name.</p>
<p>Those are options you can (but don't have to) pass.
It does however need to get a context as parameter.</p>
<blockquote>
<p>A build’s context is the set of files located outside the Container (local path or URL) that it will be able to refer to at build time.</p>
</blockquote>
<p>This is more or less like the <code>COPY</code> instruction we saw <a href="#copy">before</a>, with the difference being that the <code>COPY</code> command makes the hosts files available at run time, while the context makes them available only at build time.</p>
<h3>Example</h3>
<p>A build command usually looks something like this:</p>
<pre><code>docker build -t my-docker-image -f src/Dockerfile .
</code></pre>
<p>Which is to say: <em>'Build an Image called <code>my-docker-image</code> based on the file <code>src/Dockerfile</code> with <code>.</code> (or <code>cwd</code>) as its build context'</em>.</p>
<h2>Docker Run</h2>
<p>Tells Docker to execute the image as defined in the Dockerfile.
If a command (or script) is appended, it will override the <code>CMD</code> instruction (if set).
Its more or less like spinning up a VM.</p>
<p>It must take an Image as parameter, although its options make it possible to override nearly all the commands specified in the Dockerfile.
This allows for a lot of flexibility.</p>
<h3>Common options</h3>
<h4>-it</h4>
<p>It allocates a <em>'pseudo-tty'</em> and keeps STDIN open during execution.
Useful if you want to be able to interact with your Container through command line.</p>
<h4>--rm</h4>
<p>Removes the Container from the hosts file system after execution.</p>
<h4>-u</h4>
<p>Changes the user and group (both of which are <code>root</code> by default) for the specific execution.
Useful if your docker Image outputs files to the hosts file system (which can get a bit unwieldy if done so as <code>root</code>).</p>
<p>One neat thing you can do is make the docker Image run as the current host user (the one executing the command).
You would do so by passing <code>"$(id -u "$USER"):$(id -g "$USER")"</code> as the parameter for <code>-u</code>.</p>
<h4>--volume</h4>
<p>Allows you to bind or mount directories from the hosts file system to the Container, or from one Container to another.</p>
<p>Takes an argument of the structure <code>host-source:container-destination</code> (<code>container-destination</code> must be an absolute path).</p>
<p><!-- markdownlint-disable-next-line MD024 --></p>
<h3>Example</h3>
<pre><code>docker run -it --rm --volume "$PWD":/data -u "$(id -u "$USER"):$(id -g "$USER")" my-docker-image useful-script.sh
</code></pre>
<p>Run the Image tagged as <code>my-docker-image</code> in a Container and execute <code>useful-script.sh</code> at startup.
Keep STDIN open with a <em>'pseudo-tty'</em> while running and remove the Container when done.</p>
<p>Also, mount the <code>cwd</code> (<code>$PWD</code>) of the host into the <code>/data</code> directory in the Container, and operate as the current host user (and group) instead of <code>root</code>.</p>
<h2>Docker Compose</h2>
<p>Utility for managing the build and run of one or more Images, and the relations (or dependencies) between them.</p>
<p>You might be able to achieve similar results by just running the Images separately from the command line.
This is however a really easy and convenient way of building complex systems of interconnected and/or interdependent Containers <strong>in a reproducible manner</strong>.</p>
<p>So just like you would build an Image from a Dockerfile, you can compose a bunch of Services from a <code>docker-compose.yml</code> like:</p>
<pre><code>services:
my-cool-app:
build:
context: ${PWD}
dockerfile: ./Dockerfile
command: python app.py
ports:
- "5000:5000"
mysql:
image: mysql
datadog:
image: datadog
volumes:
- /var/run/docker.sock:/var/run/docker.sock
</code></pre>
<p>As you might be able to tell by the general structure of the file, it is quite literally a sequence of <em>'Dockerfile-like'</em> instructions enclosed within a sequence of <em>'Services'</em> (which for simplicity we'll consider equivalent to Containers).</p>
<h3>How it works</h3>
<ol>
<li>Define your app Image with a Dockerfile (and point to it from <code>docker-compose.yml</code>).</li>
<li>Define the Images for the rest of the Containers (Services) you need in <code>docker-compose.yml</code>.</li>
<li>Run <code>docker compose up</code> to start and run the Containers (<code>docker compose down</code> will stop and remove the ongoing processes gracefully).</li>
</ol>
<p>You can pass it the <code>-d</code> flag to detach the process from the terminal and, just like with the <code>build</code> command, you can use <code>-f</code> to tell it where the <code>docker-compose.yml</code> is located.</p>
Docker 101https://devintheshell.com/blog/docker-101/https://devintheshell.com/blog/docker-101/Some very simple conceptsSun, 30 Jan 2022 16:14:52 GMT<p>Basic docker-related vocabulary (namely Image, Container and Volume) with a brief explanation.
There are a lot of nuances I'll be glossing over.</p>
<p>We'll go into a bit more practical detail in the <a href="../docker-102">follow-up post</a>.</p>
<h2>Not a VM</h2>
<p>It is usually described in detail how different Docker (or containerization more broadly) is a from a traditional Virtual Machine.</p>
<p>This is most definitely true: there are a number of low level and practical differences between these two technologies.
However, I would argue that pointing out those differences does little to help understand the vocabulary and concepts around Docker.</p>
<p>In fact, understanding VM-related concepts like <em>instance</em> or <em>virtual hard disks</em> and the difference between a VM definition/configuration and a specific <em>run</em> of that VM can go a long way to help understand why docker can be so intricate and useful.</p>
<h2>Image</h2>
<p>An <strong>Image</strong> is what defines the default composition and behavior of a <strong>Container</strong>.
Think of how the idea of a table is an abstract representation of a concrete, palpable table.</p>
<p>If we were talking about a VM, this would be more or less analogous to a VM's definition or configuration.</p>
<p>You define an <strong>Image</strong> in the corresponding <a href="../docker-102#dockerfile">Dockerfile</a>, and build it via the command line (or pull it from the web).</p>
<p>In a VM, configuration and behavior would usually go separately. Best you could do are snapshots.
An <strong>Image</strong> on the other hand, not only defines configuration (i.e. OS or installed software) but also behavior (i.e. commands to run, dependencies to install) of what we call <strong>Containers</strong>.</p>
<h2>Container</h2>
<p>A <strong>Container</strong> is a concrete instantiation of what's defined by the corresponding <strong>Image</strong>.
Think of the palpable table from before.</p>
<p>This would be like a specific, concrete instance of your good old VM.</p>
<p>It's an isolated environment in which <em>'stuff'</em> (might be your app, might be other things) happens.
You can think of a <strong>Container</strong> as the concrete instantiation of an <strong>Image</strong>
It gets created when you run said <strong>Image</strong>.</p>
<p>The technical differences between VMs and containerization are usually brought up at this point.
Just know that <strong>Containers</strong> are <strong>stupidly efficient</strong> compared to VMs, and a lot more versatile.</p>
<h3>Note</h3>
<p>Technically, we say that <strong>Images run in Containers</strong>.</p>
<p>So <strong>Containers</strong> only hold that <strong>Image</strong> run.
This is because one <strong>Image</strong> can be executed multiple times in parallel, so you might have a bunch of <strong>Containers</strong> running with the same <strong>Image</strong> but possibly with different processes and/or outputs or results.</p>
<p>However, I reckon it's easier to visualize for a newcomer as explained above.</p>
<h2>Volume</h2>
<p>Think of it as 'disk space' for a <strong>Container</strong> (or multiple <strong>Containers</strong>).
It's where docker will operate, its very own file system.</p>
<p>Plain and simple, it's the equivalent of a virtual hard disk for a VM.
It can be, and usually is, shared between multiple <strong>Containers</strong> and can easily communicate with (as in it's mounted to) the hosts file system.</p>
Heuristics for Devshttps://devintheshell.com/blog/heuristics-for-devs/https://devintheshell.com/blog/heuristics-for-devs/Growing list of code, design and development related heuristicsSat, 29 Jan 2022 16:17:57 GMT<p>Some heuristics I find useful at work. Something like <a href="https://github.com/stanislaw/SoftwareDesignHeuristics">this</a>, but dumbed down and more concise.</p>
<p>These are not meant to be dogmas but general rules of thumb that should help you be a better dev. Ditch them as soon as they don't.</p>
<h2>Fast Feedback</h2>
<p>Both at small (TDD) and large (CD) scale.
Don't just <em>think</em> it's OK, actually see if it is in practice.</p>
<p>Expect fuckups to occur, try to know about them fast.</p>
<h2>Baby steps</h2>
<h3>Refactoring</h3>
<p>Start small, even if small means apparently irrelevant, peripheral changes (variable names, directory structure, etc.).</p>
<h3>New Feature</h3>
<p>Find the smallest meaningful piece of the system and build up from there.</p>
<h2>Complexity</h2>
<p>The root of all evil. It is <strong>sometimes necessary</strong>, but often accidental.
Be skeptical of the former and avoid the latter at all cost.</p>
<h2>Divide and conquer</h2>
<p>Split everything up as much as possible, even if it seems absurd.
As long as it doesn't take more effort to split the task than to actually do it, the smaller, the better.</p>
<h2>Decide as late as possible</h2>
<p>Chances are, the later you decide, the more knowledge and experience you have.
Anything that <strong>can</strong> be decided further down the road without causing a major setback, <strong>should</strong>.</p>
<h2>Respect the Legacy</h2>
<p>We know at least two things about all Legacy Code:</p>
<ul>
<li>It works.</li>
<li>It makes money.</li>
</ul>
<p>Respect Legacy Code and the people who wrote it.</p>
<h2>Go fast, write great code</h2>
<p>The only sustainable way to go fast is writing great code.
The only sustainable way to write great code is going fast.</p>
<h2>Start anew</h2>
<p>Don't build the project of tomorrow with the crap from yesterday.</p>
<h2>Listen to your gut feeling</h2>
<p>Don't dismiss it with a <em>'meh, it works'</em>.
If you feel something could be better, <strong>make it better</strong>.</p>
<h2>Be water my friend</h2>
<p>Flexible with other peoples code, strict with yours.</p>
<h2>Commits</h2>
<p>Keep 'em coming and keep 'em small.
Think of them as checkpoints, safe states you can return to. You can always squash them later.</p>
<h2>Boy Scout Rule</h2>
<p>Leave the system (code, docs, etc.) better than you found it.</p>
<h2>Code</h2>
<h3>Readable</h3>
<p>Code for other people, not for the CPU.</p>
<h3>Naming is not relevant</h3>
<p>It's <strong>extremely</strong> important.</p>
<h3>Simplistic naming</h3>
<p>Complex naming schemes might indicate inadequate modelling.</p>
<h3>Expl!c!t language</h3>
<p>When in doubt be explicit.</p>
<h3>Boring, repetitive, predictable</h3>
<p>Boring code is good code.
No surprises, no 'WTF'.</p>
<h3>Write code like a manual</h3>
<p>Show what it does and how to use it.
Hide how it works.</p>
<h3>Syntax</h3>
<p>Nouns for classes.
Verbs for functions.
Adjectives for interfaces.</p>
<h2>Tests</h2>
<h3>Isolation</h3>
<p>Not all layers need to be tested in isolation, or tested at all for that matter.</p>
<h3>Test Behavior</h3>
<p>Not code.</p>
<h3>Coverage</h3>
<p>Low test coverage suggests you might want to write more/better tests.
High test coverage <strong>does not</strong> imply you are testing enough/properly.</p>
<h3>Don't test someone else's code</h3>
<p>Either trust the framework/library or chose another one.
If you can't trust it, don't depend on it.</p>
<h2>Design</h2>
<h3>Unix</h3>
<p>Do 'one thing', do it well.</p>
<h3>Abstract dependencies</h3>
<p>Depend on abstractions, not concretions.</p>
<h3>Extend existing behavior</h3>
<p>Don't modify it.</p>
<h3>Coupling and Cohesion</h3>
<p>They are the same thing. The latter just has some though into it.</p>
<h3>Demeter, don't ask</h3>
<p>Units (classes, modules, functions) should talk to one another only if they share the same concern, and in such a way that keeps them ignorant of one another's inner workings.</p>
<h3>Avoid changes in abstraction levels</h3>
<p>They are hard to follow and indicate that something might need to be in a different layer.</p>
<h3>Avoid generalizations</h3>
<p>They are easy to build but a pain to remove.</p>
<h2>Don'ts</h2>
<h3>Don't do a perfect job</h3>
<p>Perfection is hardly relevant. A bad test is better than no test.
Don't waste time and mental space on perfection.</p>
<h3>Don't follow the rules</h3>
<p>Follow the principles.</p>
<h3>Don't be clever</h3>
<p>Don't get fancy. Keep it simple.</p>
<h3>Don't be an 'architect'</h3>
<p>Bug-less code with meh-architecture is better than awesome architecture with buggy code.</p>
<h3>Don't fear duplication</h3>
<p>It's better than poor abstraction.
The <a href="https://en.wikipedia.org/wiki/Don't_repeat_yourself">DRY</a> principle is about avoiding duplicate <strong>logic or knowledge</strong>.</p>
<p>Duplicate lines might imply logic duplication, but they might not.</p>
<h3>Don't get sentimental</h3>
<p>No emotional attachment to code. Not to yours, not to others.</p>
How to findhttps://devintheshell.com/blog/how-to-find/https://devintheshell.com/blog/how-to-find/A needle in a haystackWed, 26 Jan 2022 18:50:36 GMT<p>It not only helps locate files in the file system, it also allows you to manipulate what it finds.</p>
<h4>Keep in mind</h4>
<p>Not all find implementations are created equal. This post is best on the GNU implementation.</p>
<h2>The basics</h2>
<p>The find command has the following structure:</p>
<pre><code>find [DIR] [OPTS] [EXP]
</code></pre>
<p>Where <code>DIR</code> is the directory in which you wish to search, <code>OPTS</code> are search options, and <code>EXP</code> is an expression by which to search.</p>
<p>The most basic practical use might look something like this:</p>
<pre><code>find . -name 'config'
</code></pre>
<p>Which translates to <em>"find anything named exactly <code>config</code> within <code>cwd</code> (<code>.</code>) and its contained directories"</em>.</p>
<p>This would print the paths (relative to where find is launched) for all files <strong>and</strong> directories that match the given pattern.</p>
<p>So for a file named <code>config</code> in a directory named <code>config</code>, it would output:</p>
<pre><code>src/config
src/config/config
</code></pre>
<h2>The options</h2>
<p>For clarity, I've grouped them under three categories: Filters, Operators and Actions.</p>
<p>This separation should make them easier to reason about.</p>
<h3>Filters</h3>
<p>Technically called <code>tests</code>, these will tell find what <em>'sort of things'</em> you are after.</p>
<h4>-type</h4>
<p>Tells find to only consider certain type of files:</p>
<pre><code>-type f -> files
-type d -> directories
-type l -> symlinks
</code></pre>
<h4>-name / -path</h4>
<p>When asking for the <strong>name</strong>, find will look for a match with the last portion of the path, so after the last <code>/</code>.</p>
<p>When asking for the <strong>path</strong>, it will look for any path that <strong>exactly match</strong> the given string.</p>
<p>So if you want to find all files within a <code>something</code> directory, but there are many such directories under <code>cwd</code>, you would tell find to look for files with <code>something</code> as a part of their paths:</p>
<pre><code>find . -type f -path '*something*'
</code></pre>
<p>As you can see, the <code>EXP</code> part of the command takes a <em>reduced regex</em> (which is why it only matches the exact string by default).</p>
<p>Here, we include the wild-card <code>*</code>, which will match for <code>cwd/path/something/myFile</code> and/or <code>cwd/something/myOtherFile</code>.</p>
<p>Both the <code>-name</code> and the <code>-path</code> filters have case-insensitive versions: <code>-iname</code> and <code>-ipath</code>.</p>
<h4>-regex</h4>
<p>Unlock the full potential of <a href="../how-to-regex">regex</a> by using the <code>-regex</code> flag!</p>
<h4>-mindepth / -maxdepth</h4>
<p>Unless told otherwise, find will always search <strong>recursively</strong> throughout the directory structure.
You can limit the scope of the command by setting its <code>-mindepth</code> and <code>-maxdepth</code>.</p>
<p>These filters take a number as parameter: <code>1</code> is the directory passed to find (<code>cwd</code> as <code>.</code> in our examples so far), <code>2</code> is its direct children directories, and so on.</p>
<p>So <code>find . -maxdepth 1 -type f -name 'whoami'</code> would look for a file named <code>whoami</code> only within the starting directory (ignoring its child directories).</p>
<p>While <code>find . -mindepth 2 -type f -name 'whoami'</code> would look for that same file in all directories under <code>cwd</code>, excluding <code>cwd</code> itself.</p>
<h3>Operators</h3>
<p>Mix, match or negate multiple searches:</p>
<pre><code>-not -> negate following pattern
-a -> 'and' following pattern
-o -> 'or' following pattern
</code></pre>
<p>So <code>find . -name 'hi' -o -name 'mom'</code> would look for files named <code>hi</code> or <code>mom</code>.</p>
<h3>Actions</h3>
<p>There are a bunch of actions find can perform. By far the most common and useful one is <code>-exec</code>.</p>
<h4>-exec / -execdir</h4>
<p>You might need to further manipulate the output of a find command. Usually however, you'll find that the tools you want to use don't read from <code>stdin</code> but rather expect the input as params.</p>
<p>You could use <code>xargs</code> for this, but the find command offers a built-in alternative.</p>
<p>You can use <code>-execdir [COMMAND] "{}" \;</code> (or <code>-exec</code>) at the end of your command to achieve <em>'pipe like'</em> functionality.</p>
<pre><code>find . -name 'removeMe' -type f -execdir rm "{}" \;
</code></pre>
<p>Here, the <code>[COMMAND]</code> is <code>rm</code>, the <code>"{}"</code> is whatever find found (quoted to avoid <a href="https://www.gnu.org/software/bash/manual/html_node/Shell-Expansions.html">shell expansions</a>), and <code>\;</code> indicates the end of the <code>-execdir</code> command.</p>
<p>This example means <em>'remove all files named "removeMe" from <code>cwd</code> and its subdirectories'</em>.</p>
<p>There are a couple of things to keep in mind here:</p>
<h5>exec vs execdir</h5>
<p>Although most of the examples you'll see around use <code>-exec</code>, this launches the <code>[COMMAND]</code> from wherever you ran find from.</p>
<p>Instead, use <code>-execdir</code> to run the command from the directory used as find's search parameters.</p>
<h5>exec vs shell</h5>
<p>When we say that <em>"exec runs a given command"</em>, what we really mean is that find runs the <code>exec</code> <a href="https://linux.die.net/man/3/exec">application</a> with the given parameters. <code>exec</code> doesn't really know about shell specific functions, aliases or piping or redirecting outputs.</p>
<p>This is why you'll commonly see something like <code>-exec bash -c "your_cool_cmd 'params' {}"\;</code>. This way, you can make full use of all of, in this case, <code>bash</code>'s niceties.</p>
<h5><code>\;</code> vs <code>\+</code></h5>
<p>You might find some examples ending with a <code>\+</code> instead of the <code>\;</code> shown above.</p>
<p>Simply put: <code>\;</code> tells <code>-exec</code> to run its command <strong>once per result</strong>, while using <code>\+</code> the command will run <strong>only one time</strong> taking all results from find as a single parameter.</p>
<p>So <code>\+</code> is more efficient but, depending on the use case, not always a good fit.</p>
<p>Read more about it <a href="https://www.everythingcli.org/find-exec-vs-find-xargs/">here</a>.</p>
<h2>Common use cases</h2>
<h3>Remove empty directories</h3>
<pre><code>find . -empty -type d -execdir rm "{}" \+
</code></pre>
<h3>Detailed results</h3>
<pre><code>find . -type f -name '*config*' -ls
</code></pre>
<p>Find all <code>config</code> files and print their properties as such:</p>
<pre><code>6454785 4 -rw-r--r-- 1 user user 147 jan 24 12:56 ./tsconfig.json
6454787 4 -rw-r--r-- 1 user user 41 jan 24 12:55 ./config.yml
6427340 4 -rw-r--r-- 1 user user 41 jan 24 12:56 ./node-config.js
</code></pre>
<h3>Path globs</h3>
<p>Say you want the <code>config</code> files under the <code>dotfiles/</code> directory but you don't know in which subdirectory they are.</p>
<pre><code>find . -type f -path "./dotfiles/*/config"
</code></pre>
<p>Will output the <code>config</code> files somewhere within the <code>dotfiles/</code> directory.</p>
<h3>Exclude specific path</h3>
<pre><code>find scr/ -name '*.py' -not -path '*/site-packages/*'
</code></pre>
<p>Find all files ending in <code>.py</code>, while discarding the ones under <code>site-packages/</code>.</p>
<h3>Print only the path to a file</h3>
<pre><code>find . -name 'carmen-sandiego' -printf '%h\n'
</code></pre>
<p>Prints the relative path (from <code>cwd</code>) to the results <strong>excluding</strong> their name.</p>
<h3>Count stuff</h3>
<pre><code>find src/modules/UserLogin/ -type f -execdir wc -l "{}" \+
</code></pre>
<p>Will count how many lines are in each file under the <code>UserLogin</code> module, and print out a total as a bonus!</p>
<h2>Fancy things you can do</h2>
<h3>Clean up</h3>
<p>You are done 'legally' downloading music and want to clean up the left behind crap from your <code>Music/</code> directory:</p>
<pre><code>find Music/ -type f -not -iname "*.mp3" -not -iname "*.ogg" -not -iname "*.wma" -not -iname "*.m4a" -execdir rm -r "{}" \;
# This is just an example, for simple use cases prefer something like rm !(*.mp3|*.ogg|*.wma|*.m4a)
</code></pre>
<p>You could be more concise with a well put-together regex, the point is that you can achieve this sort of things <strong>without it</strong>.</p>
<h3>More Execdir</h3>
<h4>Result-dependent sed</h4>
<pre><code>find lady/ -type f -name 'gaga' -execdir sed -i 's:dance:Just Dance:g' "{}" \;
</code></pre>
<p>Replace all occurrences of <code>dance</code> for <code>Just Dance</code> in any file named exactly <code>gaga</code> within the <code>lady</code> directory.</p>
<p>Learn more about <a href="../how-to-sed">sed</a>.</p>
<h4>Remove trailing spaces from directories</h4>
<pre><code>find . -name '* ' -execdir bash -c 'mv "$1" "${1%"${1##*[^[:space:]]}"}" "{}"' \;
</code></pre>
<p>Yep.</p>
<h4>Redirect output</h4>
<pre><code>find a_place/ -execdir bash -c 'do_something_cool_on "{}" > {}_processed' \;
</code></pre>
<p>Here we create a new file for each match processed by find.</p>
<h4>Pipe output</h4>
<pre><code>find . -mindepth 1 -maxdepth 1 -type d -execdir sh -c 'ls -1 "{}" | grep -i -q "list|downloaded"' \;
</code></pre>
<p>Translates to: <em>'List all directories <strong>not</strong> containing a file called <code>list</code> or <code>download</code> only directly under <code>cwd</code>'</em>.</p>
How to sedhttps://devintheshell.com/blog/how-to-sed/https://devintheshell.com/blog/how-to-sed/Basics and not-so-basicsSat, 11 Dec 2021 18:09:19 GMT<p>Dive much deeper into sed <a href="https://github.com/adrianscheff/useful-sed">here</a> and <a href="https://alexharv074.github.io/2019/04/16/a-sed-tutorial-and-reference.html">here</a>.</p>
<h5>Keep in mind</h5>
<p>Not all sed implementations are created equal.</p>
<p>This post is about the GNU version as it has a lot of cool features that OSX, the various BSDs and Busybox variants are missing.</p>
<h2>The basics</h2>
<p>Sed stands for <strong>S</strong>tream <strong>ED</strong>itor, you can edit a stream like this:</p>
<pre><code>echo "searching, seek and destroy" | sed 's/seek/destroy/g'
</code></pre>
<p>Or run the program directly on a file like this:</p>
<pre><code>sed 's/seek/destroy/g' lightning.md
| | |
sed 'do_this' on_this
</code></pre>
<p>Let's break down the <em>'do_this'</em> part:
Sed will <strong>S</strong>ubstitute <code>seek</code> with <code>destroy</code> <strong>G</strong>lobally[^1] within <code>lightning.md</code>.</p>
<p>[^1]: sed operates on a <strong>per-line</strong> basis, so when we determine the scope (<strong>G</strong>lobal in the example), we are referring to the scope within each line.</p>
<p>As is the case with most terminal utilities, it output to <code>stdout</code> by default, so no changes will be done to our <code>lightning.md</code> file.
We can pass it the <code>-i</code> flag to make the changes '<strong>i</strong>n place', i.e. overwrite the original file.</p>
<p>Of course, we can also redirect its output to a different file with <code>></code>.</p>
<p>So given a file like:</p>
<pre><code>This line contains the word line twice
This line also contains the word line twice
</code></pre>
<p>If we run a sed command like <code>sed 's/line/potato/' test-one-line.md</code>, it would print the following to <code>stdout</code>:</p>
<pre><code>This potato contains the word line twice
This potato also contains the word line twice
</code></pre>
<p>Notice how we didn't use the <strong>G</strong>lobal[^1] scope, so sed parsed only the first instance of <code>line</code> on <strong>both</strong> lines.</p>
<p>Using the <code>-i</code> flag it will overwrite the file instead of printing to <code>stdout</code>.</p>
<h2>Quality of life</h2>
<h3>Always quote</h3>
<p>Notice the <code>'</code> in <code>sed 's/seek/destroy/g'</code>.
This prevents any regex we might use from leaking out to the shell.</p>
<h3>Extended Regex</h3>
<p>By default, only basic regex is enabled, which enables you to use some special characters (like <code>.</code> or <code>*</code>) while others
will be taken literally (like <code>+</code> or <code>?</code>).</p>
<p>We can choose to use <strong>E</strong>xtended regex by passing the <code>-E</code> flag to the command. Give this a try if you find your regex to not work as expected.</p>
<p>Learn more about regex <a href="../how-to-regex">here</a>.</p>
<h3>Pick a convenient delimiter</h3>
<p>Usually, sed examples are shown with the <code>/</code> char as a delimiter.</p>
<p>For this to work, all <code>/</code> within the command need to be escaped.</p>
<p>You might find it useful to switch delimiter, especially when using sed on paths:</p>
<p><code>sed 's/\/bin\/bash\//\/bin\/sh\//g'</code> -> <code>sed 's:/bin/bash/:/bin/sh/:g'</code> or <code>sed 's_/bin/bash/_/bin/sh/_g'</code></p>
<p>Sed doesn't really care <strong>what</strong> you use as long as you are consistent with it.</p>
<h2>Simple but useful</h2>
<h3>Remove all EOL spaces</h3>
<pre><code>sed 's/\s$//'
</code></pre>
<p>Remove all spaces at the end of all lines in the given file.</p>
<p>The <code>\s</code> is simply a way of representing white spaces. You can learn more about it <a href="../how-to-regex">here</a>.</p>
<h3>Delete all instances of word</h3>
<pre><code>sed 's/foo//g'
</code></pre>
<p>Delete all instances of <code>foo</code>.</p>
<p>You might be tempted to use something like <code>s/.*foo.*//g</code> to delete any line containing <code>foo</code>.</p>
<p>Don't, it will leave an empty line in its place.
There is a <a href="#delete">delete</a> command for this use case.</p>
<h3>Only in nth instance</h3>
<pre><code>sed 's/lorem/ipsum/2'
</code></pre>
<p>Substitute <code>lorem</code> for <code>ipsum</code> <strong>only</strong> on the <code>2nd</code> instance of <code>lorem</code> of every line.</p>
<h3>Only from nth instance</h3>
<pre><code>sed 's/lorem/ipsum/2g'
</code></pre>
<p>Substitute <code>lorem</code> for <code>ipsum</code> <strong>from</strong> the <code>2nd</code> instance of <code>lorem</code> of every line, until the end of the line.</p>
<h2>The not so basics</h2>
<h3>Only on matching lines</h3>
<pre><code>sed '/^foo/ s/hi/mom/' file
</code></pre>
<p>Substitute <code>hi</code> for <code>mom</code> only on lines that start with <code>foo</code>.</p>
<p>For example, to migrate CSS classes from snake_case to camelCase, without compromising their properties, you might use
something like:</p>
<pre><code>sed -E '/\{$/ s_*(\w+?)_\u\1_g' file.css
</code></pre>
<p>Which <strong>only</strong> does the thing in lines that end with<code>{</code>.</p>
<p>If that looks like a bunch of random symbols to you, check out <a href="../how-to-regex">this post</a>.</p>
<h3>Between matching lines</h3>
<p>You can apply a command only <strong>within</strong> a certain (variable) range:</p>
<pre><code>sed '/#region/,/#endregion/s/foo/bar/' file.cs
</code></pre>
<h3>Re-use the match</h3>
<p>You can use <code>&</code> to represent the match:</p>
<pre><code>echo "what a nice example, this is a cool program!" | sed 's/[nice|cool]/VERY&/'
</code></pre>
<p>Would output:</p>
<pre><code>what a VERYnice example, this is a VERYcool program!
</code></pre>
<h3>Case-insensitive</h3>
<p>You can add an <code>i</code> at the end to make the match case-insensitive:</p>
<pre><code>sed 's/foo/bar/gi'
</code></pre>
<p>Which means:</p>
<pre><code>foo Foo -> bar bar
</code></pre>
<h3>Negate matches</h3>
<p>You can tell sed to do it's magic only on lines <strong>not</strong> matching a given pattern:</p>
<pre><code>sed '/^foo bar baz.*/! s/foo bar/hi mom/' afile.txt
</code></pre>
<p>This would substitute <code>foo bar</code> for <code>hi mom</code> except in lines that start with <code>foo bar baz</code>.</p>
<h3>Output replacements to separate file</h3>
<p>You can write the lines affected by sed to a separate file with <code>w</code>:</p>
<pre><code>sed 's_foo_bar_w replacementsFile' fileToModify
</code></pre>
<h3>Substitute multiple lines</h3>
<p>By default, sed uses <code>\n</code> chars as line delimiters, so multi-line substitutions are non-trivial.</p>
<p>Thankfully, the GNU version supports the <code>-z</code> flag, which tells sed to use <code>NUL</code> as the line delimiter.</p>
<p>This allows you to get a bit fancy and do things like:</p>
<pre><code>sed -z 's_line one\nline two_merged lines one and two_g'
</code></pre>
<blockquote>
<p>Consider however, that this means that <code>^</code> and <code>$</code> now refer to the end <strong>of the file</strong> (<code>NUL</code>) instead of the line, which also
affects the <code>g</code> at the end of the command.</p>
</blockquote>
<p>Sadly, non GNU implementations of sed <a href="https://unix.stackexchange.com/a/26290">require a bit more</a> <em>'sed-Fu'</em> to achieve
this.</p>
<h3>Groupings and References</h3>
<p>You can leverage the magic of Groupings and References to, for example, switch words around:</p>
<pre><code>sed -E 's:([a-zA-Z]*) ([a-zA-Z]*):\2 \1:' file
</code></pre>
<p>Which means:</p>
<pre><code>World Hello -> Hello World
</code></pre>
<p>Want a better use case?</p>
<pre><code>sed -E 's_(.+?)\[(.+?)\]\(([^)]+)\)(.+?)_\1\2[^\3]\4\n\n\n[^\3]: \3\n_g' book.md
</code></pre>
<p><img src="./sweat.webp" alt="sweat" /></p>
<p>Let's take it apart:</p>
<h4>Search</h4>
<p>The <em>'search'</em> part looks like this: <code>(.+?)\[(.+?)\]\(([^)]+)\)(.+?)</code>.</p>
<p>The first and last groupings are pretty simple: <em>'whatever goes before/after the mess in between'</em>.</p>
<p>That leaves us with <code>\[(.+?)\]\(([^)]+)\)</code>, which looks like a mess because we <strong>have</strong> to escape a lot of regular and squared parenthesis.</p>
<p>There are two distinct zones to this regex: <code>\[(.+?)\]</code> and <code>\(([^)]+)\)</code>.</p>
<p>The first means <em>'everything inside [squared parenthesis]'</em>, while the second could also be written like <code>\((.+?)\)</code> (which is pretty much the same as the other one, except for the different parenthesis).</p>
<p>Want to know why to use one instead of the other? Check out <a href="../how-to-regex#beware-the-greed">this post</a>.</p>
<p>So we have four groups:</p>
<ol>
<li>Everything before</li>
<li>Everything within <code>[]</code></li>
<li>Everything within <code>()</code></li>
<li>Everything after</li>
</ol>
<h4>Replace</h4>
<p>On the other hand, the <em>'replace'</em> part reads <code>\1\2[^\3]\4\n\n\n[^\3]: \3\n</code>.</p>
<p>We can see that there are two parts to this mess: <code>\1\2[^\3]\4</code> and <code>[^\3]: \3</code>, with a bunch of line breaks (<code>\n</code>) here and there.</p>
<p>Notice also how the '<em>[squared parenthesis]'</em> are not escaped here.</p>
<p>The first part simply removes all the parenthesis from the match, while enclosing the third grouping in squared parenthesis and prepending it with a <code>^</code>.</p>
<p>So <code>text [looks like](a-link) more text</code> becomes <code>text looks like[^a-link] more text</code>.</p>
<p>The second half repeats the previous behavior regarding the third grouping while adding it again after a <code>:</code> and a white space.</p>
<p>Taking into account the line breaks, <code>text [looks like](a-link) more text</code> becomes:</p>
<pre><code>text looks like[^a-link] more text
[^a-link]: a-link
</code></pre>
<p>So we successfully turned Markdown links into Markdown references, without breaking the rest of the line.</p>
<p>Keep in mind that this command will hammer through images (<code></code>) as well.
You might want to negate those matches with something like <code>/!.*/!</code>.</p>
<p>Also, this command won't behave nicely on lines with two or more links.</p>
<p>Was it a headache? Yes.</p>
<p>Was it more of a headache than doing it by hand on 400+ pages, heavily referenced book? Hell no!</p>
<h3>Change cases</h3>
<p>Here are some of the GNU specific goodies mentioned earlier:</p>
<pre><code>\l Turn the next character to lowercase.
\L Apply \l until a \U or \E is found.
\u Turn the next character to uppercase.
\U Apply \u until a \L or \E is found.
\E End case conversion started by \L or \U.
</code></pre>
<p>So to give a simple example, you can ensure all headings in a <code>.md</code> file start with upper case letters by running this:</p>
<pre><code>sed -E 's/^(#+) (\w+)/\1 \u\2/' cases.md
</code></pre>
<p>Which means:</p>
<pre><code>## all caps -> ## All caps
</code></pre>
<h3>Concatenate multiple commands</h3>
<p>Sometimes doing everything in one go is a bit of a headache or actually impossible.</p>
<p>You can pipe sed commands using the shell (<code>|</code>) or adding the <code>-e</code> flag before them:</p>
<pre><code>sed -Ee 's/(^#+) (\w+)/\1 \u\2/' -e 's/foo/bar/g' cases.md
</code></pre>
<p>This way, the file is read <strong>once</strong> and the commands are run one after the other on each line.</p>
<h2>More than substitutions</h2>
<p>Sed is a stream <strong>editor</strong>, so you can do much more than substitutions with it.</p>
<h3>Delete</h3>
<p>To delete any line containing the word <code>vim</code> you could do:</p>
<pre><code>sed /vim/d file
</code></pre>
<p>For a more useful example, you could delete empty lines with:</p>
<pre><code>sed '/^$/d' file
</code></pre>
<p>Or delete commented lines (starting with <code>#</code>) like so:</p>
<pre><code>sed '/^#/d' file
</code></pre>
<p>Or negate the whole thing and delete everything <strong>but</strong> commented lines:</p>
<pre><code>sed -E '/^#/!d' file
</code></pre>
<h3>Print</h3>
<p>You can tell sed to print the lines where replacements are made with <code>p</code>:</p>
<pre><code>sed 's/foo/bar/p' file
</code></pre>
<p>You can also simulate <a href="../how-to-grep">grep-like</a> behavior with something like <code>sed '/re/p' file</code> (<a href="https://en.wikipedia.org/wiki/Grep">familiar?</a>), which would simply print all instances of <code>re</code>.</p>
<p>Of course, without the <code>-i</code> flag sed prints everything else as well, so you end up with the lines you are interested in printed twice.</p>
<p>Use the <code>-n</code> flag to make it behave as expected (which is to <strong>only</strong> print matching lines).</p>
<p>For a more practical example, you can print the lines between two matches:</p>
<pre><code>sed -nE '/between-this/,/and-this/p' file
</code></pre>
<h3>Append, Insert and Change</h3>
<p>Append text on a new line after each line containing the given text:</p>
<pre><code>sed '/foo/a\AFTER FOO' file
</code></pre>
<p>Insert text on a new line before each line containing the given text:</p>
<pre><code>sed '/foo/i\BEFORE FOO' file
</code></pre>
<p>Change line containing the given text:</p>
<pre><code>sed '/bar/c\BAR IS CHANGED' file
</code></pre>
How to Regexhttps://devintheshell.com/blog/how-to-regex/https://devintheshell.com/blog/how-to-regex/And not lose your sanityTue, 07 Dec 2021 09:57:58 GMT<p>There are plenty of super useful <a href="../series/cli-fu/">CLI utilities</a>, many of which you should already have in your system. To get the most out of them, some basic understanding of common regex patterns is needed.</p>
<p>Keep in mind that not all Regex engines are created equal and their implementations and valid patterns may vary a bit. However, the general concepts should be more or less the same.</p>
<h2>The basics</h2>
<h3>Ranges</h3>
<p>Use <code>[]</code> to match whatever falls within the given range.</p>
<p><code>[abc]</code> ➡️ <em>'a'</em> or <em>'b'</em> or <em>'c'</em>.</p>
<p><code>[a-z]</code> ➡️ Any char between <em>'a'</em> and <em>'z'</em>. It may or may not include diacritics.</p>
<p><code>[a-zA-Z0-9]</code> ➡️ Any alphanumeric char either lower or <strong>upper</strong> case.</p>
<p>You can negate them with <code>^</code>:</p>
<p><code>[^a-z]</code> ➡️ Any char <strong>not</strong> between <em>'a'</em> and <em>'z'</em>.</p>
<p>More on how to negate matches <a href="#negations">below</a>.</p>
<h3>The Dot</h3>
<p>Use it to match any char, usually except new lines.</p>
<p><code>.</code> ➡️ Any <strong>one</strong> char.</p>
<p><code>..</code> ➡️ Any <strong>two</strong> chars (not necessarily the same ones).</p>
<h3>Multipliers</h3>
<p>Use them to match <strong>any number</strong> of the previous item.</p>
<p><code>a+</code> ➡️ <strong>1</strong> or <strong>more</strong> instances of <code>a</code></p>
<p><code>ab+</code> ➡️ <code>a</code> followed by <strong>1</strong> or <strong>more</strong> instances of <code>b</code> (so <code>ab</code>, <code>abb</code>, and so on)</p>
<p><code>.+</code> ➡️ Any char <strong>1</strong> or <strong>more</strong> times</p>
<p><code>.*</code> ➡️ Any char <strong>0</strong> or <strong>more</strong> times</p>
<p><code>.?</code> ➡️ Any char <strong>0</strong> or <strong>1</strong> times</p>
<h4>Greedy vs Lazy matches</h4>
<p>What would you expect to happen if you pass a string like <code><body>Banana</body></code> through a regex like <code><.*></code>?</p>
<p>You might be surprised to find that it likely would <strong>not</strong> match <code><body></code> nor <code></body></code>. In fact, it would most likely match the whole <code><body>Banana</body></code> instead.</p>
<p>By default, most regex engine's <code>+</code> and <code>*</code> multipliers are <strong>greedy</strong>, which means that they will try to match <strong><a href="https://stackoverflow.com/a/2301298/14385510">as much as possible</a></strong>.</p>
<p>A <strong>lazy</strong> match is probably what you want in most cases, and you usually get that by adding <code>?</code> to the multiplier: so using <code><.*?></code> instead will match <code><body></code> and/or <code></body></code>.</p>
<p>If you want to get real <em>fancy</em> you could also use <code><[^>]+></code> to achieve this, which should be understandable by this point. It's usually more <a href="https://www.regular-expressions.info/repeat.html">efficient</a>, but be careful, regular expressions get out unreadable real fast.</p>
<p>So remember, if you are having trouble with <code>.*</code> (or <code>.+</code>), try using <code>.*?</code> (or <code>.+?</code>) instead.</p>
<h3>Numbered Multipliers</h3>
<p>Instead of matching <strong>any number</strong>, these match <strong>a given number or range of numbers</strong> of the previous item.</p>
<p><code>a{5}</code> ➡️ <em>'aaaaa'</em>.</p>
<p><code>a{1-5}</code> ➡️ Between <strong>1</strong> and <strong>5</strong> consecutive <em>'a'</em>.</p>
<p>What's cool about them is that they can behave like a more interesting <code>?</code> multiplier:</p>
<p><code>a{3,}</code> ➡️ <strong>3 or more</strong> <em>'a'</em>.</p>
<h2>The not so basics</h2>
<h3>Short-hands</h3>
<p>Regex can get hard to write and read, and there are certain structures we often want to match against.</p>
<p>To make our life easier, we can use short-hands (if your regex engine supports them):</p>
<pre><code>\s ➡️ a whitespace.
\S ➡️ anything but a whitespace (opposite of \s).
\d ➡️ a digit (0-9).
\D ➡️ anything but a digit (opposite of \d).
\w ➡️ a 'word' char (shorthand for [a-zA-Z0-9_]).
\W ➡️ anything but a 'word' char (opposite of \w).
</code></pre>
<h3>Anchors</h3>
<p>You might need a regex to only match at the beginning or the end of a line. For this, we use anchors like <code>^</code> and <code>$</code>:</p>
<p><code>^</code> ➡️ Start of the line.</p>
<p><code>$</code> ➡️ End of the line.</p>
<p><code>\b</code> ➡️ Word boundary (beginning or end of word).</p>
<p>So, for a regex like <code>\bFOO$</code>:</p>
<p><code>FOO</code> in <code>What a nice line of text BAR FOO</code> would match.</p>
<p><code>FOO</code> in <code>What a nice line of text BARFOO</code> would not.</p>
<h3>Multiple matches</h3>
<p>Just like an <code>if</code> statements, you can match for more than one expression:</p>
<p><code>foo|bar</code> ➡️ Would match either <code>foo</code> <strong>or</strong> <code>bar</code>.</p>
<h3>Escaping special chars</h3>
<p>What if we want our regex to match some of the special chars we've seen (like <code>$</code>, <code>[</code> or <code>+</code>) <strong>literally</strong>?</p>
<p>We would need to <strong>escape</strong> them by putting a <code>\</code> in front of them.</p>
<p>If we take our previous example and escape the <code>$</code>: <code>\bFOO\$</code>:</p>
<p><code>FOO</code> in <code>What a nice line of text BAR FOO$ something else</code> would match.</p>
<p><code>FOO</code> in <code>What a nice line of text BAR FOO</code> would not.</p>
<p>If you come across a scary looking, unreadable regex this is probably the main culprit. Don't let the <code>\</code> scare you!</p>
<h3>Grouping and References</h3>
<p>One neat trick that most regex engines will allow you to do is <strong>grouping</strong> parts of the match and <strong>referencing</strong> them later in the regex.</p>
<p>One regex can have multiple groups and these get referenced by their number (starting with <strong>1</strong>).</p>
<p>You surround the <strong>group</strong> in <code>()</code> and reference it with <code>\</code> followed by the group's number:</p>
<p><code>(foo)-(bar) \2\1</code> ➡️ Will match <code>foo-bar barfoo</code> (notice the spaces).</p>
<p>If you know how <a href="../how-to-sed">sed</a> works, you can probably imagine this can save <strong>a lot</strong> of headaches.</p>
<h3>Negations</h3>
<p>You can negate parts of your regex using <a href="https://www.regular-expressions.info/lookaround.html">lookarounds</a>.</p>
<p>Say you want to match all instances of <code>foo</code> <strong>followed</strong> by anything but <code>bar</code>, followed by <code>baz</code>.
So for example, we want <code>foowhateverbaz</code> to match but not <code>foobarbaz</code>.</p>
<p>A <em>lookahead</em> like <code>foo(?!bar).+?baz</code> would do just that: It negates the part of the regex between parenthesis and preceded by <code>?!</code>.</p>
<p>It simply means <em>'not followed by <code>(?!this)</code>'</em>.</p>
<p>Similarly, you might want to go about this the other way around.</p>
<p>If you want to match all instances of <code>foo</code> except when it is <strong>preceded</strong> by <code>bar</code>, you could use a <em>lookbehind</em> like <code>(?<!bar)foo</code>.</p>
<p>So <code>This whateverfoo is weird</code> would match.</p>
<p>But <code>This barfoo is weird</code> would not.</p>
<p>It simply means <em>'not preceded by <code>(?<!this)</code>'</em>.</p>
<p>Both <strong>lookaheads</strong> and <strong>lookbehinds</strong> can be used to match a pattern while negating another one.
Which one to use just depends on whether you want to negate something <strong>before</strong> or <strong>after</strong> something else.</p>
Software Architecture for Noobshttps://devintheshell.com/blog/arch-for-noobs/https://devintheshell.com/blog/arch-for-noobs/A place to start learning about Software ArchitectureSat, 27 Nov 2021 12:14:39 GMT<p>Some posts overviewing some code architecture patterns I've seen being used in one way or another either directly or indirectly through the ideas they bring to the table.</p>
<p>Best to read them in order, since newer concepts are often better understood with the previous ones in mind.</p>
<p>These posts are not about low level architecture nor whiteboard-like systems design. Rather we go over different ways to structure code to make it (hopefully) more maintainable and easier to work with in the long run.</p>
<p>Apart from reading the source material for each architecture, you can go in much more detail in <a href="https://herbertograca.com/2017/07/03/the-software-architecture-chronicles/">this great series of posts</a>.</p>
<p>Seriously go check that out, it's great. In the meanwhile:</p>
<ol>
<li><a href="../mvc-arch">MVC-MVVM</a></li>
<li><a href="../ebi-arch">EBI</a></li>
<li><a href="../ddd-tactics">DDD (Tactical)</a></li>
<li><a href="../ddd-strategy">DDD (Strategical)</a></li>
<li><a href="../ports-and-adapters">Ports & Adapters</a></li>
<li><a href="../onion-arch">Onion</a></li>
<li><a href="../clean-arch">Clean</a></li>
</ol>
<p>Have fun!</p>
Clean Architecturehttps://devintheshell.com/blog/clean-arch/https://devintheshell.com/blog/clean-arch/Integrating previous designsTue, 23 Nov 2021 17:20:20 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>This is Uncle Bob's attempt at synthesize previous architectural patterns and concepts.</p>
<p>Based on a common thread between <a href="../ports-and-adapters">Ports & Adapters</a>, <a href="../onion-arch">Onion</a>, <a href="../ebi-arch">EBI</a>, he wrote an <a href="https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html">article</a> as <em>'an attempt at integrating all these architectures into a single actionable idea'</em>.[^1]</p>
<p>[^1]: From the <a href="https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html">original article</a></p>
<h2>High level</h2>
<p>The original article starts with a diagram similar to this:</p>
<p><img src="./clean-arch-mine.webp" alt="diagram" /></p>
<h3>Project structure</h3>
<p>At face value, the diagram seems to suggest a project structure as such:</p>
<pre><code>Entities
├── Tenant
├── Landlord
├── Rent
Repositories
├── TenantRepo
├── LandlordRepo
├── RentRepo
UseCases
├── PayRent
├── RequestRent
├── RequestRepair
├── CalculateRent
...
</code></pre>
<p>There are however a couple of shortcomings here:</p>
<ul>
<li><strong>Low cohesion</strong>: Modify one Use Case, and you'll have to change code in three different modules</li>
<li><strong>No clear purpose</strong>: A newcomer would have to dig through the directory structure to know what the application is for</li>
</ul>
<p>Borrowing some key concepts like <a href="../ddd-strategy/#bounded-context">Bounded Context from DDD</a>, the previous system can be represented as such:</p>
<pre><code>Tenant
├── Tenant
├── TenantRepo
├── PayRent
├── RequestRepair
Landlord
├── Landlord
├── LandlordRepo
├── RequestRent
Rent
├── Rent
├── RentRepo
├── CalculateRent
...
</code></pre>
<h2>Low level</h2>
<p>There's a pretty useful diagram in the slides Bob uses in his conferences.</p>
<p><img src="./clean-arch-flow-diagram.webp" alt="bobs-flow" /></p>
<p>Cool, but a bit overwhelming. Let's start with Entity and Interactor[^2] and build the diagram with concepts from other notable architectures.[^3]</p>
<p>[^2]: From the <a href="../ebi-arch">EBI architecture</a>
[^3]: <a href="../arch-for-noobs">Start here</a></p>
<p><img src="./clean-arch-diagram-1.png" alt="diagram-1" /></p>
<p>Think of the Interactor as the implementation of a use case of the application.</p>
<p>Thanks to <a href="../ports-and-adapters">Ports & Adapters</a>, we know that a use case should be defined as an abstraction to ensure inwards dependency.</p>
<p><img src="./clean-arch-diagram-2.png" alt="diagram-2" /></p>
<p>So if a use case is an abstract concept of <em>'what needs to be done'</em>, the Interactor is the concrete implementation of <em>'how exactly it will happen'</em>.</p>
<p>Entities usually require some sort of persistence, we can use a Driven Port/Adapter pair for that.</p>
<p><img src="./clean-arch-diagram-3.png" alt="diagram-3" /></p>
<p>Let's also represent the actor that will use the Driver Port from before, as well as the data structure (DTO) that will be shared between it and the Interactor.</p>
<p><img src="./clean-arch-diagram-4.png" alt="diagram-4" /></p>
<p>The controller could just as well be a CLI or a GUI.</p>
<p>Using a DTO here allows us to avoid exposing the Domain Entities outside the boundaries of the application. Speaking of boundaries.</p>
<p><img src="./clean-arch-diagram-5.png" alt="diagram-5" /></p>
<p>The red line marks the limit of our application logic and separates it from the pieces of the system that are necessarily coupled to the Infrastructure.</p>
<p>Now suppose that, when something happens in the system, we want to notify the user or update a UI element.</p>
<p><img src="./clean-arch-diagram-6.png" alt="diagram-6" /></p>
<p>Or using Uncle Bob's terminology:</p>
<p><img src="./clean-arch-diagram-7.png" alt="diagram-7" /></p>
<p>With this, we are back to his original diagram.</p>
<p>Let's review how the flow of execution would go for an incoming HTTP request:</p>
<ol>
<li>The Request reaches the Controller</li>
<li>The Controller:
<ol>
<li>Dismantles the Request and creates a Request Model with the relevant data</li>
<li>Through the Boundary, triggers the Interactor</li>
</ol>
</li>
<li>The Interactor:
<ol>
<li>Finds the relevant Entities through the Entity Gateway</li>
<li>Orchestrates interactions between entities</li>
<li>Creates a Response Model with the relevant data and sends it to the Presenter through the Boundary</li>
</ol>
</li>
</ol>
<p>The diagram in the lower right corner of the first <a href="https://blog.cleancoder.com/uncle-bob/images/2012-08-13-the-clean-architecture/CleanArchitecture.jpg">image on the original post</a> might help visualize what's going on.</p>
<p><img src="./flow.webp" alt="flow" /></p>
<p>Swap <em>'Use Case Output/Input Port'</em> for <em>'Boundary'</em>.</p>
Onion Architecturehttps://devintheshell.com/blog/onion-arch/https://devintheshell.com/blog/onion-arch/Make layers great againSun, 10 Oct 2021 14:02:46 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>It helps to think of this as an update to <a href="../ports-and-adapters">Ports & Adapters</a> that brings more fine-grained control over the <strong>Application Core</strong>.</p>
<h2>Bringing back the layers</h2>
<p>When going over Ports & Adapters we saw that it sort of got rid of the layers. More specifically, it only (implicitly) left two of them:</p>
<ul>
<li>An <strong>external</strong> layer (with Adapters and the relevant infrastructure).</li>
<li>An <strong>internal</strong> one (with pretty much everything else).</li>
</ul>
<p>As you might imagine, especially in bigger applications, having most of the code bundled in a single layer can get... Complicated.</p>
<p>That's where <a href="https://jeffreypalermo.com/2008/07/the-onion-architecture-part-1/">Jeffrey Palermo's</a> Onion Architecture shines.</p>
<p>In a nutshell, it's a more detailed specification regarding how to organize what remains of our code after defining where the <strong>boundaries</strong> (Ports & Adapters) are.</p>
<p><img src="./onion-architecture.webp" alt="onion-architecture" /></p>
<p>Let's have a look at these layers, starting from the core of the Onion.</p>
<h2>Domain Model</h2>
<p>This would be our core business Entities, enriched with their corresponding rules and logic.</p>
<blockquote>
<p>[...] the state and behavior combination that models truth for the organization.
[^1]</p>
</blockquote>
<p>[^1]: From Palermo's <a href="https://jeffreypalermo.com/2008/07/the-onion-architecture-part-1/">original post</a>.</p>
<p>This part of the system should change <strong>only</strong> if the most essential business rules change, which doesn't usually happen (if ever).[^2]</p>
<p>[^2]: When these changes occur, they are usually accompanied by structural, organizational changes.</p>
<h2>Domain Services</h2>
<p>Similar to how <a href="../ebi-arch#interactor">Interactors</a> managed the interactions between Entities in the <a href="../ebi-arch">EBI Architecture</a>, all Domain logic involving multiple Models will go here.</p>
<p>Whatever business logic doesn't fit the scope of a single Entity (or value object for that matter) belongs here.</p>
<h2>Application Services</h2>
<p>Like Domain Services orchestrate the interactions between multiple Models, Application Services orchestrate the interactions between multiple Domain Services.</p>
<p>Here we'll also find the Ports and use cases definition from <a href="../ports-and-adapters">Ports & Adapters</a>, right at the boundary of our application.</p>
<h2>Inwards dependency</h2>
<p>As you can see, this design enforces the same <em>'direction of dependency'</em> as Ports & Adapters, only this time it is also applied to our own.</p>
<p>Not only does our infrastructure <strong>depend</strong> on our Domain (and not the other way around), but the layers <strong>within</strong> our domain also depend on whatever layers lay beneath them.</p>
<p>This way, we end up with an independent core that can (and should) be compiled, executed and tested independent of its outer layers.</p>
<p>We couple <strong>towards</strong> the center.</p>
Pair programming doesn't have to suckhttps://devintheshell.com/blog/pair-programming-suck/https://devintheshell.com/blog/pair-programming-suck/And you might want to do it more oftenFri, 08 Oct 2021 18:09:55 GMT<p>A while ago an interesting debate occurred between my coworkers regarding pair programming.
We all follow this practice at <a href="https://leanmind.es">LeanMind</a> as much as possible, but some of us do so pretty much 24/7 while others have a more selective approach.
This led to an interesting conversation that made each other more aware of the way others view this practice.</p>
<p><img src="./pair-love.webp" alt="pair-love" /></p>
<h2>When is pairing not fit for purpose</h2>
<p>It's only fair for me to point out some valid considerations regarding what can go wrong with pairing, before arguing why you should still do it.</p>
<h3>Productivity</h3>
<p>One might argue that, if the task is simple enough, having two developers tackling it is a bit wasteful.
Sure, complicated o critical problems might require group collaboration but maybe if we could get more done on our own, we should.</p>
<p>This will probably resonate more with business people than with developers, we'll see why in a bit.</p>
<h3>It aint' easy</h3>
<p>Some people are more social than others. Even the same person can be more social one day than the next, and that's okay.
There's an argument to be made for not feeling <strong>pressured</strong> to pair. It shouldn't be a mandatory thing, but rather a scenario to strive for.</p>
<p>Take a break from time to time, change partner, ease yourself into it.
Computer people are not really known for their social skills, nobody expects an introvert to want to spend 8 hours straight with a co-worker.</p>
<p>It might be healthier to think of pair programming as an objective in some cases, rather than '<em>a thing you just do</em>'.</p>
<h3>Everybody is different</h3>
<p>Just like with any other human interaction or relation, some work and some don't.
This can be harsh and hard to deal with (for some more so than for others).</p>
<p>Some get distracted while being the '<em>navigator</em>' and wander around not really helping a lot.
Some want things done their way and can't seem to listen while being the '<em>driver</em>'.
Some may not be willing to swap out the role at all.</p>
<p>Dealing with these situations is quite simply <strong>not developing software</strong>.
And you can bet that your average developer has zero interest in resolving issues like these (although one could argue that they probably should).</p>
<h3>Juniors might hamper Seniors</h3>
<p>No matter how you define '<em>Juniors</em>' and '<em>Seniors</em>' (or if you just don't care much for the terms at all), people with less knowledge and/or experience in any particular subject will inevitably take longer to understand things and have <strong>a lot</strong> of questions.</p>
<p>It's easy to see yourself pairing with someone more or less at the same level you are, but it can be annoying to have someone stopping you at every step to ask you basic (although very <strong>valid</strong>) questions.</p>
<p>One could consider pairing partners based on experience, although there are very valid reasons to mix and match experience levels.</p>
<p><img src="./pair-strong.webp" alt="pair-strong" /></p>
<h2>Pair as much as you can</h2>
<p>Now for the good stuff.</p>
<hr />
<h3>Beware the bias</h3>
<p>I've been very lucky in this regard: I've written some code on my own, but I have never <strong>worked</strong> alone.</p>
<p>I started a while back in a team that felt more like a bunch of friends than a work group. Not that we weren't getting things done (we were even helping other teams on a regular basis), we just worked in a very friendly and enjoyable environment.</p>
<p>We laughed a lot and were there for one another, it really felt like hanging out with some friends.</p>
<p>Just keep in mind that someone less fortunate might not feel the same as I do about pairing, which is totally fair!</p>
<hr />
<h3>Don't care much about productivity</h3>
<p>It might be productive <strong>in the short term</strong> to have two developers doing two different tasks, but is it also in the long run?</p>
<p>When both codes need to be merged, but they seem like they come from two different worlds, is pairing still less productive?
What about when one of them spends two days looking for a bug that could have been easily found in half an hour by just having double the eyes looking at the code?
Don't you think you would benefit from a helping hand when you hit a dead end?
Wouldn't you <strong>always</strong> prefer to know <strong>early</strong> if there is a better way?
Do you really think that having some help or a different opinion makes you less productive?</p>
<p>Sure you could just ask for help when needed but that assumes everybody is mature enough to <strong>explicitly and openly</strong> admit they need help.</p>
<p>This is why I mentioned that business people might focus more on this than developers:
The usually just don't see or understand the present and long term differences between clean code and spaghetti code.
And that's not a criticism, it just isn't really their job to tell one apart from the other. They shouldn't be expected to.
If anything it's on developers for not knowing how to present these issues properly to the business.</p>
<h3>Majority of us are noobs (and that's fine)</h3>
<p>Uncle Bob has repeated this fact time and time again: Developers more or less double every five years or so.
This means there are always <strong>a lot</strong> more inexperienced developers than experienced ones.
There is an obvious need to take Juniors and their learning process into account when thinking about anything related to software development.</p>
<p>It's kind of strange that '<em>shadowing</em>' isn't just standard practice in the industry, but if it were, it would benefit <strong>greatly</strong> from pairing.
I can't think of a faster way for someone to get the hang of a particular language, framework, methodology or project than for him to pair with an experienced developer.</p>
<p>Sure you can do test projects on your own (and it actually helps a lot), but the level of insight and '<em>know how</em>' you get from watching a Senior hands on with the code, asking questions and receiving corrections and guidance in real time is just on a completely different league.</p>
<p>Of course, that will hamper his speed (quite a lot at the beginning), especially if the more experienced one actually cares about his pairing partner getting better on the job at hand.
To this I would say...</p>
<h3>Developer 'flow' is no good</h3>
<p>It sure feels good: Being on you own, doing your thing at light speed. Just ripping through the code and solving tickets left and right.
You look at your code after a week and have no clue what you wrote, why you wrote it that way or if there was a better way to do it. You overengineered half of it and didn't stop for a moment to see what the rest of the codebase looks like.</p>
<p><img src="./borat-success.webp" alt="success" /></p>
<p>I would argue that being in this state of '<em>Flow</em>' is only useful if you work on your own in a small and/or unimportant project.
It seems much, much better in the long run (and for the rest of the team) to have someone stop and question you every once in a while.
This will make you consider why you are doing what you are doing (and if you are making it unintelligible for the rest of us).
If you can explain it simply you can be sure that you know what you are doing (and you will have taught something new to your peer, making both of you better off).</p>
<p>Nobody wants to be distracted while solving a problem, but when doing things out of inertia we can all benefit from an external voice of reason.</p>
<h3>Good for the mind</h3>
<p>Especially with remote work becoming the norm, it's not really healthy to spend 8 hours a day solving problems under stress on your own.
I know, it sounds strange, but we are supposed to be around each other. We are not built to be alone.</p>
<p>You might think you are better off on you own, but I can guarantee you will feel better with good company.
Pairing can be fun, relaxing and morally helpful.
Don't isolate yourself for no reason.</p>
<h3>Enforces soft skills</h3>
<p>I've been pleasantly surprised with how important soft skills are in software development and I can see a couple of clear benefits of pair programming in this regard:</p>
<h4>Leave your ego at the door</h4>
<p>If done well, pairing will force you to listen to others, overcome your ego and accept what's best for the team.
You can never have too much of that.</p>
<h4>Leave your personal problems at the door</h4>
<p>Not only will this help you compartmentalize trouble (which is always good), you will <strong>have</strong> to <em>decouple</em> your mind from your personal issues.
That's just free counseling right there.</p>
<h4>Forces professional behavior</h4>
<p>A doctor doesn't just behave badly or work alone when he is '<em>not feeling it</em>'. He picks himself up and gives his best.
Why should we not do the same?</p>
<h4>Soft skills Hard truth</h4>
<p>The reality of soft skills is that there is no method to it. You can watch what you say and how you say it. You can be more assertive and sensitive.
But at the end of the day, you learn soft skills by throwing yourself into the fire.</p>
<p>You can't '<em>think your way</em>' through human interactions.
You deal with them as they come, in the context you struggle the most, screw up as many times as you need in order to get a feel for it and acquire that new skill (just like you would when learning to ride a bike).
Working alone won't help much with this.</p>
<p>When in doubt, program in pairs. It's good for everybody</p>
Ports & Adaptershttps://devintheshell.com/blog/ports-and-adapters/https://devintheshell.com/blog/ports-and-adapters/AKA Hexagonal architectureSun, 03 Oct 2021 18:11:44 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>This architecture, created by Alistair Cockburn in 2005, builds on the layer-based designs that came before it, putting a much bigger emphasis on the outer boundaries of the system.</p>
<h2>Everything is I/O</h2>
<p>Focusing on <strong>I/O to and from our application</strong> will help understand this architecture.</p>
<p>Try to see everything but the most core business logic as an I/O device.</p>
<ul>
<li>Database? I/O device.</li>
<li>Messaging service? I/O device.</li>
<li>GUI? I/O device.</li>
<li>The web? I/O device.</li>
</ul>
<p>We ditch the layers as we used them before and swap them for puzzle pieces of sorts, where <strong>our core business logic</strong> communicates with different types of I/O devices.</p>
<p><img src="./ports-and-adapters.webp" alt="ports-and-adapters" /></p>
<p>Keep this image in mind going forward.</p>
<h2>Port</h2>
<p>There is a really nice definition in the <a href="https://herbertograca.com/2017/09/14/ports-adapters-architecture/#what-is-a-port">post I'm leaning</a> on to write this:</p>
<blockquote>
<p>A port is a consumer agnostic entry/exit point to/from the application.</p>
</blockquote>
<p>It's a piece of <strong>our core application</strong> that vaguely defines how we communicate with any given 'I/O device', without caring about the implementation detail.</p>
<p>If we use a persistence port as an example, it should stay unchanged no matter if we use MySQL, MongoDB, or whatever else.</p>
<p>It just knows we need to persist and what parts of our domain need persistence.</p>
<p>In this example, it will most likely take the form of an Interface or a Trait: It defines that we need a <code>persist(user)</code> functionality, but not how to implement it.</p>
<h2>Adapter</h2>
<p>A module that <strong>adapts</strong> to or from a specific port.</p>
<p>So if we are using an SQL database, its adapter will <strong>implement</strong> our persistence <strong>port</strong> and translate our domain to whatever SQL commands needed to read from or write to the database.</p>
<p>This, unlike its <strong>port</strong>, will not stay if we swap SQL for MongoDB. Our domain ends where our adapter starts.</p>
<p>This way, we decouple our domain from any particular technology we might be using.</p>
<p>This thought process should sound familiar, as it is, in practice, how you often implement a <a href="../ddd-tactics#repositories">Repository</a> from DDD.</p>
<h2>Far Beyond Driven</h2>
<p>You might have noticed that the example of a persistence port/adapter doesn't quite fit all types of infrastructure.</p>
<p>There is a difference between our previous SQL example and an REST API port/adapter: One <strong>is driven</strong> by our application while the other decides or <strong>drives</strong> what our application does.</p>
<p><img src="./driving-driven.png" alt="driving-driven" />
<sub><a href="https://jmgarridopaz.github.io/content/hexagonalarchitecture.html">source</a></sub></p>
<p>Think about your typical REST API controller function, for example a GET controller that calls a <code>GetUser</code> <em>use case</em>.</p>
<p>If you follow the <strong>D</strong> in <strong>SOLID</strong>, your controller should depend on that <em>use case</em> abstraction (interface), which will be implemented somewhere else inside the hexagon.</p>
<p>So the REST controller (<strong>Adapter</strong>) would use the <em>use case</em> (<strong>Port</strong>) which will be implemented within the boundaries of the application as part of the Business Logic.</p>
<h2>Inwards dependency</h2>
<p>By following these guidelines, we end up with the infrastructure related code depending on the Domain/Business Logic <strong>and not the other way around</strong>.</p>
<p>The Domain doesn't know a thing about a web UI, API calls, CLI, SQL, Redis, etc.</p>
<p>It knows its own rules and logic, it knows what it <strong>can do</strong> (driver ports) and what it <strong>needs</strong> to function (driven ports).</p>
<p><img src="./ports-and-adapters-dependency.webp" alt="ports-and-adapters-dependency" /></p>
<p>This allows us to focus on what actually matters, disregarding external libraries and other infrastructure as nothing more than <em>'a thing that fits (or uses) the ports of the application'</em>.</p>
<p>It also has the (quite nice) added bonus of making the system really easy to test. Just swap the real adapter for a test one and you are good.</p>
<h2>Considerations</h2>
<p>This enforced isolation between our application and its dependencies is the most important idea Ports & Adapters has to offer.</p>
<p>By structuring an <em>'us vs. them'</em> approach, it materializes the idea of 'core business rules', which while present in previous architectures were not so clearly delineated.</p>
<p>Of course this only considers two concentric <em>'layers'</em>: Our code and the rest of the world. This will be <a href="../onion-arch">expanded upon</a> by Jeffrey Palermo three years later.</p>
<h3>Hexagonal?</h3>
<p>So what's the deal with the hexagon? Why is it called Hexagonal Architecture? What piece is repeated six times?</p>
<blockquote>
<p>The shape should evoke the inside/outside asymmetry rather than top/down or let/right. Then Square is not suitable. Pentagon, Heptagon, Octagon, ... Too hard to draw.</p>
<p>So Hexagon is the winner.</p>
</blockquote>
<p>Well, that's a bit lame. Don't get me wrong, <a href="https://www.youtube.com/watch?v=thOifuHs6eY">hexagon is bestagon</a>, but it might as well have been a circle.</p>
<p>'Ports & Adapters' is just a better name: it's accurate and focuses your attention to what really matters and what defines this architectural design.</p>
EBI Architecturehttps://devintheshell.com/blog/ebi-arch/https://devintheshell.com/blog/ebi-arch/A step towards better designSat, 02 Oct 2021 19:26:16 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>Created by <a href="https://www.amazon.com/Object-Oriented-Software-Engineering-Driven-Approach/dp/0201403471">Ivar Jacobson in 1992</a> as EIC (Entity-Interface-Control) and renamed by Uncle Bob to Entity-Boundary-Interactor (EBI), It's a more back-end-focused version of the <a href="../mvc-arch">MVC</a> architecture that came before.</p>
<p><img src="./ebi.webp" alt="ebi" /></p>
<h2>Entity</h2>
<p>It consists <strong>both</strong> of the <em>'domain'</em> entity and <strong>all behavior strictly related to it</strong>.</p>
<p>So in a very simple example, the <em>'Dog'</em> Entity would hold the data regarding its breed, fur color, health, etc. as well as the logic required for it to behave as expected: a walk function, a bark function, etc.</p>
<p>Already back in '92, Jacobson was warning about <em>anemic entities</em> and <em>god objects</em>.</p>
<h2>Boundary</h2>
<p>The I/O interface of the system.</p>
<p>Think of it as the <em>'fence'</em> of the domain (It's in the name).</p>
<p>All interaction between your code and the user on one side (GUI, CLI, etc.) and the infrastructure on the other (persistence, event queue, messaging, etc.) should be handled by this guy.</p>
<p>You might want to make it an interface and call it a <a href="../ports-and-adapters">Port</a>, but that's still 13 years away.</p>
<h2>Interactor</h2>
<p>The ones in charge of validating I/O between Boundaries and Entities.</p>
<p>More important than this, they will be managing the <strong>interactions</strong> between Entities.
In practice, this means that all logic not belonging to or fitting in the Entities will end up here.</p>
<p>In our previous dog example, this role would be taken by the owner. Dogs don't play dead on their own, the owner (or trainer) needs to give the order.</p>
<p>Stretching the example a bit, dog to dog interaction is usually mediated by one or more humans (assuming they are pets).</p>
<p>The same applies here.</p>
<p>Of course, one interactor will often not be enough. You should end up with about one interactor per <strong>use case</strong>.</p>
<p>For every abstract operation a user could perform on your system, there should be an interactor ready to handle the use case.</p>
MVC & MVVMhttps://devintheshell.com/blog/mvc-arch/https://devintheshell.com/blog/mvc-arch/In the beginning was the Word, and the Word was "Separation of Concerns"Sat, 02 Oct 2021 17:40:28 GMT<p>This is part of a series, <a href="../arch-for-noobs">start here!</a>
<br>
<br></p>
<p>After suffering the consequences of mashing everything together, someone got tired and <a href="https://folk.universitetetioslo.no/trygver/1979/mvc-2/1979-12-MVC.pdf">gave us MVC</a>.</p>
<h2>Model-View-Controller</h2>
<p>In 1979, Trygve Reenskaug came up with this architecture as a way to solve some of the issues related with writing code for the machine and not for the human.</p>
<p>This was our first attempt at <em>'separation of concerns'</em> and was guided by the following logic:</p>
<blockquote>
<p>Separate data, logic and presentation.</p>
</blockquote>
<p>More accurately, it contemplated three basic units (or layers):</p>
<ul>
<li><strong>Model</strong>: The business entities/logic.</li>
<li><strong>View</strong>: The UI.</li>
<li><strong>Controller</strong>: The <em>'procedural'</em> or <em>'application'</em> logic. It would guide all interactions between the previous two.</li>
</ul>
<p>Would look something like this:</p>
<p><img src="./MVC.png" alt="MVC" /></p>
<p>If done right you would end up with multiple View-Controller pairs per screen, since each view should ideally only be responsible for a single piece of the UI (widget, button, text field...) and talk to a Controller if data was needed.</p>
<p>This also means that if multiple Views needed the same data, they would communicate with the same Controller.</p>
<p>Notice here that the Controller <em>reacts</em> to the View, and manipulates the Model as a consequence.</p>
<p>The View would then react directly to the events triggered by the Model, updating the UI accordingly.</p>
<p>This forces a one directional flow in which the user interaction with the View determines what a specific Controller should do with the Models, which in turn updates the View directly.</p>
<p>This design is often still used in the front end (which is no surprise since it was created in the context of GUI reliant desktop applications).</p>
<p>All things considered, this approach leaves us with a couple of issues:</p>
<ol>
<li>The View-Controller relation can get messy fast.</li>
<li>Having multiple Views per Controller can get even messier, since each View ideally corresponds to a piece of the UI.</li>
<li>The View is coupled directly to the Model.</li>
</ol>
<h3>Model-View-ViewModel</h3>
<p>That's what <a href="https://learn.microsoft.com/en-us/archive/blogs/johngossman/introduction-to-modelviewviewmodel-pattern-for-building-wpf-apps">John Gossman tried to solve</a> around 2005.</p>
<p>Basically, he called the old Controller <strong>ViewModel</strong> and made it also responsible for the events fired by the Model.</p>
<p>So now, the flow of execution would stop in the ViewModel both on its way to the Model and on its way back to the View.</p>
<p>This new and improved 'Controller' now had the power to manipulate Models as well as to implement View specific logic (which made the View much simpler, so designers to focus on... Design).</p>
<p>Now View and ViewModel <strong>must</strong> have a 1:1 relation.</p>
<p><img src="./MVVM.png" alt="MVVM" /></p>
<p>As you might imagine, once the Model operations get complicated and/or there's a bunch of different data to manage or present together, it's easy to still wrangle things together.</p>
<p>The <a href="../ebi-arch">EBI architecture</a> attempts to solve this.</p>
Blockchain with Nodehttps://devintheshell.com/blog/node-chain/https://devintheshell.com/blog/node-chain/Create your very own mini-blockchain!Sun, 26 Sep 2021 14:03:46 GMT<p>We'll build a very simple POC blockchain with four distinct building blocks.
Check the <a href="https://github.com/EricDriussi/node-chain">repo</a> for reference!</p>
<h2>Transaction</h2>
<p>The only things really needed for a transaction to take place are:</p>
<ul>
<li>Who sends the money?</li>
<li>Who receives the money?</li>
<li>How much money?</li>
</ul>
<pre><code>export class Transaction {
constructor(
public amount: number,
public sender: string,
public reciever: string
) {}
}
</code></pre>
<h2>Block</h2>
<p>Think of a block as a group of transactions with a time stamp and a reference to the previous block:</p>
<pre><code>import {Transaction} from "./Transaction";
export class Block {
constructor(
public previousHash: string,
public transaction: Transaction,
public timeStamp = Date.now()
) {}
}
</code></pre>
<p>As discussed <a href="../blockchain#hashing">before</a>, we need to hash our blocks.
To do this, we'll use the <code>crypto</code> library:</p>
<pre><code>get hash() {
const hash = crypto.createHash("SHA256");
const block= JSON.stringify(this);
hash.update(block).end();
return hash.digest("hex");
}
</code></pre>
<p>We create a <code>SHA256</code> hash, add our stringified block to it, and output a <code>hexadecimal</code> representation of it.
This is what will link multiple blocks together.</p>
<h2>Chain</h2>
<p>Since there should only ever be <strong>one</strong> chain, we can represent it as a Singleton.
We also know that it will host our blocks and that these need to know about their predecessor, so we can start by:</p>
<pre><code>import {Block} from "./Block";
export class Chain {
public static instance = new Chain();
chain: Block[];
get lastBlock() {
return this.chain[this.chain.length - 1];
}
</code></pre>
<p>Let us also initiate the chain with a first block:</p>
<pre><code>constructor() {
this.chain = [new Block("", new Transaction(10, "God", "Satoshi"))];
}
</code></pre>
<p>We'll also need a way to add blocks to the chain. Something like:</p>
<pre><code>addBlock(transaction: Transaction) {
const newBlock = new Block(this.lastBlock.hash, transaction);
this.chain.push(newBlock);
}
</code></pre>
<p>This will work, but it allows anybody to send anything anywhere.
We need verification, signatures and keys.
We need a wallet.</p>
<h2>Wallet</h2>
<p>At a basic level a wallet is just a wrapper for a key pair, not unlike what you use to secure an <a href="../maintain-vps#ssh">SSH connection</a>.</p>
<pre><code>export class Wallet {
public pubKey: string;
public privKey: string;
get address(){
return this.pubKey;
}
}
</code></pre>
<p>Again, we'll use the <code>crypto</code> library to generate the key-pair.
Since we want two ways encryption, we'll use <code>RSA</code>.</p>
<pre><code>import crypto from "crypto";
constructor() {
const keyPair = crypto.generateKeyPairSync("rsa", {
// standard rsa settings
modulusLength: 4096,
publicKeyEncoding: {type: 'spki', format: 'pem'},
privateKeyEncoding: {type: 'pkcs8', format: 'pem'},
})
this.pubKey = keyPair.publicKey;
this.privKey = keyPair.privateKey;
}
</code></pre>
<h2>Payment</h2>
<p>As explained <a href="../blockchain#transacting--signatures">before</a>, we don't want to expose the private key <strong>nor</strong> the encrypted data.
Rather, we use the private key to <strong>sign</strong> the data.</p>
<p>This way, the public key can be used to verify the data's integrity <strong>without</strong> exposing the private key.</p>
<pre><code>pay(amount: number,senderPubKey: string) {
const transaction = new Transaction(amount, this.pubKey, senderPubKey);
const sign = crypto.createSign("SHA256");
sign.update(transaction.toString()).end();
const signature = sign.sign(this.privKey);
Chain.instance.addBlock(transaction, this.pubKey, signature);
}
</code></pre>
<p>Here we create a transaction and use <em>crypto</em> to sign it with the payer's private key.
In a real world scenario we would verify the signature sending the block to various miners (nodes) over the web, but it's good enough for our purpose.</p>
<p>We'll have to update our <code>addBlock</code> function to ensure the block is verified before being added to the chain:</p>
<pre><code>addBlock(transaction: Transaction, senderPubKey: string, signature: Buffer) {
const verifier = crypto.createVerify("SHA256");
verifier.update(transaction.toString());
const transactionIsValid = verifier.verify(senderPubKey, signature)
if (transactionIsValid) {
const newBlock = new Block(this.lastBlock.hash, transaction);
// Proof of work!
this.mine(newBlock.proofOfWorkSeed);
this.chain.push(newBlock);
}
}
</code></pre>
<h2>Mining</h2>
<p>As mentioned <a href="../blockchain#proof-of-work">before</a>, we'll use of the concept of "<em>Proof of Work</em>" to our advantage.</p>
<p>Let's add a <em>POW</em> seed to our <code>Block</code> class:</p>
<pre><code>proofOfWorkSeed = Math.round(Math.random() * 42069666);
</code></pre>
<p>Now we can add a <code>mine</code> function to our <code>Chain</code> such as:</p>
<pre><code>mine(proofOfWorkSeed: number) {
let solution = 1;
console.log("⛏️ ... ⛏️ ... ⛏️");
while (true) {
const hash = crypto.createHash("MD5");
hash.update((proofOfWorkSeed + solution).toString()).end();
const attempt = hash.digest("hex");
if (attempt.substr(0, 4) === "9999") {
console.log(`Done! Solution: ${solution}`)
return solution;
}
solution += 1;
}
}
</code></pre>
<p>Here we look for a number that added to the seed will produce a hash starting with four consecutive nines.
The specific implementation here doesn't really matter, as long as it is somewhat costly to compute.</p>
<p>After finding the correct answer (which will be different for each block) it will return it for the other nodes to verify (which is much easier than to solve), or at least that is what would happen in a real world application.</p>
<h2>Run it</h2>
<p>Hopefully you've ended up with something like <a href="https://github.com/EricDriussi/node-chain">this</a>.</p>
<p>You can add your own finishing touches and run <code>npm run start</code> to see it in practice.</p>
<p>There you go, your very own blockchain!</p>
Blockchain 101https://devintheshell.com/blog/blockchain-101/https://devintheshell.com/blog/blockchain-101/Blockchain basicsSun, 26 Sep 2021 14:02:46 GMT<h2>General structure</h2>
<p>As the name implies, a blockchain is nothing more than a <em>chain</em> of <em>blocks</em>, where each block contains a collection of transactions and is connected (chained) to the previous block.
Kinda like a Git repo is made up of a bunch of commits linked to one another, each containing a bunch of changes.
Only in this case you could only commit new code to the repo, no rebasing or amending.</p>
<p>The chain can be (and usually is) distributed among multiple machines, thus it's decentralized nature.
This begs the question, what if I mess around with my copy of the chain?
Can I give myself a million coins just like that?</p>
<h2>Proof of work</h2>
<p>To ensure the validity of the transactions (and that the blockchain has not been tampered with), all parties involved in the network need to agree on one 'correct' blockchain.</p>
<p>A common way to face this problem, although not the only one, is by using a <strong>proof of work</strong> system.</p>
<p>Simply put, the transaction finds its way into a block, which is then validated by solving an algorithm.</p>
<p>The lucky first will receive a portion of the transaction as a payment while the others will <strong>verify</strong> that the solution is correct (this is why you can't just create coins as you wish).</p>
<p>The algorithm needs to be quite expensive to compute for this to make sense, but we also want to verify and compare the results easily.
For this, we use hashes.</p>
<h2>Hashing</h2>
<p>A Hash is a "<em>one way cryptographic function</em>".
Which means we can use it to encrypt data, but getting the original data from it's encrypted form is impossible.</p>
<p>This is useful because it makes it impossible to tamper with a computed block once it's hashed, while making it easy to compare it to others.</p>
<p>Plus, hashing has the added benefit of producing standard sized <em>blocks</em>.</p>
<h2>Transacting & Signatures</h2>
<p>The last piece of the puzzle are transactions.</p>
<p>To send and/or receive coins, each user needs a <em>Wallet</em>.
This is little more than a key pair (a public key and a private key).</p>
<p>A transaction in this context is basically composed of the amount to send/receive, plus the sender and receiver public keys.</p>
<p>When a transaction is sent, it is singed with the sender's private key.
This means that anybody can confirm who the sender was by checking if the public key matches the signature of the transaction.</p>
<p>This way, we can trace the transactions as far back as we want, without exposing the private keys of the wallets involved.</p>
Operate and maintain your VPShttps://devintheshell.com/blog/maintain-vps/https://devintheshell.com/blog/maintain-vps/Some basic tips and tricksMon, 12 Jul 2021 16:15:12 GMT<h2>SSH</h2>
<p>To ensure a secure SSH connection, it is best to not rely on password authentication.
Instead, we should use a key pair.</p>
<p>The idea is, we generate an ssh key for our machine and make our server trust it.</p>
<p>By doing this, we ensure that only the holder of the key can connect through SSH with the VPS, making access quicker, easily scriptable and brute-force proof.</p>
<h3>Generate the pair</h3>
<pre><code>ssh-keygen -t ed25519 -a 100 -C "[email protected]"
</code></pre>
<p>This command should prompt you for a path in which to store the keys (usually <code>~/.ssh/</code>) as well as a passphrase.</p>
<p>You may want to leave the passphrase blank if you plan on scripting on top of this connection.
Although more convenient, it is less secure.
In any case you can later run <code>ssh-keygen -p</code> to change or remove the passphrase.</p>
<p>You will now see a key pair in the path you selected. Let's get one of them on your server.</p>
<h3>Get the public key on your server</h3>
<pre><code>ssh-copy-id [email protected]
</code></pre>
<p>You'll have to enter your VPS's root user's password, after which your server will authorize access to the machine holding the SSH private key.</p>
<p>Log out and back in. If it bypasses the password prompt it worked!
If it didn't, check the permissions for both the keys and the <code>.ssh</code> directory.</p>
<h3>Disable password logins</h3>
<p>Not a necessary step but if you want a reason to do it just check the output of <code>journalctl -xe</code> on your VPS.
To avoid brute force attacks, let's make sure logins are <strong>only</strong> allowed for the private keyholder.</p>
<p>Open <code>/etc/ssh/sshd_config</code> in your VPS and find/set the following lines as shown.</p>
<pre><code>PasswordAuthentication no
PermitEmptyPasswords no
MaxAuthTries 3
</code></pre>
<p>This will harden your connection quite a bit, you can go the extra mile by setting up a non-root, sudo user and only connect to the VPS with that user.
If you go that route, make sure you set <code>PermitRootLogin no</code> to remove the possibility of a root login completely.</p>
<p>Of course, reload ssh: <code>systemctl reload sshd</code></p>
<hr />
<h4>What if I lose the key?</h4>
<p><strong>Don't. Backups are your friends.</strong></p>
<p>But in case you do, your VPS provider will likely offer some sort of local prompt emulated in a browser window.</p>
<p>Being local to the machine (or at least functioning as such), this will only be accessible to you after you log in to your VPS provider online account and won't prompt you for authentication
Then, simply set <code>PasswordAuthentication yes</code> in <code>/etc/ssh/sshd_config</code>.</p>
<hr />
<h2>Rsync</h2>
<p>You'll often find yourself needing to transfer files to and from your server.
Rsync is probably the easiest way to do it: it's fast, reliable and simple.</p>
<p>Make sure it's installed both on your local machine and on the server, then write (or make an alias for):</p>
<pre><code>rsync -rtvzP /path/to/localFile [email protected]:/path/on/the/server
</code></pre>
<p>This command will run <u>r</u>ecursively (including directories), transfer modification <u>t</u>imes (skips non modified files), <u>v</u>isualize the files being uploaded, compress (<u>z</u>?) files for upload and <u>P</u>ick up where it stopped in case of lost connection.</p>
<p>Of course, to download something from the VPS, just reverse the arguments.</p>
<h2>Cronjobs</h2>
<p>There are certain routine tasks that are better left to Cron.
It will take care of running any command with a given frequency or repetition pattern.</p>
<p>Say, for example, that you want to automate updates for your server.
You could run <code>crontab -e</code> and insert something like <code>30 2 * * 0 apt -y update && apt -y upgrade</code> into the file (<code>:wq</code> to save and quit).</p>
<p>Let's break it down:</p>
<pre><code> .------------------ minute (0 - 59)
| .--------------- hour (0 - 23)
| | .------------ day of month (1 - 31)
| | | .--------- month (1 - 12)
| | | | .------ day of week (0 - 6)
| | | | |
* * * * *
30 2 * * 0 apt -y update && apt -y upgrade
</code></pre>
<p>Basically a machine friendly version of <em>'please update the system every Sunday at 2:30AM'</em>.</p>
<p>There's plenty you can do with cronjobs, <a href="https://crontab.guru">this</a> website is a great tool to set specific patterns.</p>
<h2>Finding Things</h2>
<p>You might use <code>whereis</code> or <code>which</code> to look for executables, but that can be insufficient depending on the task at hand.</p>
<p><code>updatedb</code> can quickly index the whole system, which is neat since there's a tool called <code>locate</code> that can easily find files and directories whose paths contain any given string.</p>
<p>So if you run <code>updatedb</code> and then <code>locate cron</code>, you will see a list of files and directories containing <code>cron</code> in their paths.
Pretty cool!</p>
<h2>Ports</h2>
<p>Often you'll want to know which ports are in use.
Assuming <code>ss</code> is installed, you could use something like:</p>
<pre><code>sudo ss -tulpn | grep LISTEN
</code></pre>
<p>In some cases, you'll want to free a given port no matter who is occupying it.
Assuming <code>lsof</code> is installed:</p>
<pre><code>lsof -i tcp:[PORT_TO_FREE] | awk 'NR!=1 {print $2}' | xargs kill
</code></pre>
<p>Keep in mind that this is a bit of a nuclear option, you might want to double-check what you are getting rid of!</p>
<h2>Troubleshooting</h2>
<p>Troubleshooting issues is quite common when working on a VPS, and you'll likely find the root cause in the logs.</p>
<p>System-wide logs can be seen by running <code>journalctl -xe</code> (<code>-xe</code> is to make the results a bit more useful).
Of course, you can make your life easier if you know what you are after:</p>
<pre><code>journalctl -xeu brokenApp
</code></pre>
<p>This will only show entries relevant to your <code>brokenApp</code>.</p>
<p>Of course, not all apps use the system logs. These will usually have their own under <code>/var/log/</code>.
For example, <a href="../nginx">NGINX</a> has its access logs under <code>/var/log/nginx/access.log</code>, and the error logs in <code>/var/log/nginx/error.log</code>.
Have a look around, you'll definitely find what you're after.</p>
<p>Another useful troubleshooting utility is <code>systemctl</code>. It won't give you any logs, but you can use it to stop, restart, reload and start services manually and/or check their status:</p>
<pre><code>systemctl status appNotWorking.service
</code></pre>
<p>This command will give you plenty of information regarding that specific service.</p>
Set up a firewall with UFWhttps://devintheshell.com/blog/ufw/https://devintheshell.com/blog/ufw/Simple and Uncomplicated!Mon, 12 Jul 2021 16:14:37 GMT<h2>What it is</h2>
<p>The Uncomplicated Firewall is just an easy way to interact with <code>iptables</code>, the default way for Linux based systems to control connections to and from the web.</p>
<p>You'll usually find it in web servers, although it can --and arguably should-- be installed on your main machine.</p>
<h2>Basic Setup</h2>
<p>Let's take a look at a basic setup for a web server:</p>
<pre><code>Status: active
Default: deny (incoming), allow (outgoing)
To Action From
-- ------ ----
22 LIMIT IN Anywhere
80 ALLOW IN Anywhere
443 ALLOW IN Anywhere
</code></pre>
<p>The second line tells us the default policies for all non specified ports. In this case it denies all incoming traffic while allowing all outgoing.
Specific port policies are listed below (SSH, HTTP, HTTPS, etc.).</p>
<h2>Config</h2>
<p>If we run <code>ufw status</code> right after installing it, we'll get an underwhelming <code>Status: inactive</code> as a response.</p>
<p>Makes sense, now let's configure a basic server-ready setup like the one above:</p>
<pre><code>ufw default deny incoming # Block everything from the web
ufw limit in 22 # Limit incoming SSH connections
ufw allow in 80 # Allow incoming HTTP connections
ufw allow in 443 # Allow incoming HTTPS connections
ufw enable
</code></pre>
<hr />
<p><strong>Important:</strong> Make sure to not block ssh communication! That might lock yourself out of your VPS/Server completely!</p>
<hr />
<p>Now if you run <code>ufw status verbose</code> you should see pretty much the same information as we saw in the example above.</p>
<h3>Deleting rules</h3>
<p>For example, if you want to delete the previous HTTPS rule:</p>
<pre><code>ufw delete allow in 443
ufw reload
</code></pre>
<h3>Fine-Tuning</h3>
<p>Of course, you can easily change the default behavior as well as fine tune the policy on a per port basis.
You can <code>deny</code>, <code>reject</code>, <code>limit</code> or <code>allow</code> either <code>in</code> or <code>out</code> going traffic for which ever port you might need, as well as use the same parameters to define default behaviors.</p>
Serve your site on Torhttps://devintheshell.com/blog/onion-tor/https://devintheshell.com/blog/onion-tor/Just because you canMon, 12 Jul 2021 16:14:24 GMT<h2>Why?</h2>
<p>Isn't Tor used by criminals to do bad stuff?
<a href="https://2019.www.torproject.org/about/torusers.html.en">Kinda</a>. It's also used by people that cannot safely browse the clear web and/or express their opinion due to tough restrictions or straight systematic oppression.
People that don't want certain queries to be public, journalists, whistleblowers and privacy minded people regularly browse the web through Tor.</p>
<p>It's really easy, so why not make your site accessible for Tor users? Consider that even if you are not using Tor today, you might in the future.</p>
<h2>Install & Enable</h2>
<p>For any serious use, you should have a look at <a href="https://community.torproject.org/onion-services/setup/install/">their install instructions</a>.
To keep it simple, you can probably find reasonable up-to-date packages in your preferred distro's repositories.</p>
<p>Once installed, open <code>/etc/tor/torrc</code> in your favorite editor, search for the lines
<code>HiddenServiceDir /var/lib/tor/hidden_service/</code>
and
<code>HiddenServicePort 80 127.0.0.1:80</code>
and uncomment them.</p>
<p>Start the service with <code>systemctl enable --now tor</code>.</p>
<p>Get your onion address with <code>cat /var/lib/tor/hidden_service/hostname</code>.</p>
<h2>Serve</h2>
<p>If you know how to <a href="../nginx">set up <code>nginx</code></a> this won't be anything new to you.</p>
<p>Simply create your <code>nginx</code> config file for the onion site by opening <code>/etc/nginx/sites-available/your-onion-website</code> with your favorite text editor.
Then, paste and adjust these lines:</p>
<pre><code>server {
listen 80 ;
root /var/www/onion-site ;
index index.html ;
server_name onion-address.onion ;
}
</code></pre>
<p>Ensure you point to the right path and onion address, and that it!</p>
<p>Make sure to symlink the config file to <code>/etc/nginx/sites-enabled</code> and reload <code>nginx</code>.</p>
<p>You should now be able to open your Tor browser, paste your onion address, and visit your site!</p>
Serve your website with Nginxhttps://devintheshell.com/blog/nginx/https://devintheshell.com/blog/nginx/Quick and easyMon, 12 Jul 2021 10:50:30 GMT<p>You are going to need a server or a VPS, as well as a registered domain name.</p>
<h2>Basics</h2>
<p>Assuming you already have a properly configured <a href="../maintain-vps#ssh">shh connection</a>, connect to your VPS and install NGINX.</p>
<p>Broadly speaking, NGINX will look for instructions on how to serve a given site in the <code>sites-enabled</code> directory.
These are usually symlinks to config files located under <code>sites-available</code>.</p>
<p>Let's learn by building something.</p>
<h3>Site level config</h3>
<p>Say you have your website under <code>/path/to/your/website/</code>, just as an example.
We'll begin by creating a config for your website:</p>
<pre><code>nano /etc/nginx/sites-available/yourwebsite
</code></pre>
<p>Of course, name <code>yourwebsite</code> whatever you like.</p>
<p>Just copy the config below, swapping <code>yourdomain.org</code> for whatever your domain is.</p>
<pre><code>server {
listen 80 ;
listen [::]:80 ;
server_name yourdomain.org ;
root /path/to/your/website ;
index index.html ;
location / {
try_files $uri $uri/ =404 ;
}
}
</code></pre>
<p>It's pretty self-explanatory:
<em>"Listen on port 80 (HTTP) for requests to <code>yourdomain.org</code>, and serve them whatever is under <code>/path/to/your/website</code>, starting with the <code>index.html</code> file. If nothing is there, respond with code 404."</em></p>
<h3>Enable the site</h3>
<p>Use <code>ln -s</code> to symlink the config file as explained above and restart nginx to make it load the new configurations:</p>
<pre><code>ln -s /etc/nginx/sites-available/yourwebsite /etc/nginx/sites-enabled
systemctl restart nginx
</code></pre>
<p>We are basically done at this point!
This will serve your website under your domain name. It's really that easy.</p>
<p>That said, there are some other things you might want to consider.</p>
<h2>NGINX config</h2>
<p>We went over how to configure NGINX at the site level.
There is also a more general config file: <code>/etc/nginx/nginx.conf</code>
Here you can tinker with more generic settings regarding NGINX itself.</p>
<p>For example, if you are hosting a file server you might want to play around with <code>client_max_body_size</code>.
This will change the maximum allowed size of the client request bodies.</p>
<p>You can change where the access or error logs are stored with <code>access_log</code> and <code>error_log</code>.</p>
<p>Or you could please the SEO gods with these Gzip settings:</p>
<pre><code>gzip on;
gzip_vary on;
gzip_min_length 10240;
gzip_proxied expired no-cache no-store private auth;
gzip_types text/plain text/css text/xml text/javascript application/x-javascript application/xml application/json;
gzip_disable "MSIE [1-6]\.";
</code></pre>
<h2>Security</h2>
<p>As a general rule, you want to at least have a <a href="../ufw">properly set up firewall</a>.</p>
<p>Moreover, your browser will try to prevent you from visiting your site.
This is due to the lack of certificates, modern browsers really prefer HTTPS (and for good reasons).</p>
<p>Let's see if we can get that lock icon that browsers like so much.</p>
<h3>Be the S in HTTPS</h3>
<p>There are multiple ways to obtain certificates for your site, but by far the easiest is to use <code>certbot</code>.</p>
<pre><code>apt install python-certbot-nginx
certbot --nginx
</code></pre>
<p>Just follow the instructions, it's dead easy.
It'll ask for:</p>
<ul>
<li>Your email (to notify about expiring certs)</li>
<li>Which domain/s to certify</li>
<li>Whether to redirect traffic from HTTP to HTTPS. This is definitely the way to go, so feel free to select option 2 here.</li>
</ul>
<p>Certificates created with this method need to be renewed every three months.
We can automate this using cron.</p>
<pre><code>crontab -e
</code></pre>
<p>Paste <code>0 0 2 * * certbot --nginx renew</code> into the file to create a cronjob that automatically asks <code>certbot</code> to renew all certs, and does so every two months.</p>
<p>Now we have a decent NGINX setup as a starting point for your website or for whatever else you might want to host on your <a href="../maintain-vps">VPS</a></p>
Map & Reduce in Javahttps://devintheshell.com/blog/java-streams/https://devintheshell.com/blog/java-streams/A quick overviewFri, 28 May 2021 15:01:39 GMT<h2>Streams rundown</h2>
<p>Since Java 8, you can use the Stream API to manipulate Collections.
They are a fairly semantic way to apply a series of changes to a Collection, producing <strong>another different</strong> Collection (or object, or variable) as a result.
No mutation occurs.</p>
<p>Generally speaking, you turn a Collection into a Stream with <code>.stream()</code> and pipe a series of <em>operators</em> to produce the desired result.</p>
<p>Each <em>intermediate</em> operator (<code>map</code>, <code>filter</code>, <code>sorted</code>, etc.) takes a Stream as input, while outputting another Stream.</p>
<p>You close the Stream by calling one of the <em>terminal</em> operators: <code>collect</code> will return a Collection (no surprises here), <code>forEach</code> will return Void, while <code>reduce</code> and <code>find</code> will be covered below.</p>
<h2>Map</h2>
<p>Runs a given function on each element of the Stream.</p>
<p>Something like <code>inputStream > Map(myFunction(element)) > outputStream</code>.</p>
<p>For example:</p>
<pre><code>List<String> list = Arrays.asList("this", "is", "a", "test");
List<String> answer = list.stream()
.map(String::toUpperCase)
.map(str -> str + ".txt")
.map(str -> str.length())
.collect(Collectors.toList());
System.out.println(answer);
</code></pre>
<p><code>Output: [8, 6, 5, 8]</code></p>
<h2>Filter</h2>
<p>Similar to <code>map</code>, but in this case, the function it receives needs to return a <code>boolean</code>.
This is because <code>filter</code> will only output the elements of the incoming Stream that return <code>true</code> after being evaluated with the function it received.</p>
<p>Something like <code>inputStream > Filter(myFilter()) > outputStream</code>, where <code>myFilter()</code> returns a <code>boolean</code>.</p>
<pre><code>List<String> list = Arrays.asList("this", "is", "another", "test");
List<String> answer = list.stream()
.filter(str -> str.length() > 3 && str.startsWith("a"))
.collect(Collectors.toList());
System.out.println(answer);
</code></pre>
<p><code>Output: [another]</code></p>
<h2>Reduce</h2>
<p>Produces a <strong>single</strong> result from a Stream, by applying a given combining operation to the incoming elements.</p>
<p>There are <strong>three</strong> possible components in this operation:</p>
<ul>
<li><strong>Identity</strong> (optional): initial or default value, if the Stream is empty.</li>
<li><strong>Accumulator</strong>: function that takes two parameters:
<ul>
<li>Partial result of the operation.</li>
<li>Next element of the Stream.</li>
</ul>
</li>
<li><strong>Combiner</strong> (optional): function used to combine the partial results when under parallel execution or mismatch between the types of the accumulator.</li>
</ul>
<p>Something like <code>inputStream > Reduce(myIdentity, myAccumulator, myCombiner) > result</code>.</p>
<h3>Accumulator</h3>
<p>Unless the accumulator has some complexity to it, you'll usually see it as a Lambda:</p>
<pre><code>String[] array = { "Java", "Streams", "Rule" };
Optional<String> combined = Arrays.stream(array).reduce((str1, str2) -> str1 + "-" + str2);
if (combined.isPresent())
System.out.println(combined.get());
</code></pre>
<p><code>Output: Java-Streams-Rule</code></p>
<p>By default, <code>reduce</code> will return an <code>Optional</code> of the type it finds in the incoming Stream, hence the <code>if</code> statement at the end.</p>
<p>You can avoid that part by closing the Stream with an <code>orElse()</code>.</p>
<h3>Identity</h3>
<p>Useful for avoiding <code>NullPointerException</code>s, especially when reducing complex objects.</p>
<pre><code>int product = IntStream.range(2, 8)
.reduce(0, (num1, num2) -> num1 * num2);
System.out.println("The product is: " + product);
</code></pre>
<p><code>Output: The product is: 5040</code></p>
<h3>Combiner</h3>
<p>Due to some quirks of the JVM when under parallel execution, we'll need a way to combine the results of each sub-stream in one.</p>
<p>A simple example with the three <code>reduce</code> components explicitly set might look something like:</p>
<pre><code>int sumAges = Arrays.asList(25, 30, 45, 28, 32)
.parallelStream()
.reduce(0, (a, b) -> (a + b), Integer::sum);
System.out.println(sumAges);
</code></pre>
<p><code>Output: 160</code></p>
<p>The <strong>Combiner</strong> will also be necessary if different types are managed in the Accumulator.
In the example, the Accumulator has an <code>int</code> as partial result, but a <code>User</code> as next element:</p>
<pre><code>List<User> users = Arrays.asList( new User("Dacil", 30), new User("Gabriel", 35));
int result = users.stream()
.reduce(0, (partialAge, user) -> (partialAge + user.getAge()), Integer::sum);
</code></pre>
<h2>Find</h2>
<p>There are <strong>two</strong> variants of the <code>find</code> function in Java:</p>
<ul>
<li><strong>findFirst</strong>: Deterministically find the first element in the Stream.</li>
<li><strong>findAny</strong>: Return any single element of the Stream, disregarding order.</li>
</ul>
<p>One always gets the same element (given the same input Stream), while the other does not guarantee it.
Bear in mind, that in simple single-threaded examples like these, both are likely to behave in the same way.</p>
<pre><code>String[] array = { "Stream", "Java", "Rule" };
Optional<String> combined = Arrays.stream(array).sorted().findFirst();
if (combined.isPresent())
System.out.println(combined.get());
</code></pre>
<p><code>Output: Java</code></p>
<p>By default, <code>findFirst</code> will return an <code>Optional</code> of the type it finds in the incoming Stream.
Just like we did with our <code>reduce</code> example, you can avoid handling the Optional by closing the Stream with an <code>orElse()</code>.</p>
Back in time with githttps://devintheshell.com/blog/git-undo/https://devintheshell.com/blog/git-undo/Undo your mistakesSat, 22 May 2021 16:46:36 GMT<p>One useful feature of VCS (git or otherwise) is the ability to restore the state of a project to a previous point in time.</p>
<p>Here are some common mistakes and how to fix them.</p>
<h2>Local changes</h2>
<p>Changes you might want to undo <strong>before</strong> being pushed to a remote.</p>
<h3>Commit</h3>
<p>Commits can be undone using the <code>git reset</code> command.
There are multiple ways to undo a commit, depending on what you want to do with the changes in it.</p>
<ul>
<li><code>--hard</code>: Removes all changes from the removed commit</li>
<li><code>--soft</code>: Puts all changes in the staging area</li>
<li><code>--mixed</code>(default): Puts all changes in the working dir (unstaged)</li>
</ul>
<p>We also need to tell git <em>which commit we are resetting to</em>. This is done by passing the hash of the commit <strong>prior to the one to be undone</strong>:</p>
<pre><code>git reset --soft <hash_of_good_commit>
</code></pre>
<p>You can get the last 10 hashes with this command:</p>
<pre><code>git log -10 --abbrev-commit --pretty=oneline
</code></pre>
<p>If only the last commit needs to be reset, <code>HEAD~1</code> can be used instead of the hash to tell git to go to the commit before the current one:</p>
<pre><code>git reset HEAD~1
</code></pre>
<p>Of course this allows for any number of commits to be undone, not just the last one.</p>
<h3>Change</h3>
<p>Let's suppose, to keep things simple, that all changes to a file need to be undone.
If these changes have not yet been added to the staging area, <code>git restore <file></code> will remove those changes.</p>
<p>If instead they are already in the staging area, <code>git restore --staged <file></code> will unstage them, so that they can be either modified and restaged, or removed altogether using the previous command.</p>
<p>Of course if multiple files need to be handled a <code>.</code> can be used instead of a list of file names.
Consider that this will apply to <strong>all files</strong>.</p>
<p>Similarly, if all uncommitted changes need to be fully discarded, running <code>git reset --hard</code> with no commit hash will reset the state of the project to whatever is in the current commit, <strong>removing all other changes</strong>.</p>
<h3>Merge</h3>
<p>Undoing an in-progress merge is as simple as running <code>git merge --abort</code>.</p>
<p>If however the merge has already been committed, the previously mentioned <code>git reset --hard HEAD~1</code> will also work here. Of course, using the hash instead of <code>HEAD~1</code> would work as expected.
Merges are ultimately just fancy commits.</p>
<h3>The Nuclear option</h3>
<p>Sometimes, the local work tree gets mangled by a combination of odd git abstractions and user error.</p>
<p>It might be easier to fully reset the local env to whatever is currently on the remote repo.
To do this, run these commands:</p>
<pre><code>git fetch origin
git reset --hard origin
git clean -xdf
</code></pre>
<p>Here, the state of the remote repo is fetched, the state of the local repo is reset to the remote one, and all untracked files are cleaned recursively, leaving the working area with no changes.</p>
<p>Indeed, at this point one might consider <code>rm -rf ./the_whole_project/ && git clone the_thing_again</code>.
This works, but also removes all branches and ignored files. Plus, big repos might take a while to fully clone. The commands described here should be more time-efficient.</p>
<h2>Pushed changes</h2>
<p>If the changes have already been pushed, using <code>reset</code> like before will require a <code>git push --force</code>, which will overwrite the remote repo with your current one (or more specifically, overwrite the conflicting changes).</p>
<p>This might not be an issue in a personal project but when working with other people it's a bit no-no.
In fact, force pushes might be disabled altogether.</p>
<p>This makes sense, since changing the state of the remote while another person's work depends on that (now overwritten) state can render their work useless or take a while to merge back together.</p>
<h3>Revert</h3>
<p>Apart from <strong>resetting</strong> a branch to a given commit, we can also <strong>revert</strong> a specific commit (or set of commits).</p>
<p>This way, instead of <strong>removing</strong> commits, we <strong>add new ones</strong> with the changes required to reset the state of the project to how it was before the commit to be reverted.</p>
<p>So given git log like this:</p>
<pre><code>621d866 (HEAD -> master, origin/master) oh fuck
07ef6b4 another goot commit
3dbbc2b good commit
</code></pre>
<p>Resetting the last commit would require a force push, but <code>git revert HEAD</code> will simply add a new commit that can be safely pushed:</p>
<pre><code>5c07fa2 (HEAD -> master) Revert "oh fuck"
621d866 (origin/master) oh fuck
07ef6b4 another goot commit
3dbbc2b good commit
</code></pre>
<p>Beware however, that if a revert is done on a commit previous to the last one, and the reverted changes are needed for the changes in newer commits to work, those commits might break (as in, the build might break, or the tests might fail).
In those cases you might need to revert multiple commits or introduce further ones to fix the issues.</p>
<p>This has no good solution, so consider reverting a commit as soon as possible and committing small changes at a time.
A commit that changes 200 files is bound to cause issues when reverted, while one that only modifies a function likely will not.</p>
<h4>Merges</h4>
<p>Perhaps surprisingly, using the previous revert command on a merge commit will fail:</p>
<pre><code>error: commit <HASH> is a merge but no -m option was given.
fatal: revert failed
</code></pre>
<p>That <code>-m</code> flag takes a number that corresponds to the <strong>M</strong>ain parent.</p>
<p>This makes sense, a merge by definition has two <em>parent</em> commits: the one you are on when running <code>git merge</code> (1) and the one you are merging into it (2).</p>
<p>So if merging <code>feature-branch</code> into <code>master</code>, the former would be 2 and the latter would be 1.</p>
<p>The command:</p>
<pre><code>git revert -m 1 <merge-commit-hash>
</code></pre>
<p>Would create a revert commit restoring the state of <code>master</code>.</p>
<p>Since more than two commits can be merged, the <code>-m</code> flag takes an indefinite number.
In most cases, the expected behavior will be achieved passing 1 to it.</p>
Task scheduling with Springhttps://devintheshell.com/blog/spring-scheduler/https://devintheshell.com/blog/spring-scheduler/And while we're at it, learn how to set up Cron JobsThu, 29 Apr 2021 16:19:52 GMT<h2>Enable the resource</h2>
<p>There are many ways to manage repeating tasks in Spring, but by far the easiest one is using the built-in Scheduler.</p>
<p>To enable it you just have to annotate the main class with <code>@EnableScheduling</code>.</p>
<p>It's worth pointing out that the default behavior doesn't allow for parallel execution of tasks.
To do this you'll also use <code>@EnableAsync</code> on the main class and <code>@Async</code> on the desired function.</p>
<h2>Types of Scheduling</h2>
<p>Spring offers three ways of managing recurrent jobs:</p>
<h3>Fixed Rate</h3>
<p>Runs the method every 'X' milliseconds.
Enable it with <code>@Scheduled(fixedRate = timeInMilliseconds)</code>.</p>
<pre><code>@Scheduled(fixedRate = 2000)
public void repeatEveryTwoSeconds() {
System.out.println("I run every two seconds, no matter the previous run!");
}
</code></pre>
<h3>Fixed Delay</h3>
<p>Runs the method 'X' milliseconds after the previous execution is done.
Enable it with <code>@Scheduled(fixedDelay = timeInMilliseconds)</code>.</p>
<pre><code>@Scheduled(fixedRate = 2000)
public void repeatAfterTwoSeconds() {
System.out.println("I run two seconds after the previous run is over!");
}
</code></pre>
<p>You can also adjust the initial execution delay adding <code>initialDelay</code>, as such: <code>@Scheduled(fixedDelay = 2000, initialDelay = 3000)</code>.</p>
<h3>Cron</h3>
<p>For greater flexibility, Spring allows us to adjust the repetition pattern with Cron.
Enable it with <code>@Scheduled(cron = "* * * * * *")</code>.</p>
<pre><code>@Scheduled(cron = "0 0 0 * * *")
public void repeatEveryMidnight() {
System.out.println("I run every day at midnight");
}
</code></pre>
<h2>Unix cron vs Spring cron</h2>
<p>There are some subtle differences between the cron schedules you'll set up in Spring applications and the ones you'll find in your typical Linux machine.</p>
<h3>Unix Cron</h3>
<pre><code>┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month(1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
</code></pre>
<h3>Spring Cron</h3>
<pre><code> ┌───────────── second (0-59)
│ ┌───────────── minute (0 - 59)
│ │ ┌───────────── hour (0 - 23)
│ │ │ ┌───────────── day of month (1 - 31)
│ │ │ │ ┌───────────── month (1 - 12)
│ │ │ │ │ ┌───────────── day of week (0 - 7) (Saturday to Saturday)
│ │ │ │ │ │
│ │ │ │ │ │
* * * * * *
</code></pre>
<p>As you can see, where Unix-like Cron has only 5 fields (some systems have 6, but that's used for user permissions), Spring-like Cron has 6; adding the ability do manage tasks my the second.</p>
<p>Moreover, while traditional Cron only supports macros in some systems, Springs version does so by default:</p>
<table>
<thead>
<tr>
<th>Macro</th>
<th>Description</th>
<th>Cron</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>@yearly</code></td>
<td>Once a year</td>
<td><code>0 0 0 1 1 *</code></td>
</tr>
<tr>
<td><code>@monthly</code></td>
<td>Once a month</td>
<td><code>0 0 0 1 * *</code></td>
</tr>
<tr>
<td><code>@weekly</code></td>
<td>Once a week</td>
<td><code>0 0 0 * * 0</code></td>
</tr>
<tr>
<td><code>@daily</code></td>
<td>Once a day</td>
<td><code>0 0 0 * * *</code></td>
</tr>
<tr>
<td><code>@hourly</code></td>
<td>Once evey hour</td>
<td><code>0 0 * * * *</code></td>
</tr>
</tbody>
</table>
Life beyond Google Searchhttps://devintheshell.com/blog/web-search/https://devintheshell.com/blog/web-search/Search engines worth usingSun, 25 Apr 2021 10:42:43 GMT<p>Even setting aside all the privacy concerns that come with using any Google product, some find the almighty search engine to be pretty lack luster.</p>
<p>Its results are filled with ads, spam and irrelevant or even auto-generated results.
Plus, it's nearly impossible to sift through the noise when investigating any sort of vaguely controversial topic.</p>
<p>To be fair, there is one thing that it does pretty well: help you out when you don't quite know what you are looking for.
I wouldn't say it does so to your best interest but hey, it's something.</p>
<h2>Search Engines</h2>
<p>There are quite a few search engines available to you.
Some of them (Bing) are widely considered meme-engines.</p>
<p>I would disagree: each of them has its own use case.
We are just used to the (supposed) omnipotence of Google.
Here is a quick overview:</p>
<h3><a href="https://duckduckgo.com">Duckduckgo</a></h3>
<p>Probably the most popular of the bunch.
Privacy minded, kind of bare bones.
No tracking, no profiling and much fewer ads than Google.
Pretty good general purpose alternative, except maybe for image search.</p>
<h3><a href="https://startpage.com">Startpage</a></h3>
<p>A different front-end to Google's back end.
The idea is that you still want Google's results (for some reason) but would rather not have the NSA over for dinner.
Most of the targeted advertisement <a href="https://github.com/prism-break/prism-break/issues/168">should</a> not spam your results.
They are based in Europe which might give you some peace of mind.</p>
<h3><a href="https://swisscows.com">Swisscows</a></h3>
<p>Super family friendly SE.
Built-in blockage of porn, violence and the likes.
It mixes its own indexing with Bing's.</p>
<h3><a href="https://bing.com">Bing</a></h3>
<p>Indeed, trusting Microsoft instead of Google is hardly any better.
However, it's preferable to have 5 different companies partially tracking you than to have one knowing you better than you know yourself.
They do their own indexing, so results should be more or less independent of Google.
Plus, it's actually pretty good for image searches.</p>
<h3><a href="https://yahoo.com">Yahoo</a></h3>
<p>The same reasoning as above more or less applies here as well. Still not a great service privacy wise, but useful in its own right.
If you are into crypto or finance in general, it has some pretty useful tools and is well respected in that regard.</p>
<h3><a href="https://www.qwant.com">Qwant</a></h3>
<p>Based in France, it recently started doing their own indexing.
Easy to use, simple UI, user-friendly design.</p>
<h3><a href="https://www.wolframalpha.com">Wolframalpha</a></h3>
<p>Mainly used in academia.
Rather different from what we usually understand by <em>'Search Engine'</em>, but rather useful with technical searches.</p>
<h3><a href="https://yandex.com">Yandex</a></h3>
<p>For those who hate the NSA but would love to meet the <s>KGB</s> SVR.
Jokes aside, it's very widely used along Russia's sphere of influence.
As such, it's very useful for learning different perspectives on sensitive issues or topics.</p>
<p>If you want to '<strong>legally</strong>' download content, look no further.</p>
<h3><a href="https://you.com/">You.com</a></h3>
<p>Pretty UI, basically no ads, and the ability to customize sources and searches to your heart's content.
What's not to like?</p>
<h3><a href="https://search.brave.com/">Brave search</a></h3>
<p>Another one of the few that do their own indexing.
Pretty fast and reliable. Their whole marketing revolves around privacy on the web, so you can expect a decent level of privacy.
It even offers to add results from other SE in case you aren't satisfied.</p>
<p>It also works on <a href="https://search.brave4u7jddbv7cyviptqjc7jusxh72uik7zt6adtckl5f4nwy2v72qd.onion/">Tor</a>!!</p>
<h2>Search with Searx</h2>
<p><em>So what now? Am I supposed to use twelve search engines instead of one?</em></p>
<p><img src="./yesbutno.webp" alt="yesbutno" /></p>
<p>You can just use a <strong>meta SE</strong>!
Simply put, it queries a bunch of different SE for you and presents all the results in a single page.</p>
<p><img src="./meta-search.webp" alt="meta-search" /></p>
<p>The main one that comes to mind is Searx (or more accurately, it's fork SearxNG).</p>
<ul>
<li>It doesn't offer personalized results, because it doesn't generate a profile about you.</li>
<li>It doesn't save or share what you search for.</li>
<li>It's fully open source (code <a href="https://github.com/searxng/searxng">here</a>) so you can actually host your own instance (I actually do so and use it daily) or just chose one you trust from <a href="https://searx.space">this</a> list and use it.</li>
</ul>
<p>If you want you can set exactly how and which SE it queries. If you don't, you can just go ahead and use a public instance as is.
It won't work with every SE available, and it might be a bit fiddly on occasion, but it probably offers you more than you need, and it's just so convenient there's no getting around it.</p>
Please use a password managerhttps://devintheshell.com/blog/password-manager/https://devintheshell.com/blog/password-manager/You need one whether you know it or notWed, 31 Mar 2021 18:49:26 GMT<h2>Why you need it</h2>
<h3>Repeated passwords</h3>
<p>Even if you think its new and clever, you might very well have used the password you just <em>'invented'</em> for a long forgotten account (which may or may not have leaked).
Repeating passwords is nearly as bad as setting them to '<em>admin</em>' or '<em>password1</em>'.</p>
<h3>Plain text</h3>
<p>If you don't use a password manager and don't repeat passwords, chances are you are storing them in an unencrypted, plain text file.</p>
<p>We all manage a <strong>huge</strong> amount of accounts, no way you can remember all those passwords.</p>
<p>Anybody with (even remote) access to your machine can read an unencrypted file.
Plus you need to be in that specific machine to access your passwords or copy that file around.</p>
<h3>TOTP</h3>
<p>Nowadays, it's often required to have some sort of MFA set up.
<strong>O</strong>ne <strong>T</strong>ime <strong>P</strong>asswords are by far the mos convenient and secure way to achieve this.</p>
<p>Plain and simple, this is not possible without a password manager.</p>
<h3>Work passwords</h3>
<p>You might not care about your personal stuff, but please do care about your work related accounts and credentials.</p>
<p>You put your whole company, coworkers and clients/users at risk when you neglect your online security at work.</p>
<p>One of the main ways attackers get access to user's sensitive information is by taking advantage of bad practices used by the people who are supposed to be trusted with that information.</p>
<h2>Why you want it</h2>
<h3>It's more comfy than your solution</h3>
<p>I can bet that the way you currently manage your passwords is either uncomfortable or insecure.
You either have them written in plain text in a file you have to fetch every time you log in (or even worse, written in a physical paper like a <strong>caveman</strong>), or you let your browser manage them for you (good luck using a different browser or needing any kind of advanced management).</p>
<p>Good password managers, especially if they have a companion browser extension, are literally a one click solution to both creating good passwords and filling them into the login forms.</p>
<h3>Good passwords are hard</h3>
<p>Just look at the requirements for any account password and be honest: Can you really come up with a good one without using personal information like name or DOB?
Yeah, me neither.</p>
<h3>Typing huge passwords sucks</h3>
<p>And you always get something wrong.</p>
<h3>Lost the file? Lost all passwords</h3>
<p>If store them in a file, your passwords are <strong>gone forever</strong> as soon as that file gets deleted.</p>
<p><img src="./no.webp" alt="no" /></p>
<p>That's just a bummer.</p>
<h2>What to use</h2>
<p>Well... a password manager 😀. Here is what to avoid and a personal suggestion.</p>
<h3>Avoid non FOS software. Here is why</h3>
<ol>
<li>Nobody knows what the code actually does or how secure it is. You are 100% just <strong>trusting</strong> the company offering the service.</li>
<li>FOSS is always <strong>more secure</strong>. It can be publicly audited and people <strong>will</strong> pick it apart and patch it.</li>
<li>If the company decides to make you pay for features that where once free you might have no choice, except <strong>maybe</strong> to export a JSON or CSV file and move away.</li>
<li>If the company goes six feet under, you're on a ticking time bomb to find an alternative.</li>
<li>You are in charge. You don't have to, but often <strong>can</strong> go and host the service <a href="https://vault.devintheshell.xyz">yourself</a>.</li>
</ol>
<h3>The dynamic duo</h3>
<h4><a href="https://bitwarden.com">BitWarden</a></h4>
<p>The more user friendly alternative.
They are widely used and known, are repeatedly audited by third parties, have a free and a paid business service, and have pretty much anything you might need:</p>
<ul>
<li>Desktop GUI</li>
<li>Desktop CLI</li>
<li>Mobile GUI</li>
<li>Browser Plugin</li>
<li>Web Vault</li>
</ul>
<p>You have the option to make an account with them and host your passwords in their servers (just like with any other password manager) <strong>or</strong> you can host your own instance on your own server.</p>
<p>If you plan to go that route, check out <a href="https://github.com/dani-garcia/vaultwarden">VaultWarden</a> for a super lightweight alternative!</p>
<h4><a href="https://keepassxc.org">KeePassXC</a></h4>
<p>Minimal solution (although not as minimalist as just using <a href="https://www.passwordstore.org">pass</a>).
It's a cross-platform implementation of the <a href="https://wiki.archlinux.org/index.php/KeePass">KeePass</a> standard with added plugin support.</p>
<p>You have a local encrypted vault which you connect to the plugin and that's it.
<strong>You</strong> are in charge of backups and security and can access the vault only locally, but there is literally no one else involved.
Not even a connection the web.</p>
<h2>Conclusion</h2>
<p>Personally I have used a lot of different password managers.
Nowadays, I run a <a href="https://github.com/dani-garcia/vaultwarden">VaultWarden</a> instance on a VPS but still have a local copy available from KeePassXC, just in case.</p>
<p>No solution is perfect for everyone and each have valid use cases.</p>
<p>Except for not using one.
That's just silly 🙃.</p>
Clean Your Codehttps://devintheshell.com/blog/clean-code/https://devintheshell.com/blog/clean-code/High level overview of Uncle Bob's Clean Code points and core conceptsWed, 24 Mar 2021 14:52:21 GMT<p>Got the idea from <a href="https://gist.github.com/wojteklu/73c6914cc446146b8b533c0988cf8d29">here</a>, but concepts below come mostly from horizontal reading and Uncle Bob's speeches.</p>
<h2>Overview</h2>
<p>As a general rule of thumb, try to make your code <em>'pretty'</em>.</p>
<p>Look at the indentations, do they make sense? Are you like five indentation levels deep? Is the naming descriptive? Is there a more efficient way to achieve the same thing? Is your code elegant?
Do you actually feel proud about it and have an urge to show it to your peers?</p>
<h3>Made to be read</h3>
<p>You got your code to work? Great!
That means it's machine friendly. Now you need to go back and make it human friendly.</p>
<p>When coming up with new ideas or solutions to existing problems, our minds tend to become a bit messy.
We forget stuff, rush through ideas and generally don't care much for the <em>'proper solution'</em> but focus on <em>'a solution'</em>.</p>
<p>This is good, we <strong>should</strong> focus on getting the job done first.
But we also need to take some time after the fact to clean after ourselves.</p>
<p>You are most likely not the only one working on that peace of code, treat it like a common space.
First make it work, then clean it up.</p>
<blockquote>
<p>You're not done when it works, You're done when it's right.
— <cite>Robert C. Martin<cite></p>
</blockquote>
<p>Code is clean if it can be understood easily by a competent reader.
Another developer should be able to enhance it (or fix it for that matter).</p>
<p><img src="./wtfdoor.webp" alt="wtfdoor" /></p>
<p>Surprises might be nice irl but I'm not looking forward to going <em>"WTF?"</em> when reading your code.
The more predictable the better.</p>
<blockquote>
<p>Clean code does one thing well.
— <cite>Bjarne Stroustrup <cite></p>
</blockquote>
<h2>Some Rules</h2>
<h3>General</h3>
<ol>
<li><strong>Conventions</strong> are there for a reason. It's easier for me to understand you if we both follow a set of common rules.</li>
<li><strong>KISS</strong> (Keep it simple stupid). Reduce complexity as much as possible, keep it <em>minimalistic</em>.</li>
<li>Always find the <strong>root cause</strong>. Don't just fix the problem, actually fully resolve the issue.</li>
<li><strong>Boy Scout Rule</strong>. Leave the code better than you found it.</li>
<li><strong>Unix</strong> philosophy. Do one thing and do it well, don't write code with excessive/diverse responsibilities.</li>
</ol>
<h3>Understandability</h3>
<ol>
<li>Be <strong>consistent</strong>. If class A has a <code>var name</code> and a <code>var surname</code>, class B shouldn't have a <code>var firsName</code> and a <code>var lastName</code> in their place.</li>
<li>Use <strong>explanatory</strong> variables names. They help with semantics.</li>
<li>Prefer <strong>value objects</strong> to primitive type. They also help with semantics.</li>
<li>Avoid <strong>negative conditionals</strong> and be generally careful with conditionals, they get out of hand fast.</li>
</ol>
<h3>Naming</h3>
<ol>
<li>They should be descriptive, unambiguous, meaningful, pronounceable and searchable.</li>
<li>Don't use <code>data</code> or <code>object</code> in the name. We already know.</li>
<li>No Magic Numbers.</li>
</ol>
<h3>Functions</h3>
<ol>
<li>They should be <strong>small</strong> and <strong>single scoped</strong>.</li>
<li>Their names should describe their intent. Default to using <strong>verbs</strong> if possible.</li>
<li>No more than 3 <strong>arguments</strong>, the fewer, the better.</li>
<li><strong>Side effects</strong> are usually bad, hard to track and easy to avoid.</li>
<li>Don't use <strong>flag</strong> arguments. If a function does one thing or another based on the input, you should have two functions, not one.</li>
</ol>
<h3>Comments</h3>
<ol>
<li>A <strong>necessary evil</strong>. To be avoided if possible.</li>
<li>Trust your VCS, <strong>remove</strong> the code.</li>
<li>Use them to: explain the <strong>intent</strong>, <strong>clarify</strong> the code or <strong>warn</strong> of consequences.</li>
</ol>
<h3>Structure</h3>
<ol>
<li>Prefer <strong>vertical</strong> coupling/cohesion rather than horizontal.</li>
<li><strong>Group</strong> variables, objects and functions if they relate in usage.</li>
<li>Respect <strong>indentation</strong>.</li>
<li>Keep <strong>lines short</strong>.</li>
<li><strong>Blank lines</strong> can be used to separate weakly related elements. Consider separating them further.</li>
</ol>
<h3>Objects</h3>
<ol>
<li>Expose an <strong>interface (API)</strong>, hide internal structure.</li>
<li>Keep them <strong>small</strong> and <strong>single scoped</strong>.</li>
<li>Few imports and few <strong>instance variables</strong>.</li>
</ol>
<h3>Avoid</h3>
<ol>
<li><strong>Rigid</strong> software is hard to change, and there are usually cascading consequences for each change.</li>
<li><strong>Fragile</strong> code breaks in many places due to a single change.</li>
<li><strong>Complexity</strong> is sometimes <em>necessary</em>, but often <em>accidental</em>. Avoid the latter.</li>
<li>Code <strong>Repetition</strong> is a pain to work with.</li>
<li><strong>Opacity</strong> is not a sign of a clever mind, but of an uncaring personality.</li>
</ol>