Even in a world of Kotlin Multiplatform, there are other options, which might cover other specifc use case scenarios (more about it along this post).
This is actually the main reason, why I would like to present Rust as a candidate for code reusability across different platforms… in this case for Android Development.
Our project consists of an Android Application that will call Rust code in order to encrypt/decrypt a given String:
Our Android App calling Rust code.
Before continuing, it is worth mentioning that the entire codebase sits in a Github Repository containig extra documentation and code comments to facilitate UNDERSTANDING and LEARNING.
In a nutshell, our project will follow this flow:
Our global project overview.
Rust and Android interaction involves a bunch of parts (my approach is to have 2 separated projects that we can independently evolve):
.so) should be placed in jniLibs folder inside the Android Project.As a next step, let’s run the project, break things down and dive deeper into each part.
After cloning the repo, follow the steps below.
/home/fernando/Android/Sdk.jni_crypto/build.rs file.
$ANDROID_HOME/ndk/25.2.9519653 matches with ANDROID_NDK_VERSION = "25.2.9519653".Cargo.toml file for the correct one.rust-library/jni_cryptor folder.cargo run --bin release.cargo run --bin publish.cargo test.android-sample folder (build.gradle.kts file).app via IDE.The Rust project structure (called crypto) looks like this:
Our Rust ‘crypto’ project overview.
string encryption/decryption.Let’s use the example of text encryption (it is simplified by only base64-encoding a string). So here is our encrypt function in Rust, part of the crypto crate inside the cryptor/src/libs.rs file:
use base64::{
Engine as _,
engine::general_purpose::STANDARD as base64Engine
};
///
/// Encrypts a String.
///
pub fn encrypt(to: &str) -> String {
base64Engine.encode(String::from(to))
}
And a tiny test for it:
use cryptor;
#[test]
fn test_encrypt_string() {
let to_encrypt = "hello_world_from_rust";
let str_encoded_b64 = "aGVsbG9fd29ybGRfZnJvbV9ydXN0";
let encrypted_result = cryptor::encrypt(&to_encrypt);
assert_eq!(str_encoded_b64, encrypted_result);
}
Now we need our JNI Api in place, which makes use of our crypto crate (as showcased in the crypto project structure picture above). This sits inside the cryptor_jni/scr/libs.rs file:
///
/// [cfg(target_os = "android")]: Compiler flag ("cfg") which exposes
/// the JNI interface for targeting Android in this case
///
/// [allow(non_snake_case)]: Tells the compiler not to warn if
/// we are not using snake_case for a variable or function names.
/// For Android Development we want to be consistent with code style.
///
#[cfg(target_os = "android")]
#[allow(non_snake_case)]
pub mod android {
extern crate jni;
// This is the interface to the JVM
// that we'll call the majority of our
// methods on.
// @See https://docs.rs/jni/latest/jni/
use self::jni::JNIEnv;
// These objects are what you should use as arguments to your
// native function. They carry extra lifetime information to
// prevent them escaping this context and getting used after
// being GC'd.
use self::jni::objects::{JClass, JString};
// This is just a pointer. We'll be returning it from our function.
// We can't return one of the objects with lifetime information
// because the lifetime checker won't let us.
use self::jni::sys::jstring;
use cryptor::encrypt;
///
/// Encrypts a String.
///
#[no_mangle] // This keeps Rust from "mangling" the name so it is unique (crate).
pub extern "system" fn Java_com_fernandocejas_rust_Cryptor_encrypt<'local>(
mut env: JNIEnv<'local>,
// This is the class that owns our static method. It's not going to be used,
// but still must be present to match the expected signature of a static
// native method.
_class: JClass<'local>,
input: JString<'local>,
) -> jstring {
// First, we have to get the string out of Java. Check out the `strings`
// module for more info on how this works.
let to_encrypt: String = env.get_string(&input)
.expect("Couldn't get java string!").into();
// We encrypt our str calling the cryptor library
let encrypted_str = encrypt(&to_encrypt);
// Here we have to create a new Java string to return. Again, more info
// in the `strings` module.
let output = env.new_string(&encrypted_str)
.expect("Couldn't create Java String!");
// Finally, extract the raw pointer to return.
output.into_raw()
}
}
Something to pay a bit of attention to, is the function signature, which we will cover in our android project part. But for now, let’s leave it here and focus on our artifact (crate) generation.
At this point, our Rust code is in place, and we need to generate our .so artifacts via cargo (Rust package manager).
When building the crypto_jni crate with the cargo build command (inside our crypto_jni foler), cargo first searches for a build script file (build.rs) in the root folder of the project in order to execute it.
AND HERE IS WHERE THE MAGIC HAPPENS!!!… so let’s have a look at what is inside our build.rs file:
...
static ANDROID_NDK_VERSION: &str = "25.2.9519653";
...
fn main() {
system::rerun_if_changed("build.rs");
create_android_targets_config_file();
add_android_targets_to_toolchain();
}
Basically we are creating a cargo config file containing android targets information, needed by cargo to perform cross compilation.
Run cargo build inside the cryptor_jni folder and once done open the generated file at rust-library/cryptor_jni/.cargo/config, which should look similar to this:
[target.armv7-linux-androideabi]
ar = ".../ndk/25.2.9519653/.../linux-x86_64/bin/arm-linux-androideabi-ar"
linker = ".../ndk/25.2.9519653/.../linux-x86_64/bin/armv7a-linux-androideabi21-clang"
[target.i686-linux-android]
ar = ".../ndk/25.2.9519653/.../linux-x86_64/bin/i686-linux-android-ar"
linker = ".../ndk/25.2.9519653/.../linux-x86_64/bin/i686-linux-android21-clang"
[target.aarch64-linux-android]
ar = ".../ndk/25.2.9519653/.../linux-x86_64/bin/aarch64-linux-android-ar"
linker = ".../ndk/25.2.9519653/.../linux-x86_64/bin/aarch64-linux-android21-clang"
[target.x86_64-linux-android]
ar = ".../ndk/25.2.9519653/.../linux-x86_64/bin/x86_64-linux-android-ar"
linker = ".../ndk/25.2.9519653/.../linux-x86_64/bin/x86_64-linux-android21-clang"
Each target in the above config file derives from the Official Android Documentation on “Using NDK with other build systems” which basically states that in order to build for an specific cpu architecture/type and instruction set (ABI), the Android NDK provides pre-compiled toolchains that need to be used (Ex.
arm-linux-androideabi-arandarmv7a-linux-androideabi21-clang).
Now cargo knows what to build and how, for instance, the next step is to add those targets to the rust toolchain, which is basically what this line of code is doing:
fn main() {
...
/// ## Examples
/// `rustup target add arm-linux-androideabi`
///
/// Reference:
/// - https://rust-lang.github.io/rustup/cross-compilation.html
add_android_targets_to_toolchain();
}
The above code enables us to run the following commands if we wanted to individually build our targets:
cargo build --target armv7-linux-androideabi
cargo build --target i686-linux-android
cargo build --target aarch64-linux-android
cargo build --target x86_64-linux-android
Although this is perfect valid, it is tedious… that is why it is a good practice to AUTOMATE ALL THE THINGS (as much as possible). And this is done by the cryptor_jni/src/bin/release.rs file, relying on cargo binary targets, which are basically programs that can be executed after comopilation:
cargo run --bin release
Last but not least, there is another binary target call publish (publish.rs file) that we can execute:
cargo run --bin publish
This will copy all the generated targets to its corresponding android directories in our android-sample project.
We have been mentioning ABIs previously along this article, but what is that exactly and how does an ABI relate to a target?
ABI stands for Application Binary Interface, which is a combination of a CPU type/architecture and instruction set. In Android Development any NDK target must be mapped to a specific directory in the project. This relationship is as following according to the documentation:
-------------------------------------------------------------
ANDROID TARGET ABI (folder inside `jniLibs`)
-------------------------------------------------------------
armv7a-linux-androideabi ---> armeabi-v7a
aarch64-linux-android ---> arm64-v8a
i686-linux-android ---> x86
x86_64-linux-android ---> x86_64
-------------------------------------------------------------
So far, evertyhing is ready for development on the Rust side with some automation… But of course, there are a couple of IMPROVEMENTS that I did not want to skip… and even though these are OUT OF SCOPE of this article, they are definitely worth highlighting:
jniLibs directory): ideally these ones should be properly versioned (mentioned above) and uploaded to a crates repository or similar.On the Android side of things, there are a couple of moving parts that we have to take into consideration.
In the build.gradle.kts we need to add NDK configuration:
android {
...
ndk {
// Specifies the ABI configurations of your native
// libraries Gradle should build and package with your APK.
// Here is a list of supported ABIs:
// https://developer.android.com/ndk/guides/abis
abiFilters.addAll(
setOf(
"armeabi-v7a",
"arm64-v8a",
"x86",
"x86_64"
)
)
}
...
}
This is done at Android Application Class level:
class AndroidApplication : Application() {
override fun onCreate() {
super.onCreate()
loadJNILibraries()
}
private fun loadJNILibraries() {
/**
* Loads the Crypto C++/Rust (via JNI) Library.
*
* IMPORTANT:
* The name passed as argument () maps to the
* original library name in our Rust project.
*/
System.loadLibrary("cryptor_jni")
}
}
In order to call Rust via JNI, we have to respect method/function signature. This is essential and a MUST so that classes, functions and methods can be found by the android runtime.
Remember this piece of code from our cryptor_jni project that encrypts a String:
...
pub extern "system" fn Java_com_fernandocejas_rust_Cryptor_encrypt<'local>(
mut env: JNIEnv<'local>,
_class: JClass<'local>,
input: JString<'local>,
) -> jstring {
...
}
...
Invoking it from Kotlin, would mean creating a kotlin class RESPECTING package and function naming:
package com.fernandocejas.rust
/**
* Helper that acts as an interface between native
* code (in this case Rust via JNI) and Kotlin.
*
* By convention the function signatures should respect
* the original ones from Rust via JNI Project.
*/
class Cryptor {
/**
* Encrypt a string.
*
* This is an external call to Rust using
* the Java Native Interface (JNI).
*
* @link https://developer.android.com/ndk/samples/sample_hellojni
*/
@Throws(IllegalArgumentException::class)
external fun encrypt(string: String): String
...
}
We are done!!! Now we can inject our Cryptor class where it is needed and encrypt/decrypt Strings:
private val cryptor = Cryptor()
val encryptedString = cryptor.encrypt("something")
You might be wondering what is the real purpose of all this wiring for integrating Rust with Android… At the moment I can think of some real use cases:
As a conlusion, I would say… WE DO NOT WANT to replace Kotlin with Rust. Kotlin is very good as what it is meant to be: Android Development in THIS CASE. But keep in mind that PICKING THE BEST TOOL FOR THE RIGHT JOB is essential to fulfill project requirements, and it is in this context where we can count on Rust as ONE MORE TOOL IN OUR TOOLBOX.
Ufff… that was a loooong post… but if you made it to this point, you should definitely feel proud of yourself… Till the next time and do not forget to provide FEEDBACK!!!
After a long time of procrastination, I finally finished this blog post where I would like to share my journey on setting up my personal HOME LAB. This includes 2 different approaches: Kubernetes and Docker.
I will also highlight some of the problems I bumped into, the cost of maintenance and some tips and tricks.
SPOILER: I started with Kubernetes and ended up with a pure/plain Docker approach. Both are great tools and should be used for their intended purpose. So in order to understand what happened here, let’s start this journey and jump on this train!!!
I strongly believe that this is the first question we have to ask ourselves whenever we decide to go for such a complex project.
We have to establish our goals, so here are mine:
Whenever we make such a decision, we need to keep in mind the commitment to the project, which includes the following time investment:
Software does not exists without Hardware backing it up… and to be honest, I could have done pretty much all this via some cloud providers, but then, neither this is no longer a HOME LAB nor a fully PRIVATE HOME SERVER, so I opted for the following bare metal:
Arch Linux with LTS Kernel is my choice as my Home Lab Server.
This was my first approach, based on Kubernetes, so in order to understand what tha means, here is what the official website writes down:
“Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.”
At first I set up the official Kubernetes (k8s) but then I realized that there is a more lightweight version of it, which fulfilled my needs. It is also called Kubernetes (k3s), and it is basically a single binary that only takes up around 50 MB of space and has low resource usage of around 300 MB of RAM. Even thought there are tiny differences, they are both mostly compatible, so learning one will pretty much cover the other. That is perfect!!!
Before continuing, it is VERY IMPORTANT to understand some of the concepts or fundamental blocks that are part of Kubernetes. Here is a summary in a very simplistic way:
Of course we are just scratching the surface here, but for our purpose, that is enough. There is way more, and for a deeper explanation, refer to the official Kubernetes Documentation.
Now that we understand the Kubernetes main concepts, here is a raw picture of my Home Lab Infrastructure with Kubernetes (k3s):
Home Lab General Architecture with Kubernetes.
WHAT IS GOING ON? In a nutshell, here is the normal flow when accessing any of the hosted services in my Arch Linux servers:
Now that we have the big picture on what is going on, mostly at hardware level (mentioned in the previous section), the next step would be to answer the following question:
What happens when I reach any hosted app contained in the Kubernetes Cluster?
A picture is worth a thousand words:
Kubernetes Application Flow.
As we can see, this is the flow:
Service.Type=LoadBalancer.Now we have to get our hands dirty and start setting up our cluster.
At this point in time, I assume that we have the minimum set of requirements in place:
These are the steps and list of ingredients we need for our recipe:
1. Install k3s MASTER and WORKER nodes.
2. Install kubectl (if not already installed after Step 1) in order to connect remotely to the cluster.
3. Install Helm if necessary, a package manager for Kubernetes, which will facilitate actually installing packages in our cluster.
4. Setup a Load Balancer. k3s already comes with ServiceLB, but I found MetalLB the right option for bare metal, because it makes the setup easier on clusters that are not running on cloud providers (we have to disable ServiceLB though){: target=”_blank” }.
5. Install Nginx Web/Reverse Proxy, which is our Ingress, in order to expose HTTP and HTTPS routes from outside the cluster to services within the cluster. k3s recommends another option too: Traefik, so up to you.
6. Install and configure cert-manager. I would label this as OPTIONAL but I guess we want to be able to have valid SSL/TSL Certificates to avoid our browser warning us when accessing our hosted applications.
7. Deploy and configure Kubernetes Dashboard, which is a web-based Kubernetes user interface.
If everything went well so far, we should be able to see information about our cluster by running:
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP
kube-master Ready master 44h v1.25.0+k3s.1 192.168.0.22 <none>
kube-worker Ready <none> 2m47s v1.25.0+k3s.1 192.168.0.23 <none>
Or we can also access our Kubernetes Dashboard (sample picture):
The kubernetes-dashboard provides a great UI to manage our cluster.
We have a variety of tools in this area:
kubectl (already installed).I would say that it is up to you, to choose the one that better fulfills your requirements.
Also, let’s not forget to check the Addons sections in the Kubernetes Official Documentation.
k9s is such a powerful Kubernetes Terminal Client.
This is a simple example where we will be deploying https://draw.io/ to our Kubernetes cluster.
$ kubectl create namespace home-cloud-drawio
drawio-app.yml with the following content:apiVersion: apps/v1
kind: Deployment
metadata:
namespace: home-cloud-drawio
name: drawio
spec:
replicas: 1
selector:
matchLabels:
app: drawio
template:
metadata:
labels:
app: drawio
spec:
containers:
- name: drawio
image: jgraph/drawio
resources:
limits:
memory: "256Mi"
cpu: "800m"
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
namespace: home-cloud-drawio
name: drawio-service
spec:
selector:
app: drawio
ports:
- port: 5001
targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: home-cloud-drawio
name: drawio-ingress
labels:
name: drawio-ingress
spec:
rules:
- host: home-cloud-drawio
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: drawio-service
port:
number: 5001
ingressClassName: "nginx"
drawio-app.yml file:$ kubectl apply -f drawio-app.yml
BOOM!!! We have basically created a Deployment, which includes a Service and Ingress configuration to access our hosted app from outside the cluster.
Now let’s check the running services to corroborate that everything works as expected:
kubectl get services -o wide --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP
default kubernetes ClusterIP 10.43.0.1 <none>
kube-system kube-dns ClusterIP 10.43.0.10 <none>
kube-system metrics-server ClusterIP 10.43.33.97 <none>
kube-system nginx-ingress LoadBalancer 10.43.196.229 192.168.0.200
home-cloud-drawio drawio ClusterIP 10.43.35.88 <none>
We can access our application by visiting http://192.168.0.200 in our browser (avoid the SSL/TSL warning).
In this example we have not added any extra complexity (for learning purpose), but if a hosted app requires Storage, we will have to create Kubernetes Persistent Volumes too. Same with, for example Let’s Encrypt Certificates
kubectl is a very powerful CLI, it has great documentation and a very useful cheatsheet.
These are some of the most common commands I use:
# Cluster information
$ kubectl cluster-info
$ kubectl get nodes -o wide
# Check runnint Services
$ kubectl get services -o wide --all-namespaces
# Check running Ingress
$ kubectl get ingresses --all-namespaces
# Display all the running Pod
$ kubectl get pods -A -o wide
# Get logs for an specific Pod
$ kubectl logs -f <your_pod> -n <your_pod_namespace>
# Get information about an specific Pod
$ kubectl describe pod <your_pod> -n <your_pod_namespace>
Ok, so at this point in time…I LEARNED A LOT (and invested a lot of time too)…but I also HAD HEADACHES, and this is where the Rule of Seven applied:
In the end, I had a bunch of moving parts (with Kubernetes) which turned to be super complicated for what I really needed, plus I had a cluster with a lot of capacity that I was barely using (refer to the Monitoring Section for more on this).
That is why I decided to apply what I ALWAYS encourage in my daily work life:
Based on my previous points, then a pure Docker approach (with docker compose) was the way to go:
Home Lab General Architecture with Docker.
Upfront, this infrastructure architecture seems very similar to the one defined with Kubernetes, and indeed it is, the flow is the same as described above and server configuration is equal too. The biggest changeset has to do with implementation details:
docker is easier.As an example, we will setup the same application as above: draw.io with docker compose:
home-lab.yml:version: "3.8"
services:
traefik:
image: traefik:latest
container_name: traefik
command:
# Dynamic Configuration: mostly used for TSL certificates
- --providers.file.filename=/etc/traefik/dynamic_conf.yml
# Entrypoints configuration
- --entrypoints.web-secure.address=:443
labels:
- traefik.http.routers.traefik_route.rule=Host(`traefik.home.lab`)
- traefik.http.routers.traefik_route.tls=true
- traefik.http.routers.traefik_route.service=traefik_service
- traefik.http.services.traefik_service.loadbalancer.server.port=8080
ports:
- 80:80
- 443:443
volumes:
- ~/traefik/dynamic_conf.yml:/etc/traefik/dynamic_conf.yml
- ~/traefik/_wildcard.home.lab.pem:/etc/traefik/_wildcard.home.lab.pem
- ~/traefik/_wildcard.home.lab-key.pem:/etc/traefik/_wildcard.home.lab-key.pem
networks:
- home-lab-network
restart: always
drawio:
image: jgraph/drawio:latest
container_name: drawio
labels:
- traefik.http.routers.drawio_route.rule=Host(`drawio.home.lab`)
- traefik.http.routers.drawio_route.tls=true
- traefik.http.routers.drawio_route.service=drawio_service
- traefik.http.services.drawio_service.loadbalancer.server.port=8080
networks:
- home-lab-network
restart: always
Let’s understand first what is going on within this file:
mkcert for my custom domain: home.lab.dynamic_conf.yml defined in our docker home-lab.yml file, volumes section, which looks like this:tls:
certificates:
- certFile: /etc/traefik/_wildcard.home.lab.pem
keyFile: /etc/traefik/_wildcard.home.lab-key.pem
stores:
- default
stores:
default:
defaultCertificate:
certFile: /etc/traefik/_wildcard.home.lab.pem
keyFile: /etc/traefik/_wildcard.home.lab-key.pem
$ docker compose -f home-lab.yml
$ docker ps -a
CONTAINER ID IMAGE STATUS PORTS NAMES
de20745cda65 traefik:latest Up 5 hours 0.0.0.0:80->80/tcp traefik
as24545tda76 drawio:latest Up 5 hours 0.0.0.0:8080->8080/tcp drawio
To access our hosted app, let’s just open a browser and go to https://drawio.home.lab.
First, it is mandatory to check the official documentation and the docker CLI cheatsheet.
# Running containers
$ docker ps -a
$ docker container ls -a
# Container management/handling
$ docker container stop <container_name>
$ docker container restart <container_name>
$ docker container rm <container_name>
# Image management/handling
$ docker images
$ docker image rm <image_id>
# Existent Volumes
$ docker volume ls
We can use 4 main services for Alerting and Monitoring:
Useful official setup guides:
Here a screenshot of my Home Lab Monitoring/Alerting via the mentioned services/tools, where Prometheus scraps cAdvisor performance data and it is display on a Grafana dashboard:
Grafana - Prometheus - cAdvisor combo for Alerting and Monitoring.
Extra ball: We can use ctop locally in our Linux Server:
ctop provides a concise overview of real-time metrics for multiple containers.
There is ‘NO 100%’ secure system, but we can always reduce risk. Personally:
So far, I have mentioned that probably the safest way to access our Home Lab is to setup a Wireguard VPN, but there are a couple of alternatives to still setup our Home Lab for external access:
Fault Tolerance simply means a system’s ability to continue operating uninterrupted despite the failure of one or more of its components.
A system is resilient if it continues to carry out its mission in the face of adversity.
Revisiting these concepts trigger a couple of questions we need to answer…
No silver bullets here, and I also gotta say that in this space our approach with Kubernetes clearly wins, especially due to the capacity of having multiple worker nodes (high availability by nature), thus if one of them fails, the other could continue operating and take the load of the down one. The downside is whether our Kubernetes control plane fails, then we are in the same situation as with our single Server approach with docker (check docker swarm for high availability).
In case of failure with our simpler docker approach, we have an ADVANTAGE too: it is relatively easy to re-run the entire infrastructure, which means only ONE COMMAND. And as this happened to me (so far once and fingers crossed), I just grabbed a backup of my data and configured everything in NO TIME on my local computer until I figured out the issue.
Data redundancy occurs when the same piece of data is stored in two or more separate places.
My approach for DATA REDUNDACY includes 2 practices:
Assuming that our Server/NAS hard drives are encrypted and need to be decrypted remotely when restarting our Linux Server, we have a couple of options:
My Home Lab Dashboard using Homer.
If you reachead this point of the article and you are not convinced of using one or the other, then here I have a couple more alternatives to explore:
I would not finish this article without mentioning some of the biggest players in IT-Infrastructure:
Well, after many months or hard work, I’m finally writing this conclusion: It has been (and still is) a long journey, which let me dive into this amazing world of Infrastructure, full of challenges but also with tons of lessons learned. I can only say that this post aims to be a time saver for you, and a source of knowledge share and struggles.
As ALWAYS, any feedback is more than welcome! See you soon :).
System maintainance (and software maintainance in general) is an ongoing process that requires attention and responsibility.
So in this blog post I will summarize the key actions we can take in order to keep our arch linux installation healthy, optmized and fully working.
BTW, If you are NOT using Arch yet, I have a guide explaining how to install it from scracth and also a tiny wiki with information about daily tasks, process and guides.
It is very important to have the latest version of the system up and running (including users apps and packages). I gotta say that sometimes things get broken due to the nature of the rolling release model, but since each installation is different, we are responsible for checking Arch Linux latest news in the Arch Linux website..
Once done, we can proceed to perform a system update/upgrade by running:
$ sudo pacman -Syu
or if you are using any AUR helper (in my case yay):
$ yay -Syu
package as marginal trust:error: <package>: signature from "Someone <mail.of.someone>" is marginal trust
...
Do you want to delete it? [Y/n]
Then update the keyrings as following and run again the full system upgrade command:
$ sudo pacman -Sy archlinux-keyring
The package manager is our source of truth when it comes to what we use in our system but its cache grows exponentially since it keeps ALL versions that we are installing/upgrading. This is of course usefull when it comes to system stability and rolling things back (by using pacman -U /var/cache/pacman/pkg/name-version.pkg.tar.gz) but it requires maintenance.
Let’s perform a bunch of checks before:
$ sudo ls /var/cache/pacman/pkg/ | wc -l //cached packages
$ du -sh /var/cache/pacman/pkg/ //space used
We can use paccache for this purpose, so let’s install it first (if we do not already have it):
$ sudo pacman -Sy pacman-contrib
Now we can easily clean everything up and keep the latest 3 versions (default behavior):
$ sudo paccache -r
Orphans are no more than unneeded dependencies that are a result of a package which was uninstalled. They waste storage space so they required attention too.
Let’s list all the orphans in our system:
$ sudo pacman -Qdtq
To remove all orphans let’s run:
$ sudo pacman -Qtdq | sudo pacman -Rns -
error: argument '-' specified with empty stdin:We do not have to worry, that means there are no orphans in our system. :)
Let’s list all the installed packages first in order to check whether we have software we are no longer using:
$ pacman -Qei | awk '/^Name/{name=$3} /^Installed Size/{print $4$5, name}' | sort -h
We can also list the ones installed from the AUR:
$ pacman -Qim | awk '/^Name/{name=$3} /^Installed Size/{print $4$5, name}' | sort -h
If we want to unistall all unneeded packages and their unused dependencies and configuration files:
$ sudo pacman -Rns $(pacman -Qdtq)
In case we want to individually uninstall packages, we use this command instead:
$ sudo pacman -Rns <package-name>
Our cache takes a lot of space as long as we use our system, so it is a good idea to check it out and clean it up accordingly. With the following command we can check its size:
$ sudo du -sh ~/.cache
$ 32G /home/fernando/.cache
If we want to clear it up, we just remove its content:
$ rm -rf ~/.cache/*
System logs are always important to fix issues and to know what is going on within our Linux distro but again, they need a bit of maintenance.
Let’s first perform a system check to see how much space is being consumed by our logs:
$ journalctl --disk-usage
In order to remove logs we use the same command by limiting it by time (check the man for size limit and other alternatives):
$ sudo journalctl --vacuum-time=7d
If we want to permantely set this up by size, we can uncomment SystemMaxUse in /etc/systemd/journald.conf configuration file, to limit the disk usage used by these files by, for example: SystemMaxUse=500M in my case.
That is it… at the least the minimum and basic things…. Just know, that not all the mentioned steps are mandatory and should be done in one shot one after the other, but we want to make sure we care about the healthiness of our system by from time to time giving it a bit of love.
Reviewing code is not an easy task. Most of the time code reviews come in a format of Pull Requests (PRs from now on). We all have our own style and guidelines when addressing them, so in this post I will share my own tips in order to make them effective and valuable.
As a starting point I would like to bring up a couple of inspirational coding quotes, which I keep in mind when writing code:
“Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.” - John F. Woods
“Programs must be written for people to read, and only incidentally for machines to execute.” - Harold Abelson
“I’m not a great programmer; I’m just a good programmer with great habits.” - Kent Beck
IT WORKS ON MY MACHINE…
Let’s get started by exploring a bunch of areas, which will help us to create structure and organization within our code reviews, plus things to pay attention to within the process.
Before jumping into the process of reviewing code, we need to understand the whys behind code reviews and how they contribute to better software development.
Let’s enumerate them:
Effective code reviews ensure code quality.
The first and most important part of a code review is its PR size. It is key to provide a concise size in order to facilitate the review: keep it short and straight to the point.
As a rule of thumb and in my experience:
Remember that effective code reviews are small and often and if you feel you are breaking this rule, try to break code down into tinier chunks.
In situations like renamings or refactors which involve a bigger changeset (due to coupling, legacy code or tech debt: we have already been there right?), you can specify in the PR description or even *pair with someone and commit the changes directly.
A good PR description is key in order to rapidly acquire context on what needs to be reviewed. Descriptions and PR themselves should follow the Single Responsibility Principle: do one thing and do it well.
They should basically contain (as short as possible):
Boy Scout Rule: I know it is tempting to refactor something out of scope of a PR and include it in it, but… try to not fall into this trap and be gentle with the person reviewing your code. Being strict in this sense is important: you can always create another small PR with this changeset.
Communication in PRs is by nature asynchronous and because of this fact, we need to be careful to not block people. Reviewing code is a responsibility, it is not a “touch and go” process, so let’s ensure that we monitor our conversations and/or requested changes.
When it comes to attitude my tips are:
We should always have a positive attitude and provide constructive feedback.
In the beginning of the article we have highlighted code quality as one of the strongest benefits of reviewing code, which rises the questions of: What do we need to pay attention to at this level?
The first thing I do when reviewing code is to scroll down to the bottom of the PR in order to see the existence of tests:
It is very important to include code coverage within PRs.
To detect issues at code level, even though it is key to be proficient with the technology we are evaluating, it is even more important to be familiar with code anti-patterns, code smells, software engineering design principle and best practices, thus, we have this knowledge which will provide us with extra tools to easily detect potential issues.
In this aspect, I check:
Effective Code Reviews contribute to increase code quality.
Code reviews done wrong could become a big evil, leading to critical issues, not only at a technical level but also around collaboration and team morale. If we create consistency, structure and organization when reviewing our code, we will be contributing to better processes, discussions and higher software quality.
There’s a lot to gain out of conducting effective code reviews and I hope this post has shaded some light on the topic. As usual, any feedback or tips are more than welcome! Happy Coding!
]]>The title of this article might be a bit confusing…but in this post we are not going to talk about programming languages or architecture…
I do not want to break your expectations though, in essence, we are going to go a bit technical…but we will mostly focus on a Core Part of Product Development:
How to write First Class Features, always keeping in mind Engineering and its impact on the rest of the organization.
We are going to use the term Functionality, Feature and User Story interchangeably.
You can find more info on these definitions here.
One of the problems that arise and that I see in organizations is global communication. Something that on paper should be easy to manage is most of the time compromised because of the lack of frameworks, tools or common language/vocabulary, thus, leading to misundestandings, incoordination and of course creating stress, pressure and friction.
Dealing with communication is one of the most challenging parts in an organization.
As our product evolves, there is the need to adopt a common vocabulary/language, interpreted by all the moving parts of our organization: business users, analysts, managers, engineers, etc. The idea is to effectively bridge communication gaps between different areas of an organization.
BDD (framework) and Gherkin (language) could help us to achieve this goal by favoring a more consistent communication channel, so let’s define both and see how we can make a good use of them.
Let me quote Wikipedia here, which perfectly describes this concept:
Behavior-driven development (BDD) is a process that encourages collaboration among developers, QA and non-technical or business participants in a software project. It encourages teams to use conversation and concrete examples to formalize a shared understanding of how the application should behave.
Fundamentally BDD advocates the usage of a common vocabulary to create a domain specific language (DSL) in order to convert structured natural language statements into scenarios with acceptance criteria for a given function, and the tests used to validate that functionality.
Here is a representation if you are comming from the technical side of things:
GIVEN-WHEN-THEN are fundamental in BDD.
Gherkin is a Business Readable, Domain Specific Language created especially for behavior descriptions. It gives us the ability to remove logic details from behavior tests, which turns it into a language that could be understood by anyone without getting deep into implementation details (from an Engineering Perspective).
It serves two main purposes:
As we can see, there is a strong relationship between Gherkin and BDD, so for instance, we can also agree that Gherkin is an implementation of BDD, thus, reponding very well to the approach GIVEN-WHEN-THEN and THREE AMIGOS collaboration:
Three amigos working together to get the best possible outcome.
Do not worry if you are a bit confused and you did not get it yet, an example and a real case scenario will make it more clear. Keep reading :).
The most basic building block consists of a feature description plus an scenario. This scenario consists of a list of steps, which must start with one of the keywords Given, When, Then. But or And are also allowed keyboards.
Here a quick example for a login functionality:
FEATURE: User Login
In order to use our mobile client, users should be
able to authenticate.
SCENARIO 1: Login with email
GIVEN there is a login screen
AND I have introduce my email and password
WHEN I press the login button
THEN I should be authenticated
AND taken to the welcome screen
It is very common to have multiple scenarios that satisfy a functionality. Let’s take our example above to a new level by adding 2 more scenarios:
FEATURE: User Login
In order to use our mobile client, users should be
able to authenticate.
SCENARIO 1: Login with email
GIVEN there is a login screen
AND I have introduce my email and password
WHEN I press the login button
THEN I should be authenticated
AND taken to the welcome screen
SCENARIO 2: Login with phone number
GIVEN there is a login screen
AND I have introduce my phone number and pin
WHEN I press the login button
THEN I should be authenticated
AND taken to the welcome screen
When we have similar scenarios with similar information, copying and pasting can become tedious and repetitive. There is a way to avoid this given the following example:
FEATURE: Tip Calculator
After calculaing the total of the check, users
should be able to optionally provide a tip.
SCENARIO 1: Tip out 5% of the total
GIVEN the total of the bill is 100 euros
WHEN I tip out 5% of the total
THEN I should pay 105 euros
SCENARIO 2: Tip out 10% of the total
GIVEN the total of the bill is 200 euros
WHEN I tip out 10% of the total
THEN I should pay 220 euros
SCENARIO 2: Tip out 15% of the total
GIVEN the total of the bill is 100 euros
WHEN I tip out 15% of the total
THEN I should pay 115 euros
By using scenario outlining, we translate our previous example into:
FEATURE: Tip Calculator
...
SCENARIO OUTLINE: Calculating tips
GIVEN the total of the bill is <TOTAL>
WHEN I tip out <TIP> of the total
THEN I should pay <PAYMENT> euros
EXAMPLES:
| TOTAL | TIP | PAYMENT |
| 100 | 5% | 105 |
| 200 | 10% | 220 |
| 100 | 15% | 115 |
Backgrounds allows us to add some context to all scenarios in a single feature:
FEATURE: Conversation Administrator Role
...
BACKGROUNG:
GIVEN A global administrator named "Fernando"
AND A conversation group called "Android"
AND a user called "Antje" not belonging to any conversation
SCENARIO 1: Fernando rename conversation
GIVEN I am logged in as Fernando
WHEN I change the conversation name to "iOS"
THEN I should see the new conversation name "iOS"
AND I should see a message "Conversation name changed"
SCENARIO 2: Fernando add member to conversation
GIVEN I am logged in as Fernando
WHEN I add "Antje" to the "Android" conversation
THEN I should see a message "Antje added to the conversation"
@Wire is secure collaboration platform and as part of leadership, one of my responsibilities is to contribute with product coordination between stakeholders and mobile engineering. Currently we are in the process of improving the platform, and in this case, I wanted to share the re-writing of one of our functionalities: Email Verification.
The global description of the functionality is in plain english and has an overview of it. There is no scenario definition and we can add here any kind of useful information that we consider useful for the understanding and further development.
Human-readable feature description with extra information.
It is worth mentioning that level of granularity in terms of breaking down tasks into smaller ones, will depend on the complexity of the feature. Always keep in mind the Divide and Conquer approach.
Diving and Conquer and Keep it simple are very important in sub-diving tasks.
At this point, we can fully apply what we have learned so far: Scenario definition and acceptance criteria with Gerkin at sub-task level.
Gherkin is a Business Readable, Domain Specific Language created especially for behavior descriptions.
Writing features is not straightforward: many different people profiles are involved and they ALL should understand what they mean.
As an Extra Ball, here are some useful tips, which could help in order to write better Feature/Functionalities/Issues/User Stories, apart from the ones reviewed so far:
Writing Features in a collaborative way is a must.
Communications is very important and it dictates how well structured and coordinated an organization is.
In this post we have seen how a Cross-Communication Framework like BDD in combination with A DSL like Gherkin can help us to mitigate communication issues. Gherkin is also adaptable and flexible, we can even establish your own rules to take it to the next level.
Are you using BDD or Gherkin? As usual, I will finish this way: Any Feedback is more than welcome, feel free to ping me in order to share your thoughts and ideas.
$ archinstall --script guided
It is ALSO VERY IMPORTANT to use the Arch Linux Official Installation Guide and follow this guide to understand what is going on, thus, you acquire knowledge and polish/customize your installation. Trust me, the lessons learned here are inmense!!!
I’m a Linux fan, and the main reason is because of its open source nature: I have been using it for years and I gotta say a lot has changed since the early days… If you remember re-compiling the kernel in order to install an application, you know what I’m talking about… Fortunately that does not happen anymore(?), so do not freak out, not yet :).
This article will act as a looooong guide, which is goint to help you to install (and understand) Arch Linux with full disk encryption. We will review some of the concepts involved during our process, thus, we have a better picture of what we are doing.
In the past, I used SuSe, Red Hat, Debian, Ubuntu and Arch, in that order. I gotta say with Arch… was Love at First Sight (also thanks to my friend Oriol).
Here are some of the reasons which motivated me:
.iso image in order to create a bootable disk for an Operating System.I have an Intel based system, in this case a Dell XPS 13 (9310) where we will install everything from scratch. I have also used this guide for installing my Intel NUC too, so most of the content in this article would apply to other hardware. In case there are some specifics I will mention them.
It is important that you check your hardware in the Official Arch Linux Wiki for tips, tricks, troubleshooting and extra specific steps when setting up your linux environment.
As a first step we need a bootable USB Disk, so in order to create it we need an .iso we can download from here.
Plug your USB Drive Stick and check its location by running lsblk:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 1 7,6G 0 disk
├─sda1 8:1 1 621M 0 part /run/media/fernando/ARCH_202012
├─sda2 8:2 1 61M 0 part
└─sda3 8:3 1 300K 0 part
In my computer that was /dev/sda, so let’s burn the .iso with the dd tool:
dd bs=4M if=/<path>/archlinux-2020.12.01-x86_64.iso of=/dev/sda status=progress oflag=sync
That is ALL WE NEED at the moment, so let’s get to the next section to start learning :).
There is a short answer for this: Security. In a time where (almost) all our information is binary and for instance, our lives, are mostly inside of devices, I personally want to ensure that my sensitive information is hard to get even if my laptop lands on the street, due to it being stolen or lost (hopefully not but never say never…).
So it is time to jump deeper in the core of this article:
The result is going to be a Full Arch Linux installation with Disk Encryption(FDE).
Block device encryption encrypts/decrypts the data transparently as it is written/read from block devices, the underlying block device sees only encrypted data. To mount encrypted block devices we must provide a passphrase to activate the decryption key.
Some systems require the encryption key to be the same as for decryption, and other systems require a specific key for encryption and specific second key for enabling decryption.
LUKS (Linux Unified Key Setup) is a specification for block device encryption (nowadays a standard for Linux). It establishes an on-disk format for the data, as well as a passphrase/key management policy.
LUKS uses the kernel device mapper subsystem via the dm-crypt module. This arrangement provides a low-level mapping that handles encryption and decryption of the device’s data. User-level operations, such as creating and accessing encrypted devices, are accomplished through the use of the cryptsetup utility.
Logical Volume Management utilizes the kernel’s device-mapper feature to provide a system of partitions independent of underlying disk layout. With LVM you abstract your storage and have “virtual partitions”, making extending/shrinking easier (subject to potential filesystem limitations).
Virtual partitions allow addition and removal without worry of whether you have enough contiguous space on a particular disk, getting caught up fdisking a disk in use (and wondering whether the kernel is using the old or new partition table), or, having to move other partitions out of the way.
The straightforward method is to set up LVM on top of the encrypted partition. Technically the LVM is setup inside one big encrypted blockdevice:
+-----------------------------------------------------------------------+ +----------------+
| Logical volume 1 | Logical volume 2 | Logical volume 3 | | Boot partition |
| | | | | |
| [SWAP] | / | /home | | /boot |
| | | | | |
| /dev/MyVolGroup/swap | /dev/MyVolGroup/root | /dev/MyVolGroup/home | | |
|_ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _|_ _ _ _ _ _ _ _ _ _ _ _| | (may be on |
| | | other device) |
| LUKS encrypted partition | | |
| /dev/sda1 | | /dev/sdb1 |
+-----------------------------------------------------------------------+ +----------------+
Example of encrypted disk layout using LVM on LUKS
That is enough theory for now and it is (finally?) time to dip our toe in practical linux water.
As a first step we need to boot our system with our already created Arch Linux Bootable USB Disk.
We will have to disable TPM and SecureBoot, otherwise our USB drive with the Arch Linux .iso image will not be recognized. Do not worry, you can enable it later.
We also have to Disable RAID and enable AHCI/NVMe (or disable all Operating Mode of the integrated storage device controller). Apparently in many DELL Laptops with Windows, this is only for compatibility and some Intel Features which depend on this functionality under Windows. By the way, RAID mode offers no benefit in this case (on an XPS 13 that only supports a single SSD). Check this official thread for more info.
If you want to have a dual-boot with Windows, disabling RAID will make it unusable but you can follow the next steps to avoid this.
On Windows in order to switch RAID to AHCI (AVOID THIS if you do not want dual-boot Linux-Windows):
Run as administrator.bcdedit/set safeboot minimal.RAID on.AHCI mode, ignoring the warnings and applying and rebooting.bcdedit/deletevalue safeboot.If we have reached this point, that means that we have loaded the Arch Linux Live USB and booted from it. The proof is that we find ourselves at a prompt: root@archiso ~ #. Well done!
At this point we should be in front of a prompt:
root@archiso ~ #
This is our root prompt and I’ll be shortening that to $ in this post.
This is an OPTIONAL step but if the console font is too small or not readable, we can set it up:
$ setfont latarcyrheb-sun32
We need an internet connection, so let’s configure the network. I connected via ethernet so everything worked out of the box. if you need WiFi, you can set it up
by launching iwctl (interaction mode with autocompletion). Here are some useful commands:
`iwctl`
`station list` # Display your wifi stations
`station <INTERFACE> scan` # Start looking for networks with a station
`station <INTERFACE> get-networks` # Display the networks found by a station
`station <INTERFACE> connect <SSID>` # Connect to a network with a station
We also need to update our system clock. Let’s use timedatectl(1) to ensure the system clock is accurate:
$ timedatectl set-ntp true
To check the service status, we can use timedatectl status.
Once that’s done, we can start building up to the installation.
This is my disk layout (run lsblk to get this output):
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:0 0 476,9G 0 disk
├─nvme0n1p1 259:1 0 512M 0 part /boot
└─nvme0n1p2 259:2 0 476,4G 0 part
└─luks 254:0 0 476,4G 0 crypt
├─main-root 254:1 0 50G 0 lvm /
└─main-home 254:2 0 110G 0 lvm /home
This results in a System with Full Disk Encryption (FDE), aside from the boot partition.
For this I used the parted utility for manipulating the partition table:
$ parted /dev/nvme0n1
(parted) mklabel gpt # WARNING: wipes out existing partitioning
(parted) mkpart ESP fat32 1MiB 513MiB # create the UEFI boot partition
(parted) set 1 boot on # mark the first partition as bootable
(parted) mkpart primary # turn the remaining space in one big partition
File system type: ext2 # don't worry about this, we'll format it after anyway
Start: 514MiB
End: 100%
Now you can check the created layout:
(parted) print
Model: Unknown (unknown)
Disk /dev/nvme0n1: 512GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 538MB 537MB fat32 boot, esp
2 539MB 512GB 512GB ext2
(parted) exit
This will encrypt the second partition, which we’ll then hand off to LVM to manage the rest of our partitions. Doing it this way means everything is protected by a single password.
$ cryptsetup luksFormat /dev/nvme0n1p2
WARNING!
========
This will overwrite data on /dev/nvme0n1p2 irrevocably.
Are you sure? (Type uppercase yes): YES
Enter passphrase:
Verify passphrase:
Now we need to open the encrypted disk so LVM can do its thing:
$ cryptsetup open /dev/nvme0n1p2 luks
Enter passphrase for /dev/nvme0n1p2:
In this section (since we already know about LVM) we will need:
Let’s proceed with the commands then:
$ pvcreate /dev/mapper/luks # create the physical volume
Physical volume "/dev/mapper/luks" (Fernando) successfully created.?
$ vgcreate main /dev/mapper/luks # create the volume group (Fernando)
Volume group "luks" successfully created
$ lvcreate -L 100G main -n root # create a 100GB root partition
Logical volume "root" created.
$ lvcreate -L 18G main -n swap # create a RAM+2GB swap, bigger than RAM for hibernate
Logical volume "swap" created.
$ lvcreate -l 100%FREE main -n home # assign the rest to home
Logical volume "home" created.
We can check the layout by running lvs:
$ lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
home main -wi-a----- 308.43g
root main -wi-a----- 100.00g
swap main -wi-a----- 18.00g
Now we’re going to format all the partitions we’ve created so we can actually use them.
$ mkfs.ext4 /dev/mapper/main-root
...
Allocating group tables: done
Writing inode tables: done
Creating journal (65536 blocks): done
Writing superblocks and filesystem accounting information: done
$ mkfs.ext4 /dev/mapper/main-home
...
Writing superblocks and filesystem accounting information: done
$ mkswap /dev/mapper/main-swap
Setting up swapspace version 1, size = 18 GiB (19327348736 bytes)
...
$ mkfs.fat -F32 /dev/nvme0n1p1
...
It’s time to install the base system, which we can then chroot into in order to further customise our installation.
A chroot is an operation that changes the apparent root directory for the current running process and their children. A program that is run in such a modified environment cannot access files and commands outside that environmental directory tree. This modified environment is called a chroot jail.
Before we can install the OS we need to mount all the partitions and then chroot into the mountpoint of the root partition.
mount /dev/mapper/main-root /mnt
mount /dev/mapper/main-home /mnt/home
mount /dev/nvme0n1p1 /mnt/boot
swapon /dev/mapper/main-swap
Next step is to edit /etc/pacman.d/mirrorlist and put the mirrors closest to us at the top. This’ll help speed up the installation.
It is highly recommended that we generate a mirrorlist and unckeck the http checkbox so we only use mirrors we can fetch from over https. (Feel free to mark IPv6 if your connection supports it.)
In my case I generated it for Germany and used curl to get them. Here the steps:
mv /etc/pacman.d/mirrorlist /etc/pacman.d/mirrorlist.bak #Backup just in case.
curl https://archlinux.org/mirrorlist/?country=DE&protocol=https&ip_version=4&ip_version=6
>> /etc/pacman.d/mirrorlist #Get the mirror list.
rm /etc/pacman.d/mirrorlist.backup #Success: remove the backup file.
vim /etc/pacman.d/mirrorlist
We can also have a look at the status of the mirrors and even for more info, as usual, we can go to the Arch Linux Wiki.
Now that everything is set up we need to bootstrap the OS:
# In case the below command FAILS, we can first run:
# pacman-key --init
# pacman-key --populate archlinux`
pacstrap -i /mnt base linux linux-firmware base-devel lvm2 vim
Let’s break down all these packages we are installing:
It’ll now prompt us to confirm our package selection and then start with the installation of the base system. Picking the defaults should be safe and fine.
Now that the base system is there, we can chroot into it to customise our installation and finish it.
First we generate an fstab file (use -U or -L to define by UUID or labels, respectively):
$ genfstab -U /mnt >> /mnt/etc/fstab
Check the resulting /mnt/etc/fstab file, and edit it in case of errors.
$ cat /mnt/etc/fstab
Here is an example of my /etc/fstab:
$ cat etc/fstab
# Static information about the filesystems.
# See fstab(5) for details.
# <file system> <dir> <type> <options> <dump> <pass>
# /dev/mapper/main-root
UUID=xxxxxxx-3c01-xxxx-xxxx-ab120fexxxxx / ext4 rw,relatime 0 1
# /dev/nvme0n1p1
UUID=52CE-47A9 /boot vfat rw,relatime,fmask=0022,dmask=0022,
codepage=437,iocharset=iso8859-1,
shortname=mixed,utf8,errors=remount-ro 0 2
# /dev/mapper/main-home
UUID=xxxxxxx-3c01-xxxx-xxxx-ab120xxxxxxx /home ext4 rw,relatime 0 2
Now change root into the new system:
$ arch-chroot /mnt
Your prompt will now change to: [root@archiso /]#.
Let’s edit our Locale Information by opening the /etc/locale.gen file and uncommenting en_US.UTF-8 UTF-8 and other needed locales: In my case also: de_DE.UTF-8 UTF 8 (since I live in Germany).
Once we are done, we need to generate them by running:
$ locale-gen
As a last step in this section, let’s execute the following in order to create the locale.conf(5) file and set the LANG variable accordingly:
$ echo LANG=en_US.UTF-8 > /etc/locale.conf
$ export LANG=en_US.UTF-8
Let’s set our timezone by running:
$ tzselect
Once we have selected our timezone we need to update a few more things. First override the /etc/localtime file and symlink it to your timezone with this format:
$ ln -sf /usr/share/zoneinfo/<continent>/<location> /etc/localtime
In my case (Berlin):
$ ln -sf /usr/share/zoneinfo/Europe/Berlin /etc/localtime
Time so sync the clock settings and set the hardware clock to UTC by running hwclock(8) to generate /etc/adjtime:
$ hwclock --systohc --utc
This part will set the keyboard layout and font to be used by the virtual console as default values.
Let’s create the /etc/vconsole.conf configuration file and add keyboard configuration:
$ vim /etc/vconsole.conf
KEYMAP=us
At this point we could also (OPTIONAL) set another font by adding this to the mentioned file:
FONT=latarcyrheb-sun32
KEYMAP=us
Time to give your system a name by adding that to /etc/hostname. As mentioned earlier, I am:
android10-xps-arch
Also, add a line for that same hostname to /etc/hosts:
$ vim /etc/hosts
# Static table lookup for hostnames.
# See hosts(5) for details.
127.0.0.1 localhost
::1 localhost
127.0.1.1 android10-xps-arch.localdomain android10-xps-arch
If the system has a permanent IP address, it should be used instead of 127.0.1.1.
For this purpose, we have to create /etc/modprobe.d/i915.conf with the following content:
options i915 enable_guc_loading=-1 enable_guc_submission=-1
We need to modify /etc/mkinitcpio.conf and add the following information
MODULES to: (nvme i915 intel_agp)HOOKS to: (base autodetect systemd block sd-vconsole sd-encrypt sd-lvm2 fsck keyboard filesystems)Let’s regenerate the initramfs: For LVM, system encryption or RAID, modify mkinitcpio.conf(5) and recreate the initramfs by executing the following command:
$ mkinitcpio -p linux
If the command fail (it happened to me) and you see something like:
`specified kernel image does not exist /boot/vmlinuz-linux`
You might need to first reintall the linux kernel and then re run the above command:
$ pacman -S linux
$ mkinitcpio -p linux
The command output should look like this:
==> Building image from preset: /etc/mkinitcpio.d/linux.preset: 'default'
-> -k /boot/vmlinuz-linux -c /etc/mkinitcpio.conf -g /boot/initramfs-linux.img
==> Starting build: 4.13.9-1-ARCH
-> Running build hook: [base]
-> Running build hook: [systemd]
-> Running build hook: [autodetect]
-> Running build hook: [keyboard]
-> Running build hook: [sd-vconsole]
...
-> Running build hook: [block]
==> WARNING: Possibly missing firmware for module: wd719x
==> WARNING: Possibly missing firmware for module: aic94xx
-> Running build hook: [sd-encrypt]
-> Running build hook: [sd-lvm2]
-> Running build hook: [filesystems]
-> Running build hook: [fsck]
==> Generating module dependencies
==> Creating gzip-compressed initcpio image: /boot/initramfs-linux-fallback.img
==> Image generation successful
Don’t worry about those two warnings, the XPS 13 doesn’t have any hardware on board that needs those drivers.
Sometimes bugs are discovered in processors for which microcode updates are released. These updates provide bug fixes that can be critical to the stability of your system. Without them, you may experience spurious crashes or unexpected system halts that can be difficult to track down.
This module is loaded together with the initramfs when your system boots, let’s install the package for it:
$ pacman -Sy intel-ucode
We will be using systemd-boot as our bootloader.
In order to start, we need to tell bootctl to install the necessary things onto /boot:
$ bootctl install --path=/boot
In the future we won’t need to call install, but update instead. The good thing is that there is a hook that can be installed which will do this automatically every time we perform a full system upgrade. We are going to do it later once we have a full system up and running.
Let’s edit /boot/loader/loader.conf and make it look like this:
timeout 10
default arch
editor 1
By setting editor 1 it’s possible for anyone to edit the kernel boot parameters, add init=/bin/bash and become root on your system. However, since the disk is still encrypted at this point they can’t do much with it. For instance I think this is very convenient to be able to edit those options when something does go wrong.
We now need to create the boot entry named arch. To that end, create the file /boot/loader/entries/arch.conf with the following content:
title Arch Linux
linux /vmlinuz-linux
initrd /intel-ucode.img
initrd /initramfs-linux.img
options luks.uuid=$UUID luks.name=$UUID=luks
root=/dev/mapper/main-root rw
resume=/dev/mapper/main-swap ro
intel_iomu=igfx_off quiet mem_sleep_default=deep
snd_hda_intel.dmic_detect=0
# NOTE: options should be in the same line separated by
# spaces. Here I formatted this way for better understanding.
Replace $UUID with the value from this command:
$ cryptsetup luksUUID /dev/nvme0n1p2
In case we already have a Windows Intallation, here are GOOD NEWS from the Arch Linux Wiki:
I performed this step in another installation and everything was recognized automatically and added to the bootloader entries.
For commands execution, it is always preferable to use sudo over changing to root. In order to do so we need to install the sudo package and update its configuration:
$ pacman -Sy sudo
Now let’s go to the configuration file:
$ sudo visudo
Next step is to update the configuration and uncomment the line that reads %wheel ALL=(ALL) ALL plus add some extra configuration at this point to save time when creating our first user, (here fernando is going to be my username):
...
##
## User privilege specification
##
root ALL=(ALL) ALL
# Options
Defaults editor=/usr/bin/vim, !env_editor
Defaults insults
# Full Access
fernando ALL=(ALL) ALL
# Last rule as a safety guard
fernando ALL=/usr/sbin/visudo
# Uncomment to allow member of group wheel to execute any command
%wheel ALL=(ALL) ALL
...
We now have to create a user account mentioned in the step above for ourselves and ensure we are added to the wheelgroup:
$ useradd -m -G wheel,users -s /bin/bash fernando
$ passwd fernando
New password:
Retype new password:
passwd: password updated successfully
We need a Graphical User Interface. There are many options out there and I’m not going to point out which is better or worse, personally I think it is a matter of taste. Here are the most popular ones:
We will take advantage of this step and add a couple of extras, so let’s do it by executing the following commands:
$ pacman -Sy gnome gnome-extra dhclient iw dialog
$ pacman -Sy networkmanager network-manager-applet xf86-input-libinput
Something worth mentioning is that I researched a lot to build up this guide and part of it was inspired by one created by Daniele Sluijters who gave a good point about what is the reason behind installing dhclient over dhcpd:
I explicitly install
dhclientbecausedhcpdisn’t very good at dealing with non-spec compliant DHCP implementations. Especially if you have a D-Link router or might encounter one, install this package. It also avoids some issues I’ve had on large networks like at the office, Eduroam etc.
After we are done with the installation, we have to enable both services GDM (gnome) and Network Manager:
$ systemctl enable gdm
$ systemctl enable NetworkManager
So the time has come…If you are still there, I have to say WOW! Congratulation! You have survived to your first (or one more) installation of Arch Linux, which is great. I’m proud of you and I am also sure you have been learning a lot so far.
So one of our last steps will be to exit the chroot:
$ exit
Unmount our filesystems:
$ umount -R /mnt
And finally reboot
$ reboot
Arch Linux up and running with GNOME 3
You though you were done right? The answer to this question is: Yes and No :).
One last thing: Since you might have used an ISO image, it could be that it is not the latest so let’s do a full system upgrade before continuing:
$ sudo pacman -Syu
By default we have (installed) the latest stable linux kernel version. LTS stands for: Long Term Support, which means that this kernel will not be updated as frequently as the most recent one.
Here is a char that simplify differences between them:
The good news is that we can have both installed and choose with which one we want to start our system.
It does not hurt at all to have both installed, actually in my experience, when we want to try the latest state of the art kernel functionalities by using the latest version, it could happen that either something stops working or there is some misbehaviour, so by having the LTS version could also be a life saver to fix things (refer to the troubleshooting section).
$ sudo pacman -S linux-lts
/boot/loader/entries/arch.conf file and name it arch-lts.conf:$ sudo cp /boot/loader/entries/arch.conf /boot/loader/entries/arch-lts.conf
sudo vim /boot/loader/entries/arch-lts.conf) and ONLY modify the first 2 LINES:title Arch Linux LTS
linux /vmlinuz-linux-lts
...
$ uname -r
Arch Linux has an amazing Package Manager (pacman) but one of the things which makes Arch AMAZING is its community. There will be cases where pacman is not going to be enough and you will need to use a AUR Helper in order to download and install software created/maintained by the community.
There are several AUR Helpers out there and nowadays people are talking very good about Yay (written in Go) but I will personally stick to the old school way and use Pacaur, which has been written in bash and pretty much emulates pacman behavior.
I will also ad the steps to install Yay and give it a try. The choice is yours.
Since AUR Helpers are NOT part of the Core Arch Linux Repository we need to install them manually.
$ sudo pacman -S binutils make gcc fakeroot expac yajl jq wget gtest gmock wget --noconfirm
# We get the auracle `.tar.xz` file
$ wget https://aur.archlinux.org/cgit/aur.git/snapshot/auracle-git.tar.gz
# We need to uncompress the downloaded file
$ tar -xzf auracle-git.tar.gz
# Now we build the package
$ cd auracle-git
$ makepkg PKGBUILD --skippgpcheck --noconfirm
# We use pacman to install the already generated package
$ pacman -U auracle-git-*
$ mkdir -p ~/tmp/pacaur_install
$ cd ~/tmp/pacaur_install
.tar.xz file and then we install it:$ curl -o PKGBUILD https://aur.archlinux.org/cgit/aur.git/plain/PKGBUILD?h=pacaur
$ makepkg -i PKGBUILD --noconfirm
$ sudo pacman -U pacaur*.tar.xz --noconfirm
$ rm -r ~/tmp/pacaur_install
$ cd
$ pacaur -S yay
pacman -S --needed git base-devel
git clone https://aur.archlinux.org/yay.git
cd yay
makepkg -si
In the section Setting Up The Bootloader, we have mentioned, that whenever there is a new version of systemd-boot, the boot manager can be optionally reinstalled by the user (we are the users :)).
This can be performed manually (REMEMBER: Automate all things!) or the update can be automatically triggered using pacman hooks, which is what we are going to do here by just installing the package systemd-boot-pacman-hook (in the AUR Repository), which will automate this process:
$ pacaur -S systemd-boot-pacman-hook
Everything from here is entirely OPTIONAL and based on PERSONAL TASTING. I just want to share my full setup with the hope that you can also get something useful out of it :).
Let’s install all this: We can do it in a batch processing fashion:
$ pacaur -S firefox google-chrome chromium vlc gimp gnome-tweaks imv graphicsmagick
And my choice here is oh-my-zsh due to its flexibility, customization and plugins system. You can do anything you want.
I also opted for the PowerLevel10k theme… check the final result:
oh-my-zsh customized using the PowerLevel10k them.
$ pacaur -S zsh oh-my-zsh-git
$ chsh -l
$ chsh -s /usr/bin/zsh
$ yay -S --noconfirm zsh-theme-powerlevel10k-git
$ echo 'source /usr/share/zsh-theme-powerlevel10k/powerlevel10k.zsh-theme' >>~/.zshrc
$ pacaur -S nerd-fonts-hack
We will have to move some content from your files .bash_profile and .bashrc to .zshrc and .zprofile respectively.
# Enable Powerlevel10k instant prompt. Should stay close to the top of ~/.zshrc.
# Initialization code that may require console input (password prompts, [y/n]
# confirmations, etc.) must go above this block; everything else may go below.
if [[ -r "${XDG_CACHE_HOME:-$HOME/.cache}/p10k-instant-prompt-${(%):-%n}.zsh" ]]; then
source "${XDG_CACHE_HOME:-$HOME/.cache}/p10k-instant-prompt-${(%):-%n}.zsh"
fi
# Path to your oh-my-zsh installation.
ZSH=/usr/share/oh-my-zsh/
export DEFAULT_USER="fernando"
export TERM="xterm-256color"
export ZSH=/usr/share/oh-my-zsh
export ZSH_POWER_LEVEL_THEME=/usr/share/zsh-theme-powerlevel10k
source $ZSH_POWER_LEVEL_THEME/powerlevel10k.zsh-theme
plugins=(archlinux
bundler
docker
jsontools
vscode web-search
k
tig
gitfast
colored-man-pages
colorize
command-not-found
cp
dirhistory
autojump
sudo
zsh-syntax-highlighting
zsh-autosuggestions)
# /!\ zsh-syntax-highlighting and then zsh-autosuggestions must be at the end
source $ZSH/oh-my-zsh.sh
# Uncomment the following line to disable bi-weekly auto-update checks.
DISABLE_AUTO_UPDATE="true"
ZSH_CACHE_DIR=$HOME/.cache/oh-my-zsh
if [[ ! -d $ZSH_CACHE_DIR ]]; then
mkdir $ZSH_CACHE_DIR
fi
source $ZSH/oh-my-zsh.sh
.p10k.zsh file, which is very well documented:$ vim ~/.p10k.zsh
My Color Palette in the Terminal Preferences.
Here are the ones:
Installation:
$ pacaur -S git asdf android-studio docker code zeal intellij slack
In case we face problems, it is important to have written down all the necessary steps just for the sake of properly starting our system with a Rescue Disk (the same USB Drive we set up already).
We plug our Bootable USB Drive and boot into the system.
We need to open the encrypted disk so LVM can do its thing:
$ cryptsetup open /dev/nvme0n1p2 luks
Enter passphrase for /dev/nvme0n1p2:
iwctl:`iwctl station list` # Display your wifi stations
`iwctl station station scan` # Start looking for networks with a station
`iwctl station station get-networks` # Display the networks found by a station
`iwctl station station connect network_name` # Connect to a network with a station
$ mount /dev/mapper/main-root /mnt
$ mount /dev/mapper/main-home /mnt/home
$ mount /dev/nvme0n1p1 /mnt/boot
$ swapon /dev/mapper/main-swap
$ arch-chroot /mnt
Now you are ready to work and fix Arch Linux just in case something unexpected ocurred.
$ exit
$ umount -R /mnt
$ reboot
This specific installation uses ‘ext4’ as a file system but if you use ‘btrfs’, I have added troubleshooting information in my Linux Wiki
This has been such a ride! A very long but (hopefully) a fulfilling process. I have no more words than saying THANKS for READING!
I hope you found the material useful and of course any feedback is more than welcome, so feel free to drop me a line in any of the social networks that appear in this website.
From here, I will leave up to you to continue diving deeper into the Linux Ecosystem and finish up with an inspirational quote:
“Wisdom is not a product of schooling but of the lifelong attempt to acquire it.”
Some time ago, I wrote about how to build a company’s culture based on human values, and mentioned a bunch of ideas that I would like to bring back:
As we can see, these ideas reflect human behavior and attitudes against a problem, thus, I would like to take the opportunity to build on them and explore how we can leverage failures in order to learn from them. Let’s get started!
Make sure you visit my postmortems sections in order to complement your learnings in this article.
I bumped into this inspirational quote and wrote it down some time ago (please ping me if you know the author):
“Whatever you do, you will make mistakes, errors and blunders. Everyone does. Some less, some more, but no one is exempt. Yet, success is nothing more than mistakes you managed to overcome. That is why a mistake is not important. What counts are the things you do after one.”

In my opinion, it is our reponsibility to minimize failure, but company’s cultures should provide room for failing, and when this happens, we should get something positive and constructive out of it: a lesson learned.
Moreover, let me quote myself:
“Learning something without sharing it, is senseless”
We want to share what we have learned:
“The more we fail, the more resilient we become. Let’s let this knowledge to be shared in order to spread and build resilience.”
It is at this point when postmortems come in to play, so in the next section we are going to define what is a postmortem and how this concept can help us to acquire and transmit knowledge based on failure.
I definitely agree with the definition, but from my perspective, postmortems should not be restricted/limited to only technical issues (outages, technical incidents, failures), because otherwise we miss a lot of valuable information from other areas that are helpful for continuous improvement.
They should also include situational issues we might encounter like:
It is VERY IMPORTANT when writing them to:
How we organize this information, is a matter of taste, but we should keep consistency within the format to facilitate/favor readability. Some other extra information that could be useful to include:
Here is an screenshot of a postmortem out of a situation I experienced recently:

You can also visit my Postmortems Section if you are curious about my experiences in this field.
There are no silver bullets when it comes to tooling and templates for writing and sharing postmortems.
In the example above I used a simple web using github pages that I created some time ago, but documents, wikis or anything that could be shared company wise is more than valid. I will leave it up to you, the only thing I would add here: make sure postmortems are visible and easy to find.
I have also forked a github repository with a bunch of useful templates that I adapted myself according to my needs.
Postmortems are a useful tool/technique that can be used in order to contantly improve and learn.
Personally I’m strong believer of building a culture based on the following ideas:
“Make mistakes and embrace them, it is nothing to be ashamed of.”
“Learning something without sharing it is senseless.”
“Foment continuous learning and sharing.”
“Encourage a culture based on postmortems and lessons learned.”
I hope you have enjoyed this article and found it useful. As usual, remember that any feedback is more than welcome, I’m easily found on Twitter and other social networks. Cheers!
Some time ago, I wrote about how to build a company’s culture based on human values, and mentioned a bunch of ideas that I would like to bring back:
As we can see, these ideas reflect human behavior and attitudes against a problem, thus, I would like to take the opportunity to build on them and explore how we can leverage failures in order to learn from them. Let’s get started!
Make sure you visit my postmortems sections in order to complement your learnings in this article.
I bumped into this inspirational quote and wrote it down some time ago (please ping me if you know the author):
“Whatever you do, you will make mistakes, errors and blunders. Everyone does. Some less, some more, but no one is exempt. Yet, success is nothing more than mistakes you managed to overcome. That is why a mistake is not important. What counts are the things you do after one.”

In my opinion, it is our reponsibility to minimize failure, but company’s cultures should provide room for failing, and when this happens, we should get something positive and constructive out of it: a lesson learned.
Moreover, let me quote myself:
“Learning something without sharing it, is senseless”
We want to share what we have learned:
“The more we fail, the more resilient we become. Let’s let this knowledge to be shared in order to spread and build resilience.”
It is at this point when postmortems come in to play, so in the next section we are going to define what is a postmortem and how this concept can help us to acquire and transmit knowledge based on failure.
I definitely agree with the definition, but from my perspective, postmortems should not be restricted/limited to only technical issues (outages, technical incidents, failures), because otherwise we miss a lot of valuable information from other areas that are helpful for continuous improvement.
They should also include situational issues we might encounter like:
It is VERY IMPORTANT when writing them to:
How we organize this information, is a matter of taste, but we should keep consistency within the format to facilitate/favor readability. Some other extra information that could be useful to include:
Here is an screenshot of a postmortem out of a situation I experienced recently:

You can also visit my Postmortems Section if you are curious about my experiences in this field.
There are no silver bullets when it comes to tooling and templates for writing and sharing postmortems.
In the example above I used a simple web using github pages that I created some time ago, but documents, wikis or anything that could be shared company wise is more than valid. I will leave it up to you, the only thing I would add here: make sure postmortems are visible and easy to find.
I have also forked a github repository with a bunch of useful templates that I adapted myself according to my needs.
Postmortems are a useful tool/technique that can be used in order to contantly improve and learn.
Personally I’m strong believer of building a culture based on the following ideas:
“Make mistakes and embrace them, it is nothing to be ashamed of.”
“Learning something without sharing it is senseless.”
“Foment continuous learning and sharing.”
“Encourage a culture based on postmortems and lessons learned.”
I hope you have enjoyed this article and found it useful. As usual, remember that any feedback is more than welcome, I’m easily found on Twitter and other social networks. Cheers!
This is a long article that I wanted to squeeze in a smaller one but it was almost mission impossible to get rid of some important/valuable information. I hope you enjoy and find it helpful.
Feel free to provide feedback, which as usual, is more than welcome.
With that being said, I would like to start with a quote from Robert C. Martin:
“Bad code is always imprudent”
I cannot agree more with this, and no matter what I sell you in this post :), there is NEVER a good reason to write bad code.

We as Engineers, Tech Leads and Managers know that technical debt is one of our worst enemies when it comes to codebases and software projects in general It can be very frustrating and demotivating thus making our life a bit more complicated…But…
Let’s try to answer these questions and also explore in depth different techniques and strategies that will help us effectively deal with it.
In an ideal world, a project would be:
If that is your reality, then you can stop reading this post, luckily you have UNLOCKED the LEVEL SUPERHEROE, so please share your thoughts and ideas, I am more than curious to know how you have achieved it.
Otherwise, welcome to my world, where we create authentic monsters: giant beasts full of technical debt, legacy code, issues and bugs.

And if you let me add more, that also includes coordination and communication problems across the entire organization. Yes! Our Software is terrible and we know it is TRUE, which does not make it any special, right?
There are many definitions of legacy code and some of them, in my opinion, contradict themselves, so since you are familiar with the concept, let’s keep it simple:
“Legacy code is code without tests.”
Testing nowadays should be implicit in our engineering process when writing code. So if you are not at least unit testing your codebase, run and do it, it is a command :).
I came across this term lately and it looks like we can use it as a synonym of technical debt, but in reality, here is the formal definition:
“Reckless Debt is code that violates design principles and good practices.”
That means that all code generated by us and our team is junk (not done on purpose of course).
Moreover, Reckless Debt will lead to Technical Debt in the short/mid term and it is also a signal that your team needs more training, or you have too many inexperienced or junior developers.
Here I will rely on Martin Fowler:
“Technical Debt is a metaphor developed by Ward Cunningham to help us think about this problem. Like a financial debt, technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design.”
So let’s put our day to day life back into our heads. In this case we have decided to add a new functionality to our project, so here we have 2 well defined options:
The “easy” way, built up with messy design and code, which will get us there way faster: REMEMBER WE NEED TO PAY THE INTEREST.
The “hard” way, built up with cleaner code and a meaningful and consistent architecture. Without a doubt this will take more time but it is going to be more EFFICIENT IN TERMS OF INTEREST COST.
“Accept some short term Technical Debt for tactical reason.”
It is not uncommon that at some point we need to develop something quickly because of time to market (or market experiment), or perhaps there is a new internal component that needs to be shipped in order to be used across the entire organization and we are contributing to it (a module for example), and we code it fast with not the best design until we can come up with a more robust and effective solution.
“No matter what is the reason, but part of this decision to accept technical debt is to also accept the need to pay it down at some point in the future. Having good regression testing assets in place, assures that refactoring accepted technical debt in the future, can be done with low risk.”
Let’s move on and see how we can analyze and inspect our codebase in order to detect technical debt.
It is the most basic and fundamental building block when it comes to measuring technical debt at a code level.
Most of us are familiar with this practice since it aims to highlight potential bugs, vulnerabilities and complexity.
But first, in order to interpret the results of static code analysis and quantify technical debt, we need to be familiar with a bunch of code metrics:
Cyclomatic Complexity: stands for the complexity of classes and methods by analyzing the number of functional paths in the code (if clauses for example).
Code coverage: A lack of unit tests is a source of technical debt. This is the amount of code covered by unit tests (we should take this one responsibly, since testing getters and setter can also increase code coverage).
SQALE-rating: Broad evaluation of software quality. The scale goes from A to E, with A being the highest quality.
Number of rules: Number of rules violated from a given set of coding conventions.
Bug count: As technical debt increases, the quality of the software decreases. The number of bugs will likely grow (We can complement this one with information coming from our bug tracker).
There is a variety of tools out there (free for open source projects), which provide the above information out of the box, and most of the time, they can be easily integrated either with your CI infrastructure or directly with our version control system tools like github/gitlab/git.
Here is a screenshot of one codebase example using the online open source tool SonarQube:

Lint is also a very flexible and popular one (there are plugins for the most popular IDEs and you can write your own custom rules, in this case on Android):

Static code analysis should be our first mandatory step to tackle and measure technical debt.
So let’s make sure we include as a regular practice in our engineering process.
A Tech Debt Radar is a very simple tool that has personally given me really good results (while working at @SoundCloud, within the android team, it was (and it is AFAIK) a regular practice).
“We should know that this is not an automated tool (like the ones mentioned above) and I define it as a Social Technical Debt Detector by Experience”.
The idea is pretty simple: all the feedback related to how difficult is to work with the current codebase, comes from actually the developers working with it (by experience).
You can see a Tech Debt Radar in the picture below:

As we can see, there is a board with a few post-its representing each one either a feature or even an area of the codebase which eventually is hard to work with.
Then we have 2 axys:
At a process level, this is done in a meeting with the development team and a technical debt captain (someone who will be in charge of analysing technical debt).
Basically each member of the team, will have the chance to place these post-its depending on how much pain (X axis) each is causing, and how much development time (Y axis) is required to fix it.
This would be mostly common sense (with strong arguments and an explanation of the whys) in the beginning but I can ensure that it will get better over time with the accumulated experience.
As an example on the board, let’s look at the DI card (Dependency Injection). It looks like it is a very painful area in our project and refactoring it will require a big effort. On the other hand, Login is causing a lot of pain and fixing it will not be very complicated.
With this in mind you can get some conclusions:
By addressing all features that are painful and at the same time require little development time (the ones placed upper left corner), we will be able to provide a lot of value and improvement by fixing them.
The rest of the functionalities will require some workout to be prioritized and refactored. As a rule of thumb, discuss with the team and use a divide and conquer approach (split up big problem into smaller ones).
Once we gather all this information, we need to keep track of all the collected feedback, so feel free to use your favorite tool for that purpose.
Even a document might do the job: this is a matter of taste, as soon as you have a place to store all this data and see the evolution over time.
A Technical Debt Radar will not provide the level of granularity and details that any other automated tool out there might do, but it is totally worth a try, and a very valuable method that perfectly complements our codebase analysis with the purpose of understanding the most painful spots, and the most important, is that this information comes from us, from the feedback of the people who daily work with the code.
Remember to have these meetings regularly (minimum once every 2-3 weeks) in order to keep an eye on how much progress (positive or negative) has been done.
It is obvious that technical debt have a 1 to 1 relationship with legacy code but there is another important factor to take into consideration: the social part of our organization, which basically emphasizes in how we as human beings interact with each other (as a team), with customers, with the rest of the organization and with the code itself.
All this comes from the fact, that over the years there has seen changes in the way we work and interact with each other, which led to modifications in collaboration techniques, tools and again, the code itself.
References like Adam Tornhill in the area of human psychology and code are helping us to understand a bit more this social part.
Before continuing, let’s recap what a traditional static code analysis tool can do for us:
In conclusion, static analysis is a very useful tool and as pointed out above, should be our first step when it comes to code inspection, but there is an important gap to fill in:
“Static analysis will never be able to tell you if that excess code complexity actually matters – just because a piece of code is complex doesn’t mean it is a problem.”
Social aspects of software development like coordination, communication, and motivation issues increase in importance and all these softer aspects of software development are invisible in our code:
“Adam Thornill: if you pick up a piece of code from your system there’s no way of telling if it has been written by a single developer or if that code is a coordination bottleneck for five development teams. That is, we miss an important piece of information: the people side of code.”
Behavioral code analysis emphasizes trends in the development of our codebase by mining version-control data.
Since version-control data is also social data, we know exactly which programmer that wrote each piece of code and with this in mind, it is possible to build up knowledge maps of a codebase, for example, like the one in the next figure which shows the main developers behind each module:

For the purpose of better understanding way more of what we are talking about, we will be diving deeper into this online toolset called Codescene.io, which is free for open source projects.
Needless to say, apart from being a great helper with a nice UI, the platform is mostly based on an open source project called code-maat from the same author.
Let’s see what Codescene is capable of…
In essence, a hotspot is complicated code that you have to work with often.
Its calculation is pretty simple:

With a Hotspot analysis we can get a hierarchical map that lets us analyze our codebase interactively.
By using one of the examples of the platform, we can check the following visualizations where each file is represented as a circle:

As we can see, we can also identify clusters of Hotspots that indicate problematic sub-systems.
By clicking on a Hotspot we can get more details to get deeper information:

The main benefits of a Hotspot analysis include:
Maintenance problems identification: Information on where sits complicated code that we have to work with often. This is useful to prioritize re-designs.
Risk management: It could be risky to change/extend functionality in a Hotspot for example. We can identify those areas up-front and schedule additional time or allocate extra testing efforts.
Defects Detector: It could identify parts of the codebase that seem unstable with lots of development activity.
Here is the full documentation with more details.
In medicine, biomarkers stand for measurements that might indicate a particular disease or physiological state of an organism. We can do the same for code to get a high-level summary of the state of our hotspots and the direction our code is moving in.
Code biomarkers act like a virtual code reviewer that looks for patterns that might indicate problems.
They are scored from A to E where A is the best and E indicates code with severe potential problems.
Let’s have a look at a couple of examples listing risky areas of our code base:


In conclusion we can use Code Biomarkers to:
To decide when it’s time to invest in technical improvements instead of adding new features at a high pace.
Get immediate feedback on improvements.
Same as with hotspots, here is also the biomarkers full documentation.
There is way more to cover in this field like:
But from here I will leave it to you, otherwise this article will be too long and, by the way, the idea was to wake up your curiosity (luckily I have achieved it) and shade some light on what is possible by exploring the social side of the code.
“Behavioral code analysis helps you ask the right questions, and points your attention to the aspects of your system – both social and technical – that are most likely to need it. You use this information to find parts of the code that may have to be split and modularized to facilitate parallel development by separate teams, or, find opportunities to introduce a new team into your organization to take on a shared responsibility.”
I definitely encourage you to give Codescene a try either within an open source repo or within the existent samples, you will be surprised how much curious stuff you find :).
I would like to introduce an open source repository visualization tool called Gource.
Here is how the author describes it:
“Software projects are displayed by Gource as an animated tree with the root directory of the project at its centre. Directories appear as branches with files as leaves. Developers can be seen working on the tree at the times they contributed to the project.
In essence you can grab your git repository, run gource on it and the result is something like this (This is an example of the Bitcoin repository and its evolution):
The documentation sits at the Gource Github Wiki.
As a trick we have had it in a monitor during sprints to make more visible and transparent how we move around our codebase. Really fun!
“The best way to reduce technical debt in new projects is to include technical debt in the conversation early on.”
As this quote suggests, this is more at a process level, and even though we have our refactoring toolbox, without the effort of the team, would be impossible to minimize future technical debt and repair existent one.
So let’s see how we can deal with these contexts by pointing out a few tips for the action plan.
As a conclusion, I would like to finish this section with a bunch of quotes from Adam Tornhill (a reference in this field):
“Technical debt can be a frustrating and de-motivating topic for many Development Teams.”
“The keyword is transparency.”
“Explain the cost of low-quality code by using the transparent metaphor of ‘technical debt’.”
“Make technical debt visible in the code using a variety of objective metrics, and frequently re-evaluate these metrics.”
“Finally, make technical debt visible on the Product and/or Sprint Backlog.”
“Don’t hide Technical Debt from the Product Owner and the broader organization.”
Technical debt is a ticking bomb and as our lovely Batman from 1966 (characterized by Adam West) would say (you can check the full 2 minutes video here, BTW one of my favorite scenes ever):
And based on this inspiring quote let me rephrase it to:
It is a reality that technical debt exists in 99% of the codebases, it is also an important challenge we must face to keep the healthiness and maintenance of our software projects.
Hopefully there is light at the end of the tunnel and with the different techniques mentioned above, now you have a couple of new tools in your toolbox to address it effectively.
Have fun and do not let technical debt beat you.
Part of this article came out of a talk I gave about TECHNICAL DEBT recently, you can check the slides:
There is also a sketch that perfectly summarizes the main idea of my talk, courtesy of @lariki and @Miqubel:

And finally a video of my talk at Mobiconf:
This is a long article that I wanted to squeeze in a smaller one but it was almost mission impossible to get rid of some important/valuable information. I hope you enjoy and find it helpful.
Feel free to provide feedback, which as usual, is more than welcome.
With that being said, I would like to start with a quote from Robert C. Martin:
“Bad code is always imprudent”
I cannot agree more with this, and no matter what I sell you in this post :), there is NEVER a good reason to write bad code.

We as Engineers, Tech Leads and Managers know that technical debt is one of our worst enemies when it comes to codebases and software projects in general It can be very frustrating and demotivating thus making our life a bit more complicated…But…
Let’s try to answer these questions and also explore in depth different techniques and strategies that will help us effectively deal with it.
In an ideal world, a project would be:
If that is your reality, then you can stop reading this post, luckily you have UNLOCKED the LEVEL SUPERHEROE, so please share your thoughts and ideas, I am more than curious to know how you have achieved it.
Otherwise, welcome to my world, where we create authentic monsters: giant beasts full of technical debt, legacy code, issues and bugs.

And if you let me add more, that also includes coordination and communication problems across the entire organization. Yes! Our Software is terrible and we know it is TRUE, which does not make it any special, right?
There are many definitions of legacy code and some of them, in my opinion, contradict themselves, so since you are familiar with the concept, let’s keep it simple:
“Legacy code is code without tests.”
Testing nowadays should be implicit in our engineering process when writing code. So if you are not at least unit testing your codebase, run and do it, it is a command :).
I came across this term lately and it looks like we can use it as a synonym of technical debt, but in reality, here is the formal definition:
“Reckless Debt is code that violates design principles and good practices.”
That means that all code generated by us and our team is junk (not done on purpose of course).
Moreover, Reckless Debt will lead to Technical Debt in the short/mid term and it is also a signal that your team needs more training, or you have too many inexperienced or junior developers.
Here I will rely on Martin Fowler:
“Technical Debt is a metaphor developed by Ward Cunningham to help us think about this problem. Like a financial debt, technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice. We can choose to continue paying the interest, or we can pay down the principal by refactoring the quick and dirty design into the better design.”
So let’s put our day to day life back into our heads. In this case we have decided to add a new functionality to our project, so here we have 2 well defined options:
The “easy” way, built up with messy design and code, which will get us there way faster: REMEMBER WE NEED TO PAY THE INTEREST.
The “hard” way, built up with cleaner code and a meaningful and consistent architecture. Without a doubt this will take more time but it is going to be more EFFICIENT IN TERMS OF INTEREST COST.
“Accept some short term Technical Debt for tactical reason.”
It is not uncommon that at some point we need to develop something quickly because of time to market (or market experiment), or perhaps there is a new internal component that needs to be shipped in order to be used across the entire organization and we are contributing to it (a module for example), and we code it fast with not the best design until we can come up with a more robust and effective solution.
“No matter what is the reason, but part of this decision to accept technical debt is to also accept the need to pay it down at some point in the future. Having good regression testing assets in place, assures that refactoring accepted technical debt in the future, can be done with low risk.”
Let’s move on and see how we can analyze and inspect our codebase in order to detect technical debt.
It is the most basic and fundamental building block when it comes to measuring technical debt at a code level.
Most of us are familiar with this practice since it aims to highlight potential bugs, vulnerabilities and complexity.
But first, in order to interpret the results of static code analysis and quantify technical debt, we need to be familiar with a bunch of code metrics:
Cyclomatic Complexity: stands for the complexity of classes and methods by analyzing the number of functional paths in the code (if clauses for example).
Code coverage: A lack of unit tests is a source of technical debt. This is the amount of code covered by unit tests (we should take this one responsibly, since testing getters and setter can also increase code coverage).
SQALE-rating: Broad evaluation of software quality. The scale goes from A to E, with A being the highest quality.
Number of rules: Number of rules violated from a given set of coding conventions.
Bug count: As technical debt increases, the quality of the software decreases. The number of bugs will likely grow (We can complement this one with information coming from our bug tracker).
There is a variety of tools out there (free for open source projects), which provide the above information out of the box, and most of the time, they can be easily integrated either with your CI infrastructure or directly with our version control system tools like github/gitlab/git.
Here is a screenshot of one codebase example using the online open source tool SonarQube:

Lint is also a very flexible and popular one (there are plugins for the most popular IDEs and you can write your own custom rules, in this case on Android):

Static code analysis should be our first mandatory step to tackle and measure technical debt.
So let’s make sure we include as a regular practice in our engineering process.
A Tech Debt Radar is a very simple tool that has personally given me really good results (while working at @SoundCloud, within the android team, it was (and it is AFAIK) a regular practice).
“We should know that this is not an automated tool (like the ones mentioned above) and I define it as a Social Technical Debt Detector by Experience”.
The idea is pretty simple: all the feedback related to how difficult is to work with the current codebase, comes from actually the developers working with it (by experience).
You can see a Tech Debt Radar in the picture below:

As we can see, there is a board with a few post-its representing each one either a feature or even an area of the codebase which eventually is hard to work with.
Then we have 2 axys:
At a process level, this is done in a meeting with the development team and a technical debt captain (someone who will be in charge of analysing technical debt).
Basically each member of the team, will have the chance to place these post-its depending on how much pain (X axis) each is causing, and how much development time (Y axis) is required to fix it.
This would be mostly common sense (with strong arguments and an explanation of the whys) in the beginning but I can ensure that it will get better over time with the accumulated experience.
As an example on the board, let’s look at the DI card (Dependency Injection). It looks like it is a very painful area in our project and refactoring it will require a big effort. On the other hand, Login is causing a lot of pain and fixing it will not be very complicated.
With this in mind you can get some conclusions:
By addressing all features that are painful and at the same time require little development time (the ones placed upper left corner), we will be able to provide a lot of value and improvement by fixing them.
The rest of the functionalities will require some workout to be prioritized and refactored. As a rule of thumb, discuss with the team and use a divide and conquer approach (split up big problem into smaller ones).
Once we gather all this information, we need to keep track of all the collected feedback, so feel free to use your favorite tool for that purpose.
Even a document might do the job: this is a matter of taste, as soon as you have a place to store all this data and see the evolution over time.
A Technical Debt Radar will not provide the level of granularity and details that any other automated tool out there might do, but it is totally worth a try, and a very valuable method that perfectly complements our codebase analysis with the purpose of understanding the most painful spots, and the most important, is that this information comes from us, from the feedback of the people who daily work with the code.
Remember to have these meetings regularly (minimum once every 2-3 weeks) in order to keep an eye on how much progress (positive or negative) has been done.
It is obvious that technical debt have a 1 to 1 relationship with legacy code but there is another important factor to take into consideration: the social part of our organization, which basically emphasizes in how we as human beings interact with each other (as a team), with customers, with the rest of the organization and with the code itself.
All this comes from the fact, that over the years there has seen changes in the way we work and interact with each other, which led to modifications in collaboration techniques, tools and again, the code itself.
References like Adam Tornhill in the area of human psychology and code are helping us to understand a bit more this social part.
Before continuing, let’s recap what a traditional static code analysis tool can do for us:
In conclusion, static analysis is a very useful tool and as pointed out above, should be our first step when it comes to code inspection, but there is an important gap to fill in:
“Static analysis will never be able to tell you if that excess code complexity actually matters – just because a piece of code is complex doesn’t mean it is a problem.”
Social aspects of software development like coordination, communication, and motivation issues increase in importance and all these softer aspects of software development are invisible in our code:
“Adam Thornill: if you pick up a piece of code from your system there’s no way of telling if it has been written by a single developer or if that code is a coordination bottleneck for five development teams. That is, we miss an important piece of information: the people side of code.”
Behavioral code analysis emphasizes trends in the development of our codebase by mining version-control data.
Since version-control data is also social data, we know exactly which programmer that wrote each piece of code and with this in mind, it is possible to build up knowledge maps of a codebase, for example, like the one in the next figure which shows the main developers behind each module:

For the purpose of better understanding way more of what we are talking about, we will be diving deeper into this online toolset called Codescene.io, which is free for open source projects.
Needless to say, apart from being a great helper with a nice UI, the platform is mostly based on an open source project called code-maat from the same author.
Let’s see what Codescene is capable of…
In essence, a hotspot is complicated code that you have to work with often.
Its calculation is pretty simple:

With a Hotspot analysis we can get a hierarchical map that lets us analyze our codebase interactively.
By using one of the examples of the platform, we can check the following visualizations where each file is represented as a circle:

As we can see, we can also identify clusters of Hotspots that indicate problematic sub-systems.
By clicking on a Hotspot we can get more details to get deeper information:

The main benefits of a Hotspot analysis include:
Maintenance problems identification: Information on where sits complicated code that we have to work with often. This is useful to prioritize re-designs.
Risk management: It could be risky to change/extend functionality in a Hotspot for example. We can identify those areas up-front and schedule additional time or allocate extra testing efforts.
Defects Detector: It could identify parts of the codebase that seem unstable with lots of development activity.
Here is the full documentation with more details.
In medicine, biomarkers stand for measurements that might indicate a particular disease or physiological state of an organism. We can do the same for code to get a high-level summary of the state of our hotspots and the direction our code is moving in.
Code biomarkers act like a virtual code reviewer that looks for patterns that might indicate problems.
They are scored from A to E where A is the best and E indicates code with severe potential problems.
Let’s have a look at a couple of examples listing risky areas of our code base:


In conclusion we can use Code Biomarkers to:
To decide when it’s time to invest in technical improvements instead of adding new features at a high pace.
Get immediate feedback on improvements.
Same as with hotspots, here is also the biomarkers full documentation.
There is way more to cover in this field like:
But from here I will leave it to you, otherwise this article will be too long and, by the way, the idea was to wake up your curiosity (luckily I have achieved it) and shade some light on what is possible by exploring the social side of the code.
“Behavioral code analysis helps you ask the right questions, and points your attention to the aspects of your system – both social and technical – that are most likely to need it. You use this information to find parts of the code that may have to be split and modularized to facilitate parallel development by separate teams, or, find opportunities to introduce a new team into your organization to take on a shared responsibility.”
I definitely encourage you to give Codescene a try either within an open source repo or within the existent samples, you will be surprised how much curious stuff you find :).
I would like to introduce an open source repository visualization tool called Gource.
Here is how the author describes it:
“Software projects are displayed by Gource as an animated tree with the root directory of the project at its centre. Directories appear as branches with files as leaves. Developers can be seen working on the tree at the times they contributed to the project.
In essence you can grab your git repository, run gource on it and the result is something like this (This is an example of the Bitcoin repository and its evolution):
The documentation sits at the Gource Github Wiki.
As a trick we have had it in a monitor during sprints to make more visible and transparent how we move around our codebase. Really fun!
“The best way to reduce technical debt in new projects is to include technical debt in the conversation early on.”
As this quote suggests, this is more at a process level, and even though we have our refactoring toolbox, without the effort of the team, would be impossible to minimize future technical debt and repair existent one.
So let’s see how we can deal with these contexts by pointing out a few tips for the action plan.
As a conclusion, I would like to finish this section with a bunch of quotes from Adam Tornhill (a reference in this field):
“Technical debt can be a frustrating and de-motivating topic for many Development Teams.”
“The keyword is transparency.”
“Explain the cost of low-quality code by using the transparent metaphor of ‘technical debt’.”
“Make technical debt visible in the code using a variety of objective metrics, and frequently re-evaluate these metrics.”
“Finally, make technical debt visible on the Product and/or Sprint Backlog.”
“Don’t hide Technical Debt from the Product Owner and the broader organization.”
Technical debt is a ticking bomb and as our lovely Batman from 1966 (characterized by Adam West) would say (you can check the full 2 minutes video here, BTW one of my favorite scenes ever):
And based on this inspiring quote let me rephrase it to:
It is a reality that technical debt exists in 99% of the codebases, it is also an important challenge we must face to keep the healthiness and maintenance of our software projects.
Hopefully there is light at the end of the tunnel and with the different techniques mentioned above, now you have a couple of new tools in your toolbox to address it effectively.
Have fun and do not let technical debt beat you.
Part of this article came out of a talk I gave about TECHNICAL DEBT recently, you can check the slides:
There is also a sketch that perfectly summarizes the main idea of my talk, courtesy of @lariki and @Miqubel:

And finally a video of my talk at Mobiconf: