artemis.sh

Bidirectional I2S with PIO

2026-03-22T00:00:00+00:00

Raspberry provide an example I2S output PIO program. I needed simultaneous input and output, so I adapted their program for that. Code at the end of the post.

The default I2S program has output like this:

Bit clock low, output data
Bit clock high, remote codec chip latches data

We want this:

Bit clock low, we output data, so does remote
Bit clock high, we latch data, so does remote

To do this we need INs as well as OUTs. We can’t squeeze that into existing cycle count, so we run this PIO program 2x as fast as raspberry’s output-only program would run for the same sample rate. That means our final PIO clock should be 4 * sample_rate * 32 for 16-bit stereo.

We do all the extra work during BCLK=1, so all our outputs just get an extra delay cycle. Nice and easy. As with the original, adjust the set commands as necessary for higher bit depths than 16-bit.

You’ll also need to configure the state machine for input, in the same way as output. Output has autopull @ 32 and shift left, so input wants autopush @ 32 and shift left also. And don’t forget to set up the input pin. By the way, don’t forget you need to manually jump to entry_point. I forgot to do that in my rust code and was very confused when my audio bits were out of wack.

If you’ve never done I2S with PIO, I’d recommend referencing the original file for the setup code and notes on usage.

Also, a note on clocks. Raspberry suggest that “Fractional [PIO clock] divider will probably be needed to get correct bit clock period, but for common sysclk freqs this should still give a constant word select period”. If you are targeting a common sample rate, I think this is not true for the default 150MHz sysclk. At least, 44100Hz and 48000Hz give me fractions even across an entire word or frame. You will need to adjust your sysclk a bit if you really need one of the standard sample rates.

Using a fractional divider would be bad in general for my chip. Right now I have it deriving its clock directly from BCLK, so I definitely don’t want that jittering.

But, I don’t like having clock jitter of any sort around any of my audio code even if on paper it seems like it’s fine (I’ve been fighting this with my GBA project…). Just don’t use a fractional divider unless you’re desperate, your life will be easier for it. Adjust your sysclk based on what sample rates you need, or if you can get away with it, just use a nonstandard sample rate. It’s fine.

Anyway, here’s the code:

;
; Copyright (c) 2020 Raspberry Pi (Trading) Ltd.
; Copyright (c) 2026 Artemis Everfree
;
; Redistribution and use in source and binary forms, with or without modification, are permitted provided that the
; following conditions are met:
; 
; 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following
;    disclaimer.
; 
; 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
;    disclaimer in the documentation and/or other materials provided with the distribution.
; 
; 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products
;    derived from this software without specific prior written permission.
; 
; THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
; INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
; DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
; SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
; SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
; THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

.program audio_i2s_bidi
.side_set 2

                    ;        /--- LRCLK
                    ;        |/-- BCLK
bitloop1:           ;        ||
    out pins, 1       side 0b10    [1]
     in pins, 1       side 0b11
    jmp x-- bitloop1  side 0b11
    out pins, 1       side 0b00    [1]
     in pins, 1       side 0b01
    set x, 14         side 0b01

bitloop0:
    out pins, 1       side 0b00    [1]
     in pins, 1       side 0b01
    jmp x-- bitloop0  side 0b01
    out pins, 1       side 0b10    [1]
     in pins, 1       side 0b11
public entry_point:
    set x, 14         side 0b11

And here’s a little bonus for how to initialize the state machine and jump to entry_point with the rust pio crate / hal.

let i2s_prog = pio::pio_file!("./src/i2s.pio", select_program("audio_i2s_bidi"));

// Save entry_point label so we can jump to it later
let i2s_entry = i2s_prog.public_defines.entry_point; 

let i2s_prog = i2s_prog.program;

let (mut pio, sm0, _, _, _) = pac.PIO0.split(&mut pac.RESETS);
let i2s_prog = pio.install(&i2s_prog).unwrap();
let i2s_offset = i2s_prog.offset();
let (mut i2s_pio_sm, mut i2s_rx, mut i2s_tx)
    = PIOBuilder::from_installed_program(i2s_prog)

    // Output config
    .out_pins(11 /* GPIO_11 */, 1)
    .side_set_pin_base(14 /* GPIO_14 */)
    .autopull(true)
    .pull_threshold(32)
    .out_shift_direction(ShiftDirection::Left)

    // Input config
    .in_count(1)
    .in_pin_base(10 /* GPIO_10 */)
    .autopush(true)
    .push_threshold(32)
    .in_shift_direction(ShiftDirection::Left)

    // Gives a 47261.538461538Hz sample rate with 150MHz SYSCLK.
    // Adjust SYSCLK to get something closer to standard if you need.
    .clock_divisor_fixed_point(26, 0) 
    .build(sm0);

// THE PINS FOR IT
let mut pi2s_di = pins
    .gpio10
    .into_function::();
let mut pi2s_do = pins
    .gpio11
    .into_function::();
let mut pi2s_fsync = pins
    .gpio14
    .into_function::();
let mut pi2s_bclk = pins
    .gpio15
    .into_function::();
i2s_pio_sm.set_pindirs([
    (10, PinDir::Input),
    (11, PinDir::Output),
    (14, PinDir::Output),
    (15, PinDir::Output),
]);

// Jump to entry_point label
let entry_addr = i2s_entry as u16 + i2s_offset as u16;
assert!(entry_addr < 0b1_00000);
let instr = pio::Instruction {
    operands: pio::InstructionOperands::JMP {
        condition: pio::JmpCondition::Always,
        address: entry_addr as u8,
    },
    delay: 0,

    // If this is None, instruction encoding panics. just to check
    // if you're awake. keep you on your toes. make sure you know
    // your installed program is side setting whether you like it
    // or not.
    side_set: Some(0), 
};
i2s_pio_sm.exec_instruction(instr);

// Run the thing
let mut i2s_pio_sm = i2s_pio_sm.start();

// Now you can do stuff with the FIFOs.

raspi3 luks - aes-xts vs xchacha,aes-adiantum

2025-10-03T00:00:00+00:00

The raspberry pi 3 doesn’t have the hardware crypto extensions for aarch64, at least going by cpuinfo. That’s bad news for using AES for disk encryption, because software implementations of AES are slow. But there’s a disk encryption algorithm that uses your pick of chacha12 or chacha20 for the bulk data encryption. These algorithms were designed to be better suited to software and simd implementations than AES, so they’re a lot faster on the pi.

no hw crypto

Before I tell you about that, here’s what I’m going off of to see we don’t have hardware instructions for AES. I’ll compare with my Rock64 which does.

First, checking /proc/cpuinfo

# pi3
Features	: fp asimd evtstrm crc32 cpuid

# rock64
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp

Second, cat /proc/crypto | awk '$1 == "driver" && $3 ~ /^aes/'

# pi3
driver       : aes-arm64
driver       : aes-generic

# rock64
driver       : aes-arm64
driver       : aes-ce
driver       : aes-generic

I think aes-arm64 is an arm64 implementation (maybe using simd? or maybe just hand-crafted assembly) optimized as much as it can be, while aes-ce is using the aes instruction (ce = “crypto extensions” perhaps?).

wtf is adiantum?

I don’t feel like I can explain the crypto of it, so I’m not going to try. High level, some folks who were working at google wanted to make a faster disk encryption option for phones that didn’t have hardware AES acceleration. They cooked a composite-algorithm up, using chacha12 or chacha20 for the bulk of the work, and some other existing algorithms for some other parts of the encryption process. And that’s called Adiantum.

AES is sort of a household name in crypto at this point, but if you’re wondering where you might have used chacha20 before, WireGuard notably uses it as its only symmetric encryption cipher.

You should go read the paper if you want the maths of it.

It was merged into linux years ago so you can use it pretty much anywhere, but if you’ve got hardware aes acceleration then the usual aes-xts option beats it anyway.

Note there’s no 512-bit key version of adiantum. I think since the key is being used in a different way its not a direct comparison to aes-xts for what that means for security.

As for chacha20 vs chacha12, the number is how many rounds of the algorithm it’s doing. My understanding is chacha20 was the first variant, and then some folks proposed that a reduced number of rounds would still be secure. I can’t be the authority to tell you if chacha12 is good enough for you or not though.

benchmarks

These will just be running cryptsetup benchmark, which is a synthetic benchmark that tells the kernel to do some encryption work in memory, without an actual IO layer involved.

These are all on the pi3; comparing to rock64 wouldn’t be really interesting here since they’ve got different clocks anyway.

# Pin CPU to max clock speed so dynamic scaling doesnt mess
# with the numbers
echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor

cryptsetup benchmark -c aes-xts -s 256
cryptsetup benchmark -c aes-xts -s 512
cryptsetup benchmark -c xchacha12,aes-adiantum
cryptsetup benchmark -c xchacha20,aes-adiantum

Algorithm	Key	Encryption	Decryption
aes-xts	256b	34.5 MiB/s	35.1 MiB/s
aes-xts	512b	26.3 MiB/s	26.8 MiB/s
xchacha12,aes-adiantum	256b	143.9 MiB/s	144.0 MiB/s
xchacha20,aes-adiantum	256b	121.7 MiB/s	121.7 MiB/s

As you can see, it’s quite the difference!

encrypting a LUKS partition with it

Consult your favorite guide for using luks, but instead of --cipher aes-xts-plain64, use --cipher xchacha20,aes-adiantum-plain64.

Defragging my old Dell’s UEFI NVRAM

2025-02-22T00:00:00+00:00

I’m setting up an old Dell box (2011-era tech). I was migrating to some new boot drives and setting up grub on them again. Got to the point of deleting the old boot entries and adding new ones, and got an error like Could not prepare boot variable: No space left on device. Weird, because there weren’t that many EFI variables, but df -h also reported that efivars was full:

efivarfs              64K   60K   0K  100% /sys/firmware/efi/efivars

funky.

It wasn’t due to a glut of boot entries either; I’m pretty keen on keeping the old ones cleared out, so we only had a few on here.

Well, I thought maybe there was a lot of dead space in the nvram that wasn’t in use but also couldn’t be allocated. So I booted an EFI shell, and did this:

fs0:
dmpstore -s efi-vars
dmpstore -d
dmpstore -l efi-vars

Please be careful with these commands, it worked for me but for all I know it might brick your setup. Check help dmpstore first.

What this does is:

switch to fs0 filesystem.
dump all the EFI variables from nvram, copy them to a file on disk.
delete all the variables from nvram.
load all the variables back from the file, and copy them back into nvram.

After doing this, I rebooted, and had plenty of space:

efivarfs              64K   14K   46K  24% /sys/firmware/efi/efivars

And all my UEFI configuration seems to be right as I left it before doing this. So I think this confirms my suspicions.

The Serenity of -j1

2024-07-29T00:00:00+00:00

It’s another post about our thinkpad :). We’ve been using gentoo on our x220 for a few years now. The x220 is a thinkpad from 2011, and ours is rolling with an i7-2620M processor, 2 cores 4 threads. When all the threads are saturated, the fans really start to scream a bit, and it’s not a pleasant sound. Fine for a few minutes, certainly not for longer. What’s there to do about it?

Early on we wrote a custom fan control script to implement different fan curves from stock. The stock curves idle the fan one speed step higher than it really needs to be and we didn’t like that. But under load the fans are screaming either way so the script isn’t really impactful there.

Then we disabled turbo. We still have turbo disabled actually. This keeps it from being forced into the max fan state under load unless ambient temperature is particularly high; particularly nice if we do happen to keep it under load for awhile. For example, OSRS and Minecraft run fine on this thing, but they’ll keep it pegged at turbo clocks. Turning off turbo means less heat, and it also means more power efficiency, which is great for this era of laptop where battery life is not really a strong suit.

Ok but it’s still loud, ultimately. We would stick to running package updates over night, which meant we had to think to do that, and meant we had a cutoff for when we had to stop actually using the laptop, unless we wanted to be wearing headphones to block out the sound. I’ll wear headphones while gaming but I don’t want them on my entire time using a computer. And of course we’d have to interrupt them again in the morning to use our computer. Quite obnoxious really.

The ultimate the solution, it turns out, is quite simple: relax and use -j1.

No, really.

With -j1, this laptop’s fans stay well within the quiet RPMs. But on top of that, it means that software compilation is not eating all the system resources, we have 3 threads and more ram for the rest of our computing needs. Combined, this means that software updates can happen whenever, not just over night, and it’s no problem.

So now, when I want to update our system, I just run an update, whenever, regardless of what we’re doing. I set it aside to run in the background, and then I forget about it. It may take longer in wall-clock time, but it is out of sight, out of hearing, out of mind. So in terms of how much of our mental space it takes up, much much less, and that matters far more. We will interrupt it if we need to be on battery for an extended period, but that’s about it.

I’ve taken this from software updates to other parts of my life. If I need to work on a rust or haskell project, I will start off by installing all the dependencies with -j1 and going away to do something else for awhile. Then I come back later to do the work I actually need to do. And for that I will use -j4 probably, because really I do not want to make my iteration times longer than they already are. But the initial build of the dependency tree takes so long that I need to go do something else while I wait anyway, so it taking 10 or 20 minutes more is really quite fine, especially since our computer is not annoying me in the process.

As for why we are using gentoo on an old laptop like this in the first place… would you believe me if I said it saves us time and mental energy compared to the alternatives? Hah. Such is the curse of the adept rock talker.

So, do you actually want to write?

2024-06-19T00:00:00+00:00

There’s this collection of tales about people who use static site generators. The story goes something like this: someone decides they want to be a blogger. So they start researching static site generators, they go down the list, ask some friends, start comparing generators. They learn, wow, there’s a generator in their favorite programming language, and look at all the cool things they can make it do!

A few weeks later they’ve sunk 10 hours into their website project, hit the button to deploy, and gaze upon their inaugural post:

My new blog!

I just finished deploying my blog with Vanbi, a cool new static site generator that uses meknau technology and ropjar to florp pages fast with extensible plugins that allow for custom sisti functionality.

A year later, they post their second post on their website:

Blog update

I got tired of Vanbi. Now I use temtcu, an experimental platform that fixes a lot of the ergonomics of doing web design with meknau technology. Looking forward to writing some posts soon!

That post was made in 2022, and it’s 2024 now.

This is a true story, it happens quite a bit. Perhaps this is you.

Usually when this story is told, it’s with some quippy message about how people are spending all their energy building their site, and if they had just used some existing blog hosting platform they’d have been writing this whole time instead. I’m not here to repeat that punchline.

Here’s what I want to ask instead: If this is you, are you sure you actually want to write? Really, I mean that question.

i am not a game designer

When I got into programming I was really enamored with the idea of being a game developer. I liked video games, so I thought I would like making a video game. What actually happened is that I made something like 20 different toy game engines and graphics libraries, and then a small handful games. And what I realized is that most of the time, I wasn’t actually interested in making a game.

I liked writing the bones of a game, the technology that could in theory be used to make a game. I liked the process of implementing a game that was already a known quantity. I reimplemented osu!mania in Dart and then later in Typescript, ported Prelude of the Chambered’s graphics engine to WebGL, wrote a Bejeweled clone for the calculator, as well as a 2048 clone and a few other things. I only made one original game.

Writing a game engine did not sap me of the energy needed to design a game. I did it precisely to avoid needing to design the game, as a delaying factor. And because I thought it was a really enjoyable process in and of itself. When I ran out of things I was interested in putting in my engine, that’s when I abandoned the project, because that’s the point when I’d need a game to go with it to give it any direction.

i am a musician

When I got into making music, I was overwhelmed. I knew that I liked making music on some level, but you wouldn’t think it to talk to me about my projects. I hated almost everything I made. None of it sounded like what I wanted. I would spend hours working on a loop, and then quit FL Studio, pissed off and disgruntled, only to be back at it the next day for reasons I could not tell you.

Part of how I tried to cope with this was getting into writing my own synthesizer software. I thought, if I could understand the math behind sound, maybe I could figure out my problems, understand how to make things I liked.

Now I had two problems. It turns out that Digital Signal Processing is one of the hardest fields to break into if you don’t have either an academic background, or someone to point you at the right resources for it. So I gave up on writing my own synthesizers for awhile.

I did not consider myself a “music producer” for most of that time period. Observably, I was. I have hundreds of little ideas and sketches from that time, some of which I fucking love these days. I wanted desperately to make music, and so I tried to do that, and kept trying to do that even when I didn’t live up to my standards.

Now I make music I like, and I write DSP code, and those things are usually not as connected as I trick myself into thinking.

i am a writer

I keep fucking writing. It keeps happening. I was basically blogging on forums in bbcode before someone convinced me that my writing was worth putting on a website of its own. I have so many posts here, on other websites, in my personal notes; it’s something I’ll do regardless of which medium.

But you know what else I did? I wrote my own static site generator.

The deep lore of this website is that it was originally made with the Ghost blogging platform, which I will not link to because however many years on it’s a very different entity than it was at the time, and not one I like. But I thought writing my own blog would be fun, so I wrote bashyll, a weird stab at a static site generator created by someone who had never even used another static site generator in her life.

And that carried me for a bit. Then in 2019 I switched to Jekyll. And I’m still on Jekyll now. Not even an up to date jekyll; I’m using some version that uses ruby 2 and keeps telling me how my entire stack is End of Life or some shit.

It doesn’t matter.

I’ve actually mostly rewritten my website in a new generator I call site, which you will never see the source code for. One of these days I’ll feel like finishing that up and switch over to it, and then I won’t touch it for another 5 years. I’ve realized that I don’t really care about static site generators. I like the consequences of using a static site generator, enough to write my own to fit my particularities, but it’s not my main interest. I like writing.

so what

Do you actually want to write?

Ok. Then write. In a text file on your computer, in a pastebin, on a blogging platform, on neocities, on cohost or mastodon, on a copy-pasted template from github pages, in /var/www, on some forum that nobody’s looked at in 15 years. It really does not matter. Go write.

Are you configuring a static site generator to put off writing? Then it’s time to seek the depths of your soul. I ask again, do you actually want to write?. If you do, but it’s hard- I know that struggle. But I promise you that it won’t get any easier just because you switched blog technologies again. Your problems are somewhere else.

But, perhaps you don’t really want to write. Do you like tinkering with site generator tools? Then fucking tinker to your heart’s content. You don’t need to have an end goal where you actually put some thinkpiece or technical writeup on the website. Building it can be its own reward. Play with the styles, create your minimalist heaven, or your maximalist Y2K masterpiece.

Not that this is a dichotomy or anything, you can write and do web design. But why are we doing these things? You truly do not need to have anything to say to make a website. That is one of the best parts. Are you trying to write because you’ve been convinced that it’s a necessary part of having a website? I’m here to tell you: you are free.

If you love it, let it consume you. If you hate every moment you spend with it, why are you trying to do it? Real winners quit.

Save Scarecrow, a Massive Physical Media Collection

2024-06-18T00:00:00+00:00

https://scarecrowvideo.org/sos. This is the reason for this post. Scarecrow Video is one of the few video rental places still operating. You can go in, or rent by mail, something from a collection of 147,000 titles. Like. I cannot understate how big this is. No streaming service has this many. All of them combined don’t have this many. You probably cannot pirate many of these, because they’re too niche to even be seeded, if there was a digital copy in the first place. It’s so cool. But also predictably they are in financial trouble.

So Scarecrow obviously can’t operate as a usual business, right? Like, it’s fun to go in and rent a thing and have an experience that’s like nostalgic of the video rental stores that were common in the past (even though those usually did not have this big a selection), but that doesn’t keep the lights on. They’ve relied on donations for awhile to keep things going, but it seems like they’ve had some of their existing funding providers dry up.

They’re asking for $1.8 million before then end of the year. They claim in the linked post that they need it to:

“Stay in our current location for as long as possible,

Provide our existing staff with a living wage,

Hire the permanent leadership we will need to break out of this cycle of scrambling just to keep our heads above water, and

Provide the working capital we would need to allow our new team time to stabilize our organization.”

I believe it. I’ve got a personal stake in it because I go there some time, it’s one of my favorite things to do. It’s a good place, I think.

They say:

“Our situation is urgent and the stakes have never been higher. If we are unable to raise this money, our ability to keep our doors open will be jeopardized, and we will be forced to move out of our space and go into “hibernation” while we plot a new future for Scarecrow. Keeping one of the world’s largest publicly-available video collections intact and accessible is our utmost priority, and though there are still some uncertainties on our path forward, we are not going away.”

I believe they’ll try if it comes to that, but I think it’s good for them to be able to persist as they’re own thing. Pure speculation on my part, but I could imagine them trying to find some other archival group to adopt their collection, and maybe it’d work out, but ehhh maybe it wouldn’t.

“Comparing our holdings to institutions like The Library of Congress and the Paley Center as well as the WorldCat database reveals that we hold thousands of rare and out-of-print titles. Of our 100 rarest titles:

44 may be the only publicly accessible copies in existence;

33 are held by 5 or fewer public collections; and

88 aren’t even held by the Library of Congress.”

Seriously, it is so cool being able to go just rent like, any of these.

Anyways, $1.8 million is a pretty big ask. That’s not the kind of thing I can make a dent in personally. I’m sure that every bit helps, so if you’re interested in sending them money personally please do, but that’s not why I’m making this post.

This is an amount that really could stand for an organization or suspiciously wealthy individual to provide. And so, if you happen to know one of those who would be interested in sending them money, please forward the message on. :)

https://scarecrowvideo.org/sos

There are still quite a few 32-bit x86 Linux distributions

2024-04-12T00:00:00+00:00

The options for 32-bit x86 distributions have been declining a little bit. Some distros have dropped 32-bit support, but quite a few still have it. What remains? And which CPUs do they actually support?

x86 Micro-architectures

“32-bit support” can mean a lot of different things to different people. There were a lot of instruction set additions over the span of development of the 32-bit x86 architecture.

i386 was where everything started, but the Linux kernel has not supported i386 for a over a decade. The kernel currently supports i486, but we may see that bump up to i586 soon. i586 starts with the first Pentiums (the “5” in “586”), and it’s the earliest architecture I’m going to consider in this post.

People usually agree on the meaning of i586, but i686 tends to have a bit more variance. Some people say i686 to mean the P6 architecture without SSE/MMX. Some people mean it to include SSE/MMX. Some projects throw SSE2 in there also, like rust’s i686 target.

And then you run into situations where people say i386 but mean i686 or some other combination of instructions. For example, Go’s i386 GOARCH is actually i686+MMX, because they use MMX for their atomics, so you can’t use Go without MMX.

The situation is a bit funky.

With that out of the way, I’ll talk about some linux options. I’m not going to go down the list of distrowatch and find everything that supports x86; you’re free to do that on your own. I’ll mention things that I know about to give a general feel for the landscape as I see it.

Alright then. In no particular order, some distributions:

Alpine - i686 with CMOV and SSE
Debian - i686 + MMX/SSE
openSUSE - i586??? (based on boot medium name)
Void - i686
Adélie - MMX
Gentoo - i486+
Slackware - i686? Maybe supports older than that though; it’s slackware.

Feel free to let me know about corrections/refinements for the micro-architecture support. I’m doing my best, but there’s only so deep I can research it.

After listing all those I got tired of researching distributions. If there are more than I have the energy to write about in one sitting, I think that’s pretty good.

Alpine

Alpine is know for being a pretty tiny distribution out of the box, but it’s got a fair number of packages too. You can use this as your desktop system if you want to, but it particularly shines for server use. Alpine can be installed in a traditional manner, or in a setup where the system is loaded fully into RAM, only committing changes back to disk when manually told to do so.

This second install method is nice on systems with slow IO, or systems that are prone to suddenly losing power (as you probably won’t get the filesystem in a weird state if you aren’t writing to it). And because alpine splits docs & dev headers out from the main package, you can do this without taking away too much memory from your system to use for other things.

Alpine supports x86. My favorite use of this support is in iSH, a 32-bit x86 emulator and linux kernel emulator which out of the box gives you access to an alpine userspace on your phone or iPad.

~~I’m not sure which x86 feature level it supports. If I had to guess, I’d guess i686, but please let me know if you happen to know.~~ EDIT: thanks lucidiot for directing me to the Alpine requirements page, which has a very clear answer to this actually! You need i686, with CMOV and SSE.

Debian & derivatives.

As of bookworm the baseline is i686 without MMX or SSE. Though any Go programs will violate this I believe- and I’m not sure if they use rust’s i586 target or what.

In December of 2023, Debian announced they’re sunsetting it too. No timeframe yet, but at some point “in the near future”, you’ll stop being able to install i386 debian. My bet is that Bookworm will be the last official i386 debian release we see.

Debian is not immediately discontinuing the i386 package set, but we will probably see maintainers begin dropping i386 support for their packages.

The announcement mentions that they may see changes to the architecture baseline, meaning the instructions that packages are allowed to include in them. So debian’s i386 may expand to officially include SSE2 or higher.

3rd-party debian derivatives may continue running. A number of downstream distributions such as antiX linux/MX linux already build their own kernels and provide their own package sets in addition to the core debian packages. Maybe projects like those will continue building on debian, filling in maintainership gaps in the package set as necessary. Or maybe not, we will see.

openSUSE

I really don’t know much about openSUSE. I found some talk from December 2022 that openSUSE was looking for help to keep 32-bit x86 going. But, openSUSE tumbleweed still has i586 boot media available so I guess they’ve still got it!

Void

Void Linux is one of those distributions that starts you with a barebones setup and lets you build up from there. A bit like Arch or Alpine in that way.

Void linux supports “i686”. We can learn what this means from common/build-profiles/i686.sh in the void-packages repo.

XBPS_TARGET_CFLAGS="-mtune=i686"
XBPS_TARGET_CXXFLAGS="$XBPS_TARGET_CFLAGS"
XBPS_TARGET_FFLAGS="$XBPS_TARGET_CFLAGS"
XBPS_TRIPLET="i686-pc-linux-gnu"
XBPS_RUST_TARGET="i686-unknown-linux-gnu"
XBPS_ZIG_TARGET="i386-linux-gnu"
XBPS_ZIG_CPU="_i686+sse2"

So, generally i686, but SSE2 for zig, and I believe that rust target also includes SSE2. I think that by changing these environment variables, you could also roll your own void linux that targets other 32-bit x86 variants, though you would of course have to build all the packages yourself.

And void makes building your own packages pretty easy to do, even from a non-void system, which is nice. While working on this post I tried it out, and got void building i686 packages from my gentoo system, with custom CFLAGS to optimize for my atom chip.

Adélie

Adélie is an independent distro, notable for supporting 32-bit x86, and both 32 and 64-bit PowerPC, the latter of which I think is even more niche than 32-bit x86. Adélie uses the APK package manager, like alpine does. But they aren’t a downstream of alpine’s packages; they do their own packaging.

Per the documentation, they target chips with at least MMX.

I don’t know much about what using this distro is like in practice. They have both desktop and server flavors though, so if you want to get a desktop up and running easily it might be a good option.

Gentoo

Gentoo! Gentoo still has x86 support, and since it’s a source based distribution it’s up to you what CPU architecture you want to build software for. Gentoo has installation media that supports i486 and i686, so you really can run this anywhere the kernel will run.

But, you’ll also have to compile nearly everything yourself, one way or another. On a 32-bit desktop, if you’ve got patience, that might be ok. It’s a bit harder for laptops, unless you’re fine leaving it for maybe days at a time when you need to build a browser.

Gentoo does support building your own binary packages on another machine though, so if you’ve got a another box that can do your compiling, that can help.

Gentoo also has official binaries now, a fairly recent development. For 32-bit x86 it’s just the @system set, meaning everything included in a fresh installation. That’s not much, but it does get you gcc, so if you use the official binaries you won’t need to build your entire toolchain from source on updates.

This nice thing about gentoo is even if they drop official support for x86 on a package, it’s extremely easy to tell your system to try and install it anyway, with a one-line addition of the package name to your /etc/portage/package.accept_keywords. And a lot of times it’ll work! This is one of the nicest things about using gentoo on a niche architecture.

Slackware

One of the oldest distributions still kicking, and it’s still got 32-bit support. The Slackware 15 release notes includes a kernel intended for chips “older than a Pentium III”, so it’s probably built without SSE where possible, though I’m not sure exactly which micro-architecture they’re targeting.

Slackware includes a decent chunk of software, but a more minimal set than something like Debian or Void. So you won’t need to build everything from source on it, but you’ll need to build some things.

There are also a number of slackware derivatives, but I’m not familiar with the slackware world.

Trying out NetBSD on our Vaio

2024-04-10T00:00:00+00:00

NetBSD 10.0 is out!! This is very exciting; they’ve poured a ton of work into making that happen. I figured in celebration I needed to install NetBSD on something around the house, and the vaio vgn-p (prev featured in Life at 800MHz) seemed fun to try it on.

So I put an installation stick in, but bad luck. Got an error where it timed out waiting for EHCI (the USB controller) to reset. I turned on some extra debug statements and added a few of my own, and learned that all the EHCI hardware registers were returning garbage data. Strange. Same problem on NetBSD 9, 8, 7- though 7.2 actually catches the problem earlier, saying “can’t map memory space”. Best I can tell, it’s just not mapping the EHCI registers properly.

Anyways the problem seems to be ACPI related. Disabling ACPI makes the problem go away, and USB works fine- though at the cost of losing all the ACPI-provided features. That’s not too big of a deal on this device, especially since I never use sleep mode or anything.

# in bootloader prompt after pressing "3"
userconf disable acpi
boot

# cat /boot.cfg
menu=Boot normally=rndseed /var/db/entropy-file:userconf disable acpi:boot
menu=Boot single user:rndseed /var/db/entropy-file:boot -s
menu=Drop to boot prompt:prompt

This isn’t super surprising. There is a long and extensive history of weird or buggy ACPI implementations, especially in laptops, and it’s basically whack-a-mole for OS devs to deal with all the quirks. Now I get to learn how to do that sort of debugging too. Some folks in chat have told me to cross reference the PCI descriptors with the ACPI tables, so once I go learn how to check the ACPI tables I guess that’s what I’ll be doing.

Nearly everything else seems to work after working around this problem though. I’ve got ctwm running in X. Firefox runs, though as sluggishly as I’ve come to expect on this hardware (which is why I love netsurf). I’ve got wifi and ethernet too without any problem! And the keyboard and trackpoint work. It’s a computer, wouldn’t you know it, and it runs about as well as you’d expect.

Unfortunately the external VGA output doesn’t seem to be doing anything. That one, I haven’t figured out yet where I’d start to try and fix. I also can’t seem to set custom video modes on the internal display the way I can from linux, and these problems might be related.

I mentioned the last time I wrote about this laptop that hardware-accelerated graphics on this thing requires blobbed drivers that I’ve never been able to get working. But what I didn’t mention is that, despite that, there is still a specific open source driver in the linux kernel (gma500) for this hardware. It doesn’t give you GPU-accelerated OpenGL, but it does handle things like setting up graphics modes, mapping a framebuffer into memory, and probably setting up the external VGA port as well.

I don’t know yet how pixels are even getting onto the display on this thing in NetBSD, so I have some more learning about the graphics stack to do. Maybe I can get this stuff working if I sit down with it, maybe not. Beyond technical, I think there are licensing problems if I just ported the gma500 driver from Linux, because it’s GPL2. Maybe I could ask Intel to dual-license it? They already license i915 as MIT, which is how NetBSD can support modern intel chipsets, so I guess maybe there’d be a chance. Of course nothing stops me from trying to port it anyway, but without dealing with the licensing I couldn’t share it, and that’d be a bit disappointing.

This kernel hacking stuff though, this is to me one of the appeals of NetBSD. I consider myself kind of an aspirational NetBSD user right now, because I don’t use it regularly on anything. But I keep coming back to it now and then because every time I interact with it I have such a great time reading the documentation, looking through the source code, using the compilation tools (with AMAZING cross-compilation support), and so on. It’s all very comprehensible. NetBSD really feels like it’s inviting me to work on it, in a way that a lot of OSes don’t to me. And I’ve been hoping for a problem the right size that I can do that with what energy I actually have for this sort of thing.

I think the average person probably has much better luck installing NetBSD than me though. If you install it on a normal computer, it will usually just work. Indeed that’s been my experience every time I’ve installed it on a desktop computer, or one of my thinkpads. At worst, maybe you have too new a GPU, and it’s not supported by the current version of the graphics stack. Unfortunately, I’m often cursed with weird hardware that makes me feel lucky that even linux works on it (and sometimes only works with vendor kernels, ugh).

Simple Executable Love2D Files, or, You Can Shove Random Data At The Start of a Zip File and it’s Basically Fine

2024-02-29T00:00:00+00:00

LÖVE (which I will write as love because my keyboard doesn’t have an Ö) is a neat program that’s mostly intended for writing games with lua. We’ve been using it to write an image viewer. There’s a lot of ways to package a love project up for distribution, and some of them ship a copy of love with the project and some don’t. Since my distribution provides the version of love I need, I can create a .love file with all my source code and assets in it, and then I can run it with love path/to/myprogram.love. A .love file is just a .zip with a different file extension, so that’s pretty easy to do.

I didn’t want to have to specifically execute my program by typing out love path/to/myprogram.love though. I wanted to be able to throw it into a directory on our PATH so I can just run like myprogram path/to/image.png and have it execute.

I could do this pretty easily by putting my .love file somewhere on disk, and then putting a shell script on my PATH that executes that file with love… but, what if the zip file and the script were actually the same file? I didn’t actually know much about the zip format yet, but I had to try. So I gave it a shot with one of our old Kaleidoscope generator programs:

cd kaleidoscope

zip -r ../kaleidoscope.zip .
  adding: main.lua (deflated 70%)

cd ../

echo '#!/usr/bin/env love' > kaleidoscope.love

cat kaleidoscope.zip >> kaleidoscope.love

chmod +x kaleidoscope.love

./kaleidoscope.love

Huh. There it is. Well alright, let’s go a bit further. We use wayland a fair bit these days, and right now this love program is running under Xwayland. I happen to know that love uses SDL under the hood. The version on my system doesn’t enable their native wayland backend by default yet, but it seems to work fine for me, so I figured I’d set the environment variable to turn it on.

Unfortunately, it seems like you cannot actually set environment variables with a #!/usr/bin/env shebang. The program just hangs forever. This isn’t related to love or the zip file stuff at all, it always happens. But, if we could shove a shebang at the front of the zip, why not a whole shell script?

echo '#!/bin/sh
if [ -n "$WAYLAND_DISPLAY" ]; then
  export SDL_VIDEODRIVER=wayland
fi
exec love "$0" "$@"
' > kaleidoscope.love

cat kaleidoscope.zip >> kaleidoscope.love

./kaleidoscope.love

Cool! I could stop here, it clearly works. But, I decided to learn a bit more about zips, because I wanted to know: is this still a valid zip file? And if not, how can I make it one?

Let’s just try to unzip it somewhere:

mkdir /tmp/whatever
cp kaleidoscope.love /tmp/whatever
cd /tmp/whatever
unzip kaleidoscope.love
Archive:  kaleidoscope.love
warning [kaleidoscope.love]:  102 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  inflating: main.lua    

Interesting, so we are violating the spec, but the unzipper libraries are just able to figure it out anyway.

We looked into it a bit further and it turns out that the main thing making this work at all is that the zip file directory is stored at the end of the file, not the start. So it’s easy for software to see that it is in fact a zip file. The thing is, the directory specifies the locations of files relative to the start of the file. So we’ve shoved 102 bytes at the start of the zip file, and now all the offsets are 102-bytes away from where they should be. This is detectable, clearly, but not ideal. But, this is the only problem, actually. If we rewrote all the offsets, adding 102 to each of them, then our zip file would be completely 100% valid!

Rather than write a program to do that, I instead wrote a script that generates zip files from scratch, writing the offsets correctly as it goes. I didn’t bother actually making it compress anything, since I don’t really care about that right now. But, if you’re curious, here it is! Use at your own risk.

You need zlib installed (I use it for crc32 despite the lack of compression), though you almost certainly already do. You need cffi-lua to load it. You need luaposix. And you need lua5.3 for string.pack().

#!/usr/bin/env lua

-- Change this to whatever you want to put at the front of the zip
local love_file_loader = [[#!/bin/sh
# This is a love2d zip file! You can extract it with any unzipper tool to see
# the source code.
if [ -n "$WAYLAND_DISPLAY" ]; then
    export SDL_VIDEODRIVER=wayland
fi
exec love "$0" "$@"
]]


local input_dir, output_love = ...

if input_dir == nil then
    print([[
Usage: ' .. arg[0] .. '  [output.love]

Basically this zips up the input_dir and creates a file with

    #!/usr/bin/env love

and then the zip file appended. which works, somehow! If you don't say what
output.love to use it will add `.love` to the project input_dir path and use
that.
]])
    os.exit(1)
end

-- default .love extension. trims trailing slashes first
output_love = output_love or (input_dir:match('^(.-)/*$') .. '.love')


--[[
I feel like zip files are kinda frustrating to deal with on linux. Rather than
try to wrange various command line interfaces, let's do it ourselves.

I considered using libzip but I don't really like its interface. You have to
give it a file path and let *it* open the file if you want to write data. Meh.
It won't work for what we're trying to do.

But we probably shouldn't try to do DEFLATE in lua right now, so let's just
write uncompressed for now. After all, we're just putting lua files and pngs in
a box.

Despite this, we still pull in libz for now, because it has a crc32 function
and we need crc32. Maybe later we can add compression with it too.

https://www.zlib.net/manual.html
]]

local cffi = require('cffi')
local libz = cffi.load('z')
cffi.cdef([[
extern unsigned long crc32(
    unsigned long crc,
    const unsigned char *buf,
    unsigned int len
);
]])

-- We'll use luaposix to traverse the directory and pack files in
local posix = require('posix')
local posix_stat = require('posix.sys.stat')


--[[
wrapper around libz crc32, needed by zip creation. A crc is just a 32-bit
number, so we take that number in and return a new one rather than updating an
object.

https://github.com/q66/cffi-lua/blob/master/docs/introduction.md#caching
]]
local libz_crc32 = libz.crc32
local libz_buf = cffi.typeof('const unsigned char*')
local function crc32_new()
    return libz_crc32(0, nil, 0)
end

local function crc32_update(crc, bytes)
    local buf = cffi.cast(libz_buf, bytes)
    return libz_crc32(crc, buf, #bytes)
end

local function crc32_finalize(crc)
    return cffi.tonumber(crc)
end

--[[
zip file creation!

https://en.wikipedia.org/wiki/Zip_(file_format)

We take a base directory path, a list of file paths relative to the base dir, an
output file handle, and a flag for whether to include the directory name in the
output. That is should create_zip('blah', ...) create files 'blah/whatever/' or
just 'whatever/'

The main reason for this is so that all the offsets in the zip file are relative
to the start of the zip with stuff shoved in front of it (such as a shebang).

The structure of a zip file is

- List of files, each with
    - Local header
    - data
- Central directory of file entries
- End Of Central Directory


These data structure descriptions are copy-pasted from the wikipedia article.

=== Local Header ===
0        4     Local file header signature = 0x04034b50 (PK♥♦ or "PK\3\4") (lil end
4        2     Version needed to extract (minimum)
6        2     General purpose bit flag
8        2     Compression method; e.g. none = 0, DEFLATE = 8 (or "\0x08\0x00")
10        2     File last modification time
12        2     File last modification date
14        4     CRC-32 of uncompressed data
18        4     Compressed size (or 0xffffffff for ZIP64)
22        4     Uncompressed size (or 0xffffffff for ZIP64)
26        2     File name length (n)
28        2     Extra field length (m)
30        n     File name
30+n    m     Extra field 

Note that to fill in the Compressed size without pre-compressing a file in RAM,
we can write the header, write the file data, then seek backwards.

=== Central Directory Entry ===
0     4     Central directory file header signature = 0x02014b50 (little endian)
4     2     Version made by
6     2     Version needed to extract (minimum)
8     2     General purpose bit flag
10     2     Compression method
12     2     File last modification time
14     2     File last modification date
16     4     CRC-32 of uncompressed data
20     4     Compressed size (or 0xffffffff for ZIP64)
24     4     Uncompressed size (or 0xffffffff for ZIP64)
28     2     File name length (n)
30     2     Extra field length (m)
32     2     File comment length (k)
34     2     Disk number where file starts (or 0xffff for ZIP64)
36     2     Internal file attributes
38     4     External file attributes
42     4     Relative offset of local file header (or 0xffffffff for ZIP64). This is the number of bytes between the start of the first disk on which the file occurs, and the start of the local file header. This allows software reading the central directory to locate the position of the file inside the ZIP file.
46     n     File name
46+n     m     Extra field
46+n+m     k     File comment



=== End of central directory ===

0     4     End of central directory signature = 0x06054b50
4     2     Number of this disk (or 0xffff for ZIP64)
6     2     Disk where central directory starts (or 0xffff for ZIP64)
8     2     Number of central directory records on this disk (or 0xffff for ZIP64)
10     2     Total number of central directory records (or 0xffff for ZIP64)
12     4     Size of central directory (bytes) (or 0xffffffff for ZIP64)
16     4     Offset of start of central directory, relative to start of archive (or 0xffffffff for ZIP64)
20     2     Comment length (n)
22     n     Comment 
]]
local function create_zip(basedir, files, outf)
    local filemeta = {}

    for _, name in ipairs(files) do
        local path = basedir .. '/' .. name
        local meta = {
            offset = outf:seek()
        }
        filemeta[name] = meta

        -- stat file
        local stat = posix_stat.stat(path)

        -- We can only handle regular files
        assert(posix_stat.S_ISREG(stat.st_mode) ~= 0)

        -- Save metadata
        meta.compression_method = '\0\0' -- none compression left beef

        --[[
        write the local header. To start off with,
        - checksum is 0
        - file sizes are 0

        We will seek back and fill those in later.
        ]]
        local header = string.format(
            'PK\x03\x04\0\0\0\0%s\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0%s\0\0%s',
            meta.compression_method,
            string.pack('



we’ve added light mode
2024-02-27T00:00:00+00:00
This has been an addition we’ve wanted for a long while. For too many reasons to consider listing, some people find dark websites difficult to read. Or they’ve just got preferences. The Reader Mode in many browsers can help do an auto-transformation, but not everyone knows about that, and it doesn’t always do it right. So, now there’s a light theme built into the site now. If you’re a light-mode enjoyer, I hope it helps! The rest of this post will be some details about how we implemented it.

While this was a simple addition, it might be more complex than you’d think.

why we didn’t use prefers-color-scheme

First, let’s discuss the unfortunate case of the CSS prefers-color-scheme media query.

In theory, this CSS selector lets a web developer make a site work in either light or dark mode with CSS alone, no javascript needed. Let’s review the values it can take:


  
    dark - Indicates that user has notified that they prefer an interface that has a dark theme.
    light - Indicates that user has notified that they prefer an interface that has a light theme, or has not expressed an active preference.
  


There is no way to differentiate between whether a user specifically wants light theme or has no particular preference. Light is, implicitly, the default per the way this works right now. This is ok if you want your site to default to light mode. But our site is, first and foremost, for ourselves. We use some browsers that don’t support prefers-color-scheme, we use OS environments where setting it up is difficult; in general we are in a lot of situations where we want to browse our site somewhere without configuration, and we want our site to look the way we prefer when we do that.

So, given that dark is the default we want, and there’s no way to differentiate between “default” and “light”, we just can’t use this feature to do anything useful.

JavaScript + localStorage

Whether or not we could use prefers-color-scheme, we’d still want an interactive override as an option. It just turned out to be the entirety of the feature in this case. We’ve got a bit of javascript at /colorscheme.js which makes it work. It’s based on the theme switcher on iliana.fyi, but tuned to our own sensibilities. I’ll reproduce the block comment from the top of the file here:


  This color theme switcher is based on iliana’s switcher:

  https://github.com/iliana/iliana.fyi/blob/main/src/theme.jsx

  We do things a bit differently. In an ideal world, we would follows
prefers-colorscheme, let users set that to set the theme to light/dark, and then
use javascript as an optional override. But, for Reasons, browsers do not
communicate that a reader explicitly prefers a light theme. There is either
“reader wants a dark theme” or nothing.

  iliana takes the philosophy of presenting a light theme by default as a
consequence of this. But I tend to use browsers that don’t let you specify
prefers-colorscheme, and also don’t support javascript. I want the website to
look the way I want by default, since this is my personal site.

  In our CSS file, we set up some CSS variables and initialize them with a :root{}
block. We also define a :root.light{} block to turn on light mode. Changing
colors then is performed by the presence or absence of the “light” class in the
documentElement class list.

  This javascript creates a button element for choosing the right theme. We
generate the HTML in here so that the reader doesn’t see an option to change
themes if they don’t have javascript- we wouldn’t want to make promises we can’t
keep.

  We load their preference out of localStorage, if it’s there. Then, whenever they
change their setting with the button, we reflect that change and save it.

  To avoid the dreaded “flash of unstyled content”, it’s important that this JS is
run after the document exists, but before it renders. We can do this by
including the script with “defer”, like this:

    
  


And, here’s the JS in its entirety:

let color = 'dark';

try {
    color = window.localStorage.color;
} catch {
    // nothing
}

if (color === 'light') {
    document.documentElement.classList.add('light')
}

/*
theming happens in the css
*/
const btn = document.createElement('button');
btn.innerText = 'Light/Dark';
btn.id = 'themeSwitcher';

/*
We need to listen for when the reader changes color scheme, and we do that here.

Update the actual color, and save it in localStorage
*/
btn.addEventListener("click", () => {
    // Toggle the theme
    if (color === 'light') {
        color = 'dark';
        document.documentElement.classList.remove('light')
    } else {
        color = 'light';
        document.documentElement.classList.add('light')
    }

    try {
        window.localStorage.color = color;
    } catch {
        // nothing
    }
});

document.body.firstElementChild.prepend(btn);


The nice thing about localStorage is it’s stored entirely client side, so I don’t need to track cookies or anything. It’s also persistent across the entire site. It can be cleared behind my back if a reader’s browser decides it needs to free up spaces, but I’ll be low on a browser’s priority for that since I’m only storing a single value.

CSS

This JS pairs with a few chunks of CSS:

:root {
  --foreground:         #fbf5ef;
  --foreground-accent:  #f2d3ab;
  --background:         #272744;
  --background-accent:  #494d7e;
  --background-code:    #494d7e;

  /* derived by palemoon from background-accent */
  --button-border-bright:  #aeb0c6;
  --button-border-dark:    #313354;
  --border-code: none;
}

:root.light {
  --foreground:        #121223;
  --foreground-accent: #15172b;
  --background:        #fbf5ef;
  --background-accent: #f2d3ab;
  --background-code:   #fefbec;

  /* derived by palemoon from background-accent */
  --button-border-bright: #fbf1e5;
  --button-border-dark:   #86755f;
  --border-code: 1px solid var(--foreground-accent);
}


By default, the CSS variables in :root will take affect, but in light mode the variables in :root.light will be set instead. The rest of the CSS is defined in terms of these variables.

The weird button-border colors are necessary because I wanted the button to have the old-style appearance of border-style: outset and border-style: inset. Firefox today gives inset and outset a much more subdued appearance, but palemoon still has the old style I was looking for. So i just took a screenshot of that, color-picked the colors out of it, and set the 4 button border colors manually to make it look right everywhere else.

I also couldn’t really justify using a darker color to differentiate code blocks from prose when part of the point of light mode was also to be a higher contrast reading option, so I went for a slightly-different color that used to be the background of this site back in 2017 or so, as a fun little reference for ourselves. It wasn’t really distinct enough for me though, so I added a border around code blocks too.

Button positioning is a little hacky, but hey it works:

/*
On wide monitors, just put it in the top-right corner. On thin monitors,
we float: right so that it reflows the nav bar text.

Our body max-width is 750px, so we add a bit onto that for margin and then
call it good
*/

@media (min-width: 908px) {
    #themeSwitcher {
        position: absolute;
        top: 1em;
        right: 1em;
    }
}

#themeSwitcher {
    float: right;
    padding: 4px;
    margin-left: 4px;
}


Maintaining support for Netsurf

We use the netsurf browser sometimes, which doesn’t currently support CSS variables. We still wanted our theme to work there though. To do this, we define all color properties twice- first with the default theme, and second with the CSS variable:

hr {
    color: $background-accent;
    color: var(--background-accent);
    background-color: $background-accent;
    background-color: var(--background-accent);
}

a {
    color: $foreground-accent;
    color: var(--foreground-accent);
}


Netsurf will see the first non-variable definition, use that, and ignore the definition with var(). More featureful browsers will overwrite the static definition with the var() definition since it comes second.

Because of this, we’re still using a CSS pre-processor, and those dollar-signs are variables that get replaced with the correct hex codes at site generation. We’re doing a rewrite of our site with a new custom site generator actually, so we might make that automatically do these double-definitions for us.

Our light theme changer won’t work on netsurf, but eh. At that level of tech, opening our website with the graphical version of the links browser, or the terminal version with a light terminal theme, would also do the trick just fine.


If you can’t find it, it needs to be written
2024-01-17T00:00:00+00:00
In the world of the digital, many of us have been tricked into thinking that something only needs to be said once. If someone has stated something, then stating it again is noise. Provides no purpose. Does not benefit anyone. The extension of this follows: I should not write something, on the off-chance that someone else whom I don’t even know about has already written it, perhaps better. The fear of being more incomplete than an imagined other-expert. But an utter void of information is more incomplete than your works will be, and there is value in saying what has been said.

Collecting a lot of pieces of information into a cohesive source takes a lot of labor, and a lot of lived experience. What have I experienced in life? It’s different from others, and that perspective informs what I think is worth writing about. For an instructional piece, it informs details I include, because they were confusing or surprising to me, and which details I omit because they seem so obvious as to not even be worth mentioning. The time in which I write influences these things too. How many tutorials or guides have we seen in the world that link to a number of dead links as suggested sources for materials or further research? How many are subtly wrong about something, in a way the author never noticed? I will write a different guide on the same topic than someone else will, and that is valuable to the reader who now has two sources to compare and cross-reference instead of one.

Additionally, if I cannot find another author collecting the information I want to share all in one place, then that collection of information does not exist in my world. It may exist in someone else’s- someone else may have that collection, may even know of a place where that collection has been published. But if I cannot find it, there’s a good chance others in my social circle can’t either because of the way social bubbles work. And so, in collecting and reproducing that information myself, I’m sharing it with others that wouldn’t have access otherwise.

And counterintuitively, sharing incomplete information is also one of the most effective ways of getting others to share additional tidbits in addendum, as email replies, as comments on a website, and so on. Often, it is far more effective than simply asking a question. It’s an oft-repeated joke that the best way to get the right answer to a question is to provide the wrong answer. I don’t advocate for intentional misinformation, but there’s a nugget of wisdom: people notice small information gaps much more readily than they notice vast information voids, and it’s easier to fill in a small gap when the rest of the puzzle has already been written. By publishing information in a visible place, I entice others to join in, and I can update my document to cite and reflect what I learn through them.

Knowledge itself has to be actively maintained, or it decays, even in the digital world that promises that knowledge will live forever. By repeating what has been said, we perpetuate it forward. By experimenting with what has been said and re-performing research, we validate and verify and innovate to try to make what we’re perpetuating forward more valuable than what came before. By citing what came before, we leave a trail of clues and evidence for others to retread the same ground, and reinforce it.

And don’t forget to archive your sources and your works. If you don’t control and maintain your archives, they aren’t yours, and they will evaporate long before you do.


Installing nixGL on aarch64
2023-12-19T00:00:00+00:00
nixGL is a tool that’s useful for people who use Nix on distributions that aren’t NixOS. It lets you run graphical applications from nixpkgs with hardware acceleration; without it everything falls back to software rendering. Here’s how to install it on aarch64 (arm64).










Update! - 2024-04-10

I was curious and took a peek- seems like this has been fixed upstream! I have not tested it myself, but folks in the replies say it works :D. I’ll try it out when I have a graphical aarch64 system up and running again, but my computer situation is a bit in flux at the moment so it might be a bit.

Even without my own testing, I think it’s highly likely that you can safely ignore the rest of this post and just follow the standard nixGL installation instructions. I’ll leave the rest of the post up in case someone needs it for something. Many thanks to the folks who did the work to fix it.










nixGL provides the nixGLMesa and nixVulkanMesa packages. The second one is only useful if your system can handle vulkan. The recommended way to install these packages is with their channel, and that’s what I’m going to demonstrate. Adapt this to flakes as necessary.

First, add the channel:

nix-channel --add https://github.com/guibou/nixGL/archive/main.tar.gz nixgl && nix-channel --update


what not to do

If you just run nix-env -iA nixgl.nixGLMesa, you will get this error:

nix-env -iA nixgl.nixGLMesa
installing 'nixGLMesa'
error:
       … while calling the 'derivationStrict' builtin

         at /builtin/derivation.nix:9:12: (source not available)

       … while evaluating derivation 'nixGLMesa'
         whose name attribute is located at /nix/store/aar6rj1zv6bkac1fis2kpg3ivl2jkw2r-nixpkgs-23.11/nixpkgs/pkgs/stdenv/generic/make-derivation.nix:348:7

       … while evaluating attribute 'text' of derivation 'nixGLMesa'

         at /nix/store/aar6rj1zv6bkac1fis2kpg3ivl2jkw2r-nixpkgs-23.11/nixpkgs/pkgs/build-support/trivial-builders/default.nix:148:16:

          147|     runCommand name
          148|       { inherit text executable checkPhase allowSubstitutes preferLocalBuild;
             |                ^
          149|         passAsFile = [ "text" ];

       error: i686 Linux package set can only be used with the x86 family.


The important part:


  error: i686 Linux package set can only be used with the x86 family.


This is because by default, nixGL pulls in some i686 libraries for multi-lib support, but it does this even if you are on arm. Fortunately this can be disabled with the enable32bits setting.

what to do

So, here’s what you should do instead:

nix-env -i -E '(_: with import  { enable32bits = false; }; nixGLMesa)'

# if you want vulkan
nix-env -i -E '(_: with import  { enable32bits = false; }; nixVulkanMesa)'


Now you can use the nixGLMesa and nixVulkanMesa commands to run programs. For example,

artemis@reform ~> nix-shell -p mesa-demos
[nix-shell:~]$ which glxinfo
/nix/store/pgkpc86qjnkncyq0h1bc7qdr7q2g0a2r-mesa-demos-9.0.0/bin/glxinfo

[nix-shell:~]$ nixGLMesa glxinfo | grep renderer
    GLX_MESA_copy_sub_buffer, GLX_MESA_query_renderer, GLX_MESA_swap_control, 
    GLX_MESA_query_renderer, GLX_MESA_swap_control, GLX_OML_swap_method, 
Extended renderer info (GLX_MESA_query_renderer):
OpenGL renderer string: Vivante GC7000 rev 6214

[nix-shell:~]$ 


There it is, Vivante GC7000! That means we are hardware accelerated :D

the more correct fix

It would probably be good to fix this upstream, so that it doesn’t pull on i686 libraries on other architectures. I don’t actually know how to do that though, else I’d try to write a fix myself, so I’m writing about how to make it work right now since that’s what I’ve got the energy for.


i like gentoo’s package deprecation process
2023-11-05T00:00:00+00:00
Gentoo’s process for removing packages from the main gentoo package repository is designed to make me aware of it and give me time to react, and I really appreciate that.

So when a package gets dropped from the gentoo repo, this happens in a few steps.

First, the package is masked by a package.mask. This does two things:


  It prevents any new users from installing the package (this can be overridden, but its a conscious configuration change to do so).
  It alerts anyone with the package already installed that they have a masked package installed, along with a comment as to why it’s masked.


Here’s an example. I had dev-util/catalyst-3.0.22-r1 installed, which is masked:

!!! The following installed packages are masked:
- dev-util/catalyst-3.0.22-r1::gentoo (masked by: package.mask)
/var/db/repos/gentoo/profiles/package.mask:
# Andreas K. Hüttel <[email protected]> (2023-07-12)
# The catalyst-3 branch is outdated and not used by Gentoo
# Release Engineering anymore. Please either use git master
# (9999) as all Release Engineering build machines or wait
# for catalyst-4. Questions or bug reports about catalyst-3
# may or may not lead to useful results.


I’ve got info in the comment on why it’s masked and what I might want to do in response. In this case I just removed the package because I don’t actually use catalyst. In other cases I might decide to copy the package into my own personal repo and continue maintaining it there, accepting the maintainership burden for myself. Or I need to find a way to migrate to an alternative.

Catalyst isn’t really an example of what a package removal from the main repo looks like though, since it’s just that version which is masked. Here’s another one which is a full removal:

- media-gfx/gmic-3.2.6::gentoo (masked by: package.mask)
/var/db/repos/gentoo/profiles/package.mask:
# Marek Szuba <[email protected]> (2023-10-26)
# Upstream uses a massive home-made Makefile which has since the beginning
# required massive amounts of patching to make it behave reasonably
# (as well as to fix the problems which ostensibly led upstream to
# abandoning CMake, and which they immediately re-introduced in their NIH
# solution) and which if anything have only got worse since then. One,
# optional, reverse dependency in the tree.
# Removal on 2023-11-26. Bug #916289.


Maintainer doesn’t want to keep patching a difficult to work with build system. Fair enough.

Note Removal on 2023-11-26, one month after the mask date. If I run any package manager operations between the mask date and one month after the mask date, I’ll get this comment telling me the package is masked. After a month though, the package will get removed from tree along with the package mask (no need to mask a package which isn’t there anymore). So I won’t get the helpful message if I happen to go more than a month between system updates. Which I think is a fairly generous window.

Anyways this contrasts to me with my experience with packages getting removed from the repos with Arch Linux.

On Arch, packages just kinda vanish from the repos it seems like. They just stop getting updates and I don’t notice, until I try to install them on another arch machine and realize they’re gone. Or someone re-adds them to the AUR and suddenly I’m updating what used to be an official package from AUR when I run an AUR update (which is how I’ve learned about a large number of package removals).

I think that’s kinda the usual story too across other distributions. I’m not really sure what happens to packages that get dropped during major debian upgrade though to be honest, but I don’t think I’ve ever been notified about a package that used to be in the repos and isn’t anymore.

But yeah, I like that Gentoo has a system for telling me about these things and giving me time to decide what I want to do about it.


LSDJ is pretty cool
2023-11-03T00:00:00+00:00
Little Sound DJ (LSDJ) is a music tracker for the original gameboy / gameboy color. It makes pretty bleep bloops, I think it’s neat.

The core is pretty simple. The gameboy’s sound hardware has 4 channels: 2 pulse waves, a 4-bit 16-sample wavetable, and a noise channel. You’ve got hard-panning, left, right, or center. But you can do so so much with this. I feel like I really underestimated it for a long time before I finally tried it out.

The pulse waves are pulse waves. You’ve got a few pulse widths to pick from, a volume envelope. But you can also do very smooth pitch bends and vibratos, and the first pulse channel lets you do really fast frequency sweeps that turn it properly percussive (people love to use this for kick-drum purposes). In the latest LSDJ you’ve even got a little visual representation of the volume envelopes which is really really nice because the numbers are kind of slow to process.

Something that really stands out to me is how fast arps sound. Thinking about other synths, super-fast arps are usually really hard to do when controlling a synth over MIDI. MIDI is good, but for timing that tight, it’s got limits. So usually you’re stuck with using an arp feature built into whatever synth you’re using, and usually that arp feature also is only as simple as playing a set of notes in one of a few pre-defined orders. With LSDJ you get incredibly high-speed high-resolution arp control, either with a simple 3-note arp using the arp command, or a custom 16-step arp sequence using “tables” (I think you can even chain tables for longer arps?) where you can decide the exact transposition of each note. You can do really high-precision tremolos this way too. It’s like an LFO you can program.

The sound channel is pretty nice. You’ve got noise and a bit of control over the timbre of it, and a volume envelope, that makes it work really well for hats or noise-snares. But you can also tune it to a few notes (C, D, F, G#). It’s not a precise tuning, its slightly detuned a bit, because this is actually a consequence of the undertone series. The undertone series is all the integer-divisions of a frequency, in contrast with the usual overtone series that are integer multiples of a frequency. Anyway, the sound channel can be nearly-tuned to some notes this way, and LSDJ lets you do it, giving a really pleasing sizzly synth sound with just enough detune to add some musical spice. Arpeggiating this sounds super cool too.

The wave channel is truly a star. Nominally, it’s intended to be loaded up with a predefined wave that you can then play at different speeds, and LSDJ does let you do this. You can hand-draw your own waves and switch between them as you like. But there’s more.

For one, LSDJ can automatically compute wavetables for you. You’ve got a wavetable synthesizer built in that lets you specify a start and end state, and will build a wavetable by sliding the parameters inbetween them. This wavetable synth has


  Volume control
  Filter cutoff
  Filter resonance
  Some phasey controls that let you warp or pulse-width modulate or a few other things.


And then you also have a few distortion modes (clip, foldback, modulo wrap) to spice it all up with. It’s seriously cool, I wish I had a program like this on my computer to generate wavetables to use in my more hi-fi wavetable synths

LSDJ can then automatically transition between waves to provide smooth or not-so-smooth modulation of these parameters during song playback.

But LSDJ can also hack the wave channel into being a PCM output by sequentially loading a series of waves from an internal wavebank with precise timing. It uses this to give you real drumkit samples, if you want to use them (with up to two drums playing back simultaneously!). There’s a ton of classic drum machine samples included to play back in gritty 4-bit quality.

And that’s pretty cool as it is, but it also has a speech synthesizer built in. Which I found on accident! Set your wave channel to instrument 40 and you have it.

Rather than writing out words to get auto-synthesized, this speech synth lets you write a sequence of individual sounds (allophones), and you can precisely time exactly how long you want each one to last in song-ticks. This makes is really easy to make it say exactly what I want with the exact cadence I want, which makes it much easier to use musically than most lofi speech synthesizers that sound like this. Usually speech synthesizers that sound as oldschool as this were never really intended for song use, and so you’ve got to sample them and chop them up to get something musical. LSDJ’s speech synth is no vocaloid, but it’s really pleasant to use. You can even get a super distorted variation by putting on a max-speed low-depth vibrato on it.


  little groove with a speech synth saying "maybe"
  


As far as I can tell it’s fixed to synthesizing speech with an A-note carrier frequency, which, fair enough. I imagine rom banks for other notes might be a bit much for the cartridge size. If you’re willing to do a bit of post-processing magic, its nothing a little autotune wouldn’t fix for you.

Anyways, on top of all of this, there’s a ton of commands you can use to tweak all your sounds and it really makes me want to just tweak everything and make variations and make my bleep bloops do all sorts of fun things. And there’s also command-tables that you can use to get two whole commands and a transpose per song tick (or synth invocation, depending on how you tell it to work). Great for doing all sorts of advanced trickery, or getting really picky with your arpeggios.

And then you’ve even got a “live mode” that gives you a pattern-launching style performance mode which is super fun.

Somehow this software just keeps getting better over time, and that’s like one of my favorite things is when that happens. It’s such a joy to play with and I highly recommend giving it a shot.

Speaking of which, here’s some ways I’ve found are a good way to run and some ways that aren’t:

You can of course use a GB/GBC. I know there’s a hardware mod some folks do (https://www.littlesounddj.com/lsd/prosound/) to get a better line-out on the GBC. I don’t have one myself. You can also use it with a GBA. I just don’t really like the way the buttons on the old hardware make my hands feel so I’ve never been compelled to do it this way.

Generally speaking, don’t bother trying to use it on Nintendo DS/DSi. There’s a gameboy emulator called GameYob, but the audio emulation on this emulator is pretty inaccurate and so volume envelopes wont work right, vibratos and pitch bends get super quantized, wave samples get garbled, sound channel acts funky. Although- if you want to explore this land of not-quite-right audio emulation, it does sound kinda glitchy cool, and could be worth composing for in its own right. Just expect anything you write on it to sound different everywhere else.

On the 3DS and New3DS, install retroarch and then use Gambatte. The sound emulation is great! I recommend remapping start/select to the bumper buttons. On the old 3DS you might experience some brief audio glitches while pattern editing. Maybe disabling wireless would help with that? I don’t know a whole lot about it since I have a New3DS which doesn’t ever have audio glitches. I was just briefly playing with my friend’s old 3DS getting it installed for them. But I’m sure you can do some tweaks to minimize it, maybe play with the emulator settings. The headphone out can be a bit noisey if you’re recording from it, so if you want a cleaner recording for a song, send your save file over to a computer emulator to get a digital recording out of that.

On the computer, you’re really living in luxury. The LSDJ site recommends BGB and Sameboy. I’ve run BGB in wine and it worked well for me. Definitely consider using a real controller with these! You can do keyboard too, but making music with controller is kinda nice.

Alright that’s all! I really like this and wanted to talk about how cool it is.


cloud-init - set root password with a config iso
2023-10-18T00:00:00+00:00
This is useful for running cloud-init VMs locally, particularly if the image doesn’t have sudo or some other privilege escalation tool. Usually in an actual cloud environment you wouldn’t do this.

you will need the following commands:


  mkpasswd
  cloud-localds


cloud-localds comes from a package usually called cloud-utils on distributions that have it. Gentoo does not, so I used nix-shell -p cloud-utils to use the nixpkgs build of it.

generate a password hash:

mkpasswd -m sha512crypt


This will prompt for a password and provide a hash. Here’s the hash for ergosphere. Your hash should look like this too:

$6$8Q6mhBP3mpXVaESC$STC9rjLChG54I.Xlj3/mRwInf.YSJnToe8GOKDO5jwDUnXqPmLBWzYxWrc6bCOnfIXqJqNMJBjIabHSVumCe80


create a file config.yaml, with your hash in it:

#cloud-config
users:
- name: root
  lock_passwd: false
  # replace with your password's hash
  hashed_passwd: $6$8Q6mhBP3mpXVaESC$STC9rjLChG54I.Xlj3/mRwInf.YSJnToe8GOKDO5jwDUnXqPmLBWzYxWrc6bCOnfIXqJqNMJBjIabHSVumCe80


generate config.iso:

cloud-localds config.iso config.yaml


attach config.iso to your VM as a cdrom before boot.


Scrollbars are becoming a problem
2023-10-12T00:00:00+00:00
Scrollbars. Ever heard of them? They’re pretty cool. Click and drag on a scrollbar and you can move content around in a scrollable content pane. I love that shit. Every day I am scrolling on my computer, all day long. But the scrollbars are getting smaller and this is increasingly becoming a problem. I would show you screenshots but they’re so small that even screenshotting them is hard to do. And people keep making them even smaller, hiding them away, its like they don’t want you to scroll! “Ah”, they say, “that’s what the scroll wheel is for”. My friend, not everyone can use a scroll wheel or a swipe up touch screen. And me, a happy scroll-wheeler, even I would like to quickly jump around some time.

Why it matters

Ok so a lot of folks have fine motor control problems. Others use somewhat inaccurate pointing devices like eye-trackers (they’re very impressive, but not good enough for your 8 pixel wide scrollbar!!). And these people, they wanna scroll! When my friend is voice/sound controlling their computer (with Talon Voice which is really good btw you should try it), they don’t wanna say “scroll down” over and over again or start auto scrolling and try to land in the right spot. They wanna just look at the scrollbar and click on the spot they wanna be looking at. You click on the spot and the content goes there! It’s THE GOOD STUFF!!!

I am not the first to notice this. See this post too in 2015 discussing this problem. I’d hate to go back in time only to tell them nothing has improved.


  “The simple fact that these skinny scroll bars exist are evidence that designers do not sit with non technical users to conduct usability testing. Because if they did that they would immediately discover the problem.

  People with dexterity and hand control challenges have a difficult time with these skinny scroll bars.

  People with eye sight challenges suffer with these skinny scroll bars.”


The Problems

People keep making the bar smaller! Or trying to get rid of it! I’m naming names of software I use day to day, but it’s not just them, people are doing this even outside the world of Linux.

In some of the cases the problem is actually that the bar is precisely the same size it always has been (in pixels), but monitor resolutions are much higher than when that bar width was first chosen, and bars haven’t started scaling to keep up, but in other cases the bars are actually getting smaller. Gods, remember the needle-bar Ubuntu tried to introduce for a bit? lmao.

And while the bars were shrinking, another feature silently disappeared: buttons to click and hold down to scroll left/right in increments. Arrow-keys largely hold this functionality now, but are dependent on what content is in focus. Buttons were not dependent on this. Actually, some of you might not even know what I’m talking about. These:



I don’t use these! But I added it into this post on request of someone who does, and wants them back.

By and large the trend that persists is, the bars get less usable, and either there are no user-configurable ways to fix them, or the ability to configure the options are buried so deep into the tech stack that no normal user can find them. I’m extremely technical. Many of my friends are not, and it’s their troubles that have driven me to write this post.

GTK

In GTK2, you could modify scrollbar widths directly in your gtkrc, and GUI programs existed to do this. In GTK3, this is CSS. Which, OK I guess, but there hasn’t really been a good user-friendly way to set it unless you understand theming. I found this reddit thread with a good script for it, which I’ll reproduce here in the likely event that reddit dies. Put this in a .sh file and run it if you need.

#!/bin/bash

echo "NEW SCROLLBAR WIDTH(px) OR TYPE 'r' TO RESET"
while true; do
    read uin
    uin=$(echo "$uin" | xargs) # trim
    uin=$(echo "${uin,,}") # lower case

    if [[ "$uin" =~ ^[1-9][0-9]?$ || "$uin" == "r" ]]; then

        # RESET

        # remove previous width in gtk.css (if any)
        if [ -f "$HOME/.config/gtk-3.0/gtk.css" ]; then
            if [[ $(grep -v "slider { min-width" "$HOME/.config/gtk-3.0/gtk.css") ]]; then
                grep -v "slider { min-width" "$HOME/.config/gtk-3.0/gtk.css" > tmpfile && mv tmpfile "$HOME/.config/gtk-3.0/gtk.css"
            else
                rm "$HOME/.config/gtk-3.0/gtk.css"
            fi
        fi
        if [ -f "$HOME/.config/gtk-4.0/gtk.css" ]; then
            if [[ $(grep -v "slider { min-width" "$HOME/.config/gtk-4.0/gtk.css") ]]; then
                grep -v "slider { min-width" "$HOME/.config/gtk-4.0/gtk.css" > tmpfile && mv tmpfile "$HOME/.config/gtk-4.0/gtk.css"
            else
                rm "$HOME/.config/gtk-4.0/gtk.css"
            fi
        fi

        # reset scrollbar visibility
        gsettings reset org.gnome.desktop.interface overlay-scrolling

        # reset flatpak overrides
        if [[ "$(flatpak --version 2>&1)" =~ ^Flatpak ]] && [ -f "$HOME/.local/share/flatpak/overrides/global" ]; then
            sed -i 's|xdg-config/gtk-3.0;||g' "$HOME/.local/share/flatpak/overrides/global"
            sed -i 's|xdg-config/gtk-4.0;||g' "$HOME/.local/share/flatpak/overrides/global"
            # if the those were the only filesystem overrides, remove the filesystems= line
            grep -vx "filesystems=" "$HOME/.local/share/flatpak/overrides/global" > tmpfile && mv tmpfile "$HOME/.local/share/flatpak/overrides/global"
            # if there are no other overrides, remove the flatpak global overrides file
            if [[ ! $(grep -vx "\[Context\]" "$HOME/.local/share/flatpak/overrides/global") ]]; then
                rm "$HOME/.local/share/flatpak/overrides/global"
            fi
        fi

        # APPLY NEW SETTINGS

        if [[ "$uin" =~ ^[1-9][0-9]?$ ]]; then
            # add new width in gtk.css
            echo "slider { min-width: ${uin}px; min-height: ${uin}px; }" >> "$HOME/.config/gtk-3.0/gtk.css"
            echo "slider { min-width: ${uin}px; min-height: ${uin}px; }" >> "$HOME/.config/gtk-4.0/gtk.css"

            # apply to flatpak (if installed)
            if [[ "$(flatpak --version 2>&1)" =~ ^Flatpak ]]; then
                flatpak override --user --filesystem=xdg-config/gtk-3.0 --filesystem=xdg-config/gtk-4.0
            fi

            # make scrollbar always visible
            gsettings set org.gnome.desktop.interface overlay-scrolling false
        fi

        echo "LOGOUT FOR EVERYTHING TO BE APPLIED"
        read hold
        exit 0
    else
        echo "Invalid input, try again."
    fi
done


Naturally the script has to override flatpak separately because heaven forbid my flatpak applications look the way I themed my system. They’ve also started hiding the scrollbar entirely by default until you hover over where it’s supposed to be, what is that???? You can get it back in GTK 3 if you know the right command line command to type in. (the script above does this too btw)

gsettings set org.gnome.desktop.interface overlay-scrolling false


Or you can find it in the GUI if you know how to navigate the Dconf Editor, I guess. lol. Though in GTK 4, you can’t even set this setting globally. See this thread right here, which to the best of my knowledge is still true as of writing.

> > Who ever said they were going to "go"? The API to make an application use
> > them is still there. Programs used frequently on tablets or such can take
> > feature requests to have their own options to use non-overlay scrollbars.
> 
> So I just need to file a feature request against every gtk application that I
> use?
> And when I install GNOME I'll need to go into each application and enable it.




Qt

To quote /u/cfeck_kde on /r/kde:


  “The width of Qt scrollbars is determined by the Qt widget style plugin you are using. As far as I know, only the Skulpture style allows configurable sizes. For other styles like Breeze, you would need to change the C++ source and recompile.”


On the one paw, recompile my GUI theme? - but hey I’m a gentoo user I do that every friday anyway, I could just drop a patch in I guess… RECOMPILE MY GUI THEME??

I repeat,



But on the other paw, because Qt style plugins are real code they do get to have a lot more power here if you find one that lets you change what you want. I use Kvantum personally. I can’t seem to find a setting in its configuration to change scrollbar width, but I can disable scrollbar disappearing (which it calls “Transient scrollbars”) so that’s nice.

Anyways, I tried out Skulpture like the reddit person recommended and it seems kinda cool, give it a try! I can’t figure out a way to configure it with a GUI though without using KDE Plasma though. No shade on Plasma, I just don’t have it installed right now to try it out, but maybe if you’re a plasma user it’ll be good for you.

Still, if our only hope is this one theme engine that may or may not keep working as Qt development continues, I can’t help but feel like the war is being lost.

Firefox

Firefox has also joined the war on scrollbars it seems, with an incredibly tiny scrollbar on the side of my screen. Thankfully, Firefox is at least fixable (for now…), but you have to go into about:config to do it which is never a great sign. HOWEVER, you CAN do it.


  Type in about:config in your address bar
  Search for widget.non-native-theme.scrollbar.size.override
  Edit it to whatever number you want
  You can also edit widget.non-native-theme.scrollbar.style to change the shape of it, set it to 4 for a nice chonk rectangle
  Finally, turn on “Always show scrollbars” in the normal settings window about:preferences if you want them always on.


Here, I set the size to 50. I don’t want it this big myself but by goly am I glad I can make it this big anyway. Behold, scrollbar glory:



EDIT: Go check out this response post from Athena Lilith Martin where she explores some extra settings to improve firefox’s scrollbars further, including disabling web page CSS overrides.

Chrome

Imagine being able to configure anything useful in chrome ever.

Electron

For that matter, imagine configuring an Electron app. Couldn’t be me. Or anyone else, for that matter. Maybe you can inject some custom CSS into the electron app to fix things up? Honestly if you have solutions for Chrome or Electron please tell me because I have no idea.

THERE ARE MORE

Look these are just the ones I encounter regularly, but all of these so-called “modern design principles” seem to be in a war against scrollbars and everyone that uses computers slightly different from the people implementing this are suffering as a consequence. Shit sucks yo!

The Best

OK you know what rules though? You know what I love, what my friend who uses eye trackers loves, what my friends who use tablet pens love? MINIMAPS



YEAH!!!!!!!!!

I can see the content. I can click the content. I can go to the content. And the hit target is massive. What more could a girl want?


Ephemerality in the Land of Federation
2023-09-09T00:00:00+00:00
Sometimes when we’re talking, we want ephemerality. Messages that do not last. Messages that cannot be read after some period of time. There are many reasons for this, and I’m not going to explain them, because I’d rather spend energy on how it can  or can’t be achieved.

TL;DR: you need a threat model

When we talk about this, we also need to ask “ephemeral to whom”? On a service like Discord, when I delete a chat message, that message is probably deleted from Discord’s servers. It might persist in backups somewhere, depending on how they run things. It’s almost certainly not deleted from any government agencies that Discord is sharing streams of messages with, if any (speculation on my part, but I tend to just assume a chat platform as large as discord has the NSA asking for that yumy chat data).

Now even if I’m right about that, the chat servers deleting the data, but governments keeping it, is fine for a lot of use cases. Most of the time when I want a message truly deleted, not just hidden, what I really care about is that if my account gets hacked or the recipients’ accounts are hacked, the hacker can’t go get at that message. Interpersonal feuds are higher on my list of concerns than nation-state actors. Deleting it from the server takes care of that.

I might care that the recipient does not receive the message, but no matter what that is a best effort race of whether I can get the deletion in before they see it.

I might care that the recipient does not share the message, but that is fully reliant on them being trustworthy and is completely unrelated to chat platform we’re using, so I need to determine what I think they’ll do before I even message them.

So then let’s move onto Federation. First I’ll talk about Mastodon, IRC, XMPP, Matrix, as they currently stand.

Mastodon


  you can send deletion requests
  unmodified servers that honor the deletion request will completely get rid of the data
    
      however, the data may still live in backups
    
  
  modified servers may not honor the deletion request
  I think that servers that de-federated with your home server will not see the deletion request, but I also don’t know if they’re holding onto your data


While a centralized service may honor deletions, the potential for random servers to refuse to handle deletion requests prevents you from achieving reliable ephemerality, particularly because those servers might ALSO continue to make your post publicly viewable. The barrier to viewing the post is just having a link to the copy held on their server- you don’t need to be a server admin and log into the database or something.

IRC


  Most IRC servers don’t want to maintain message history
  Most big channels have one or more participants logging the entire history
  Ephemerality exists only to the degree that all participants in a chat have turned off their chat program’s logging features
  No way to request a client redact history
  People using “bouncers” store messages for some period of time to be relayed to their clients, but they can also choose to log in without their bouncer. There’s no reliable way to know if they’re behind a bouncer or not.
  Basically, despite the lack of deletions, it can be ephemeral if all participants want it to be, because ephemerality is the default here and persistence is added on top.


XMPP


  I don’t know a whole lot about this, but
  most servers don’t want to store chat history for long periods
    
      usually they store it like a week at most and it’s just for syncing up devices that don’t connect in very often
    
  
  clients store extended chat history instead of the server
  if you don’t want the server operator seeing your shit, you end to end encrypt it.
  how do message deletions work? Do they exist at all? I forget, it’s been so long since I’ve used it.


Matrix


  Message history is maintained indefinitely
  A deletion generally does not delete data, it just sends a “this message was deleted” event that hides the original message in the timeline
    
      likewise, edits are additive in this way, stacking on top of the original message
    
  
  However, matrix supports end to end encryption
    
      Encrypted data is only visible to anyone with the decryption keys
      Some partial protection against the “hacked account” situation. Does not protect against the “hacker computer” situation.
    
  
  Matrix-the-protocol also supports the ability to negotiate a temporary peer-to-peer connection that is not tracked by the server. This is used for voice/video calls


I want to highlight that when we talk about “true message deletion”, actually deleting the data from the server is largely irrelevant because a server admin usually has no way to ALSO delete that data from their backups (and they SHOULD have backups). What we can actually care about is one or more of the following:


  ensuring no current users can access the data
  ensuring no future users can access the data
  avoiding sending data to the server in the first place


Mastodon is sort of a lost cause here I think. It doesn’t fit in with what mastodon does, mastodon is all about making data public, in a way that makes redactions exceedingly difficult to guarantee in any way. But it might be worth pursuing for the purpose of mastodon DMs, websiteboy permitting.

IRC- again, if all participants configure their shit right, you can have a high degree of ephemerality today.

Matrix: room for improvement, but the tech is basically already there if clients were convinced to use it.

Making Matrix Better at This

So there’s three main things that can be done here.

Two of them are useful for the purpose of premeditated ephemerality. I know I want this data to be temporary. Let’s make sure of it.

Option 1: Expiring conversations

At a high level the way this works is

  I want to talk to you and have the conversation expire after like a few days. Our apps negotiate keys for this (probably: just make a channel under the hood)
  We talk
  After a few degrees, both of our apps delete all the encryption keys for the conversation, making it permanently inaccessible.


This relies on all apps actually dropping the keys (do you trust the person reading your messages? do you trust the app dev to have not messed it up?), and it relies on the encryption used not being cracked at some future date.

Option 2: Peer to peer conversations

Right now as I mentioned, matrix-the-protocol already has a way for apps to start a peer to peer direct connection with each other for video/audio. But like, why not just do that but send text over it? If we did this, the server NEVER sees the data at any point. And only clients actively participating in the conversation can see anything. The main downside is that it would SUCK to use over mobile because you’d need to keep your app open for the whole conversation, since phones are really aggressive at killing background connections. Also, moving between cell towers often breaks connections. But it’s probably the biggest guarantee of ephemerality you can make.

Option 3: The Nuclear Option

If you truly want to redact a message that was already sent…

I don’t know if you can drop the keys retroactively for just part of a conversation (if you can that would rule!). But you can certainly drop the keys for all of one. In theory, we could build a method for me sending you a request to drop all your encryption keys for our existing conversation, you could accept, and then we have essentially buried the entire chat up to that point, never to be read again. Gone. Nobody can decrypt it ever again. Matrix historically has had bugs that cause this scenario to happen anyway (though its been good about it lately for me…), just turn it into a feature.


Gentoo Stability Indicators- what do keywords mean?
2023-09-09T00:00:00+00:00
I’m not a core Gentoo dev, but this is my understanding of what the stability indicators mean, having used Gentoo for about a year now.

There’s three levels of stability, indicated by whats called a “keyword”:


  Unkeyworded
  Unstable-keyworded
  Stable-keyworded


The stability of a package is both per-version, and per-architecture. So a package version with keywords “amd64 ~arm64” is:


  Marked stable on amd64 (x86_64)
  Marked unstable on arm64 (aarch64)
  Unkeyworded on all other CPU architectures.


So what do these three things actually mean? Let’s go from least to most stable.

Unkeyworded

A package is unkeyworded when the package maintainer does not know if it will even build and install correctly, let alone run. All packages start out unkeyworded on new CPU architectures, since no one knows initially what will and won’t work on it.

Some packages have build scripts that always build the latest bleeding-edge version of the source code, usually marked with a 9999 or 99999999 version number. These packages are always unkeyworded, since the package will always download the latest source code when you run the build for it, and the maintainer has no way to guarantee that will actually work.

To install an unkeyworded package, you have to add it to /etc/portage/package.accept_keywords

Usually this looks something like:



This ** wildcard will allow you to install the package even if there are no keywords for your architecture. The < and -9999 excludes any bleeding-edge live build versions. You don’t need to include it if the package has no bleeding-edge build script, but I like to include it regardless in case it gets one later.

Unstable

A package is unstable on a given CPU architecture when it is known to build on that architecture, and maybe known to work in some capacity, but has not been thoroughly tested. Package maintainers might mark a package as unstable themselves. As a user, you can also file a request for a package to be marked as unstable, telling them that you have built the package on the architecture and it seems to work for you. The wiki has instructions (wiki/Knowledge Base: Missing keywords and keyworq requests) for how do to this. You should probably make sure all its dependencies are keyworded first.

Generally speaking, you shouldn’t submit a request to keyword a version unless other archictectures already have at least on unstable keyword for that version. If none of them have a keyword, the maintainer probably does not consider that version to be keyword-worthy in general.

I think that once a package is keyworded on an arch, it stays that way for new versions unless someone comes in to say that it is now thoroughly broken on the arch and needs to be unkeyworded. That means that just because something is unstable, does not mean it will actually build/work correctly. But it does mean it at least used to, so fixing it may not be too bad if there is trouble.

To install an unstable package, you can again add it to /etc/portage/package.accept_keywords, but this time you add it as something like:

dev-lua/luaposix ~arm64


A tilde in front of your CPU architecture. You can also globally accept unstable for all packages, but I do not do this. It can be frustrating to reverse this change if you do this and decide it is not for you.

Stable

A package version is stable on a given CPU architecture when it is known to build, run, and generally work as it is supposed to. It’s not an indication that the software has no bugs- all software has bugs. But it’s an indication that it’s considered about as good as it will get for general use.

Stabilization has a more involved set of requirements, but like unstable, you can request a package be marked stable as a user. See the wiki/Stable request. Stabilizing a package has a much higher cost on a package maintainer than unstable-keywording it, so maintainers will only do it within the bounds of what they have energy to commit towards it, and will prioritize packages they think have the most users.

What to expect out of different CPU architectures

On amd64 (x86_64), the vast majority of stuff is marked stable. It is where the most maintainer time goes. You will still need to accept some unstable software, but unkeyworded software is rare.

On arm64 (aarch64), you will have a pleasant amount of stabilized software available, but you will frequently need to accept unstable versions of software, and you will need to accept unkeyworded software with some regularity (go make keyword requests! This is an instruction for me too, I have been putting it off).

Straying from there into architectures like ppc64 or riscv you will find progressively fewer keywords and stable software. Such is life on these architectures.

GURU is different

Packages in GURU, the Gentoo user repository, will never be stable. It’s not allowed. Additionally, keywording happens not by filing an issue but by just comitting the keyword as a change to the repo. If you want to keyword a package, message the package maintainer or become a contributor yourself and commit the keyword.

As a contributor, I don’t commit keywords unless I’ve actually tested the build myself and am pretty sure it will work.


hackage search is not so good
2023-08-03T00:00:00+00:00
Haskell has a package database called Hackage. It has a search feature, but it is hard to find things with the search feature. What is bad about it? What are work arounds?

Problems

It doesn’t work good

If the search does not work good then that’s a problem. What do I mean?

Look, I want to find bindings for libgit.

I type libgit in search:



Hmm. This does not seem right. One result, very little usage- seems implausible.

I type libgit2 on a whim:



Hmm, better. Four results (one off screen), and hlibgit2 is the one I want. But why does it not show up when I type libgit? For comparison, we reference crates.io libgit search. Bad thing about crates.io search: libgit bindings are not near the top. So much for “sort by relevance”. Good thing: libgit2-sys is on first page of results at all. And at the top if sortby recent downloads!



Let’s try harder. I want to hash data. I search “sha256”:



Some results… cryptohash is deprecated in favor of cryptonite though, and cryptonite and saltine are nowhere on the results!

Ok, I search “crypto”. Still, they do not appear.



I search “cryptography”. Finally.



You see, it is hard to find things with the search. There was an old search and it worked much better than this. Problem was solved! Not anymore.

Javascript

The second problem: it requires javascript. “Yes but so does crates.io”- true, and that is not great either. Why? Personal workflow, I’ll show you.

I like to use links browser in terminal to search packages and package documentation. We have shell aliases that do this. Main one is ddg command:

vi@localhost ~> cat (which ddg)
#!/bin/sh
if [ $# -eq 0 ]; then
    links "duckduckgo.com/lite"
else
    query="$@"
    links "duckduckgo.com/lite?q=$query"
fi


In rust, we have alternative index lib.rs which works without JS.

ddg !librs serde
                              [1]Lib.rs
   › Search [2]#serialization [3]#json [4]#no-std [5]#deserialization [6]#parser
   [7]serde________________ [8]Search
     * Sorted by relevance
     * [9]I'm feeling ducky
    1. [10]serde

       A generic serialization/deserialization framework

       v1.0.180 9.4M #serde #serialization #no-std
    2. [11]serde_yaml

       YAML data format for Serde

       v0.9.25 1.9M #yaml #serde #serialization
    3. [12]serde_with

       Custom de/serialization functions for Rust's serde


Solutions

hoogle

hoogle is not just search for functions, it can search packages too. It also works without javascript! Let’s try the searches again here.

First, libgit:



Good.

Second, sha256:



Not as good, but cryptonite is there at least. Difficult to parse though.

We can limit to just search packages with is:package but that is just a string contains on the package name. Because it is string contains, is:package crypto works for cryptonite but not saltine,



But is:package cryptography does not help at all.



So hoogle can help when hackage fails, and it works in links, but it does not solve everything. And it can only search packages in stackage.

stackage

Let us try stackage search, maybe it can help?





Hmm, I think this is the same search algorithm as hoogle. Oh well, at least it works without javascript.

Google

In desperation, we try Google:



Well at least we can get going in the right direction. Crypto is not in stackage but maybe it is good. And there is the FP Complete post at the bottom there to recommend cryptonite to us.


lua script to generate a nodejs cache from a package-lock.json
2023-07-24T00:00:00+00:00
this is a lua script that generates a nodejs cache from a package-lock.json. why is this useful? well you can generate a cache without this by deleting node_modules, and then running npm install --no-save --cache path/to/some/dir. It will cache all the downloads to the directory you gave it. but, it won’t cache anything it doesn’t download. This means if you install esbuild for example it will only cache the binary executable for your CPU architecture and not other ones.

if that’s fine for you then just do that instead. But i’m trying to generate a node module cache to use when building a node package in a network-isolated sandbox (gentoo package build). I don’t want to generate a bunch of different tar files on different CPU archs, I just want the one, so I want all the different esbuild binaries in there, and it’ll just use the right one. All of them are in package-lock.json:

        "@esbuild/android-arm": "0.17.19",
        "@esbuild/android-arm64": "0.17.19",
        "@esbuild/android-x64": "0.17.19",
        "@esbuild/darwin-arm64": "0.17.19",
        "@esbuild/darwin-x64": "0.17.19",
        "@esbuild/freebsd-arm64": "0.17.19",
        "@esbuild/freebsd-x64": "0.17.19",
        "@esbuild/linux-arm": "0.17.19",
        "@esbuild/linux-arm64": "0.17.19",
        "@esbuild/linux-ia32": "0.17.19",
        "@esbuild/linux-loong64": "0.17.19",
        "@esbuild/linux-mips64el": "0.17.19",
        "@esbuild/linux-ppc64": "0.17.19",
        "@esbuild/linux-riscv64": "0.17.19",
        "@esbuild/linux-s390x": "0.17.19",
        "@esbuild/linux-x64": "0.17.19",
        "@esbuild/netbsd-x64": "0.17.19",
        "@esbuild/openbsd-x64": "0.17.19",
        "@esbuild/sunos-x64": "0.17.19",
        "@esbuild/win32-arm64": "0.17.19",
        "@esbuild/win32-ia32": "0.17.19",
        "@esbuild/win32-x64": "0.17.19"


There are entries for each one of these specifying the source. So anyways this lua script just traverses the package-lock.json and adds each tarfile URL from each resolved field to the custom cache, so I can tar it up. It reads $PWD/package-lock.json and writes $PWD/node-modules-cache/

You need to install subproc, luaposix, and lunajson as lua packages. You need openssl, base64, and npm on your $PATH.

#!/usr/bin/env lua
--[[
	builds a node-modules-cache/ dir from a package-lock.json

	requires lua packages:
		- lunajson
		- subproc
		- luaposix

	requires commandline tools:
		- openssl
		- base64
		- npm
]]

local lunajson = require('lunajson')
local subproc = require('subproc')
local posix = require('posix')
local posix_stdio = require('posix.stdio')

-- config as necessary
local outdir = 'node-modules-cache'

local function dbg(arg)
	io.stderr:write(tostring(arg) .. '\n')
	io.stderr:flush()
end

local base16 = (function()
	local alphabet = '0123456789abcdef'
	local lut = { }
	for i = 1, 16 do
		for j = 1, 16 do
			lut[((i - 1) << 4) | (j - 1)] = alphabet:sub(i, i) .. alphabet:sub(j, j)
		end
	end

	return function(data)
		local out = ''
		for i = 1, #data do
			out = out .. lut[data:byte(i)]
		end
		return out
	end
end)()

local function integrity_check_file(hash, path)
	local algo, expected = assert(hash:match('^([^-]+)-(.+)$'))

	local pfd = posix.popen_pipeline({
		{'openssl', algo, '-binary', path},
		{'base64', '-w0'}
	}, 'r')

	local f = assert(posix_stdio.fdopen(pfd.fd, 'r'))
	local actual = assert(f:read('a'))
	f:close()

	return expected == actual
end

--[[
	files are stored in the cache based on their integrity hash. This function
	takes an integrity hash and generates the path within the cache to where npm
	will put the file
]]
local function cacache_path(integrity)
	local algo, hash = assert(integrity:match('^([^-]+)-(.+)$'))

	-- convert hash to base16...
	local pfd = posix.popen_pipeline({
		function()
			print(hash)
		end,
		{'openssl', 'base64', '-d'}
	}, 'r')
	local f = assert(posix_stdio.fdopen(pfd.fd, 'r'))
	local hash_bin = assert(f:read('a'))
	f:close()

	local hash_b16 = base16(hash_bin)

	-- 2 levels of dirs
	local d1, d2, fname = hash_b16:match('^(..)(..)(.+)$') 

	return '_cacache/content-v2/' .. algo .. '/' .. d1 .. '/' .. d2 .. '/' .. fname
end


local lock_file = io.open('package-lock.json', 'r')
local lock = lunajson.decode(lock_file:read('a'))

subproc('mkdir', '-p', outdir)

for pkgname, pkg in pairs(lock.packages) do
	dbg('evaluating ' .. pkgname)
	if pkg.resolved then
		local outfile = outdir .. '/' .. cacache_path(pkg.integrity)

		local needs_download = false

		local _, _, ecode = subproc('test', '-f', outfile)
		if ecode ~= 0 then
			dbg('outfile ' .. outfile .. ' does not exist.')
			needs_download = true
		elseif not integrity_check_file(pkg.integrity, outfile) then
			dbg('outfile ' .. outfile .. ' has the wrong hash.')
			needs_download = true
		end

		if needs_download then
			dbg('downloading ' .. pkg.resolved)
			print(subproc('npm', 'cache', '--cache', outdir, 'add', pkg.resolved))
			dbg('checking hash of ' .. outfile)
			if integrity_check_file(pkg.integrity, outfile) then
				dbg('hash is correct')
			else
				dbg('hash is wrong')
				error()
			end
		else
			dbg('already have local copy')
		end
	end
	dbg('=====')
end

print(subproc('rm', '-rv', outdir .. '/' .. '_logs'))


hope that helps


Split up a .stp/.step file into STLs
2023-07-22T00:00:00+00:00
Simple thing, writing so we can find it again. So first install FreeCAD, easiest with the appimage or flatpak probably. Then open the step file with File > Open or File > Import. You’re on a screen like this:



Click HERE and change the “workbench” from “Start” to “Part”.





Select your part on the left.



Click Part > Compound > Explode Compound in the menubar.



Now you have the part split up. You can keep splitting as far as you need until you have either individual objects or one compound per STL you want.

Select one of the parts in the menu:



Now click File > Export and save it as an STL Mesh (.stl file).



Repeat for each STL you want.

We’ve never used FreeCAD before; thanks Val Packett for teaching us how to switch workbenches and explode compounds.

– vi[olet]


Nix: Using rustPlatform.buildRustPackage with Git Dependencies (it’s kind of a pain)
2023-07-08T00:00:00+00:00
I’m not sure if something changed with the way Nix’s rust build system works in 23.05, but I’m pretty sure something did because this project didn’t suddenly switch to git dependencies since I last built it with git; it was already using them. Anyways whether it’s new or not, you need to specify a hash for each of the git dependencies which works but is annoying to actually do. Here’s how.

Alright so when you normally build a rust project using rustPlatform.buildRustPackage you specify a cargoSha256. This is the hash of a derivation containing the source code of all the dependencies of the project, separate from the project source itself. If all your dependencies are normal crates then you just set that to empty string, run a build, replace it with whatever the real hash is and you’re done.

When you have git dependencies you don’t use cargoSha256. Instead you need to define cargoLock, like this:

rustPlatform.buildRustPackage {
  # This isn't a complete package definition because I'm only including the
  # parts relevant to this post. See
  # https://github.com/NixOS/nixpkgs/blob/master/doc/languages-frameworks/rust.section.md

  # in this example you have the source locally, relative to the nix file.
  # for example, when writing a nix flake
  src = ./.;
  
  # again, you have the source locally, so you can refer to the Cargo.lock
  # directly. I'm not sure how you do this with builds that pull the source
  # down from a remote honestly, probably referencing the Cargo.lock from the
  # source somehow.
  cargoLock.lockFile = ./Cargo.lock;

  # Here's the annoying bit
  cargoLock.outputHashes = {
    "capstone-0.10.0" = "sha256-x0p005W6u3QsTKRupj9HEg+dZB3xCXlKb9VCKv+LJ0U=";
    "hidapi-1.4.1" = "sha256-2SBQu94ArGGwPU3wJYV0vwwVOXMCCq+jbeBHfKuE+pA=";
    "hif-0.3.1" = "sha256-o3r1akaSARfqIzuP86SJc6/s0b2PIkaZENjYO3DPAUo=";
    "humpty-0.1.3" = "sha256-efeb+RaAjQs9XU3KkfVo8mVK2dGyv+2xFKSVKS0vyTc=";
    "idol-0.3.0" = "sha256-s6ZM/EyBE1eOySPah5GtT0/l7RIQKkeUPybMmqUpmt8=";
    "idt8a3xxxx-0.1.0" = "sha256-S36fS9hYTIn57Tt9msRiM7OFfujJEf8ED+9R9p0zgK4=";
    "libusb1-sys-0.5.0" = "sha256-7Bb1lpZvCb+OrKGYiD6NV+lMJuxFbukkRXsufaro5OQ=";
    "pmbus-0.1.0" = "sha256-20peEHZl6aXcLhw/OWb4RHAXWRNqoMcDXXglwNP+Gpc=";
    "probe-rs-0.12.0" = "sha256-uS+Hh2dKUXDgwqS9MdV6CmONO8i2pOeR5LBenliiEe0=";
    "spd-0.1.0" = "sha256-X6XUx+huQp77XF5EZDYYqRqaHsdDSbDMK8qcuSGob3E=";
    "tlvc-0.2.0" = "sha256-HiqDRqmKOTxz6UQSXNMOZdWdc5W+cFGuKBkNrqFvIIE=";
    "vsc7448-info-0.1.0" = "sha256-otNLdfGIzuyu03wEb7tzhZVVMdS0of2sU/AKSNSsoho=";
  };
}


Ok so how do you figure out what to put in outputHashes? It’s simple, but tedious. At first, just set it to {}, containing nothing, and then run your nix build. The build will fail, and give you a - pair:

error: No hash was found while vendoring the git dependency capstone-0.10.0.
You can add a hash through the `outputHashes` argument of `importCargoLock`:

outputHashes = {
  "capstone-0.10.0" = "";
};

If you use `buildRustPackage`, you can add this attribute to the `cargoLock`
attribute set.


Ok, so whatever pair it gave you, put it in with an empty string or lib.fakeSha256 as the hash.

  cargoLock.outputHashes = {
    "capstone-0.10.0" = lib.fakeSha256;
    # or if you want less typing, "capstone-0.10.0" = "";
  };


Now repeat the process. Each time you re-run your nix build it will give you a new error with a new package. Eventually, you will have all the git dependencies:

  cargoLock.outputHashes = {
    "capstone-0.10.0" = lib.fakeSha256;
    "hidapi-1.4.1" = lib.fakeSha256;
    "hif-0.3.1" = lib.fakeSha256;
    "humpty-0.1.3" = lib.fakeSha256;
    "idol-0.3.0" = lib.fakeSha256;
    "idt8a3xxxx-0.1.0" = lib.fakeSha256;
    "libusb1-sys-0.5.0" = lib.fakeSha256;
    "pmbus-0.1.0" = lib.fakeSha256;
    "probe-rs-0.12.0" = lib.fakeSha256;
    "spd-0.1.0" = lib.fakeSha256;
    "tlvc-0.2.0" = lib.fakeSha256;
    "vsc7448-info-0.1.0" = lib.fakeSha256;
  };


At this point, you will start getting new error messages for incorrect hashes:

error: hash mismatch in fixed-output derivation '/nix/store/vp1w3i1xpsji7lvd2ij49myjbibmddqb-capstone-rs-77296e0.drv':
         specified: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
            got:    sha256-x0p005W6u3QsTKRupj9HEg+dZB3xCXlKb9VCKv+LJ0U=
error: 1 dependencies of derivation '/nix/store/bvb8wznxhpja68kak39s4a7c8n2q77fk-capstone-0.10.0.drv' failed to build
error: 1 dependencies of derivation '/nix/store/2gmfryjwzcplz3aysxc2nqwypq6839kb-cargo-vendor-dir.drv' failed to build
error: 1 dependencies of derivation '/nix/store/rig877k0ab9xiwfs73caqas2jb52ckv6-humility-20230708.drv' failed to build


Insert the correct hash:

  cargoLock.outputHashes = {
    "capstone-0.10.0" = "sha256-x0p005W6u3QsTKRupj9HEg+dZB3xCXlKb9VCKv+LJ0U=";
    "hidapi-1.4.1" = lib.fakeSha256;
    "hif-0.3.1" = lib.fakeSha256;
    "humpty-0.1.3" = lib.fakeSha256;
    "idol-0.3.0" = lib.fakeSha256;
    "idt8a3xxxx-0.1.0" = lib.fakeSha256;
    "libusb1-sys-0.5.0" = lib.fakeSha256;
    "pmbus-0.1.0" = lib.fakeSha256;
    "probe-rs-0.12.0" = lib.fakeSha256;
    "spd-0.1.0" = lib.fakeSha256;
    "tlvc-0.2.0" = lib.fakeSha256;
    "vsc7448-info-0.1.0" = lib.fakeSha256;
  };


Now repeat the process. Once again, you need to re-run the build one by one until you have the hashes for all of the packages. BE CAREFUL You might not get the hash errors in the same order you got the - pairs. Pay close attention to the error message to make sure you’re adding the right has to the right crate. I accidentally ended up mismatching some of the hashes because I figured I’d get the errors in the same order.

Anyways, after all that is done you’ll have an outputHashes like I showed in the example.

  cargoLock.outputHashes = {
    "capstone-0.10.0" = "sha256-x0p005W6u3QsTKRupj9HEg+dZB3xCXlKb9VCKv+LJ0U=";
    "hidapi-1.4.1" = "sha256-2SBQu94ArGGwPU3wJYV0vwwVOXMCCq+jbeBHfKuE+pA=";
    "hif-0.3.1" = "sha256-o3r1akaSARfqIzuP86SJc6/s0b2PIkaZENjYO3DPAUo=";
    "humpty-0.1.3" = "sha256-efeb+RaAjQs9XU3KkfVo8mVK2dGyv+2xFKSVKS0vyTc=";
    "idol-0.3.0" = "sha256-s6ZM/EyBE1eOySPah5GtT0/l7RIQKkeUPybMmqUpmt8=";
    "idt8a3xxxx-0.1.0" = "sha256-S36fS9hYTIn57Tt9msRiM7OFfujJEf8ED+9R9p0zgK4=";
    "libusb1-sys-0.5.0" = "sha256-7Bb1lpZvCb+OrKGYiD6NV+lMJuxFbukkRXsufaro5OQ=";
    "pmbus-0.1.0" = "sha256-20peEHZl6aXcLhw/OWb4RHAXWRNqoMcDXXglwNP+Gpc=";
    "probe-rs-0.12.0" = "sha256-uS+Hh2dKUXDgwqS9MdV6CmONO8i2pOeR5LBenliiEe0=";
    "spd-0.1.0" = "sha256-X6XUx+huQp77XF5EZDYYqRqaHsdDSbDMK8qcuSGob3E=";
    "tlvc-0.2.0" = "sha256-HiqDRqmKOTxz6UQSXNMOZdWdc5W+cFGuKBkNrqFvIIE=";
    "vsc7448-info-0.1.0" = "sha256-otNLdfGIzuyu03wEb7tzhZVVMdS0of2sU/AKSNSsoho=";
  };


And now you’re done wrangling dependencies. The Cargo.lock file you gave to cargoLock.lockFile specifies the hashes of all the normal crates, and you’ve manually specified the hashes of all your git dependencies.


Thoughts on MNT Reform after a couple months
2023-07-06T00:00:00+00:00
So we got ahold of one of these used from someone who found it didn’t do what they needed it to. New, these things are like 1200 euros. We got one for $420 (lmao). though we had to buy a different keycap set to get key legends (prev owner had blank caps), and we had to replace the battery boards with a newer version since this was like one of the first Reform versions made and the battery boards had a problem. the thing is that if you look purely at the CPU performance and ram, and you compare that to the price, this thing is currently really bad value. maybe that changes when they finish designing the RK3588 module but right now that’s how it is. but it’s increasingly clear that the value is there now, it’s just not in compute, it’s everything else.

So like, build quality is great. But more than that, the hardware layout is great. We’ve had to open this thing up a few times now to get at various parts we were updating to the current revisions of things and nothing was a pain in the ass. Disconnecting the battery boards is painless. Disconnecting the tiny OLED display for the system controller board is painless. Getting the trackball out to clean under it is easy, swapping the compute module is easy. We’ve gone through absolute hell with a lot of laptops during disassembly and reassembly and there’s none of that here. It’s really obvious to us how to do everything too, we didn’t even need to check the manual.

And there’s things like how the keyboard is mechanical switches. it rules. Actually the keyboard has some other neat parts. For one thing, it works as both a laptop keyboard and a standalone keyboard. They literally sell the exact same board in its own little case that you can use as a USB keyboard if you just like the board and want to use it with other computers. It’s got a USB port built into it for that purpose. Or you can buy it without the case and it functions as a replacement keyboard for the laptop. Actually wild that even if the rest of this laptop broke and we didn’t feel like fixing it we could keep using the keyboard as a keyboard.

Trackball also goes hard. I’m still a trackpoint diehard but this is a very close second and isn’t RSI-inducing for us like trackpads are. Fuck trackpads.

Also the bit where the batteries are standard. The batteries were dead from the person I got it from, but I just bought some from an unaffiliated website online selling LiFePO4 18650 cells and they just worked. because it’s a standard. No custom battery pack bullshit!!!!!!!!!! This is such a killer feature in and of itself because dying batteries is what has been the cause of death for almost every single one of our laptops to date (ballooning batteries in many of the cases) and the only reason our thinkpad x220 still has good battery life is third party sellers sell packs with the 18650s replaced (but I can’t do that replacement myself).

And then the other side of things is software and support. I feel like I’m used to the worst of the worst dealing with weird vendor boards. Pine64 is better than those usually but it’s still rough. MNT is out here having first-party images to flash on the SD card and eMMC, but not only that, they have a whole suite of command line scripts that automate the processes of downloading, flashing and updating u-boot and the boot images, setting up fstab, etc. etc. They ship with custom builds of gstreamer and such to get hardware accelerated video playback working. They ship environment variables to actually turn it on, to fix various programs under wayland. The community landed updates into mesa to make the GPU run better, etc. People are out here making the software work on the hardware and it’s great.

And u-boot supports the display! what the FUCK. you never see that!!!!!!! I can get early debug info without even having to connect up a UART. good lord. Oh and they have instructions and a defconfig for building the u-boot that ACTUALLY WORKS. I literally copy pasted from the handbook without thinking and got a working u-boot image this NEVER HAPPENS TO ME. The Kicad schematics are there and I can understand them. The keyboard firmware is heavily commented so I can figure out how to modify it to my liking. They are actually open source in the sense of you can go to the source code and do USEFUL THINGS with it, not “oh we shoved the source code without any context out the door”.

The one real downer on the software side for some folks is you have to use Wayland. It is what it is, that’s basically the case everywhere with ARM. the PinebookPro somehow brute-forces it enough to be somewhat usable under X but even it struggles. I don’t really understand it, but Xorg runs so terribly on these embedded GPUs and there’s not really anything anyone can do about it (unless you’re feeling masochistic and want to do some Xorg dev work, good luck though for real). But that sucks if your workflow hard-depends on X, and its why the person who sold it to us was selling it. Though as we found out a lot of improvements have been made to the Wayland software ecosystem over the past years.

Anyways this thing is awesome. CPU/RAM specs still suck. But everything else is incredible and this feels like a device that will actually continue to work instead of falling apart in a year or two like most laptops do. Looking forward to the Pocket too. Still gonna be maining the x220 until the compute on this catches up but damn!


cross-compile with NixOS and deploy that shit continuously
2023-06-06T00:00:00+00:00
Alright the premise of this one is way simpler than how we got there. We’ve got a raspberry pi 2, and we wanted to set it up to do some system monitoring. Pretty simple stuff ultimately: it’s got an FTDI serial adapter and an ST-Link both plugged in over USB to monitor a long computer that we’re doing work on right now. It’s also got an ethernet connection to that computer for netbooting, and then it’s bridging that connection to the rest of our network over a second ethernet port. We could get most of the way to what we wanted with Alpine Linux or even raspbian, but we’re running humility to do the ST-Link side of things, and there’s no way in hell I’m waiting around for that to compile an a raspi2. So, cross compile right? Yeah, but cross compiling sucks. Unless NixOS can save us? Turns out it can.

Look I’m not really a NixOS girl usually, but I’ve been coming around to it. Some of my roommates are really selling me on it lately, and I’ve been using it for creating x86 live ISOs, so I figured that it had a shot of being good here. It’s uh. Well it’s good once you get there, but very little of what I’m doing is directly documented (main reason I’m writing this, to teach that knowledge forward). And the interaction between cross-compilation and nix flakes still kinda sucks. We’ll be grappling with that a few times in this post. It’s worth it though, it’s way better than dealing with cross toolchains directly.

Anyways due to that lack of documentation and my inexperience with NixOS this would not be at all possible without help from Xe / open skies / ckie. They did most of telling me what to look at and figuring out how to get things to work. I just put the pieces together.

Ok let’s get on with it.

There’s a few discrete things we want to do here:


  Cross-compile humility
  Cross-compile a bootable raspi2 image
  Deploy changes to the raspi in-situ after we set it up, so we don’t need to constantly re-flash the card.


Humility

Let’s start with humility. This is a little unintuitive. Here’s a flake.nix that I dropped into the humility repo at some random commit that happened to be on main at the time:

{
  description = "debugger for Hubris";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-23.05";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    let system = flake-utils.lib.system;
    in flake-utils.lib.eachSystem [
      system.x86_64-linux
      system.aarch64-linux
      system.armv7l-linux
    ] (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
        build-humility = (pkgs:
          pkgs.rustPlatform.buildRustPackage {
            pname = "humility";
            version = "20230526";

            src = ./.;

            cargoSha256 = "sha256-+2JAuY6zQkepLrbRKII6rOUJYQw6Psq92fIiE0Gm1Ns=";
            buildInputs = [ pkgs.libudev-zero ];
            nativeBuildInputs = [ pkgs.pkg-config pkgs.cargo-readme ];

            meta = with pkgs.lib; {
              description = "debugger for Hubris";
              homepage = "https://github.com/oxidecomputer/humility";
              license = licenses.mpl20;
              mainProgram = "humility";
            };
          });
      in rec {
        packages = rec {
          humility = build-humility pkgs;
          humility-cross-armv7l-linux =
            build-humility pkgs.pkgsCross.armv7l-hf-multiplatform;
          humility-cross-aarch64-linux =
            build-humility pkgs.pkgsCross.aarch64-multiplatform;
        };

        defaultPackage = packages.humility;

      });

}


Ok so remember, we want to be able to do a native compilation and a cross compilation. Because we’re cross compiling we need to think about the distinction between the host system (the thing running the compiler) and the target system (the thing running the code).

I’m using flake-utils.lib.eachSystem to iterate over possible host systems. Realistically I should just list every -linux combo here so someone using my flake could attempt to compile this from any host. I mean really there’s no reason not to just use anySystem I guess, if it fails it fails. But yeah, the point is, we’re iterating over all possible host systems.

Next up we have built-humility. This is a function which takes in some version of nixpkgs and defines a build of humility for that nixpkgs. That’s confusing until I explain how we’re using it.

So down below we have this in the packages section:

humility = build-humility pkgs;
humility-cross-armv7l-linux =
  build-humility pkgs.pkgsCross.armv7l-hf-multiplatform;
humility-cross-aarch64-linux =
  build-humility pkgs.pkgsCross.aarch64-multiplatform;


humility here defines a build where the target is the same as the host. So
like. You’re on an x86_64 computer, or an aarch64 computer or whatever. you want to just compile and use humility. You use the humility package. It compiles something you can run on the system you’re on right now.

Next we have humility-cross-armv7l-linux which defines a build where the target is armv7l-linux. We use pkgs.pkgsCross.armv7l-hf-multiplatform which gives us an alternate view into nixpkgs where every build is defined as being cross compiled instead of native compiled. The host is whatever system we happen to be running on, its any of the systems we passed in to flake-utils.lib.eachSystem up above. So like in my flake here it could be an x86_64 or an aarch64 system, or I could go add riscv or powerpc to the list of potential build hosts if I was feeling ambitious. Really most things should work.

Then we’ve got humility-cross-aarch64-linux. Same thing as the armv7l-linux one, but now we’re targeting aarch64 from whatever our host is. There’s probably some way to iterate over all possible targets to make this better than just listing them out one by one.

This is pretty cool. You can run

nix build .#humility-cross-armv7l-linux


and it will build the entire dependency chain and then build humility! This will take kind of awhile your first time doing it unless you have a lot of computer because when I say it builds the entire dependency chain I mean the entire dependency chain, no binary cache available. This is the downside to using pkgsCross, and so you’ll have some bootstrapping overhead from this. In the case of armv7l that’s hardly a downside though because armv7l doesn’t have an official binary cache anyway and we didn’t feel like trying to figure out how to use a community one.

Why not qemu-user

You may have have run into an alternative way to do cross-compilations with flakes wherein you build via for example .#defaultPackages.armv7l-linux. This works very differently: instead of actually cross compiling, it instead does a “native compile”, but emulates the armv7l instruction set in userspace using qemu-user. This is really cool, and we’ll actually be using qemu-user emulation later in this post, but it’s also slow as dirt because you’re emulating the entire compiler. rustc is slow enough as it is, it doesn’t need help being slower.

Plus, since there’s no binary cache for armv7l, we’d have to build the entire dependency tree this way. That would take me like days. or weeks. I dunno.

Still, in some complex situations, cross-compilation doesn’t work, and your options will lay between qemu-user emulation, full system emulation in a VM, or trying to debug/fix the cross-comp up your dependency tree. In that case, pick your poison.

Bootable raspi image

Time for another flake. I’ll give you the minimal flake that gets something booting and then we’ll go from there:

{
  description = "Build image";
  # update to whatever version
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-23.05";

  outputs = { self, nixpkgs }: rec {
    nixosConfigurations.vulpix =
      nixpkgs.legacyPackages.x86_64-linux.pkgsCross.armv7l-hf-multiplatform.nixos {

        imports = [
          "${nixpkgs}/nixos/modules/installer/sd-card/sd-image-armv7l-multiplatform.nix"
          nixosModules.vulpix
        ];
      };
    images.vulpix = nixosConfigurations.vulpix.config.system.build.sdImage;

    nixosModules.vulpix = ({ lib, config, pkgs, ... }: {
      environment.systemPackages = with pkgs; [
        neofetch
      ];

      services.openssh.enable = true;

      users.users.root.openssh.authorizedKeys.keys = [
        "ssh-ed255119 AAAAAAAAAAAAAAAAAAsdfgjgkly idk i dont speak bottom"
      ];

      networking.hostName = "vulpix";
    });
  };
}


Ok so you put in a real SSH key in there but this is enough to get a built image. Run

nix build .#images.vulpix


and it’ll cross-compile a shitload of packages and a mainline linux kernel and load it up into result/sd-image/somethingorother.img.zst. Do a

zstdcat  | sudo dd of=/dev/sdWhatever bs=4M status=progress oflag=direct


and you will have a bootable SD card. It even outputs u-boot and kernel spew to the serial console! fuck yeah. The thing to pay attention to here is nixpkgs.legacyPackages.x86_64-linux.pkgsCross.armv7l-hf-multiplatform.nixos. We’re using this to actually cross-compile everything, similar to how we did in the flake.

Here I’ve hardcoded the host system to x86_64-linux because I don’t really care about trying to build this thing on other hosts right now, but we could do the same trick as with humility, using flake-utils to make it generic across multiple host builder architectures. Truthfully I just don’t feel like making that change and re-testing it before finishing this blog post.

dealing with pi bullshit

For the pi, I need you to hold the fuck up and maybe don’t do use the config I just gave you. Mainline kernel might work for you, but for us the st-link would just NOT work on mainline on the pi. I don’t know why. It was causing libusb error spam and breaking shit, so we needed to use the raspi vendor kernel.

BUT! Using the vendor kernel is different in some other exciting ways.

First off, out of the box you’ll get this error somewhere in the steps to building the SD image: modprobe: FATAL: Module ahci not found in directory /nix/store/gl48ccw2i45p80bkr43fpqpqi3xxw93v-linux-armv7l-u
nknown-linux-gnueabihf-6.1.21-1.20230405-modules/lib/modules/6.1.21

exciting right? There’s a workaround that we found on github.

Ok so the other issue is there’s some kernel bug I don’t understand that caused one or both of the ethernet adapters to fail out and not come up. We found this thread about a similar thing that suggested setting coherent_pool=4M in the kernel parameters. We tried that and it worked so. lol. lmao i guess. whatever.

{
  description = "Build image";
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-23.05";

  outputs = { self, nixpkgs }: rec {
    nixosConfigurations.vulpix =
      nixpkgs.legacyPackages.x86_64-linux.pkgsCross.armv7l-hf-multiplatform.nixos {

        imports = [
          "${nixpkgs}/nixos/modules/installer/sd-card/sd-image-armv7l-multiplatform.nix"
          nixosModules.vulpix
        ];
      };
    images.vulpix = nixosConfigurations.vulpix.config.system.build.sdImage;

    nixosModules.vulpix = ({ lib, config, pkgs, ... }: {
      environment.systemPackages = with pkgs; [
        neofetch
      ];

      services.openssh.enable = true;

      # deal with that "module ahci not found" error
      nixpkgs.overlays = [
        (final: super: {
          makeModulesClosure = x:
            super.makeModulesClosure (x // { allowMissing = true; });
        })
      ];

      users.users.root.openssh.authorizedKeys.keys = [
        "ssh-ed255119 AAAAAAAAAAAAAAAAAAsdfgjgkly idk i dont speak bottom"
      ];

      networking.hostName = "vulpix";

      # good luck
      # needed for the stlink to work
      boot.kernelPackages = lib.mkForce pkgs.linuxKernel.packages.linux_rpi2;

      # if you don't have this and you have 2 network devices plugged in
      # with the rpi kernel then networking breaks due to kernel bugs. lol.
      boot.kernelParams = [ "coherent_pool=4M" ];
    });
  };
}


Also don’t get me wrong, as annoying as the pi stuff is, this is still shockingly painless compared to what dealing with this sort of problem often looks like with other ditros/distro builders.

let’s add humility

Adding humility from the flake we defined previously is easy. We just add that flake as an input, and then add humility.packages.x86_64-linux.humility-cross-armv7l-linux to the environment.systemPackages. I don’t want to drop another full copy of the config with that change, and I want to leave these configs copy-pastable for your personal use, so if you need to see a full example config with humility imported, click this link for flake-with-humility.nix. Again, that x86_64-linux could be made generic across multiple build host architectures, but I didn’t bother.

Deploy Changes

I don’t want to pull the SD card out and re-flash it every time I make changes. It’s annoying, it wipes any persistent data I’ve put on there, it wastes write-cycles on the flash. There is a better way. We’re using deploy-rs because Xe recommended it to us, though we think there’s also something called “Morph” which fills a similar niche.

With deploy-rs, all you have to do is import it as an input and add a new deploy output to your flake, and then you can update the system on the fly by running nix run github:serokell/deploy-rs in the repo your flake is in. Or rather, that’s almost all you have to do. Here’s that section, see if you can spot the catch:

deploy.nodes.vulpix = {
  profiles.system = { 
    user = "root";
    path = deploy-rs.lib.x86_64-linux.activate.nixos nixosConfigurations.vulpix;
  };

  # this is how it ssh's into the target system to send packages/configs over.
  sshUser = "root";
  hostname = "host.of.the.system.that.it.should.ssh.into";
};


Yeah you see that x86_64-linux? That’s a binary that’s going to run on the target. Which is notably armv7l-linux for us. So… sigh, ok look, here’s where cross comp fails us. deploy-rs’s flake doesn’t support armv7l-linux, for no real reason other than the list it uses for supported systems doesn’t include it. We could fork the flake and add it, and then we could actually use armv7l-linux. But that will try and do the qemu-user compile which, as previously mentioned, is utter hell. If you’re targeting an aarch64 system from an x86_64 host maybe you don’t care because I think deploy-rs has a binary cache to cover you there. But in this case, we just left it as x86_64-linux, and took a different option, adding this to our raspi system config:

# needed for deploy-rs
boot.binfmt.emulatedSystems = [ "x86_64-linux" ];


Yes, we’re going to emulate the x86_64-linux binary on the 900MHz processor of the pi. This is actually fine because it doesn’t actually have to do much computationally, and it’s a rust binary so we’re at least emulating native code instead of like, the python interpreter. It’s genuinely not a problem to do this, I 100% recommend it.

At this point if you already flashed your SD card while following along, sorry, you’ll need to re-flash the SD card with the emulatedSystems change before you can start using deploy-rs. AFTER you do that, with a flake like the one below, you can start using deploy-rs to build new packages and send changes over.

{
  description = "Build image";
  inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-23.05";
  inputs.deploy-rs.url = "github:serokell/deploy-rs";

  outputs = { self, nixpkgs, deploy-rs }: rec {
    nixosConfigurations.vulpix =
      nixpkgs.legacyPackages.x86_64-linux.pkgsCross.armv7l-hf-multiplatform.nixos {

        imports = [
          "${nixpkgs}/nixos/modules/installer/sd-card/sd-image-armv7l-multiplatform.nix"
          nixosModules.vulpix
        ];
      };
    images.vulpix = nixosConfigurations.vulpix.config.system.build.sdImage;

    nixosModules.vulpix = ({ lib, config, pkgs, ... }: {
      environment.systemPackages = with pkgs; [
        neofetch
      ];

      nixpkgs.overlays = [
        (final: super: {
          makeModulesClosure = x:
            super.makeModulesClosure (x // { allowMissing = true; });
        })
      ];

      services.openssh.enable = true;

      users.users.root.openssh.authorizedKeys.keys = [
        "ssh-ed255119 AAAAAAAAAAAAAAAAAAsdfgjgkly idk i dont speak bottom"
      ];

      networking.hostName = "vulpix";

      # needed for deploy-rs
      boot.binfmt.emulatedSystems = [ "x86_64-linux" ];

      # good luck
      # needed for the stlink to work
      boot.kernelPackages = lib.mkForce pkgs.linuxKernel.packages.linux_rpi2;

      # if you don't have this and you have 2 network devices plugged in
      # with the rpi kernel then networking breaks due to kernel bugs. lol.
      boot.kernelParams = [ "coherent_pool=4M" ];
    });

    
    deploy.nodes.vulpix = {
      profiles.system = { 
        user = "root";
        path = deploy-rs.lib.x86_64-linux.activate.nixos nixosConfigurations.vulpix;
      };

      # this is how it ssh's into the target system to send packages/configs over.
      sshUser = "root";
      hostname = "host.of.the.system.that.it.should.ssh.into";
    };
  };
}


Any time you change this, you just run nix run github:serokell/deploy-rs and your changes are delivered! Basically the same as if you were editing a configuration.nix on a normal NixOS system and doing nixos-rebuild switch or whatever it is (sorry if that’s wrong I don’t use NixOS on my desktop sorry).

Here’s the most kick-ass part of this, is that it updates the boot configurations properly too. And, since we have a working u-boot console, we can actually choose which boot configuration to use at startup:

switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:2...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
------------------------------------------------------------
1:      NixOS - Default
2:      NixOS - Configuration 4 (2023-06-07 06:31 - 23.05pre-git)
3:      NixOS - Configuration 3 (2023-06-06 01:46 - 23.05pre-git)
4:      NixOS - Configuration 2 (2023-06-06 01:44 - 23.05pre-git)
5:      NixOS - Configuration 1 (1970-01-01 00:00 - 23.05pre-git)
Enter choice: 

It auto-boots into the latest one after a delay, but you can just go ahead and pick something else.

This saved my ass multiple times, both when trying to switch from the mainline kernel to the rpi kernel, and when I accidentally rearranged the network adapter to a different USB port and broke my network bridge configuration, because I was able to just boot back into a previously working config and then re-deploy a new configuration from there. No bullshit having to take the SD card out and chroot into it from my desktop or something to fix stuff.

So there you go. Cross compile shit to your tiny devices. Send new configs to them. Do it all without it being a royal pain and without it feeling like it’ll fall apart at any moment. I swear I’m not a Nix fan but this stuff is undeniably kinda cool.


Enable Crate Features in Helix Editor With rust-analyzer
2023-05-08T00:00:00+00:00
So you’re using helix, you’re working on a rust crate, and you need to turn a feature flag on. Here’s how:


  create a .helix folder in the project root.
  create a file .helix/languages.toml.
  In that file, put this in:


[[language]]
name = "rust"
[language-server.rust-analyzer.config.cargo]
features = [ "some_feature" ]


This will enable the feature! Restart helix (or reload config + restart lsp) and you’re good to go. You can also configure anything else you can find in rust-analyzer’s config docs because the config object is just passed right on in to rust-analyzer as json. Note that anywhere rust-analyzer’s docs write the rust-analyzer. prefix, you instead want language-server.rust-analyzer.config. in your toml.

For example, here’s what i have in my global languages.toml, to let rust-analyzer run longer and to use clippy:

[[language]]
name = "rust"
[language-server.rust-analyzer]
timeout = 120

# rust-analyzer docs specify this as `rust-analyzer.check.command`
[language-server.rust-analyzer.config.check]
command = "clippy"


If you want, you can also use .helix/ to configure other helix settings on a per-project basis like disabling autoformat or setting indent levels or whatever.


Fix Linux Suspend/Sleep in Gigabyte B550i Aorus Pro AX
2023-04-09T00:00:00+00:00
Linux won’t sleep on this motherboard out of the box. I have a rev1.1 motherboard running BIOS version F17b. No idea if this applies to rev1.2. There’s a workaround you can do which is to disable PCIe wakeup on GPP0 (GPP bridge to the m.2 NVMe drives): echo GPP0 | sudo tee /proc/acpi/wakeup. To make this persistent you need to run this command at boot. I do not know why it is like this. systemd/openrc methods to run this at boot:

systemd service:

[Unit]
Description=fix sleep

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=sh -c 'echo GPP0 > /proc/acpi/wakeup'

[Install]
WantedBy=multi-user.target


openrc:

Ensure local is enabled

rc-update add local default


Create /etc/local.d/fix-sleep.start. As root:

cat >/etc/local.d/fix-sleep.start < /proc/acpi/wakeup
EOF

chmod +x /etc/local.d/fix-sleep.start



Why I Switched to Helix Editor
2023-04-06T00:00:00+00:00
It works on illumos. It supports rust-analyzer out of the box. I like the controls. I like the color themes. I don’t need to install or configure plugins.


I got Mozilla’s syncstorage-rs working; free yourself from python2
2023-03-27T00:00:00+00:00
You may not know this, but you can run your own sync server for Firefox Sync, which lets you keep all your sync’d data entirely on your own servers. The github:mozilla-services/syncserver has been what you use to do this for awhile now, but it’s written in Python 2, which is becoming increasingly non-existent on modern Linux distributions since Python 2 is end-of-life. The replacement is github:mozilla-services/syncstorage-rs, a new rust implementation. I had some trouble just following the docs, but I did some code diving to find the missing details. Here’s what I got working on my machine.

At the end of this post you should have a mariadb or mysql database up and running, and you’ll be syncing your firefox instances through your very own syncstorage-rs server. These docs expect you to know how to install, set up, and manage your own SQL database, because I’m not prepared to educate a beginner on how to do that. What you won’t have at the end of this post is your own instance of the mozilla authentication provider. That means that you’ll still log in through a Mozilla Firefox Account, using an existing email and password for that if you have one (though you could also delete and recreate your account if you wanted).

You can theoretically set up your own auth provider too, but I haven’t looked into how, because ultimately all your device data will go through your own server regardless of what auth provider you use. It’d be a benefit for device metadata privacy though, which would be nice. Regardless, setting up your own auth server is out of scope for this post.

I don’t have a good understanding of the stability of this software right now. I’m just some girl on the internet; I don’t work at mozilla. Use this at your own risk, and if it ends up deleting all your data, well hey that’s probably good to know but I can’t do anything about it.

Also, I’m writing this against syncstorage-rs commit hash f416d8a8c44c4c294f9403b40f136bda85bdd709 from March 7, 2023. These instructions may not be applicable to other versions, newer or older.

So let’s get on with it.

1. Dependencies

First decide on a URL and port that you want to run the syncserver on. I’ll be using

http://umbreon.eq:8000


In this post. Anywhere you see that, replace it with your syncserver’s location.

To build syncstorage-rs, you need the mysql or mariadb client, and you also need development headers for it. Some distros split up the client/server, some bundle them all together. You’ll also need the database server if you intend to run the database on the same system as syncstorage-rs. Here’s some example commands for various distros:

# gentoo, with +server (default) on mariadb
emerge dev-db/mariadb

# arch linux, client + server
pacman -S mariadb

# arch linux, just the client
pacman -S mariadb-clients mariadb-libs

# void linux
xbps-install mariadb

# debianish (mariadb-server is the server pkg)
apt install libmariadb-dev

# rhelish
dnf intsall mariadb-devel


You also need Python 3 and virtualenv (or whatever you prefer to use instead of virtualenv, if you have preferences).

Oh and rust of course. Go grab rust from rustup or wherever you get your rust toolchains.

Once you have rust, you’ll also need diesel to run the database migrations to initialize the database.

cargo install diesel_cli --no-default-features --features 'mysql'


2. Clone and Build

Now you need to clone the syncstorage-rs project and install the main server program:

git clone https://github.com/mozilla-services/syncstorage-rs
cd syncstorage-rs
cargo install --path ./syncserver --no-default-features --features=syncstorage-db/mysql --locked


In the same directory, set up a python virtualenv and install these two sets of requirements. You need to run the syncstorage server from the environment because it runs some python itself as part of the authentication process. You also need this environment to run some of the commands we’ll be using to populate the database.

virtualenv venv
source venv/bin/activate
pip3 install -r requirements.txt
pip3 install -r tools/tokenserver/requirements.txt


3. Initialize the Database

Now we can initialize the databases syncstorage / tokenserver (both bundled up in syncserver) need. This assumes you’ve already done basic setup on your mysql/mariadb database and have a root user you can access.

Create the user and databases:

SYNCSTORAGE_PW="$(cat /dev/urandom | base32 | head -c64)"
printf 'Use this for the syncstorage user password: %s\n' "$SYNCSTORAGE_PW"

# login as root sql user using whatever creds you set up for that
# this sets up a user for sync storage and sets up the databases
mysql -u root -p <


Run the migrations to setup the initial database structure. From the syncstorage-rs folder:

# syncstorage db
$HOME/.cargo/bin/diesel --database-url "mysql://syncstorage:${SYNCSTORAGE_PW}@localhost/syncstorage_rs" migration --migration-dir syncstorage-mysql/migrations run

# tokenserver db
$HOME/.cargo/bin/diesel --database-url "mysql://syncstorage:${SYNCSTORAGE_PW}@localhost/tokenserver_rs" migration --migration-dir tokenserver-db/migrations run


Add the sync endpoint to the services table in the tokenserver db:

mysql -u syncstorage -p"$SYNCSTORAGE_PW" <


Now you need to add a “node”. A node is any instance of syncserver to which a client can be allocated. You probably only want one node and that node is the server we’re setting up literally right now. You’ll also specify the user capacity, which indicates how many separate firefox accounts can use your server to sync. If it’s just you using this, you could set this to 1.

# the 10 is the user capacity.
SYNC_TOKENSERVER__DATABASE_URL="mysql://syncstorage:${SYNCSTORAGE_PW}@localhost/tokenserver_rs" \
    python3 tools/tokenserver/add_node.py \
    http://umbreon.eq:8000 10


4. Set up your config file

There’s a sample config file in config/local.example.toml, but we need to change most of the URLs because we want to run against mozilla’s prod environment instead of their staging environment. Rather than tell you how to edit that, just run this command to generate a good file.

MASTER_SECRET="$(cat /dev/urandom | base32 | head -c64)"
METRICS_HASH_SECRET="$(cat /dev/urandom | base32 | head -c64)"
cat > config/local.toml <


5. Securing your connections

I personally don’t feel like leaving this thing running on the open internet for anyone to use. You open yourself up to all sorts of fun possibilities like your hard drive filling up from other peoples’ data, or someone discovering a vulnerability in the service and using is to hack your server. I’m running mine on a local network with my own VPN setup to make that work, but this is also a good application for tailscale if you use that (I don’t). I heard tailscale recently got beta support for custom OIDC providers, neat!

The config file generated by the command I gave you above restricts the server to run on localhost so that you don’t get a server open to the entire world just by copy-pasting commands out of this post. You can set it to a specific IP to listen on a particular instance, but it’ll be serving over unencrypted HTTP, so you only really want to do this if you’re putting it on a VPN interface instead of an internet-accessible IP. If you want TLS encryption, reverse proxy it behind nginx or caddy or something like that.

6. Run the server

It’s finally time! Make sure that whenever you run the syncserver you’re doing it from the python virtualenv so it can run the python needed for the authentication.

~/.cargo/bin/syncserver --config=config/local.toml


6.1 A Note on Time

The authentication process is very sensitive to time. If your server is more than a couple seconds behind what the global NTP network agrees the current time is, authentication will just silently fail, and you won’t know why your browser won’t authenticate correctly, it’s very confusing to debug. The root cause is that when the sync server’s time is behind, the JWT token will be considered valid in the future, but not now. The sync server silently eats this error:

tokenserver-auth/src/verify.py

        except (ClientError, TrustError):
                    return None


7. Configure Firefox

First, log out of Firefox Sync if you’re logged in. I’m not 100% sure if this is necessary, but it’s what I did. Then open about:config.

Set identity.sync.tokenserver.uri to http://umbreon.eq:8000/1.0/sync/1.5, replacing http://umbreon.eq:8000 with your sync server’s location.

Restart Firefox.

Log in to Firefox Sync like normal. Then configure another Firefox the same way and log in there too. If everything goes well, data should transfer over! If not everything goes well, then you might see some things sync and not others, or nothing happen at all.

If you need to debug, about:sync-log is your friend.

8. Make it persistent

Now you should probably set up a system service to start it up automatically at boot. I’ll leave that for you to figure out how to do, because statistically you’re probably on a systemd-based system, but I don’t have any of those handy to try out a systemd service. Here’s some resources that can get you started though.

Remember, whatever script you write to start the service up, it needs to activate the virtualenv first! And you probably want to run it from the syncstorage-rs folder as the working directory, though I’m not 100% sure that’s necessary.


  systemd - arch linux wiki
  openrc - openrc docs
  runit - runit docs


My friend also sent me this sample systemd service from her setup, which you can maybe adapt to your needs.

[Unit]
Description=Mozilla Firefox Sync
Wants=mysql.service
After=network.target mysql.service

[Service]
Environment="VIRTUAL_ENV=/path/to/syncstorage-rs/venv"
ExecStart=/path/to/syncstorage/binary --config=/path/to/config.toml
Restart=on-abort

User=syncstorage
Group=syncstorage
UMask=007

NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true

[Install]
WantedBy=multi-user.target



Firefox: Disable ‘A web page is slowing down your browser’
2023-03-25T00:00:00+00:00
This warning serves no purpose on slow computers where like half of modern web-apps trigger it (thanks modern web-apps). To disable it, go into about:config and set dom.ipc.processHangMonitor and dom.ipc.reportProcessHangs to false.


a practical guide to self-induced {trance?, hypnogogia?, daydreaming?} for fun and vibes
2023-03-15T00:00:00+00:00
A strategy so simple that a friend described it to me in a few sentences, off hand, not intending to provide me instruction. and I went and did it, and it worked anyway.

We were hanging out. She says to me offhand:


  yeah I heard some people induce psychedelic states by just laying still at a bit of an angle and not moving for awhile. your body just slowly forgets it exists because it’s not getting any change in sensory input.


She was closer to correct that maybe she knew. i’m not sure i’d directly compare my experiences to drugs, but it’s nice in its own way. And I found it led me quite easily into mental states I’d only incidentally found myself stumbling into before. At any rate, I think it’s fun, so here’s some writing:

Here’s what you do: lay somewhere comfortable. As to the “angle” thing, I like to do this sitting on a couch, in an armchair, anywhere else I can set myself down for awhile without moving. You can also do this just lying flat, you’ll just be more prone to falling asleep.

Put on some music, if you want. Sometimes I like music that has enough going on that I can use it as a mental focal point that is more in my mind than the rest of my body. But that also has little enough going on that it doesn’t dominate my thoughts, that it can fade into the background and become the skybox of the world. Other times I like to just sit with an album, let it become my world, immerse myself in it.

If you’d like, set some sort of indicator for being “done”. For me it’s usually just the end of the album I’m listening to. It could also be a gentle alarm. Nothing startling, nothing loud- just a signifier to remind you of the outside world.

And then you really do just, lay there. And don’t move. If you feel the urge to move anything, don’t do it. This is perhaps the trickiest part.

If you’re particularly physically uncomfortable though, you can move a little bit to do something about it. It’ll bring you out of things a bit, but it’s not a hard-reset. Going in is gradual, falling out is too.

You can close your eyes, if you want. Or you can leave them open. You’ll find quite the lovely patterns on the world around you if you leave them open. You may find a more vivid experience of the inner if you keep them closed. The experience is different depending. Experiment.

If you do need a focal point in your body, focus on your breath.

As you lay there, you’ll feel proprioception begin to dissipate. Your body doesn’t feel as though it’s going numb, you simply, stop perceiving it at all. It’s gradual for me, beginning at my extremities and working towards my core, until I’m nothing but a bit of an orb, not entirely even in a world anymore.

You may also think this sounds a bit dissociative. As someone(s) intimately familiar with many forms of dissociation, the state we find ourselves in from this does not taste distinctly dissociative, in the same fashion that finding we’ve tuned out the sound of a fan doesn’t feel dissociative. But it can be used as a liminal stepping-stone to certain dissociative states, if one so chooses, among many other things one can do with it.

And what you do with this state is up to you. It can be nice to simply exist in it, let your consciousness float off- a strange resting state that isn’t quite sleep but isn’t quite awakeness, a mist of hypnogogia. Nothing wrong with getting a bit dreamy.

It can also be nice to use this as a way to get more in touch with your imagination. Your inner world, if you have one- or if you want to have one but don’t, this is certainly one of myriad ways to begin.

While writing it occurs to me you could perhaps in this state build for yourself a new proprioception, if you want. Let me know how that goes, if you try it.

If you’ve kept your eyes open- we find our latent apophenia makes itself known, patterns tracing themselves along the walls, holographic colors shifting and swirling this way and that, shapes coming forth. You can practice influencing them, practice quite literally changing how you see the world, bit by bit.

And you can even get up and walk around if you’d like. You may snap out of it, or, with a bit of practice and mental trickery, you may not. It can be fun to travel the world, with the veil semi-pierced, the world of fantasy leaking in a bit.

So many options, or do nothing at all. It doesn’t matter really.

. . . . . . . . .

And, to further contextualize a concern you may have: the feeling of trance can feel a bit like sleep to one unfamiliar with it. but it is different. Personally, afterwards, I don’t feel that I’ve slept. Neither do I ever feel my mental state simply cease, the way I do when I doze off. True though, the most liminal of moments I find myself in feels somewhat connected to sleep, in a way I don’t fully understand. Though if you do end up accidentally taking a nap: perhaps you needed it! But! That is what I use the music and other focii for- I can actively focus in on them to keep myself waking, if I feel the call of sleep pulling at me a bit too hard.

Anyways, that’s all I have for you. Nothing complicated, nothing too odd. Just a fun little thing you can try.




Copying NixOS’s Live CD to RAM: a short look at NixOS’s early boot
2023-03-07T00:00:00+00:00
Due to Circumstances, I want to boot a NixOS live ISO in such a way that the storage medium can be removed after boot-up. This is at a technical level rather simple. NixOS live images store the Nix store in a SquashFS, and everything else is already in RAM via tmpfs anyway. So in theory all we need to do is copy the SquashFS to ram before it gets mounted. But that does raise the question: how is the SquashFS mounted in the first place? And how can we change that? If you were a reasonable person you would read nixos.wiki/wiki/Bootloader, and you would find your answer: pass copytoram in boot.kernelParams. I am not reasonable, and figured it out from the source code, which is what the remainder of this post is about.

Our first clue is in nixos/modules/installer/cd-dvd/iso-image.nix:

  # store them in lib so we can mkImageMediaOverride the
  # entire file system layout in installation media (only)
  config.lib.isoFileSystems = {
    "/" = mkImageMediaOverride
      {
        fsType = "tmpfs";
        options = [ "mode=0755" ];
      };

    # Note that /dev/root is a symlink to the actual root device
    # specified on the kernel command line, created in the stage 1
    # init script.
    "/iso" = mkImageMediaOverride
      { device = "/dev/root";
        neededForBoot = true;
        noCheck = true;
      };

    # In stage 1, mount a tmpfs on top of /nix/store (the squashfs
    # image) to make this a live CD.
    "/nix/.ro-store" = mkImageMediaOverride
      { fsType = "squashfs";
        device = "/iso/nix-store.squashfs";
        options = [ "loop" ];
        neededForBoot = true;
      };

    "/nix/.rw-store" = mkImageMediaOverride
      { fsType = "tmpfs";
        options = [ "mode=0755" ];
        neededForBoot = true;
      };

    "/nix/store" = mkImageMediaOverride
      { fsType = "overlay";
        device = "overlay";
        options = [
          "lowerdir=/nix/.ro-store"
          "upperdir=/nix/.rw-store/store"
          "workdir=/nix/.rw-store/work"
        ];
        depends = [
          "/nix/.ro-store"
          "/nix/.rw-store/store"
          "/nix/.rw-store/work"
        ];
      };
  };


/ will be a tmpfs. /iso will be the USB stick or DVD or whatever we’re booting from. The SquashFS inside of it gets mounted and combined with a second tmpfs to provide the Nix store from the boot media while alloying temporary in-memory additions. But how does this translate into the actual bootup process?

Let’s look at nixos/modules/system/boot/stage-1-init.sh. This is the script that gets installed as /init in the initramfs, so it’s the very first thing that gets executed during the bootup process. Here’s the section related to mounting our set of file systems, though, don’t bother reading this whole snippet. I’ll highlight the important things after.

exec 3< @fsInfo@

while read -u 3 mountPoint; do
    read -u 3 device
    read -u 3 fsType
    read -u 3 options

    # !!! Really quick hack to support bind mounts, i.e., where the
    # "device" should be taken relative to /mnt-root, not /.  Assume
    # that every device that starts with / but doesn't start with /dev
    # is a bind mount.
    pseudoDevice=
    case $device in
        /dev/*)
            ;;
        //*)
            # Don't touch SMB/CIFS paths.
            pseudoDevice=1
            ;;
        /*)
            device=/mnt-root$device
            ;;
        *)
            # Not an absolute path; assume that it's a pseudo-device
            # like an NFS path (e.g. "server:/path").
            pseudoDevice=1
            ;;
    esac

    if test -z "$pseudoDevice" && ! waitDevice "$device"; then
        # If it doesn't appear, try to mount it anyway (and
        # probably fail).  This is a fallback for non-device "devices"
        # that we don't properly recognise.
        echo "Timed out waiting for device $device, trying to mount anyway."
    fi

    # Wait once more for the udev queue to empty, just in case it's
    # doing something with $device right now.
    udevadm settle

    # If copytoram is enabled: skip mounting the ISO and copy its content to a tmpfs.
    if [ -n "$copytoram" ] && [ "$device" = /dev/root ] && [ "$mountPoint" = /iso ]; then
      fsType=$(blkid -o value -s TYPE "$device")
      fsSize=$(blockdev --getsize64 "$device" || stat -Lc '%s' "$device")

      mkdir -p /tmp-iso
      mount -t "$fsType" /dev/root /tmp-iso
      mountFS tmpfs /iso size="$fsSize" tmpfs

      cp -r /tmp-iso/* /mnt-root/iso/

      umount /tmp-iso
      rmdir /tmp-iso
      if [ -n "$isoPath" ] && [ $fsType = "iso9660" ] && mountpoint -q /findiso; then
       umount /findiso
      fi
      continue
    fi

    if [ "$mountPoint" = / ] && [ "$device" = tmpfs ] && [ ! -z "$persistence" ]; then
        echo persistence...
        waitDevice "$persistence"
        echo enabling persistence...
        mountFS "$persistence" "$mountPoint" "$persistence_opt" "auto"
        continue
    fi

    mountFS "$device" "$(escapeFstab "$mountPoint")" "$(escapeFstab "$options")" "$fsType"
done

exec 3>&-


So a few things of note. The actual mount data is pulled from some source called fsInfo. That’s provided by nixos/modules/system/boot/stage-1.nix:

fsInfo =
  let f = fs: [ fs.mountPoint (if fs.device != null then fs.device else "/dev/disk/by-label/${fs.label}") fs.fsType (builtins.concatStringsSep "," fs.options) ];
  in pkgs.writeText "initrd-fsinfo" (concatStringsSep "\n" (concatMap f fileSystems));


And fileSystems is that list of mount points we saw earlier! This is converting that list into a file that’s easy to parse line by line from bash at early boot.

However, before we go any further, maybe we don’t need to do any work at all to get our squashfs in ram. See this?

    # If copytoram is enabled: skip mounting the ISO and copy its content to a tmpfs.
    if [ -n "$copytoram" ] && [ "$device" = /dev/root ] && [ "$mountPoint" = /iso ]; then
      # ... snip ...

      mountFS tmpfs /iso size="$fsSize" tmpfs
      cp -r /tmp-iso/* /mnt-root/iso/

      # ... snip ..
    fi


That’s doing literally exactly what I want, mounting a tmpfs on /mnt-root/iso and copying the SquashFS (and the rest of the ISO contents) into it. So how can we enable copytoram? Earlier in the script, there’s a little loop that parses any arguments that were passed in as kernel boot parameters:

for o in $(cat /proc/cmdline); do
    case $o in
        # ... snip ...
        copytoram)
            copytoram=1
            ;;
        # ... snip ... 
    esac
done


So all we need to do to boot from ram is to pass copytoram as a kernel parameter? Sweet! That’s an incredibly simple change to our system definition:

boot.kernelParams = [ "copytoram" ]


Hah! Easy! Let’s make sure it actually worked though. The easiest way for me to tell is maybe by booting up the ISO as a remote-attached ISO through the BMC of one of my computers. It gives me a little read-out of how much data has been loaded over the network. My ISO is 284MiB large, so we should expect about that much data transferred, give or take 1000-vs-1024 measurements. But enough talk, time to boot:

<<< NixOS Stage 1 >>>

loading module loop...
loading module overlay...
loading module dm_mod...
running udev...
Starting version 251.12
starting device mapper and LVM...
mounting tmpfs on /...
waiting for device /dev/root to appear.......
mounting tmpfs on /iso...


There it is, mounting tmpfs on /iso..., exactly what we want to see. And then my console sat here for awhile as my KVM counted the ISO bytes transmitted up to a nice 284MiB. Just to be sure, I’ll detach the ISO, check the mount point, and then checksum the SquashFS.

# mount
tmpfs on /iso
/iso/nix-store.squashfs on /nix/.ro-store

# sha256sum /iso/nix-store.squashfs
4bb86abad14682f73105943b710e26864a1c6f063f01d9728e489ef98f034039 /iso/nix-store.squashfs


It does in fact work! Thanks, NixOS.


ffmpeg v4l2 requests 4.4.3 patchset
2023-03-06T00:00:00+00:00
I rebased the 4.4 branch of jernejsk/FFmpeg on ffmpeg 4.4.3. You can find that at faithanalog/FFmpeg-v4l2. This lets you use hardware accelerated video decoding on anything that uses v4l2-requests (hantro, rkvdec, etc.). Tested on my Quartz64. You can also download a .patch file that applies cleanly to upstream 4.4.3: v4l2-4.4.3.patch. If you’re on gentoo, drop this in /etc/portage/patches/media-video/ffmpeg-4.4.3/. Tested on my Quartz64 and nowhere else. You also need to build with --enable-v4l2-request --enable-libudev. On gentoo, use EXTRA_FFMPEG_CONF='--enable-v4l2-request --enable-libudev' emerge ffmpeg and consider putting this in /etc/portage/env so you don’t forget later.

I had to drop hevc support because the original code was written against an unstable version of the hevc v4l2-requests API, and the API that ended up in mainline is slightly different. jernejsk/FFmpeg has a branch based on ffmpeg 5.x that has seen more recent work. There might be a working hevc implementation in there that could be backported. I haven’t tried. I probably won’t because I don’t really watch hevc content, and quartz64 only has hantro working right now, which can’t do hevc anyway. You should give that a look if you’re interested though.

You should use this with mpv’s new dmabuf-wayland backend for buttery smooth video playback under wayland. It eliminates a lot of the memory copies that were in the pipeline previously.

mpv --hwdec=drm --vo=dmabuf-wayland 


Unfortunately the on-screen display doesn’t work with this video output yet. No visual controls, no captions. It’s still experimental.

Make sure you’re in the video group so you can access the right devices. Also make sure you’re using an up to date dtb. I used the dtb in linux 6.1.9 and it worked for me.


Blazingly Fast Lua Serialization
2023-02-23T00:00:00+00:00
You’re writing lua, you want to serialize and deserialize data, and you want to pick the best format/library pairing for the job. What’s good? I’ve been doing some testing to find out. Here’s the short version: If you want the fastest option and you can choose the format, use lua-cbor if you need it to be pure lua, or use lua-protobuf if you’re cool with a C library. If you need JSON, use either lunajson for pure lua, or lua-cjson for a faster C implementation. And now, the details.

JSON

Not much to say about JSON. You’re not going to get great speeds out of this no matter what, but it’s everywhere, and you’ll need it eventually. The long-lived lua-cjson library is going to be fastest for you if you’re cool with a C-based library. If you want pure lua, use lunajson.

Unfortunately, JSON being JSON, there’s not a great way to put binary data in here. Both lua-cjson and lunajson will happily encode a binary string directly into the output JSON (and successfully decode it too!) regardless of if it’s valid Unicode, immediately violating the JSON spec and making many decoders very unhappy with you. If you need to put binary data in you’re probably best off base64-encoding it.

MessagePack

MessagePack is a schema-less format like JSON, but it’s a binary format. This makes it much more bandwidth-efficient than json, and for parsers it also means it can encode numbers and strings a lot faster. Unfortunately the lua implementations both leave a bit to be desired, so consider using CBOR instead if you can (see the next section).

There’s two options here. They both have some important jankyness with regards to how they encode strings.

The first option is kieselsteini’s msgpack. It runs lua’s utf8.len function on all strings before encoding them- if that length comes back successfully, lua validated it as utf8, and the library encodes it as a utf8 string. Otherwise it sends it as a binary string. This imposes a pretty big cost if you’re primarily transmitting binary data. It’ll also cause problems if the other end is expecting a value to be always-binary or always-utf8, because it’ll see both depending on the contents. But, it won’t be mis-tagging binary data as utf8, so that’s good.

The second option is fperrad’s lua-MessagePack. This one is janky in a different way: out of the box it tags all strings as utf8, even if they’re binary strings. As a result, you might generate errors in whatever you’re sending the data to, as it tries to decode binary data as utf8. You can change this behavior globally to tag all strings as binary by calling MessagePack.set_string('binary'), which at least is technically correct.

lua-MessagePack also lets you specify a custom encoder for a piece of data, which you can use to switch out the string tagging for a specific piece of data only, but it looks a bit cumbersome to use with nested data structures. That said, if all you need is binary strings, lua-MessagePack is probably going to be about as applicable to the task as the CBOR library I’ll talk about next, but I just find it a bit more of a hassle to use.

If lua version compatibility is a concern for you, kieselteini’s msgpack requires lua5.3. fperrad’s lua-MessagePack provides both a 5.1 compatible version and a 5.3 compatible version, which are available as separate packages on luarocks.

CBOR

Out of JSON, MessagePack, and CBOR, you’re going to have the best time with CBOR. CBOR, like JSON and MessagePack, is a schema-less data format. CBOR is inspired by/derived from MessagePack so at a protocol level it’s very similar. It’s a binary format, it’s bandwidth-efficient, etc. The lua-cbor library is well written, incredibly fast, has good defaults, but is flexible enough to handle mixed string formats in a reasonable fashion.

Its default behavior is to tag all strings as binary strings (a safe default!). It correctly handles the strange array/map duality of lua tables in the most efficient way it can do safely (serializing both variants concurrently and only writing out the correct one at the end). But, it also makes it possible to override these behaviors in your data structures.

Your first option it to pass in an options table as the second argument to cbor.encode. For example, if you want to encode a data structure composed entirely of arrays without any key-value maps, you could run:

local array_mt = { __name = "array" }
local options = {
    [array_mt] = cbor.type_encoders.array
}
cbor.encode(my_array, options)


But the bit I’m happy about is you can also set encoders in the metatables for values. In this case, I’ll return to the string example: if you have a table containing a utf8 string, and you care about tagging it as such, you can set a custom encoder in its metatable.

local table_with_utf8strings_mt = {}
table_with_utf8strings_mt.__tocbor = function(table, opts)
    -- encode table as a map with all strings tagged as utf8 strings
    local old_string_encoder = cbor.type_encoders.string
    cbor.type_encoders.string = cbor.type_encoders.utf8string
    return cbor.type_encoders.map(table, opts)
    cbor.type_encoders.string = old_string_encoder
end

setmetatable(my_table, table_with_utf8strings_mt)


This is a bit grungy honestly and I’m not super thrilled about it, but at least it’s possible if you need it. You could even use these custom encoders if you want a custom wire format for encoding your table (maybe you want to omit some fields?), without having to write all the machinery yourself.

As far as lua compatibility goes, lua-cbor will work with lua5.1. With lua5.2 it’ll go faster with the help of the bitshift operators, and with lua5.3+ it’ll go faster still by using string.pack/string.unpack. All of that is handled transparently for you, it selects whatever’s available when you require() it.

There is another CBOR implementation worth mentioning which is org.conman.cbor. This has the benefit of supporting CBOR extensions, which you probably don’t need, but if you need to interface with something that is using them, it’s an option. But it comes at a cost; despite boasting about how parts of it are implemented in C, I actually saw lua-cbor encode/decode data 15-25x times faster than this C library. Absurd!

Protobufs

Ah, protobufs. Love them or hate them, they exist, and you might need or even want to use them. In that case lua-protobuf has you covered, for both protobufs version 2 and 3.

Unlike the other data formats listed here, Protobufs uses a schema, meaning you define in advance what your data looks like. lua-protobuf makes this pretty ergonomic. You can include your schema directly in your lua code as a multi-line string and parse it at startup, or you can pre-compile your schema to a binary format. In exchange for using a schema and importing some C code, you get some incredible speeds.

For data which is primarily strings or blobs, you’ll see about the same speed as as lua-cbor since for both libraries all the time spent there is pretty much just memcpy()s. On the other paw, for complex data structures and specially anything with large arrays, you can see on the order of 20x faster speeds with protobufs than lua-cbor.

There is a quirk you need to be aware of: the currently loaded schema is global state. Thankfully, there is a way to work with this design. lua-protobuf provides pb.state(), a function you can use to grab a copy of the current state. Then you can reset it, load up some new state, do whatever serialization you need, and put the old state back. This lets you juggle multiple schemas within the same program if you need to (although for most usecases you won’t need to).

It’s a worthwhile price to pay for the performance if you need it. I don’t think protobufs can really be beat here, short of writing your own library in C to hand-roll a protocol.

In Conclusion

JSON gets the data there. CBOR gets it there faster. Protobufs gets it there at ludicrous speed. What you ultimately use probably depends on more than just what’s fastest, particularly if you’re interoperating with some other code. But there’s probably your best options.


Anti-Fandom Action! Hosting BreezeWiki with Caching and WildCard DNS
2023-02-22T00:00:00+00:00
There’s this website online that’s a bit notorious for being awful, and also for being everywhere: fandom dot com. Fandom hosts a lot of wikis, some of which have existed for over a decade now. They used to be known as wikkia and provided the quite-useful service of a hosted MediaWiki instance. That’s still what they do actually, but over time they’ve become more and more malignant. I don’t know the full story, what happened with management, whatever, but these days when you go on a Fandom page you’re bombarded with ads for media you don’t care about, weird trivia quizzes, obnoxious animations, and all of this slows your browser down and gets in the way of the page you were actually trying to read. BreezeWiki is a proxy that fixes that, and you can even run your own! You can point the getindie browser extension at your instance or another person’s instance and it’ll turn that pit of despair into a nice smooth browsing experience, and recommend alternative independently hosted wikis if they exist. If that’s all you want to do, go download that extensions, you’re free. But if you want to run your own, that’s what the rest of this post is for.

So before I go into the details on my setup, you should know that there’s Official BreezeWiki Documentation on how to run your own instance and it’s pretty good. It’ll get you from 0 to running most of the time, and those docs will be up to date after this post stops being up to date. But for completeness I’m going to cover the whole thing.

Installing BreezeWiki

You’ll want to use a system with at least 1GB of ram (that’s what I’m using). You can scrape by with less, but BreezeWiki is going to take a few hundred megabytes on its own.

BreezeWiki is written in Racket. If you’re on x86-64 and you don’t want to set up Racket, you can just download the binary distribution of BreezeWiki as the official docs say. You should be able to unpack and run breezewiki-dist/bin/dst. I instead opted to install Racket and run it from a git clone.

I’m running on Debian Bullseye, which has a version of Racket that’s too out of date to run BreezeWiki, but the version in bullseye-backports is new enough. You need the backports repo in your apt sources, and then you can install it.

echo 'deb http://deb.debian.org/debian bullseye-backports main' | sudo tee /etc/sources.list.d/bullseye-backports.list
apt update
sudo apt install -t bullseye-backports racket


More generally, you need at least Racket version 8.4. If your distribution doesn’t provide that, you can get an up to date version of racket from download.racket-lang.org.

After installing Racket, I created a breezewiki user to run the code under:

useradd -m breezewiki


Then I cloned the git repository into /opt/breezewiki:

cd /opt
sudo git clone https://gitdab.com/cadence/breezewiki.git
chown -R breezewiki:breezewiki breezewiki


We need to install the dependencies.

sudo -iu breezewiki bash -c 'cd /opt/breezewiki && raco pkg install --auto'


We also need to configure breezewiki,

sudo -iu breezewiki nano /opt/breezewiki/config.ini


and here’s what my config looks like:

canonical_origin = https://yourcoolbreezewiki.com
debug = false
feature_search_suggestions = true
log_outgoing = false
port = 10416
strict_proxy = false


I want to highlight strict_proxy here. If you turn that on then your BreezeWiki instance will download images from fandom and then re-serve them to anyone using your instance. As a user this is pretty nice because it means even less interaction with fandom, but right now there’s some edge-cases that mean if you turn this on some pages will break and not look right. If you’re ok with that, you can turn it on, but for now I’ve been told it’s best to leave it off. Hopefully I can turn that on later! However, you may want to keep it off forever if you don’t have the bandwidth to support hosting the images yourself.

The last thing is that because we’re reverse proxying breezewiki with Nginx (we’ll get there soon), it doesn’t make sense to have breezewiki listening on a network interface accessible to the broader internet. You could firewall it off, or you can edit the racket code to make it listen on 127.0.0.1 in release mode by copying this command to patch it:

cd /opt/breezewiki
sudo -u breezewiki git apply <number (config-get 'port))
    (λ (quit)
      (channel-put ch (lambda () (semaphore-post quit)))
diff --git a/dist.rkt b/dist.rkt
index deb08a8..9d4fdf3 100644
--- a/dist.rkt
+++ b/dist.rkt
@@ -20,7 +20,7 @@
 (require (only-in "src/page-file.rkt" page-file))
 
 (serve/launch/wait
- #:listen-ip (if (config-true? 'debug) "127.0.0.1" #f)
+ #:listen-ip "127.0.0.1"
  #:port (string->number (config-get 'port))
  (λ (quit)
    (dispatcher-tree
EOF


This is technically optional, but I like knowing that all the traffic is going through my nginx server.

Finally, we need a service to run BreezeWiki. My installation is using Systemd, so here’s a systemd service for you to use. Adapt this to other systems as necessary. If you’re using systemd, put this in /etc/systemd/system/breezewiki.service:

[Unit]
Description=breezewiki is cool
After=network.target

[Service]
User=breezewiki
Group=breezewiki
WorkingDirectory=/opt/breezewiki
ExecStart=/usr/bin/racket /opt/breezewiki/dist.rkt
Restart=on-failure
# everything after this point is just hardening
InaccessiblePaths=/etc/nginx /etc/letsencrypt /etc/passwd /etc/group
ReadWritePaths=/opt/breezewiki/storage
ReadOnlyPaths=/etc/racket /etc/resolv.conf /etc/hosts /usr/share/racket /usr/lib/racket /usr/include/racket /usr/share/doc/racket
PrivateDevices=true
ProtectControlGroups=true
ProtectHome=read-only
ProtectKernelTunables=true
ProtectSystem=full
PrivateTmp=true
ProtectProc=invisible
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
NoNewPrivileges=true
RestrictNamespaces=true
RestrictAddressFamilies=~AF_UNIX


[Install]
WantedBy=multi-user.target


Then run

sudo systemctl daemon-reload
sudo systemctl enable --now breezewiki


Give it a minute and check that it’s running ok

sudo systemctl status breezewiki


Updating BreezeWiki Later

If you cloned the source from git, then later on when you want to update
BreezeWiki you’ll need to do this:

sudo -iu breezewiki bash -c '
    cd /opt/breezewiki \
    && git pull --rebase --autostash \
    && raco pkg install --auto --skip-installed \
    && raco pkg update --auto
'
sudo systemctl restart breezewiki


The git stash push/pop are only necessary if you applied my patch to make it
listen on 127.0.0.1. Hopefully that’ll just be a config option in the ini later
you so don’t need that patch at all. If in doubt, just delete the breezewiki
folder, re-clone it, and put your config back.

Setting up Nginx

Now we need to set up Nginx.

sudo apt install nginx
sudo rm /etc/nginx/sites-enabled/default
sudo mkdir -p /var/cache/breezewiki/nginx /var/www/breezewiki
sudo chown -R www-data:www-data /var/cache/breezewiki /var/www/breezewiki
sudo nano /etc/nginx/sites-enabled/breezewiki


Here’s the general config file. You should read through this and take note of
the comments that tell you when you need to think about a setting and change it
to match your setup.

# this sets up a response cache at /var/cache/breezewiki/nginx.
# Leave levels=1:2 alone, leave keys_zone alone You should adjust
# max_size= to be however much space you want to use for caching. If you
# aren't caching images, 60gigs is extremely overkill. You won't see a ton
# of benefit beyond a couple gigs. If you are caching images, go ham.
# inactive= specifies how long a file will stay on disk until it gets deleted
# (this is NOT how long nginx will wait before refreshing the cache for that
# file). You can set it to whatever you want as long as it's longer than your
# cache time that you set later down in the file.
proxy_cache_path /var/cache/breezewiki/nginx levels=1:2 keys_zone=breezewiki_cache:50m
                 max_size=60g inactive=7d use_temp_path=off;


server {

    # If you're going to set up wildcard DNS, leave this as an underscore.
    # otherwise you should set this to whatever domain you're hosting
    # breezewiki on. For example
    # server_name https://yourcoolbreezewiki.com
    server_name _;

    root /var/www/breezewiki;

    # Used if you go for HTTPS letsencrypt verification strategy, not
    # necessary if you're doing wildcard DNS but it doesn't hurt anything.
    location /.well-known {
        allow all;
    }

    # see https://www.nginx.com/blog/nginx-caching-guide/
    location / {
        proxy_cache breezewiki_cache;
        proxy_cache_use_stale error timeout updating http_500 http_502
                              http_503 http_504;
        proxy_cache_lock on;
        proxy_cache_background_update on;
        proxy_ignore_headers Cache-Control;

        # 24 hour caching is probably ok for a wiki? idk.
        proxy_cache_valid 404 10m;
        proxy_cache_valid 200 301 302 72h;
        proxy_cache_valid any 1m;

        # use the cookie too so we cache themes correctly
        proxy_cache_key $host$proxy_host$request_uri$cookie_theme;

        proxy_pass http://127.0.0.1:10416;
        proxy_set_header Host $host;

    }

    # certbot will change this to 443 for you if you're using certbot with the
    # HTTPS verification strategy. If you want to do wildcard DNS, I'll give you
    # some changes to make to this file later in the post.
    listen 80;
}


Now, reload nginx

systemctl reload nginx


If you want to host breezewiki on a single domain then you’re almost done. You just need to set up Letsencrypt for the HTTPS certificate. Go check out certbot’s homepage if you need help using certbot and just want to host breezewiki on a single domain, and then you’re done! If you want to do something a bit more advanced, keep reading.

Bonus! Wildcard DNS with Letsencrypt

BreezeWiki can take advantage of Wildcard DNS to make using it a bit nicer for anyone trying to use it manually (instead of with a browser extensions).

Normally, when you’re on a page and want to use breezewiki you need to go up to the URL (say minecraft DOT fandom DOT com), and then edit it to yourcoolbreezewiki.com/minecraft. Kind of annoying because you need to move the minecraft from the start to the END of the URL. If you set up Wildcard DNS, then you can just change it to minecraft.yourcoolbreezewiki.com which is a bit nicer to do. But, the setup is more complicated because now we need a wildcard DNS entry and a wildcard TLS certificate.

How you set up a wildcard DNS entry depends on your DNS provider. With most user-friendly DNS systems you just create an A (or AAAA) record for * (or *.subdomain if you want to host it on a subdomain) and then set the IP to your server’s IP. That’s pretty simple. The complicated part is getting the TLS certificate, because now you need to use the DNS method for proving you own your domain.

certbot has built-in support for doing this DNS automation for a number of DNS providers. I do not like any of the supported DNS providers, for various reasons. If you’re into hosting your DNS on CloudFlare or AWS then by all means I guess go for it but ehhhhhhhhh no thank you. Instead, I did what’s called ACME Delegation. In short, we’re going to set up a DNS server on our own server- but don’t worry, we don’t need to entirely self-host DNS. Instead, we put a special CNAME record in our normal DNS provider called _acme-challenge. That record will tell Letsencrypt “hey go talk to my DNS server I’m running over here on the side, it has permission to verify that I own this domain”. Pretty neat!

The two pieces of software we’re going to use to do this are joohoi/acme-dns and acme-dns/acme-dns-client. These are both written in go so we’ll need to install the go compiler. Once again, we need a new enough version, and bullseye-backports provides:

sudo apt install -t bullseye-backports golang


Then we need to get the code

cd $HOME
git clone https://github.com/acme-dns/acme-dns-client
git clone https://github.com/joohoi/acme-dns


And build/install them

cd $HOME/acme-dns-client
go build
sudo install --mode 755 -D -t /usr/local/bin acme-dns-client

cd $HOME/acme-dns
go build
sudo install --mode 755 -D -t /usr/local/bin acme-dns
sudo install --mode 644 -D -t /etc/systemd/system acme-dns.service
sudo install --mode 644 -D -t /etc/acme-dns/config.cfg config.cfg


Now to do some configuration

nano /etc/acme-dns/config.cfg


Here’s what my config looks like. I think by default it’s configured to let anyone use your acme-dns service but that seems a bit silly. I didn’t set up authentication for it, but I did limit it to 127.0.0.1.

[general]
listen = "0.0.0.0:53"
protocol = "both"
# domain name to serve the requests off of
domain = "auth.yourcoolbreezewiki.com"
# zone name server
nsname = "auth.yourcoolbreezewiki.com"
# admin email address, where @ is substituted with .
nsadmin = "admin.yourcoolbreezewiki.com"
# predefined records served in addition to the TXT
records = [
    # domain pointing to the public IP of your acme-dns server 
    "auth.yourcoolbreezewiki.com. A 69.69.69.69",
    # specify that auth.yourcoolbreezewiki.com will resolve any *.auth.yourcoolbreezewiki.com records
    "auth.yourcoolbreezewiki.com. NS auth.yourcoolbreezewiki.com.",
]
debug = false

[database]
engine = "sqlite3"
connection = "/var/lib/acme-dns/acme-dns.db"

[api]
ip = "127.0.0.1"
disable_registration = false
port = "2043"
tls = "none"
# optional e-mail address to which Let's Encrypt will send expiration notices for the API's cert
notification_email = ""
# CORS AllowOrigins, wildcards can be used
corsorigins = [
    "*"
]
use_header = false
header_name = "X-Forwarded-For"

[logconfig]
# logging level: "error", "warning", "info" or "debug"
loglevel = "info"
logtype = "stdout"
# format, either "json" or "text"
logformat = "text"


You might run into problems with this if your server is running systemd-resolved. I don’t really know how to help you in that case, because I don’t run systemd-resolved. If you for sure know that you don’t need resolved you can force-disable it with sudo systemctl mask systemd-resolved but seriously go look into the implications of doing this before you do it.

Anyhow, turn on your acme-dns server now.

systemctl daemon-reload
systemctl enable --now acme-dns


Now you need to add a couple DNS records to your domain’s DNS, however you usually do that. You need


  an A record pointing whateveryouwant.yourcoolbreezewiki.com to the IP address of your server.
  an NS record pointing auth.yourcoolbreezewiki.com to whateveryouwant.yourcoolbreezewiki.com.


Keep that tab open because we’re going to add the CNAME record I mentioned in a moment.

Now we need to configure acme-dns-client which is the bit that certbot is going to use. We do that with a command like this:

sudo acme-dns-client register -d 'yourcoolbreezewiki.com` -s http://127.0.0.1:2043


As part of this process it will give you something you need to copy-paste into your DNS provider. Create a CNAME record with whatever it gives you, which should look a bit like eabe2453-cafe-9999-bad1-80085ace5ff2.auth.yourcoolbreezewiki.com. You’ll need to wait for the CNAME record to propagate.

Once that’s done, you can get your certificate with certbot

sudo certbot certonly --manual --preferred-challenges dns --manual-auth-hook 'acme-dns-client' \
    -d 'yourcoolbreezewiki.com' -d '*.yourcoolbreezewiki.com'


Make sure you have the domain with and without the wildcard! You need both!

That gets the cert and also sets it up for auto-renewal later. Finally you need to modify the nginx config from earlier. At the very end, you would’ve had something like this

    # HTTPS verification strategy. If you want to do wildcard DNS, I'll give you
    # some changes to make to this file later in the post.
    listen 80;
}


That’s going to change to this

    listen 443 ssl;
    ssl_certificate /etc/letsencrypt/live/yourcoolbreezewiki.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourcoolbreezewiki.com/privkey.pem;
    include /etc/letsencrypt/options-ssl-nginx.conf;
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;

}

server {
    return 301 https://$host$request_uri;
    listen 80;
    server_name _;
}


(Double check that that is in fact the right path for your certificate).

Now systemctl reload nginx and you should hopefully be done! Yay!!

If you have any questions feel free to contact me or come join the BreezeWiki Matrix where the dev and server operators like me hang out.


So you want to cross compile illumos…
2023-02-21T00:00:00+00:00
I want to cross compile to illumos from linux, for reasons I’ll hopefully get
to write about in the future. This is a pretty tricky endeavor. Let’s talk
about why.

illumos’s toolchain for the most part is not terribly exotic. It’s using GCC, so at least we’re not dealing with a weird bespoke compiler. Here’s the main things we do need to care about:


  libc
  linker + linker support tools
  dmake
  the monolithic project structure


libc is maybe an obvious one- your code needs to link against a libc and it’s gotta be the libc for the system. That’s fine enough. It does mean that we probably want to be able to cross-compile that libc along with everything else, but we could maybe scrape by copying a binary libc onto our Linux system to get started. The linker is maybe less obvious.

See, in illumos land, there’s a lot of tools with history older than many of the tools we use today, and one such tool is its linker. illumos has its own linker implementation, and to my knowledge it’s the only linker you can rely on to successfully generate illumos binaries (but we’ll talk about why we might be wrong about that in a bit). So, we really would like to get the linker to work if we can.

Speaking of illumos-only tools, here’s the thing that’s causing me the most grief right now: illumos’s dmake. Don’t get this confused with OpenOffice’s dmake - this is different. They might share some history? maybe? But if they do, they’re not compatible today. Anyways, why is this such a problem? Well illumos’ dmake is built with dmake! It’s a self-hosting build tool. That’s all well and good except for when you’re trying to compile for another OS, one where perhaps you can’t run the build tool yet.

All of this is a bit complicated by the fact that illumos has a monolithic git repo where all the code for the kernel and userspace software live, which means that there’s some project-wide state that gets threaded into the builds of each individual tool by the monolithic network of Makefiles, and the cross-dependencies between things are a little obscured.

So what are our options?

One thing we could do is try to cross-compile from illumos to Linux. Now we have two cross-compilation problems. But, if we could convince the build system to generate Linux binaries, then maybe we could build just dmake and nothing else, and use that to get started building things from the Linux side of things.

We could also try to replace dmake with GNU make. There’s some illumos forks that do this, and while it is tempting, I would like to be able to build with a source tree as close to upstream as possible, so I’m hesitant. But we could compromise, replacing the build system for just dmake, and then once we have that up and running use that to build everything else. But I think we may need to hack away at the build system a bit anyway to support the cross compilation, so maybe we should just replace the build system wholesale (again, there’s another project that does this.)

Then hopefully we can convince the linker to build under linux, and finally we can (probably?) start cross compiling.

So what’s this other project we keep talking about? There is some degree of precedent for all this, but we’re left with more questions than answers after investigating it a bit. There’s this aarch64/riscv port of illumos over at n-hys/illumos-gate, and it does replace the build system with GNU make, and it supports cross compiling from linux. But, it also seems to install the ld from GNU binutils into the system. It seems that it only builds the ld.so from illumos and we suspect that since it’s not beholden to generating stuff compatible with existing illumos systems, this fork might have gone out of its way to support output from the GNU linker to make life a bit easier. That makes sense for that project, but we need to generate stuff that works on existing illumos boxes, so that’s not a strategy we can pursue.

Or I could be dead wrong with my read on that. There’s a lot of commits of changes to go through if I wanted to be certain.

Anyway, it still might be worth grabbing the makefiles since they’re already kitted out to handle cross compilation.

The other challenge with the monolith is that it’s sort of geared to you building the whole thing at once, which is probably going to be a lot more effort than building a subset of it. Or it won’t be, I don’t know, but I think that we at least are not interested in building the kernel right now. It should be possible to only build subsections of the repo though, it’s just not obvious to me yet how we would go about doing that.

So anyway, if you know the answers to all of my problems do send me an email would you? Otherwise, well, hopefully we figure this out and have some interesting posts for you down the line.


Tfw your kernel makes your linker print to STDERR - Gentoo, Mold, and AArch64
2023-02-10T00:00:00+00:00
We’ve been passively experimenting with the mold linker for the past six months or so. We’ve got a Quartz64 that hasn’t needed to do much of anything, and because it’s only got 4 efficiency cores, the linker runs long enough that we can actually watch it and see how it’s doing. We’re not really sure if it’s been worth using, and haven’t done any scientific comparisons on it. That said, here’s how we have it set up.

AArch64 Weirdness

There’s a strange interaction between the default kernel config for aarch64 and mold- well, more with mimalloc which is the allocator mold uses. When I say default I mean downloaded straight from kernel.org. Every time mold ran, it’d print this to stderr: unable to allocate aligned OS memory directly, fall back to over-allocation.

Now what does this have to do with the kernel? Well, looking at the mimalloc code we can find this comment:


  on 64-bit systems, use the virtual address area after 2TiB for 4MiB aligned allocations


So for 4MiB-aligned allocations, it’s using virtual memory above 2^41 bytes. Looking at the kernel config, I found it set to CONFIG_ARM64_VA_BITS_39. That means that every process only gets 39 bits of virtual memory address space. 39 is famously known for being less than 41 bits. And what do you know, I set it to CONFIG_ARM64_VA_BITS_48 and the problem went away.

This actually caused at least one package to fail the build, and I think a couple, because they were not expecting the linker to start spuriously printing memory allocation problems to stderr.

Gentoo Config

Aside from using mold, we’ve got a few other weird things going on, which is that we’re also using clang with Thin LTO for as many packages as possible. Not everything works with that, so sometimes we have to fall back to gcc, that’s not news to anyone who does this sort of thing.

So first off, the relevant make.conf lines:

# I'm not really clear on whether omit-frame-pointer is default-on in clang yet for O2+
COMMON_FLAGS="-O3 -mcpu=cortex-a55 -fomit-frame-pointer -pipe -fPIC"
CC="clang"
CXX="clang++"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"

# Save the default LDFLAGS so we can restore them for builds that are broken
OLDLDFLAGS="${LDFLAGS}"
# i dont think O2 does anything here?
LDFLAGS="${LDFLAGS} -fuse-ld=mold -rtlib=compiler-rt -unwindlib=libunwind -Wl,-O2 -Wl,--as-needed"

CFLAGS="${COMMON_FLAGS} -flto=thin"
CXXFLAGS="${COMMON_FLAGS} -flto=thin"
FCFLAGS="${COMMON_FLAGS}"
FFLAGS="${COMMON_FLAGS}"


There’s some other fun flags that might be worth passing in to mold (read the man page) but I’m not doing any of that.

Ok so now we need a couple fallbacks for the things that break.

First, the obligatory “compiler-gcc” env:

CC="gcc"
CXX="g++"
AR="${CHOST}-ar"
NM="${CHOST}-nm"
RANLIB="${CHOST}-ranlib"

COMMON_FLAGS="-O2 -mcpu=cortex-a55 -fomit-frame-pointer -pipe"
CFLAGS="${COMMON_FLAGS}"
CXXFLAGS="${COMMON_FLAGS}"
FCFLAGS="${COMMON_FLAGS}"
FFLAGS="${COMMON_FLAGS}"
LDFLAGS="${OLDLDFLAGS} -B/usr/libexec/mold"


Notice this one is still using mold. I haven’t run into any packages yet that need the combo of GCC+Not Mold. Everything I’ve compiled has worked with either


  clang + mold
  clang + lld
  gcc + mold


To that end, here’s clang-without-mold:

LDFLAGS="${OLDLDFLAGS} -fuse-ld=lld -rtlib=compiler-rt -unwindlib=libunwind -Wl,-O2 -Wl,--as-needed"


Now, here’s the exceptions that I have in those two. First, the things I have using gcc:

dev-libs/boost compiler-gcc # everything seems to claim that this is supposed to work so im not sure why its not
sys-devel/gcc compiler-gcc
sci-libs/fftw compiler-gcc # broken for uhhh reasons? this could use clang, its just passing flto=thin to fortran for some reason. also doesnt like mold flags because it uses gnu ld regardless
media-libs/rubberband compiler-gcc # depends on boost, mangling is wrong with clang
dev-util/systemtap compiler-gcc # tapsets.cxx:68:17: error: expected namespace name using namespace __gnu_cxx;
sys-apps/plocate compiler-gcc
dev-java/snappy compiler-gcc # complains about linking libc++ after building with -fPIC, maybe we need to rebuild libc++ with fPIC, but nothing else has complained so idk
sys-devel/binutils compiler-gcc # need to use gcc AR for pgo/lto
games-emulation/mgba compiler-gcc


And here’s what I have using clang with lld:

dev-lang/ruby clang-without-mold    # configure: error: something wrong with LDFLAGS="-Wl,-O1 -Wl,--as-needed -fuse-ld=mold -rtlib=compiler-rt -unwindlib=libunwind -Wl,-O2 -Wl,--as-needed"
x11-libs/cairo clang-without-mold    # cairo can't link with pthread for some reason. i saw imagemagick do it just fine though.
app-emulation/qemu clang-without-mold # sizeof(size_t) doesn't match GLIB_SIZEOF_SIZE_T.
dev-libs/libtomcrypt clang-without-mold # links a file with $CC, then links another file with gcc. ????? might be able to fix with an env that uses -B instead of -fuse-ld for clang too
sys-libs/compiler-rt clang-without-mold


dev-util/cmake clang-without-mold # The C++ compiler does not support C++11 (e.g.  std::unique_ptr).
media-libs/lcms clang-without-mold
net-dns/bind-tools clang-without-mold
dev-util/glslang clang-without-mold
media-libs/mesa clang-without-mold # silent error


A few caveats here. First, some of these packages I compiled a number of mold releases ago and might work now. Secondly, some of the things I’ve switched to GCC might actually be working around LTO-related bugs instead of clang-related bugs. I don’t have a clang-without-thinlto environment, because I haven’t felt like setting one up and adding another variation to try out. But bear that in mind, I highly suspect that turning off thin LTO would have solved a number of them.

Compilation Performance

Honestly, it’s not enough that I’d do it again. I don’t think mold’s to blame
here- it’s more just the nature of LTO as far as I can tell. mold will
initially use all cores, but for almost everything it very quickly falls down
to only using one core for a very long time. This feels very much to me like
the LTO step, and for now mold is at the whims of llvm’s LTO plugin in that
regard. But if you’re into weird toolchains or you’re not using LTO, I’d say
maybe give it a go.


How To Make Java Swing Look Better
2023-01-08T00:00:00+00:00
This is not new information but I’ve been having an increasingly difficult time finding the exact options to use in the library of babel that is the modern search engine. So here we are.

There’s 4 settings at play here. awt.useSystemAAFontSettings and swing.aatext let you force enable (or disable!) text anti-aliasing. swing.defaultlaf and swing.crossplatformlaf let you set the theme. You can install your own themes but I don’t remember how to do that so I’m not writing about that today. The default theme is Metal. That’s the thing that looks distinctly like an oldschool java application. If you don’t like that, that’s probably why you’re on this post.

If you’re running java as a command, and you want to use your system’s GTK theme for java, here’s what you do:

java -Dawt.useSystemAAFontSettings=on \
     -Dswing.aatext=true \
     -Dswing.defaultlaf=com.sun.java.swing.plaf.gtk.GTKLookAndFeel \
     -Dswing.crossplatformlaf=com.sun.java.swing.plaf.gtk.GTKLookAndFeel \
     -jar JamochaMUD.jar


This is usually what I’m trying to do because it’s the easiest way for me to get a dark mode in applications. Be aware if you are going for a dark mode that this doesn’t set the icon theme, so you might get clashing icons. I think there’s a way to override the icon theme? IDK what it is, if you know please contact me.

Notice defaultlaf and crossplatformlaf are seemingly duplicating the theme settings. That’s because some applications have code in theme that queries the crossplatformlaf setting and sets the appearance to that instead of using the default. That’s nice of them, but we’re trying to set our own theme here.

There’s other themes you can use. Here’s the bundled “Look and Feel”s, as they’re called.


  Metal (default)
  Nimbus (my favorite) - javax.swing.plaf.nimbus.NimbusLookAndFeel
  Motif - com.sun.java.swing.plaf.motif.MotifLookAndFeel
  Windows (for Windows users) - com.sun.java.swing.plaf.windows.WindowsLookAndFeel
  GTK+ (follows GTK theme) - com.sun.java.swing.plaf.gtk.GTKLookAndFeel


Just set -Dswing.defaultlaf and -Dswing.crossplatformlaf to whatever you want. Or if you like Metal but you want to change the colors to be even more oldschool, -Dswing.metalTheme=steel.

If you want to use these settings by default, you can export an environment variable called _JAVA_OPTIONS which contains these. For example,

export _JAVA_OPTIONS="-Dawt.useSystemAAFontSettings=on -Dswing.aatext=true -Dswing.defaultlaf=com.sun.java.swing.plaf.gtk.GTKLookAndFeel -Dswing.crossplatformlaf=com.sun.java.swing.plaf.gtk.GTKLookAndFeel"


Note that some programs won’t listen to this setting if they set their own Look and Feel in their code. If you’re really serious about theming those you’ll need to learn how to do java modding, decompiling their code or modifying the bytecode. It’s less scary than it sounds, but out of scope for this post.


An X11 Apologist Tries Wayland
2022-09-18T00:00:00+00:00
I think it’s only fair to call me an X apologist. I get incredibly frustrated when people talk about dropping support for X11. I fight back against the notion that some day X11 will be dead and unmaintained, a curiosity of a time before. I’ve spoken to people in my circles at-length about the accessibility tools that Wayland simply hasn’t been capable of supporting that X11 has. A lot of times, I’ve ended this conversation with “Maybe 5 years from now it’ll be good”. Well it’s 5 years in since I first said those words, and you know what, I’m actually pleasantly surprised.

I’ve been using sway for most of my testing, since I already had an i3 config handy. Basically all I had to do was copy my config to .config/sway/config, change some of the services it was launching to get rid of X11-specific stuff, and add some output configuration. It’s pretty great how painless it was (and the default configs are just fine too, but I’m particular).

Latency and VSync

Once I actually started sway though, I was not happy so see that my mouse felt laggy. It’s not the sort of thing that makes it unusable, but it felt like I had just switched to using a bluetooth mouse. I’ve been around the block long enough to know that sway was almost certainly using a software cursor. The thing is, sway, or more accurately, wlroots, supports hardware cursors. This is dependent on linux’s Direct Rendering Manager (DRM, but not the bad kind) stack providing a cursor layer when queried, which I guess it’s not doing for my GPU? I’m not really sure why that is, as my GPU definitely has a cursor layer (I checked the kernel code), but I haven’t chased this rabbit hole to conclusion yet.

What following this rabbit did teach me though, is something extremely cool about wayland’s rendering model. Wayland wants every frame to be perfect. That means no screen tearing, no half-drawn frames, none of that stuff. If you’ve ever turned vsync on in a video game you know that trying to match to vblank almost always incurs a latency penalty, and it can make input feel sluggish. Out of the box, there’s a bit of that in wayland too, but sway has a way out: max_render_time.

Sway lets you configure exactly when it starts compositing a frame prior to the vblank interval. If you set max_render_time 1, it will wait to the very last millisecond before the vblank interval to composite a frame- probably a bad idea unless you have a god tier GPU, because if it takes longer than 1ms to composite and misses the vblank interval then you’ve actually added an entire extra frame’s worth of latency to the situation. So what you do is, you keep stepping this number up 1 millisecond at a time while playing a smooth animation, until you’ve eliminated any stuttering in the animation. And now you have done something X11 cannot do- eliminated screen tearing with the absolute minimum latency cost possible.

This fantastic. It feels fantastic. It even made my software cursor not feel so softwarey, which I’ve never experienced with a software cursor before. I have a pretty bad GPU, but on a higher end card you’d get a huge benefit to this in games. If your card can render the game many times faster than your monitor refresh rate, you can unlock your FPS in the game, tune your max_render_time to the absolute minimum, and get EXTREMELY low latency while still having absolutely no screen tearing whatsoever.

And like, this is the first time I’ve ever seen the vsync setting in a game actually sync the game up with the vblank interval in a way that matters. It works for games in wine. It’s amazing. I have never experienced gaming on Linux that looked this smooth in my life.

wlroots

This probably isn’t limited to sway, by the way. Most of the compositing logic is actually handled by a library called wlroots, a project that spawned from sway, but is now used in a quite a few other compositors out there. wlroots handles all the business of providing different rendering backends, input handling, screen capture, and so on, so that a developer can do what they actually wanted to do, which is write a window manager. As a result, if one wlroots compositor can do it, there’s a good chance the rest can too.

This is something I’ve noticed over and over while looking into replacements for my X11 tools. Broadly speaking, they all talk about supporting wlroots-based compositors, rather than any compositor in particular. From my perspective, wlroots has sort of become the defacto standard for compositor features, much like how most X software assumes they’re running under Xorg (when really, other X servers exist, and have varying degrees of support for the extensions Xorg supports).

Unifying the Fragments

Unlike the X ecosystem though, there’s no illusion that wlroots is the only game in town. Gnome and KDE are both sort of doing their own thing, and the degree to which software written for them will work on wlroots or vice-versa depends heavily on whether they rely on ecosystem-specific extensions, or use more basic wayland features that all of them support. This is a problem. The solution for higher level pieces of software like browsers, OBS, and anything else that doesn’t want to think about this seems to be something called xdg-desktop-portal.

xdg-desktop-portal was born out of the flatpak project as a way to provide controlled hardware access and desktop integrations to flatpak apps. Flatpak apps are sandboxed in a lot of ways, so this was invented as a broker that can also provide the very nice benefit of asking for user consent before providing functionality to the app requesting it. It even provides more basic features like file-choosers, printing, location services, which aren’t particularly relevant to the wayland experience, but I just want to highlight that this wasn’t originally dreamed up just to address the wayland ecosystem. Critically though, xdg-desktop-portal doesn’t actually provide these features directly, but instead hooks the app up to a backend, like xdg-desktop-portal-gtk, xdg-desktop-portal-gnome, xdg-desktop-portal-kde, and xdg-desktop-portal-wlr. Perhaps you can see where this is going.

You’ve got a unified interface (xdg-desktop-portal) over a common protocol (dbus), with backends for different wayland compositors, and you’re already supporting things that require compositors to implement them separately like screen sharing. It’s only natural that this is basically How This Will Be Done going forward. So for example, as I already mentioned, screen sharing. Both OBS and Google Chrome implement this under wayland through xdg-desktop-portal and pipewire- wait, pipewire?!

Yeah. When pipewire first hit the scene I remember my friends talking about how it could also share video, but they sort of treated it like a novelty thing. Now, its video capabilities have now become an important part of the Wayland ecosystem. I’m not even using pipewire for audio- I’m still using pulseaudio. Regardless, I’ve got pipewire working on my system purely for screencasting. The nitty gritty is a bit mired in callbacks, but if you sort out the logic of screencast-portal.c in OBS you’ll find that OBS is using the xdg-desktop-portal interface to negotiate a screencast session, and at the end of that negotiation it gets a pipewire handle it uses to receive the actual screen data. This actually makes a lot of sense. You need some form of IPC to get the pixels from one place to another, you really don’t want to be doing that over dbus, and pipewire was already right there. It’s a perfect fit.

Fragmentation has long been the elephant in the room for wayland in my opinion. Wayland’s history is well worn with ecosystem fragmentation, and devs have to redo their work multiple times in order to support what are in my view the Primary Three of Gnome, KDE, and wlroots. Still, 3 is better than it could be, and standardization is happening through efforts like xdg-desktop-portal and (more slowly) extensions to the wayland protocol. So I have to admit that even this is in a much better state than it was.

Accessibility

I guess we should talk about my final bugbear that I’ve had with wayland now, which is accessibility. And here, well, things still aren’t perfect, but the ecosystem has started figuring things out. To tackle an extremely simple one- gamma control. Gnome and KDE handle this in their own ecosystems ok, and wlroots actually has support for this now too. The actual tools to use it are a little more sparse, but they’re out there. There’s some tools like gammastep and wlsunset that support this, but I’m used to redshift, so I’ve been using minus7’s redshift fork to do both redshifting and gamma control for my monitor. (EDIT: The maintainer of gammastep let me know that it too is a fork of redshift with fairly minimal changes, and config files will even work with it if you change the [redshift] section to [general], so that may be a better option! I admit I hadn’t looked too closely at it or wlsunset while writing this post.)

Another easy one is screen reading. This was already handled entirely by AT-SPI through the dbus protocol, so this is really in exactly the same state it’s in on X: Not great, but workable. I don’t really use screen reading though for anything other than text-to-speeching long text posts, so it’s fine enough for me.

I also rely on mouse and keyboard automation quite a bit, but thankfully there’s a lot of software now that uses the uinput interface to do input emulation. I use antimicrox for controller mapping and have for years, so I was pleasantly surprised to learn they support it. There’s also ydotool, which provides xdotool functionality in a more generic way, and also has a uinput backend. The effort you need to go through to actually use these depends on how your distribution handles the file permissions of /dev/uinput. Some of them have it as root:input, in which case you just need to usermod -a -G input  and then relog to get it working. Others have it as root:root so you either need to go do some reconfigurations to change its permissions or live with running the software using it as root.

I remember talking at length with the developer of Talon Voice, a voice control/eyetracking tool that works quite well on linux, about the challenges of supporting wayland. The other big thing, aside from how to do input emulation, was whether it was possible to query the list of windows and active focus for context-specific voice commands. I have definitely seen software that does this at this point, since most app panels are their own pieces of software independent of the compositor. And, as you’d expect from them, they can focus windows too. So given that these two problems are solved, it’s now on my list to try and help Talon get Wayland support when I have the energy.

There’s one thing I haven’t been able to solve though, and that’s dwell-click for wlroots. Gnome implements this themselves in the compositor. KDE has KMouseTool, but that’s heavily X-only so I suspect it doesn’t work on wayland. I have my own implementation, rtmouse, which is also X-only. If you’re not aware, this sort of software waits for the user to stop moving their mouse, and then automatically performs a click, typically with audio and/or visual feedback. Naturally, any software doing this needs to know if the user is moving their mouse or performing clicks. There’s not really a good way for me as a software developer to query this data in real time from the compositor. This is a bit more real-time than simple user idle detection, which they do provide. I also need to know when mouse clicks happen so I can cancel my autoclick, to avoid an accidental double-click if the user manually inputs a click. So basically, I’m asking for the mouse equivalent of a keylogger, which is a thing wayland stuff really tries hard not to provide for security reasons.

I think if I wanted to implement this I’d need to try to add some features into wlroots in a way that doesn’t make people die of infosec, and then modify my software to use that, but even then I don’t think I’d ever be able to provide this in a way that’s universal across gnome, KDE, and wlroots.

Wayland Without a GPU

My final interest then, is support for devices without hardware acceleration. I use an 800MHz netbook without a working GPU. Can wayland deal with this? Yes! Gnome and KDE are sort of out of the question by default here since they’re much more GPU-reliant regardless of X or Wayland, so once again I’ll look at wlroots. wlroots as a handful of rendering backends it can use, the primary being GLES2, with a Vulkan backend that’s in there now. But the third backend, pixman, is what we want here. render/pixman/renderer.c has the goods here- this is a software compositing backend for wlroots that uses the pixman library, which has very optimized software routines for pixel manipulation. It’s even got SSE support, which means my little netbook should be in good shape- that thing has up through SSSE3.

I haven’t had the chance to actually test this out yet, but I’m very much looking forward to trying it to see how wlroots compares against X on my netbook. I suspect it’ll do better. There’s a few TODOs in wlroots’ pixman renderer about some places for potential optimizations too, which I’m interested in implementing once I have a setup where I might actually notice a difference.

I expect video playback to be reasonable here too. mpv has a wlshm (WayLand SHared Memory) backend that copies pixels into a shared memory buffer, skipping the formalities of creating an OpenGL context, calling glTexSubImage2D, and hoping llvmpipe will do the right thing and make it fast enough to be usable (it probably won’t). SDL though, I’m a bit worried about SDL. SDL’s wayland backend only supports creating an OpenGL context, and doesn’t have any support for wlshm, so I’m worried that it might be much slower than it needs to be even with applications that are entirely doing software rendering. I might be able to work around this by forcing it though XWayland, and maybe that path will be a bit faster, but we’ll see.

That’s a lot of guessing by me here, but I have a number of other things I need to get done before I can test this on the actual hardware. I’ll be writing a followup whenever I do that.

Closing Thoughts

All in all, I’m very impressed with the work the wayland community has done since I last did a serious look at the state of things. I’m still waiting for a stacking window manager that scratches the same itch for me that icewm does, but I’m following labwc with great interest. At this point though, I’ve established that I can live my life on wayland, and for the time being I am. Not everyone can yet though, and there’s still work to be done. Part of why I’m feeling the urge to transition to wayland is performance benefits, but the other part is so that I’ll be able to help solve the unsolved problems to make it viable for more people.

I don’t think X is ever going to die. Even if it fades away on Linux, there’s a lot of old video hardware that will probably only ever be well supported with real Xorg, on Linux and other OSes such as NetBSD. That stuff is already seeing support dropped in more recent versions of Xorg, and preservationists will need to do digging to find versions that still take advantage of everything the hardware has to offer. But, I understand now why the wayland folks have been talking so highly of it, and how drastically it simplifies the userland stack, and I’m no longer concerned that I’ll wake up to find my netbook has become unusable for modern software.

If you’re into Gnome, Wayland is probably a good experience today out of the box, even if you aren’t a power user. I’m not into Gnome, which is why I haven’t looked at it in this post. If you’re not into Gnome either, but you want to give Wayland a shot, just know what you’re getting into. KDE, I have heard mixed things about, but can’t speak to.

If you want to try a wlroots-based compositor, know that you can probably do everything you want to do, but it’ll be some effort. You’ll need to do some diving into search engines, but the solutions are out there. arewewaylandyet is a good starting point that I’ve been referencing throughout this process, along with the Arch Linux Wayland Pages and Gentoo’s Wayland Desktop Landscape page. You may also need to hop on IRC and ask some questions. It’s early adopter territory, but if that’s what you’re into, a lot of developers have already put a lot of work into paving the way forward, and it’s worth trying out!


Stop Chrome from Stealing Sway’s Hotkeys
2022-09-15T00:00:00+00:00
I read the manual, and I’m putting it here so I don’t forget. As mentioned in some previous posts, I like to run web-apps in chrome with the --app= flag, rather than use the electron version. For stuff like Discord this largely makes it act like the electron version- the website gets its own dedicated window, clicking links opens stuff in a different window, there’s no browser UI taking up space. The main differences are I don’t have to bother keeping an app updated, and I can apply custom CSS. Anyway, recently I’ve been trying out Wayland with Sway, and it seems that when you launch chrome with --app=, it inhibits your compositor’s keyboard shortcuts so it can have them all to itself.

I’m sure this behavior makes sense in certain usecases, but it doesn’t make sense here. Now I can’t switch workspaces, windows, or do anything else with sway. Thankfully, sway has a way to disable this. If you want to disable this for every application, you can throw this line into your config:

seat * shortcuts_inhibitor disable


That’s what I’m doing for now, since I want this to be the default behavior for all apps anyway. You may not want this though. It actually makes a lot of sense for a VNC viewer or a full screen video game or something like that to capture all your inputs, so you can use this config line instead if you only want this behavior in chrome:

for_window [app_id="^chrome-.*"] shortcuts_inhibitor disable


Or, you can do what I plan to do: leave this disabled globally but manually enable it for any apps you want to be able to inhibit shortcuts:

seat * shortcuts_inhibitor disable
for_window [app_id="whatever-app-patern"] shortcuts_inhibitor enable


Note that this won’t force the application to inhibit shortcuts, it just allows the application to request shortcut inhibition if it wants to.


That time I 10x’d a TI-84 emulator’s speed by replacing a switch-case
2022-08-07T00:00:00+00:00
There’s a javascript emulator for the TI83+, TI84+, and TI84+CSE calculators called jsTIfied, which was written by Christopher Mitchell, founder of the calculator fan site cemetech.net. There’s not a whole lot of reasons to use it over something else if you’ve got a native option available, but if you don’t it’s pretty great. I got interested because it was the first emulator to support the TI84+CSE when that calculator was released in the early 2010s. The CSE was exciting because it retrofitted a 320x240 color display onto the hardware platform of the 84+SE, so all the other hardware and OS access was the same except for graphics. I wanted to be one of the first game developers for the CSE, but developing for the calculator without an emulator and debugger is pretty painful, so I tried out jsTIfied with my older calculator ROMs to get a feel for it.

jsTIfied had a problem though: it was too damn slow. These calculators use a z80 processor, which is pretty simple to emulate. But jsTIfied couldn’t even handle emulating the 6MHz calculator models at full speed, and the CSE’s processor was clocked at 15MHz, so it was even worse. jsTIfied is closed source, but I decided that I was going to try and do something about it anyway.

Of course the first thing you want to do when you’re debugging a web app is go to the profiler. There was just one hotspot that dwarfed all the others, and that was the instruction decode and execution switch-block. That’s the sort of thing you’d expect since these calculators don’t have any other complicated hardware to emulate like pixel processing units or audio chips, but it seemed a bit fishy. Yeah javascript is slow, but computers made in the early 2000s could have handled emulating this calculator at full speed with native code. Javascript overhead wasn’t enough to explain it.

So I started digging into the actual code. I had to unminify it, but I was used to dealing with obfuscated code from Minecraft. The instruction decode block had one giant switch block, with additional nested switch blocks for multi-byte instructions. In most languages this is just fine, since your compiler will turn it into jump tables, so why wasn’t I seeing jump table performance here? I had a bit of an obsession with javascript performance at the time due to my WebGL experiments, and I had already learned that at the time JS engines wouldn’t optimize functions above a certain size. Knowing this, I split all the nested switch statements into their own functions, and made the parent switch call them, to see if that would take care of things.

Now I needed a way to actually load my code. I quickly spun up a web server on my computer that would pass through requests to the upstream website, but intercept the request for the emulator engine and return my modified code instead. I switched /etc/hosts to point the upstream domain at 127.0.0.1 and I had my code loaded.

Unfortunately though, I saw basically no speed up. I was very sure at this point that I was within the size limits for functions, so there had to be something else missing. I went digging around looking for low level details on the implementation of switch statements in javascript. Eventually I found a stackoverflow post from someone trying to do exactly the same thing I was: optimize a (different) z80 emulator. That’s when I saw a deeply disturbing comment, with sources cited directly to Chrome’s V8 source code:


  @LGB actually in V8 (JS engine used by google chrome) you need to jump
through a lot of hoops to get switch case optimized: All the cases must be of
same type. All the cases must either be string literals or 31-bit signed
integer literals. And there must be less than 128 cases. And even after all
those hoops, all you get is what you would have gotten with if-elses anyway
(I.E. no jump tables or sth like that). True story.


Check out the post for yourself here https://stackoverflow.com/questions/18830626/should-i-use-big-switch-statements-in-javascript-without-performance-problems#comment27798374_18830724

This is not what you want to hear when you’re looking at an emulator with a heckload of switch blocks, especially switch blocks that all had more than 128 cases. I had let the optimizer run on my functions, but when it got to the switch blocks it said “thanks but no thanks I’m good”. I had only one option left, and that was to wrap every case of every switch in a function, dump them all into an array, and do the lookups myself. So I wrote a script to do just that.

The original code would look something like this:

switch (z8.r2[Regs2_PC]++) {
  case 0x00: // nop
    break;
  case 0x01: // do something?
    break;
  // ...
  case 0xDD // index register prefix
    switch (z8.r2[Regs2_PC]++) {
      case 0x00: // do something
        break;
      case 0x01: // do something
        break;
      // ...
      case 0xFF
        break;
    }
    break;
  // ...
  case 0xFF:
    break;
}


Which I then translated to something like this:

let instr_table = new Array(256);
let instr_subtable_DD = new Array(256);


instr_table[0] = functon() { /* nop */ };
instr_table[1] = function() { /* do something, probably */ };
instr_table[0xDD] = function() {
  return instr_subtable_DD[read_byte(z8.r2[Regs2_PC]++)]();
};


After all that was done, I had success! The emulator went from slow as molasses to being too fast. I told the original dev about this and he was eager to merge those changes in, but a calculator that’s too fast is bad in its own way, because you can hardly control the thing. So I got him to give me access to the source, wrote a speed governor, and he got it all squared away and pushed up to the site. You can see this yourself at https://www.cemetech.net/projects/jstified/jstified_compressed.js?20170706a, just search for z8oT. You can also see a demo I recorded at the time below, first with the old slow version and then with the code that was far too fast.


  
    
    
    jsarray vs switch speed comparison
  


I had one more trick up my sleeve too: notice that those program-counter register increments doesn’t involve a & 0xFFFF to keep the value within the appropriate 16 bits. That’s because I switched to storing our registers in a Uint16Array, which has that wrapping behavior built in (since it’s backed by real honest-to-goodness u16s). I think that overflow behavior is defined in the spec but I don’t actually remember- either way it works just fine everywhere I’ve tried it, but do me a favor and check for yourself. Ultimately this had a marginal performance benefit at best, but it removed a LOT of bitwise ops from the code and made it much harder to mess up the value-range of register operations in general.

Before closing, I must warn you I wouldn’t recommend you take this as modern performance advice for any of your own javascript without checking the V8 and spidermonkey source first. 2013 was a different time, and JS engines have come a long way since then. I really hope they’ve made this better than it was.


[reblog] How to Start an Unencrypted Chat on Matrix (Element)
2022-07-29T00:00:00+00:00
Hey! A friend of mine, Cadence, wrote a great blog post about how to start an unencrypted direct message with someone else on Matrix using Element or a fish script. That post is over at https://cadence.moe/blog/2022-07-29-how-to-start-an-unencrypted-chat-on-matrix-element. I’ve also iframe’d it here! This reblog thing is a new thing I’m trying, and I don’t intend to make it a majority part of the feed, but I think it’s fun. Let me know what you think!

This is mainly useful for cases where all of the matrix homeservers involved are operated by trusted parties. For example, I run a personal matrix server, and I also run a matrix server for a bunch of friends and people close to me. I’m operating the servers on hardware owned by us, and everyone involved is comfy with this setup from a privacy perspective. So that’s why we’re ok with it.

But the reason we want it is ‘cause its less hassle. Us folks on this server wanna be able to use the server side chat history search for one thing. For another, with E2EE its often the case that something gets messed up and you can’t even read chat history anymore, or one chat client gets into a funky state where you have to log out and back in for encryption to start working again. So that’s why we want unencrypted DMs, but matrix clients make it really hard these days.

Anyways, if you’re reading in a browser, the post is here below!


I’m tired of making decisions
2022-07-15T00:00:00+00:00
Decisions are our keys to the ethereal freedom, but they are also our shackles to the earth. I’ve been thinking a lot about what makes a system pleasant for me to use, particularly as a workstation. I’ve become disillusioned with a lot of Linux distributions as of late, not for technical reasons, but just because many of them feel cause me to feel depressed and dejected when I work with them. One of the primary factors I’ve identified is how they reckon with choice. What do they allow me to choose? What do they force me to choose? This balance is why I use Puppy Linux, despite everything, and why I’ve gone from being an Arch Linux zealot to avoiding it at all costs.

But before I start bikeshedding Linux distributions with you, my thesis: making too many decisions is bad for my health, so I don’t want to make decisions that don’t matter.

Decisions do not come free of charge, and they are exhausting. The degree to which that exhaustion strikes varies depending on your emotional state, baseline load, disabilities, and so on. When I was younger, I had very few decisions I had to make. I had to decide what to eat for breakfast, what to eat for lunch, what to eat for dinner, and that was it. School made all my daily decisions for me, and my parents provided shelter from big questions like where to live or how much to spend on groceries or all the other fun things that come with living a life. So then, I was free to direct my decision-making to whatever I wanted.

First it was which music to listen to, which Minecraft mods to install. Then it was which IDE I wanted to use, which vim plugins, which Ubuntu variant. When I found Arch Linux, it was like a game. I had the power to design my system in a manner that I’d never had before, choosing which bootloader to use, which display manager, which window manager, which file browser. I’d scroll through the Arch Linux List of Applications for hours, looking at every category, carefully reading about every application, and trying them out to find what I thought was the best.

These days I have too much other stuff to decide. The realities of living are a long list of decisions that anyone reading this that’s been around long enough will be familiar with. Disabilities add on to this. As I manage my RSI, I’m faced with choosing how long I’m allowed to interact with my phone at any moment, which keyboards I can use and where, how long I can play a game without hurting myself. I have to plan out how I’m going to lift a heavy object, and exactly where I’m going to take it, so I don’t cause a flare-up in my hands from doing so.

As a result, I now find getting bogged down in the minutia of my workstation to be an incredibly exhausting and emotionally draining act. This was particularly surprising to me, and I rejected it for awhile, until I saw that it kept playing out over and over again. Choosing which login manager to use is a decision that feels ultimately meaningless to me, as they all suit my needs just fine, and all I really want is to log into my system. The same is true for choosing my status bar, my battery monitor, my audio server, the tool that handles my screen brightness keys.

With Arch Linux, this problem is relegated solely to which software to use and how to configure it. Alpine is mostly on the same page, but has the added cost of choosing which subpackages of the software to install (-docs, etc.). Gentoo is the absolute worst of this world, where every package comes with USE flags to determine precisely how it’s configured and built, and what other dependencies it should pull in.

Even once I’ve made these decisions, if I ever have to install again, I have to either remember the answers or make them again. I’ve got to go find all my little config files for every single bit. I can speed-run installing Arch in about 15 minutes, but it takes six months before I feel like I’ve finished setting it up.

It’s important to understand that these are not criticisms of the systems, but rather an analysis of why they don’t work for me. I wrote a post just a couple months ago about installing Gentoo on an ARM SBC, and the flexibility of the system was a huge boon to getting the hardware working to the extent that I did. All of these exist the way they do to solve a problem.

So if my problem is decisions, why don’t I just go buy a mac? macOS is all about Apple making the decisions for you right?

It certainly is, and I did go down this road. For awhile, I even went as far as living exclusively from an iPad. But here, the pendulum swings. It’s true that I want an environment that makes decisions for me, but I do not want an environment that forces me to live with those decisions in whole, to take it as an all or nothing package. With macOS, if you like what they make, you’re living well. If you don’t like a decision, in many cases it’s nigh-impossible to do something about it and change it, unless you’re an ex-apple engineer well versed in hacking at private APIs.

What about something like Ubuntu? It does address low level things like auto-mounting flash drives when I plug them in, but it really would like me to use snaps (I would prefer not to), honestly it still pushed a great deal of choice onto me at the application level. These days, very few applications come in the Ubuntu installation, and if you’re expecting something like a CD burner program, audio editor, image editor, or partition editor to come built in you’re not going to find it. That’s true of a lot of distributions today. The rise of broadband has left behind the traditional “everything you need is on the CD” approach, and moved towards maintainers largely focusing on creating a very shiny fresh installation that does almost nothing of actual value out of the box.

So I’ve found myself in love with Puppy Linux.

I think a lot of people don’t realize that Puppy Linux is actually a pretty broad family of distributions, each building on top of different package sets from other distributions, like Debian, Ubuntu, Slackware, and Void Linux. Some of the comments I’ve made with regards to not having enough out of the box still apply to some of the pups, like VanillaDpup, a very minimal pup that sprinkles a bit of Puppy magic on top of an otherwise largely stock debian system.

Others though, they feel complete right from first boot. You’ll find multimedia tools, tools for creating bootable devices, tools for collating all of your low level system information, tools for organizing contacts and managing a calendar. There’s a built-in password manager, an IRC client, a webcam recorder, an email client, a torrent downloader. Hell, there’s a built-in program for writing and hosting a personal blog. There’s a fucking PORT SCANNER. BUILT. IN!!! What the FUCK? There is a GUI for configuring and launching samba, a GUI for making SIP calls I am losing my MIND. That’s not even half of it, and all of this is just in the base image for FossaPup64, which I might add, fits on a CD-R with 300MB to spare.

And of course I can still install packages from ubuntu or debian or whatever.



To complement all this, I feel like I genuinely do have complete control of my system. There’s an actually intuitive UI for changing default applications. There’s a GUI for cron jobs. I can edit the bootup script to add in or take out whatever I want and I don’t have to worry about those changes getting stomped on later or causing issues. Hell, I could swap out the init system manually if I really wanted, though I’m not sure why I would. Most of the time there’s a configuration option to do what I want, and when there isn’t I can just change the system, because it’s designed to be my system, not my maintainer’s system.

The downside is that in comparison with the mainstream, the system is not particularly maintainable in the long term. Years down the line I fully expect to be installing a new system from scratch and migrating my configurations over to that. Sometimes packages that rely heavily on post-install scripts don’t install quite right. I would never deploy a fleet of puppy linux systems, nor would I feel inclined to try and run it in a datacenter. I’m quite happy with other options for that.

But the key behind puppy’s attraction for me is that it makes so many decisions for me, guides me through the decisions I need to make myself, but allows me to question, challenge, and change ANY of its choices however I see fit without putting up a fight. And the folks on the forum are particularly helpful too. I haven’t really gotten all of that together anywhere else.

I’m tired of making choices that don’t matter, but I still want options that make a difference to me. If I can get that in a more maintainable way, honestly I’d take it, but with Puppy, for now, I’ve found a zen.


MNT Pocket Reform Has Problems, But I Still Want It
2022-07-05T00:00:00+00:00
I am a notorious lover of small computers. At the start of the year I wrote about my darling little Vaio VGN-P that I use all the time for communications, writing, and even dev work. But these things are long out of manufacture and difficult to repair. In current year you can get some compelling options from GPD, but their keyboards are pretty bad, so they’re out of the question for me. So, when I heard about the MNT Pocket Reform, I was pretty fucking interested. My first impressions are pretty good, but there’s some problems that could end up being deal breakers.



To understand my critique, you need to know that I have repetitive stress injuries in both my hands/wrists that I have to actively manage. As long as I do everything right, I’m not in any pain and I can do all the work and play I want to. If I fuck up, I get to suffer for a week. That’s the whole reason I got that little vaio to carry around in the first place, so I could use my phone less. On top of that, I have a more general physical disability that makes it infeasible for me to carry around a heavy laptop everywhere I go, so weight is another huge factor here. If it wasn’t, I’d just bring my Thinkpad x220 everywhere and be happy. The difference between 2 and 3 pounds in a laptop may be nothing to you, but it means the world to me.

So from that perspective, what do I like? Well, first of all it’s small, so that’s a good sign for the weight. They haven’t announced weight yet but I’d be willing to bet that if I got the PLA case option, and swapped out the battery cells to bring it down from the stock 8000mAh to 4000mAh, it’d be well under 2 pounds.

It’s also got a trackball. I’d prefer a trackpoint, but trackball is still better for me than a trackpad, so I’ll take it. Given this is MNT we’re talking about, everything is open source, so it’s theoretically possible to design a trackpoint module for this thing. Personally I don’t have the energy to work on that myself.

Also, since it’s completely open, that’s a big win for repairability. This is something that I’d be less concerned about breaking than I am with my vaio, because if it breaks I have the knowhow to do something about that.

But… I have some concerns. My first issue is with the keyboard. It’s ortholinear, which might be fine for me, or might not; honestly I haven’t used an ortho keyboard long enough to find out. But more pressing is that they plan to ship with Kailh Choc White switches. These are clicky switches, and clicky switches are VERY good at causing my RSI to flare up. I’d love the option to buy it with tactile browns instead. If that’s not possible, I’d take it with the switches unpopulated so I can just solder my own in instead. That’d be preferable to having to desolder the stock switches.

My second bug bear is the display. It’s 1920x1080p at 7”, giving a PPI of 300. I find this rather excessive. At native scaling, everything will be too small for me to read, and UI scaling is something I don’t enjoy using. It also seems like it’ll needlessly strain the low-tier hardware it’ll ship with. But honestly, the vaio is in the same boat, so I’m already used to upscaling from a lower resolution to solve this problem.

The last thing here bothering me is they didn’t put in a headphone jack. Seriously?? USB-C to headphone jack dongles are a pain in the ass and prone to breaking, and I’ve gotten an unpleasant blast of loud digital noise out of even an apple one.

The elephant in the room that I’ve alluded to is the low-end computational specs of the thing, but honestly I do not really give a damn. I am considering this for a usecase that is already served by something equally slow, so it’s not a factor in the equation for me.

The weight is really the wildcard here. If it’s light enough, I’m going to buy one. If it’s too heavy, I won’t. All my other problems are things I can work around in software, or with some effort doing some hardware modding. I’m looking forward to getting ahold of one of these, and if I do I will definitely be writing about it here.

Cheers!

– artemis


I Disabled GIF Animations on Cohost
2022-07-03T00:00:00+00:00
If there’s one thing that is guaranteed to piss me off in the current year, it’s a social media tool that doesn’t let me disable GIF animations. Personal websites- yeah, whatever, it’s your website. But when social media lets people blast animations into my feed, I’m just not ok with that. Today’s offender is cohost, some hip new website for Posting that some of my friends are using. The GIFs are fucking everywhere and I want them to stop. Websites shouldn’t take all the blame here. If we weren’t living in a hell disguised as middle-earth, browsers would have a setting to disable GIF animations at the browser level. But we don’t, so websites have to do all sorts of bullshit if they actually want to support disabling GIF animations, especially if they want play-on-hover support. Anyways, today cohost is in my line of fire and I decided to do something about it. Here’s my janky userscript that replaces gifs with canvases, with a snapshot of the gif drawn on.

What this script doesn’t do is play-on-hover. I might get to that later, but I’m not sure how to do it yet. For now, this makes the animations stop, and that’s enough for me. Oh and, if I start seeing shit with CSS animations, I will absolutely also modify this script or spin up a userstyle to turn that stuff off too. I just haven’t seen any yet to test on.

How to use it

I want to make it clear that this script has not seen much testing yet. Nevertheless if you’re into this kinda shit, you can use it by following these steps:


  Install Tampermonkey
  Click on this link to the script: static-cohost-gifs.user.js
  If tampermonkey is installed, it should prompt you to install the script.


If something goes wrong, I’m not really in a position to provide support. Sorry. But if something goes wrong and you fix it, please feel free to email me what the problem was and a copy of your modified script so I can update my code accordingly.

Just read the code here if you want

I’ve copied the script here below if you just want to read it:

// ==UserScript==
// @name         static cohost gifs
// @namespace    https://artemis.sh/
// @version      0.1
// @description  make the gifs stop
// @author       artemis everfree of the violet spark
// @match        https://cohost.org/
// @icon         https://www.google.com/s2/favicons?sz=64&domain=cohost.org
// @grant        none
// ==/UserScript==

(function() {
    'use strict';

    console.log("static gifs is running!");

    function replace_gif(gif) {
        let canvas = document.createElement("canvas");
        canvas.width = gif.naturalWidth;
        canvas.height = gif.naturalHeight;
        canvas.class = gif.class;

        // copied from what cohost applies to img normally. this sucks because we need to manually keep it
        // up to date for now
        let style = "max-width: 100%; display: block; vertical-align: middle;";
        if (gif.style) {
            style = style + gif.style;
        }
        canvas.style = style;

        let ctx = canvas.getContext("2d");
        ctx.drawImage(gif, 0, 0, gif.naturalWidth, gif.naturalHeight);
        gif.parentNode.replaceChild(canvas, gif);
    }

    function replace_gif_on_load(gif) {
        //gif.crossOrigin = "anonymous";
        if (gif.complete && gif.naturalHeight > 0) {
            console.log("gif is complete", gif.src);
            replace_gif(gif);
        } else {
            console.log("Registering gif callback", gif.src);
            //let replaced = false;
            gif.addEventListener("load", () => {
                //if (gif.src.match(/\.gif$/)) {
                    //replaced = true;
                    replace_gif(gif);
                //}
            });
        }
    }

    function replace_all_gifs() {
        let imgs = Array.from(document.querySelectorAll("img"));
        let gifs = imgs.filter((img) => img.src.match(/\.gif$/));
        gifs.forEach(replace_gif_on_load);
    }

    function init() {
        console.log("Registering vi's mutation observer");

        replace_all_gifs();

        let observer = new MutationObserver((mutationList, observer) => {
            console.log("mutation received");
            mutationList.forEach((mutation) => {
                if (mutation.addedNodes) {
                    mutation.addedNodes.forEach((node) => {
                        console.log(node);
                        if (node.tagName === "IMG" && node.src.match(/\.gif$/)) {
                            setTimeout(() => { replace_gif_on_load(node); });
                        }
                    });
                }
            });
        });

        observer.observe(document.querySelector("#root"), { attributes: false, childList: true, subtree: true });
    }

    if (document.readyState === "complete") {
        init();
    } else {
        let initialized = false;
        document.addEventListener('readystatechange', (e) => {
            if (document.readyState === "complete" && !initialized) {
                setTimeout(() => { init(); });
                initialized = true;
            }
        });
    }

})();



How I Use Borg for Ransomware-Resilient Backups
2022-06-22T00:00:00+00:00
If you’re in need of a backup solution for your *nix machines, BorgBackup is a great tool for it. Borg features encryption, deduplication, append-only data access for ransomware resiliency, and data compression. I’ve been using it for five or six years now and I’ve developed a strategy for deploying borg that I’ll share with you.

This isn’t a step-by-step tutorial to using borg. If you want that, you should go check out borg’s Installation Guide and Quick Start Guide which do a good job explaining it.

Do you see a problem with anything I’ve written about here? Please contact me and I’ll update the post appropriately.

Also in case it has to be said, this information is PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, and I’m not LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE BLOG POST. Do your own threat modeling and red-teaming. Don’t take my word for it.

Borg Repositories

Let’s establish what a borg repository is, and what its security properties are.

So, a borg repo stores a collection of backups. When you create the repo you specify an encryption key, encryption file, or both. The repository can be stored locally on disk (or anything that looks like a disk), but borg can also back up over an SSH connection. This creates a natural client-server model, where the data repository is stored on a server, and a client connects over SSH to that server to back itself up. SSH is a convenient means of authentication for the client-server model here.

While multiple clients can back up data to the same borg repository, any client that wants to write to the repository has to acquire a write-lock, so only one client can write data to the repository at a time. Consequently, I find that backing up multiple systems into a single repository is logistically challenging. It’s also a security risk, as multiple systems accessing the same repository would have to share a single security key. If any one system was compromised, an attacker could decrypt the data for all systems using the repository.

Instead, I host multiple repositories on a single backup server. This also provide an opportunity for additional access controls, which I’ll explain later.

Backup data is encrypted by the client before it is transmitted to the server. As a result, if the backup server is compromised, the attacker can delete or ransom the backups but they cannot decrypt them and recover the data within.

So, about the backups in the repos. Each backup is a complete snapshot in time of the file tree which is backed up, but the file data is deduplicated within a repository. If you back up a system on three separate occasions, and your /usr/bin/gcc file was the same in each snapshot, only one real copy of that data is stored. The specifics are actually a bit more complicated, since under the hood this works by breaking the file into chunks and deduplicating those individual chunks. As a result, you can get deduplication of files within the same snapshot, or even partial deduplication of files that were only appended or partially rewritten.

File data is also compressed within the repository, and you have your choice of algorithms like lz4, gzip, lzma, and zstd. I prefer lz4 on my resource-limited systems. Everywhere else I use zstd level 3 for a nice balance of compression speed and ratio.

How I Use Borg

Client Systems

First things first, I generate a long encryption key for the system. Never re-use keys between systems. I store one copy of that in a password vault, and another goes on the client system.

On each of my systems, I create a folder called /backup which is owned by root and unreadable to any other user. Within that folder, I create two files, env.sh and backup.sh. I also generate an SSH keypair.

env.sh stores environment variable definitions for the repository, including the backup destination and encryption key. The borg command looks at variables prefixed with BORG_ for additional configuration beyond the command line flags. Also note the BORG_BASE_DIR variable. This tells borg to use /backup to store working data, and the metadata cache for speeding up future backups.

$ cat env.sh
#!/bin/sh
export BORG_REPO='[email protected]:/path/to/repository'
export BORG_BASE_DIR=/backup
export BORG_PASSPHRASE="a randomly generated password unique for every system"
export BORG_RSH='ssh -i /backup/system-name-ssh-key'


backup.sh stores the actual command to create the backup. Here’s the simplified version:

$ cat backup.sh
#!/bin/sh
source /backup/env.sh

borg create "$@"                            \
    --stats                                 \
    --one-file-system                       \
    --compression auto,zstd,3               \
    --exclude /backup                       \
    --exclude /root/.cache                  \
    --exclude /home/**/.cache               \
    "$BORG_REPO::{hostname}-{now:%Y-%m-%d}" \
    /                                       \
    /mnt/otherfilesystem
    # add more paths as desired

# Note, you can use %Y-%m-%d_%H:%M as the time format
# string to also put the hour and minute in there.


Maybe you don’t want a full system backup and instead just want to capture some specific repos. Not a problem. Any directory you ask borg to backup will be recursively traversed, but it won’t follow symlinks, so don’t worry about that. You may also want to use application-specific backup processes in your backup script. For example, databases may not like being restored with a snapshot that was taken while the database was running, so you might want to exclude the database’s data folder and create/store a database dump instead.

Be mindful of swapfiles too by the way. You probably don’t want to back those up.

I split these two files so that I can interactively source env.sh to more easily work with the repository manually when I need to. Additionally, since backup.sh passes through any additional parameters passed in, I can manually run /backup/backup.sh -p to watch a backup happen in real time.

With these files in place, I then go over to my backup server to complete the configuration there.

Server System

On the other side of the coin is the backup server. I could have one backup server or multiple, it doesn’t really matter because this scales out to however you want to do things.

On the server I create a user called borg which all clients will use to connect in. The user has no password, so each client has to use their own SSH key to connect. This allows us to place restrictions on those connections. Each entry in the authorized_keys file looks something like this:

restrict,command="borg serve --append-only --restrict-to-repository /mnt/backups/borg/torchic-repo" ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIIuGYc6VTb21SVzRehdi3Pd+AgVnw3g6JB66LK36IVdI root@torchic


The restrict flag basically tells OpenSSH to prevent the client from doing anything fun like port forwards, X forwarding, opening extra channels, sftp, and so on. I believe that on older systems you have to write out all the restrictions one by one, but restrict was introduced as a forward-compatible way to tell OpenSSH to basically sandbox the client into interacting with whatever command is being executed and nothing else.

The command argument forces a specific command to run when a client connects with that SSH key. borg serve was made for this purpose, and communicates with the connecting borg client over stdin/stdout. The --restrict-to-repository argument restricts the client to a specific repository, and, crucially, --append-only prevents the client from deleting any existing data in the repository.

With all this done, I borg init -e repokey-blake2 from the client system, run a backup manually, and then set up a cronjob or systemd timer to create a new backup daily from then on.

Replication

Replicating backups off-site is broadly out of scope for this post, but I do want to say that the repos are structured very regularly, a bit like this:

.
├── config
├── data
│   ├── 0
│   │   ├── 1
│   │   ├── 101
│   │   ├── 103
│   │   ├── 996
│   │   └── 998
│   └── 1
│       ├── 1000
│       ├── 1002
│       └── 1008
├── hints.1168
├── index.1168
├── integrity.1168
├── nonce
└── README


As a result, it’s very easy to mirror this to any media you’d like, such as a cloud bucket storage, a tape archive, or just some other hard drives.

Restoring data

Restoring the backup is a complicated topic that depends on the nature of the backup. For the most simplistic full system backup, all you’ve got to do is borg extract a backup into a freshly formatted filesystem, set up your bootloader, fix up /etc/fstab with any new filesystem UUIDs, and you’re good to go. Your mileage may vary, and you’ll need to develop your own processes here depending on your situation.

I often just borg mount a backup and copy over only the bits I care about. Nothing wrong with that if it suits your needs.

Threat Modeling

This is not a comprehensive look at the nitty gritty of the security, just the bits I think I can include in this overview. Borg has really great documentation on this topic, so I urge you to read their security FAQs and security internals pages for more information.

The nonce file

Before getting into specific scenarios, I need to tell you about the nonce file.

Every repo has a nonce file in it. Be mindful of the repo’s nonce file. This is used in the encryption and is modified whenever a client writes data to the repository. If a write ever re-uses a nonce, the repo’s encryption can be broken. See this FAQ entry for more info. When writing, a client will use the greater of either the server’s copy of the nonce value or the client’s cached value. If two clients write to the same repo, an attacker with server-access could reset the nonce value after client A writes data, causing client B to re-use the same nonce (it doesn’t know the nonce was incremented). This is another reason to stick to one client per repo.

This raises the question of how to manage the nonce file in an append-only replica of the repository. You could just exclude it from the replica, since it’s not needed for reading data. If your backup client has the latest nonce cached, this is fine. If it doesn’t, then the client will re-use nonce values, and this breaks the encryption.

If you are ever in a situation where a client may re-use a nonce, you should consider the repo’s encryption broken. The simplest solution is to make a new repo and migrate your old data into it.

Now let’s get into how the deployment I described stands up to various attacks.

Scenario 1: An attacker compromises a server

So an attacker compromises the server, and they can access the backup files. They could delete the backups. They could prevent clients from making new backups. They could ransom the backups.

If your server is automatically replicating backups offsite in a push-configuration, the attacker may also be able to delete or ransom your off-site backups. Therefore if you do want automatic push-replication, you ideally want to do it in a way restricts your server to append-only replication, much like the clients are restricted to append-only writes to the server.

Because I use one client per repo, an attacker with server access can’t cause nonce reuse unless the client loses its cached nonce. Still, if an attacker compromises your server, your safest bet is to create a new repository for future backups.

Theoretically, an attacker could also exploit a flaw in the client-server protocol to hack the client when it connects in to the server. To the best of my knowledge, there are no known flaws that could allow this at this time.

Scenario 2: An attacker compromises a client

If the attacker compromises the client, and they get root access, they get the encryption key. This is perhaps not terribly exciting for them, because they can also just read the client’s files right from the disk. Still, this key allows them to decrypt backup history, which may provide data that isn’t currently on the client.

They can also deny service to the other backup clients by uploading lots of data to the server. This can be mitigated with borg serve --storage-quota, and with per-connection bandwidth limits on the server at the network layer.

If the attacker finds a flaw in borg serve --restrict-to-repository, they may be able to break out of this repository restriction and access other clients’ repositories. This is not immediately concerning, because they can’t decrypt the data in those repositories, but if they find further exploits in borg serve they may be able to delete those repositories.

Additionally, if the attacker can find a flaw in borg serve that allows them to get code execution, they could more completely break out of that sandbox and gain a foothold on the server that way.

Both of these can be mitigated by creating separate users for each client, applying AppArmor or SELinux controls to the borg serve process, and other system-level isolation techniques on the borg server. Use similar logic if you’re running something other than Linux on the borg server. For example, you FreeBSD folk are probably reaching for your FreeBSD jails already.

Scenario 3: An attacker compromises server replicas

This is very dependent on how you’ve decided to replicate your data. In all cases, the attacker won’t be able to decrypt data unless they also get ahold of the encryption keys, but they can certainly try to ransom the replica.

If you use a pull-configuration with a replica server connecting to the primary borg server to download data, the replica may be able to delete that server’s data. It’s up to you to put in the access controls to give it read-only access.

Scenario 4: An attacker compromises your password vault

If an attacker compromises the password vault, they get access to all the keys needed to decrypt the data. At this point, your backups should no longer be considered encrypted. However, the attacker still needs to get access to the backup server or a replica to get the actual backup data.

Rotate your encryption keys immediately, do all your other incident response, etc.

I recommend against storing your SSH keys in the vault. If a client dies and you need to rebuild it, just make a new SSH key. If you store your SSH keys in the vault, the attacker can just use those to connect to the borg server and get the data.

Scenario 5: A repo has a stale write-lock

The borg client will not break an existing write-lock held by a different system. If the write-lock was acquired by the client system, but the process that acquired the lock is dead, the client can prove the lock is stale and automatically break the lock (perhaps it was rebooted while backing up).

You may occasionally end up in situations where the write-lock was acquired by the client, but the client can’t prove it. Maybe the system was hard-rebooted and the lock was not sync’d to disk properly. In this case, the client will be safe and avoid breaking the lock, with the assumption that the lock may be held by another connection.

If this happens, your client won’t back up new data. I recommend setting up some alerting system to let you know when your backups fail for any reason, so you can investigate yourself and do whatever you need to do to clear things up (breaking locks, repairing data, etc.)

Learning more

I’ve only scratched the surface of borg. Borg has some of the best documentation of any tool I’ve had the pleasure of using, and you should really go read it if you want to learn more. Head on over to their readthedocs pages at borgbackup.readthedocs.io.


More Oxide at Home: My Pi is a Wireless Crucible
2022-06-14T00:00:00+00:00
Welcome back to me making bad decisions with Oxide software. Today we’ll have a look at Crucible, a networked storage service. Now, “network storage” can mean a lot of things, from low level block devices to high level bucket stores. Crucible sits at the low level end of the spectrum, and is intended for stuff like block devices for virtual machine. I’m going to tell you what I’ve learned from digging through the crucible git repository and talking to Josh Clulow (seriously, thanks for answering all my 2AM questions). Then for dessert I’m going to turn a Raspberry Pi into a WiFi USB drive backed by a Crucible datastore on my OpenIndiana box, complete with block-level encryption and data replication.

How Does Crucible

Before we start using Crucible, I’ll give you a high level overview of the project as I understand it. I’ll take a look at the on-disk data format, and I’ll explain a bit about replication so you know the overhead we’re working with. Now, I didn’t write Crucible, nor have I read every line of code, so please mind the knowledge gap as you tread through this section. I’ll keep this post updated if I get word that something isn’t quite right.

Layer Cake

From a high level you’ve got four layers to think about.

At the very bottom is the real physical backing storage. These are real bytes, used by real people, and they exist in a place (or multiple places), and they actually store data. Whether that backing storage is a disk, a slab of ram, an S3 bucket, or an array of redstone torches in minecraft, it doesn’t particularly matter as long as the next layer up knows how to use it. For the purposes of crucible, it’s just a folder on a machine.

The next layer is Downstairs. Downstairs runs on whatever machine has the real data. It stores it in a format called the Crucible Region format (TM) (tm), and spends its time waiting around listening for requests from Upstairs. Crucible is designed with replication in mind, so you can have multiple Downstairs instances all providing access to the same logical data replicated across multiple physical nodes.

Let’s climb our way up to Upstairs, the third layer. Upstairs is a runtime that applications can use to communicate with Downstairs. It implements the network protocol as well as higher level features like encryption, data verification, replication to multiple Downstairs regions. Apps that want to use Crucible (called Guests) import Upstairs, launch it in a tokio runtime, and then communicate with that task with the API. There’s no Upstairs daemon or anything like that, it all stays in-process.

And finally at the top is the “guest” application. This is the code that wants to store some data. For example, it could be a virtual disk driver for a virtual machine, something Propolis has already implemented. You could also use it to implement a Network Block Device server, which again has already been done. Basically, any program that wants to store some blocks of data, that’s the guest application. It imports Upstairs, which talks to Downstairs, which talks to the raw storage. Now we’ve got bytes flowing around and they’re doing something meaningful.

Crucible Regions

What’s in a Region? Let’s create one and find out!

$ mkdir var
$ cargo run -p crucible-downstairs -- create -u $(uuidgen) -d var/region1
Created new region file "var/region1/region.json"
UUID: 0ea9975b-43dc-e237-d2d3-e693c0b65959
Blocks per extent:100 Total Extents: 1


Ah json, we meet again.

$ cat var/region1/region.json
{
  "block_size": 512,
  "extent_size": {
    "value": 100,
    "shift": 9
  },
  "extent_count": 15,
  "uuid": "0ea9975b-43dc-e237-d2d3-e693c0b65959",
  "encrypted": false
}

$ tree var/region1/
var/region1/
├── 00
│   └── 000
│       ├── 000
│       ├── 000.db
│       ├── 001
│       ├── 001.db
│       ├── 002
│       ├── 002.db
│       ├── 003
│       ├── 003.db
│       ├── 004
│       ├── 004.db
│       ├── 005
│       ├── 005.db
│       ├── 006
│       ├── 006.db
│       ├── 007
│       ├── 007.db
│       ├── 008
│       ├── 008.db
│       ├── 009
│       ├── 009.db
│       ├── 00A
│       ├── 00A.db
│       ├── 00B
│       ├── 00B.db
│       ├── 00C
│       ├── 00C.db
│       ├── 00D
│       ├── 00D.db
│       ├── 00E
│       └── 00E.db
├── region.json
└── seed.db


So what can we deduce from this?

First off, we’ve an extent_count of 15, and we’ve got 15 pairs of files 00/000/. Those are probably the extents themselves. Next, the extent_size is 100 shift 9. Let’s do some maths:

$ wc -c var/region1/00/000/000
   51200 var/region1/00/000/00

$ python -c 'print(100 << 9)'
51200


So yeah those are our extents, they’ve all got a uniform size, and that size is calculated as value * 2^shift. Something else to note is 2^9 is 512, the same value as our block size. Downstairs told us each extent stores 100 blocks, and 100 * 512 = 51200, so the math checks out. Ok, so the files with no file extensions are the data, what about the db files? OpenIndiana’s file command can’t identity them, but the one on my Linux box can:

# file 000.db
000.db: SQLite 3.x database, last written using SQLite version 3038005


Haha just standard SQLite. That means we can look inside pretty easily.

$ sqlite3 region1/00/000/000.db
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> .output extent-db.txt
sqlite> .dump
sqlite> .exit

$ cat extent-db.txt 
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE metadata (
                    name TEXT PRIMARY KEY,
                    value INTEGER NOT NULL
                );
INSERT INTO metadata VALUES('ext_version',1);
INSERT INTO metadata VALUES('gen_number',0);
INSERT INTO metadata VALUES('flush_number',0);
INSERT INTO metadata VALUES('dirty',0);
CREATE TABLE encryption_context (
                    counter INTEGER,
                    block INTEGER,
                    nonce BLOB NOT NULL,
                    tag BLOB NOT NULL,
                    PRIMARY KEY (block, counter)
                );
CREATE TABLE integrity_hashes (
                    counter INTEGER,
                    block INTEGER,
                    hash BLOB NOT NULL,
                    PRIMARY KEY (block, counter)
                );
COMMIT;


I checked out seed.db and the dump of that is exactly the same. Maybe it’s there so they can cheaply copy that file in-place to initialize new extents? Literally yes. While I’m looking at the code, we can get some more information about the extent metadata from the comments, but I’ll let you read that yourself if you’re particularly interested.

So a crucible region is a bunch of extent files that store some data blocks. Each extent has a corresponding SQLite database to keep hold of some basic extent metadata and any integrity hashes. This encryption_context table suggests crucible can handle encrypting data, rather than relying on the underlying data store for that. Then there’s a basic JSON file that specifies the parameters of the region, and that’s about it! This encryption thing has me interested though- is encryption performed Downstairs or Upstairs?

Encryption

With a quick search for “encrypt”, I have my answer: Upstairs. Checking the encrypt_in_place() function, we see that Upstairs encrypts the data before it sends it off to Crucible. But! It also sends the hash of the data AFTER encryption, so that Crucible can do integrity checks without the decryption key. Interesting.

So remember, Upstairs runs with your application, not your datastore. Since Upstairs handles encryption, this means that in one fell swoop you get both encryption at rest as well as some encryption in transit. If an attacker compromises your Downstairs, or intercepts your transmissions, they don’t immediately get access to your data. They’ve got to get ahold of the encryption keys first. I’ve been told that transport-layer encryption and authentication mechanisms for the protocol are also on the roadmap, but I don’t know how much of that is done.

But, there are some attacks that are currently possible. The code itself points out two possibilities:

// XXX because we don't have block generation numbers, an attacker
// downstairs could:
//
// 1) remove encryption context and cause a denial of service, or
// 2) roll back a block by writing an old data and encryption context


So an attacker that compromises Downstairs could hold on to some valid data and its associated encryption context, and then present that later to Upstairs to show it an older data state. That’s a bit abstract- how could someone actually exploit this?

Well, indulge me as I let my infosec side through a bit. Let’s say our hypothetical attacker Alice has the following access:


  Control of Downstairs
  Some unprivileged shell on Upstairs


Upstairs in this scenario is running Debian or Ubuntu, with the rootfs backed by Crucible. Whenever someone installs a package with apt-get, an entry is logged in /var/log/apt/history.log, which is by default world-readable. With this access, Alice can watch as packages are installed and correlate package installations with data writes.

A few months later, someone discovers a serious security flaw in something installed on the system. The system administrators are on their game and rapidly deploy a patch to it, but even if they hadn’t, automatic upgrades would have kicked in soon enough. By the time a working proof of concept is published openly to the world, the system is already immune.

But Alice, now she has a trick up her sleeve. Once she has a working exploit, she can roll back the system’s version of that package and take advantage of it. As long as she has access to Downstairs, she’s also free to sit on that exploit and wait for the perfect time to strike. When an attacker is under pressure, they make mistakes; here that pressure is removed.

So there’s a fun story to get you thinking about how these attacks can play out. But it’s here that I want to remind you that this is an attack they’re already thinking about. They could make changes to crucible to defend against this sort of thing, or they could decide to implement that defense in another layer of the software stack. Something more interesting is to imagine attacks they haven’t thought of, but I’ll leave that as an exercise for the reader. ;)

Data Integrity and Replication

Crucible’s design is incredibly straightforward and integrity and replication are no exceptions. An Upstairs application can replicate data out to multiple Downstairs nodes, typically at least three. Downstairs instances don’t talk to each other, so Upstairs just writes the same data out to each node. This does linearly increase your network traffic from Upstairs with your number of nodes, but it also means there’s no complicated consensus protocol between Downstairs nodes. A fair trade, when you’ve got cutting edge network equipment.

When reading data, Upstairs sends read requests out to all the Downstairs nodes at once. It’ll give the guest application the data from the first Downstairs node that returns something valid, but it’ll also collect the responses from the remaining nodes to make sure they’re returning what they’re supposed to. Data is only valid if its hash is valid, and if it decrypted properly.

There’s also some extra checks when a Downstairs node connects to make sure all the nodes are in a consistent state. If you’re curious give this comment a read for the details.

Let’s Crucible!

Downstairs

That’s enough theory for one day, now let’s do something with it! First we need to create some regions. I’m feeling like I want a 128GiB data store, and I’ll be using 128MiB extents so there’s not so many files to deal with. That means we’ll need 1024 extents, each containing 262144 blocks. Also, I learned you need to tell it in advance when a region will be encrypted. So, here’s three regions.

vi@illumos$ cargo run -q -p crucible-downstairs -- create -d var/region1 -u $(uuidgen) --extent-size 262144 --extent-count 1024 --block-size 512 --encrypted true
Created new region file "var/region1/region.json"
UUID: 8c459730-4000-6772-9368-a3c0083f6e8c
Blocks per extent:262144 Total Extents: 1024

vi@illumos$ cargo run -q -p crucible-downstairs -- create -d var/region2 -u $(uuidgen) --extent-size 262144 --extent-count 1024 --block-size 512 --encrypted true
Created new region file "var/region2/region.json"
UUID: 8600e7f9-f6c0-c7b2-f4d5-c8f4466d829a
Blocks per extent:262144 Total Extents: 1024

vi@illumos$ cargo run -q -p crucible-downstairs -- create -d var/region3 -u $(uuidgen) --extent-size 262144 --extent-count 1024 --block-size 512 --encrypted true
Created new region file "var/region3/region.json"
UUID: fffd19ea-84d2-c765-fc8d-fe815b7f1b62
Blocks per extent:262144 Total Extents: 1024


Let’s check on that data.

vi@illumos$ du -h var/
29.3M   var/region1/00/000
29.3M   var/region1/00
29.3M   var/region1
29.3M   var/region2/00/000
29.3M   var/region2/00
29.3M   var/region2
29.3M   var/region3/00/000
29.3M   var/region3/00
29.3M   var/region3
88.0M   var

vi@illumos$ wc -c < var/region1/00/000/000
 134217728


Hah, nice. So the files are allocated, but since they’re all full of zeroes, we’re not actually paying for that storage space yet. This is just my file system at work. Since everything in ZFS is copy-on-write, ZFS doesn’t have a reason to pre-allocate the data.

Now we can spin up some Downstairs nodes, each on their own port, and each backed by a different region. I’ve listed three commands here, but I ran them each from their own terminal.

vi@illumos$ cargo run --release -q -p crucible-downstairs -- run -p 3810 -d var/region1/
Opened existing region file "var/region1/region.json"
UUID: 8c459730-4000-6772-9368-a3c0083f6e8c
Blocks per extent:262144 Total Extents: 1024
Using address: 0.0.0.0:3810
No SSL acceptor configured
listening on 0.0.0.0:3810
Repair listens on 0.0.0.0:7810

vi@illumos$ cargo run --release -q -p crucible-downstairs -- run -p 3820 -d var/region2/
Opened existing region file "var/region2/region.json"
UUID: 8600e7f9-f6c0-c7b2-f4d5-c8f4466d829a
Blocks per extent:262144 Total Extents: 1024
Using address: 0.0.0.0:3820
No SSL acceptor configured
listening on 0.0.0.0:3820
Repair listens on 0.0.0.0:7820

vi@illumos$ cargo run --release -q -p crucible-downstairs -- run -p 3830 -d var/region3/
Opened existing region file "var/region3/region.json"
UUID: fffd19ea-84d2-c765-fc8d-fe815b7f1b62
Blocks per extent:262144 Total Extents: 1024
Using address: 0.0.0.0:3830
No SSL acceptor configured
listening on 0.0.0.0:3830
Repair listens on 0.0.0.0:7830


Once again I am pleased with how easy this is to do. Now let’s get weird with it.

No Big Deal

I don’t really feel like writing my own code to use Crucible, but luckily I don’t have to. As I mentioned earlier, there’s a reference implementation of the Network Block Device protocol that we can run to get a regular ol block device to use however we’d like. From now on we’ll be working from my Raspberry Pi 4, and the first thing we’ve got to do is build Crucible over there. Then we can run the NBD server!

vi@pi$ cargo build --release
vi@pi$ cd target/release
vi@pi$ ./crucible-nbd-server -k 'buttslol' -t 172.16.254.177:3810 -t 172.16.254.177:3820 -t 172.16.254.177:3830
Crucible runtime is spawned
The guest is requesting activation with gen:0
thread 'crucible-tokio' panicked at 'Key length must be 32 bytes!', upstairs/src/lib.rs:151:17


Ok it looks like we’re giving this thing a raw 256-bit AES key. Let’s try with a 32-byte long string?

vi@pi$ ./crucible-nbd-server -k 'Key length must be 32 bytes! lol' -t 172.16.254.177:3810 -t 172.16.254.177:3820 -t 172.16.254.177:3830
Crucible runtime is spawned
The guest is requesting activation with gen:0
thread 'crucible-tokio' panicked at 'could not base64 decode key!: InvalidByte(3, 32)', upstairs/src/lib.rs:148:37


Nope, it actually wants 32 bytes that have been base64 encoded. Works for me!

vi@pi$ echo -n 'Key length must be 32 bytes! lol' | base64
S2V5IGxlbmd0aCBtdXN0IGJlIDMyIGJ5dGVzISBsb2w=

vi@pi$ ./crucible-nbd-server -k 'S2V5IGxlbmd0aCBtdXN0IGJlIDMyIGJ5dGVzISBsb2w=' -t 172.16.254.177:3810 -t 172.16.254.177:3820 -t 172.16.254.177:3830
172.16.254.177:3810[0] looper connecting to 172.16.254.177:3810
172.16.254.177:3820[1] looper connecting to 172.16.254.177:3820
Wait for all three downstairs to come online
[...]


There we go! But we don’t have a block device yet. For that we’ll use nbd-client.

vi@pi$ sudo apt install nbd-client
vi@pi$ sudo modprobe nbd
vi@pi$ sudo nbd-client -C 1 -b 512 -p localhost /dev/nbd0
vi@pi$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
nbd0         43:0    0  128G  0 disk
mmcblk0     179:0    0 59.5G  0 disk
├─mmcblk0p1 179:1    0  256M  0 part /boot
└─mmcblk0p2 179:2    0 59.2G  0 part /


Now we can do some tests to see what kind of bandwidth we get. Let’s try 16MiB of data.

# write test
vi@pi$ sudo dd if=/dev/zero of=/dev/nbd0 oflag=direct conv=fsync bs=4M count=4
4+0 records in
4+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 31.2571 s, 537 kB/s

# read test
vi@pi$ sudo dd if=/dev/nbd0 of=/dev/null iflag=direct bs=4M count=4
4+0 records in
4+0 records out
16777216 bytes (17 MB, 16 MiB) copied, 9.73241 s, 1.7 MB/s


Wow. That write speed is terrible. Like, 40 times slower than my slowest SD card writer. The read is a bit better but still not great. This is for the most part not crucible’s fault; no, this is just the harsh realities of the Pi 4 WiFi. Even with 802.11ac, I only get about 10MiB/s up and 6-8MiB/s down over the wifi. That’s worse than 100mbit ethernet.

Remembering that each read gets replicated three times, we’re actually pulling down 5.1MiB/s of block data from Downstairs, so we’re getting good use out of our download bandwidth. Writes are a bit rougher, because we have to wait for all three writes to finish before it’s considered done. Granted, we have a much higher round-trip latency on WiFi than we’d get on a wire, and Crucible isn’t designed for this. Still, this could probably be making better use of the bandwidth available.

Regardless, this is still enough to do something, so let’s introduce the last piece of the puzzle.

Go Go Gadget: Mass Storage Device

Linux has this thing called USB Gadget Mode, where it can act like a USB Peripheral on a USB On-The-Go (OTG) port. The Raspberry Pi 4’s USB-C port is actually an OTG port, so the Pi 4 can do this. With the Mass Storage Gadget, Linux can present a USB Mass Storage Device backed by any block device or raw data file on the system. More plainly, we can turn our Pi into a USB Hard Drive. This is extremely well-trodden ground, so I’ve been following thagrol’s USB Mass Storage Gadget guide.

So let’s turn Crucible into a bootable USB Drive shall we? Coming in at only 21MB, TinyCoreLinux is an excellent candidate for this.

vi@pi$ sudo dd if=TinyCore-current.iso of=/dev/nbd0 bs=1M oflag=direct conv=fsync
22+0 records in
22+0 records out
23068672 bytes (23 MB, 22 MiB) copied, 51.6277 s, 447 kB/s


Now we need to make the Crucible NBD server and NBD client run at boot, and set up the USB gadget. I set up a couple runit services to handle this.

vi@pi$ cat /etc/sv/crucible/run 
#!/bin/sh
exec /home/vi/crucible/target/release/crucible-nbd-server -k 'S2V5IGxlbmd0aCBtdXN0IGJlIDMyIGJ5dGVzISBsb2w=' -t 172.16.254.177:3810 -t 172.16.254.177:3820 -t 172.16.254.177:3830

vi@pi$ cat /etc/sv/nbd/run 
#!/bin/bash
sv start crucible
modprobe nbd
while ! nbd-client -C 1 -b 512 -p localhost /dev/nbd0 &>/tmp/wtf.txt; do
        sleep 1
done
sleep 1
modprobe g_mass_storage file=/dev/nbd0
permafrost


Last but not least, I added dtoverlay=dwc2,dr_mode=peripheral to my /boot/config.txt. Believe it or not, that’s everything. I plugged my Pi into my desktop, pressed the power button, and waited in anticipation. A few tens of seconds later, I had a desktop.



I’m not sure why I was surprised that this worked, but I was. I’m streaming data real time from an illumos server in the other room to my raspberry pi, which is presenting it as one of the slowest USB devices in the world, and somehow this all just works.

Faster.

Now that I’ve satisfied my dark desire to make a wireless USB drive, let’s add an ethernet cable and see what this stack can really do. I tested with a lot of extent and block sizes to find the fastest for this setup, and disabled encryption for good measure. After all that, here’s the best I got:


  4096-byte block size
  16384-block extent size
  6.9 MiB/s read-speed
  3.8 MiB/s write-speed


That’s much better than on WiFi, but it could be better still. At peak read we’re still only using about 170 megabits/s raw network bandwidth, less than 1/5 the bandwidth available. Crucible’s NBD server is intended for testing, not production, so I thought maybe it wasn’t making the best use of the Upstairs layer that it could. I was right.

Turns out the Crucible team hasn’t actually implemented the NBD protocol themselves. Instead, they used the nbd crate, which takes any struct with Read + Write + Seek implemented and turns it into an NBD server. This is really great for getting something working fast, but the problem is everything is synchronous. This model just doesn’t allow you to pipeline your IO operations. What are our other options?

I didn’t want to implement the NBD protocol myself, but luckily there’s a lot of nbd crates out there. nbd-async seemed promising at first, but while it does use async IO, it’s structured in such a way that it can only serve one NBD request at a time. I could have modified it to support parallel requests, but there’s a better option: nbdkit.

nbdkit is the go-to standard for writing nbd servers. It’s been around the block, and supports writing servers in a lot of languages, including rust. You provide a dynamic library in the plugin format it understands, and it handles the rest. I naiively assumed that using it with rust would be a frustrating endeavor full of unsafe code and C FFI, but that could not be further from the truth. All you have to do is implement their Server trait, and their plugin!() macro takes care of all the FFI glue to expose it as an nbdkit plugin.

I did have to rework the NBD server to be a dynamic library instead of an executable. That was interesting. There’s no main() function, and the load() function doesn’t have a direct way to give state to the rest of the server, so I hacked a couple of global mutexes in and called it a day. Also, the rust nbdkit library doesn’t build properly on aarch64 because they made some bad assumptions about C types (c_char isn’t u8 everywhere, sorry y’all), so I had to work around that. Anyways, after that rework I was rewarded with… well, some benefit, mostly in read speeds. With 512-byte blocks I got


  7.9 MiB/s read-speed
  5.1 MiB/s write-speed


And with 4096-byte blocks, I got


  18.4 MiB/s read-speed
  4.1 MiB/s write-speed


You can see my code on my github fork, complete with hardcoded downstairs IP-addresses, since I didn’t feel like dealing with nbdkit configuration. It’s a bit hacky but it did the trick.

It’s interesting that the reads and writes are now scaling opposite of each other with block size. With 4096-byte blocks we got up to 500mbit/s raw read bandwidth, accounting for replication, so that’s pretty good! But I have to conclude I’m running into a bottleneck with either Upstairs or Downstairs, because there is definitely still bandwidth left in my gigabit networking that Crucible could be using. The writes in particular are very inconsistent, with the network graph bouncing up and down throughout the write. But hey, that’s what you get when you take something that tells you not to use it yet and use it anyway!


 


I’m tabling this project for now, but if I can squeeze a bit more out of it in the future I have some other silly ideas for Crucible that it was definitely not meant for. Thanks as always for stopping by!


How I Listened To Music In The 2010s
2022-05-30T00:00:00+00:00
The early 2010s was when I first got into music in a big way. Early on, I just had some files in the first music player app I thought looked cool, but as my library grew my methods for listening to music got more and more convoluted, culminating in a DIY cloud streaming setup. I wanna write about it, for my own memory and amusement. Maybe you’ll get a kick out of it too!

Early Days

When I first got into music, I didn’t have a smartphone yet. I did have an iPod Touch, but honestly I didn’t really find it all that compelling to try to use. iTunes was really confusing and frustrating to me, and I kinda just used it to play games and listen to podcasts. So I listened to all my music on my computer in quod libet, a GTK music app that’s still my favorite desktop music player to this day. It takes ages to index a large library from scratch, but at that time that wasn’t even a consideration for me, and once everything is indexed adding new stuff goes fast.

Once I got my first smart phone, the world of music on mobile opened up to me. I could just plug the thing in and drag music to it like a flash drive. But also, there were music apps that looked so cool. This was in the early 2010s and I was rocking Poweramp, an app with gapless playback and an incredible neon aesthetic that felt right at home with the Holo era of android. Gods I miss the Holo era. Just look at this.



At the time I used a file browser called “ES File Explorer”, which I refuse to link to since as far as I remember they eventually got bought out by an adware or spyware company or something like that, but it could act as an SFTP server. Whenever I got new music, I’d fire up the SFTP server and transfer it over from my desktop. I didn’t have too many albums so this worked ok. I had a lot of pony fan music, some furry artists, and a bunch of more mainstream EDM DJ sets.

At first it was mostly 128-192KB/s ish MP3s (for the podcasts), and some high quality MP3s for the other stuff, since I was mostly pirating music at the time. Later, I started getting paid to write game mods, and so I had some disposable income to buy music from artists directly. Most of the stuff I was into was on Bandcamp, so I started amassing a collection of FLAC files. This was great for my archivist tendencies, but terrible for my storage space on disk. What was I to do?

Before this ever truly became a problem, I recognized that my SFTP strategy was just not going to cut it on its own. So I took the obvious course of action and started transcoding. I’d used ffmpeg a couple times in the past, but this is when I really cut my teeth on it. Over a few days I put together a fish script that would incrementally transcode my audio library from my main archive to another folder for holding the phone transcodes. But I also wasn’t content to just transcode everything; why trancode my mp3s and m4as? They were already compressed enough as it was. Eventually I could just run a single command to transcode any new music, SFTP the results to my phone, and have my music library in sync in a matter of minutes.

This again worked well for awhile. Then I discovered Vaporwave.

The Collection Grows

The thing about Vaporwave is that there is a LOT of it. Not all of it is good, but even reddit’s recommendation list at the time was massive. There was an open file server that had a huge collection of nearly all the releases to date at that point, so I figured I’d just download the whole collection so I’d have everything on hand as I went through others’ recommendations. Around the same time, the Business Casual label was offering their entire discography at an absurdly low price, so I picked that up too.

Now I had decidedly Far Too Much Music. I may have been able to afford an upgrade to a 64GB microSD, but even with that I’d be pushing up against it with my mp3 transcode. That’s when I learned about opus. I fucking love the opus codec. I don’t know that many people can say they have a passion for codecs, but I sure do, and opus has gotta be my number one favorite. Opus was designed for low-latency interactive applications like Skype calls, but sort of on accident ended up outperforming AAC-LC, AAC-HE, and Vorbis too, all while being open and royalty free. It also degrades gracefully at lower bitrates; Even once artifacting is noticeable, it ends up just having a lo-fi vibe to it rather than the obnoxious arifacts you hear with something like low bitrate mp3.

All this made opus the perfect fit for my phone, which desperately needed something new to let me continue cramming my entire music library on it. I updated my previously-mentioned script to transcode my entire library to 96kbit/s Opus files. I experimented a bit with other bitrates, but this was my personal sweet spot. I could notice a little loss of detail in the high end in the percussion of a few particular tracks, but I really had to actively listen for that, and by and large the quality degradation wasn’t noticeable to me.

The other problem was that encoding my huge new influx of audio files was going to take ages. I now had 300 gigs of audio, and one core just wasn’t cutting. This was how I learned about GNU Parallel. GNU Parallel is like xargs, but it runs commands in parallel (wow go figure). This way I could let it handle juggling ffmpeg invocations so I could fully saturate my 8-thread i7-2600k.

However, this move to opus necessitated a change in audio app too. Nowadays Poweramp supports opus, but at the time it did not. In fact, most apps at the time didnt, and of the ones that did, only one had both gapless playback and a UI that I found enjoyable: GoneMAD. Nowadays, the UI I used is available in “GoneMAD Classic”, but I’m on iOS now so I don’t know much about how it holds up. At the time, it did everything I needed, most of important of which was browsing the file tree directly. A lot of my music library didn’t have proper metadata tags, so I manually sorted things into folders and relied on being able to browse my own organization structure. Surprisingly, a lot of music players didn’t have this feature.

Time To Make Mistakes

The thing is, when you’re into archival, you only ever get more data. I collected large music dumps as certain artists and labels disappeared from various scenes, and my own tastes expanded. Eventually, even my opus solution was hitting the limits of my on-phone storage. I could’ve upgraded my storage again, but at the time I finally had consistent LTE coverage over my town, so I decided to be a bit more ambitious. I wanted to stream my music from home, and put my storage problems to rest once and for all.

So what did the stack look like for that?


  MPD ran on my computer and transcoded music in real time
  Icecast ran on a server in the cloud, taking in a stream from MPD and providing multi-client support, better buffering, and a web URL.
  ympd ran on the cloud server too and let me control mpd on the go (the MPD control apps didn’t work as well).
  nginx sat in front of icecast and ympd providing TLS and HTTP Basic Auth.


Here’s a screenshot of ympd, which I’ve inverted because otherwise it’s obnoxiously bright.



The link between MPD and Icecast/ympd also ran over a pair of SSH port forwards established with autossh, since otherwise I’d be sending unencrypted credentials over the network.

I didn’t really want to run this on my desktop though, since I sometimes left it booted into Windows for gaming purposes, so I intially tried to set this up on a first generation Raspberry Pi B. Now, I don’t know if you remember just how incredibly slow the first pi was, but it could not handle real-time transcoding to opus. I tried. These days, a RockPro64 can transcode opus at 32x real time speed; that’s how far we’ve come in the low-cost ARM SoC world. If memory serves, the Pi could barely do AAC but it was borderline enough that I didn’t trust it. So, for a bit at least, I was back to transcoding to mp3 or vorbis. But I bumped up the bitrate to compensate since I had the data for it in my data plan, and this worked fine.

Except for when it didn’t work out. Because the problem with real-time streaming to a phone is that sometimes a phone loses and regains connection when switching cell towers. Sometimes its IP address changes for no good reason. Sometimes you go through an area of spotty reception and get a bunch of packet loss. Neither Icecast nor MPD nor the apps I used for streaming music were prepared to deal with any of this properly. As a result, I had buttons on my home screen to launch a script to restart icecast and MPD as needed. I had to kill my streaming app from time to time. It was really REALLY cool when it worked, but it was so frustrating when it didn’t. But, I’m nothing if not comitted to the bit, so I used this setup long enough to get a Raspberry Pi 2 and switch over to opus encoding for a little bit.

I wouldn’t do it again. MPD is a pain to set up and configure, so is icecast, and making them work together is even worse. It took me a good week before I finally had everything working together reliably. I don’t remember the details, and my config files are lost to time; all I can say is, do not try this at home. Unless you’re into that kinda shit. Eventually, I got a phone with enough storage that the whole streaming thing was no longer relevant and went back to my old ways. Then I got an iPhone and tried Apple Music, at which point I was well into my synthwave phase, so all the music I wanted to listen to was streamable on there anyway.

Nowadays I’ve chilled out quite a bit. I bought a rockbox-compatible music player a few years ago, and I copy audio files over to it like it’s a flash drive once again. If I run out of space, I just delete some stuff I haven’t listened to in awhile. It’s not complicated. I even use CDs sometimes, when I’m feeling in the mood. I might bring back my opus transcoding script some day to bring my full collection over, but I’m very much over streaming music. It’s just nicer to press play and know it’ll always work.


Everfree’s ARMFerno - My Unholy Battle With a Rock64
2022-05-16T00:00:00+00:00
I’ve got this rock64, which is an aarch64 board comparable to a Raspberry Pi 3 B+ with 4 gigs of ram. For years I’ve wanted to put a distribution on here that doesn’t have a premade image available, mainly because out of all the options on that page I don’t actually like any of them. Well, except NetBSD, but NetBSD doesn’t have GPU drivers for it. Problem is, everything I do want to use provides rootfs tarballs and tells you to figure it out. To do that I’ve got to get a Linux kernel, track down the device trees so it knows what hardware it has, and then wrangle u-boot into actually booting the whole thing. I figured that would be the hard part; little did I know the depths that Single Board Computer Hell would reach.



For the purposes of the install I decided to go with Gentoo. Yeah yeah, I know; memes aside, Gentoo made sense for this project. They make it really easy to apply custom patches to the kernel and other system packages. There’s a rootfs with all the files of a base install, but they also provide an aarch64 installation ISO. I figured I could find some way to boot up that ISO and go from there (narrator: she did not boot up that ISO). So I flashed the ISO onto an SD card, and then went on to solve the u-boot part of the problem.

dd if=/path/to/gentoo.iso of=/dev/mySDCard bs=4M status=progress


First Circle - u-boot

u-boot is a bootloader that’s commonly used in embedded systems. It’s got a lot of flexibility in the build process that lets devs adapt it for whatever convoluted boot process the system needs to get going. That’s important because the boot process for ARM SoCs is almost entirely non-standard, and any similarities between chips is largely incidental. At the extreme end you’ve got those awful broadcom chips in the raspberry pi that infamously use the GPU to boot the system. Thankfully we don’t have to deal with anything that bad here.

If I was doing this when the rock64 came out I’d expect to go looking for some fork of u-boot to work with, but we live in 2022, so I just went for mainline u-boot. Configuring this is a bit like configuring a Linux kernel. First we generate a default config file.

git clone https://github.com/u-boot/u-boot
cd u-boot
make rock64-rk3328_defconfig


Then we edit it interactively with make menuconfig if we want to change anything. Once that’s done we can build the image- except we can’t yet actually.

Second Circle - ARM Trusted Firmware

To actually boot up, u-boot needs to bundle in the ARM Trusted Firmware for the SoC, so we’ve got to go get that.

git clone https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/
cd trusted-firmware-a
CROSS_COMPILE=aarch64-linux-gnu- make PLAT=rk3328 bl31
# save the path of the output for use with u-boot
export BL31="$(realpath build/rk3328/release/bl31/bl31.elf)"


How did I figure this out? Why, through the power of friendship! No seriously, I just asked my friend and she told me to do this and it worked. I don’t know where else you’d find this information on your own.

Ok, back to u-boot then. From the u-boot folder again, I built my image with the BL31 file in tow.

# this relies on the BL31 environment variable we exported in the last code block.
CROSS_COMPILE=aarch64-linux-gnu- make -j4


Now we’ve got some binaries, and the main ones we care about are idbloader.img and u-boot.itb. idbloader.img is the very first thing that runs when the chip starts booting from the SD card, and that needs to go at sector 64 (using 512 byte sectors). u-boot.itb on the other hand has an address configurable in the menuconfig, and at the time of writing the default in upstream u-boot is sector 16384. idbloader jumps into the main u-boot code after early initialization, so if we change u-boot’s offset we’ve got to reflash idbloader too.

There’s two approaches you can take to flashing this onto the SD card from here if you’re following along at home. The first option is to write a third file, u-boot-rockchip.bin, at sector 64. This is a bundle of both the idbloader.img and u-boot.itb files, with padding in between. The downside is, this also obliterates the partition table, so I flashed them separately instead.

dd if=idbloader.img of=/dev/mySDCard bs=512 seek=64
dd if=u-boot.itb of=/dev/mySDCard bs=512 seek=16384


If all goes well, you’ll get something like this when you power on the board:

U-Boot TPL 2021.07 (Apr 30 2022 - 00:50:36)
LPDDR3, 800MHz
BW=32 Col=11 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=4096MB
Trying to boot from BOOTROM
Returning to boot ROM...

U-Boot SPL 2021.07 (Apr 30 2022 - 00:50:36 -0700)
board_init_sdmmc_pwr_en
Trying to boot from MMC1
NOTICE:  BL31: v2.6(release):v2.6
NOTICE:  BL31: Built : 00:14:00, Apr 28 2022
NOTICE:  BL31:Rockchip release version: v1.2


U-Boot 2021.07 (Apr 30 2022 - 00:50:36 -0700)

Model: Pine64 Rock64
DRAM:  4 GiB
PMIC:  RK8050 (on=0x10, off=0x08)
MMC:   mmc@ff500000: 1, mmc@ff520000: 0
Loading Environment from MMC... *** Warning - bad CRC, using default environment

In:    serial@ff130000
Out:   serial@ff130000
Err:   serial@ff130000
Model: Pine64 Rock64
Net:   eth0: ethernet@ff540000
Hit any key to stop autoboot:  10


Hit a key to interrupt the boot sequence and get manual control over the u-boot shell, or Control-C if it’s already started trying to boot the system.

Third Circle - pxeboot

I realized at this point that while I might be able to boot from the ISO, I wasn’t able to install from it unless I copied it into a tmpfs and remounted the in-ram ISO as /, because I was going to obliterate the ISO partition table on the SD card during the install. In retrospect I probably should have done that, but I didn’t feel like figuring it out, so I took a different road.

I re-imaged the SD card with gentoo’s rootfs tarball, but then I extracted the kernel and initramfs from the ISO and slapped those in there as well. However, when I tried to boot this with u-boot’s booti command, it thought the initramfs was corrupt. It wasn’t decompressing it properly I guess, I’m really not sure. For some reason I decided the logical next step was to try to boot it with PXE instead. You shouldn’t do this. It’s a pain. What I should have done, and what you should do, is to just use an uncompressed initramfs; I’ll tell you how to do that later. But I want to document the PXE process, so here we go.

How does pxe boot work from u-boot? Here’s the rough outline:


  Run a TFTP server somewhere to host our files for u-boot to retrieve.
    
      I used atftp.
    
  
  Set up the configuration structure for pxelinux.
  Get u-boot connected to the network.
  Tell u-boot the tftp server IP, either through magic DHCP settings or manually.
  Run pxe get.
  Run pxe boot.
  Hope it works.


On my desktop I have a file tree structure in my tftp server directory a bit like this:

.
├── gentoo.igz
├── gentoo.img
└── pxelinux.cfg
    └── default-arm


The first two files are the initrd and linux kernel, and then default-arm contains this:

DEFAULT GENTOO
MENU TITLE  Installer
PROMPT 0
TIMEOUT 150

MENU WIDTH 80
MENU MARGIN 16
MENU ROWS 15
MENU TABMSGROW 20
MENU CMDLINEROW 20
MENU TIMEOUTROW 21
MENU HELPMSGROW 22

LABEL GENTOO
 MENU DEFAULT
 MENU LABEL Boot Gentoo
 KERNEL gentoo.img
 INITRD gentoo.igz
 APPEND root=LABEL=root console=ttyS2,1500000


I should tell you that a number of these config lines don’t actually do anything since u-boot only emulates a subset of real pxelinux, but they don’t hurt anything either. In particular, all those menu formatting commands are irrelevant since there’s no menu to format, but I’m leaving this file as-is since it’s what’s on my hard drive. This config also relies on your rootfs partition having the root label, but change the linux command line however you want really.

So with my desktop serving that, I booted my board into u-boot and ran

dhcp
setenv serverip my.desktop.ipv4.address
pxe get
pxe boot


This usually worked. Sometimes my board was able to hit my router, but nothing else on my network, and I have no idea why. Whenever that happened I had to power off the board for 10-15 minutes and then power it back on for it to work again. I also saw some mentions of ARP so, yeah this is low level networking issues that I just did not feel like figuring out at the time.

But with that all done, I had a booted gentoo system, so let’s move on.

Fourth Circle - Kernel

Gentoo proved to be perhaps the best choice I could have made for this project, though I didn’t realize it at the time. Gentoo’s facilities for applying patches and doing whatever you want with the kernel took some of the pain out of using all this hardware’s features, but I’m getting ahead of myself. Before we get to the good parts, we’ve got to address the elliephant in the room: compile times.

The Rock64 has a quad-core processor with Cortex-A53s clocked at 1.2ish GHz. In technical terms, that means compiling things is gonna take awhile. I have the 4GB of ram variant so that helps at least but to put this in perspective, compiling GCC took me about 18 hours straight. That’s the worst case scenario though, and everything else isn’t quite as bad. In some sense, the forced breaks on the project were welcome, as I could have easily been sucked in for even more hours at a time than I already was.

There wasn’t much left to do to finish the installation, but I did want to free myself from pxeboot. So, after installing some creature comforts, I loosely followed gentoo’s amd64 handbook until I got to building the kernel. Actually configuring the kernel took me a few hours as I poked through every menu and turned config options on for my hardware, and I still kept missing things along the way. I was using gentoo’s normal 5.15 source package, but if I had used ayufan’s kernel and defconfig I might have had an easier time. If you want to do that you can clone that repo and use

ARCH=arm64 make defconfig


Using this kernel will at least get you most of the way to a full working set of modules for the hardware. But building the driver modules isn’t enough on it’s own, because we also need to use the right ✨Device Trees✨.

Fifth Circle - Device Trees

On the x86 systems we’re all used to, device trees aren’t ever something we have to think about. The platform is standardized such that the kernel knows how to talk to all the platform hardware, and it can enumerate anything connected over PCIe automatically. On older systems you might have to worry about defining IRQs, but generally speaking if your hardware isn’t showing up on a modern amd64 Linux install, it’s just because you’re missing kernel modules or firmware.

Outside that ivory tower, we have device trees. Device trees are a static descriptor of the hardware available on a device. They describe what hardware exists, what address range that hardware is memory-mapped into, some information the kernel can use to decide what modules are responsible for it, and any additional device-specific configuration needed. This is all defined in a web of dts and dtsi files that all get compiled into a binary representation called a dtb file.

Our u-boot actually already has a device tree baked into it that it’s providing to our kernel when we pxeboot, but that device tree is wrong. The USB2.0 ports don’t provide power, for one thing, and the USB3.0 hardware doesn’t even show up in lsusb. So where’s the right tree? Good question! Here’s some of the places we could find a device tree that claims to be for the rock64 specifically:


  mainline u-boot
  ayufan’s fork of mainline u-boot
  ayufan’s older fork of non-mainline u-boot
  mainline linux kernel source tree
  ayufan’s fork of mainline linux
  ayufan’s fork of rockchip’s linux
  a patch file I got from someone in the rock64 IRC that needs to be applied to HEAD of torvalds/linux


Can you guess which device tree is the right one? That’s right, it’s either the one in ayufan’s fork of mainline linux if you don’t need hardware accelerated video decoding, or the patch file applied to HEAD of torvalds/linux if you do. I’m told that patch is getting upstreamed in Linux 5.19, so once that’s out the easy choice will be to just use the Linux 5.19 source tree and call it a day. If you need that patch now, here’s a link to it on patchwork.kernel.org.

I took the patched upstream. Once I applied the patch, I deleted the arch/arm64/boot/dts/rockchip/ folder from my 5.15 kernel source tree and replaced it with the same folder from my patched upstream kernel. Then I deleted a couple definitions for other boards that were giving me compile errors.

In either case, to build the dtb files we can go into the kernel source tree and run

make dtbs


and then to install them in /boot it’s

make dtbs_install


Sixth Circle - Booting from SD Card

At this point we’ve got the holy trinity of booting a linux system: the kernel, the initramfs, and the device tree binaries. Let’s go! I still hadn’t automated booting at this point, so from the u-boot prompt I did something along the lines of

load mmc 1:2 ${kernel_addr_r} /vmlinuz-5.15.32-gentoo-r1.img
load mmc 1:2 ${fdt_addr_r} /dtbs/5.15.32-gentoo-r1/rockchip/rk3328-rock64.dtb
load mmc 1:2 ${ramdisk_addr_r} /initramfs-5.15.32-gentoo-r1.img
booti ${kernel_addr_r} ${ramdisk_addr_r}:${filesize} ${fdt_addr_r}
Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000] Linux version 5.15.32-gentoo-r1 (root@localhost) (gcc (Gentoo 11.2.1_p20220115 p4) 11.2.1 20220115, GNU ld (Gentoo 2.37_p1 p2) 2.37) #4 SMP PREEMPT Sat Apr 30 05:14:29 PDT 2022
[    0.000000] Machine model: Pine64 Rock64
[    0.000000] efi: UEFI not found.
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000200000-0x00000000feffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty

[... it goes on for awhile ...]


And there we go. Booting!

There’s a number of ways to automate this, simplest of which is probably baking a boot script into the u-boot image. But, I never actually bothered to automate the bootup process so I am unfortunately leaving this one as an exercise for the reader. Sorry!

Seventh Circle - Responsive GUI

With USB working, I could finally run startx from the TTY and get a GUI, and, ooooh boy was it slow. I’m talking, you drag a window and watch it follow behind. You can watch the pixels update over a few frames after minimizing a window. I’m the girl that uses an 800MHz laptop with software rendering on the daily, and I’m saying it’s slow. So what’s the problem?

$ glxinfo | grep llvm
OpenGL renderer string: llvmpipe


Well that might do it. No GPU. You see, I had built my system with support for panfrost, but what I actually needed was lima. Who the hell is Steve Jobs, you ask?

lima balls.



Anyway, now that I’d remembered that my GPU was a Mali-450, and got the correct VIDEO_CARDS setting in my make.conf, I ran startx again. Guess what, it ran EVEN WORSE!! I wish I was joking, but no. It was somehow less responsive. I did a test and our humble glxgears got 25fps full screen with both hardware and software rendering- the only difference was CPU usage. What is going on?

Well quite simply, X sucks on embedded hardware. I’m a noted X-apologist, and even I have to face facts on this one. So, I installed sway, a wayland window manager inspired by i3wm. Starting sway, I was pleased to see that dragging windows around was actually fast, how incredible. Not only that, glxgears bumped its way up from 25fps full screen to 50fps! There’s no escaping the fact that this GPU is extremely an embedded GPU, but at least it gets the job done.

And with all that, we can graphically multitask like I teased at the top of the post. We’ve got some low framerates, but it’s responsive!


  
    
    
    minetest running while a video of never gonna give you up plays.
  


But what about that video in the corner? It’s looking particularly choppy…

Eighth Circle - Hardware Accelerated Video Playback

Surely video playback can’t be that hard right? We’ve been doing hardware accelerated video for literally decades. How could we ever need cutting edge software for that?

Good question! The problem is, up until recently there has been almost no standardization of this stuff on SoCs. In x86 land we’ve ended up with something that feels a bit like two competing standards, VAAPI and VDPAU. VAAPI is pushed by Intel, VDPAU is pushed by Nvidia, AMD has over the years used both, and there’s wrapper libraries that translate between the two for applications’ benefit. Technically, nothing stops SoC vendors from implementing one or both of these standards. In fact, some even have! But it’s not a given, and there’s a lot of vendor-specific stuff going on.

ffmpeg and by extension mpv have support for Rockchip’s “Rockchip Media Process Platform”, so that’s what I chased down for a day or two. As it turns out, this only works with Rockchip’s fork of the Linux kernel. The video decoding hardware has support in mainline Linux, but it’s using a completely different interface called “Video4Linux2 Request”. As far as I can tell, Video4Linux started out as an API for accessing video capture devices, TV tuners, and the like. These days it’s grown beyond that, and one thing it can do is facilitate hardware video decoding. Finally it seems like we might be approaching a standard API to support SoCs’ weird signal chains.

So there’s a driver in the kernel in staging called rkvdec which supports the rock64 and rockpro64’s hardware with v4l2. It’s been in there for awhile, so if you want to stick to an LTS kernel you can get it in 5.15. We’ll also need the v4l2 modules, and that dts patch I mentioned earlier to detect the hardware properly. With that all out of the way, you should see a /dev/video1 file after booting- that means we’re in business! You can confirm using v4l2-ctl:

$ v4l2-ctl -Dl 
Driver Info:
  Driver name      : hantro-vpu
  Card type        : rockchip,rk3328-vpu-dec
  Bus info         : platform: hantro-vpu
  Driver version   : 5.15.32

[...snip...]

Codec Controls

  h264_profile 0x00990a6b (menu)   : min=0 max=4 default=2 value=2 (Main)

Stateless Codec Controls

  h264_decode_mode 0x00a40900 (menu)   : min=1 max=1 default=1 value=1 (Frame
-Based)
  h264_start_code 0x00a40901 (menu)   : min=1 max=1 default=1 value=1 (Annex
 B Start Code)
  h264_sequence_parameter_set 0x00a40902 (h264-sps): value=unsupported payload type flag
s=has-payload
  h264_picture_parameter_set 0x00a40903 (h264-pps): value=unsupported payload type flag
s=has-payload
  h264_scaling_matrix 0x00a40904 (h264-scaling-matrix): value=unsupported payloa
d type flags=has-payload
  h264_decode_parameters 0x00a40907 (h264-decode-params): value=unsupported payload
 type flags=has-payload



Ok, next problem, upstream ffmpeg doesn’t have support for the V4L2-Request API yet. Right now, you can get a fork with support from jernesk/FFMpeg on github. I’ve also created a patch file that applies cleanly to the upstream 4.4.1 source tarball if you want to use that instead. You’ll need to pass --enable-v4l2-request to configure to use it.

Finally, make sure your mpv is actually using the right ffmpeg if you have more than one installed. If it is, you can pass --hwdec=drm-copy to mpv, and you’ll be decoding video with hardware!

By the way, a lot of this is also documented in Mainline Hardware Decoding on the pine64 wiki.

So I did all that, and what was my reward? Well here’s the punchline, video playback was actually CHOPPIER than without hardware decoding. WHY?! I don’t have a perfect answer for you. My gut feeling is this is a memory bandwidth problem. You see, in wayland the only acceleration method we can use is drm-copy. As the name implies, the data path here is something like


  ffmpeg demuxes our file.
  h264 frames are sent to the media engine hardware.
  decoded frames are sent back to system memory.
  mpv then uploads these frames to the GPU in the OpenGL layer.
  (?) There might be some colorspace conversion that has to happen here too.
  sway composites this into the framebuffer that’s finally displayed on screen.


I may be missing some steps there, but the gist is, that’s a lot of memory bandwidth used for just a single frame. I’ll be generous and guess that we’re using 3 bytes for pixel; a 1920x1080 frame is 5.9MiB. If we have to shuffle that frame around even 3 times, we’re already at 533MB/s of bandwidth used for a 30fps video minimum. Add onto that the other things the system has to do and the latency involved with a number of these operations, and this little thing just cannot keep up. With software decoding, yeah the CPU is doing all the decoding work, but the reduced memory bandwidth used pushes it ahead just a little bit.

“But Artemis, kodi can do smooth playback, how does it do it?”

Well, to answer that let’s go out of wayland and back to the TTY. Now run

mpv --hwdec=drm /path/to/file.mp4



  
    
    
    Smooth 30fps 1080p playback of the demo ambience by Quite.
  


Perfectly smooth playback of ambience by Quite at 1080p30fps. Not a frame dropped. When we use --hwdec=drm instead of --hwdec=drm-copy, the data path is much more direct from file to media processing engine to display. None of these intermediary copies involved. Since mpv has exclusive control over the display, it can easily take the fast path. No pesky windowing system or composition stack in the way. Same goes for kodi! But, sadly, we can’t used --hwdec=drm inside Wayland.

Technically, there’s no reason this fast path couldn’t be taken from within Wayland or even X. On raspberry pi, for example, you can use the very janky omxplayer to take the fast path over there, rendering the video as an overlay atop the session. Rockchip devices can do the same sort of thing, and with proper code you can even make it fit in with the windowing manager cleanly by positioning the output overtop a window. It’s just, nobody has bothered to write the code to do it.

Well, almost nobody.

 Apparently nobody has bothered to write something that can take the fast path under Wayland or X on rockchip
 Because I guess everybody just gives up and uses kodi if they want a media center, or otherwise stops using the thing
 And the hardware in the pinebookpro is fast enough it can just brute force the inefficiencies at pbp resolution
 oh uh
 my coworker did
 but he can't release it


Gotta love intellectual property law.

Ninth Circle - Adjusting the Display Resolution

It was here at the ninth circle of ARM hell that my journey came to an end. You see, I cannot get this thing to output a display signal at anything other than 1920x1080. If I set it to 1280x720, I don’t get a signal output. If I set it to any manner of standard VGA resolutions, I don’t get a signal. If I plug it into my 1280x1024 monitor, it sure claims it’s in 1280x1024 mode, but there’s no signal. This is constant across the TTY, X11, Wayland. It simply does not matter. I felt, and still feel, like I’m losing my mind when I talk about this. It’s supposed to be in the right mode, its just, there’s no output. Here, look!

vi@shiny ~ $ swaymsg -t get_outputs
Output HDMI-A-1 'Unknown GH18PS 0323ME0502' (focused)
  Current mode: 1280x1024 @ 60.020 Hz
  Position: 0,0
  Scale factor: 1.000000
  Scale filter: nearest
  Subpixel hinting: unknown
  Transform: normal
  Workspace: 1
  Max render time: off
  Adaptive sync: disabled
  Available modes:
    1280x1024 @ 60.020 Hz
    1024x768 @ 60.004 Hz
    800x600 @ 60.317 Hz
    800x600 @ 56.250 Hz


And at this point I had to give up. I tried to pick this apart for a day or so, but ultimately I decided enough was enough, and I powered my board down. I fought the Rock64, and the Rock64 won.

Roll Credits

Thank you to everyone that helped me along the way writing this post. There is no way I could have figured this all out on my own; it is so hard to find accurate information about these boards online, especially as the kernel and userspace are both constantly changing around these devices. In particular I want to shout out


  Will, for teaching me u-boot and helping me wrangle it into working on this board.
  linear, for answering gentoo and u-boot questions, and helping me get lima working.
  CounterPillow in the Pine64 IRC, for getting my video acceleration working. The dts patch, the ffmpeg fork, all those links came from them.
  The folks over in #gentoo on liberachat, for answering a lot of my beginner gentoo questions.



A System-Witch’s Package Manager Murder Mystery
2022-04-11T00:00:00+00:00
It was a calm day on the puppy-linux thinkpad. The CPU was cool, the RAM was operating at double data rate, and a gentle breeze flowed through the chassis as the fan whirred away. A girl wanted to log onto Discord to chat with her friends, but much to her dismay, Discord needed a package update! No matter, such things happen from time to time, but she’d need to uninstall the old version first. She threw on her systems-witch hat and set to work.

The girl walked over to urxvt and activated her package manager. “Hello package manager, could you pkg u discord”? The package manager began to reply, “Uninstall the package discord:”.

There was a pause. Package managers need to think things over sometimes. Suddenly, she heard a shriek over her terminal:

/usr/sbin/pkg: line 5912: grep: Argument list too long
/usr/sbin/pkg: line 5912: uniq: Argument list too long
/usr/sbin/pkg: line 5913: mv: Argument list too long
ESC[32mUninstalled:ESC[0m discord
/usr/sbin/pkg: line 268: grep: Argument list too long
/usr/sbin/pkg: line 5933: which: Argument list too long
/usr/sbin/pkg: line 1: wc: Argument list too long
ash: -le: argument expected
/usr/sbin/pkg: line 1: wc: Argument list too long
ash: -le: argument expected
/usr/sbin/pkg: line 1: wc: Argument list too long
ash: -le: argument expected
/usr/sbin/pkg: line 1: wc: Argument list too long
ash: -le: argument expected


The last two lines repeated over and over as the script clawed desperately at the air, its mind spinning in circles. The blood drained from our little witch’s face. She hung up the terminal and the package manager slumped down. Her thoughts were swirling, but one question stood out among all the others: Why?

With not a moment to waste, she hurried to the scene of the crime.

$ sed -n '5910,5914 p' < /usr/sbin/pkg

  # clean up user-installed-packages (remove duplicates and empty lines)
  grep -v "^\$" ${REPO_DIR}/user-installed-packages | uniq > ${REPO_DIR}/user-installed-packages_clean
  mv ${REPO_DIR}/user-installed-packages_clean ${REPO_DIR}/user-installed-packages



Odd. Just a typical grep and and uniq command. The uniq doesn’t even have any arguments! How can the argument list be too long when there’s no arguments? Surely, there must be an explanation. Someone or something had killed her package manager, and she was going to figure out what.

The girl retreated into her mental archives. Was this a shell problem? A Linux problem? Argument list too long, argument list too long… Linux certainly has a maximum argument length for programs. Was the something setting the limit to zero somehow? She paged through her memories searching for something, anything, that mentioned argument lists. Finally she found something. A memory, not about arguments, but environment variables.

You see, when the Kernel executes a program, it provides the current set of environment variables directly adjacent to the command line arguments in memory. Could it be that the argument length limit applied to the environment variables too? A query online said yes, but the stack exchange is wily and not to be trusted without verification. Our protagonist dived into the linux source code.

In fs/exec.c she found a function named bprm_stack_limits, which had this to say on the matter:

  limit = max_t(unsigned long, limit, ARG_MAX);
  /*
   * We must account for the size of all the argv and envp pointers to
   * the argv and envp strings, since they will also take up space in
   * the stack. They aren't stored until much later when we can't
   * signal to the parent that the child has run out of stack space.
   * Instead, calculate it here so it's possible to fail gracefully.
   *
   * In the case of argc = 0, make sure there is space for adding a
   * empty string (which will bump argc to 1), to ensure confused
   * userspace programs don't start processing from argv[1], thinking
   * argc can never be 0, to keep them from walking envp by accident.
   * See do_execveat_common().
   */
  ptr_size = (max(bprm->argc, 1) + bprm->envc) * sizeof(void *);
  if (limit <= ptr_size)
    return -E2BIG;


She chuckled, That second half of the block comment was an echo of a recent attack on polkit. But there at the bottom was the answer to the question at hand: max(bprm->argc, 1) + bprm->envc. The kernel source agreed, environment variables take space away from the argument list. If an environment variable was too big, it could stop the shell from running programs at all! But what environment variable could have gotten that large?

The intrepid system administrator returned to the package manager’s corpse, this time peering above the injury.

$ sed -n '5906,5910 p' < /usr/sbin/pkg

  # remove $PKGNAME from user-installed-packages
  NEWUSERPKGS="$(grep -v "^${PKGNAME}" ${REPO_DIR}/user-installed-packages)"
  [ "$NEWUSERPKGS" != "" ] && echo "$NEWUSERPKGS" > ${REPO_DIR}/user-installed-packages



Hmm… so the entirety of user-installed-packages was loaded into NEWUSERPKGS. How big was that file?

$ wc -c /var/packages/user-installed-packages
172474 /var/packages/user-installed-packages


Well, that certainly seemed large enough to overflow a reasonable argument list length limit. It seemed to our witch that she finally had a suspect. But, how could she be sure? There weren’t any commands exporting that variable, and it’s unbecoming to levy such an accusation against a line of code without reasonable proof. Perhaps if there were some way to print the exported environment? She tried adding an env to the script, but it was no use.

/usr/sbin/pkg: line 5910: env: Argument list too long


Of course, env was a separate program, and the shell couldn’t launch those. But maybe it could run a builtin command. If exports were the problem, maybe they could be the solution too!

$ export --help
export: export [-fn] [name[=value] ...] or export -p
    Options:
      -f	refer to shell functions
      -n	remove the export property from each NAME
      -p	display a list of all exported variables and functions


There, -p! That’s what she needed. She sprinkled an export -p into the code.

export EDITOR='vim'
export HOSTNAME='puppypc1400'
export KICAD_PATH='/usr/share/kicad'
export LS_COLORS='bd=33:cd=33'
export NEWUSERPKGS='tmux_3.0a-2|tmux|3.0a-2||Utility;shell|750K|pool/main/t/tmux|tmux_3.0a-2_amd64.deb|+libc6&ge2.27,+libevent-2.1-7&ge2.1.8-stable,+libtinfo6&ge6,+libutempter0&ge1.1.5|terminal multiplexer|ubuntu|focal|
vim-common_8.1.2269|vim-common|8.1.2269|1ubuntu5|Filesystem;filemanager|375K|pool/main/v/vim|vim-common_8.1.2269-1ubuntu5_all.deb|+xxd|Vi IMproved - Common files|ubuntu|focal|
vim-runtime_8.1.2269|vim-runtime|8.1.2269|1ubuntu5|Filesystem;filemanager|30765K|pool/main/v/vim|vim-runtime_8.1.2269-1ubuntu5_all.deb||Vi IMproved - Runtime files|ubuntu|focal|
[... snip ...]


The witch grinned. There it was, NEWUSERPKGS printing out as far as the eye could see. With a culrpit identified, she had what she needed to resurrect her package manager from its untimely death. She added an export -n NEWUSERPKGS above the wound, re-aligned her runes, and sent a jolt of energy into the package manager. Its eyes lit up.

$ pkg u discord
Uninstall the package discord:  
Uninstalled: discord
$ 


She’d done it! Her package manager was back once again, alive and well.

But there was a loose thread dangling. There were no export commands around NEWUSERPKGS, so why was it exported to the child processes? Could a shell be instructed to export all of its variables automatically? She consulted the stack exchange once more.

“What do you mean export all at once? you can use semi colons to define in one line” said a voice in the stack exchange. “Your question is unclear” chimed another. But finally, a moment of clarity: “set -a: When this option is on, the export attribute shall be set for each variable to which an assignment is performed”. She took once more to the package manager, sifting gently through its code with her regular expressions:

$ grep -C2 'set -a' /usr/sbin/pkg
#====================  main functions  ======================#

set -a

# utility funcs


There it was. The trouble maker that had set this all in motion. A bit of a silly choice for a shell script, but so things were. She offered to remove it from the package manager, but it expressed misgivings that it might start malfunctioning. She nodded. The package manager was back on its feet, so better to let it be, for now.

With her Discord client updated, the UNIX witch gave her package manager a gentle pat, and she bid it back to its slumbers. She’d wake it again, when the time arose. For now, it deserved some rest, and so did she.


Could Hubris and WebAssembly Allow High-Level Hardware Emulation?
2022-03-30T00:00:00+00:00
WebAssembly is a specification for a portable binary execution format, which has grown far beyond its original intent of simply providing an alternative runtime for running code in web browsers. Notably, it has a segmented memory model that is unlike the usual flat address space most programs are accustomed to running in. Not that programs don’t usually run within some level of memory segmentation, but it’s usually not something tracked alongside your pointers and is handled a bit lower level than that. WebAssembly also makes complicated memory allocation environments where multiple library may have multiple memory allocators a bit of a bear to get working, and this has been the source of much debate hammering out the standard moving forward. Therefore, WebAssembly tends to be at its utmost smoothest when everything is statically linked, memory regions are known statically, and there’s no dynamic allocation outside these regions at runtime. And here’s where things get interesting, because that’s exactly what you get with Hubris.

I wrote about Hubris just a few days ago so I’m not going to rehash the explanation I gave there. Go read the start of that, or read Cliff Biffle’s post about it for the more comprehensive overview if this is new.

Hubris is a kernel designed to run on embedded systems. These systems typically have flat memory layouts, and low amounts of total system memory (think in the kilobytes to megabytes). Hubris has a task model whereby all tasks are given a fixed amount of memory at a fixed address during compile time. This memory allotment does not change over the runtime of the application. Tasks can not share memory between each-other except by passing around Leases. A task with a lease can ask the kernel to access the lease’s memory region on its behalf.

Now, Hubris itself is intended to be portable across multiple CPUs, with all the low-level ARM stuff that’s ARM or Cortex-M specific self contained enough to allow for a potential Risc-V implementation in the future or something like that. So, what if we compiled Hubris for WebAssembly?

WebAssembly (wasm) outside the browser is interesting because there are existing wasm runtimes that already implement the actual execution of the wasm bytecode, and are flexible enough to be molded into doing whatever hairbrained ideas you want just by forking them and messing around with the implmentation. I believe all the platform-specific stuff in the kernel and userlib for Hubris can be implemented by providing a function body in the rust code that just calls out to the wasm runtime. Then, you can implement the logic to make it actually do what it’s supposed to do in the wasm runtime itself.

Why would you want to do this though?

Well, not just for the sake of putting Hubris somewhere weird, although that is neat. No, what’s interesting about this is it could allow for high-level emulation of target embedded systems for the purpose of automated testing.

So like, think about how you’d test something that talks to a database or a web API. Generally you create something that pretends to be that database or API, but is actually just some simple logic that expects the code you’re testing to send over some specific sequence of data, and provides a plausible response to that data in return. This doesn’t guarantee that your code is actually sending a valid SQL query or json object or whatever, but it does let you know that your code is doing what you think it should be doing, and you can automate it such that you’ll know if it ever stops doing that.

Running a Hubris app inside a web-assembly runtime could make it easier to do these sorts of tests. You could implement a virtual GPIO and serial peripheral, and create a test that makes sure your serial transfer task is indeed writing the configurations you expect to the GPIO and serial control registers. You can pretend to be the device that would sit on the other end of that serial connection, and see if you’re getting the datastream you’re expecting. It should be less complicated and potentially more performant to do this in a wasm runtime, rather than trying to hack it into qemu or something (which apparently doesn’t emulate things well enough for hubris to work anyway). At the same time, it gives you a test environment more similar to the real hardware than, for example, testing the task’s code with everything stubbed at the function level.

This wouldn’t be a substitute for testing on hardware, since the emulation is only as good as your knowledge of the hardware you’re emulating. You also may not be able to compile the full app as it would be deployed on hardware, and may have to omit tasks that rely on inline assembly or anything cortex-m specific. But, I think it could complement it. These sorts of tests could catch things well before you even get to the point of testing on hardware, saving you write-cycles on your dev boards, while being easier to integrate into an automated git pipeline.

Barriers to making this work

The first and foremost problem I’ve already mentioned: making a runtime that could run Hubris and emulate the hardware features it needs at a high level. Because of wasm’s memory model it probably doesn’t make sense to implement this with the assumptions the ARM variant of Hubris makes. The microcontrollers have a flat memory model, whereas wasm is segmented even under the hood unless you provide a way to break out of those semantics. This might require some changes to the hardware-independent sections of Hubris, but I don’t know enough about it to say.

That said it might(?) be possible to actually give Hubris and all its tasks a flat memory model for the purposes of being more true to hardware, but it would require some clever tricks during the compilation and linking steps to get rust to actually do that. Even if it is possible, I don’t know of it makes sense to go through the effort unless Hubris really needs it to work. Take this with some salt though; it’s been awhile since I read into the details of wasm’s memory model, so much of what I’m saying about it comes from talking with my friend who created the innative wasm runtime. He’s had far more experience dealing with the toolchain at this level than I have, and it’s possible I misunderstood some of what he was telling me.

With that caveat, there’s another challenge, which is peripheral access. Interacting with peripherals on ARM chips is as simple as writing or reading predefined memory addresses. That’s sort of a problem with web-assembly though because web-assembly really does not want to represent this sort of access. Memory access requires both a memory region and an offset into that region, and you can’t just cast a constant to a pointer and expect it to compile into something sensible as far as I’m aware.

The way peripheral access crates are made actually provides a potential way out of this though. Peripheral access crates are a bunch of fancy wrappers around the raw pointers that make them nicer to work with, and they’re auto-generated from XML descriptions of the chips they’re created for. The same XML could be used to generate a drop-in crate that replaces the reads and writes with accesses into a dedicated memory segment for IO. The runtime could then pick up where the hardware normally would and emulate the memory mapped IO in the same way. Or, if you wanted, you could forego the memory song and dance entirely and make the pac readers/writers call a special wasm function instead.

Then you’d need to get your build system to override the pac crate with your runtime-specific crate, and hopefully that works out the way you want.

Is it a pipedream?

I’m not sure! And that’s part of why I’m writing this, is to find out. This is all just what I can think of right now, having only poked at this very briefly changing a few target values to wasm32-unknown-unknown in Hubris. It seems plausible to me, but I wonder if I’m missing something that makes it infeasible in practice. Maybe I’m not, and I’m onto something. Either way, now the idea is out of my head and into someone else’s.


Oxide on My Wrist: Hubris on PineTime was the best worst idea
2022-03-28T00:00:00+00:00
In my last Oxide-related post I got Oxide’s Propolis software running and said I might try and get their sled agent up and running next. Anyways that didn’t happen. Instead I ended up reading datasheets, writing rust codegen, spending 16 gigs of ram for an hour to build docs for a crate that’s just a glorified bundle of pointers, dreaming about serial data transfer, and uploading code to my smart watch over the slowest debug link I’ve ever had the displeasure of using. *Record Scratch* You’re probably wondering how I got in this situation. Well, it all started when I learned the nRF52832 microcontroller has a memory protection unit.

So yeah, I ported Oxide’s embedded kernel, Hubris, to my PineTime smart watch, and now I’m going to tell you about it. If you’re not into embedded dev much, stick around for a bit! It’s not all scary, but don’t feel bad if you have to bail as the tail end of this post descends into technical madness. If you are into embedded dev though, well, have I got a treat for you. Before I get into the how, I’m going to talk a bit about what Hubris is, why a smart watch is actually a good place to apply it, and some thoughts on things I like and things I don’t. Then I’ll tell you the tale of how I got it running on my hardware in particular. But first, a demo!


  
    
    
    demoscene twister doing a wiggle
  


(twister math based on this pico8 demo by visy)

Also, if you’re just interested in the code, here’s my fork. The GPIO and SPI code are in a pretty good place, though I’m missing a couple hardware configuration options in both. Have fun!

Hubris: what?

I’m not the authority on the topic here, and if you want an explanation from someone who wrote the dang thing you should watch Cliff Biffle’s talk about it or read the transcript.

Let me give you the basics though, so you have some grounding. Hubris is a kernel for embedded devices that uses a hardware feature a lot of people have forgotten about: the Memory Protection Unit. This piece of hardware in many ARM and RISC-V chips allows the kernel to lock down whether various segments of memory are readable, writable, and executable. Then the kernel can execute a task in that limited context. And by the way, since all IO is memory mapped on these systems, memory protection and IO protection are the same thing. If you can’t access a peripheral’s address space, you can’t access that peripheral!

But hey, what’s the big deal right? We’re all used to this in operating systems like Linux, Windows, macOS, and so on. Well, in the embedded world, it may shock you to learn that most people are just out there shoving a bunch of tasks onto a chip with a kernel that doesn’t bother with this. Those tasks can absolutely stomp on each other’s memory, do whatever IO they want, set your cat on fire, it’s a free for all in there. Some other kernels provide MPU functionality, but pickings are slim.

Hubris says “that’s bad, actually”. The result is an architecture where tasks can only interact through message-passing. Hardware interaction is encapsulated in tasks too, which helps debugging a ton. For example, you can know with certainty that if something is toggling a GPIO pin, it’s the GPIO task. You can add debugging hooks into that task to trace what’s sending messages to it, and now you have a complete high level view of everything doing GPIO. You can enforce mutexes so that two tasks can’t both simultaneously ask the SPI task to do a data transmission at the same time. It’s fantastic.

Since Hubris is written in rust, it can also get the borrow checker in on the fun. Hubris extends the concept of borrow checking with a something called “Leases”. When a task sends a message to another task, it can include a Lease to some range of memory. As the recipient of a lease, you can’t access that memory directly, but you can ask the kernel to read or write that memory on your behalf. The kernel checks to make sure you’ve got a valid lease, and copies memory between your address space and the lease sender’s address space. Since rust’s borrow checker made sure the sender had the rights to hand out the lease in the first place, the whole thing is memory safe.

Oh yeah also they have a debugger called Humility, which knocked my programmer socks off. If you’ve got a debug link to your device you can use Humility to do things like get a list of running tasks, get a backtrace of a failing task, check out ringbuffer logs, mess around with GPIO/SPI/i2c. You can go even more extreme by asking it for your tasks’ memory spaces, and then start mucking around reading or writing bytes directly in memory.

Look at this backtrace I got debugging my demo code:

vi@navi ~/p/hubris (pinetime) [1]> cargo xtask humility app/demo-pinetime/app.toml -- tasks -sl lcd
    Finished dev [optimized + debuginfo] target(s) in 3.34s
     Running `target/debug/xtask humility app/demo-pinetime/app.toml -- tasks -sl lcd`
humility: attached via OpenOCD
system time = 129006
ID TASK                 GEN PRI STATE
 3 lcd                   61   3 FAULT: PANIC (was: ready)
   |
   +--->  0x20002208 0x00008cb2 userlib::sys_panic_stub
                     @ /hubris//sys/userlib/src/lib.rs:989
          0x20002210 0x00008cb8 userlib::sys_panic
                     @ /hubris//sys/userlib/src/lib.rs:981
          0x20002210 0x00008cc0 rust_begin_unwind
                     @ /hubris//sys/userlib/src/lib.rs:1444
          0x20002218 0x000086ce core::panicking::panic_fmt
                     @ /rustc/ac2d9fc509e36d1b32513744adf58c34bcc4f43c//library/core/src/panicking.rs:88
          0x20002220 0x0000898a core::panicking::panic
                     @ /rustc/ac2d9fc509e36d1b32513744adf58c34bcc4f43c//library/core/src/panicking.rs:39
          0x20002380 0x000084f6 main
                     @ /hubris//task/pinetime-lcd/src/main.rs:113



If this seems like a boring ol’ stack trace, yeah, that’s what’s so exciting! Boring ol’ stack traces are typically not this easy to get ahold of in the embedded world, and I’ll admit I’ve stuck to printf debugging in the past rather than deal with the other debugging tools available. This is so easy that even I don’t have an excuse anymore.

Oops, Hubris on a smart watch is actually practical

I fully expected everything I did with Oxide software to be fun, but otherwise impractical for hobbyist projects at home. Hubris is different.

See, on a smart watch, you want to be able to load a bunch of apps onto your watch without worrying if the timer app you just installed is actually counting its way down to nuking your EEPROM, putting you into a bootloop, and texting your ex. At the extreme end, a particularly unlucky piece of code could soft-brick your watch until you unglue the back (breaking the watertight seal), plug a programmer into the debug port, and reprogram it. But even if it never gets that bad, it’s just nice to not have to treat any extra piece of software as a land mine.

Enter, Hubris. Tasks isolated from each other? Done. Tasks isolated from hardware? yup! That’s all the foundation you need to start building a robust watch operating system. Get yourself some dedicated tasks for stuff like input, graphics compositing, bluetooth, and baby you’ve got a stew going!

But should you use it yet?

It depends how adventurous you are, and how much you’re willing to do without support. To quote Hubris’ CONTRIBUTING.md:


  However, Hubris is not done, or even ready. It’s probably not a good fit for your use case, because it’s not yet a good fit for our use case!

  … snip …

  and so, we thought it was important to explain where we’re currently at, and
manage your expectations.
  
    We are a small company.
    Our current goal is to get our first generation products finished and in
customers’ hands.
    We’re writing Hubris in support of that goal, not as its own thing. Hubris has
a total of zero full time engineers – we’re all working on the products, and
tool development is a side effect.
    For expediency, we’re developing our server firmware and Hubris in the same
repo. We will probably split this up later to make it more obvious how to use
Hubris from other applications. But, for now, we’re primarily focused on
getting our firmware ready, because, again, we need to finish our computers.
    These points together mean that we may not have enough bandwidth to review and
integrate outside PRs right now. This will change in the future.
  


So, you shouldn’t expect support, and you shouldn’t expect someone to be available to walk you through things personally.

On the other hand, everything I did in this post, and everything I learned along the way, came almost entirely from reading the existing docs (they’re good!) and the source code (it’s good too! and commented!). I got some helpful hints along the way from Oxide folks on twitter, but I went out of my way to figure out as much as possible on my own to see if it was possible. If you’re comfortable with that, and you’re fine with using an early stage project that’s still being molded into its final form, I’m happy to report there’s nothing stopping you from using Hubris right now!

The downside to hardware isolation.



Leases are great, but they’ve got overhead:


  You’ve got to round trip through the kernel for memory accesses
  The kernel has to file its taxes to make sure you’re allowed to access the lease in question
  The kernel has to copy memory between the two tasks’ memory spaces


The first two points here are somewhat mitigated by the LeaseBufReader/LeaseBufWriter wrappers that buffer read/writes and batch the kernel calls, but this just trades CPU time for RAM, something microcontrollers famously don’t have very much of.

And of course, the message passing itself trips through the kernel and has a cost, and the SPI task has its own taxes it needs to file to work generically.

I ran into this head first when working on my graphics demo. My display is connected over a 8MHz link and uses 16-bit color, so in theory I should be able to update half the screen at 16fps, if my code did nothing else. In reality, I was getting somewhere from 1-4fps with my LCD task talking to the SPI task, sending six write messages per row of pixels. I could reduce this overhead by buffering more pixels before handing them over to the SPI task, but now I’m spending more ram, and the memcopy isn’t free either. None of this even accounts for all the GPIO messages that are sent to the GPIO task during this from both the SPI task and my LCD code.

The easy solution to this is to give the LCD task direct access to the low level hardware peripheral rather than isolating it, but there’s more than just the LCD on that SPI bus; there’s some flash memory on there too. I’m left with a choice:


  Consume more ram, cut down overhead a bit, still have slower LCD access, but keep hardware isolation
  Cut the generic SPI middletask out of the equation, and roll LCD and Flash access into a single monolithic task that Does Both Directly For Some Reason.
  Deal with the low bandwidth and keep things as they are now.


The second option here is probably what I’ll do if I keep working on this project, because LCD speed is more important than a clean separation of concerns when you’re dealing with real time user interactions. Compare these two videos of writing a solid block of color to the screen, first through the SPI task, and second with direct SPI hardware access:

The second video is a bit flickery, so fair warning.


  
  
  Screen updates are slow with SPI task


  
  
  Screen updates are fast without the SPI task


It sucks that this is a compromise I have to make. I have some weird ideas to partially mitigate the issue by creating a DMA-compatibly memory buffer in my LCD task and shoving a pointer to that through the SPI task and into the DMA SPI hardware, but I’m pretty sure this violates memory safety, and the only thing it would mitigate is the Lease overhead. Even if this worked, I’d still be stuck with large pixel buffers I don’t want or need.

I’m certain this is a challenge the Hubris folks are aware of (hey, if you’re reading and I’m missing something obvious, let me know and I’ll update the post). I’m interested to see what their solutions to this look like, or if they’re just using faster chips than me.

Intermission

Good gods I sure am writing a lot of words. I’ve been working on this for the past two weeks and it turns out I’ve got a lot to say! From this point on I’ll be talking about how I got to where I am now, the random bullshit I ran into, and how I solved it. If you’re curious about what porting Hubris to a new chip family looks like, this is for you. I also recommend checking out the commit history to see how I got here framed in code. I’ve left the commit history messy so you can see all the trials and missteps along the way.

So you want to port Hubris to a new chip

I came into this knowing absolutely nothing about Hubris, and I’m going to give this to you from that perspective, so you can see this project with fresh eyes the way I did. The first thing I did was run the first command in the README that looked vaguely useful.

cargo xtask dist app/demo-stm32f4-discovery/app.toml


stm32 is a family of arm microcontrollers that I recognize, so I started there. This command built a bunch of stuff in the drv/ and task/ folder, and generated a binary ready to flash onto a chip. drv/ and task/ have a bunch of drivers and application-level tasks respectively, but what’s in the app.toml? Well, here’s a link to see for your self. Among other things we’ve got


  The chip and board the app is intended to run on.
  The memory layout of the app.
  What tasks the app wants to include.
  Some runtime configuration switches.


A lot of the tasks I can tell I don’t need. Ping and pong look like test heartbeat apps and usart serial isn’t going to do me much good right now so I guess that’s out. Eventually we get down to three tasks that we do actually want running: hiffy, jefe, and idle.

Hiffy is the “HIF Interpreter”. I’ll let task/hiffy/src/main.rs do the talking:

//! HIF is the Hubris/Humility Interchange Format, a simple stack-based
//! machine that allows for some dynamic programmability of Hubris.  In
//! particular, this task provides a HIF interpreter to allow for Humility
//! commands like `humility i2c`, `humility pmbus` and `humility jefe`.  The
//! debugger places HIF in [`HIFFY_TEXT`], and then indicates that text is
//! present by incrementing [`HIFFY_KICK`].  This task executes the specified
//! HIF, with the return stack located in [`HIFFY_RSTACK`].


Then, according to task/jefe/README.mkdn, jefe is “the supervisory task for the demo application, which handles last-ditch error reporting, task restarting, and the like.”.

Finally, idle is scheduled when nothing else needs to run. Its sole purpose is to do nothing. Gods I wish that were me.

Bringing up the kernel

The PineTime uses an nRF52832 microprocessor, a lil baby 64MHz ARM chip with bluetooth. Hubris doesn’t have any support for it in the upstream repo so I added my own support. How did I do that? Well I woke up one morning, put on some lofi beats to write embedded software to, and over the next few hours I


  copied the app/demo-stm32f4-discovery/ folder to app/demo-pinetime/.
  renamed everything inside to pinetime.
  added app/demo-pinetime to the workspace.members of the top level Cargo.toml
  copied the chips/stm32f4.toml file to chips/nRF52832.toml, leaving the values there alone for now.
  adjusted the flash/ram addresses in app.toml according to the nRF52832 datasheet.
  commented out tasks so I only had jefe, hiffy, and idle.
  adjusted the imports in the app/demo-pinetime/ to import the nRF52832 hardware crates instead of the stm stuff.
  messed around with the openocd config file until it worked with my BusPirate.
  flashed target/demo-pinetime/dist/final.bin to the watch.
  it worked???




It’s incredible what you can do when you’re working with code that’s designed to be portable. The most important bit of this was the memory address adjustments. Chip datasheets will tell you the memory layout of your chip and your compiler and linker would really like to know this information. Here’s a screenshot from the nRF docs:



And then, here’s the corresponding bits in the app.toml:

vi@navi ~/p/hubris (pinetime)> cat app/demo-pinetime/app.toml
# bla bla bla
[outputs.flash]
address = 0x0000_0000
size = 0x0008_0000
read = true
execute = true

[outputs.ram]
address = 0x2000_0000
size = 0x0001_0000
read = true
write = true
execute = true
# bla bla bla


Neat right?

By the way, don’t use a BusPirate for flashing chips if you have something better. I love this thing and it’s a great little multi-tool but it took, no exaggeration, 15 minutes to finish uploading the firmware. I actually did it manually instead of using GDB because I was convinced GDB was just bugging out on me but in retrospect I just never gave it enough time to finish the upload. I have since purchased some proper flashing hardware and I’ll be very happy when it gets here.

Anyway, now I had a kernel doing fuck-all on a smart watch and I was incredibly full of myself. I went to twitter to claim victory like I had just cut off the hydra’s head, utterly clueless to the fate I’d just consigned myself to. See, I wasn’t content to just run a kernel. I wanted to drive the display, which means I needed to talk to the display controller. For that I needed to implement an SPI task, and in turn that lead to a GPIO task. At this point my yak stack was looking pretty tall, but the only thing to do was start shearing.

GPIO

Continuing the pattern of copy-pasting code and hammering it into submission, I copied the drv/stm32xx-sys, drv/stm32xx-sys-api, and drv/stm32xx-gpio-common folders, renaming the prefix to nrf52832. I also added these to the root-level Cargo.toml’s workspace just like with the app folder. For the rest of this post I’m going to leave that bit out, but basically, any time you’ve got a new Cargo.toml in a subdirectory you probably need to add its folder to the workspace.

The stm32xx-sys task handles GPIO and RCC configuration. These used to be separate but were merged into a single task to reduce memory usage, since every additional task costs some extra memory overhead. I didn’t know that at the time, but I did know my chip’s spec sheet doesn’t mention a direct corollary to the RCC, so I renamed my -sys folders back to -gpio and deleted all the references to RCC in the code.



The stm32 chips also have more GPIO configuration options than my nRF52832, and multiple GPIO banks. We don’t have to deal with that on the nRF chip so I cleared all that out too and reworked the API a bit to match.

The status quo of bare metal rust. Like, really bare metal

If you’ve worked with something like Arduino before you’re accustomed to having some reasonably efficient abstraction over the hardware that’s stable across different CPUs. These abstractions save you from looking up chip-specific tutorials or spec sheets to do something basic. That’s true in rust too if you use the Hardware Abstraction Layer (hal) crates, but with hubris we don’t have that luxury, because those crates assume they’re working without any sort of memory protection or CPU privilege system in place. Instead, we go a layer deeper and use Peripheral Access (pac) crates. These are auto-generated from individual chip descriptions and give a type-safe way to access chip registers with niceties like enums for multi-choice options. Here’s an example from the GPIO:

use nrf52832_pac as device;

// GPIO port 0 register set
let p0 = unsafe { &*device::P0::ptr() };

// Configure pin 2 as an output with pullup resistor
self.p0.pin_cnf[2].write(|w| {
  w
    .dir().variant(device::p0::pin_cnf::DIR_A::OUTPUT)
    .pull().variant(device::p0::pin_cnf::PULL_A::PULLUP)
});


These writers let you modify multiple fields in the same 32-bit hardware register without having to juggle a bunch of integer constants and bitwise operations. It’s pretty nice actually! The downside is these crates get pretty large, and some of them don’t even have proper docs on docs.rs. See this incredibly broken set of stm32h7 docs for example. It’s not that there’s anything complicated about the build itself, it’s just that it consumes so much resources the docs.rs backend is killing off the docs mid-build. I built the docs for this crate in particular on the big chonker I used for my last post on Propolis, and it took 16 gigs of ram and an hour real-time. The nRF52832 pac crate is fine on docs.rs, but you may have to build docs locally depending on what chip you’re working with.

Anyways, on with the show.

But wait there’s codegen

The final piece to get this all compiling was the .idol file, something I hadn’t noticed up until this point. These files describe the message passing API surface of a task, so any time you make changes to that API you’ve got to update the .idol  file too. Once again I duplicated the stm32’s sys idol file to a gpio idol file for my chip, and here’s a sample of what that looks like:

Interface(
    name: "GPIO",
    ops: {
        "gpio_configure_raw": (
            args: {
                "pin": "u8",
                "config": "u32",
            },
            reply: Result(
                ok: "()",
                err: CLike("GpioError"),
            ),
            idempotent: true,
        ),
        "gpio_configure_gourmet": (
            args: {
                "pin": "u8",
                "mode": (
                    type: "Mode",
                    recv: FromPrimitive("u8"),
                ),
                "output_type": (
                    type: "OutputType",
                    recv: FromPrimitive("u8"),
                ),
                "pull": (
                    type: "Pull",
                    recv: FromPrimitive("u8"),
                ),
            },
            reply: Result(
                ok: "()",
                err: CLike("GpioError"),
            ),
            idempotent: true,
        ),
    }
)


The nrf52832-gpio crate uses this at compile time to generate the server trait for you to implement, and the nrf52832-gpio-api crate generates a corresponding client stub to plumb the inner workings of talking to that server. All a server has to do is implement the appropriate trait and provide a main function that pumps the message queue. Clients just import the api crate and call the api like a function, with the inter-task communication hidden away when you don’t want to think about it.

Once I updated my idol file and pointed my build.rs files at it, I had a working GPIO task! All I had to do was add it to my app.toml and I was good to go. Or, so I thought. I had actually missed something very important, but to figure that out I had to try and use my GPIO for something.

Starting the LCD task

With a GPIO task up and running, I had enough to actually make my watch do something visible. The LCD backlight is just controlled by some GPIO pins, so I whipped up a quick LCD task to make it blink.

To do this, I copied task/pong over to task/pinetime-lcd and stripped out everything from the main loop except for what looked like some sleep code (it was!). I also replaced the USER_LEDS task slot with GPIO, imported the GPIO api, and sprinkled in some GPIO control of the backlight pin.

#![no_std]
#![no_main]
use userlib::*;
use drv_nrf52832_gpio_api as gpio_api;
task_slot!(GPIO, gpio);
#[export_name = "main"]
pub fn main() -> ! {
    const TIMER_NOTIFICATION: u32 = 1;
    const INTERVAL: u64 = 3000;
    const BACKLIGHT_HIGH = 23;

    // Get handle to talk to the gpio task
    let gpio = gpio_api::GPIO::from(GPIO.get_task_id());

    // Configure pin for output
    gpio.gpio_configure_output(BACKLIGHT_HIGH, gpio_api::OutputType::PushPull, gpio_api::Pull::None).unwrap();


    let mut msg = [0; 16];
    let mut deadline = INTERVAL;
    sys_set_timer(Some(deadline), TIMER_NOTIFICATION);
    loop {
        let msginfo = sys_recv_open(&mut msg, TIMER_NOTIFICATION);

        // Toggle backlight
        gpio.gpio_toggle(1 << BACKLIGHT_HIGH).unwrap();

        if msginfo.sender == TaskId::KERNEL {
            deadline += INTERVAL;
            sys_set_timer(Some(deadline), TIMER_NOTIFICATION);
        }
    }
}


Then it was the song and dance of updating my Cargo.toml and my app.toml. Here we get to see task slots for the first time! I’ll give you the abridged version from the app.toml:

[tasks.gpio]
# all the gpio config

[tasks.lcd]
# all the lcd config, but then
task-slots = ["gpio"]


So to recap,


  The rust code declares a task slot with task_slot!(GPIO, gpio);
  In our app.toml, we declare a gpio task
  we fill the LCD task’s gpio slot with that GPIO task.
  At run time, the task slot provides the GPIO task’s ID, and the rust code uses that to build a client struct to talk to GPIO.


Excellent, surely this works right? Well, uh, no. And … this GENeration number in humility tasks seems to keep going up. I think my GPIO task is crashing, and my .unwrap()s are taking the LCD down with it.

vi@navi ~/p/hubris (pinetime)> cargo xtask humility app/demo-pinetime/app.toml tasks
    Finished dev [optimized + debuginfo] target(s) in 3.41s
     Running `target/debug/xtask humility app/demo-pinetime/app.toml tasks`
humility: attached via OpenOCD
system time = 183050
ID TASK                 GEN PRI STATE
 0 jefe                   0   0 recv, notif: bit0
 1 gpio              315251   1 recv
 2 lcd               318758   3 not started
 3 hiffy                  0   3 ready
 4 idle                   0   5 ready


Trust no one, not even yourself

This, my friends, is the memory protection unit in action. There’s one little detail I didn’t mention in the GPIO section earlier, because I had forgotten it myself: we need to give our GPIO task access to the memory space of the GPIO peripheral. If we don’t, the MPU shows up and unalives our little GPIO task with no feelings of remorse.

Finally, we learn that this is what that chips/ folder is for. Every entry in our chips/nRF52832.toml defines the address and size of some memory block, and gives that block a name we can use to let tasks use it. So for GPIO, I added this to my chips file:

[gpio]
address = 0x5000_0000
size = 0x1000


And in my app.toml, I added

[tasks.gpio]
# The name of the memory range doesn't have to be the same as the task name, but in this case it is.
uses = ["gpio"]


With that, we have a glorious blinky screen!


  
  
  blinking screen


SPI’s sappin’ my sanity

The next thing to do is to actually turn the screen on and get some pixel data on there, and for that we need SPI. SPI is a serial protocol whereby one host device (our microcontroller) is connected to several client devices (our LCD, also some SPI flash memory) over three shared lines carrying bidirectional data and a clock signal. Each client device also has a dedicated chip-select signal which is pulled low to tell that device it’s being addressed and pulled high to tell it to ignore whatever’s going on on the line. Our display is connected over a SPI link, and our microcontroller has dedicated SPI hardware to use that link efficiently. We just need to write some code to use the SPI hardware.

Once again, I copied the stm32 SPI driver and started chopping away at the parts I didn’t need, since the nRF has much simpler SPI hardware with less configuration involved. It’s got two ways to use the SPI, Direct Memory Access (DMA) and the simpler register-driven variant. DMA is more efficient because we can point the SPI hardware at a large chunk of memory, tell it to go to town on that memory, and then yield to other tasks for a bit. The downside is, it’s more complicated to use. In the interest of Getting Something Working I used the simpler SPI interface that need us to feed in bytes one at a time as they’re transmitted.

Here’s where things got complicated though, not because of the SPI hardware, but the configuration around it. Our app.toml provides task configuration sections that our tasks can read at build time. The SPI driver I copied converts this configuration to a struct with all the device and mux configuration. This involves walking the toml data, validating that it is indeed a satisfiable configuration, and generating rust code to represent that configuration. I’ve never actually done rust codegen until now, but it’s not too dissimilar from something like Haskell codegen, so that part didn’t scare me off too bad.

What did cause me a headache though was this cursed error report:

error: failed to run custom build command for `drv-nrf52832-spi-server v0.1.0 (/sd/vi/home/p/hubris/drv/nrf52832-spi-server)`

Caused by:
  process didn't exit successfully: `/sd/vi/home/p/hubris/target/release/build/drv-nrf52832-spi-server-b7d1371bb53586d5/build-script-build` (exit status: 1)
  --- stdout
  --- toml for $HUBRIS_TASK_CONFIG ---
  [spi]
  global_config = "spi1"

  cargo:rerun-if-env-changed=HUBRIS_TASK_CONFIG

  --- stderr
  Error: environment variable not found

  Stack backtrace:
     0: anyhow::error:: for anyhow::Error>::from
               at /home/vi/.cargo/registry/src/github.com-1285ae84e5963aae/anyhow-1.0.44/src/error.rs:530:25
     1:  as core::ops::try_trait::FromResidual>>::from_residual
               at /rustc/ac2d9fc509e36d1b32513744adf58c34bcc4f43c/library/core/src/result.rs:1915:27
     2: build_util::toml_from_env
               at /sd/vi/home/p/hubris/build/util/src/lib.rs:60:18
     3: build_util::config
               at /sd/vi/home/p/hubris/build/util/src/lib.rs:51:5
     4: build_script_build::main
               at ./build.rs:17:25


Huh? The failing line is simply let global_config = build_util::config::()?;.

A couple hours later and I finally found the culprit. The original app I copied didn’t have any SPI, and when I was looking at the other ones that did I missed a config section down at the bottom with keys like [config.spi.spi1]. That global_config setting tells the build system what key actually holds the SPI configuration details, and if that key isn’t actually present you get the cryptic error message above about missing environment variables.

Eventually though I did get SPI up and running, and you can see a sample of the config for that below. I’m pretty happy with where the implementation is now after a few more days of refactoring and refining it down, but it could stand for doing a DMA version at some point.

[config]

[config.spi.spi0]
controller = 0

[config.spi.spi0.mux_options.lcd]
miso_pin = 4
mosi_pin = 3
sck_pin = 2

[config.spi.spi0.devices.lcd]
mux = "lcd"
cs = 25
frequency = "M8"
spi_mode = 3


Pixels pixels pixels pixels pixels

With SPI working I could start getting pixels on the screen. This is a simple case of “read the datasheet and do what it says”. The display controller in here is also very similar to the ones they have on the TI-84+CSE, something I have a history of working with, so I was right at home with it. No interlacing on this one though sadly, so I can’t do the half-resolution hack to squeeze more performance out of it. Commands are sent by holding the command pin low and sending the 8-bit command code over the serial bus, and then command data comes after with the command pin held high. I’m using 16-bit color, but it can accept 12-bit color to save bandwidth. The downside is you’ve got to worry about byte alignment, and that’s a pain.

Eventually I got a funky lil guy on my screen surrounded by undefined RAM data:



A bit more effort and a detour into demoscene research and I got that neat twister you saw at the top of the screen!

SPI is dead, long live SPI.

Remember how I mentioned earlier that SPI access from the LCD task is way faster? Well, I wanted to animate my twister and that’s when I ran into troubles, because screen updates were taking agonizingly long. It wasn’t so much an animation as it was a slideshow. As a result I was forced to cut the SPI task I worked so hard on out of the equation and give the SPI hardware address space over to my LCD task instead. This gave me the smooth animation I was looking for, but it was kind of disappointing to have to do. oh well!

There’s a long way to go

Getting this project from where it is now to a fully functional smartwatch OS would be quite the endeavor. We’d need to bring up i2c to talk to the touch screen and other sensors, get the SPI flash working, implement a proper graphics stack. We’d need to write apps for the darn thing, or even have a watch face of any sort. We’d need to optimize everything for battery consumption as much as possible. All of this, and I haven’t even mentioned bluetooth, which would require finding a good bluetooth stack written in rust, or making one.

That’s far more than I care to do myself, though it might be possible to nerdsnipe me into helping if others want to work on it too. No promises.

Still, I hope you learned something, or just found this interesting. I know I sure have!


That time I wrote a FORTH compiler for my TI84
2022-03-22T00:00:00+00:00
I’ve got something special in the works for my next Oxide-related post, but while I work on that I want to revisit an old project of mine, uninspiringly named calccomp. The README claims it’s a C compiler but that’s a lie; I never got very far on the C part of the project. What it does have is a custom Z80 assembler and a compiler for my own dialect of FORTH, with a few neat features like the ability to write an infinitely recursive “word” (the FORTH terminology for a subroutine/function) without stack overflows. I’m leaving the project in its unorganized and disheveled state, but it deserves some proper representation.

FORTH?

Yeah, FORTH. or Forth. or forth. Capitalize it however you want really, but I’m all-capsing it as is tradition. If you don’t know, FORTH is a stack based concatenative programming language. Most programming languages use a stack under the hood to store things like local variables and where to return after a function ends, but FORTH makes manual manipulation of that stack a core part of the language. A typical FORTH program looks something like this:

1 DUP + .


FORTH’s grammar is very simple: just string.split() the source code on spaces and you have your tokens ready for interpretation. This program has 4 steps to it:


  push 1 to the stack.
  DUPlicate the top of the stack. The stack now contains 1, 1
  Add the top two values of the stack and push the result. The stack now contains 2
  Print the top value of the stack to the console. The stack now contains nothing and you see a 2 on screen.


Anything that isn’t a literal value is called a “word”. DUP, +, and . are all words. Think of them like functions or subroutines.

This sort of programming can be pretty confusing, but it’s also kinda fun. It feels a bit like a puzzle game trying to swap around stack entries the most efficient way possible to do whatever you want to get done.

It’s worth stating that FORTH isn’t a singular language, but more of a family of languages, a bit like lisp. FORTH is one of the simplest languages out there from a syntactic standpoint, and is also relatively easy to write an interpreter or a compiler for. The result is that a lot of people write their own FORTHs with little quirks that reflect the desires and personal flair of whoever made them.

My first exposure to FORTH was through the MineCraft mod RedPower 2, written some years ago by eloraam. She implemented an emulator for a modified 6502 CPU instruction set, and then wrote a FORTH interpreter and compiler on top of that. You could then program this thing in FORTH to interact with the world and various machines in all sorts of ways, from simple redstone equations to complicated item sorting algorithms and flying excavator machines. I was in love with this, and I married this love to my obsession with TI83/84 calculators.

I accidentally wrote an Assembler

Before we get to the FORTH compiler, I want to talk about my assembler. It’d be natural to wonder why I even bothered writing an assembler in the first place. After all, I could just make my compiler feed text into someone else’s assembler and call it a day. That was actually my original plan, and the assembler was kind of an accident.

See, the simplest way to write a compiler is to just turn each token into some assembly code, print it all out, and then shove that into an assembler to do something with it, but that makes implementing optimizations more error prone and just plain annoying. It also means that if your assembly representation of your primitives and core library have any invalid assembly in them, you don’t find out until your assembler starts giving you cryptic errors with line numbers that are difficult to correlate with your actual compiler code.

The solution to both of these problems is to build some data types to represent the assembly language, or as a compiler dev will call it, an Abstract Syntax Tree (AST). Granted, assembly rarely gets very tree-like, but it’s a tree nonetheless. This way, you can ensure that you never accidentally typo your way into invalid assembly code without it getting caught by the host compiler (the compiler that’s compiling your compiler).

Once you’re working with an AST though, you eventually need to serialize it to a file. And at that point you might as well generate the machine code for your target CPU to skip the assembly step and OOPS you wrote your own assembler! So that’s what happened to me, and I just decided to roll with it and add labels and a few other features I wanted.

I primarily use Ben Ryves’ Brass Assembler when I’m writing Z80 code and I stole its variable allocation feature to make the compiler implementation simpler. Here, let me explain with some code:

; Tells the assembler to use 768 bytes starting at
; address 4000 (hexadecimal) for static variable allocation.
.varloc 4000h, 768

; Defines snake_x = 4000h as an assembly-time constant and
; removes 4000h/4001h from the allocation pool
.var 2, snake_x

; Defines snake_y = 4002h as an assembly-time constant and
; removes 4002h/4003h from the allocation pool
.var 2, snake_y


Rather than having to manually allocate all of your static variables, you can give Brass a pool of memory and ask it to slice off pieces of that memory for you without having to worry about where exactly they are. In a language like C, we take this sort of thing for granted, but not all assemblers can do this. I love this feature, and it made writing the compiler easier, so I threw it into the assembler.

For my final assembler trick, I added in a Z80 quasi-quoter. My assembler and compiler are both written in Haskell. Haskell’s got a feature called quasi-quoters, which let you write some custom parser logic that takes in a String and dumps out a Haskell syntax tree. Then you can use this to do Compile Time Shenanigans. Here’s an example from the FORTH compiler:

-- wasm stands for "word assembly", Web Assembly wasn't a thing yet :)
wasm "DUP" = rtni [asm|
  pop hl
  push hl
  push hl
|]


rtni is a function that takes in a Z80 assembly syntax tree, but writing out the syntax tree by hand is annoying. Instead I import a function from my assembler called asm. That’s the quasi-quoter! [asm| whatever |] is a “quasi-quotation”, which tells the Haskell compiler to feed all the text in between the two pipe characters to the asm function. Whatever Haskell code comes back out is what gets compiled, type-checked, and all that good stuff. This way I can have my cake of writing bare assembly but I get all the benefits of having my entire standard library syntax checked statically.

My FORTH at the language level

My FORTH was based on a list of standard FORTH words from some old scanned-in PDF, and I don’t remember which one anymore. It uses the traditional two-stack layout with a data stack and a return stack. The data stack is the stack most of the FORTH words use for inputs and outputs, and any time something refers to “the stack” that’s usually what they mean. The return stack stores the return address for word calls, but there’s also some words you can use to shuffle data between this and the data stack. This is pretty common practice, but you’ve got to be very careful when writing a word that the return stack looks the same at the end as it did at the start or you’re going to end up jumping to who knows where and crashing your system.

I changed the syntax around from traditional FORTH word definition. A lot of FORTHs, defining a word looks something like this:

: MY_WORD 2 * ;


: starts the definition, MY_WORD is the name of the word, and then everything up to the ; is the body. I didn’t really like that at the time, and decided what the world needed was more curly braces, so I ended up with this:

WORD MY_WORD {
  2 *
}


I genuinely don’t remember the reason. While I was at it though, I also added inline assembly:

ASMWORD ENABLE_INTERRUPTS {
  ei ; enable interrupts
}


The body of an ASMWORD is passed directly into my assembler without any extra processing. This was my escape hatch to do anything I couldn’t do with my standard library, or write tight assembly loops for graphics. I’ll talk about this a bit more in the implementation details when I explain calling conventions.

Apparently I used THEN as a terminator for if-statements, so they look a bit funky. check this out,

( pops the stack and executes BODY1 if the value is non-zero (TRUE), else BODY2 )
IF 
  ( BODY1 goes here )
ELSE
  ( BODY2 goes here )
THEN


I also mentioned earlier that I invented a way to do infinite recursion; this was actually an excuse to avoid implementing for/while loops for a bit. I added a word called RECURSE which just jumped back to the start of the current word definition. This looks a bit like tail-call optimization if you squint but it’s much simpler. Traditional tail-call optimization has to go through flow analysis to prove that a function is calling itself and then immediately returning, and then the compiler can choose to just jump to the start of the function again. Since FORTH is so loosey goosey with the concept of “calling a word” and “word arguments”, adding a tool to just jump back to the start of a word is totally chill as long as the word leaves the data and return stacks in a sensible state when it finally does return.

So here’s what a loop that counts down to 0 looks like in my FORTH:

WORD MAIN {
  5 PRINT_UNTIL_ZERO
}

WORD PRINT_UNTIL_ZERO {
  DUP . ( print the number )
  1-    ( subtract 1 )
  DUP 0 < IF
    ( value is less than 0 )
    DROP RETURN
  ELSE
    RECURSE
  THEN
}


How the compiler actually works

The parser is pretty boring so I’m going to skip that.

Calling Conventions

Calling conventions- every language has to have them. To keep things fast I use the hardware stack to store data, pointed to by the SP register. Since the return stack isn’t used as much, I use the slower IX index register to keep track of that. This presents a bit of a problem for calling words because I use the real call instruction, and that puts the return address on our data stack. To solve this, the first thing all words do is pop the return address from the data stack and move it over to the return stack for safe keeping. When returning, they load the top value from the return stack into the HL register and do an indirect jump back to the caller.

Inline Words

The calling conventions impose some overhead, so it’s best to avoid calling words unless they’re long enough to warrant it. Most of the core vocabulary gets inlined as long as the implementation isn’t too big. For some of the larger core stuff like multiply I go halfsies on it, inlining the stack pops and pushes before and after calling the main body of the routine. This is important for the optimizations I’ll talk about later.

Everything else gets the full calling overhead. If my compiler was a bit more complex I could avoid this by automatically doing the halfsies approach for words that don’t reach deep into the stack, but that would have required deeper static analysis than I wanted to figure out.

Tree-Shaking

There’s a lot of stuff in the standard library, but most programs don’t use all of it. For inline words this isn’t a problem; the code is inserted inline if you use it and omitted if it’s not. For everything else, I added dependency tracking. If a word calls another word, it declares the called word as a dependency. When it comes time for code-generation, only the stuff the program actually uses gets inserted into the final binary.

Optimizations

Here’s where I take this from “neat” to “fast enough to write game logic”. Throughout the standard library, I stick to the HL and DE registers as much as possible for stack operations and math operands, and this gives me two really obvious optimizations.

First, pushing a register and popping back into it does nothing but waste CPU time so we can remove this pattern entirely.

; Delet this
push hl
pop hl

; This too
push de
pop de


Second, copying a register to another register is faster if you skip the stack:

; Delet this
push hl
pop de

; Use this instead
ld d,h
ld e,l


This makes a lot of our stack operations just cancel each other out. For example, an unoptimized SWAP DROP like this

; SWAP
pop hl
ex (sp),hl
push hl

; DROP
pop hl


can get turned into this optimized code:

pop hl
ex (sp),hl


And if a previous word ended in push hl, the SWAP DROP is suddenly a single ex (sp),hl instruction.

The compiler devs among you are already yelling “peephole optimization” at the screen and yeah, exactly that. The performance gains from this were so big that I didn’t even bother writing other optimizations because I didn’t need them.

Is it practical?

You bet your ass it is! Here’s Snake for the TI84+CSE, implemented in my FORTH. The project is very TI-calculator focused, but could totally be adapted to other Z80 systems. I have a Z80 computer kit lying around so maybe I’ll do that sometime.

Anyhow, thanks for listening to me ramble on about the machinations of my past.


Oxide at Home: Propolis says Hello
2022-03-14T00:00:00+00:00
So Oxide is making some cool stuff huh? Big metal boxes with lots of computer in them. Servers as they should be! Too bad I can’t afford to buy one for myself… but wait, they’re open-sourcing the software they’re writing to do it. Mom said we can have Oxide at Home!

Oxide at Home:



Hardware, with the software haphazardly jammed in

Let’s be clear, I’m not aiming for elegance here. I’m not aiming for enterprise grade either. I want something dirty, something hacky, something that makes you go “what the fuck, why, no???????”.

To that end I’m choosing right at the start to make my life more interesting. Oxide’s software is mostly written for illumos, a direct descendant of OpenSolaris. There’s a handful of illumos distributions out there, but Oxide develops primarily for their distribution called Helios. Their Omicron README (no relation) also mentions OmniOS. Naturally I’m going to use neither of those and make it work on OpenIndiana instead.

You see, I can’t get a copy of Helios right now unless I commit corporate espionage, and OmniOS describes itself as “enterprise”. As I’ve already stated, I am not an enterprise, nor do I plan on becoming one unless Jean-Luc Picard starts taking estrogen and wants to be my captain. Tribblix was also in the running but I couldn’t get the installer to work, so I landed on OpenIndiana.

Anyways, everything I do, I’ll do with the intention of getting it working, not making it good. Expect awful things along the way.

But first, we need to summon Ferris

Oxide, as their name implies, likes to write software in rust. Some of that software wants to use a nightly rust too. Might help to have rustup huh? Well, there’s a couple problems. First, rustup’s install script is not actually as universal as they think it is:

vi@box:~$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
sh[455]: local: not found [No such file or directory]
sh[456]: local: not found [No such file or directory]
sh[457]: local: not found [No such file or directory]
sh[458]: local: not found [No such file or directory]
sh[202]: local: not found [No such file or directory]
sh[62]: local: not found [No such file or directory]
sh[65]: local: not found [No such file or directory]
sh: line 72: _ext: parameter not set


Fine, whatever, let’s pipe it to bash then,

vi@box:~$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | bash
ld.so.1: rustup-init: fatal: libgcc_s.so.1: open failed: No such file or directory


Excuse me the fuck? I uh,

vi@box:~$ find / -name 'libgcc_s.so.1'
/usr/sfw/lib/libgcc_s.so.1
/usr/sfw/lib/amd64/libgcc_s.so.1
/usr/pkgsrc/lang/rust/work/rust-1.55.0-x86_64-unknown-illumos/lib/pkgsrc/libgcc_s.so.1
/usr/gcc/7/lib/libgcc_s.so.1
/usr/gcc/7/lib/amd64/libgcc_s.so.1
/usr/gcc/11/lib/amd64/libgcc_s.so.1
/usr/gcc/11/lib/libgcc_s.so.1
/usr/gcc/3.4/lib/libgcc_s.so.1
/usr/gcc/3.4/lib/amd64/libgcc_s.so.1
/usr/gcc/10/lib/libgcc_s.so.1
/usr/gcc/10/lib/amd64/libgcc_s.so.1


What do you want from me, rustup? Well you see, it’s very simple:

vi@box:~$ pkg search file:basename:libgcc_s.so.1
INDEX      ACTION VALUE                                 PACKAGE
basename   file   usr/gcc/8/lib/amd64/libgcc_s.so.1     pkg:/system/library/[email protected]
basename   file   usr/gcc/8/lib/libgcc_s.so.1           pkg:/system/library/[email protected]
[... a bunch of other gcc versions skipped ...]
basename   file   usr/lib/amd64/libgcc_s.so.1           pkg:/system/library/[email protected]
basename   file   usr/lib/libgcc_s.so.1                 pkg:/system/library/[email protected]


We need gcc-4-runtime! Obviously (/s). Oh we also need g++-4-runtime or we get another missing shared library but I’ll spare you the details.

vi@box:~$ sudo pkg install gcc-4-runtime g++-4-runtime
vi@box:~$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | bash
info: downloading installer

Welcome to Rust!


FINALLY. Ok.



What’s a Propolis?

Oxide is making racks of lots of computer. My understanding is that they have a control plane that talks to all the sleds (the blades of computer). Each sled runs a sled agent, and one thing that sled agent can do is start virtual machines. This is where Propolis comes in, as a userspace frontend to the bhyve hypervisor.

This is a great place for us to start because Propolis doesn’t depend on any other services to run. It just sits there and exposes an API to make VMs.

Let’s build it!

vi@box:~/oxide-at-home$ git clone https://github.com/oxidecomputer/propolis
vi@box:~/oxide-at-home$ cd propolis
vi@box:~/oxide-at-home/propolis$ cargo build


This will make target/debug/propolis-cli and target/debug/propolis-server. I copied those over to /usr/local/bin and moved on with my life, just get them on your PATH somehow if you’re following along at home.

Configuring Propolis

Anyway, how do we use this? First we need a config file, and the README provides this helpful example:

bootrom = "/path/to/bootrom/OVMF_CODE.fd"

[block_dev.alpine_iso]
type = "file"
path = "/path/to/alpine-extended-3.12.0-x86_64.iso"

[dev.block0]
driver = "pci-virtio-block"
block_dev = "alpine_iso"
pci-path = "0.4.0"

[dev.net0]
driver = "pci-virtio-viona"
vnic = "vnic_name"
pci-path = "0.5.0"


First question - what the hell is OVMF_CODE.fd? I did a pkg search for it and not a single package has it, but it’s the bootrom used when the VM starts up. Comes from a project called EDK2 I guess? I’m fuzzy on the details, but I followed a trail from the arch linux edk2-ovmf package to this github wiki and eventually this jenkins build artifact folder on the personal website of a qemu dev.

I grabbed the x64 rpm, extracted it a few times with 7zip, and eventually got my hands on OVMF_CODE-pure-efi.fd. This ended up working out so, cool I guess.

EDIT: I have since been informed that the Propolis README has a link to a recommended bootrom. As you’ll soon see, my propensity for not reading READMEs all the way through knows no bounds. I pretty much just copied the example config file out and decided I’d come back to the README if I ran into a problem I couldn’t solve, and unfortunately I’m very good at solving problems. Sorry Oxide folks, thanks for putting up with my bullshit <3.

Next, I downloaded a copy of the alpine-extended iso for 3.15 since that’s the latest right now.

Finally, you see that vnic = line? We need to give it a vnic-type network interface. The README actually explains the correct way to do this but I didn’t bother to read that. I just read the man page and threw stuff at the terminal until it did something useful.

vi@box:~/oxide-at-home$ dladm show-link
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g1     phys      1500   down     --         --
e1000g0     phys      1500   up       --         --
vi@box:~/oxide-at-home/propolis$ sudo dladm create-vnic -l e1000g0 propolis
dladm: invalid link name 'propolis'
vi@box:~/oxide-at-home/propolis$ sudo dladm create-vnic -l e1000g0 e1000g9
vi@box:~/oxide-at-home$ dladm show-link
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g1     phys      1500   down     --         --
e1000g0     phys      1500   up       --         --
e1000g9     vnic      1500   up       --         e1000g0


So this totally breaks naming conventions but I couldn’t figure out what constitutes a “valid link name” from the man page. If I had actually read the README more I would have seen the suggestion of vnic_prop0. You should use that instead! But my config will use my best effort shitpost name instead, since that’s what really happened.

With all that done, my final config file looks a bit like this:

bootrom = "/export/home/vi/oxide-at-home/edk2/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd"

[block_dev.alpine_iso]
type = "file"
path = "/export/home/vi/oxide-at-home/run/alpine-extended-3.15.0-x86_64.iso"

[dev.block0]
driver = "pci-virtio-block"
block_dev = "alpine_iso"
pci-path = "0.4.0"

[dev.net0]
driver = "pci-virtio-viona"
vnic = "e1000g9"
pci-path = "0.5.0"


Using Propolis

With a config file that looked good and all the hubris of a university student on orientation day I started the propolis server.

vi@box:~/oxide-at-home/run$ sudo propolis-server run propolis.toml 127.0.0.1:12400


In another terminal I told propolis to make a VM.

propolis-cli -s 127.0.0.1 new cirno -m 1024 -c 1


Peeking over at the propolis logs, I saw this:

Mar 13 22:36:20.531 INFO Starting server...
Mar 13 22:36:56.204 INFO accepted connection, remote_addr: 127.0.0.1:46363, local_addr: 127.0.0.1:12400
Mar 13 22:36:56.210 INFO request completed, error_message_external: Internal Server Error, error_message_internal: Cannot build instance: No such file or directory (os error 2), response_code: 500, uri: /instances/3915cdd5-3998-4f42-b728-0f8b594afae0, method: PUT, req_id: 0535a501-4467-4f4d-8da5-029e5ed26a20, remote_addr: 127.0.0.1:46363, local_addr: 127.0.0.1:12400


What do you MEAN “No such file or directory”?????



I tried poking at the code but that went nowhere fast. My usual debugging strategy here is to use strace but we’re in illumos land so we need to use dtrace instead, which is like if someone (Bryan Cantrill) decided strace needed awk built in. Now, I’m usually content to just pipe the whole firehose of strace into awk and filter from there but dtrace is actually pretty neat, if a bit confusing at first. And it’s got a pony. Does strace have a pony? I don’t think so.

I wanted to see all openat invocations, so I grabbed the probe id.

vi@box:~/oxide-at-home/run$ sudo dtrace -l | grep openat
22431    fbt    genunix    openat entry
22432    fbt    genunix    openat return


Then I ran propolis-server with dtrace and tried to make another VM.

vi@box:~/oxide-at-home/run$ sudo dtrace -i '22431 { printf("%s", copyinstr(arg1)) }' -c 'propolis-server run /export/home/vi/oxide-at-home/run/propolis.toml 127.0.0.1:12400' 2>&1 | grep -v -e '/proc' -e '/etc/ttysrch' -e /var/adm/utmpx -e '/dev/pts/3'
dtrace: description '22431 ' matched 1 probe
Mar 13 23:15:02.407 INFO Starting server...
Mar 13 23:15:04.980 INFO accepted connection, remote_addr: 127.0.0.1:40237, local_addr: 127.0.0.1:12400
Mar 13 23:15:04.983 INFO request completed, error_message_external: Internal Server Error, error_message_internal: Cannot build instance: No such file or directory (os error 2), response_code: 500, uri: /instances/b745d636-c8b6-46e5-bb08-839af892b702, method: PUT, req_id: 38dc08e3-e2d5-4561-a328-f48984011a8f, remote_addr: 127.0.0.1:40237, local_addr: 127.0.0.1:12400
 13  22431                     openat:entry /var/ld/64/ld.config
 13  22431                     openat:entry /usr/lib/64/libsqlite3.so.0
 [... snip bunch of random dlls ...]
 13  22431                     openat:entry /etc/certs/ca-certificates.crt
  4  22431                     openat:entry /dev/vmmctl


Hmm what’s /dev/vmmctl? Ha. haha. Remember how I said Propolis is a frontend for bhyve? That’s the bhyve control device. Does it exist?

vi@box:~/oxide-at-home/run$ ls /dev/vmmctl
/dev/vmmctl: No such file or directory


No, no of course it doesn’t, because I forgot to install bhyve. Let’s do that shall we?

vi@box:~/oxide-at-home/run$ sudo pkg install system/bhyve bhyve/firmware brand/bhyve system/library/bhyve


Using Propolis, for real this time

I restarted Propolis, and finally, FINALLY, we can create a VM.

$ vi@box:~/oxide-at-home/run$ propolis-cli -s 127.0.0.1 new cirno -m 1024 -c 1


We have to explicitly turn on VMs after they’re created, and then we can interact with them over serial. propolis-cli can give us a serial connection to the VM, but here I ran into a little snag. The alpine image we’re using attaches the console to VGA by default, so I had to attach to serial first in one terminal, start the VM up in the other, then switch back to the serial connection to stop grub from autobooting.

$ vi@box:~/oxide-at-home/run$ propolis-cli -s 127.0.0.1 serial cirno
$ vi@box:~/oxide-at-home/run$ propolis-cli -s 127.0.0.1 state cirno run

[ grub appears on the serial connection ]


Once grub came up I removed quiet from the linux arguments and added console=ttyS0. I hit the button to boot the system, and at long last, I had victory:



This is where I stopped, to prevent my brain from melting.

What’s next?

I’m not sure! I think sled-agent is the logical next step as we work our way from the ground up trying to build out a fully working deployment (for some definitions of “working” and “deployment”) but we’ll see.

Or maybe I’ll build a server rack out of cardboard. You never know.


A brief tale of pkgsrc and illumos
2022-03-07T00:00:00+00:00
Recently we got our hands on some nice DDR3 era hardware, which we’ll use eventually for NAS purposes. It’s got 96 gigs of ECC RAM, two Xeons, the works. For fun we’ve decided to run OpenIndiana on it, an illumos distribution. OpenIndiana has a package manager called the Image Package System (IPS), and the default repositories have basically everything we’d need, but for another layer of excitement we put pkgsrc on here too. pkgsrc is a repository of package build scripts, most of which work on NetBSD, Solaris, illumos, Linux, macOS, and more! Joyent actually provides a binary distribution of pkgsrc for illumos, but on our everlasting journey for increasingly esoteric layers of fun we’re building pkgsrc from source. Don’t worry, it was easier than you think.

Ok so let’s get something out of the way first, if you’re familiar with pkgsrc and just want to know what dependencies to install, here’s the spoilers. Install this stuff:

$ pkg install gcc-11 gnu-binutils c-runtime system/header


Then do your usual pkgsrc bootstrap. For everyone else, here’s the brief tale of how we got to that point.

How we got there

First off we need an actual OpenIndiana installation to work from. We’ll spare you the details since the installer explains itself fairly well, but if you have a lot of RAM you might want to modify the default partition layout. We ended up with a 96GB swap partition which is uhh, excessive, you might say, particularly on a 200GB SSD.

Once we reboot into the system, a system update is in order.

$ pkg update


This updated something upwards of 400 packages for us and took awhile. A coffee break later and we can move on to grabbing a copy of pkgsrc itself.

$ curl https://cdn.netbsd.org/pub/pkgsrc/stable/pkgsrc.tar.xz | xz -d | sudo tar xvof - -C /usr


This downloads a copy of the latest stable snapshot of pkgsrc and extracts it into /usr. Everything in the tar file is in pkgsrc/whatever so we end up with /usr/pkgsrc as our source tree. If you’re wondering why we use xz -d as its own step instead of passing J to tar: for some reason tar on our system gives us this weird error tar: directory checksum error when we try that, so we’re not sure what it’s doing but we don’t think it’s doing it right.

We’ll probably also need a compiler toolchain so let’s install gcc and binutils to get things going.

$ pkg install gcc-11 gnu-binutils


Cool, so do we have everything we need? Let’s find out:

$ cd /usr/pkgsrc/bootstrap
$ ./bootstrap --prefix=/usr/pkg --prefer-pkgsrc yes --make-jobs 16

[... blah blah blah a bunch of build output ...]

ld: fatal: file crt1.o: open failed: No such file or directory


Hmm, that’s no good. We actually have no idea what crt1.o is used for, but whatever the case, we definitely need it. This is a decent opportunity to learn the ropes of IPS’s package searching features. pkg search has a lot of advanced querying functionality, and the man page has some examples of it. Here it demonstrates searching the locally installed packages (-l) for a file named vim (file:basename:vim).

$ pkg search -o path,pkg.name -l file:basename:vim
PATH         PKG.NAME
usr/bin/vim  editor/vim/vim-core


This demonstrates the file:basename: query, and we can use that to search the remote repository for a file named crt1.o

$ pkg search file:basename:crt1.o
INDEX      ACTION VALUE                PACKAGE
basename   file   usr/lib/amd64/crt1.o pkg:/system/library/[email protected]
basename   file   usr/lib/crt1.o       pkg:/system/library/[email protected]
basename   file   usr/lib/amd64/crt1.o pkg:/system/library/[email protected]
basename   file   usr/lib/crt1.o       pkg:/system/library/[email protected]

pkg install c-runtime


Ok let’s give bootstrap another shot to see if that was all we needed.

conftest.c:9:10: fatal error: stdio.h: No such file or directory
    9 | #include 
      |          ^~~~~~~~~


Alright fair enough, so we didn’t have our libc headers. Let’s search for stdio.h this time.

$ pkg search file:basename:stdio.h
INDEX      ACTION VALUE                PACKAGE
basename   file   usr/include/stdio.h  pkg:/system/[email protected]

$ pkg install system/header
$ ./bootstrap --prefix=/usr/pkg --prefer-pkgsrc yes --make-jobs 16 


And what do you know it works! That’s all we need to successfully bootstrap pkgsrc and start building things. After that we succesfully built rust from source, which pulled in a build of cmake, llvm and a few other fan favorites along the way, so it’s safe to say this is a fully functional pkgsrc bootstrap. Thanks for coming along on the journey.

Until next time!


Regret License Version 1.0
2022-02-20T00:00:00+00:00
                         Regret License
                   Version 1.0, February 2022

         THE LICENSE JUST REPEATS: REGRET REGRET REGRET

1. Definitions.

Dear Legal Entity,

"Regret" shall mean
    - We regret using, reproducing, and distributing the Work.
    - We regret all works of authorship.
    - We most definitely regret deploying the Work onto our Kubernetes
      cluster's raggedy ass fleet!

Ooh-rah!

2. Regret Is a Name, Sergeant.



Tailscale on NetBSD - Proof of Concept
2022-02-16T00:00:00+00:00
I’m currently working on porting Tailscale to NetBSD. Actually, I already have the core functionality working (see screenshot below). I don’t have a full idea of what the rest of the port will look like, but there’s plenty of additional features and loose ends that I need to chase down until this moves from proof of concept to something upstreamable. This also relies on adding a NetBSD backend into wireguard-go, which I actually have no idea how to upstream, but I’ll burn that bridge when I get to it. Anyway, I’m gonna talk about what I’ve done so far and what needs to come next.



Things are still rough around the edges so I’m not posting the forks I’m working on just yet, but check back in later because I will absolutely be sharing the code for this once I have something with a more solid foundation.

wireguard-go

So the first part of this puzzle is wireguard-go, which is the official golang implementation of wireguard. Tailscale uses this on operating systems that don’t have a native version of wireguard in the kernel. wireguard-go is written with a modular structure such that most of it is independent of the operating system, and then there’s a single file for each OS that implements the necessary plumbing to get it up and running as a network device. Now, there’s no official NetBSD backend for wireguard-go, but I found this weird fork on the deep web that implements the interface with NetBSD’s tun devices. It hasn’t been updated in a couple years, so I made a couple minor modifications and rebased it on a more recent stable release, and wadya know we’re in business!

I haven’t run this code through an extensive test to make sure it handles any potential edge cases appropriately, but we’ve got a good starting point to work off of. It’d be really cool to get this upstreamed into the main wireguard-go project after a bit more work on it; until that happens, I’m using an override to build against a local fork of the code.

cd tailscale
go mod edit -replace golang.zx2c4.com/wireguard=/path/to/local/fork


At this point I’ve got a local clone of my fork of snow’s fork of wireguard-go and I’ve told go to use it when building tailscaled. So then, let’s do that!

tailscaled

tailscaled is the daemon that holds all the magick to make Tailscale work. It’s got some OS-specific codepaths for network diagnostics and configuring the network stack’s routing tables that we need to address before we can give it a try.

Routing

When an application asks the operating system to send a packet to an IP address, the OS’s network stack checks a list of what IP address ranges are accessible through which network devices to figure out which device the packet should go through. Every OS has a different way to configure this, so tailscale has OS-specific implementations in wgengine/router that make this work.

I must confess to you, dear reader, that I have committed computer crimes, because I didn’t actually write a NetBSD backend for this. Instead I did a bit of the ol’

cp router_openbsd.go router_netbsd.go


And uh, it just worked? It’s not ready to ship, but it was enough to get a connection to another device on the tailnet, so that’s a win! I did also try the router_userspace_bsd backend used by FreeBSD and macOS but that one failed immediately. For now we’re using the OpenBSD one, and I’ll work on changes to it as necessary to iron things out.

And speaking of ironing things out, we’ve already got one candidate problem off the bat from the health check in tailscale status:

# Health check:
#     - router: exit status 1


What’s up with this? Well, if we take a look at tailscaled’s logs, we can find

router: route del failed: [route -q -n del -inet 100.110.144.67/32 -iface 100.113.133.86]: exit status 1
route: botched keyword: del
Usage: route [-dfLnqSsTtv] cmd [[-] args]


So adding routes is working, but cleaning them up afterwards isn’t- why? It’s pretty simple actually. Searching through NetBSD’s man pages, there’s no mention of del as a shorthand for delete:

root@localhost ~/tailscale (main)# man route
   The route utility provides several commands:

   add         Add a route.
   flush       Remove all routes.
   flushall    Remove all routes including the default gateway.
   delete      Delete a specific route.
   change      Change aspects of a route (such as its gateway).
   get         Lookup and display the route for a destination.
   show        Print out the route table similar to "netstat -r" (see
               netstat(1)).
   monitor     Continuously report any changes to the routing information
               base, routing lookup misses, or suspected network
               partitionings.


I double checked the source code and sure enough, delete is a keyword but del isn’t.

The BSDs share a lot of history but there’s often little quirks like this that you’ve got to look out for with the userspace utilities. I’ll need to fix that up to use delete, and check for any other problems while I’m at it.

Port list

The port list is a preview feature of tailscale that you can turn on which shows a view in the admin of all the ports open on your tailscale interface. tailscaled uses the command line tool netstat to collect this information on the other BSDs, and getting this working was just a case of turning it on in the build flags.

diff --git a/portlist/netstat_exec.go b/portlist/netstat_exec.go
index 77972d98..3959d291 100644
--- a/portlist/netstat_exec.go
+++ b/portlist/netstat_exec.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.

-//go:build (windows || freebsd || openbsd || darwin) && !ios
+//go:build (windows || freebsd || openbsd || darwin || netbsd) && !ios
 // +build windows freebsd openbsd darwin
 // +build !ios




What’s next?

With that rough draft of functionality up and running, what else is there to do? I think the first thing is to go through and refine the route control code to fully function on NetBSD. At that point I’ll post my forks of tailscale and snow’s NetBSD wireguard-go backend without worrying that it’s going to do terrible things to peoples’ routing tables.

Longer term, tailscale has a whole host of other features that need to be tested and verified for this to function, but the big wildcard is the tunnel interface implementation. I doubt tailscale wants to merge in a pull request that relies on a third party fork of wireguard-go, but I also have no idea what goes in to acceptance testing and merging in a new backend for wireguard-go, so there’s a lot of unknowns there. If you have any advice, please let me know, I’d love to hear from you.

Either way, even if I don’t get anything upstreamed I’ll still upload my forks with instructions for anyone willing to do a bit of DIY, so stay tuned!


The Following Software License Is Intended For Jim Boonie Only
2022-02-11T00:00:00+00:00
Copyright ©  

The following software license is intended for Jim Boonie ("Jim") only.

This software and associated documentation files ("It") is FREE! SOFTWARE!

    We're giving you code.

It's FREE!

    We're granting you permission to deal in It without restriction.

It's software, free.

    It's some free code for you, Jim!

This is _free_ software.

    Well it's PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, but the code is
    free!

Two branches, no tests. It's free!

    You git clone the repo to your free code we _furnished_ you the software!

It's a two branches repo it's free its got a vuln in the baack

    I'm not being held LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY,
    WHETHER IN AN ACTION OF CONTRACT, TORT, OR OTHERWISE, ARISING FROM, OUT OF
    OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
    SOFTWARE all day its YOUR code!

Free. Software ill pee my pants.

    Jim come get your damn libra-
ITS SOME FREE CODE!!!!!!
    Jim I got software
Jim does it get better than this?
    Jim-
THE CODE IS FREE
    Jim-
THE CODE IS FREE
    It's free $%#@ing code.

...

its free software



GNU::Free Distributions Increase Agency, Actually
2022-02-09T00:00:00+00:00
My good friend Xe wrote a post complaining about fully GNU::Free distributions, and I disagree with some things xe says in it. Xe challenged me to a 1v1 fite irl fox only final destination no items but xe can’t meet up with me in person and nintendo’s netcode is garbage, so I’m writing a response blog post instead. I think that the existence of fully GNU::Free  distributions is important for increasing people’s agency in the broader sense. On the other hand, I agree with Xe that some of the communities around these projects have problems.

Please note, I’ll use “free” and “libre” as shorthand for GNU::Free in this post. I won’t be making a distinction between “free” and “libre”, but I’ll use both to mitigate semantic satiation. Some people consider these words to have different meanings; I ask you to suspend your concerns for this post.

Also, I want to be clear that I think GNU and FSF have a lot of bad takes, especially regarding the actual nuance of the FSF’s device firmware beliefs, and I’m not endorsing them as organizations. We’re using GNU’s definition of “free” here because it’s what Xe was talking about originally, and it’s used as a reference definition around a lot of the free software community.

Ok so what’s all this about? Well if you don’t want to read Xe’s post, xer thesis is that distributions which use de-blobbed kernels and only package libre software limit the user’s agency, because the user can’t use proprietary software they might want to use to get their hardware working or to run software they want to run.

If I was forced into using those distributions, I would agree with that. But here’s the cool part, I’m not. If I want to use software which is proprietary (I do), I can just choose to use a different distribution. I disagree with a lot of GNU’s choices and opinions, and I often avoid projects that follow their doctrines. That’s agency!

On the other hand, consider someone who finds GNU’s stringent requirements for a distribution to align with their own desires. They really care about what licenses the software they run is released under. They will not compromise on this. In this case, a distribution that allows proprietary software actually decreases their agency. Let’s step into that user’s shoes:



Hi, I'm a Libre Enthusiast Strawperson Argument, but you can call me Leina. I'm diametrically opposed to using any sort of proprietary software on my system, and this is invariant. What are my options?

I have a few actually! I could use Debian, with the non-free repos disabled, and this gets me quite far. Xe believes that distributions should have "an escape hatch into a less pure environment" if they want to use non-free software on a distribution that offers a free-software only option. I don't think that's necessary, but, Debian has it anyway.

On the other hand, I could use something like Parabola, an arch-derivative with only libre software in the repos, or Guix, which also has some cool technical advantages with its declarative system configuration. There's a whole list of other options GNU thinks are cool. These don't have escape hatches, is that a problem?

Well no, it's actually an advantage. With no escape hatch, I know that no matter what I do on my system, I'm not going to accidentally turn on the non-free repos. What's more, my distribution maintainers are designing with this in mind, so they'll be more likely to actively look for free alternatives to proprietary software that other distributions might handwave on account of already providing a non-free option. All their documentation will be written with the assumption that using non-free tools is something I never want to do. If I wanted that escape hatch, I could install a different distribution.



Thanks Leina, I’ll take it from here. I think that the existence of a hardline free-software-only distribution is a useful thing and I support them existing. Leina also mentioned that they could use Debian if they wanted free software now with an escape hatch later, but GNU thinks Debian’s approach is not ok. To quote their page, Explaining Why We Don’t Endorse Other Systems:



"Debian's Social Contract states the goal of making Debian entirely free software, and Debian conscientiously keeps nonfree software out of the official Debian system. However, Debian also maintains a repository of nonfree software. According to the project, this software is “not part of the Debian system,” but the repository is hosted on many of the project's main servers, and people can readily find these nonfree packages by browsing Debian's online package database and its wiki.

There is also a “contrib” repository; its packages are free, but some of them exist to load separately distributed proprietary programs. This too is not thoroughly separated from the main Debian distribution.

Debian is the only common non-endorsed distribution to keep nonfree blobs out of its main distribution. However, the problem partly remains. The nonfree firmware files live in Debian's nonfree repository, which is referenced in the documentation on debian.org, and the installer in some cases recommends them for the peripherals on the machine.

In addition, some of the free programs that are officially part of Debian invite the user to install some nonfree programs. Specifically, the Debian versions of Firefox and Chromium suggest nonfree plug-ins to install into them.

Debian's wiki also includes pages about installing nonfree firmware."



End Quote.

Some of this is kind of nit-picky but I get what they’re going for here, and it’s something we discussed with Leina earlier; Debian is designed around the existence of the non-free repos, so it’s going to creep into the documentation, installer, and infrastructure, and so sooner or later there’s a decent chance you’ll need it during normal use of the system for lack of the maintainers exploring other options for you.

The problem begins when the passion for free software leads to viewing users of non-free software as immoral, committing acts of sin. Some free software extremists see using proprietary software as a state of damnation that they’ve been given a holy command to save people from, whether those people want to be “saved” or not. In this mindset, the mere mention that perhaps someone might want to use proprietary software is seen as a moral failure, worthy of excommunication. This builds a very isolationist community that’s hostile to outsiders that might want to learn from and contribute to these libre distribution projects, but don’t care for it becoming core to their identity.

This sort of thing isn’t unique to GNU, or libre distributions really. You see the same dynamic play out with ubuntu users that mock or shun windows users, arch users that mock or shun ubuntu users, and so on.

The question is, is this the mindset of the majority? Is it the mindset of the leadership? And, are leadership willing to break the toxic cycles down when they see them and substitute a more welcoming and conversational tone. I don’t spend time in these communities, so I simply don’t know. There’s a good chance this stuff does come from the top, but I try to avoid exposure to GNU/FSF leadership as much as possible so I’m blissfully unaware either way.

Free as in GNU’s Opinions distributions are not the problem themselves. Their existence increases the agency of people using linux in general, and provide useful ecosystems for developing more software in line with their principles. But that can only be the case when using them is a choice, and the alternatives are not forbidden from discussion simply for being non-free.


Discord Holds the Keys to Your Heart
2022-01-30T00:00:00+00:00
A friend of ours has been struggling with RSI. As part of their plans to address this, they decided to create a secondary Discord account to use on their phone, for direct messaging purposes. That way, they could still talk to people from their phone, but wouldn’t be tempted to participate in active discussions in group channels. As part of the signup process, Discord forced them to provide a phone number for account verification. They provided the same phone number they used for their primary Discord account, and in response, Discord locked both accounts, cutting them off from a huge part of their social circle. Discord’s support was unable to unlock the account or provide any information on when it might be unlocked.

Now, we’ve since heard that this was likely a technical error and not intended to function this way, but that’s hardly a consolation. There’s plenty of reasons beyond this that someone can get locked out or banned from a centralized social platform; reasons including using an unapproved client, joking about your age, making an account before age 13 but telling the platform your age after you’re an adult, tweeting an image from wikipedia, getting report-brigaded, breaching terms of service on a technicality, talking about something which is illegal in the country the company operates out of, connecting in from a VPN or TOR, really the list goes on. Cadence has a great post about all the ways Discord in particular is just an awful platform if you want more of that, but this problem pervades all modern social media.

The problem is that these lockouts tend to happen immediately, with no warning, no good means of recourse, and no way for the affected person to pick up the pieces. For many people, they only have one avenue of talking to someone. Even we’re prone to this, despite our efforts to maintain multiple ways of talking to as many people as possible. When you’re immediately cut off, you don’t even get a chance to tell people where they should go to talk to you. You’ve got to hope that you can find some way to get a message to them through a friend of a friend, or that they take a guess at what happened and go looking themselves.

But this cuts people off from more than just individuals. It cuts them off from entire communities. A lot of scenes have largely migrated away from forums, wikis, and self-hosted or IRC chat rooms, to exclusively using Discord. Losing access to your account means you lose access to ALL of these at once, largely excluding you from being able to participate in them. The increasing number of measures intended to prevent spam and abuse can make creating a new account to evade the ban extremely difficult, unless you’re in a position to easily get ahold of a new phone number that they’ll accept for a new account verification. We are social creatures, and the trauma of that loss cuts deep.

This isn’t exclusive to Discord. The same story has played out over and over on basically every centralized social media platform out there, and it will keep doing so. There is no incentive for this to stop.

A lot of the propaganda around decentralized chat and social media talks about the privacy benefits of using them. Yeah, those benefit are pretty cool. We think though that the resistance to a third party heartlessly excommunicating you is far more important. A platform can be a constellation of federated shards run by real independent people you can actually have a conversation with. In this reality, suspensions can be discussed and errors can be corrected. If someone has to leave they can be given the opportunity to provide a pointer to where to find them. Even when amicable communications fail, you can find another shard that will let you make it your home, and you can pick up where you left off with those you care about. You may be cut off from part of the network, but you will never be mercilessly cut off from the whole thing with the flip of a switch.

To be clear, there are user interface and user experience problems with most decentralized platforms. The work to improve that gap is ongoing, but not the point of discussion for this post.

In a time where so much of someone’s interaction with the world is through the internet, it’s become increasingly clear that companies do not wield their immense power of their users’ lives responsibly. Using decentralized communication tools is a reclamation of the keys to your heart from the corporations that quietly took them without you even noticing.

But I want to be clear about something. Using decentralized systems is a solution that doesn’t rapidly scale. Network effect is HUGE, and those communities entrenched in closed platforms aren’t all going to just up and leave. It would be ideal if we could hold these platforms to some standard of responsibility at a regulatory level with regards to their users’ wellbeing, but we’re not confident in that actually happening. Still, if you’re willing to fight that fight, please do.

As someone who tends to a community, one of your best options is to reduce barriers to people who can’t use the platform you’ve chosen. If you’re on Discord, setting up Discord to IRC, Discord to Telegram, or Discord to Matrix bridges is a great start. Avoiding toxic platforms like Fandom in favor of running your own wiki is also fantastic. Redundancy is the name of the game here, and the more options you can provide to people the better.

These are things you can do from the top of a community to make it more available to everyone, and you can find options like this that apply no matter what platform you’ve settled on. They’re also hard and require technology knowledge you might not have - lean on your friends and ask if they can help. There is no easy way to escape this hole that’s been dug for us, but we can certainly try, and maybe end up somewhere a bit better than where we started.

Addendum: What can you use instead if you’re willing to leave?

Here’s some technologies and tools that foster self-hosted or decentralized communication. Not all of them are federated. We don’t use all of them ourselves. An appearance on this list is not an endorsement of the team behind it, as we simply don’t know enough about them all.

None of these are perfect. Most of them are very far from it. We hope they continue to improve and grow. Taking back our right to communicate with each other is far too important to give up on.

What does ‘federated’ mean?

Basically, a lot of people run different servers, but all the servers talk to each other. Say two of your friends ran two different minecraft servers. Imagine if you could log in to one, chat with people on the other one, and even walk through a portal to play on the second server all while your game is connected to the first server. It’s a little bit like that.

When using something like Mastodon, you can talk to and see posts from anyone across the federated network. But, if any of those servers go down, it doesn’t matter too much to you because your server still works fine. If your server goes down, you can go create an account on a different one. If a server gains a reputation for providing a haven to spammers or abusers, your admins can block it so you don’t have to see them anymore. If you don’t like your admins’ decisions, you can find new admins. It’s neat!

The List

Is your favorite tool missing? Consider emailing us about it!


  matrix - a federated chat network, supports text chat, voice calls, video calls.
    
      on mobile try out the fluffy chat app, its a lot nicer to use than the element app in our experience.
    
  
  mastodon - like twitter but federated.
  forums, just, in general - hey remember forums? you can still use them!
  mediawiki - run your own wiki, because the absolute hellsite known as Fandom is never something you want to interact with.
  mumble - open source voice chat. Works like discord voice channels. there’s server providers you can pay for but you can also run your own for your friends if you’ve got the knowhow. User experience is not very modern by today’s standards.
    
      teamspeak is also reasonable if you want something with a bit more modern feel, or it was when we last used it years ago. its not open source, but unless things have changed you can at least host your own server for free for up to 32 concurrent users.
    
  
  gitea - self hostable git frontend, its like github.
  pixelfed - basically a federated instagram clone.
  peertube - on the surface it looks like federated youtube, but its also got some cool peer to peer file transfer tech to reduce the bandwidth load on servers.
  etherpad - its like google docs! real time collaborative editing. there’s a list of etherpad servers on their github.
  hedgedoc - Another real time collaborative document editor, this one uses markdown. Anecdotally we’re told its faster than etherpad.
  revolt - basically an open source and self hostable discord clone, and a pretty comprehensive one in terms of features. However the self-hosting support is mostly non-existent currently and we’re told the app can’t work with self-hosted instances, so your mileage may vary.



New Twitter - @EverfreeArtemis
2022-01-14T00:00:00+00:00
Hey there! Our old account (@artemiseverfree) got banned because we added a birthdate to it. The time between the account’s creation date and the account’s birth date was less than 13 years, so twitter decided to retroactively ban us. We’re not minors, but sure cool whatever. This post is confirmation that our new account, @everfreeartemis, is in fact controlled by us. Thanks!


Life at 800MHz
2022-01-12T00:00:00+00:00
We’ve been using a Sony Vaio VGN-P588E for the past few months as our primary personal laptop. This thing’s great; it’s got a small but not uncomfortable keyboard. It’s got a trackpoint, which we absolutely need to keep our hands healthy. Crucially, it’s only 1.5 pounds. We’re disabled in a way that means we’ve got to care about every bit of weight we add to our bag when we leave home, so that’s a big deal! One catch: the Intel Atom inside hits a peak speed of 1.33GHz, with a normal speed of 800MHz under most thermal conditions. Oh yeah and there’s only 2GB of DDR2, GPU drivers don’t work in Linux, and it’s a 32 bit processor too did I say one catch I meant four. Let’s talk about life in the slow lane.



So this thing’s main job is to help us stay off our phone, since touch screens are the hardest on the health of our hands. To do that, it’s got to be able to handle chatting with people, email, media, and light web browsing. Does it work? Actually, yes! Sometimes we have to exercise a bit of patient, but overall it handles anything we need it to handle day to day. Communications, logistics, youtube, shopping, social media, it all works! Social media is probably the worst experience out of everything, but that’s not such a bad thing since it keeps us off of it a lot of the time. We’ve even been doing some light editing work in Audacity on an album we’ve got in the works.

The modern web is definitely rather hostile to a computer this slow though, and our experiences online involve a lot of loading time. Informational sites are usually fine as long as javascript is off. (Shoutout to lib.rs btw for offering a rust crate database that actually works without javascript. We have no idea what crates.io thinks it makes sense to require javascript to look up packages but here we are.) Interactive sites require some patience but are usually fine, and nothing ever outright crashes. Social media sites are the only things that dip into the realm of “genuinely unusable” on the regular, and streaming services are basically entirely out of the question.



It’s hard to convey how much of an anti-problem this ends up being. This laptop feel a bit cozier, sort of a reprieve from the mainstream flow of the centralization of socialization. It’d be a bit frustrating if we didn’t have our phone as a fallback option for when we truly need something that doesn’t work, but really while we’re out and about, most of the things that don’t work well on here are things we don’t need to be using anyhow.

We’re most amused by the unusability of streaming services, because this thing can actually handle music and video playback just fine from local files (check out this winamp skin in Audacious!). People have just shoveled so much overhead on top to monetize it more effectively that genuinely we could not even pay to watch movies or TV on here if we wanted to, so piracy is the only option for that stuff.

As for games, obviously this thing isn’t exactly looking at a career in modern gaming, but it can handle GameBoy emulation, OpenTTD runs at a smooth 60 fps, and Sonic Robo Blast 2 runs at mostly full speed! Check this out:




Your browser doesn't support HTML5 video tag, or you have it disabled.


Running on 32-bit x86 is a bit odd these days because it feels like there’s less support for this than 64-bit arm at this point. There’s a fair bit of software that we’d have to compile from source to get working on here, since they don’t provide 32-bit binaries. And there’s plenty more that simply can’t work on this architecture. We’re glad rust is keeping the torch alive though with 32-bit support, so at least we aren’t out of the game on that one.

Overall, this thing fucking rules and does everything we need it to do. We’ve had to get creative, but we have yet to be defeated.

The Details

Surprising probably nobody we’re running Linux on here, specifically we’re using antiX which is basically Debian with a cool live disk, a bunch of custom apps that work well on low end hardware, and a nice batteries-included set of apps and tools preinstalled. Option are a bit limited in 32-bit x86 land, but we could’ve also used Void Linux if we wanted a more arch-ish experience. Technically Gentoo is also in the running, but can you imagine trying to compile all your packages from scratch on a system that benchmarks worse than a raspberry pi 3?

Comms

Anyway, with that as a starting point, let’s get an easy one out of the way: email. We’re using claws, a tried and true mainstay of the GUI email client world on linux. There’s nothing remarkable going on here, it’s email!

Moving on to chat programs, things get spicier. We’re mainly using Discord and Matrix. For Matrix we like nheko-reborn, but they don’t have 32-bit builds of the latest version on their releases page and the older version in the debian repos is missing a lot, so we’re using weechat with the weechat-matrix plugin. It’s not the smoothest user experience especially when it comes to encryption but it does the trick.

Discord on the other hand has a big problem: using a third party client is bannable, and the first party client is a heavy web app. With chromium and some heckery (more on that later), it’s possible to get it running at a usable speed, but we often cheat and run it on our little arm server instead. We connect in with VNC and this basically works fine even if we’re out and about. There’s no good way to get voice working with this though, it’s just too slow, so the phone still handles that one.

Browsing

For general web browsing we use palemoon, primarily because the UI is more customizable than modern Firefox. We’re running at 1000x480 resolution so every bit of vertical screen space counts, and palemoon has themes that go extremely compact on UI size. It also starts up a bit faster, so that’s nice. We use the palemoon fork of uMatrix as a script blocker, which improves security and keeps stuff loading faster, but lets us turn on JS when we really need it.

We also use netsurf quite a bit for information searches since it starts up nearly instantly, but a lot of sites break in that so we can’t use it exclusively.

For web searches, we use the lite version of duckduckgo.

Web Apps (are the bane of my existence)

Now let’s talk about The Chromium Apps. We don’t use chromium for general browsing, but we use it for web apps and anything javascript heavy. It’s got this neat trick where you can run it with --app= to open a web page without any of the browser UI so it looks a bit like running the page as an electron app. This is how we do twitter, mastodon, the youtube mobile interface (which can’t play video well but is VERY snappy for searching, good job whoever works on that), and a few others. We also use chromium without --app for stuff like opening medical portals, bank portals, and online shopping.

We speed things up a lot by keeping chromium’s storage loaded in ram with a tool we wrote called mnestic to handle syncing back to disk. This makes a pretty big difference and is the only thing that makes Discord even close to usable locally, because the storage on this thing, while it’s an SSD, it ain’t exactly high throughput. We also turn smooth scrolling off with a command line flag --disable-smooth-scrolling.

Honorable mention to pinafore though, a mastodon frontend that runs incredibly fast even on this machine. More webapps should be like pinafore. And while we’re at it, a dishonorable mention to twitter for being slower than Discord, we wish we were making that up.

Video (works surprisingly well?)

We were sure this thing wouldn’t be able to do any decent video playback, but we can do 360p 30fps video playback without frame drops and we’re here to tell you our secrets.

In short:

  use mpv
  use hardware upscaling by changing the screen resolution to 360p
  download videos fully before watching them instead of streaming them
  use this special set of mpv flags to make it fast:


mpv --vo=x11 --vd-lavc-fast --video-unscaled=yes --demuxer-thread=no 


We can also get away with mpv handling upscaling if we add a couple more flags into the mix, but then we’re upscaling twice and it looks worse:

--sws-fast --sws-scaler=fast-bilinear


And, since we don’t have vsync, screen tearing can be a problem. To make it less noticeable we play videos slightly faster, and this keeps the screen refresh rate far enough away from being an even multiple of the video framerate that the tearline quickly moves over the screen and isn’t very distracting:

--speed=1.005 --audio-pitch-correction=no


Other shoutouts

Here’s some other software we want to give some quick shoutouts to that we like to use on this thing.


  mtPaint, an image editor that even works well on a 133MHz pentium let alone this thing.
  IceWM, a great window manager with good out of the box defaults, themes, and a simple configuration method.
  XMMS and Audacious, for supporting my winamp themes.
  ROX Filer file manager, its weird and we like it.
  NO$GBA, a weird GBA emulator that goes fast



Git Reflector Script
2021-09-29T00:00:00+00:00
Here’s a small script to reflect a git repo from some source like GitHub to some destination like an internal gitea mirror. That’s my usecase. You could also use this to go the other direction, or whatever you like.

It’s designed to be run periodically with something like cron. If anything goes wrong during pull/push (for example, upstream rewrites history), that repo sync will fail but others will continue. Modify this to watch for errors if you like or run it manually on occasion to check for errors yourself.

Intermediary copies of repos are stored locally in $PWD/repos/. The repos/ folder is created automatically. Run the script from whatever working directory you want to be the parent of repos/, or change the script to put them somewhere else.

Repo list is in format:

local/repo/name    source_url    dest_url


See script repos= line for examples. Change this to set up your own repo list. You must use spaces as the delimiter, or change the script to use a different delimiter by changing IFS=. For example, to use tabs, set IFS=$'\t'.

Here’s the script:

#!/usr/bin/env sh

# FORMAT:
# local/path   source_url   dest_url
#
# repos are cloned from source_url into folder 
#   local/path must not contain spaces because i dont want to deal with that.
#   If the folder doesn't exist, its a clone
#   If the folder exists, its an incremental pull
#
# after that, push the repo to dest_url.
#   If there's an error (like source overwrote history with --force) then this
#   doesn't happen. Resolve errors manually.

repos='linux         https://github.com/torvalds/linux.git       [email protected]:mirrors/linux.git
faithanalog/rtmouse  https://github.com/faithanalog/rtmouse.git  [email protected]:mirrors/faithanalog-rtmouse.git
xf86-video-amdgpu    https://gitlab.freedesktop.org/xorg/driver/xf86-video-amdgpu.git  [email protected]:mirrors/xf86-video-amdgpu.git
'

printf '%s' "$repos" | while IFS=' ' read -r path source_url dest_url; do
    # work in repos subdir so we can easily gitignore it
    path="./repos/$path"

    # make parent dir of repo
    mkdir -p "$(dirname "$path")"

    # clone if it doesnt exist already
    if ! [ -d "$path" ]; then
        echo "cloning $source_url into $path"
        git clone --mirror "$source_url" "$path"
    fi

    (
        # enter repo
        cd "$path"
        
        # update all branches from source
        # docs for `git fetch -t` say it fetches tags in addition to a normal
        # fetch. in our testing this was not fully the case, and `git fetch -t`
        # did not fetch new branches.
        echo "pulling updates from $source_url"
        git fetch "$source_url"
        git fetch -t "$source_url"

        # push branches to dest
        echo "pushing updates to $dest_url"
        git push --all "$dest_url"
        git push --tags "$dest_url"
    )
done



Wowstick Electric Screwdriver Review - It Saves My Wrists
2021-09-25T00:00:00+00:00
About 4 months ago we picked up the “wowstick” electric screwdriver. It’s a lightweight handheld screwdriver with a rechargeable lithium-ion battery. It’s got some limitations worth knowing about, but as someone prone to being injured by manual screwdrivers it has made a huge difference to my ability to work on hardware projects.

What is it?

Before I get started it’s worth mentioning that there’s a lot of variants of this product online. I don’t have a good understanding of the differences between them. So, for what it’s worth, I have the Wowstick 1F+.



As I said, the wowstick is a small handheld electric screwdriver. It’s mainly targeted at working with electronic devices. As a result, it’s got pretty low torque, and you’re definitely going to break it if you try and use it for assembling furniture or something. On the other hand, since it’s got low torque, I don’t have to worry about overtorquing screws in my electronics hardware, and it’s harder to strip a screw with it.

There’s no power settings. You’ve got a button to make it turn clockwise and a button to make it turn anti-clockwise. There’s also some white LEDs at the end that shine parallel with the screw bit while its rotating, which I find kinda gimmicky but it could be useful.

It charges with micro-usb, when you need to charge it at all. I’ve only had to recharge it twice since I got it 4 month ago, and I’ve been using it about one or two times a week on average so that’s not bad!

It comes with its own set of screw bits. They’re the same formfactor as iFixit bits, which I guess is kinda the defacto standard these days for electronics repair kits, so you’re free to use bits you already have instead.

What I use it for

Here’s a few projects this thing has helped me with:


  Building a desktop PC.
  Full disassembly and reassembly of my Thinkpad x220.
  Building a custom mechanical keyboard.
  Screwing hard-drives into drive trays while working on a server rack.
  Taking apart a Switch Joycon to fix the Joycon drift.
  Taking a steam controller apart to sand down the trackpads.


And that’s just what I can remember. This screwdriver has saved my hands from so much wear and tear, and as someone with a body that is chronically prone to repetitive stress injuries, that’s a big deal.

Limitations

So what can’t it do? This thing is great if you work within its limits, but you gotta know what those limits are:


  Not suitable for use on furniture or anything else requiring high torque.
  No power adjustment settings. You have one power level.
  The axis of rotation isn’t perfectly aligned with the center of the bit.
    
      In practice, this hasn’t been a major issue. I do have to do one or two turns manually sometimes to start screws going into holes on a flat surface (like a hard drive tray).
    
  
  If the gearbox breaks, I don’t know how to repair it.
    
      I saw a youtube comment claim that you can find replacement gearboxes on AliExpress but I wouldn’t know what to search for.
    
  
  No internal magnet.
    
      I often use my magnetized iFixit screwdriver to start screws in difficult places, then switch to the wowstick.
    
  
  Depending on your budget it’s pretty expensive, at $50 USD on Amazon at the time of writing.


Where to buy it

There doesn’t seem to be a particular brand selling this to the western market, so really just look up “wowstick” on AliExpress or Amazon, pick your poison. There’s a lot of sellers on both; I bought mine from this Amazon listing in particular, sold by “Ruputas US”, so I guess that might be a safe bet, but your mileage may vary.

You’ll notice that this listing is actually a kit that comes with a set of bits, a mat, and a couple other odds and ends. I haven’t been able to find the wowstick sold individually without a kit, though I did try. That’s great if you actually need any of the stuff in the kit, but kinda wasteful if you don’t; maybe see if any friends would like the stuff in the kit if you don’t want it.

In conclusion,

It does the thing. If you suffer from repetitive stress injuries, or just want something to make working on electronics hardware less frustrating, I’d definitely consider getting it. For what it is, it’s priced pretty well, but it’s definitely expensive enough to be outside a lot of peoples’ budgets. For us personally, it’s been worth it.


A Story of Microsuites, and Atrophy
2021-09-21T00:00:00+00:00
Let me give you a view into the hellworld of “microsuites”. This shit is becoming more and more prevalent and it’s so incredibly cursed. Picture this: You come home to your apartment building. You walk up five flights of stairs to the top-floor of your building with no elevator. On the way you pass the communal kitchen on the second floor. You walk in your front door. You’ve got like 200 square feet of living space: there’s a bathroom with a shower, a sink, a mini-fridge, a microwave. The place came with internet, you didn’t have to pay for that. You’ve got a ladder up to a loft with your pre-furnished full-mattress bed. Before bed it’s time to cook up a nice pot of- wait, aren’t we missing something?

Yeah you remember the communal kitchen? That’s where the stove is. The only stove. Pretty great for the person that lives on floor two, but we hope you like carrying ingredients downstairs to cook if you’re up on floor five. Oh, and your pots and pans too. The kitchen had those pre-furnished once, until the pandemic. At least it’s cleaned weekly by a cleaning crew.

Oh well, at least there’s a microwave, and you bought your own toaster oven too! A mini-pizza it is then. You set the toaster oven to pre-heat and oh good the power’s out. You see, your microsuite only has two 15amp circuits. One of them supplies the microwave, fridge, and sink outlet. The other one powers the entire rest of the unit. And, unfortunately for you, there’s no space on sink-counter for a toaster oven, so guess which one that’s plugged into? Maybe remember to turn off your air-conditioner next time before you make a pizza, fucko.

No big deal though right? Just flip the breaker! Yeah, guess where that is? All the way at the bottom of the five flights of stairs. Flip-flip, power back on, all the way back up. You put on a cooking video and dream of a world where you too could cook something, anything really, on a stove. The induction burner you bought sits in a bin beside you. You gave up on that idea long ago, with no space to prepare ingredients or store cookware. No stove vent either, and the fire alarm can be a bit temperamental.

After dinner you climb up the ladder to your barely-serviceable bed. Your joints ache. You wonder why your past self decided this was a good place to stay- ah yes, budget. It’s certainly cheaper than the normal studios across the street. Location, location, location, right? Surely it’s worth it. You spend most of your time out and about on the town after all! Well, perhaps some version of you did. Not these days.

…

We lived there for three long years, ignoring our body’s complaints, convincing ourselves we could make it work if we were just a bit more creative, just a bit more tenacious. We aren’t sure why. Our body finally gave up on waiting for us to make the right decision and made things quite clear: move or perish. It wouldn’t have even been a choice, if not for having friends to help.

Take care of yourselves.


Puppy Linux: ROX File Manager Basics
2021-09-12T00:00:00+00:00
This is the first of what I hope will be a series of practical guides for using Puppy Linux. I’d also like to contribute these to the wiki later but I don’t want to learn how to do that right now. Anyway, this is a short post explaining some things about the ROX File Manager, the default file manager in Puppy. Everything here is also in the full documentation, but digging into that can be daunting when you’re starting out.

So, clicking on something opens it instead of selecting it.

Yup! In most file managers, you single-click to select a file and double-click to open it. In ROX, you single-click to open a file, and double-clicking just isn’t a thing you do (with a couple exceptions, search for “double-click” in the manual to learn more). Personally I like this better because double-clicking is hard for me, but this confused me a lot when I first installed Puppy.

How do I select a file in ROX?

There’s a few ways. Note that I use click as shorthand for left-click in this guide.


  Click-drag draws a selection box, like many file managers.
  Control-click a file to select it instead of opening it.
    
      Hold control and click additional files to select multiple files.
    
  
  Press the arrow keys to bring up a keyboard-driven cursor. Move the cursor over a file and press spacebar to toggle whether it is selected.
  Right-click and go to the Select menu to select by name or other search criteria.
    
      You can also skip the menu and bring up select-by-name by pressing ., and the conditional selection box by pressing ?.
    
  


By the way, you can also press Ctrl- to save a selection, or just save the current directory if no files are selected. Pressing  will bring you back to that directory and restore the selection. It’s kinda like control-groups in RTS games like Starcraft or Age of Empires, but for your file manager. I never use this, but it’s neat!

Some other neat things


  Shift-click has special actions for various file types.
    
      Shift-click a file to open it as a text file instead of whatever the default open action is.
      Shift-click a symlink to go wherever the symlink goes instead of opening it.
      Shift-click a mounted directory to unmount it.
    
  
  Press / to open up the path-entry box. You can type in a file path to go directly to it.
  Press ! to open up the shell command box. Type a command and hit enter, and ROX will run the command in that directory. Click on a file while this is open to insert the name of the file into the command box.


More information

There’s a lot more to ROX than I covered here, I just wanted to explain the things I found confusing when I first started using it. To get to the full documentation, click the Help Button and open Manual.html in the folder it takes you to.

Good luck!


Nix RFC 0098: Community Team
2021-08-14T00:00:00+00:00
Nix RFC 0098 was submitted to the Nix community recently. In its own words at the time of this post, it aims to “establish an official community team to model social norms, mediate interpersonal problems, and make moderation decisions”.

The full text can be found here: https://github.com/IreneKnapp/rfcs/blob/community/rfcs/0098-community-team.md

We had no part in the creation or publication of this document. We believe it speaks for itself, so this post will not restate its position, and we will keep our editorialization brief:

We support this effort and this document. We believe acceptance of this RFC would be a very important step for the Nix community, and would provide a good precedent for other communities like it.

We believe this document presents an approach to mediation and moderation that is sustainable and empathetic. It is aware of the limitations of people and the inherent messyness of communication and collaboration. It provides guiding principles, rather than prescriptive rules that attempt to foresee all futures and preempt them. It stands for a future we want to see realized, across communities beyond just Nix.

To anyone involved in the Nix community, we urge you to read this document to the extent you are able, join the conversation, and support it if you believe in it too.

To those uninvolved, we believe it is worth reading regardless, for the principles it represents.


Linux Mixtapes
2021-08-11T00:00:00+00:00
I use Puppy Linux on my laptop. Everyone I mention this to is like “Oh yeah Puppy, I used that on a flash drive that one time at school” or something to that extent. That’s kinda where I was when I tried it too; I didn’t want to buy a boot drive for my thinkpad when I had a slim USB stick I could just plug in with 128 gigs of storage, and I had heard puppy was fast for that.  I wasn’t wrong either, this thing runs way better than it has any right to given the speed of this storage device. But what’s even cooler is how easy it is to remix a puppy linux distribution into your own custom puppy that you can share with others.

But first, a moment from our- oh wait no they aren’t sponsoring me.

Before I get into the weird complex stuff Puppy can do, I want to direct your attention somewhere that the mixtape culture is already alive and well in the linux world: Raspberry Pi. Given raspi’s pervasiveness, and their dedication to keeping Raspberry OS backwards compatible with basically all of their hardware, its really common for raspi software to get distributed as a .img disk image. Don’t worry about installation, configuration, all that stuff, unless you really want to. If you want a plug and play experience with kodi, plex, emulators, ad-blocking, or anything else really, you can usually just download a .img file and put it on your SD card.

This only works because it’s all targeting the same hardware, with the same boot process. Surely we couldn’t do anything like that in x86 land, across a wide variety of hardware. Doing something like that would be nearly imposs-

Puppy Linux is Hecking Cool

So puppy linux actually is sort of a meta-distribution. There’s ubuntu-based puppies, debian puppies, slackware puppies. What’s shared across all of them is the way it uses union filesystems. Bear with me because I’m going to get a bit in the weeds for a sec.

A union filesystem (in this case, aufs), merges a bunch of filesystems as layers into a single filesystem. If two filesystems have a file at the same filepath, the top filesystem will provide the file and hide the file in the bottom filesystem. Think of it like stacking pieces of paper:



You can also decide what happens when a new file gets created, which layer it gets written to.

Puppy has a layer stack that looks something like this:


  in-ram write cache, starts empty at boot
  pupsave, with user-created data and user-installed packages
  puppy (read only) - core puppy system
  adrv  (read only) - pre-installed application package set
  fdrv  (read only) - firmware
  zdrv  (read only) - kernel modules and such


The read-only layers here are squashfs files. If you’ve never heard of those, imagine a zip file that’s optimized to be mounted as a read-only disk image, and be really fast

Writes get written to the in-memory write cache. You can flush the write cache to the pupsave, either manually, periodically, or at shutdown. This is pretty great for keeping my system feeling snappy despite the awful usb stick performance, by deferring the writes until I’m not busy doing anything, but ultimately it’s not my focus here.

The pupsave is just a folder on my usb drive with any changes I’ve made to the system in it. There’s nothing special about it really. That’s what’s so cool though.

Let’s say you messed up, really bad. Oops, your system won’t boot anymore. In the boot menu, you can boot without the pupsave loaded and now you’re back to the same state you had with a fresh install. That means your installation is your rescue filesystem! Then you can go into the pupsave folder, delete or fix whatever file broke things, and reboot back into your working system.

Puppy Linux Mixtapes



This layering setup makes it really easy to make your own remix of the core setup. Don’t like the defaults? There’s a solution!


  extract the puppy.sfs or adrv.sfs with unsquashfs (depending on what you want to change)
  delete or add whatever you want
  recompress it with mksquashfs
  rename the old one and put the new one in its place.


From here you can use your new changes, send them to other people, whatever you want to do. There’s even tools to turn this into a new .iso file that you can install from.

We can turn pupsaves into mixtapes too!

Since the pupsave is just a folder with files, if you want to share your system with someone, all you have to do is turn your pupsave folder into a .sfs file with mksquashfs and send it to them. Puppy lets you load arbitrary .sfs files, so they can just boot up without their pupsave, load your .sfs file on top, and they are now effectively using your system. Then they can make changes on that, squash those up, and send them back to you or other people too.

You could even squash up just part of your system, like an application that was particularly tricky to install, or a collection of software for making music, or something that replaces all the coreutils with rust rewrites. Literally whatever you want!

What would make this even cooler?

This kind of thing gives me all kinds of ideas for ways to expand on it.

Consider, for example a system built like this:


  system user data
  system packages
  core


During normal system use packages can get installed to the package layer, and changes made to user data. Overwrite your package configs carefree! If you ever break it, just delete the config file and reboot or something, because you’re not destroying the package defaults by changing files on your system, just shadowing them.

Now let’s add two more layers:


  system user data
  system packages
  common user data
  common packages
  core


Imagine you’ve got three computers all running the same OS. One is arm, one is x86, one is PowerPC. There’s certain packages you want on all of the systems, and certain configurations and files you want everywhere. Those could go in the common layers, and get automatically synced between them all.

Want neovim and fzf on every system you use? Done, with the appropriate versions installed for your CPU. Core dotfiles synced without worrying about accidentally adding system-specific dotfiles to the sync? Yup! Tired of editing /etc/hosts everywhere? Well, maybe it’s time to just set up your own DNS server, but still, it’s an option. And of course you could bundle these up and share them with anyone else you want to.

Then add chroots.

Say you’ve got an x86 program you want to use on arm. No wine nonsense, you just only have an x86 executable because that’s what the devs built for. You could chroot into a union fs where the system packages layer was replaced with an exact duplicate of itself, except with all x86 packages instead of arm. Add qemu-user into the mix, and now inside that chroot your system is exactly the same as outside except you can run x86 programs.

If you’re into containers you could start using those here too, but you containerfolk are likely to be used to thinking with layers anyway.

Bringing it Back to Embedded



Imagine a world where custom raspberry pi images are distributed as .sfs files you drag and drop into your SD card. Edit a config txt to tell it which one to load at boot, it would layer that sfs overtop the core Raspberry OS. Each .sfs could automatically get its own save folder so they aren’t stomping on each others’ data. No more downloading the same base system 5 times over along with the bloated .img file. Less wearing out SD cards from reflashing them as needs change. Maybe we could even build this on top of something that isn’t overly platform specific Raspberry OS.

Imagine that world.

I don’t have a nice conclusion to tie this all together honestly. There’s a lot of cool ideas I’d like to explore to keep making mixtape culture easier on Linux. I feel it’s really important as a more user-focused aspect of open source, in contrast to all the focus on what open source means for developers. I know I’ve mentioned squashfs a TON here but it’s really not necessary for it, as Raspberry Pi proves. I just think the technology is neat and wanted to share it with you too, alongside the other strawberry swirl of ideas in this post.

One way or another, I hope you enjoyed reading. Bye!


why i don’t use analytics
2021-08-09T00:00:00+00:00
Social ranking numbers are poison to my mind. Likes, views, retweets, boosts, all these things are to me a social slot machine. Put words out, see which words get the numbers highest, chase that dopamine one more time.

It’s not about respecting your privacy. I could write code to count HTTP GETs on my server just as easily as I could add a google analytics js payload for your browser to run. I care about privacy, but that is not why.

The numbers are part of what drove me away from being an “active user” of timeline-based social media. I still post to mastodon and twitter, but I don’t live in those spaces the way I once did. On mastodon I can hide those numbers, but I can still click through to get a list of everyone who interacted with a post. And I know how to count.

I enjoy discussion with people whose opinions I care about. I enjoy when people reach out because a post spoke to them somehow. Numbers are not conversation.

I don’t want my blog to fall to that. I don’t want to know how many people read my words, how many people like it or dislike it. I don’t crosspost to vote-style aggregators, not because I don’t want the exposure but because if I’m the one linking posts in those spaces I will be the one looking back to see the votes.

This is my space, for my thoughts, my interests, and my feelings. Analytics would take that away from me. It would be your space. I would write for you, and I’d do it so much that I’d stop writing for me.

I refuse to let that happen.


eGPU with a Thinkpad x220 from 2011
2021-08-04T00:00:00+00:00
Something I’ve been quite enamored with recently is ExpressCard. It’s an older standard that’s conceptually similar to Thunderbolt; basically, it’s a slot in the laptop that can provide a USB 2.0, USB 3.0, or PCI Express interface.

My thought on learning this was “oh hey, can I break that PCIe interface out into a normal PCIe port?”. Luckily other people have had the same idea and there’s some cool stuff on the market that do this. This isn’t meant to be a comprehensive guide for what’s out there, but the folks over at egpu.io have a great buyer’s guide for this stuff if you want to try it for yourself.

Now, don’t expect the world from this. On the x220 it’s only a single PCIe gen2 lane, which comes out to 500MB/s (4 gigabits) of bandwidth available. Still, there’s a lot you can do with that, and it’s a huge step up from the ol’ Intel HD 3000.

The Hardware

I ended up getting the “EXP GDC Beast”. This thing is actually designed to work with a number of different connectors, and you can find it shipping with a cable that has an HDMI plug on one end and ExpressCard, m.2, or miniPCIe on the other end. It’s not actually HDMI, they just used that connector since it’s reasonably durable and has a lot of pins. There’s a few different hardware revisions out there, which are probably stability improvements. I ended up with a v8.5c, but I don’t know how much that matters in practice.



For power, you have two options. The simplest is to plug an ATX power supply into the Beast using the adapter that comes with it, and then use the ATX PSU to power the GPU as well if you need to. The other option is to use a 12v barrel supply of up to 150W. The device has a MOLEX 5557 6-pin output (seen on the right) that you can use for auxiliary power for the GPU when you’re using a barrel supply.

This thing came with some pretty spectacular packaging btw.




Personally I can definitely say my life has been changed. I’m not sure if I’ve broken spacetime enough to create infinite possibilities yet, but I’m working on it. I’m sure I just need to tweak the drivers a bit for that one.

Buying the parts

Outside of China, you can find the EXP GDC Beast on AliExpress, eBay, or Amazon, pick your poison there. All the sellers seems to be reselling the product, because when I got the thing it had a QR code linking back to a Taobao user page for who I assume is the original seller. I can’t actually confirm that because I’d need to log in to a Taobao account to view the page, but if you want to check for yourself it’s expgdc.taobao.com.

Aside from Taobao, if you want to find the cheapest option you’re best off looking at AliExpress, with eBay as a second choice. Both of those ship from China though, so if you want it faster you can find it on Amazon. I was only able to find one seller on Amazon that sold it with the ExpressCard cable. Here’s a link to that page if you’re looking for it. It’s not an affiliate link, I’m not about that life.

Finding the right search terms for the aux power cable was also a pain. I used to think I had a surefire way to find it but it turns out I got some cables where one half was the wrong size entirely. Right now I’m using this 8-pin to dual 6+2pin adapter, with a 6-pin plugged into the Beast and an 8-pin plugged into my GPU’s 6-pin port. You could probably do better if you felt like crimping your own cables but I don’t.

The Full Hardware Setup

Yeah so this looks absolutely incredible with everything plugged in. We are truly living in the cyberpunk future, and I DON’T mean like 2077.



This picture is actually from before I got the software side working, but while basking in the glory of the fully functional setup I forgot to snag another picture. Rest assured though, this is representative of how the hardware side of things looked when I got it working.

We’ve got a Radeon HD7770 plugged into the dock, powered from an ATX supply in the PC case behind it, hooked into my x220. The HD7770 is actually from 2012 so it’s era-appropriate to the laptop, which I particularly enjoy.

Software

Getting this all running in Linux was surprisingly painless. In the BIOS I just had to make sure I actually had the ExpressCard port enabled, and the device showed up in lspci right after that. At this point if you wanted to plug monitors into the GPU you’d just need to set up your Xorg config the same way you would if you had plugged it into a desktop. For this card in particular I also had to add radeon.si_support=0 amdgpu.si_support=1 to my kernel parameters, because this card defaults to radeon but I needed amdgpu for Vulkan and PRIME.

Speaking of PRIME, what is that? You can see the arch wiki page on it for the details, but the gist is it’s a way to render an application on one GPU and display it on another one. In this case we’re going to render on the eGPU and display through the iGPU onto the laptop display.

If your immediate thought is “ok but didn’t you say we only had 500MB/s of memory bandwidth?”, well, yeah. It could be worse though. My laptop has a 1366x768 pixel display, which works out to 1366 * 768 * 24bpp * 60fps bits per second, or about 200MB/s. That’s 2/5 of our memory bandwidth, but that still leaves 300MB/s which is plenty for a lot of older games. It’s better than the iGPU anyway, and I can always plug it into a monitor for better performance.

To get PRIME working I added these lines to my xorg.conf:

Section "Device"
    Identifier "Card1"
    Driver "amdgpu"
    Option "DRI" "3"
EndSection

Section "ServerLayout"
    Identifier     "X.org Configured"
    Screen      0  "Screen0" 0 0
    Inactive       "Card1"       # Device for your second GPU
EndSection


After restarting X11 I ran xrandr --setprovideroffloadsink 1 0 and I was good to go! Then for AMD cards all you have to do is export DRI_PRIME=1 to any application you want to offload to the GPU.

So how well does it run games?

I don’t have hard numbers for you, but I did a before and after test with Portal 1 at max graphics settings. the HD 3000 could run it, but was chugging along at 15fps and was a stuttery hell. With DRI_PRIME=1 this went up to a nice smooth 60fps with no performance problems to speak of, so I’m calling that a success. Also since this card supports Vulkan, I can even play Windows games with DXVK.

Honestly I’m just really impressed with how painless this whole thing was. Props to all the work the driver devs out there do because I was expecting this to be much more of an ordeal. That’s the upside to just having PCIe to do whatever you want with I suppose. I’m just piggybacking on the work that’s been done for laptops with hybrid graphics and desktops, and nobody needed to write specific support for this usecase.

Anyway, that’s been my hardware adventures the past couple days. I have some ideas of more cursed directions to go with this adapter but I’m leaving it there for now. Thank’s for stopping by!

– artemis


The Workgirl Keyboard Layout
2021-07-30T00:00:00+00:00
There’s a keyboard layout I’ve been using for the past 8 or 9 months. It’s called the workgirl layout. It may look familiar to some of you:



By no coincidence, this layout happens to have the exact same key placement as the workman keyboard layout. Rest assured though, it is not the same layout.

Oh, you’re still wondering what the difference is? The name, silly!

Installing the Workgirl Keyboard Layout on Linux

X11

These days, X11 comes with this partially installed already. Here’s what you need to do to finish the installation:

First, run this command as root,

find /usr/share/X11/xkb -type f -print0 | xargs -0 sed -i 's/workman/workgirl/g; s/Workman/Workgirl/g'


Then, save this config to /etc/X11/xorg.conf.d/90-workgirl.conf

Section "InputClass"
	Identifier "workgirl-layout"
	MatchIsKeyboard "yes"
	Option      "XkbVariant" "workgirl"
EndSection


Linux TTY

Unfortunately, this doesn’t come installed out of the box, but you can still install it.

As root, run these commands to install the keymap:

mkdir -p /usr/local/share/keymaps/i386/workgirl

curl -sSf 'https://raw.githubusercontent.com/workman-layout/Workman/master/linux_console/workman.iso15.kmap' | sed 's/workman/workgirl/g; s/Workman/Workgirl/g' | tee '/usr/local/share/keymaps/i386/workgirl/workgirl.map' > /dev/null


Then, if you’re using a systemd system, add this to /etc/vconsole.conf

KEYMAP=/usr/local/share/keymaps/i386/workgirl/workgirl.map


If you’re not using systemd, consult your distribution’s documentation. It’ll probably involve running loadkeys. On puppy linux I just had to install kbd and console-data, and then add loadkeys /usr/local/share/keymaps/i386/workgirl/workgirl.map to my /etc/rc.d/rc.local.

Wayland

I can’t use wayland due to various missing accessibility features that I need, so I’m afraid I don’t have instructions for that. Feel free to contact me if you have some recommendations that work.

“But I’m not a workgirl either”

I may vibe with the word “girl”, but I’m enby, so I get it. I went with “workgirl” for this post as a protest against the name “workman” that’s recognizably derivative without it coming across as plagiarism.

I highly encourage you to come up with your own name for your keyboard layout. Gender is a scam, and so is the idea of a canonical name for a keyboard layout. Name your stuff whatever you want, and don’t let other people force their names on you!


Extended Lua Hashbang - Portable Lua Scripts
2021-04-25T00:00:00+00:00
So you have a lua file, and you want to be able to run it from the command line like ./your_script.lua. In languages like python or ruby you would accomplish this by adding #!/usr/bin/env ruby or #!/usr/bin/env python as the first line of the file. The #! magic pattern, known as a shebang (sheh-bang) or hashbang, tells linux to use a specific command to execute the file. You can do this in Lua too, but it’s not the most portable option.

If you use #!/usr/bin/env lua, it’ll work just fine as long as a program named lua exists. However, some people use an alternative lua implementation called LuaJIT. LuaJIT provides an executable named luajit, so the simple shebang won’t find it unless the person running your code has added lua as an alias to luajit. If your code is compatible with both the reference Lua and LuaJIT, it’s nice to make your script work with both out of the box.

The following solution is modified from code by William Ahern on the lua-l mailing list. This will search $PATH for lua, lua5*, and luajit*, executing the script with first one it finds:

#!/bin/sh
_=[[
IFS=":"
for dir in $PATH; do
    for lua in "$dir"/lua "$dir"/lua5* "$dir"/luajit*; do
        if [ -x "$lua" ]; then
            exec "$lua" "$0" "$@"
        fi
    done
done
printf '%s: no lua found\n' "$0" >&2
exit 1
]]
_=nil

-- Now we're running lua code
print("lua code here!")


So there you have it. Add those 14 lines at the top of your lua script, mark it as executable with chmod +x your_script.lua, and you’re done! You can now run ./your_script.lua as a command on a system with lua or a system with luajit.

We’ll get into this in the next section, but that’s just linux shell script code at the top there. You could extend this solution even further to do fun things like auto-installing dependencies with luarocks.

Wow, how does that even work?

First, notice our hashbang is #!/bin/sh, so linux is actually going to run this file as if it’s a shell script. When running as a shell script, the code searches for an available lua interpreter on the system, and then re-executes the file as a lua script. Our code therefore means two things in two different programming languages!

Let’s break this down.

When running as /bin/sh:


  _=[[ sets the _ shell variable to the string "[["
  the for-loop runs, looking for an available lua interpreter in $PATH
  if no lua is found, sh fails with exit 1
  if a lua is found, exec replaces the sh process with a lua process, providing the current file as the first argument


Then, when running as lua code:


  _=[[ starts defining the lua variable _ as a multi-line string
  the sh code is ignored because it’s in the string
  ]] closes the multi-line string
  we re-assign the _ variable to nil to clean up by undefining it
  the rest of the lua file runs as normal


There’s one other nuance of shell scripts that makes this work. Most scripting languages these days will try to parse an entire file before running any of it. A language like that won’t run your code if there’s syntax errors anywhere in the file. So, given most lua probably isn’t going to be valid shell script code, why does this trick work?

Well, sh and derivatives like bash work differently. They parse the file line by line as they execute, instead of parsing the whole thing at once. That means they don’t care if you have a syntax error half way down the file; if they never get to that line of code then the error doesn’t exist! Tools like makeself even use this behavior to create self-extracting tar files, by adding a small shell script to the beginning of the tar file that extracts the rest of it.

And that’s it! That’s all the magic. Now go write some lua scripts!


Your Anti-bot is Not Accessible
2021-03-21T00:00:00+00:00
TL;DR: Input sequencing and automation tools such as autohotkey scripts, hardware macros, auto-clickers, and turbo buttons are important accessibility tools that allow people with disabilities to play games they’d otherwise be unable to play. These tools are often banned in multiplayer titles, particularly MMORPGs, in the name of fairness and bot prevention. I argue that these tools should be allowed, or even implemented within the game itself. With the recognition that a line has to be drawn somewhere, I suggest that a tool should be classified as a bot only if it automatically makes meaningful decisions in response to stimuli provided directly by the game, creating a feedback loop that does not involve the player. Further, I suggest that in the games that can’t allow external tools fairly, first-party accessibility features can still make the game playable for more people.

Who needs these things?

Well, me, for one. That’s why I’m writing this. I have chronic injuries in both my hands and wrists that severely limit how I can use them and how much I can use them. It sucks, and its forced changes in a lot of my habits, but I can still keep doing a lot of the things I love with the help of input automation and alternative control schemes.

So automation tools help people like me, and they more generally help anyone with limited mobility. But there’s another angle here: they’re preventative too! Having the kinds of injuries I have fucking sucks, but my injuries are caused by the proverbial death by a thousand papercuts. These tools can make games and other tasks less physically taxing by reducing those cuts by orders of magnitude, reducing the chance that someone ends up in my position at all.

Where do the problems start?

Using automation tools in single-player games is purely a technical challenge of fitting the tool to the game. With multiplayer games though, these tools are often a grey-area in the rules or flat-out banned. I’ll use World of Warcraft as an example, since its a game I like to play.

Blizzard’s rule of thumb for input tools is “one user action per game action”. What’s pretty obviously disallowed by this is auto-clickers. I can make a game controller cast a spell when I press a button, but I can’t make it cast a spell and then cast it again in 1.5 seconds, and keep going until I let go. This sucks!! It’s the difference between me pressing the button once per fight or 10 times, which means I have to stop playing a lot sooner than I’d like to to avoid hurting my hands. My options for making this easier are reduced to finding buttons that hurt less to push, which has quickly diminishing returns.

There’s also scenarios like combining dwell-click with eye-tracking, something a tool like talon makes possible. In that setup, the mouse moves to where you look on screen, and then initiates a click after some delay. What even counts as a discrete user or game action? It’s unfortunately not something you can get a clean cut answer on. Blizzard’s support has to give vague answers or risk giving official approval for something someone else believes breaks the rules.

The problem with these rules is they prevent players from using tools that make the game more playable, because any tool that may be in the spirit of the game could still get them banned. So the choice is between not playing the game, or playing with a constant fear that your account could be banned permanently because of how you play the game.

When is something a bot?

The point of these rules is not to prevent disabled people from playing the game. It’s to prevent botting, so that in areas of competition players are competing against other players, not automation scripts. In MMORPGs this is particularly important for economic reasons, as large botnets left unchecked will flood markets with items, crashing their prices, and generate large sums of in-game money from non-player money sources to sell to players for real currency, breaking the economy even more.

So clearly you can’t permit everything. Where do you draw the line for rules?

I think the difference is decision making. In a reductionist view of video games, a player receives a stimulus from the game, and must make a decision. This decision affects the state of the game. This prompts a new stimulus, thus completing the feedback loop.

A bot then, is a tool that removes the player from this loop. The bot receives stimuli from the game and makes meaningful decisions without a player getting involved. Banning tools by this standard allows the vast majority of accessibility tools to be used without fear.

This probably doesn’t break your game

Multiplayer games are inherently a competition of decisions. A turn-based strategy game tests long-term planning and strategy. A shooter tests reflex, coordination, tactics, and on-the-fly decisions. An MMO tests your ability to look up the optimal character build online and then do what the leveling guide/raid leader/fleet commander tells you to do (joking! but only half joking). Performing a pre-determined set of actions isn’t a decision, but choosing to take those actions is, and that’s the part the player needs to do.

However, if the optimal strategy for a game is to perform a precise pre-determined set of actions, then automation just can’t apply. There’s still steps you can take towards accessibility though, to add first-party options that are even better.

Accessibility is a part of game design

The rhythm game osu! takes a reasonable compromise here. For any level, there is a pre-determined sequence of inputs that earns the best possible score. However, it has all sorts of modifiers to make the game easier. Some carry a score penalty while slowing the level down or making hit targets larger. Others automate parts of the game entirely, automatically clicking when appropriate of moving the mouse around to aim at targets. These automation modifiers invalidate the score for global leaderboards, but allow people to have fun playing levels they otherwise couldn’t.

I love this concept of first-party gameplay modifiers and I’d like to see it applied more often. Modifiers that affect what reaction speed is necessary, what types of inputs are needed, how many, all of these are fantastic for making a game playable by more people.

It’s the easy way out for a dev to throw their hands in the air and say “sorry, I guess you just can’t play this game without breaking our rules and getting banned”. Put some compassionate thought into it; you can probably allow more than you realize. And if you really can’t allow automation, remember that you have the power to design a solution yourself. It will be appreciated.


Declawing the Dragon - Voice Coding in 2020
2020-04-08T00:00:00+00:00
In this post I am going to talk about programming with speech recognition software, also known as voice coding. Voice coding as a concept is nothing new, though you may not have heard of it. Here are some talks you can check out if you want to see what this actually looks like in practice:

  PYCON US 2013 - Using Python to Code by Voice
  The Eleventh HOPE (2016): Coding by Voice with Open Source Speech Recognition
  DECONSTRUCT 2019: Emily Shea - Voice Driven Development


Until the past few years, one thing all of the tools for this had in common is that they relied on a piece of software called Dragon, a dictation tool developed by a company called Nuance. It’s quite pricey at $300, but it’s the best you can get for running dictation on your local machine. Unfortunately, it only runs on Windows; previously, there was a macOS version, but that was discontinued in 2018.

This was the big barrier for nearly every voice coding software out there. Sure, there were some tricks that allowed you to run Dragon in a virtual machine or on some other piece of hardware, but Dragon was always somewhere in the equation if you wanted something that was actually usable.

This is no longer the case! Two of the talks that I linked at the start of this post use software which does not require Dragon. The 2016 talk uses a tool called Silvius, however the main website for it appears to be down at the time of writing this post, so I’m unsure of the status of this project. The video from 2019 uses an accessibility tool called Talon, which can use either Dragon or its own voice engine based on a fork of facebook’s wav2letter.

Talon actually works beautifully! Sure, it’s not as good for dictation, but it’s usable (in fact, I’m using it to write this post), and it’s perfectly fine for system control and programming. Critically for me, it’s also cross platform. It only works on x86_64, but it will work on macOS, Windows, and Linux. You can even host the voice processing engine on one machine and have Talon use it from another machine, allowing you to make use of the software on low power hardware.

The barrier to entry for this sort of thing is much lower than it used to be, and continues to fall. The $15/month for the Talon beta with wav2letter support can be waived for those who can’t afford it, and as I mentioned, the actual voice engine itself is open source, so it’s feasible for somebody to build their own software from scratch on top of it. There may be even more solutions that I’m not aware of out there.

Dragon is still the best around, but it’s no longer the only option. If you decided not to try voice coding in the past because of the dependency on Dragon, I highly recommend you look into it again.


Mirroring YouTube Playlists
2020-02-04T00:00:00+00:00
Recently I wanted to set up a periodic job to mirror some of my personal
youtube playlists. There’s plenty of reasons one might want to do this. For me
it’s simple: one copy is none copy, and two copies is one copy, so I want a
second copy of youtube videos I care about stored locally. This protects
against videos getting removed, copyright stuck, youtube shutting down, or
anything else that might make the youtube video otherwise unavailable.

The final straw for me was the recent YouTube Kids change. YouTube has
implemented a system whereby it flags videos as “For Kids”, and a video flagged
as “For Kids” can not be added to playlists, or interacted with beyond viewing,
and may eventually start disappearing from search results. This seems kinda
fine on the surface, but the automated systems flagging videos are very bad
at actually detecting kid-friendly content. For example, Pony Music Videos
(PMVs), which are fan made cuts of My Little Pony video footage over top
unrelated songs, have been getting flagged quite a bit, even when the song has
obviously explicit lyrics.
(EDIT 2020-03-02: this video has since been un-marked as “For Kids”,
so you’ll have to take my word that it was “For Kids” when this post was written.)

The Script

Anyways, I just wanted a simple script adding some extra features around
youtube-dl, so I made one!
You can find the full script at
https://github.com/faithanalog/x/blob/master/youtube-archiver/archive-playlists.
I’m going to go over a few specifics in case you’re interested in making your
own rather than just using my script or someone else’s. But, if you don’t care about that, and you want to use my script, here’s what to do:

First, create a list of playlists in a file called, for example, playlists.txt

Nightcore Songs
https://www.youtube.com/playlist?list=PLckeMyCaCCIN_JU1V4oADW50DlGOREoLj

Vaporwave
https://www.youtube.com/playlist?list=PLgP_WFDJWjxTRPJtV4DV99lGB92rq5we_


Optionally, create a list of SOCKS proxies
you want to use in another file, for example, proxies.txt

127.0.0.1:9050
example.com:1080


Then you can run my script!

# If you don't have any proxies to use
archive-playlists playlists.txt

# If you do have proxies to use
archive-playlists playlists.txt proxies.txt


This will create folders with the names provided by playlists.txt, and
download each playlist to its own folder. You can run it periodically however
you feel most comfortable (crontab, systemd timers, etc.) and it should keep
all the local copies up to date.

Details

The requirements I had for my script were pretty simple:


  Download videos, thumbnails, subtitles, and video metadata
  Write playlist information to its own file
  When re-running the script, download only the newly required data
  Ideally, avoid getting myself IP-banned


Most of this can be taken care of with various youtube-dl flags! I have a
function called dl_playlist() which implements all of this. Arguments passed
to dl_playlist() are transparenltly passed on to any youtube-dl commands,
which allows me to pass it a playlist URL, and optionally a proxy.

First, we download the playlist metadata:

playlist_json="playlist-$(date --rfc-3339=date).json"
youtube-dl \
    --flat-playlist \
    -J \
    "$@" < /dev/null > "$playlist_json"

temp_playlist_json="$(mktemp -p ./)"
cp "$playlist_json" "$temp_playlist_json"
mv "$temp_playlist_json" playlist.json


This writes the current metadata to a dated file, copies that to a temporary
file, and then renames the temporary file to playlist.json. This allows us to
have a history of the playlist over time, which may make it easier to figure
out what videos are missing when they get taken down. playlist.json will
always contain the latest playlist information, to make it easier for me to
write tools for this later.

The process of copying to a temporary file and then renaming the temporary file
may seem overkill, but mv is an atomic operation, so it eases my mind about
any race conditions I might run into if I’m running other tools over this data
to process it.

Next, we can download the playlist videos.

youtube-dl \
    --download-archive ytdl-archive.txt \
    --write-info-json \
    --write-description \
    --write-thumbnail \
    --all-subs \
    -i \
    -f bestvideo+bestaudio \
    -r 500K \
    --sleep-interval $min_sleep \
    --max-sleep-interval $max_sleep \
    "$@" < /dev/null


Here’s where all the fun flags come into play!


  --download-archive ytdl-archive.txt tracks which videos have been fully
downloaded in a file called ytdl-archive.txt. This allows youtube-dl to skip
loading the page for the video entirely on subsequent runs.
  --write-info-json, --write-description, and --write-thumbnail are all
fairly self explanatory. I don’t even really need to write the description,
since the info json contains it, but it might be convenient to have in its
own file.
  --all-subs instructs youtube-dl to download all available subtitles. I
don’t really know which subtitles I need, so might as well just have them all
available.
  -i makes youtube-dl ignore download errors. Without this, if a video is
missing, youtube-dl will stop running after trying to retrieve it, and skip
the rest of the playlist.
  -f bestvideo+bestaudio makes youtube-dl download the best quality video and
audio files separately, and merge then into a single .mkv file.
  -r 500K rate-limits downloads to 500 Kilobytes per second. This is part my
attempts to avoid the ire of automated IP-bans. I don’t have any source on
what a safe range for download rates is, but this seems to work for my
usecase, so I’m keeping it.
  --sleep-interval and --max-sleep-interval together specify the minimum
and maximum sleep times. youtube-dl will pick a random time between these
values between downloads. I was a bit confused about the semantics of what
counts as a download- youtube-dl sleeps after downloading a thumbnail, video
file, or audio file, but doesn’t sleep after downloading the info json.
Anyways, I use a range of 30-90 seconds, and this seems to keep things slow
enough that it shouldn’t be a problem.


The rest of the script is just housekeeping around this. It loads playlists
from a file, creates separate directories for each one, and randomly shuffles
through a list of SOCKS proxies for each
playlist. My VPN provider provides SOCKS proxies for each of their exit nodes,
so this is a really convenient way for me to distribute my downloads across a
broader range of IP addresses.

And that’s it!  The script is pretty well commented I feel, but if you have any
questions about it, feel free to ask!

Thanks to everyone on the fediverse who helped me iterate on this script and
iron out the details. ❤️

– artemis


SSTV Tx/Rx with a Pi and an RTL-SDR
2016-08-29T00:00:00+00:00
It turns out that transmitting and receiving SSTV signals is pretty easy, using
just a raspberry pi as a transmitter and an RTL-SDR
as a receiver. There are a few programs which you’ll need to install before you
begin:


  
    rpitx includes tools for encoding and
transmitting SSTV signals with a raspberry pi. This should be installed on
the pi itself.
  
  
    gqrx is a graphically controlled SDR program, which will
be used for receiving and recording the SSTV signals. Put this on whatever
computer your SDR is hooked up to.
  
  
    qsstv is an open source
cross-platform SSTV decoder which can decode directly from recorded audio.
  
  
    imagemagick will be used for
converting images to a format rpitx can understand.
  


If you’re on linux, the last three may be available in your distribution’s
package repositories; make sure to check those before wasting time downloading
and installing them manually.

Additionally, be aware that this may not work with the Raspberry Pi 2 or 3.
Stick with a Pi A/B/B+ for best results. A Pi Zero may also work, but I haven’t
tested that.

Scroll to the bottom for a demo video of GQRX and QSSTV in action, using the
realtime decoding technique described towards the bottom of the post.

Transmitting SSTV

1. Resize your image to 320x256 pixels. I did this by cropping my image with
    imagemagick. This command will resize, maintaining aspect ratio of the input,
    and then crop the result to get to 320x256 pixels:

convert  -resize '320x256^' -gravity center -extent 320x256 


2. Convert your image to an 8 bit depth RGB file.

convert -depth 8  .rgb


This step can be combined with the previous step.

convert  -resize '320x256^' -gravity center -extent 320x256 -depth 8 .rgb


3. Convert the rgb file to a .st file for rpitx, using the included tool.

pisstv .rgb .ft


4. Broadcast with rpitx

sudo rpitx -m RF -i .ft -f 


Receiving SSTV

Capturing and recording signals will be done with GQRX. SSTV decoding will then
be done afterwards with qsstv.

Recording with GQRX

1. Open GQRX. Tune to the frequency you’re transmitting on

2. Set your decode mode to USB, and drag your filter width to 3 k.

3. Begin transmitting from your Raspberry PI. We won’t decode this initial
    broadcast. Instead we’ll be using it to properly tune our receiver.

4. Center your decode filter area over the signal seen in the waterfall. You
    should have something which looks similar to the image below.



5. Adjust your audio gain in the bottom right to be be audible. Hit “Rec”, and
    then restart your SSTV transmission. Audio will be recorded to a wav file in
    your home folder. Stop the recording after your SSTV transmission completes.

Decoding with QSSTV

1. Open up QSSTV. Navigate to Options -> Configure. Go to the “Sound” tab, and
    set the sound input to “From file”.



2. Go back to the main QSSTV screen. Configure your SSTV settings to match the
    image below.



3. Click the “play button” icon in the top left corner. You’ll be prompted to
    select a wav file to decode. Navigate to your GQRX recording and select it.

4. Wait. Decoding may take some time and appear to freeze , but if all goes
    well you should have a fully decoded SSTV signal.



Realtime Decoding

It’s actually possible to decode SSTV data in real time with a slight
modification to the recording and decoding process! You’ll need to install
netcat and sox for this to work.

1. In GQRX, select the “UDP” option instead of the “Rec” option. This will
    make GQRX send audio data as 1-channel raw pcm s16le data over UDP to some
    address. The default address is localhost on port 7355, but you can
    configure that by clicking the “…” button next to “Play”.

2. Now we need to make a
    FIFO for QSSTV to read
    from. Since QSSTV looks in $HOME/audio for files by default, that’s where
    I put mine, but you can put it anywhere.

mkfifo $HOME/audio/realtime


3. Use the following command to listen for data from GQRX and write it to the
    FIFO as it’s received. The command will stop after you turn QSSTV’s decoder
    off, but you can pause/unpause/restart GQRX as much as you want with no
    problems while this is running.

netcat -l -u localhost -p 7355 \
    | sox -r 48000 -e signed -b 16 -c 1 -t raw - -t wav - \
    > $HOME/audio/realtime


4. In QSSTV, click the “play button” icon as before, and select the “realtime”
    file for playback. The file size will show up as 0 bytes, that’s normal, but
    at this point QSSTV should be receiving data. and displaying it in its
    waterfall on the right hand side of the screen.

Demo

Here’s a demo video of the process. Notice how QSSTV has three red bars. When
the decode initializes, the lowest tone should peak at the left red bar, and the
highest tone should peak at the right red bar. If they’re significantly off from
where they should be, and you aren’t getting an image decode (or the image
looks wrong), that’s how you know you need to adjust your tuning a bit more.




Your browser doesn't support HTML5 video tag, or you have it disabled.



TI-BASIC Bejeweled
2015-04-30T00:00:00+00:00
TI-BASIC is the unofficial name for the programming language included on the
stock operating systems of the TI-83 and TI-84 Plus series of calculators. This
includes a large set of calculators, the most popular these days being the
96x64 monochrome TI-84+SE, the 320x240 16 bit color TI-84+ Color SE (referred
to as the “CSE”), and the newly released TI-84+CE which features a LCD which is
significantly faster to access than the CSE and a processor upgrade from a z80
to an ez80.

The 84+ and 84+CSE use z80 processors clocked at roughly 15mhz. This is an old
CPU architecture, with no modern features like caching, out of order execution,
floating-point units, or even pipelining. As a result of the CPU limitations,
the slow interpreter, and the fact that all numbers in TI-BASIC are 9 byte
floats, programmers may find that even the simplest of games they make run too
slowly to be playable. Many turn to projects like the Axe
Parser to provide a faster
compiled language. Some bite the bullet and learn how to program games in
assembly. However, those that continue with TI-BASIC learn the ins and outs of
the language, and how to make it work for them.

Quirks of TI-BASIC

TI-BASIC has a few things that should be known before reading the following
code. All numeric variables are one letter. Valid variables include A-Z and θ
(theta). There are also one dimensional List variables such as L₁ and two
dimensional Matrix variables such as [A]. List and matrix indices are in the
range of [1,Length]. Matrix values are accessesed with
[Matrix](Row,Column).

Creating a Match-3 Game

One of the first problems anyone who creates a match-3 game (like Bejeweled)
runs across is detecting matches. Matches need to be detected wherever they may
lie on the board, so that chain reactions can occur from tiles falling in. One
of the simplest solutions works just fine for anything written to run on modern
hardware.

for every row
    counter = 0
    last_seen = 0      //Last type of tile scanned
    for every column
        if last_seen == tile_at(column, row)
            counter++
        else
            counter = 1
        if counter >= 3
            set_tile(column - 2, row, 0)
            set_tile(column - 1, row, 0)
            set_tile(column    , row, 0)
        last_seen = tile_at(column, row)


This works for removing tiles for matches greater than or equal to 3 in length.
It removes all horizontal matches; row and column can be switched to find
vertical matches. The problem is that this code is somewhat complex, and as a
result the TI-BASIC version of it runs fairly slowly.

//R = row
//C = column
//I = counter
//L = last_seen
//[A] = matrix storing tiles
//Clears vertical matches
For(C,0,7)
  0→L
  0→I
  For(R,0,7)
    [A](R,C)→X
    If L=X:Then
      I+1→I
      If I≥3:Then
        For(A,R-2,R)
          0→[A](A,C)
        End
      End
    Else
      1→I
      X→L
    End
  End
End




Experienced TI-BASIC developers will notice that this version is written for
clarity over speed or size. Even with all the micro-optimizations that can be
added to the code, it runs too slowly to be useful, as shown in the included
recording. From the time that the cursor disappears to the time it reappears,
the player can’t do anything but stare at the screen. The displayed version
also has the added optimization that it won’t check columns or rows which could
not have changed since the previous check, since no changes implies no matches,
but it’s still not enough to be playable.

So what can be done to make this program run better? Up until this point I had
thought of the problem solely as counting the number of matching tiles, but it
can be thought of another way, one which the calculator has built in functions
to deal with. They key lies with the DeltaList() function.

DeltaList(), or ΔList() as it’s displayed on the calculator, takes a list and
returns a new list with the difference between the elements of the source list.
The result will be one element shorter than the source list, since there is no
element preceding the first to subtract from. Therefore
DeltaList({1,3,3,3,2,4}) results in {-2,0,0,-1,2}. If the difference
between two elements is zero, they are equivelant! Applying DeltaList twice to
a list should then place zeros only at the end of matches of three, indicating
exactly where matches exist with no counting being done in TI-BASIC code.

DeltaList({1,3,3,3,2,4}) = {-2,0,0,-1,2}
DeltaList({-2,0,0,-1,2}) = {-2,0,-1,3}   //0 = Match


Dealing With False Positives

Unfortunately, there’s a flaw with simply using DeltaList(DeltaList()) on our
input. What if we provide the list {6,1,2,3,5}?

DeltaList({6,1,2,3,5}) = {-5,1,1,2}
DeltaList({-5,1,1,2})  = {6,0,1}    //0 = False Positive


We get a false positive. That simply won’t work for the game, but the solution
isn’t difficult: Raise the source list to a power. Raising to the power of two
results in two possible sequences that give false positives, {7,5,1} and
{1,5,7}, but raising to the power of three results in none. Even with this
extra calculation, the speed still beats out calculating matches in TI-BASIC.
The new matching code (minus optimizations) looks something like this.

For(C,1,8)
  //Store column C of matrix [A] into list L₁
  Matr►list([A],C,L₁)
  //not() normalizes the list so that matches are 1, non-matches are 0
  not(DeltaList(DeltaList(L₁^3)))→L₁
  For(Y,1,6max(L₁))
    If L₁(Y):Then
      //Y is equal to the start of the match because the length
      //of L₁ is two less than the size of the board
      For(A,Y,Y+2)
        0→[A](A,C)
      End
    End
  End
End




Not only is the code faster, it’s also much simpler. In TI-BASIC, the less code
running the better, and here all the code does in the inner loop is a simple
If and a mass set to zero. The result is a much more playable match-3 game.
The recording shown is a more complete version with a scoring and level system,
but it displays the speed advantages when making matches quite well. Not
everyone is lucky enough to be able to reframe their problem in a way that fits
the language so well, but these sort of tricks are what really allow people to
continue to create impressive games in this incredibly restrictive language.

Hidden Complexity

To the untrained eye, it may seem that I’ve removed the loop necessary to count
the number of matching elements in a row or column. Not only have I not done
that, I’ve increased the amount of work that needs to be done.
DeltaList(DeltaList()) has to take the difference of the input list, and then
further take the difference of the differences. Additionally, the input list
has to be cubed first! In reality, I’ve simply hidden the loop by moving the
work out of my code and into the interpreter’s code. DeltaList() is implemented
in assembly, while my naïve solution is not. All the operations being performed
on a particular row or column are still O(n), but the assembly is able to work
so much more efficiently that it results in a noticeable speed up.

Game-Over Detection

Every good match-3 game needs to be able to tell the player when the game is
over. Otherwise, they may spend a long time staring at an unsolvable board,
before finally getting frustrated and quitting. Unfortunately, checking if a
game is still playable is a rather long process. My initial plan was to perform
a bunch of pattern matching using more DeltaList(). The code builds a list by
reading different patterns of 3 tiles from the board that, if they all contain
the same tile, indicate a potential future match. This strategy works, but it’s
extremely slow. To solve this problem, the matching can be staggered. On every
repetition of the main game loop, the game processes key input and then
performs one step in the game-over detection. There are 48 total steps. If the
48th step is reached, and returns no solutions, the game is over.

Saving the Game

Saving the game is a fairly simple process. A few variables are saved to a
TI-BASIC List, and loaded at next start-up. This presents two problems: How can
the program handle an archived (unaccessable) list, and how can it verify that
the data is valid? To handle archived lists, it tries to access the list very
early on in the program, ensuring that if the list is archived the program will
fail quickly, without wasting time. To verify data correctness, an error
detection code is calculated from the other 4 values in the list. If the error
detection code stored in the list does not match the code calculated from the
saved values, the game resets to its default state. As an added bonus, this
means that the game can simply create the list if it doesn’t exist and read in
the zeroed out list to load the game. It will fail the error detection, forcing
the game to the proper initial state.

A Note On Graphics

Anyone who has even lightly touched on using TI-BASIC knows that the level of
graphics displayed in the two recordings above simply isn’t possible in pure
TI-BASIC at that speed. To work around this problem, people have developed
shells for launching these TI-BASIC programs that provide routines the programs
can call for graphics, advanced key input, and other various utility
operations. Unfortunately they can’t fully solve the speed problems of the
language, but they’re certainly a leg up on pure TI-BASIC. The library I’m
using for the demos above is called xLIBC, written by tr1p1ea and included in
the DCSE library created by
Christopher Mitchell.

The final game is downloadable
Here.


Prelude of the Chambered in Dart
2014-11-18T00:00:00+00:00
I was watching [REDACTED] write what was essentially a re-implementation of Doom in
Dart recently, and it inspired me to port Prelude of the Chambered (PotC) to
Dart.

Prelude of the Chambered is a game made by [REDACTED] for the Ludum
Dare competition awhile ago. It can run in
the browser, but the problem is it’s a java applet. These days, browsers throw
warning after warning at anyone loading an applet, and that’s assuming they
even have the java plugin installed. I figured porting it to Dart would solve
that problem, and be pretty simple as well, because Dart code is very similar
in syntax to Java.

I’ve posted the code to Github, or
you can try it out Here.  Since a lot of people
seem to get somewhat confused by the controls, know that space bar is used to
select menu options and attack. WASD/Arrow keys are for movement. A and D
strafe, left and right arrows turn.

Game Logic

Most of the game’s logic could be straight up copy-pasted with a few minor
corrections (mostly involving Dart’s requirement of a ‘.0’ after whole numbers
to indicate they’re doubles). I opted not to do one dart file per class,
instead grouping all entities in a ‘entity.dart’, all blocks in a ‘block.dart’,
and so on. The only actual logic that had to be rewritten was the level loading
code.

Prelude of the Chambered stores all its levels as png files. The RGB value of
each pixel is used to determine the tiles of the level, while the alpha value
is used to link triggers. For example, a button with an alpha of 1 would open
the door with an alpha of 1, or a pressure plate with an alpha of 254 would
open the door with the same alpha. This meant that to load levels from the
images, all color channels must be preserved.

The simplest way to load image data involves drawing it to a canvas and getting
the canvas colors. I found, however, that some browsers would return different
colors from the original when reading the canvas. I suspect the canvas might be
using pre-multiplied alpha when drawing, although I’m not entirely sure. I also
tried drawing the image with webgl to a framebuffer, and then reading the
pixels back in, but that also sometimes failed probably for the same reasons.
The solution was to use this dart image
library along with HttpRequests to
decode the images, avoiding the browser’s image decoder entirely.

Rendering

Porting the rendering was a bit more complex. PotC’s original rendering engine
was custom written during the competition, and doesn’t use an API like OpenGL
or DirectX. With WebGL available in the browser, it’d be a bit silly to do the
original per pixel effects instead of rewriting them to use WebGL. Most of the
world rendering was actually fairly simple. PotC has a method for drawing
walls, and a method for drawing sprites, so all that needed to be done was to
re-implement those methods. In WebGL it’s fairly trivial since it’s just
drawing two triangles. Floors were also simple, the code is almost identical to
the wall rendering code. Truthfully, most of the interesting rendering code
involves the shaders.

For the 3d world, the first thing that needs to be done is to pass the depth of
each vertex to the fragment shader. The value needed is the Z coordinate of the
position vector after being multiplied by the projection matrix. This gives us
the distance (in blocks) away from the camera of the vertex. In the 3d fragment
shader, this depth is used to implement fog. The original PotC also has some
weird post-processing effect which uses the pixel position to create a sort-of
noise over the 3d viewport. Fortunately, that can also be implemented in the 3d
fragment shader, so it doesn’t need a separate pass. Finally, PotC’s textures
use magenta (#FFFF00) to represent transparency, so the shaders discard any
fragment which reads magenta from the source texture.

The other interesting shader is the shader used to draw the ‘hurt’ texture. The
hurt texture is displayed any time the player is hurt in the game. In the
original, the effect is created in real time by constructing a Random object
with a preset seed, and comparing random values for each pixel with a function
of time so that less and less of the hurt effect shows up each frame.
Re-implementing that exactly wouldn’t be very efficient in WebGL, so I took a
different approach. The texture is generated once when the game launches,
saving the random value used in the original PotC as the alpha channel of the
texture. Then, time can be passed as a uniform to a fragment shader which has
the comparison code in it.

Audio

Audio was the quickest part of the port. The WebAudio API works well, and all
the sound clips are .wav files so they load pretty much everywhere. It was as
simple as loading the audio clips and playing them.

What I’ve learned

Creating this port was an interesting process. I’ve found that a lot of java
code is actually pretty easy to port to Dart, even without taking the time to
understand what the original code does. I think that a lot of games could also
be ported without too much trouble, as long as they don’t depend too much on
third party libraries. I have a better understand now, too, of the versatility
of GLSL shaders. I think this is yet more proof that the web APIs are ready for
larger and more complex games. I’m interested to work on some of my own games
in Dart, and see where it takes me.


z80 Assembly: Binary-Coded Decimal
2014-11-06T00:00:00+00:00
One method for displaying numbers larger than 16 bits is to convert it to
Binary Coded Decimal (BCD) first, and display the result. BCD works by using
four bits to store each decimal (base 10) digit of a number. The following code
can convert a number to BCD, and display it. It’s currently written to convert
a 24 bit number to a 10 digit BCD number, but can be modified to support
anything really. It is memory ineffecient, because it uses one byte for each
digit rather than storing two digits per byte. This is useful though because it
makes the display routine simpler.

Conversion to BCD

This routine converts a little endian value to a little endian BCD value. It’s written for brass, a z80 assembler created by Ben Ryves.

;Converts a 24 bit unsigned int pointed to by HL to BCD
;Convert to BCD with double dabble method, see http://en.wikipedia.org/wiki/Double_dabble
;Handles up to a 10 digit number
;Input: HL - pointer to number
;Output: bcdScratch - stores BCD number.
NUM_DIGITS = 10
NUM_SRC_BYTES = 3
.var NUM_DIGITS, bcdScratch
.var NUM_SRC_BYTES, bcdSource
ConvertToBCD:
    ld de,bcdSource
    ld bc,NUM_SRC_BYTES
    ldir
    ld ix,bcdSource

    xor a
    ld hl,bcdScratch
    ld b,NUM_DIGITS
_zeroScratch:
    ld (hl),a
    inc hl
    djnz _zeroScratch

    ld b,NUM_SRC_BYTES * 8
    _bcdConvLp:
        ;Do increment
        ld c,NUM_DIGITS
        ld hl,bcdScratch
        ;Iterate through each BCD digit.
        ;If digit > 4, add 3
        _bcdIncLp:
            ld a,(hl)
            cp 5
            jr c,{@}
            add a,3
            @:
            ld (hl),a
            inc hl
            dec c
            jr nz,_bcdIncLp

        ;Shift SRC bits
        sla (ix)
        .for _off, 1, NUM_SRC_BYTES - 1
            rl (ix + _off)
        .loop

        ld c,NUM_DIGITS
        ld hl,bcdScratch
        _bcdShiftLp:
            ld a,(hl)
            rla
            bit 4,a
            jr z,{@}
            and %1111 ;Mask out high bits, since we only want the lower 4 bits for the digit
            scf       ;Set carry if bit 4 set
            @:
            ld (hl),a
            inc hl
            dec c
            jr nz,_bcdShiftLp
        djnz _bcdConvLp
    ret


Displaying BCD

Displaying BCD is quite simple, since each digit is stored within its own byte.
You can simply add the char code for ‘0’ to the value to get the text char you
want. This routine displays an unsigned BCD value, without leading zeroes

;Displays the BCD value at HL
;1 byte per digit
NUM_DIGITS = 10
DispBCD:
    ld de,NUM_DIGITS - 1
    add hl,de ;Go to end

    ;Skip leading zeroes, except if the value IS zero
    ld b,NUM_DIGITS - 1
_skipLeadingZeroes:
    ld a,(hl)
    or a
    jr nz,{@}
    dec hl
    djnz _skipLeadingZeroes
@:
    inc b ;B = num digits to display
_dispBCDDigits:
    ld a,(hl)
    add a,'0'
    push hl
    push bc
    
    ;Replace this call with anything that takes a char code in register A and displays it.
    ;For example, on the z80-based TI calculators, one might use b_call(_VPutMap)
    call YourCharacterDisplayRoutine
    
    pop bc
    pop hl
    dec hl
    djnz _dispBCDDigits
    ret



TI-84+CSE: Max Pixel Fill Speed with OTIR
2014-11-02T00:00:00+00:00
One of the fastest practical ways to write arbitrary pixels (no 1-color
rectangles) to the LCD is with an OTIR loop copying pixel data directly from
memory. That being said, I was wondering how many pixels you could
theoretically update per frame. As mentioned, this is theoretical. This does
not take into account the overhead of adjusting the LCD window, or any of the
other logic you may have.  It also assumes that interrupts are disabled.

The Code

The code I’ll be using to calculate max fill rate is

;Input: HL = data
;       DE = Number of pixels to output
;            Generally this would be width * height
;            This means that DE can never be the size of the ENTIRE screen,
;            just 85.3% of it.

    ld b,e
    ld c,11h

;If B != 0, we need an extra iteration, so D needs to be incremented
    dec de
    inc d

_dispSprtLp:
    otir
    dec d
    jp nz,_dispSprtLp


I use JP in the loop to save a few extra cycles. When calculating the numbers
below, I’m basing it only on the runtime of the loop, and I’m not calculating
the negligible setup time for the loop.

The Numbers

The TI-84+CSE uses a z80 processor, which can be set to run at 15Mhz. This
means that it has about 15,000,000 cycles (or T-States) per second. All
instructions take more than one T-State to run. The ones we care about are:

OTIR: (21 * (B - 1)) + 16 If B == 0, evaluate B - 1 as 255

DEC D: 4

JP nz, ADDR: 10

Now, the hardware adds an extra 4 T-states to ever OUT to the LCD port, so the
TRUE cost of OTIR is

OTIR: (25 * (B - 1)) + 20

Based on those numbers, we can calculate the amount of T-states for any given
input of DE

When B = 0, the loop takes (25 * 255 + 20) + 4 + 10 T-States, or 6409.
Therefore if E == 0

f(D,0) = 6409 * D

Otherwise if E > 0

f(D,E) = ((25 * (E - 1)) + 34) + 6409 * D

To simplify calculations, I’m going to ignore the fact that ‘D’ can not be
greater than 255. This allows me to use the above equation for the dimensions
of the entire screen, and get a reasonably close estimate of how much time it
would take.

The Calculations

The screen is 320x240 pixels large. Each pixel is 2 bytes. 320 * 240 * 2 = 153600.
153600 / 256 = 600, so D = 600 and E = 0. 600 * 6409 = 3,845,400.

To update the entire screen with this method it takes 0.25636 seconds. Not very
promising, but here’s a table of screen percentages vs how much time it takes
to update. For any dimensions that you want to calculate yourself, the simplest
way would be to use this formula.

(WIDTH * HEIGHT) / (320 * 240) * 3845400

Divide 15,000,000 by the result to get framerate.


  
    
      Percent of Screen
      T-states
      Theoretical Frames/Second
      Dimensions
    
  
  
    
      100%
      3,845,400
      3.9 FPS
      320 x 240
    
    
      50%
      1,922,700
      7.8 FPS
      160 x 240 (Half Res)
    
    
      5.3%
      205,088
      73.13 FPS
      64 x 64 (Large sprite)
    
    
      2.7%
      102,544
      146.28 FPS
      32 x 64 (Large sprite Half Res)
    
  



TI-84+CSE: Half Resolution Mode
2014-10-09T00:00:00+00:00
Half res mode is an LCD mode which results in a halved horizontal resolution.
This can also be used for double buffering, because one can write to the left
side of the screen while displaying the right side or vice versa.

Entering half-res mode takes 4 steps:

  Enable interlacing
  Enable partial images 1 and 2
  Position the source areas of both partial images on top of each other, and
make them 160 pixels in size.
  Set the output destinations of the partial images to be 160 pixels apart.


Enabling Interlacing

To enable interlacing, set bit 10 of LCD register 01h

ld a,01h
out (10h),a
out (10h),a
ld a,%00000100  ;bit 10 set
out (11h),a
xor a           ;a = 0 for low bits
out (11h),a


To disable interlacing, set LCD register 01h back to 0000h

ld a,01h
out (10h),a
out (10h),a
xor a
out (11h),a
out (11h),a


Enable partial images

Partial images are enabled by setting bits 12 and 13 of lcd register 07h for
partial images 1 and 2 respectively.

ld a,07h
out (10h),a
out (10h),a
ld a,%00110000  ;Enable both partial images
out (11h),a
ld a,%00110011  ;Default values for low bits
out (11h),a


To disable, reset bits 12 and 13

ld a,07h
out (10h),a
out (10h),a
ld a,%00000001  ;Disable both partial images
out (11h),a
ld a,%00110011  ;Default values for low bits
out (11h),a


Set partial image positions

The partial image display positions are controlled by LCD registers 80h and
83h. Their starting lines are controlled by 81h and 84h, and their ending lines
are controlled by 82h and 85h. When using half-res mode, set the display
position of image 1 to 0, and set the display position of image 2 to 160.

The start and end lines of both images should be identical.  For example, to
display the left half of the screen, set the start lines to 0 and the end lines
to 159. To display the right half, set the start lines to 160 and the end lines
to 319. In other words, to display a section of the screen starting at ‘N’, set
the start lines to N and the end lines to N + 159.

This code will set up the display positions, and then set the start and end
lines to display the left half of the screen.

ld a,80h        ;Img 1 disp position
out (10h),a
out (10h),a
xor a           ;Display position of first img = 0
out (11h),a
out (11h),a

ld a,83h        ;Img 2 disp position
out (10h),a
out (10h),a
xor a           ;MSB should be 0 since 160 < 256
out (11h),a
ld a,160        ;Display position of second img = 160
out (11h),a

ld c,11h        ;Used later to simplify some register setting code

ld a,81h        ;Img 1 start line
out (10h),a
out (10h),a
xor a
out (11h),a
out (11h),a

ld a,84h        ;Img 2 start line
out (10h),a
out (10h),a
xor a
out (11h),a
out (11h),a

ld de, 159      ;End lines for partial images

ld a,82h        ;Img 1 end line
out (10h),a
out (10h),a
out (c),d
out (c),e

ld a, 85h       ;Img 2 end line
out (10h),a
out (10h),a
out (c),d
out (c),e

Percent of Screen	T-states	Theoretical Frames/Second	Dimensions
100%	3,845,400	3.9 FPS	320 x 240
50%	1,922,700	7.8 FPS	160 x 240 (Half Res)
5.3%	205,088	73.13 FPS	64 x 64 (Large sprite)
2.7%	102,544	146.28 FPS	32 x 64 (Large sprite Half Res)