Jekyll2026-02-17T11:51:54+00:00https://dfederm.com/feed.xmlDavid’s BlogSome guy's programming blog.Managing Home Assistant Config From GitHub2026-02-17T00:00:00+00:002026-02-17T00:00:00+00:00https://dfederm.com/managing-home-assistant-config-from-githubRecently I wanted to do some refactoring of my collection of automations in Home Assistant which have grown and changed in various ways over the years and needed some TLC, including modernizing the structure like triggers:, actions:, and other syntax changes, as well as consolidating and simplifying some similar triggers across multiple automations, for example creating a “Home Mode” concept. Historically I have just used the File editor app (apps are the new name for add-ons), but that doesn’t scale for larger refactors, and also doesn’t provide the safety of being able to roll back without having to fully restore from backup.

Install the Studio Code Server App

The first thing to do, and really what makes a huge difference on its own by greatly improving the editing experience, is installing the Studio Code Server app. This app lets you run Visual Studio Code directly in the browser to edit your config files.

Installing the app is straightforward, just navigate to Settings -> Apps, click “Install app”, and search for “studio code server”. It should be under the Home Assistant Community Apps. No additional configuration of the app is required, just install it and start it.

The code server opens to /config which is your configuration folder, exactly the folder you want to manage.

Studio Code Server in the App Store

Note: I recommend turning the app off when not using it as it can hog quite a bit of memory and is even reported to have memory leaks. Hopefully these memory issues get fixed eventually, but for right now I find it easy enough to just turn it on when I need it and off when I am finished.

Setting up Git

Now to set up the git repo you’ll use to version control your changes.

Warning: Avoid the Git pull app. I tried it and ended up having to restore from a backup. The app appears to completely replace your config directory with the git repo contents. This is not only risky, but you shouldn’t be hosting the entire config directory in git anyway (files like secrets.yaml and Home Assistant internal storage should be avoided), so I’m not entirely sure how this app is intended to be used.

First, create a new git repo on whatever hosting platform you prefer. Technically this step is optional as you can completely manage the git repo on your Home Assistant server locally, but I prefer to have additional tools at my disposal and not always work within a browser. I chose to host my configuration on GitHub.

Next, back in Studio Code Server, you need to generate an SSH key to use to authenticate to GitHub. The reason for using a deploy key instead of some other authentication mechanism such as a Personal Access Token (PAT) is to scope the credentials to this one repo, following the principle of least privilege. To generate the key, run:

ssh-keygen -t ed25519 -C "home-assistant" -f /data/.ssh/id_ed25519

Use an empty passphrase. This will create /data/.ssh/id_ed25519 and /data/.ssh/id_ed25519.pub. Note that these files are being generated to /data/.ssh/ instead of the default location of ~/.ssh/. This is because apps run in docker containers so any changes would be lost on app restart. However, the Studio Code Server app symlinks /data/.ssh/ (which is persistent) to /root/.ssh/, allowing for effective persistence of ssh keys.

Note: Unfortunately, as of writing the Studio Code Server app has a bug where /data/.ssh is mistakenly mapped to /root/.ssh/.ssh instead of the intended location. The workaround is to run cp /root/.ssh/.ssh/* /root/.ssh/ every time the app restarts. I submitted a PR to fix this issue.

Now add the public key to GitHub. Go to your repository’s Settings -> Deploy keys, and click “Add deploy key”. Paste in the public key from running the following command:

cat /data/.ssh/id_ed25519.pub 

Ensure “Allow write access” is checked if you’d like to push changes from Home Assistant to the GitHub repository.

Adding a Deploy Key in GitHub

To ensure you’ve set up authentication correct, run the following command.

The first time you run it you will be asked to confirm that you want to connect. Once confirmed, you should see a message like “Hi repoName! You’ve successfully authenticated, but GitHub does not provide shell access.”

Now the local git repo needs to be set up. From /config, run:

git init
git branch -m main
git remote add origin [email protected]:yourname/yourrepo.git
git config user.name "Your Name"
git config user.email "[email protected]"

Replace the placeholders as appropriate. Note that the git config commands intentionally are not using --global so they persist in the repo’s .git/config rather than the container’s home directory, which isn’t persistent across container restarts.

I would not recommend committing the entirety of your config directory to GitHub, so you will want to add a .gitignore file:

# Home Assistant
.HA_VERSION
.ha_run.lock
.storage/
.cloud/
tts/

# Databases
*.db
*.db-shm
*.db-wal
pyozw.sqlite

# Logs
*.log
*.log.fault
OZW_Log.txt

# Python
__pycache__/
deps/

# Node
node_modules/

# macOS
.DS_Store
._*

# NFS
.nfs*

# Trash
.Trash-0/

# Secrets and credentials
secrets.yaml

# HACS
custom_components/

Double-check that only the files you want tracked are tracked and adjust the .gitignore as needed. Then commit and push the changes:

git add .
git commit -m "Initial commit"  
git push -u origin main

From there, just commit, push, and pull as you would any other git repo.

Migrating Dashboards to yaml (optional)

If you’d like to get the same benefits of version control for your dashboards, you can migrate them to yaml as well. However, using yaml-based dashboards does impose some limitations. Notably, you can no longer edit dashboards through the UI. Because of this, migrating dashboards to yaml may not be for everyone.

If you would like to migrate your dashboards, add the following to your configuration.yaml:

lovelace:
  # Use resource_mode to load resources from YAML
  resource_mode: yaml
  dashboards:
    dashboard-overview:
      mode: yaml
      filename: dashboards/overview.yaml
      title: Overview
      icon: mdi:view-dashboard
      show_in_sidebar: true

Note that the dashboard keys must contain a - for some reason. I only have one dashboard, but if you have multiples just add more siblings to dashboard-overview. To get the yaml for individual dashboards which currently don’t use yaml, edit them in the UI and then in the “…” there should be an option for the Raw configuration editor. Copy and paste that yaml into a yaml file. I put mine under dashboards/ as per the example above.

Refer to the Home Assistant docs for a full list of configuration values.

]]>
How to rip Blu-rays and watch on Jellyfin2026-01-21T00:00:00+00:002026-01-21T00:00:00+00:00https://dfederm.com/how-to-rip-blurays-and-watch-on-jellyfinPreviously I wrote about building a NAS and Media Server for under $500 which explains how to set up a NAS and configure a Jellyfin server (although I’ve since moved away from the TrueNAS “App” in favor of a docker container). This post is a complete guide for how to populate your media library by ripping your Blu-ray discs onto your NAS so that you can watch on Jellyfin.

Table of Contents:

Why?

The first question I usually get is: why? Why rip your Blu-ray discs in the first place? Why not just stream? I have two primary reasons: ownership and quality.

For my point on ownership, if you’ve heard the phrase “You’ll own nothing and be happy”, you know what I mean. Essentially when you stream, you don’t own any of the movies or shows. The streaming platform may choose at their discretion to remove anything they want, or more likely their contracts expire. Owning physical media means that you, well, own it. No one can take it away from you and you can watch it whenever you want. There are some fringe benefits to this as well, such as avoiding data usage (for those with data caps), avoiding buffering, and being able to watch media when you don’t have internet, either via an outage or for example when on a roadtrip, although the latter requires some extra steps. I would also argue that this saves cost over time by not paying streaming subscriptions, and at risk of sounding like a privacy nut, it also shields you from the tracking of the media you consume.

My personal library that I’ve built up over the last 2 years has over 400 movies and over 1000 episodes of TV. At that size, there is plenty of variety and I never find myself without something to watch. In fact, I’ve only actually watched around one-third of the movies I currently have. I have a lot to catch up on!

My Jellyfin Library stats

In terms of quality, Blu-ray discs and especially UHD Blu-ray, are much higher quality. It mostly comes down to bitrate as the streaming services try to keep costs down by compressing the video and/or using a lower bitrate to avoid having to send tens of gigabytes over the internet just for one user watching one movie. For reference, most Blu-ray movies are around 40 GB and most UHD Blu-ray movies are around 70 GB. It’s totally infeasible to stream that much data, so there clearly must be compromises on quality. Additionally if you have a surround sound setup for your home theater, the audio quality is significantly better with physical media.

Where to get Blu-rays

I find the best place to get cheap Blu-rays is from thrift stores like Goodwill or Value Village. I’ve had even better luck at local thrift stores. I can typically get Blu-rays for around $3 each, but I do find that pricing varies wildly from store to store so your mileage may vary. Obviously selection is another problem, but in my opinion the hunt is part of the fun. My wife and I seek out nearby thrift stores whenever we’re in a different area of town, and we check back periodically at places we’ve already been to see if there are some new gems.

Anecdotally, the stars aligned one day for me at a local thrift store which had a half-off sale combined with a seemingly huge drop of Blu-rays they got, so I walked out with 30 Blu-rays for around $70. One was even a 4k Blu-ray! I’m still chasing that high to this day.

Thrifting Haul

One thing to watch out for when thrifting though is that you will want to check the condition of the disc before purchasing. Sometimes there will be smudges which can simply be cleaned, but other times the disc may be quite scratched or even missing entirely.

If you’re not into thrifting or you really want a specific piece of media you haven’t been able to find elsewhere, Blu-ray.com is a good source, as well as Amazon of course. Paying retail prices can be justified if you don’t have streaming services. Just put the money you’d typically spend on subscriptions towards buying physical media instead, and over time you can build up quite the library.

Here are some of my favorites which are relatively cheap, at least at the time of writing:

Optical Drive

To rip Blu-ray discs you need an optical drive for your computer. For regular Blu-ray discs, or DVDs for that matter, any old optical drive that can read those discs will do. However, for UHD Blu-ray AKA 4k Blu-ray you need special firmware flashed to the drive. While there is a detailed flashing guide, personally I recommend buying a pre-flashed drive for the peace of mind and convenience. I got a Pioneer BDR-212 from Billycar11 and can attest to his legitimacy, professionalism, and promptness.

MakeMKV

The best software to use for ripping is MakeMKV. It extracts the video files from the disc and remuxes them into mkv files. This file format is widely supported and because it’s a remux as opposed to re-encoding, the video files are the exact same quality as on the disc.

Once you have a few rips under your belt, I highly recommend buying MakeMKV. It’s not strictly necessary, but it’s a lifetime license as opposed to a subscription, and I think it’s important to support the software you use and enjoy to ensure that it continues to receive updates and support.

Basics of Ripping

First you need to open the disc in MakeMKV and select the video files you want. For some background, Blu-ray discs contain smaller video segments and then playlists of those segments which represent one semantic video file. This allows for deduplication of content for video files with identical parts. For example, a disc containing both theatrical and extended versions doesn’t need two complete copies of the film, which wouldn’t fit at full quality anyway, because most of the content is shared and only certain scenes differ between versions.

The reason why this matters when ripping is that many of the files you can select from in MakeMKV are irrelevant or not useful, so you should only select what you want. Some people may want just the movie, but personally I also rip the special features. This does make selection slightly more involved, but I’ll go into that more in a later section.

Selecting video files to rip in MakeMKV

In some cases studios try to deter ripping by creating several or even sometimes hundreds of playlists with different segment maps. The bad playlists end up with an incomplete and/or out-of-order video. Sometimes you can figure out the correct one from comparing the movie’s runtime to the playlist duration, but most of the time it’s easier to just do an online search as the community has already figured out the correct one. Usually a search for the movie name with “MakeMKV” at the end is sufficient to find what you need. Anecdotally, I’ve also noticed that for some discs which have only a couple of options (presumably not intentionally maliciously spamming playlists) if one of those options is 00800.mpls and the other option is similar to 00801.mpls, the 800 one is the correct one.

If you only want the movie, then selecting the correct single file is all you need to do and then you can start the rip! A short while later you will have the video file(s) ready to copy to your NAS. A Blu-ray takes around 45 minutes to rip, while a 4k Blu-ray can take up to an hour and a half, but these times can vary based on drive speed.

Progress indicator while MakeMKV is ripping

Extras and Special Features

If you’re like me and want to also rip and organize the special features, then selection becomes a bit more involved, and you’ll need to identify what each video file actually is.

When selecting what to rip, there are many useless video files on the disc that you will want to filter out. Blank segments, videos related to the Blu-ray menus, piracy warnings, and production logos are all worth filtering out. However, because you can’t see the videos as part of the selection, usually I just filter out very short video (less than 10 seconds) and alternate versions of the movie, and just sort them out after the rip. MakeMKV has a setting which can help filter out short videos, and you may want to even set it to up to a minute.

Once the rip is complete and you have a superset of the video files you want, you can use a combination of two approaches to identify what the various files are. The first step is to simply play the video on your computer with VLC or another player that supports MKV files. Many of the junk files will be immediately obvious and can be deleted.

Once most of the junk is filtered out, you can use DVDCompare to help identify the special features. You can search for a movie and then it will show the special feature titles and lengths which you can then match with the durations on the video files you have and rename your files to match the name of the special feature. It’s not always perfect and may take some manual scrubbing of the videos to figure out which is which, but with some experience it shouldn’t take more than a few minutes to sort through.

Re-encoding

I prefer to store the full quality videos, however that can take up quite a bit of storage. My current library of around 400 movies takes about 15 TB. One option to save on storage costs would be to re-encode the videos into another format, e.g. HEVC, using a tool like HandBrake.

As I prefer the highest possible quality, I do not re-encode and so I won’t go into more detail for that.

Library Organization

I have a root media share on my NAS with 2 subdirectories, Movies and TV. I have these mapped into the Jellyfin docker container as separate folders, /movies and /shows, and each is added as its own library in Jellyfin with the associated content type.

Adding a library to Jellyfin

Jellyfin does have good documentation for file organization within a library, but here’s the gist.

For movies, I have a folder for each with the name of the movie and the year of its release, eg “Willy Wonka and the Chocolate Factory (1971)”. This naming is human-readable while still being enough information for Jellyfin to find the metadata for it. Inside the folder I have the video file for the movie itself with the same name as the folder. If I have multiple versions of a movie, for example a 1080p version and a 2160p (4k) version, I’ll add a suffix to the name, so for example “Willy Wonka and the Chocolate Factory (1971) - 2160p.mkv”. I then put the special features in subdirectories which correspond to the type that they are. I find classifying special features to be of a judgement call in many cases, and ultimately whatever organization works for you is best.

A complete example for how I organize a complete title is as follows:

movies/
└── Willy Wonka and the Chocolate Factory (1971)/
    ├── Willy Wonka and the Chocolate Factory (1971) - 1080p.mkv
    ├── Willy Wonka and the Chocolate Factory (1971) - 2160p.mkv
    ├── behind the scenes/
    │   ├── Pure Imagination - The Story of Willy Wonka & the Chocolate Factory.mkv
    │   └── Tasty Vintage.mkv
    ├── extras/
    │   └── 4 Scrumptious sing-along songs.mkv
    └── trailers/
        └── Theatrical Trailer.mkv

For TV Shows, I have a folder for each show with the show name, then season subfolders containing episodes named with the standard ShowName_S##E##_EpisodeName format. Show-level extras go in an extras folder at the show level, while season-specific extras can go in an extras folder within each season folder.

shows/
└── Batman - The Animated Series/
    ├── extras/
    │   ├── Arkham Asylum.mkv
    │   ├── Batman - The Legacy Continues.mkv
    │   ├── Concepting Harley Quinn.mkv
    │   ├── The Heart of Batman.mkv
    │   └── ...
    ├── Season 1/
    │   ├── Batman - The Animated Series_S01E01_On Leather Wings.mkv
    │   ├── Batman - The Animated Series_S01E02_Christmas with the Joker.mkv
    │   ├── Batman - The Animated Series_S01E03_Nothing to Fear.mkv
    │   ├── ...
    │   └── extras/
    │       ├── A Conversation with the Director - On Leather Wings.mkv
    │       ├── A Conversation with the Director - Christmas with the Joker.mkv
    │       └── ...
    ├── Season 2/
    │   ├── Batman - The Animated Series_S02E01_Sideshow.mkv
    │   ├── Batman - The Animated Series_S02E02_A Bullet for Bullock.mkv
    │   └── ...
    └── Season 3/
        ├── Batman - The Animated Series_S03E01_Holiday Knights.mkv
        ├── Batman - The Animated Series_S03E02_Sins of the Father.mkv
        └── ...

Parental Controls

I have small children and not all content in my media library is appropriate for them. No one wants their kids getting nightmares from accidentally stumbling across Saving Private Ryan.

Luckily it’s very easy to set up parental controls in Jellyfin. When editing the settings for their user, there is a Parental Controls tab which allows you to select from different sets of ratings.

Setting parental controls

You can also fine-tune by overriding the ratings for specific content. For example, the PG-13 rating did not exist until the 80’s so some movies were rated PG at the time but would be rated PG-13 now. To do this, you navigate to the movie, click the three dots, and select “edit metadata”. You can then set a custom rating.

Editing Media Metadata

You can even take this to the extreme and use the “Approved” custom rating to manually curate every piece of media you want to make available. Personally I find that approach to be overkill though and trust the MPA for the most part, as well as just generally supervise my kids when they’re watching TV.

Skip or ignore this section if you don’t care about or disagree with my opinion on this subject. It is, after all, just my opinion.

I am not a lawyer so cannot comment on the legality of ripping Blu-ray discs where you live. Even in terms of ethics, I do not claim to be the arbiter of what is right or wrong nor can I judge others for their choices or what they believe to be ethical. I can, however, explain how I personally operate within my own ethics.

I have two primary rules for myself. First, I am fine with making a digital copy of physical media that I own for personal use within my own home. I make sure that I do in fact own a disc for every movie I have on my NAS. I also choose not to make my Jellyfin server accessible to anyone outside my household; only myself and my immediate family members access these copies. These rules meet my personal bar for what is ethical, but again it is a personal choice and I am not one to judge others for having a different opinion.

Conclusion

That’s it! With this setup, you’ll have your own personal streaming service with content you truly own, at quality that rivals or exceeds any streaming platform. Happy ripping!

]]>
Migrating pictures from OneDrive to Immich on TrueNAS Scale2024-07-17T00:00:00+00:002024-07-17T00:00:00+00:00https://dfederm.com/migrating-pictures-from-onedrive-to-immich-on-truenas-scaleAs I continue on my self-hosting journey, I decided to migrate my photos and videos from OneDrive to ensure my photos are stored safely, privately, and securely on my own server. After exploring various solutions, I chose Immich for its extensive features and, perhaps more importantly, its active development community.

Installing Immich

TrueNAS Scale supports installing Apps, so the first step is to install the Immich app. Immich has full instuctions, but I’ll go over the specific configuration I used. In TrueNAS Scale, go to the Apps tab, click “Discover Apps”, search for “Immich” and click “Install”.

Searching for the Immich app

For the configuration, I mostly used the defaults, including for the “Immich Libray Storage” section. Because I’m migrating my existing photos from OneDrive and using the NAS storage as the “source of truth”, I didn’t plan on actually storing anything directly in Immich to avoid it reorganizing the files itself. Instead, Immich has a notion of an “External Library”, which I’ll discuss more later once Immich is installed. To set that up though, a folder on the NAS holding all the photos will need to be shared with Immich. This is done in the “Additional Storage” section, and I mounted the host path /mnt/Default/federshare/David/Pictures (path on the NAS) to /pictures inside the app.

Configuring Additional Storage for the Immich app

Finish up the app install and once running, you can navigate to the Immich Web Portal and go through the configuration process to create an admin user.

To import your pictures, you will create an External Library. Click “Adminitration” in the top right, then in the “External Libraries” tab, click “Create Library” and select an owner. The owner is the Immich account the pictures will be associated with.

Importing Pictures

Creating an external library

Then click the “…” and rename the External Library to whatever you desire; I chose “David’s Pictures (NAS)”. Then click the “…” again and click “Edit Import Paths”, “App path”, type “/pictures” (or whtever mount path you used earlier), and save.

Adding an import path to the external library

You can then click “Scan All Libraries” to on-demand import the pictures from that folder at any time. To configure this to happen on a schedule, you can go to the “Settings” tab, expand the “External Library” section, and configure the “Library Watching” and “Periodic Scanning” values. Personally, I chose to enabled Library Watching and scan the library every hour. Anecdotally, I found the Library Watching feature to not work very well, so if you want a picture to appear immediately and don’t want to wait for the time, you’ll want to manually scan the external library.

Configuring Library watching

NOTE! Deleting a picture in Immich (and emptying the trash) deletes it from the NAS! ie the External Library syncs both ways.

At this point you can manually copy all your pictures from OneDrive to your NAS. After scanning the External Library (manually or otherwise), your photos should be viewable in Immich!

Imported pictures

If you back up your NAS to OneDrive, the pictures will be copied back to OneDrive (ironic!), so you can safely use the NAS as the source of truth going forward. There are Android and iOS mobile apps for Immich, which you should install at this point as well.

Syncing Pictures from Android

Speaking of the NAS being the source of truth and mobile apps, the final step to complete the process is to sync your phone’s pictures to the NAS instead of OneDrive. There are many ways to do this, but as an Android user I chose to use FolderSync.

First add an Account to sync to, and in this case at the bottom you’ll find SMB. Configure it with your SMB share name, SMB credentials, and the IP of your NAS. TrueNAS Scale supports SMB3, so be sure to select that for the fastest syncing.

Adding an SMB share to FolderSync

Next, add one or more “folder pairs”. These are a pair of folders to sync. In this case you’ll configure the left to be your phone’s camera storage (eg /storage/emulated/0/DCIM/Camera) and the right will be the SMB share under the path where you want to place the pictures, in particular the same directory or a subdirectory of what you mounted in the Immich app. (/David/Pictures/Camera in my case). I configured the sync to be “to right folder”, which means it will sync only one-way from the phone to the NAS.

Adding a folder pair to FolderSync

There are additional configurations to play around with for a folder pair, for instance I added a regular sync nightly at midnight, and also configured it to monitor the device folder so that it immediately syncs, although note that this fails when you’re not on your local network which is why I use a scheduled sync in additional.

I also was hitting some file conflicts, very likely due to bad file management from me originally on OneDrive, so I ended up configuring the left file to always “win” in the case of a conflict.

Looking Forward

At this point, the migration was complete as al my primary requirements were met. However, Immich has many features which make the experience after migration even better than it was before. For example, Facial Recognition, which analyzes the photots to identify faces and associate them with distinct people. This all happens locally, which is a relief as allowing a remote third-party to analyze pictures of my family, including my yound children, is not something I’m terribly comfortable with. Relatedly, the Smart Search feature allows for searching your pictures without having to tag them specifically, for example I can search for “<wife’s name> holding <son’s name>” and it does a reasonably good job at finding relevant photots. It’s certainly not perfect, but impressive for running locally and should improve over time.

Lastly, I really enjoy the ability to create links to share specific photos externally. This does require some extra work though to expose Immich externally, the details of which I won’t go into here, but it makes sharing full-quality pictures with family who I still primarily talk to over SMS way easier.

]]>
Backing up TrueNAS Scale to OneDrive2024-03-20T00:00:00+00:002024-03-20T00:00:00+00:00https://dfederm.com/backing-up-truenas-scale-to-onedriveRecently OneDrive was removed as a CloudSync provider for TrueNAS Scale. As I built my first NAS and use OneDrive for cloud storage, I was looking for alternate means to back up my NAS to OneDrive. I found individual pieces of possible solutions on the TrueNAS forums, but nothing approaching an end-to-end solution, so decided to do a write-up of what I ended up doing in hopes others may find it helpful as well.

TrueNAS Scale allows custom docker containers which they call “custom apps”, so the overall idea is just to use rclone in a Docker container. I like the solution because it’s decoupled from anything specific to TrueNAS, so very generic, easy to support, and there’s no “magic” involved. It’s very straightforward and understandable.

The first step is to create a new dataset which will contain your rclone configuration file. I named mine “rclone” in my root “Default” dataset. I used the SMB share type, since that’s what I plan on using, but left the rest of the settings as default.

Next you’ll need to configure the SMB share for the dataset so that you can manage the config file from other machines. For mine, I just added the SMB share to the /mnt/Default/rclone path and used the default settings. When creating a new share it’ll ask to restart the SMB service.

Connect to the new SMB share and create a single file inside called rclone.conf. This file should be in the INI and look like this:

[onedrivedavid]
type = onedrive
drive_type = personal
drive_id = <your-drive-id>
token = <your-token>

All configuration can be found in the rclone docs for OneDrive, but the boilerplate should be enough for most, so you just will need to fill in the two placeholders.

The section header is the name of the remote, so I used “onedrivedavid” since I plan to back up my wife’s data on the NAS to her OneDrive separately and wanted to disambiguate.

For drive_id, I found the easiest way is to use the Microsoft Graph Explorer. There you’ll log in (by default you’ll see mock data), and execute the query https://graph.microsoft.com/v1.0/me/drive. The first time you do this you’ll see an error that says Unauthorized - 401. You can easily grant access to Graph Explorer by clicking the “Modify permissions” tab and consenting to Files.Read.

Consenting to Graph Explorer permissions

Run the query again and you should see the JSON response in the bottom pane. Use the id field of the response as your drive_id. You can also confirm that your drive_type is “personal” from the same response.

Graph Explorer response

For the token, you can follow the rclone instructions but basically you just download the rclone executable from the website and run rclone authorize onedrive. This will pop up a browser window for you to authenticate in and once completed spit out JSON content which you will copy and paste in entirety into the rclone.conf. The value should be of the form: {"access_token":"...}.

Save your rclone.conf file and it’s time to create the docker container or “custom app”. Go to the Apps tab, click “Discover Apps” and then “Custom App”. I named mine “rclone-david” since again I wanted to disambiguate with another user’s rclone backups.

I found robinostlund/docker-rclone-sync on GitHub, which performs an rclone sync command on a schedule, which is exactly the scenario I’m targetting, so for the Image repository use ghcr.io/robinostlund/docker-rclone-sync.

As per the docs for that image, a few environment variables need to be set to configure it. Under the “Container Environment Variables” section, add the following environment variables:

  • SYNC_SRC=/rclone-data - This can be any path, as long as it matched what you use below in the Storage section.
  • SYNC_DEST=onedrivedavid:/nas-backup - The left-hand side of the value needs to match the section header in the ini file, while the right-hand side is a path within OneDrive you’d like to back up to.
  • CRON=0 0 * * * - To schedule the sync daily at midnight.
  • CRON_ABORT=0 6 * * * - Schedules an abort in case the sync is taking too long.
  • FORCE_SYNC=1 - This syncs on container startup, which makes for easier testing.
  • SYNC_OPTS=-v --create-empty-src-dirs --metadata - Additional options to pass to rclone sync. These are the options I prefer, but all options can be found in the rclone docs.

Under the “Networking” section, add an interface so it can reach out to OneDrive properly.

Under the “Storage” section, add:

  1. Config
    • Host path: /mnt/Default/rclone, or whatever yours is configured to be.
    • Mount path: /config, which is what the image expects.
    • Read Only: unchecked. rclone will write to the file, in particular to update the access token as it refreshes it.
  2. Data
    • Host path: Whatever path on your NAS you’d like to back up
    • Mount path: /rclone-data, or whatever you chose for SYNC_SRC above.
    • Read Only: checked. rclone will only need to sync from the NAS, so only need read permission to the data.

Leave everything else as the defaults and click Install. Now you’ll need to wait for the container to deploy, which may take a few moments.

Docker Container Deployment

Once the container is deployed, you can click on it and under “Workloads” there should be an icon to click on to show the logs for the container. You can use this to ensure the sync is happening properly.

Docker Container Logs

And that’s all there is to it! You can now have the benefits of storing your data locally in your NAS, while having the piece of mind of having a remote backup.

]]>
Building a NAS and Media Server for under $5002024-01-06T00:00:00+00:002024-01-06T00:00:00+00:00https://dfederm.com/building-a-nas-and-media-server-for-under-500Lately I’ve been realizing that purchased digital media isn’t really yours, and a recent event in particular sparked me into doing something I’ve been wanting to do for a while now: build a NAS to contain all my legally purchased digital media, digital backups of physical media, as well as personal documents and photos.

First, why build a NAS rather than buy a prebuilt one like the Synology DS1522+? There are pros and cons to each approach, but the DIY was attractive to me due to the better expansion, cost effectiveness, and subjectively I just have fun building PCs.

As this is a long post, here is a table of contents if you’d like to skip to a specific section:

Parts

The first part of building your own NAS is purchace the various components needed to build it. This is very similar to building any other PC, and this guide assumes you’re reasonably comfortable with building a PC. If this is your first time building a PC, I highly recommend checking out Linus Tech Tip’s How to build a PC, the last guide you’ll ever need!.

Note that I built my NAS is just over $400, although I did not include the price of tax since that very much depends on your location, nor did I include the cost of storage since that depends on your specific needs. However, that still is leaps and bounds better more economical than the prebuilt I mentioned above which at the time of writing is $700, not to mention much more powerful and flexible.

As prices and availability are always fluxuating, obviously any prices you see here may be different than what I saw, so feel free to change things up.

When selecting parts, I strongly recommend using PCPartPicker as it will help identify compatibility issues as well as help filter out incompatible parts based on what you’ve already chosen which massively helps with the tremendous amount of options out there.

Case

JONSBO N1

At risk of copying every other NAS build guide out there, I recommend the JONSBO N1. It’s specifically designed for NAS machines, so it’s pretty small. It requires a Mini-ITX motherboard, an SFX power supply, and has room for five 3.5” HDDs and well as one 2.5” drive.

CPU

AMD Ryzen 3 3100

A NAS doesn’t need to be that powerful, even one which doubles as a media server, so I opted to go for something a few years older. I went with an AMD Ryzen 3 3100 which I found used on eBay. Because it’s fairly low-end, it’s also pretty power-efficient, which is great for a NAS which is intended to be powered 24/7. Do make sure that if you buy it used, it comes with the stock cooler so that you don’t have to buy one separately.

One thing to note is that this CPU does not have integrated graphics. This causes a bit of pain during the build, at least for me, but it’s something that can be easily worked past. If you want a smoother experience though, go with something with integrated graphics.

Motherboard

ASRock A520M-ITX/AC

With the Mini-ITX form factor and AM4 CPU slot selected, there wasn’t really a ton of options for the motherboard. I ended up going with the ASRock A520M-ITX/AC because well, it was cheap and compatible. It also has 4 SATA ports, which at the moment is good enough for me and eventually I can just buy a cheap HBA for future expansion. The AM4 slot also should allow an upgrade to a better CPU in the future as well, if that becomes necessary.

The only downside is that it “only” has gigabit ethernet and not the faster 2.5 gigabit, but honestly plain old gigabit is good enough for me. If you need 2.5 gigabit since a NAS is a network attached storage and thus needs only the finest of network connectivity, you can spend a bit more for something like a GIGABYTE B550I AORUS PRO AX or an ASRock B550 Phantom Gaming-ITX/ax.

RAM

G.SKILL Ripjaws V 16GB (2x8 GB)

For RAM I went with G.SKILL Ripjaws V 16GB (2x8 GB). I basically just wanted something cheap and DDR4. I was debating going up to 32GB of RAM, but I found a good deal on eBay and so just went with 16 GB for now. In practice that’s more than enough for my usage at least.

Boot Drive

Samsung 860 EVO 500GB

I was lucky enough to have a Samsung 860 EVO 500GB laying around from when I used to used it in my desktop several years ago. If I were to have bought something though, I would have just gone with the cheapest SSD I could have, like this TEAMGROUP 256GB NVMe or if you want to save the NVMe slot for an HBA, perhaps this Crucial BX500 240GB SATA SSD

PSU

Silverstone SX500-G

The NAS is pretty low power, so nothing super special is needed here. Personally though I wanted a fully modular power supply for the ease of use, and the 80+ Gold rating for the efficiency, so that combined with the need for the SFX form factor and I was basically just left with the Silverstone SX500-G. It’s a bit overkill as PCPartPicker estimates that the system only needs ~149W, but this should give plenty of headroom for future expansion. Plus, as I mentioned with all the requirements above I didn’t really have much of an option.

Storage

Seagate IronWolf 4 TB

This will very much be dependent on your needs. I don’t have a massive amount of data to store (yet!) but I did want some resiliency, so I put a pair of Seagate IronWolf 4 TB drives in. This should give me plenty of room to drop in a few more drives in the future as my storage needs increase.

Parts Summary

To summarize my parts list and the prices I was able to get, here’s my build on PCPartPicker as well as a cost breakdown.

Part Name Price
Case JONSBO N1 $120 ($130 with $10 promotional gift card w/ purchase)
CPU AMD Ryzen 3 3100 $45.15 (used)
Motherboard ASRock A520M-ITX/AC $104.99
RAM G.SKILL Ripjaws V 16GB (2x8 GB) $33 (used)
Boot Drive Samsung 860 EVO 500GB $0 (already owned)
PSU Silverstone SX500-G $101.99
Total (without storage) $405.13
Storage 2 x Seagate IronWolf 4 TB 2 x $93.99 = $187.98
Total (including storage) $593.11

Build

All parts unassembled

Now it’s time to build! As mentioned earlier, this won’t be super detailed, but I will go through most the steps and point out specific pain points I ran into with the specific parts I used.

The first thing to do is take the motherboard out and place it onto the box it came in. Socketing the CPU is pretty straightforward; you just undo the latch, line up the triangle on the socket and the CPU, gently drop it in, and reengage the latch.

CPU Install

Next, put a pea-sized blob of thermal paste on and install the cooler. I had some Thermal Grizzly Kryonaut Thermal Paste left over from when I built my desktop, so I just used that, but I imagine anything will do.

A tip when installing the cooler, or really anything with 4 or more screws, is to use a star pattern when tightening. Also tighten in two passes such that the first pass is a gentle tightening and the second pass is where you tighten to the final torque. That way any alignment issues can still be corrected before the screws are too tight.

Cooler Install

Install the RAM next, which is as simple as just aligning the sticks the right way since they’re keyed and firmly pressing them in until you hear the click.

Looking back, you should actually wait to install the RAM until you mount the motherboard in the case and connect the power switch and reset switch wires. I found it extremely difficult to do so while the RAM was installed, but perhaps I just don’t have the nimblest fingers.

RAM Install

Next bring out the case and take off the outer shell. I had never worked with the Mini-ITX form factor previously, so I don’t have much reference to go on, but I found that the JONSBO N1 was relatively easy to work in. Don’t get me wrong, Mini-ITX was certainly challenging, but I feel like the case wasn’t the problem.

Case Case opened

Mount the motherboard on the preinstalled standoffs using the same two-pass star pattern described earlier. I forgot the I/O shield as you’ll notice from the picture below and had to correct my mistake later. I noticed with the I/O shield in place though that the motherboard was a bit difficult to get perfectly aligned on the standoffs, so using the screw-tightening strategy described really helped ensure things went smoothly anyway.

Motherboard Install

Next install the power supply. This, and when you start plugging in cables (which I held off on a little later), is where the benefits of a fully modular power supply becomes apparent. I have to imagine a non-modular would be extremely challenging to work with in a Mini-ITX form factor.

PSU Install

Installing the HDDs is an interesting part of working in the JONSBO N1. It has a host-swappable backplane for the SATA drives which is pretty neat. It was certainly a bit nerve-wracking to shove the drives into the bays and hope the connectors aligned properly, but after installing them, it seemed like a really elegant mechanism I ended up liking a lot.

If you’re using a SATA SSD for your boot drive, you’ll want to install that now as well, which is next to the power supply. I found the mounting solution there to be a bit sketchy, but as it’s an SSD and has no moving parts, it doesn’t really matter.

HDD Install

Now to start the messy work of plugging in cables. This is my least favorite part as I can never seem to have the patience (or skill?) to manage the cables properly and so it ends up being just a rat nest. Luckily the case ends up hiding the mess in the end, so just don’t let anyone go opening up the machine and discovering your dirty secret.

Sata Cables

An important note is that you will want to route all cables through the middle of the chassis rather than try and go around the outside. I made that mistake with the CPU power cable and the outer shell of the case ended up not sliding on later. Don’t make my mistakes. Route it through the case the long way, the proper way.

Bad CPU Power Cables

Ok, all the cables are shoved in there now I guess. Time to put the shell of the case back on and try things out!

Cable Mess

Configuration

I decided to use TrueNAS Scale as the NAS OS and Jellyfin for the media server. Both are fully open source and well supported, so great options for the build.

Booting

To get started with TrueNAS Scale, first you must download the iso from their website and flash it to a USB drive using something like balenaEtcher.

Because I didn’t choose a CPU with integrated graphics, I decided to put the boot SSD into my desktop computer and use the newly flashed USB install media there. Just be very careful you install TrueNAS to the correct drive or you may accidentally wipe the wrong one.

After installing the OS and putting the boot SSD back into the NAS machine, I tried booting and… nothing happened. Well, the power turned on and the fan was blowing, but without a display it wasn’t clear what was happening.

I had expected this, but as TrueNAS Scale is supposed to connect to the network via DHCP and be configurable via a web portal, I expected it to show up in my router, but it didn’t.

This is where it would benefit having a CPU with integrated graphics or a cheap spare GPU to temporarily slot in while you do the initial configuration. I had neither so I ended up doing some mad science…

Mad Science

Yea, wow. So I took the graphs card out of my desktop, leaving the power cables though since the PSU in the NAS was very tight. But, a 3070 TI obviously won’t fit in the NAS, so I took the motherboard out but left everything I could attached and in the case. Now when turning on both machines (remember, my desktop PSU was powering the graphics card) I was able to see the video output.

Frustratingly, It ended up being one simple keystroke I needed to confirm something about the fTPM and then it booted properly into TrueNAS Scale. At this point I probably should can configured the BIOS as well, but my desktop machine kept turning itself off after some time, probably due to no display device being plugged into the motherboard, and I was just ready to move on.

So I put everything back together, put the NAS in its home, powered it on, and success! I was able to see a new device on my network and was able to hit the web portal for TrueNAS Scale.

Configuring TrueNAS Scale

I was able to connect to the web portal via http://truenas.local, but depending on your local network you may need to use the IP address instead, which you can get from your router’s web portal.

Personally I prefer to have my “infrastructure” devices to have static IP addresses, like my Raspberry Pi running Home Assistant, the Pi running my alarm panel and AdGuard Home instance, and yes the NAS. That way if something gets messed up with DNS or DHCP, I should always be able to access those devices.

To do this in TrueNAS Scale you click on the Network tab and in the Interfaces section you can edit the ethernet interface. You just need to uncheck DHCP and under “Aliases” add the IP and subnet you want.

Configuring a static IP

After this you’ll need to “Test Changes”, which is a convenient feature so you don’t misconfigure anything. It will automatically revert the network configuration if you don’t confirm it after some timeout. So after you make the changes, navigate to the new static IP and confirm the changes. At least for me, using the static IP was required as the host name resolution was stale and still pointing to the old DHCP-based IP.

Next I changed the host name since I wanted to use nas.local instead of truenas.local (admittedly, very minor). Since the host name resolution was stale anyway, I figured why not. To do this, you go back to the Network tab and edit the Global Configuration with the desired host name. Because I have AdGuard Home, I also added that as the Nameserver.

Configuring the host name

Now that that’s out of the way, it’s time to actually set up the storage aspect of the NAS.

First a pool needs to be set up. A pool is organizes your physical disks into a virtual device, or VDEV. This is where you configure your desired disk layout, for example your desired RAID settings, or in the case of TrueNas Scale your RAID-Z settings. RAID-Z is a non-standard RAID which uses the ZFS file system. In my case I only have 2 data disks, so I chose to just use a mirror. When I add more disks I’ll end up converting to a RAIDZ1 (one drive can fail without data loss).

Note that Mirroring or RAID/RAID-Z are not proper substitutes for backups! They’re mostly to avoid inconvenience and downtime. You should always still take proper backups of your important data.

If you have an extra NVMe drive to use as a cache, you will also configure that here. Note though that it’s not recommended unless you have over 64GB of RAM, as the RAM cache (called “ARC”) is faster and the overhead to support the SSD Cache (“L2ARC”) requires RAM and thus eats into and reduces the size of the ARC cache. At least for me, the network is the bottleneck anyway, although that’s exacerbated by the fact that I stuck to a 1 gig interface.

Once the pool is configured, you’ll also need to configure a “dataset”, which is a logical folder within the pool, or perhaps it can be though of as a volume on a drive. Permissions are applied at the dataset level though, so if you intend to partition your data, this is where you would do so.

Next you’ll want to set up a share so you can transfer data to and from the NAS. In my case I use Windows for my primary desktop machine, so I set up an SMB share. Once you create the share it’ll prompt you to start the SMB service on the NAS, which is the server process which actually handles SMB traffic.

Finally, you’ll need to set up a user to access the SMB share. Go to the Credentials -> Local Users tab and add a new user. You’ll want to set up additional users for any family members who you want to access the SMB share directly. Note that later when configuring Jellyfin there will be separate user accounts to access the Jellyfin server, so if for example you only want your kids to consume media from the NAS but not directly access the data, you wouldn’t want to set up a user in TrueNAS Scale for them.

Now you should be able to access the share via \\nas.local\<share-name> from your Windows PC.

I recommend mapping the share as a network drive to avoid needing to re-enter credentials:

Map network drive

This allows you to see it as if it were a drive on your machine, in my case Z:.

Mapped network drive

At this point you can copy all your data!

Configuring Jellyfin

TrueNAS Scale supports installing “Apps”, which are effectively just docker containers. One such supported app is the Jellyfin app, a media server.

First go to the Apps tab and find the Settings drop down to choose a pool to use for the applications. It will create a dataset inside the selected pool called “ix-applications” to store the application data. TrueNAS recommends using an SSD pool if possible, but in my case I only have 1 pool, the HDD pool, so I just used it.

Now that an application pool is selected, you can install the Jellyfin app. Click “Discover Apps” and search for and install Jellyfin.

You’ll mostly just use the default settings, but there is one key piece you need to configure. You will need to give the Jellyfin app access to your data.

Under the Storage Configuration you should see “Additional Storage”. Click “Add”, and use Type: Host Path. For the Mount Path, use whatever path you want to be visible on the Jellyfin side, eg /movies. For the Host Path, select the path to the dataset with your movies, eg /mnt/Default/federshare/Media/Movies. Repeat this process for TV Shows, for example /shows as the mount path and /mnt/Default/federshare/Media/TV as the host path.

Jellyfin storage configuration

It’ll take a minute or two for the Jellyfin app to install and start, but once it’s done you can click the “Web Portal” button which will take you to the Jellyfin web portal where you can configure Jellyfin. Here you’ll need to configure Jellyfin user names and libraries.

The users you configure will be how users log into a Jellyfin client application to watch media, so is likely the users you will need to set up for your family members. I set up separate accounts for each of my family members so that I could apply parental controls.

I did run into a permissions quirk where the Jellyfin app didn’t have permissions to the /movies and /shows mount paths I configured, possibly because of the SMB share, but I’m not certain of the reason. I ended up needing to go to the dataset and editing the permissions and granting the everyone@ group read permissions.

TrueNAS ACL

Another quirk I ran into is that my subtitles were not named in a way which Jellyfin was able to automatically pick up. They were named <Movie Title>-en-us_cc.srt where Jellyfin requires <Movie Title>.en-us_cc.srt. This was fixed easily enough with PowerRename.

PowerRename

Now you’re ready to install the Jellyfin client on your various devices and enjoy your own personal local media streaming service!

]]>
Limited Parallelism Work Queue2023-12-03T00:00:00+00:002023-12-03T00:00:00+00:00https://dfederm.com/limited-parallelism-work-queueIn the realm of asynchronous operations within C# applications, maintaining optimal performance often requires a delicate balance between execution and resource allocation. The need to prevent CPU oversubscription while managing numerous concurrent tasks is a common challenge faced by developers.

This blog post delves into a crucial strategy to navigate this challenge: the implementation of a limited parallelism work queue. Rather than allowing unchecked parallelism that might overwhelm system resources, employing a limited parallelism work queue offers a systematic approach to manage asynchronous tasks effectively.

The heart of implementing a work queue is the producer/consumer model. This can be well-represented by Channels. If you’re not familiar with channels, Stephen Toub has a great introduction. Essentially a channel stores data from one or more producers to be consumed by one or more consumers.

In our case, the producers will be the components which enqueue work and the consumers will be the workers we spin up to process the work.

If desired, you can skip the explanation and go straight to the Gist.

Let’s start with creating the channel and the worker tasks. For now, we don’t know exactly what kind of data we need to store, so we’re just using object.

public sealed class WorkQueue
{
    private readonly Channel<object> _channel;
    private readonly Task[] _workerTasks;

    public WorkQueue(int parallelism)
    {
        _channel = Channel.CreateUnbounded<object>();

        // Create a bunch of worker tasks to process the work.
        _workerTasks = new Task[parallelism];
        for (int i = 0; i < _workerTasks.Length; i++)
        {
            _workerTasks[i] = Task.Run(
                async () =>
                {
                    await foreach (object context in _channel.Reader.ReadAllAsync())
                    {
                        // TODO: Process work
                    }
                });
        }
    }
}

This simply creates the Channel and creates multiple worker tasks whick simply continuously try reading from the channel. Channel.Reader.ReadAllAsync will yield until there is data to read, so it’s not blocking any threads.

Now we need the producer side of things. For this initial implementation with Task and not Task<T>, so we know the return type for the method needs to be a Task. The caller needs to provide a factory for actually performaing the work, so the parameter can be a Func<Task>. This leads us to the following signature:

    public async Task EnqueueWorkAsync(Func<Task> taskFunc);

As we want to manage the parallelism of the work, we cannot call the Func<Task> to get the Task, as that would start execution of the task. The way to return a Task when we don’t have one is to use TaskCompletionSource. This allows us to return a Task which we can later complete with a result, cancellation, or exception, based on what happens with the provided work.

We also know we need to write something to the channel, but we still don’t know what yet, so let’s continue to use object.

    public async Task EnqueueWorkAsync(Func<Task> taskFunc)
    {
        TaskCompletionSource taskCompletionSource = new();
        object context = new();
        await _channel.Writer.WriteAsync(context);
        await taskCompletionSource.Task;
    }

Now that we have the channel reader and writer usages, we can figure out what we actually need to store in the channel. The caller provided a Func<Task> to perform the work, and we need to capture the TaskCompletionSource so we can complete the Task we returned to the caller. So let’s define the context as a simple record struct with those two members:

private readonly record struct WorkContext(Func<Task> TaskFunc, TaskCompletionSource TaskCompletionSource);

The Channel<object> should be updated to use WorkContext instead, as the reader and writer call sites should also be adjusted. We now have the following:

public sealed class WorkQueue
{
    private readonly Channel<WorkContext> _channel;
    private readonly Task[] _workerTasks;

    private readonly record struct WorkContext(Func<Task> TaskFunc, TaskCompletionSource TaskCompletionSource);

    public WorkQueue(int parallelism)
    {
        _channel = Channel.CreateUnbounded<WorkContext>();

        // Create a bunch of worker tasks to process the work.
        _workerTasks = new Task[parallelism];
        for (int i = 0; i < _workerTasks.Length; i++)
        {
            _workerTasks[i] = Task.Run(
                async () =>
                {
                    await foreach (WorkContext context in _channel.Reader.ReadAllAsync())
                    {
                        // TODO: Process work
                    }
                });
        }
    }

    public async Task EnqueueWorkAsync(Func<Task> taskFunc)
    {
        TaskCompletionSource taskCompletionSource = new();
        WorkContext context = new(taskFunc, taskCompletionSource);
        await _channel.Writer.WriteAsync(context);
        await taskCompletionSource.Task;
    }

Now we need to actually process the work. This involes executing the provided Func<Task> and handling the result appropriately. We will simply invoke the Func and await the resulting Task. Whether that Task completed successfully, threw an exception, or was cancelled, we should pass through to the Task we returned to the caller who queued up the work.

    private static async Task ProcessWorkAsync(WorkContext context)
    {
        try
        {
            await context.TaskFunc();
            context.TaskCompletionSource.TrySetResult();
        }
        catch (OperationCanceledException ex)
        {
            context.TaskCompletionSource.TrySetCanceled(ex.CancellationToken);
        }
        catch (Exception ex)
        {
            context.TaskCompletionSource.TrySetException(ex);
        }
    }

Finally, we need to handle shutting down the work queue. This is done by completeing the channel and waiting for the worker tasks to drain. Calling Channel.Writer.Complete will disallow additional items from being written, and as a side-effect cause Channel.Reader.ReadAllAsync enumerable to stop awaiting more results and complete. This in turn allows our worker tasks to complete.

For convenience, we will make WorkQueue : IAsyncDisposable so the WorkQueue can simply be disposed to shut it down.

    public async ValueTask DisposeAsync()
    {
        _channel.Writer.Complete();
        await _channel.Reader.Completion;
        await Task.WhenAll(_workerTasks);
    }

On thing we’ve left out is cancellation, both for executing work to be cancelled when the work queue is shut down, and for allowing the caller enqueing a work item to cancel that work item.

To address this, a CancellationToken should be provided by the caller enqueueing a work item. Additionally, the WorkQueue itself will need to manage a CancellationTokenSource which it cancels on DisposeAsync. Finally, when a work item is enqueued, the two cancellation tokens need to be linked and provided to the work item so it can properly cancel when either the caller who enqueued the work item cancels, or if the work queue is being shut down entirely. Putting all that together:

public sealed class WorkQueue : IAsyncDisposable
{
    private readonly CancellationTokenSource _cancellationTokenSource;
    private readonly Channel<WorkContext> _channel;
    private readonly Task[] _workerTasks;

    private readonly record struct WorkContext(Func<CancellationToken, Task> TaskFunc, TaskCompletionSource TaskCompletionSource, CancellationToken CancellationToken);

    public WorkQueue()
        : this (Environment.ProcessorCount)
    {
    }

    public WorkQueue(int parallelism)
    {
        _cancellationTokenSource = new CancellationTokenSource();
        _channel = Channel.CreateUnbounded<WorkContext>();

        // Create a bunch of worker tasks to process the work.
        _workerTasks = new Task[parallelism];
        for (int i = 0; i < _workerTasks.Length; i++)
        {
            _workerTasks[i] = Task.Run(
                async () =>
                {
                    // Not passing using the cancellation token here as we need to drain the entire channel to ensure we don't leave dangling Tasks.
                    await foreach (WorkContext context in _channel.Reader.ReadAllAsync())
                    {
                        await ProcessWorkAsync(context);
                    }
                });
        }
    }

    public async Task EnqueueWorkAsync(Func<CancellationToken, Task> taskFunc, CancellationToken cancellationToken = default)
    {
        cancellationToken.ThrowIfCancellationRequested();
        TaskCompletionSource taskCompletionSource = new();
        CancellationToken linkedToken = cancellationToken.CanBeCanceled
            ? CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cancellationTokenSource.Token).Token
            : _cancellationTokenSource.Token;
        WorkContext context = new(taskFunc, taskCompletionSource, linkedToken);
        await _channel.Writer.WriteAsync(context, linkedToken);
        await taskCompletionSource.Task;
    }

    public async ValueTask DisposeAsync()
    {
        await _cancellationTokenSource.CancelAsync();
        _channel.Writer.Complete();
        await _channel.Reader.Completion;
        await Task.WhenAll(_workerTasks);
        _cancellationTokenSource.Dispose();
    }

    private static async Task ProcessWorkAsync(WorkContext context)
    {
        if (context.CancellationToken.IsCancellationRequested)
        {
            context.TaskCompletionSource.TrySetCanceled(context.CancellationToken);
            return;
        }

        try
        {
            await context.TaskFunc(context.CancellationToken);
            context.TaskCompletionSource.TrySetResult();
        }
        catch (OperationCanceledException ex)
        {
            context.TaskCompletionSource.TrySetCanceled(ex.CancellationToken);
        }
        catch (Exception ex)
        {
            context.TaskCompletionSource.TrySetException(ex);
        }
    }
}

If the result of the work is required, this approach may be awkward since the provided Func<CancellationToken, Task> would need to have side-effects. For example, something like the following:

string input = // ...
int result = -1;
await queue.EnqueueWorkAsync(async ct => result = await ProcessAsync(input, ct), cancellationToken));

// Do something with the result here
// ...

An alternate approach would be to have the WorkQueue take the processing function and then the EnqueueWorkAsync method could return the result directly. This requires the work queue to process inputs of the same type and in the same way, but can make the calling pattern more elegant:

string input = // ...
int result = await queue.EnqueueWorkAsync(input, cancellationToken);

// Do something with the result here
// ...

The change to the implementation is straightforward. WorkQueue becomes the generic WorkQueue<TInput, TResult> and the Func<CancellationToken, Task> becomes a Func<TInput, CancellationToken, Task<TResult>> and can move from EnqueueWorkAsync to the constructor.

public sealed class WorkQueue<TInput, TResult> : IAsyncDisposable
{
    private readonly Func<TInput, CancellationToken, Task<TResult>> _processFunc;
    private readonly CancellationTokenSource _cancellationTokenSource;
    private readonly Channel<WorkContext> _channel;
    private readonly Task[] _workerTasks;

    private readonly record struct WorkContext(TInput Input, TaskCompletionSource<TResult> TaskCompletionSource, CancellationToken CancellationToken);

    public WorkQueue(Func<TInput, CancellationToken, Task<TResult>> processFunc)
        : this(processFunc, Environment.ProcessorCount)
    {
    }

    public WorkQueue(Func<TInput, CancellationToken, Task<TResult>> processFunc, int parallelism)
    {
        _processFunc = processFunc;
        _cancellationTokenSource = new CancellationTokenSource();
        _channel = Channel.CreateUnbounded<WorkContext>();

        // Create a bunch of worker tasks to process the work.
        _workerTasks = new Task[parallelism];
        for (int i = 0; i < _workerTasks.Length; i++)
        {
            _workerTasks[i] = Task.Run(
                async () =>
                {
                    // Not passing using the cancellation token here as we need to drain the entire channel to ensure we don't leave dangling Tasks.
                    await foreach (WorkContext context in _channel.Reader.ReadAllAsync())
                    {
                        await ProcessWorkAsync(context, _cancellationTokenSource.Token);
                    }
                });
        }
    }

    public async Task<TResult> EnqueueWorkAsync(TInput input, CancellationToken cancellationToken = default)
    {
        cancellationToken.ThrowIfCancellationRequested();
        TaskCompletionSource<TResult> taskCompletionSource = new();
        CancellationToken linkedToken = cancellationToken.CanBeCanceled
            ? CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, _cancellationTokenSource.Token).Token
            : _cancellationTokenSource.Token;
        WorkContext context = new(input, taskCompletionSource, linkedToken);
        await _channel.Writer.WriteAsync(context, linkedToken);
        return await taskCompletionSource.Task;
    }

    public async ValueTask DisposeAsync()
    {
        await _cancellationTokenSource.CancelAsync();
        _channel.Writer.Complete();
        await _channel.Reader.Completion;
        await Task.WhenAll(_workerTasks);
        _cancellationTokenSource.Dispose();
    }

    private async Task ProcessWorkAsync(WorkContext context, CancellationToken cancellationToken)
    {
        if (cancellationToken.IsCancellationRequested)
        {
            context.TaskCompletionSource.TrySetCanceled(cancellationToken);
            return;
        }

        try
        {
            TResult result = await _processFunc(context.Input, cancellationToken);
            context.TaskCompletionSource.TrySetResult(result);
        }
        catch (OperationCanceledException ex)
        {
            context.TaskCompletionSource.TrySetCanceled(ex.CancellationToken);
        }
        catch (Exception ex)
        {
            context.TaskCompletionSource.TrySetException(ex);
        }
    }
}

This blog post shows two alternate approaches for implementing a limited parallelism work queue to manage many tasks while avoiding overscheduling. These two implementations are best suited to different usage patterns and can be further customized or optimized for your specific use-cases.

]]>
Scripting Machine Setup2023-05-23T00:00:00+00:002023-05-23T00:00:00+00:00https://dfederm.com/scripting-machine-setupLately, I’ve found myself setting up multiple computers, and with Microsoft DevBox on the horizon, I anticipate working with “fresh” machines more frequently. Like many developers, I thrive in a familiar environment with my preferred tools and settings, as muscle memory kicks in and I can efficiently tackle any task. Unfortunately, the process of setting up a new machine can be quite cumbersome. To address this challenge, I took matters into my own hands and developed a script that streamlines the entire setup process for me.

Note that I previously wrote about a roaming developer console, but it was not as robust as I needed, and a lot has changed since then, for example the release of winget.

You can find my completed script which I use for my personal setup on GitHub. I’d recommend forking it and tuning it for your own personal preferences.

Requirements

A key requirement for this project, especially since I expected to iterate on it quite a bit at first, was to ensure the script was idempotent. The goal was to run and re-run the script multiple times while consistently achieving the desired machine state. This flexibility allowed me to make changes and easily apply them. As a result, I could even schedule the script to automatically incorporate any modifications I had made.

To enhance user-friendliness, I aimed for the script to skip unnecessary actions. Instead of blindly setting a registry key, I designed it to first check if the key was already in the desired state. This approach served two purposes: it provided valuable logging information to indicate what the script actually changed and avoided unnecessary elevation prompts.

Furthermore, I prioritized security and implemented a strategy to handle elevation. The script was designed to run unelevated by default, and only specific commands would require elevation if necessary. This adherence to the principle of least privilege improved security measures and mitigated potential issues related to file creation as an administrator. Admittedly, this has debatable value since this script is personal and so should be deemed trustworthy before executing it.

Overall, these considerations, including idempotence, skipping unnecessary actions, and managing elevation, played crucial roles in making the script more robust, user-friendly, and secure.

Defining Machine-Specific Paths

To ensure compatibility with different machine setups, the script begins by defining two crucial paths that might vary based on the drive topography of each machine. For example, on my personal machine I have a separate OS drive and data drive, while on my work machine I have a single drive. Specifically, these paths are the CodeDir and BinDir.

CodeDir represents the root directory for all code, where I typically clone git repositories and store project files. BinDir is the designated location for scripts and standalone tools.

The setup script initiates a prompt to determine the locations of CodeDir and BinDir, assuming they haven’t been defined previously. Once the user provides the necessary input, the script proceeds to set these paths as user-wide environment variables. Additionally, BinDir is added to the user-wide PATH, ensuring convenient access to scripts and tools from anywhere within the system.

Configuring Windows

Configuring Windows revolves around making modifications to the registry. The setup script encompasses several essential registry tweaks and configuration adjustments, including:

  • Configuring cmd Autorun to run %BinDir%\init.cmd (more on that later)
  • Showing file extensions in Explorer
  • Showing hidden files and directories in Explorer
  • Restore classic context menu
  • Disabling Edge tabs showing in Alt+Tab
  • Enable Developer Mode
  • Enable Remote Desktop
  • Enable Long Paths
  • Opting out of Windows Telemetry
  • Excluding CodeDir from Defender

I will certainly be adding more to this list as time goes on.

Uninstalling Bloatware

When it comes to debloating scripts and tools, it’s important to strike a balance. I find that many available scripts tend to be overly aggressive, removing applications that might actually be useful or causing unintended harm to the system. In my personal experience, I find it unnecessary to uninstall essential applications like the Edge browser or OneDrive. Additionally, it’s worth noting that Microsoft discourages the use of registry cleaners due to potential malware risks, and honestly orphaned registry keys take up virtually no disk space and don’t slow the system down in any way.

Nevertheless, I do believe there is value in uninstalling a few specific applications that come bundled with Windows. These include:

  • Cortana: Personally I don’t find Cortana useful.
  • Bing Weather: It’s not my preferred method for checking the weather
  • Get Help: I haven’t found this app useful.
  • Get Started: I haven’t found this app useful.
  • Mixed Reality Portal: I don’t use virtual reality experienced on my desktop computer (or at all for that matter).

Beyond that, a clean install of Windows should be relatively free of bloatware applications.

Installing Applications

Many applications these days can be installed and updated via winget. Winget can easily be scripted to install a list of applications, and for me that list includes:

  • 7zip: A versatile file compression tool.
  • DiffMerge: For file or folder comparisons.
  • Git: For version control
  • HWiNFO: For monitoring CPU/CPU temperatures and clock speeds
  • ILSpy: A decompiler for .NET assemblies.
  • Microsoft Teams: For work
  • MSBuild Structured Log Viewer: A tool for debugging MSBuild.
  • .NET 7 SDK: For developing with .NET
  • Node.js: For developing with JavaScript
  • Notepad++: One of my favored text editors
  • NuGet: Package manager for .NET
  • NuGet Package Explorer: A UI for inspecting NuGet packages
  • PowerShell: Better than Windows Powershell
  • PowerToys: Various useful utilities
  • Remote Desktop Client: modern version of mstsc
  • Regex Hero: Helpful for working with regular expressions
  • SQL Server Management Studio: For working with SQL databases
  • Sysinternals Suite: Various useful utilities
  • Telegram: Favored communication app
  • Visual Studio Code: One of my favored text/code editors
  • Visual Studio 2022 Enterprise: Code editor
  • Visual Studio 2022 Enterprise Preview: Daily driver code editor
  • WinDirStat: For viewing disk usage
  • Windows Terminal: Better than the stock one

While most applications can be installed via Winget, there are a few exceptions. In those cases, the script takes care of installing those applications separately. One such app is the Azure Artifacts Credential Provider (for Azure Feeds), and WSL. Note that installing WSL involves enabling some Windows Components which require a reboot to fully finish installing.

Configuring Applications

Once the applications are installed, the setup script proceeds to configure them. Some applications are configured by the registry while other use environment variables and some even use configuration files. The following configurations are performed by the script:

  • Setting git config and aliases
  • Enable WAM integration for Git: Promptless auth for Azure Repos
  • Force NuGet to use auth dialogs: Avoid device code auth for Azure Feeds in favor of a browser popup window
  • Configure the NuGet cache locations: The defaults are under the user profile but I find a path under the CodeDir to be more appropriate.
  • Opting out of VS Telemetry: Prioritizing privacy
  • Opting out of .NET Telemetry: Prioritizing privacy
  • Copying Windows Terminal settings (more on this later)

Bootstrapping

A keen eye may have noticed that the setup script installs Git, but the script lives on GitHub, so there is a bootstrapping problem. How can we download the script and other assets from GitHub?

Luckily it’s fairly easy to download an entire GitHub repository as a zip file. The following PowerShell will download the zip, extract it, and run it:

$TempDir = "$env:TEMP\MachineSetup"
Remove-Item $TempDir -Recurse -Force -ErrorAction SilentlyContinue
New-Item -Path $TempDir -ItemType Directory > $null
$ZipPath = "$TempDir\bundle.zip"
$ProgressPreference = 'SilentlyContinue'
Invoke-WebRequest -Uri https://github.com/dfederm/MachineSetup/archive/refs/heads/main.zip -OutFile $ZipPath
$ProgressPreference = 'Continue'
Expand-Archive -LiteralPath $ZipPath -DestinationPath $TempDir
$SetupScript = (Get-ChildItem -Path $TempDir -Filter setup.ps1 -Recurse).FullName
& $SetupScript @args
Remove-Item $TempDir -Recurse -Force

That’s a bit much to copy and paste though, so I saved that as a bootstrap.ps1 script in the repo, so the full bootstrapping is a one-liner:

iex "& { $(iwr https://raw.githubusercontent.com/dfederm/MachineSetup/main/bootstrap.ps1) }" | Out-Null

It’s a bit roundabout but the one-liner will download and execute bootstrap.ps1, which will in turn download the entire repo as a zip file, extract it, and run the setup script.

BinDir and autorun

With the bootstrap process in place, we can finally complete the picture with the aforementioned init.cmd autorun script and the BinDir. The repo contains a bin directory which is copied to BinDir and contains the init.cmd autorun and other necessary scripts or files.

The init.cmd autorun is described in more detail in my previous blog post, but essentially it’s a script that runs every time cmd is launched. I use it primarily to set up DOSKEY macros like n for launching Nodepad++. Note that if you prefer PowerShell, you can set up similar behavior using Profiles (%UserProfile%\Documents\PowerShell\Profile.ps1).

Additionally, the reason why BinDir is on the PATH is because any other helpful scripts can be added there and be invoked anywhere.

Finally, a backup of my Windows Terminal settings.json is in this directory so that the setup script can simply copy it to configure Windows Terminal.

Conclusion

Setting up a new machine doesn’t have to be a cumbersome process. By adopting this setup script and following the outlined steps, you can significantly reduce the time and effort required to configure new machines while ensuring a consistent and optimized working environment. With the power of automation and the flexibility of customization, the setup script presented in this blog post offers a practical solution to streamline the machine setup experience. Embrace the script, tailor it to your preferences, and let it handle the heavy lifting for you, allowing you to focus on what matters most—writing code and building remarkable software.

]]>
Removing unused dependencies with ReferenceTrimmer2023-02-12T00:00:00+00:002023-02-12T00:00:00+00:00https://dfederm.com/removing-unused-dependencies-with-referencetrimmerIt’s been a while since I first introduced ReferenceTrimmer and a lot has changed.

For background, ReferenceTrimmer is a NuGet package which helps identify unused dependencies which can be safely removed from your C# projects. Whether it’s old style <Reference>, other projects in your repository referenced via <ProjectReference>, or NuGet’s <PackageReference>, ReferenceTrimmer will help determine what isn’t required and simplify your dependency graph. This can lead to faster builds, smaller outputs, and better maintainability for your repository.

Most notably among the changes are that it’s now implemented as a combination of an MSBuild task and a Roslyn analyzer which seamlessly hook into your build process. A close second, and very related to the first, is that it uses the GetUsedAssemblyReferences Roslyn API to determine exactly which references the compiler used during compilation.

Getting started

Because of the implementation being in an MSBuild task and Roslyn analyzer, the bulk of the work to use ReferenceTrimmer is to simply add a PackageReference to the ReferenceTrimmer NuGet package. That will automatically enable its logic as part of your build. It’s recommended to add this to your Directory.Build.props or Directory.Build.targets, or if you’re using NuGet’s Central Package Management, which I highly recommend, your Directory.Packages.props file at the root of your repo.

For better results, IDE0005 (Remove unnecessary using directives) should also be enabled, and unfortunately to enable this rule you need to enable XML documentation comments (xmldoc) due to dotnet/roslyn issue 41640. This causes many new analyzers to kick in which you may have many violations for, so those would need to be fixed or suppressed. To enable xmldoc, set the <GenerateDocumentationFile> property to true.

And that’s it, ReferenceTrimmer should now run as part of your build!

How it works

ReferenceTrimmer consists of two parts: an MSBuild task and a Roslyn analyzer.

The task is named CollectDeclaredReferencesTask and as you can guess, its job is to gather the declared references. It gathers the list of references passed to the compiler and associates each of them with the <Reference>, <ProjectReference>, or <PackageReference> from which they originate. It also filters out references which are unavoidable such as implicitly defined references from the .NET SDK, as well as packages which contain build logic since that may be the true purpose of that packages as opposed to providing a referenced library.

This information from the task is dumped into a file _ReferenceTrimmer_DeclaredReferences.json under the project’s intermediate output folder (usually obj\Debug or obj\Release) and this path is added as a AdditionalFiles item to pass it to the analyzer.

Next, as part of compilation, the analyzer named ReferenceTrimmerAnalyzer will call the GetUsedAssemblyReferences API as previously mentioned to get the used references and compare them to the compared references provided by the task. Any declared references which are not used will cause a warning to be raised.

The warning code raised will depend on the originating reference type. It will be RT0001 for <Reference> items, RT0002 for <ProjectReference> items, or RT0003 for <PackageReference> items. These are treated like any other compilation warning and so can be suppressed on a per-project basic with <NoWarn>. Additionally ReferenceTrimmer can be disabled entirely for a project by setting $(EnableReferenceTrimmer) to false.

Note that because the warnings are raised as part of compilation, projects with other language types like C++ or even NoTargets projects will not cause warning to be raised nor need to be explicitly excluded from ReferenceTrimmer.

Future work

Ideas for future improvement include:

  • Better identifying <ProjectReference> which are required for runtime only. Or at least documenting explicit guidance around how to reference those properly such that the compiler doesn’t “see” them but the outputs are still copied.
  • Add the ability to exclude specific references from being warned on by adding some metadata to the reference. This would allow for more granular control rather than having to disable an entire rule for an entire project.
  • Add support for C++ projects.
  • Find and fix more edge-case bugs!

Contributions and bug reports are always welcome on GitHub, and I’m hopeful ReferenceTrimmer can be helpful in detangling your repos!

]]>
Async Mutex2022-11-03T00:00:00+00:002022-11-03T00:00:00+00:00https://dfederm.com/async-mutexThe Mutex class in .NET helps manage exclusive access to a resource. When given a name, this can even be done across processes which can be extremely handy.

Though if you’ve ever used a Mutex you may have found that it cannot be used in conjunction with async/await. More specifically, from the documentation:

Mutexes have thread affinity; that is, the mutex can be released only by the thread that owns it.

This can make the Mutex class hard to use at times and may require use of ugliness like GetAwaiter().GetResult().

For in-process synchronization, SemaphoreSlim can be a good choice as it has a WaitAsync() method. However semaphores aren’t ideal for managing exclusive access (new SemaphoreSlim(1) works but is less clear) and do not support system-wide synchronization eg. new Mutex(initiallyOwned: false, @"Global\MyMutex").

Below I’ll explain how to implement an async mutex, but the full code can be found at the bottom or in the Gist.

EDIT Based on a bunch of feedback, it’s clear to me that I over-generalized this post. This implementation was specifically for synchronizing across processes, not within a process. The code below is absolutely not thread-safe. So think of this more as an “Async Global Mutex” and stick with SemaphoreSlim to synchronization across threads.

How to use a Mutex

First, some background on how to properly use a Mutex. The simplest example is:

// Create the named system-wide mutex
using Mutex mutex = new(false, @"Global\MyMutex");

// Acquire the Mutex
mutex.WaitOne();

// Do work...

// Release the Mutex
mutex.ReleaseMutex();

As Mutex derives from WaitHandle, WaitOne() is the mechanism to acquire it.

However, if a Mutex is not properly released when a thread holding it exits, the WaitOne() will throw a AbandonedMutexException. The reason for this is explained as:

An abandoned mutex often indicates a serious error in the code. When a thread exits without releasing the mutex, the data structures protected by the mutex might not be in a consistent state. The next thread to request ownership of the mutex can handle this exception and proceed, if the integrity of the data structures can be verified.

So the next thread to acquire the Mutex is responsible for verifying data integrity, if applicable. Note that a thread can exit without properly releasing the Mutex if the user kills the process, so AbandonedMutexException should always be caught when trying to acquire a Mutex.

With this our new example becomes:

// Create the named system-wide mutex
using Mutex mutex = new(false, @"Global\MyMutex");
try
{
    // Acquire the Mutex
    mutex.WaitOne();
}
catch (AbandonedMutexException)
{
    // Abandoned by another process, we acquired it.
}

// Do work...

// Release the Mutex
mutex.ReleaseMutex();

However, what if the work we want to do while holding the Mutex is async?

AsyncMutex

First let’s define what we want the shape of the class to look like. We want to be able to acquire and release the mutex asynchronously, so the following seems reasonable:

public sealed class AsyncMutex : IAsyncDisposable
{
    public AsyncMutex(string name);

    public Task AcquireAsync(CancellationToken cancellationToken);

    public Task ReleaseAsync();

    public ValueTask DisposeAsync();
}

And so the intended usage would look like:

// Create the named system-wide mutex
await using AsyncMutex mutex = new(@"Global\MyMutex");

// Acquire the Mutex
await mutex.AcquireAsync(cancellationToken);

// Do async work...

// Release the Mutex
await mutex.ReleaseAsync();

Now that we know what we want it to look like, we can start implementing.

Acquiring

Because Mutex must be in a single thread, and because we want to return a Task so the mutex can be acquired async, we can start a new Task which uses the Mutex and return that.

public Task AcquireAsync()
{
    TaskCompletionSource taskCompletionSource = new();

    // Putting all mutex manipulation in its own task as it doesn't work in async contexts
    // Note: this task should not throw.
    Task.Factory.StartNew(
        state =>
        {
            try
            {
                using var mutex = new Mutex(false, _name);
                try
                {
                    // Acquire the Mutex
                    mutex.WaitOne();
                }
                catch (AbandonedMutexException)
                {
                    // Abandoned by another process, we acquired it.
                }

                taskCompletionSource.SetResult();

                // TODO: We need to release the mutex at some point
            }
            catch (Exception ex)
            {
                taskCompletionSource.TrySetException(ex);
            }
        }
        state: null,
        cancellationToken,
        TaskCreationOptions.LongRunning,
        TaskScheduler.Default);

    return taskCompletionSource.Task;
}

So now AcquireAsync returns a Task which doesn’t complete until the Mutex is acquired.

Releasing

At some point the code needs to release the Mutex. Because the mutex must be released in the same thread it was acquired in, it must be released in the Task which AcquireAsync started. However, we don’t want to actually release the mutex until ReleaseAsync is called, so we need the Task to wait until that time.

To accomplish this, we need a ManualResetEventSlim which the Task can wait for a signal from, which ReleaseAsync will set.

private Task? _mutexTask;
private ManualResetEventSlim? _releaseEvent;

public Task AcquireAsync(CancellationToken cancellationToken)
{
    TaskCompletionSource taskCompletionSource = new();

    _releaseEvent = new ManualResetEventSlim();

    // Putting all mutex manipulation in its own task as it doesn't work in async contexts
    // Note: this task should not throw.
    _mutexTask = Task.Factory.StartNew(
        state =>
        {
            try
            {
                using var mutex = new Mutex(false, _name);
                try
                {
                    // Acquire the Mutex
                    mutex.WaitOne();
                }
                catch (AbandonedMutexException)
                {
                    // Abandoned by another process, we acquired it.
                }

                taskCompletionSource.SetResult();

                // Wait until the release call
                _releaseEvent.Wait();

                mutex.ReleaseMutex();
            }
            catch (Exception ex)
            {
                taskCompletionSource.TrySetException(ex);
            }
        },
        state: null,
        cancellationToken,
        TaskCreationOptions.LongRunning,
        TaskScheduler.Default);

    return taskCompletionSource.Task;
}

public async Task ReleaseAsync()
{
    _releaseEvent?.Set();

    if (_mutexTask != null)
    {
        await _mutexTask;
    }
}

Now the Task will acquire the Mutex, then wait for a signal from the ReleaseAsync method to release the mutex.

Additionally, the ReleaseAsync waits for the Task to finish to ensure its Task will not complete until the mutex is released.

Cancellation

The caller may not want to wait forever for the mutex acquisition, so we need cancellation support. This is fairly straightforward since Mutex is a WaitHandle, and CancellationToken has a WaitHandle property, so we can use WaitHandle.WaitAny()

public Task AcquireAsync(CancellationToken cancellationToken)
{
    cancellationToken.ThrowIfCancellationRequested();

    TaskCompletionSource taskCompletionSource = new();

    _releaseEvent = new ManualResetEventSlim();

    // Putting all mutex manipulation in its own task as it doesn't work in async contexts
    // Note: this task should not throw.
    _mutexTask = Task.Factory.StartNew(
        state =>
        {
            try
            {
                using var mutex = new Mutex(false, _name);
                try
                {
                    // Wait for either the mutex to be acquired, or cancellation
                    if (WaitHandle.WaitAny(new[] { mutex, cancellationToken.WaitHandle }) != 0)
                    {
                        taskCompletionSource.SetCanceled(cancellationToken);
                        return;
                    }
                }
                catch (AbandonedMutexException)
                {
                    // Abandoned by another process, we acquired it.
                }

                taskCompletionSource.SetResult();

                // Wait until the release call
                _releaseEvent.Wait();

                mutex.ReleaseMutex();
            }
            catch (OperationCanceledException)
            {
                taskCompletionSource.TrySetCanceled(cancellationToken);
            }
            catch (Exception ex)
            {
                taskCompletionSource.TrySetException(ex);
            }
        },
        state: null,
        cancellationToken,
        TaskCreationOptions.LongRunning,
        TaskScheduler.Default);

    return taskCompletionSource.Task;
}

Disposal

To ensure the mutex gets released, we should implement disposal. This should release the mutex if held. It should also cancel any currently waiting acquiring of the mutex, which requires a linked cancellation token.

private CancellationTokenSource? _cancellationTokenSource;

public Task AcquireAsync(CancellationToken cancellationToken)
{
    cancellationToken.ThrowIfCancellationRequested();

    TaskCompletionSource taskCompletionSource = new();

    _releaseEvent = new ManualResetEventSlim();
    _cancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);

    // Putting all mutex manipulation in its own task as it doesn't work in async contexts
    // Note: this task should not throw.
    _mutexTask = Task.Factory.StartNew(
        state =>
        {
            try
            {
                CancellationToken cancellationToken = _cancellationTokenSource.Token;
                using var mutex = new Mutex(false, _name);
                try
                {
                    // Wait for either the mutex to be acquired, or cancellation
                    if (WaitHandle.WaitAny(new[] { mutex, cancellationToken.WaitHandle }) != 0)
                    {
                        taskCompletionSource.SetCanceled(cancellationToken);
                        return;
                    }
                }
                catch (AbandonedMutexException)
                {
                    // Abandoned by another process, we acquired it.
                }

                taskCompletionSource.SetResult();

                // Wait until the release call
                _releaseEvent.Wait();

                mutex.ReleaseMutex();
            }
            catch (OperationCanceledException)
            {
                taskCompletionSource.TrySetCanceled(cancellationToken);
            }
            catch (Exception ex)
            {
                taskCompletionSource.TrySetException(ex);
            }
        },
        state: null,
        cancellationToken,
        TaskCreationOptions.LongRunning,
        TaskScheduler.Default);

    return taskCompletionSource.Task;
}

public async ValueTask DisposeAsync()
{
    // Ensure the mutex task stops waiting for any acquire
    _cancellationTokenSource?.Cancel();

    // Ensure the mutex is released
    await ReleaseAsync();

    _releaseEvent?.Dispose();
    _cancellationTokenSource?.Dispose();
}

Conclusion

AsyncMutex allows usage of Mutex with async/await.

Putting the whole thing together (or view the Gist):

public sealed class AsyncMutex : IAsyncDisposable
{
    private readonly string _name;
    private Task? _mutexTask;
    private ManualResetEventSlim? _releaseEvent;
    private CancellationTokenSource? _cancellationTokenSource;

    public AsyncMutex(string name)
    {
        _name = name;
    }

    public Task AcquireAsync(CancellationToken cancellationToken)
    {
        cancellationToken.ThrowIfCancellationRequested();

        TaskCompletionSource taskCompletionSource = new();

        _releaseEvent = new ManualResetEventSlim();
        _cancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);

        // Putting all mutex manipulation in its own task as it doesn't work in async contexts
        // Note: this task should not throw.
        _mutexTask = Task.Factory.StartNew(
            state =>
            {
                try
                {
                    CancellationToken cancellationToken = _cancellationTokenSource.Token;
                    using var mutex = new Mutex(false, _name);
                    try
                    {
                        // Wait for either the mutex to be acquired, or cancellation
                        if (WaitHandle.WaitAny(new[] { mutex, cancellationToken.WaitHandle }) != 0)
                        {
                            taskCompletionSource.SetCanceled(cancellationToken);
                            return;
                        }
                    }
                    catch (AbandonedMutexException)
                    {
                        // Abandoned by another process, we acquired it.
                    }

                    taskCompletionSource.SetResult();

                    // Wait until the release call
                    _releaseEvent.Wait();

                    mutex.ReleaseMutex();
                }
                catch (OperationCanceledException)
                {
                    taskCompletionSource.TrySetCanceled(cancellationToken);
                }
                catch (Exception ex)
                {
                    taskCompletionSource.TrySetException(ex);
                }
            },
            state: null,
            cancellationToken,
            TaskCreationOptions.LongRunning,
            TaskScheduler.Default);

        return taskCompletionSource.Task;
    }

    public async Task ReleaseAsync()
    {
        _releaseEvent?.Set();

        if (_mutexTask != null)
        {
            await _mutexTask;
        }
    }

    public async ValueTask DisposeAsync()
    {
        // Ensure the mutex task stops waiting for any acquire
        _cancellationTokenSource?.Cancel();

        // Ensure the mutex is released
        await ReleaseAsync();

        _releaseEvent?.Dispose();
        _cancellationTokenSource?.Dispose();
    }
}
]]>
How does PackageReference work?2022-06-06T00:00:00+00:002022-06-06T00:00:00+00:00https://dfederm.com/how-does-packagereference-workPackageReference has replaced packages.config as the primary mechanism for consuming NuGet packages. For those looking to migrate, there is documentation available to help you. But how does it actually work under the hood?

Historically with packages.config files, NuGet’s role was simply to download the exact packages at the exact versions you specified, and then copy the packages into a repository-relative path configured in your NuGet.Config file, usually /packages. Actually consuming the package contents was ultimately up to the consuming projects, however the Visual Studio Package Manager UI would help update the relevant project with various <Import>, <Reference>, and <Content> elements based on convention.

With PackageReference, these conventions have been effectively codified. It becomes very cumbersome to consume packages these days which do not conform to the conventions. Additionally PackageReference adds much-needed quality-of-life features, such as automatically pulling in dependencies and unifying package versions.

Restore

As I hinted earlier, NuGet’s job previously was to download packages only, so a nuget restore of a packages.config file did that and only that. Now with PackageReference, the restore process does not only that but also generates files per-project which describe the contents of each consumed package and is used during the build to dynamically add the equivalents of the previous <Import>, <Reference>, and <Content> elements which were present in projects.

One benefit of these generated files is that the project files are left much cleaner. The project file simply has a PackageReference, rather than consuming a bunch of stuff which happens to all be from a path inside that package with lots of duplication.

Another benefit is that the copy of all package contents from the global package cache to the repository-relative /packages directory is no longer necessary as the generated files can point directly into the global package cache. This can save a lot of disk space and a lot of restore time (at least in a clean repository). Note that the global package cache is %UserProfile%\.nuget\packages by default on Windows machines, but can be redirected as desired, for example to the same drive as your code which is ideally an SSD, by setting %NUGET_PACKAGES%.

These generated files are output to $(RestoreOutputPath), which by default is $(MSBuildProjectExtensionsPath), which by default is $(BaseIntermediateOutputPath), which by default is obj\ (Phew). The notable generated files are project.assets.json, <project-file>.nuget.g.props, and <project-file>.nuget.g.targets.

An interesting but important note is that PackageReference items are only used during the restore. During the build, any package related information comes from the files generated during the restore.

Let’s start with the generated props and targets files as they’re more straightforward.

Generated props and targets

These generated props file is imported by this line in Microsoft.Common.props (which is imported by Microsoft.NET.Sdk):

<Import Project="$(MSBuildProjectExtensionsPath)$(MSBuildProjectFile).*.props" Condition="'$(ImportProjectExtensionProps)' == 'true' and exists('$(MSBuildProjectExtensionsPath)')" />

Similarly, the targets file is imported by a similar like in Microsoft.Common.targets.

The props file always defines a few properties which NuGet uses at build time like $(ProjectAssetsFile), but the interesting part to a consumer is that the <Import> elements into packages which used to be directly in the projects are generated to these files, the packages’ build\<package-name>.props in the <project-file>.nuget.g.props and the build\<package-name>.targets in the <project-file>.nuget.g.targets.

As an example, you’ll see a section similar to this in the generated props file for a unit test project using xUnit:

  <ImportGroup Condition="'$(ExcludeRestorePackageImports)' != 'true'">
    <Import Project="$(NuGetPackageRoot)xunit.runner.visualstudio\2.4.3\build\netcoreapp2.1\xunit.runner.visualstudio.props" Condition="Exists('$(NuGetPackageRoot)xunit.runner.visualstudio\2.4.3\build\netcoreapp2.1\xunit.runner.visualstudio.props')" />
    <Import Project="$(NuGetPackageRoot)xunit.core\2.4.1\build\xunit.core.props" Condition="Exists('$(NuGetPackageRoot)xunit.core\2.4.1\build\xunit.core.props')" />
    <Import Project="$(NuGetPackageRoot)microsoft.testplatform.testhost\17.1.0\build\netcoreapp2.1\Microsoft.TestPlatform.TestHost.props" Condition="Exists('$(NuGetPackageRoot)microsoft.testplatform.testhost\17.1.0\build\netcoreapp2.1\Microsoft.TestPlatform.TestHost.props')" />
    <Import Project="$(NuGetPackageRoot)microsoft.codecoverage\17.1.0\build\netstandard1.0\Microsoft.CodeCoverage.props" Condition="Exists('$(NuGetPackageRoot)microsoft.codecoverage\17.1.0\build\netstandard1.0\Microsoft.CodeCoverage.props')" />
    <Import Project="$(NuGetPackageRoot)microsoft.net.test.sdk\17.1.0\build\netcoreapp2.1\Microsoft.NET.Test.Sdk.props" Condition="Exists('$(NuGetPackageRoot)microsoft.net.test.sdk\17.1.0\build\netcoreapp2.1\Microsoft.NET.Test.Sdk.props')" />
  </ImportGroup>

Note that $(NuGetPackageRoot) is the global package cache directory as described earlier and is defined earlier in the same generated props file.

The generated props file also defines properties which point to package root directories for packages which have the GeneratePathProperty metadata defined. These properties look like $(PkgNormalized_Package_Name) and are mostly used as an escape valve for package which don’t properly follow the conventions and using custom build logic in the project file to reach into the package is required.

Next we’ll explore the project.assets.json file.

Project Assets File

The project.assets.json file contains a boatload of information. It describes the full package dependency graph for each target framework the project targets, a list of the contents of all packages in the graph, the package folders the packages exist at, the list of project references, and various other miscellany.

Here is an example of the basic structure, with many omissions for brevity:

{
  "targets": {
    "net6.0": {
      "xunit/2.4.1": {
        "type": "package",
        "dependencies": {
          "xunit.analyzers": "0.10.0",
          "xunit.assert": "[2.4.1]",
          "xunit.core": "[2.4.1]"
        }
      },
      "xunit.analyzers/0.10.0": {
        "type": "package"
      },
      "ExampleClassLibrary/1.0.0": {
        "type": "project",
        "framework": ".NETCoreApp,Version=v6.0",
        "dependencies": {
          // ... the dependency project's package dependencies ...
        },
        // ... all other transitive package and project dependencies ...
      }
    }
  },
  "libraries": {
    "xunit/2.4.1": {
      "sha512": "XNR3Yz9QTtec16O0aKcO6+baVNpXmOnPUxDkCY97J+8krUYxPvXT1szYYEUdKk4sB8GOI2YbAjRIOm8ZnXRfzQ==",
      "type": "package",
      "path": "xunit/2.4.1",
      "files": [
        ".nupkg.metadata",
        ".signature.p7s",
        "xunit.2.4.1.nupkg.sha512",
        "xunit.nuspec"
      ]
    },
    "ExampleClassLibrary/1.0.0": {
      "type": "project",
      "path": "../src/ExampleClassLibrary.csproj",
      "msbuildProject": "../src/ExampleClassLibrary.csproj"
    }
    // ... all other transitive package dependencies' contents and transitive project dependencies' paths ...
  },
  "projectFileDependencyGroups": {
    "net6.0": [
      "ExampleClassLibrary >= 1.0.0",
      "Microsoft.NET.Test.Sdk >= 17.2.0",
      "xunit >= 2.4.1",
      "xunit.runner.visualstudio >= 2.4.5"
    ]
  },
  "packageFolders": {
    "C:\\Users\\David\\.nuget\\packages\\": {},
    "C:\\Program Files (x86)\\Microsoft Visual Studio\\Shared\\NuGetPackages": {},
    "C:\\Program Files\\dotnet\\sdk\\NuGetFallbackFolder": {}
  },
  "project": {
    "version": "1.0.0",
    // ... various information about the project ...
  }
}

Examples of why one might want to look at this file are be to understand where a dependency is coming from or why a dependency version is resolving the way that it is.

The ResolvePackageAssets target reads the project.assets.json file to translate its contents into various items, like ResolvedAnalyzers, _TransitiveProjectReferences, ResolvedCompileFileDefinitions (which end up becoming Analyzer, ProjectReference, and Reference items respectively), and everything else which is used from a package.

Now why the ResolvePackageAssets target exists as opposed to NuGet just generating these items in the generated props and targets files is anyone’s guess. It seems like that would be much simpler, straightforward, and performant. A complaint I have which I also see from others is that there is too much black-box magic, especially in ResolvePackageAssets, but it is what it is.

Conclusion

I hope this helps shed some light on how PackageReference works, explains why it’s better than the legacy packages.config, and provides some of the details which can help with understanding and debugging your build.

]]>