valinet

Reserve RAM properly (and easily) on embedded devices running Linux

2024-07-23T00:00:00+00:00

Nowadays, we are surrounded by a popular class of embedded devices: specialized hardware powered by an FPGA which is tightly coupled with a CPU on the same die which most of the times only runs mostly software that configures, facilitates and monitors the design. Such examples are the ZYNQ devices from AMD (formely Xilinx).

Sometimes, the application in question has to share some large quantity of data from the “hardware side” to the “software side”. For example, suppose you have a video pipeline configured in hardware that acquires images from a camera sensor and displays them out over HDMI. The system also has some input device attached, so the user can press some key there (just an example) and have a screenshot saved to the storage medium (an SD card, let’s say). A whole frame is quite big (1920 columns * 1080 lines * 2 color channels bytes, for a 4:2:0 1080p image let’s say means approx. 4.2MB), so wasting hardware resources (like LUT/BRAM) is not really possible.

Most of the times, designs of such systems have shared RAM modules that both sides can access at the same time: the FPGA can write to “some” memory area, and applications running on the CPU can read from that “some” memory area. Thus, the video pipeline could write the frames to RAM as well, and the software is free to grab frames from there whenever the user presses a button. Well, technically a bit after the user requests so, as frame grabbing would have to take place at specific moments (probably when the video pipeline just finished writing a new frame) in order to avoid having the image teared (saving the data when the video pipeline is just writing a new frame), but for the sake of this exercise let’s not bother with that and the fact that the thing would be merely an illusion and not immediate (interrupt style, let’s say).

So the hardware part is actually easy: we’ll have some design that we can configure from software at runtime and tell what physical address in RAM to write to. Done.

The software part is tricky though. Most of the times, ideally, one wants to dedicate all “unused” RAM to the CPU, so there is flexibility for the applications running there. In practice, mostly just an application, since these are not general purpose computing platforms usually, but run custom made “distributions” which only include the (Linux) kernel, the minimum strictly necessary user land and the custom apps that help the system do whatever it has to do. For this goal, Buildroot is a popular open way to achieve highly customized, very specialized and lightweight system images. Or such designs forgo Linux entirely and run a single binary bare metal, especially when running real time is a goal. Anyway, the point is, the “coordinating” app is mostly the single process running for the entire lifetime of the system.

Despite that, Linux is not really thought out with that expectation in mind - it reasonably expects a couple of processes to run at any given time. We cannot just pick out some random address in RAM and write to it - there could be processes that were allocated memory there. So what’s to do?

At the moment, I know three solutions, and I will go over each of them and tell you why I picked the last one.

1. Limit the RAM the OS ‘sees’

This is done using the devicetree. The hardware is then free to write data there without fearing it is going to hurt something in the software/CPU side. The problem here is that the OS cannot ever access that memory, but this technique works when the hardware needs to buffer data for itself and the CPU doesn’t really need access to it.

2. Reserve some CMA

Contiguous Memory Allocation is a feature where the kernel reserves some memory area for hardware devices that need support for physical contigous memory. It can be enabled using the device tree or the cma=128MB boot flag (that one, for example, instructs the kernel to reserve 128MB at the end of the physical memory space for CMA).

The workflow then would be to write a kernel module that requests the amount of CMA memory you want from the kernel, then configure the hardware with the physical address of that. And this approach works just fine. The problem is, you have to write a kernel module, implement communication from the user land with it - probably some ioctl where you tell the driver how much memory to request from the kernel and via which it tells you back the physical address it got. Certainly doable on a system we fully control, but do we really have to bother with a writing a driver and risk introducing instability in the one area where it hurts the most?

Actually, we can hack away a solution to this: simply pick some address in the CMA region without telling the kernel about it and configure the hardware to use it. From the software side, to access pages in that physical area, you mmap it using the /dev/mem device (an in-tree driver which allows full access to the entire visible RAM from the user space), which gives you a virtual address you can work with that points for x number of pages in the physical area you requested. Something like this:

uintptr_t phys_addr = 0x17000000;
int size = 1024;
int fd = open("/dev/mem", O_RDWR | O_SYNC);
char* virt_addr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, phys_addr);
strcpy(virt_addr, "Hello, world");
// ...

Pretty simple, so what’s wrong with it? 2 main things I’d say. Firstly, maybe you do not want access to the entire memory. What if you screw something up and write out of bounds? Yeah, there’s that size parameter that protects you but still… what if some other process misbehaves, or some unwanted program is loaded…? Access like this to the entire memory is not really desired and not really needed, as you’ll see in a bit. Secondly, and this is the big problem, the kernel knows nothing about what we are doing. We did not tell it we are using memory from there. The CMA is not for us to do what we want with it. It’s a feature the kernel expects to use to service drivers that request memory from it. If you happen to have some driver that requests memory there and you use in your code the starting address of the CMA to do your own stuff as I showed you above, you’re in for some funny times let’s say - things will be screwed up in weird ways.

You can still hack around this: you could query from the user land and find out how much CMA the kernel handed out to requesting parties. It is allocated in number of pages from the beginning. There is no direct syscall to work with the CMA from the user space, as far as I know, since one’s not really supposed to, but there is actually a debugfs interface which can tell you how much is used, but really… there is no guarantee some device won’t request more. Sure, if you control the hardware and know about it as well, just work with some memory a bit further away from where you expect/see the requests finish at and 99.99% of the time you can get away with this. But it’s… hacky. Ugly. We need something better.

3. Properly reserve memory

Specifically for this requirement outlines above, the kernel actually has a dedicated mechanism: reserved memory. It’s configured using the device tree, where you introduce an entry that looks like this:

	reserved-memory {
        #address-cells = <1>;
        #size-cells = <1>;
        ranges ;

        reserved: buffer@0x2000000 {
            no-map;
            reg = <0x2000000 0x4000000>;
        };
    };

What matters in the syntax above is the reg line, which reads like this: reserve 0x4000000 number of bytes starting from address 0x2000000. This is on 32-bit. On 64-bit, there are 4 numbers, the first and third one the high part of the 64-bit number (so it looks something like this: reg = <0 0x2000000 0 0x4000000>). Correct values for #address-cells and #size-cells can be determined from the documentation of your SoC (or examples you find online/the provided examples of your platform).

At runtime, that simply divides the system RAM in 2 regions, and the OS can use either part but not the reservation that you made in the middle. You are guaranteed that, that’s great. Even better, it’s also excluded from tools like htop, as opposed to the CMA approach. If you check /proc/iomem, it looks something like this:

00000000-01ffffff : System RAM
  00008000-007fffff : Kernel code
  00900000-00951eef : Kernel data
06000000-1fffffff : System RAM
44a01000-44a0ffff : serial
...

And, how do you access that from the user space? Well, at this point, still with /dev/mem, so we only solved half of the problem. For the other half, we can employ a safe hack this time: overlay a generic-uio device over it (again, using the device tree), which is a simple driver which just maps memory to user space of a certain size. That size and physical address is, you guessed it, specified in the device tree. In the user land, it is accessible at /sys/class/uio/....

Finally, the device tree additions are these:

	reserved-memory {
        #address-cells = <1>;
        #size-cells = <1>;
        ranges ;

        reserved: buffer@0x2000000 {
            no-map;
            reg = <0x2000000 0x4000000>;
        };
    };
	reserved_memory1@0x2000000 {
		compatible = "generic-uio";
		reg = <0x2000000 0x4000000>;
	};

From the user land, you access that memory like this:

int size;
int fd = open("/dev/uio0", O_RDWR);
char* virt_addr = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
strcpy(virt_addr, "Hello, world");
// ...

The code does not really need to have knowledge in advance about the physical address of the memory area or its size. These can be queried directly from properties exposed by the UIO device:

# cat /sys/class/uio/uio0/maps/map0/addr
# cat /sys/class/uio/uio0/maps/map0/size

And each uio has a name file associated which contains the name you specified in the device tree. You can read those for each folder in /sys/class/uio to learn which entry in there coresponds to which device you have set up in your device tree:

# cat /sys/class/uio/uio0/name
reserved_memory1

Useful links

Linux Reserved Memory

Device tree Reserve Memory “Invalid reg property”

Access valinet.ro over the Tor network

2024-06-16T00:00:00+00:00

It is my pleasure to announce that, for almost 2 weeks now, valinet.ro is available over the Tor network as well, at https://valinet6l6tq6d5yohaa6gdsf2ho4qcqcxyj2nahp5kd4z7nsa6ycdqd.onion/. I have presented this setup in an assignment for one of the classes I take in uni, so here’s the write-up that I have made for it.

Setup consists of an Alpine Linux-based VM which runs the tor server service and an Nginx Proxy Manager docker container which proxies the actual web site which, as you know, is publicly available on GitHub. This way, I do not have to “build” the web site in 2 places (I use a static site generator), it is still hosted on GitHub Pages and relayed through that proxy to the clients accessing it via Tor.

Client => Tor network => Tor server on VM => nginx reverse proxy => github.io

I chose Alpine because it is lightweight, easy to setup and manage and I am familiar with it, and Nginx Proxy Manager because it is just easier to use a GUI to configure everything rather than nginx config files.

The config for /etc/tor/torrc looks like this (based on this):

HiddenServiceDir /var/lib/tor/hidden_service
HiddenServiceVersion 3
HiddenServicePort 80 127.0.0.1:80
HiddenServicePort 443 127.0.0.1:443

The nginx docker container is configured to only listen on localhost; anyway, the VM runs in an isolated network anyway (I control the hypervisor and the infrastructure there - it’s running on a Proxmox cluster at my workplace), and the web site is also available publicly via the clearnet anyway, so I do not really care if some may hypothetically access it bypassing the Tor network; that’s why I decided running it over unix sockets was not worth the hassle, but if you want that, check this out.

For the Nginx Proxy Manager config for the web site, besides what is offered through the GUI (forwarding valinet…onion to https valinet.ro 443), I have also set a custom location for “/”, again that directs to https valinet.ro 443, which allows me to set custom nginx directives. This is what I used:

proxy_set_header Host valinet.ro;
proxy_redirect https://valinet.ro/ https://valinet6l6tq6d5yohaa6gdsf2ho4qcqcxyj2nahp5kd4z7nsa6ycdqd.onion/;

proxy_set_header is necessary so that GitHub knows which web site it hosts we want fetched (a lot of web sites are “stored” at the same IP). Without this, of course GitHub returns a 404 Not Found error.

proxy_redirect sets the text that is changed in the “Location” and “Refresh” headers. Basically, this changes the address from “valinet.ro” to “valinet…onion”.

To make the address look nice, following this, I have used mkp224o and generated a few addresses and picked up the one that I liked the most. On a powerful AMD Ryzen 5900x-based machine it only took around 15-20 minutes to generate 3 options to choose from. It was not initially in plan, but I saw this on Kushal’s web site and I have decided I must also have that.

The Tor Project has a web page showing the addresses of some well known services, like The New York Times, Facebook etc. Some of these run over https, and I thought it wouldn’t hurt to run my service over https as well for extra swag.

According to this, there are only 2 providers of certificates for .onion address: Digicert which is expensive through the roof, and HARICA (Hellenic Academic and Research Institutions Certificate Authority) which has more normal prices. The setup was very easy, I verified by generating a CSR signed with my service’s key as described in their instructions using an open tool they host and got my certificates. Then, uploading and associating them with the domain was a breeze through Nginx Proxy Manager’s GUI.

In the end, you can also use Nginx Proxy Manager to hide certain headers that you do not want proxied, using directives like:

add_header Server custom;
proxy_hide_header ETag;

This is it, happy onioning!

Convert Alpine Linux containers to VMs

2024-06-01T00:00:00+00:00

Lately I have decided to migrate away from Proxmox containers to full blown virtual machines, due to reasons like the inability of containers to host Docker inside the root volume when using zfs, so having to resort to all kinds of shenanigans, like storing /var/lib/docker in a separate xfs volume which nukes the built-in high availability, failover and hyperconvergence (using Ceph) features for that container, plus wanting to make the setup more portable and less dependant on the underlaying OS - with a VM, you basically copy the virtual disk and that’s it, and if you write that on physical media it boots and works just as expected, whereas containers only run on top of an underlaying OS like Proxmox. Anyway, despite recent progress, mainly Docker in proxmox CTs on ZFS is not that recommended, whereas running them in VMs is fine.

Here’re the steps, with little comments where relevant:

Create a new VM with the desired features in Proxmox. Attach Arch Linux or any Linux live CD to it. Make sure to add a serial port to it, so we can later enable xterm.js on it (allows copy/paste), in addition to the noVNC console.
Enable ssh (and allow root authentication with password) in the Alpine Linux CT, install rsync on it and kill all running Dockers/services inside it.
Boot Arch Linux on the new VM.
Format the disk inside the VM according to your needs. For my setup, I created a 100MB EFI partition and the rest as ext4 for the file system. That tunr2fs command is neccessary in order to fix the unsupported filesystem error which would occur later on when we install grub.

wipefs -a /dev/sda
cfdisk /dev/sda
mkfs.vfat -F32 /dev/sda1
mkfs.ext4 -O "^has_journal,^64bit" /dev/sda2
tune2fs -O "^metadata_csum_seed" /dev/sda2

Mount the new partitions.

mount /dev/sda2 /mnt
mkdir -p /mnt/boot/efi
mount /dev/sda1 /mnt/boot/efi

Optionally, asign an IP address for your Arch Linux live CD, if your netowrk is not running DHCP (in the example, 10.2.2.222 is the IP for the booted Arch Linux, and 10.2.2.1 is the IP of the gateway).

ip address add 10.2.2.222/24 broadcast + dev enp6s18
ip route add default via 10.2.2.1 dev enp6s18

Copy the actual files from the container to the new disk, excluding special stuff like kernel services exposed via the file system etc (in the example, 10.2.2.111 is the IP of the Alpine Linux container).

rsync -ahPHAXx --delete --exclude="{/dev/*,/proc/*,/sys/*,/tmp/*,/run/*,/mnt/*,/media/*,/lost+found}" [email protected]:/ /mnt/

Alternatively, here you could set up the file system with fresh files from minirootfs, a 2MB base system image of Alpine Linux destined for containers - we later on update it with the minimum required to boot on bare metal, a la Arch Linux.

wget -O- https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-minirootfs-3.20.0-x86_64.tar.gz | tar -C /mnt -xzpf -

Bind mount live kernel services to the just copyied file system, generate a correct fstab and chroot into the new file system.

for fs in dev dev/pts proc run sys tmp; do mount -o bind /$fs /mnt/$fs; done
genfstab /mnt >> /mnt/etc/fstab
chroot /mnt /bin/sh -l

Install and update base Alpine Linux packages needed. For example, we now need a kernel, since we won’t be using the kernel of the hypervisor anymore.

apk add --update alpine-base linux-lts grub grub-efi efibootmgr

Install grub on the new EFi partition. For some reason, EFI variables and commands di not work under my Proxmox VM, so efibootmgr is unable to set a boot entry. However, simply placing the bootloader in /EFI/Boot/bootx64.efi will have the UEFI boot it by default, so that’s my workaround.

grub-install --target=x86_64-efi --efi-directory=/boot/efi
mkdir -p /boot/efi/EFI/Boot
cp /boot/efi/EFI/alpine/grupx64.efi /boot/efi/EFI/Boot/bootx64.efi

Again, with my setup, grub won’t boot if root= is specified using a PARTUUID. My workaround is to explicitly use /dev/sda2. For this, edit the /boot/grub/grub.cfg file. Also, add modules=ext4 rootfstype=ext4 to that line, otherwise the bootloader doesn’t recognize the filesystem of the root partition. Lately, also specify console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200 consoleblank=0 in order to enable system messages over the serial port we have added earlier. You basically have to go from something like this:

linux /boot/vmlinuz-lts root=PARTUUID=84035c78-3729-4cc9-b1c5-d34d1189cefd ro

Replace it with:

linux /boot/vmlinuz-lts root=/dev/sda2 ro modules=ext4 rootfstype=ext4 console=tty0 console=ttyS0,115200

Some bootstraper scripts/templates for this containers set the system type to lxc. This has the effect of skipping various things at startup that are not required in a container environment, like checking the disks for errors, which is needed for remounting / as read write. Without undoing this, you will have problems booting Alpine Linux because the root file system will be read only. I have spent a ton of time figuring this out. You need to edit the /etc/rc.conf file and comment out rc_sys="lxc".
Install some aditional tools that only make sense not in a container; for example, file systems need to be checked at boot - for ext4, mkfs.ext4 is in e2fsprogs, while for FAT32 (EFI partition), mkfs.vfat is in dosfstools. You also need qemu-guest-agent in order for the shutdown command to gracefully shut down the virtualized operating system and only then turn off the VM.

apk add chrony qemu-guest-agent e2fsprogs dosfstools

Enable the following services. These are a mirror of what an actual install of regular Alpine Linux has enabled. You may be able to skip some of those, although I haven’t tried.

rc-update add qemu-guest-agent
rc-update add acpid default
rc-update add chronyd default
rc-update add devfs sysinit
rc-update add dmesg sysinit
rc-update add hwclock boot
rc-update add hwdrivers sysinit
rc-update add loadkmap boot
rc-update add mdev sysinit
rc-update add modules boot
rc-update add mount-ro shutdown
rc-update add swap boot
rc-update add sysctl boot
rc-update add urandom boot

If rc-update add urandom boot doesn’t work, please note that is is called seedrng in newer versions of Alpine Linux, so use rc-update add seedrng boot instead.

You can display the status of all services using rc-update show -v | less.

Boot and enjoy. If you did not enable the serial console, you may find that the first tty doesn’t work, or rather it duplicates output. A workaround is to use Ctrl+Alt+F2 to switch to an alternative tty.

References

Install Intel Unison on Windows 11 22000-based builds

2023-01-03T00:00:00+00:00

From the ashes of Dell Mobile Connect, which was based on Screenovate technologies which were purchased by Intel last year, a similar app with a different name has arrived - this is a quick guide showing you how to install it on 21H2 (22000-based) Windows 11 builds, despite it only officially working on 22H2 (22621-based) builds and up.

The problem: if you attempt to install the app via the Microsoft Store on 22000-based builds, it will tell you that the app is not supported, suggesting you upgrade to 22H2. If you cannot or do not have time to do that, as a workaround, I suggest this:

Source the application. For this, I recommend using the adguard store: from the drop down, choose “ProductId” and as product ID, use “9PP9GZM2GN26”. Choose the retail channel, click the checkbox, wait a bit, and then download the AppUp.IntelTechnologyMDE_2.2.2315.0_neutral_~_8j3eq9eme6ctt.msixbundle file.
Using 7zip or something similar, extract the file as if it were a regular archive file. In the resulting folder, you’ll get a WebPhonePackaging_2.2.2315.0_x64.msix file. Repeat the process, unzipping that as well.
In the resulting folder, identify and delete the AppxMetadata folder, and the [Content_Types].xml, AppxBlockMap.xml, and AppxSignature.p7x files.
Edit the AppxManifest.xml file. Identify the line which mentions OS build 22621 as minimum, and replace it to specify build 22000 instead as a minimum. So, turn line 11 from:

To:

Enable developer mode on your PC. For this, open Windows Settings -> Privacy & Security -> For developers -> Developer Mode -> ON. This will enable sideloading unsigned apps on your PC. The Intel Unison app’s signature is broken since we edited the AppxManifest.xml file, and we removed it altogether by deleting the folder and mentioned files in step 3.
Open an administrative PowerShell window and browse to the folder containing the AppxManifest.xml file we have just edited in step 4. When in that directory, issue the following command in order to install the application on your system:

Add-AppxPackage -Register .\AppxManifest.xml

That’s it! You should now be able to find the app in your Start menu, and after launhing it, it is business as usual - I was able to use it to sync my iPhone to my PC, make calls and so on, almost as if I were doing the same thing with a Mac. Certainly better than nothing, that’s for sure - it makes owning a PC, yet Apple devices otherwise a less clunky experience.

The way it works is it sets up your PC as if it were a Bluetooth handsfree device - your phone ‘sees’ the PC as if it were your tpical car stereo. It’s a fine way to do it, and I am amazed that it takes a multti-million dollar company to attempt this, instead of a hobby, open source project hosted on GitHub. The things is doable at that scale in my opinion; it could open the door to neat implementations, like having a Windows or Android based tablet as a car stereo, whihc of course comes with plenty advantages, like having a software package which you are more in control of, compared to what car manufacturers offer. But yeah, maybe another day… I am sure there are clunky Chinese implementations on Aliexpress already, but those aren’t as polished as what a DIY solution could achieve.

If you get an error on step 6, you might need to install the other prerequiste packages for the app to work. These are offered on the adguard store as well: just manually download the relevant package (like Microsoft.VCLibs.140.00.UWPDesktop_14.0.30704.0_x64__8wekyb3d8bbwe.appx for example) and install it by double clicking it in File Explorer. If you do not have the App Installer app installed on your Windows install, you can alternatively open an administrative PowerShell window where you have downloaded the file and issue a command like this in order to install the package:

Add-AppxPackage .\Microsoft.VCLibs.140.00.UWPDesktop_14.0.30704.0_x64__8wekyb3d8bbwe.appx

Happy New Year, 2023!

Working with shadow copies is kind of broken on Windows 11 22H2

2022-08-29T00:00:00+00:00

As a continuation of my previous article on shadow copies in Windows, I investigate how they behave under 22621+-based builds of Windows 11 (version 22H2, yet to be released).

Under build 22622, the Previous Versions tab displays no snapshots:

Attaching with a debugger and looking on the disassembly of twext.dll, the component that hosts the “Previous Versions” tab, I found out that the following call to NtFsControlFile errors out:

__int64 __fastcall IssueSnapshotControl(__int64 a1, _DWORD *a2, unsigned int a3)
{
  HANDLE EventW; // rsi
  int v7; // ebx
  unsigned __int64 v8; // r8
  signed int v9; // ecx
  signed int v11; // eax
  signed int LastError; // eax
  int v13; // [rsp+50h] [rbp-18h] BYREF
  wil::details::in1diag3 *retaddr; // [rsp+68h] [rbp+0h]

  memset_0(a2, 0, a3);
  EventW = CreateEventW(0i64, 1, 0, 0i64);
  if ( EventW )
  {
    v7 = NtFsControlFile(a1, EventW, 0i64, 0i64, ..., FSCTL_GET_SHADOW_COPY_DATA, NULL, 0, a2, a3);
...

This is called when the window tries to enumerate all snapshots in the system. The call stack looks something like this:

CTimewarpResultsFolder::EnumSnapshots
CTimewarpResultsFolder::_AddSnapshotShellItems
SHEnumSnapshotsForPath
QuerySnapshotNames
IssueSnapshotControl

Of note is that during CTimewarpResultsFolder::EnumSnapshots, the extension also loads in items corresponding to the following, in addition to shadow copy snapshots:

File History (AddFileHistoryShellItems)
Windows 7 Backup (AddSafeDocsShellItems)

At first, I thought that the device call no longer supports being called with insufficient space in the buffer (the first call to this IssueSnapshotControl has the buffer of size 0x10, enough for the call to populate it with information about the space it requires to contain all the data; you can read more about this device IO control call here, here, and here).

This being said, I quickly scratched out a project in Visual Studio where I set a large enough buffer. Unfortunately, this still did not produce the expected result. The call still said there are no snapshots available despite being offered enough space in the buffer to copy the names there.

    HRESULT hr = S_OK;
    DWORD sz = sizeof(L"@GMT-YYYY.MM.DD-HH.MM.SS");

    HANDLE hFile = CreateFileW(L"\\\\localhost\\C$", 1, (FILE_SHARE_READ | FILE_SHARE_WRITE), NULL, (CREATE_NEW | CREATE_ALWAYS), FILE_FLAG_BACKUP_SEMANTICS, NULL);
    if (!hFile || hFile == INVALID_HANDLE_VALUE)
    {
        printf("CreateFileW: 0x%x\n", GetLastError());
        return 0;
    }
    HANDLE hEvent = CreateEventW(NULL, TRUE, FALSE, NULL);
    if (!hEvent)
    {
        printf("CreateEventW: 0x%x\n", GetLastError());
        return 0;
    }
    typedef struct _IO_STATUS_BLOCK {
        union {
            NTSTATUS Status;
            PVOID    Pointer;
        };
        ULONG_PTR Information;
    } IO_STATUS_BLOCK, * PIO_STATUS_BLOCK;
    NTSTATUS(*NtFsControlFile)(
        HANDLE           FileHandle,
        HANDLE           Event,
        PVOID  ApcRoutine,
        PVOID            ApcContext,
        PIO_STATUS_BLOCK IoStatusBlock,
        ULONG            FsControlCode,
        PVOID            InputBuffer,
        ULONG            InputBufferLength,
        PVOID            OutputBuffer,
        ULONG            OutputBufferLength
        ) = GetProcAddress(GetModuleHandleW(L"ntdll"), "NtFsControlFile");
    if (!NtFsControlFile)
    {
        printf("GetProcAddress: 0x%x\n", GetLastError());
        return 0;
    }
    NTSTATUS rv = 0;
    DWORD max_theoretical_size = sizeof(DWORD) + sizeof(DWORD) + sizeof(DWORD) + 512 * sizeof(L"@GMT-YYYY.MM.DD-HH.MM.SS") + sizeof(L"");
    char* buff = calloc(1, max_theoretical_size);
    DWORD* buff2 = buff;
    IO_STATUS_BLOCK status;
    ZeroMemory(&status, sizeof(IO_STATUS_BLOCK));
#define FSCTL_GET_SHADOW_COPY_DATA 0x144064
#define STATUS_PENDING 0x103
    rv = NtFsControlFile(hFile, hEvent, NULL, NULL, &status, FSCTL_GET_SHADOW_COPY_DATA, NULL, 0, buff, max_theoretical_size);
    if (rv == STATUS_PENDING)
    {
        WaitForSingleObject(hEvent, INFINITE);
        rv = status.Status;
    }
    if (rv)
    {
        printf("NtFsControlFile (NTSTATUS): 0x%x\n", rv);
        return 0;
    }
    printf("%d %d %d\n", buff2[0], buff2[1], buff2[2]);

Okay, so maybe the entire call is broken altogether. Indeed, if we craft a replacement for NtFsControlFile when FsControlCode set to FSCTL_GET_SHADOW_COPY_DATA that uses the Volume Shadow Service APIs instead of this device IO control call and run the program as administrator, we indeed get the snapshots. It is interesting to note that on my machine running Windows 11 22000.856 this method returned all the snapshots that both vssadmin list shadows and ShadowCopyView listed, while the original NtFsControlFile call returned less snapshots, for some reasons. I compared the returned snapshots, but couldn’t find anything relevant regarding the missing ones.

#include 
#include 
#include 
#include 

typedef interface IVssBackupComponents IVssBackupComponents;

typedef struct IVssBackupComponentsVtbl
{
    BEGIN_INTERFACE

    HRESULT(STDMETHODCALLTYPE* QueryInterface)(
        __RPC__in IVssBackupComponents* This,
        /* [in] */ __RPC__in REFIID riid,
        /* [annotation][iid_is][out] */
        _COM_Outptr_  void** ppvObject);

    ULONG(STDMETHODCALLTYPE* AddRef)(
        __RPC__in IVssBackupComponents* This);

    ULONG(STDMETHODCALLTYPE* Release)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* GetWriterComponentsCount)(
        __RPC__in IVssBackupComponents* This,
        /* [out] */ _Out_ UINT* pcComponents);

    HRESULT(STDMETHODCALLTYPE* GetWriterComponents)(
        __RPC__in IVssBackupComponents* This,
        /* [in] */ _In_ UINT iWriter,
        /* [out] */ _Out_ void** ppWriter);

    HRESULT(STDMETHODCALLTYPE* InitializeForBackup)(
        __RPC__in IVssBackupComponents* This,
        /* [in_opt] */ _In_opt_ BSTR bstrXML);

    HRESULT(STDMETHODCALLTYPE* g6)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g7)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g8)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g9)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g10)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g11)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g12)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g13)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g14)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g15)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g16)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g17)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g18)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g19)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g20)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g21)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g22)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g23)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g24)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g25)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g26)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g27)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g28)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g29)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g30)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g31)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g32)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g33)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g34)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* SetContext)(
        __RPC__in IVssBackupComponents* This,
        _In_ LONG lContext);

    HRESULT(STDMETHODCALLTYPE* g36)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g37)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g38)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g39)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g40)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g41)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* g42)(
        __RPC__in IVssBackupComponents* This);

    HRESULT(STDMETHODCALLTYPE* Query)(
        __RPC__in IVssBackupComponents* This,
        _In_ REFIID        QueriedObjectId,
        _In_ INT64    eQueriedObjectType,
        _In_ INT64    eReturnedObjectsType,
        _In_ void** ppEnum);

    END_INTERFACE
} IVssBackupComponentsVtbl;

interface IVssBackupComponents // : IUnknown
{
    CONST_VTBL struct IVssBackupComponentsVtbl* lpVtbl;
};

NTSTATUS MyNtFsControlFile(
    HANDLE           FileHandle,
    HANDLE           Event,
    PVOID            ApcRoutine,
    PVOID            ApcContext,
    PIO_STATUS_BLOCK IoStatusBlock,
    ULONG            FsControlCode,
    PVOID            InputBuffer,
    ULONG            InputBufferLength,
    PVOID            OutputBuffer,
    ULONG            OutputBufferLength
)
{
    NTSTATUS(*NtFsControlFile)(
        HANDLE           FileHandle,
        HANDLE           Event,
        PVOID  ApcRoutine,
        PVOID            ApcContext,
        PIO_STATUS_BLOCK IoStatusBlock,
        ULONG            FsControlCode,
        PVOID            InputBuffer,
        ULONG            InputBufferLength,
        PVOID            OutputBuffer,
        ULONG            OutputBufferLength
        ) = GetProcAddress(GetModuleHandleW(L"ntdll"), "NtFsControlFile");

    if (FsControlCode == FSCTL_GET_SHADOW_COPY_DATA)
    {
        HRESULT hr = S_OK;
        DWORD sz = sizeof(L"@GMT-YYYY.MM.DD-HH.MM.SS");
        hr = CoInitialize(NULL);

        BY_HANDLE_FILE_INFORMATION fi;
        ZeroMemory(&fi, sizeof(BY_HANDLE_FILE_INFORMATION));
        GetFileInformationByHandle(FileHandle, &fi);

        GUID zeroGuid;
        memset(&zeroGuid, 0, sizeof(zeroGuid));
        HMODULE hVssapi = LoadLibraryW(L"VssApi.dll");
        if (!hVssapi) return STATUS_INSUFFICIENT_RESOURCES;

        FARPROC CreateVssBackupComponents = GetProcAddress(hVssapi, "?CreateVssBackupComponents@@YGJPAPAVIVssBackupComponents@@@Z");
        if (!CreateVssBackupComponents) CreateVssBackupComponents = GetProcAddress(hVssapi, "?CreateVssBackupComponents@@YAJPEAPEAVIVssBackupComponents@@@Z");
        if (!CreateVssBackupComponents) CreateVssBackupComponents = GetProcAddress(hVssapi, (LPCSTR)14);
        if (!CreateVssBackupComponents) return STATUS_INSUFFICIENT_RESOURCES;

        FARPROC VssFreeSnapshotProperties = GetProcAddress(hVssapi, "VssFreeSnapshotProperties");
        if (!VssFreeSnapshotProperties) return STATUS_INSUFFICIENT_RESOURCES;

        IVssBackupComponents* pVssBackupComponents = NULL;
        hr = CreateVssBackupComponents(&pVssBackupComponents);
        if (!pVssBackupComponents) return STATUS_INSUFFICIENT_RESOURCES;

        if (SUCCEEDED(hr = pVssBackupComponents->lpVtbl->InitializeForBackup(pVssBackupComponents, NULL)))
        {
            if (SUCCEEDED(hr = pVssBackupComponents->lpVtbl->SetContext(pVssBackupComponents, VSS_CTX_ALL)))
            {
                IVssEnumObject* pEnumObject = NULL;
                if (SUCCEEDED(hr = pVssBackupComponents->lpVtbl->Query(pVssBackupComponents, &zeroGuid, VSS_OBJECT_NONE, VSS_OBJECT_SNAPSHOT, &pEnumObject)))
                {
                    ULONG cnt = 0;
                    VSS_OBJECT_PROP props;
                    DWORD* data = calloc(3, sizeof(DWORD));
                    while (pEnumObject->lpVtbl->Next(pEnumObject, 1, &props, &cnt), cnt)
                    {
                        DWORD sn = 0;
                        GetVolumeInformationW(props.Obj.Snap.m_pwszOriginalVolumeName, NULL, 0, &sn, NULL, NULL, NULL, 0);
                        if (fi.dwVolumeSerialNumber != sn) continue;

                        data[0]++;
                        SYSTEMTIME SystemTime;
                        ZeroMemory(&SystemTime, sizeof(SYSTEMTIME));
                        BOOL x = FileTimeToSystemTime(&props.Obj.Snap.m_tsCreationTimestamp, &SystemTime);
                        WCHAR Buffer[MAX_PATH];
                        swprintf_s(
                            Buffer,
                            MAX_PATH,
                            L"@GMT-%4.4d.%2.2d.%2.2d-%2.2d.%2.2d.%2.2d",
                            SystemTime.wYear,
                            SystemTime.wMonth,
                            SystemTime.wDay,
                            SystemTime.wHour,
                            SystemTime.wMinute,
                            SystemTime.wSecond);
                        void* new_data = realloc(data, 3 * sizeof(DWORD) + data[0] * sz);
                        if (new_data) data = new_data;
                        else break;
                        memcpy_s((char*)data + 3 * sizeof(DWORD) + (data[0] - 1) * sz, sz, Buffer, sz);
                        VssFreeSnapshotProperties(&props.Obj.Snap);
                    }
                    void* new_data = realloc(data, 3 * sizeof(DWORD) + data[0] * sz + 2 * sizeof(WCHAR));
                    if (new_data)
                    {
                        DWORD* OutBuf = OutputBuffer;
                        data = new_data;
                        *(WCHAR*)((char*)data + 3 * sizeof(DWORD) + data[0] * sz) = 0;
                        *(WCHAR*)((char*)data + 3 * sizeof(DWORD) + data[0] * sz + sizeof(WCHAR)) = 0;
                        data[2] = data[0] * sz + 2;
                        if (OutputBufferLength < data[2] + 3 * sizeof(DWORD))
                        {
                            data[1] = 0;
                            OutBuf[0] = data[0];
                            OutBuf[1] = data[1];
                            OutBuf[2] = data[2];
                        }
                        else
                        {
                            data[1] = data[0];
                            OutBuf[0] = data[0];
                            OutBuf[1] = data[1];
                            OutBuf[2] = data[2];
                            memcpy_s(OutBuf + 3, data[2], data + 3, data[2]);
                        }
                        printf("%d %d %d\n", OutBuf[0], OutBuf[1], OutBuf[2]);
                    }
                    free(data);
                    pEnumObject->lpVtbl->Release(pEnumObject);
                }
            }
        }

        pVssBackupComponents->lpVtbl->Release(pVssBackupComponents);

        return 0;
    }
    return NtFsControlFile(FileHandle, Event, ApcRoutine, ApcContext, IoStatusBlock, FsControlCode, InputBuffer, InputBufferLength, OutputBuffer, OutputBufferLength);
}

In fact, if I run File Explorer under the built-in Administrator account (necessary so that calls using the VSS API work, specifically CreateVssBackupComponents) and replace IAT patch the call to NtFsControlFile with the function above (of course, I do this by modifying ExplorerPatcher), I do indeed get the snapshots to show under the “Previous versions” tab, but only if I set ShowAllPreviousVersions (DWORD) to 1 under HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer:

What is ShowAllPreviousVersions? It’s a setting that tells the window whether to display all snapshots or filter only the ones that actually contain the file that we are querying. The check is performed in twext.dll in BuildSnapshots:

    if ( (unsigned int)SHGetRestriction(
                         L"Software\\Microsoft\\Windows\\CurrentVersion\\Explorer",
                         0i64,
                         L"ShowAllPreviousVersions") == 1 )

Okay, so it seems that this solves the problem. Well, no really, since trying to open files that are clearly in the snapshot we have selected produces an error (which also seems to read from some illegal buffer, as signified by the Chinese characters, but that’s a bug for some other time):

Looking a bit on the code, after the call to QuerySnapshotNames, there are calls to BuildSnapshots -> TestSnapshot -> GetFileAttributesExW. The first parameter to GetFileAttributesExW is a UNC-like path that references the snapshot we are looking into for the file, things like: \\localhost\C$\@GMT-2022.08.29-14.29.52\Users\Administrator\Downloads\ep_setup.exe.

Traditionally, we can explore the contents of a snapshot using File Explorer by visiting a path like: \\localhost\C$\@GMT-2022.08.29-14.29.52. Also, opening a path like \\localhost\C$\@GMT-2022.08.29-14.29.52\Users\Administrator\Downloads\ep_setup.exe using “Run”, for example, will launch the file in the associated application, or simply execute it, if it’s an executable. But under build 22622, this is also broken. For example, trying to visit the folder \\localhost\C$\@GMT-2022.08.29-14.29.52 (by pasting the line in “Run” and hitting Enter) yields this error:

At this point, I kind of stopped my investigation. It’s clear that something big changed in the backend. It seems like calls that interact with the snapshots using any other means then the VSS API, so things like NtFsControlFile with FSCTL_GET_SHADOW_COPY_DATA, or GetFileAttributesExW on a path like \\localhost\C$\@GMT-2022.08.29-14.29.52\Users\Administrator\Downloads\ep_setup.exe somehow get denied along the way. This may stem from the wish to close off programatic access to snapshot information to applications that run under a low priviledge context, as the VSS API only works when the app runs as an administrator. If that’s the case, it’s not necessarily bad, but it seems it also broke a lot of stuff which really should be fixed. The thing is, it’s ind of expected for File Explorer to run under a standard user account and have the users use it to view and restore previous versions of their files originating from shadow copies. There’s been a recent vulnerability related to VSS (here), maybe the fix for this had something to do with this change in behavior?

I would really appreciate an official resolution on the matter, as things are pretty broken at the moment, unfortunately. I really like the functionality offered by “Previous versions” powered by shadow copies and would like to see it continue to work in the traditional UI offered with past Windows 10 and 11 releases.

Shadow copies under Windows 10 and Windows 11

2022-08-28T00:00:00+00:00

Ever since I got into the world of managing servers, working with VMs etc, I have kind of grown acustomed to being able to quickly restore the systems I work with to a previous point in time. But having this confidence with most of the tools I manage (the company servers, the hypervisors I use), the OS has always been a sore spot for me - I was always scared of some program installing and then messing up my Windows install in some irreversible way.

It was only until relatively recently that I have discovered that Windows does indeed supports snapshots on its default file system (NTFS). And not only on recent versions, but for almost 20 years now, ever since the days of Windows XP. It’s called shadow copies and saw its prime time during the Windows Vista/7 era, only to see it kind of dismissed with the launch of Windows 8 for reasons I do not really understand. This feature is EXTREMLY useful to have - not only can you restore the entire C: drive in case some software messes it up, but you can also restore previous versions of the files you have been working on. This is immensily useful and time saving, so since I have discovered this, I really cannot live without it.

The journey begun when I wondered how to have the “Previous versions” tab (hosted by twext.dll; tw stands for “timewarp”) in the Properties sheet of an item from File Explorer actually populated with previous versions. Items displayed in there can come from one of these providers:

Volume Shadow Copy (“snapshot”) - Windows’ disk snapshots functionality
Windows 7 Backup / Restore (“safedocs”) - the legacy backup solution in Windows
File History (“fhs”) - Windows 8-era replacement for safedocs
System Restore points

Despite that, the instructions in the window say that files displayed there only come from File History or restore points; File History creates backups of your files on an external drive on specified intervals. This works, but it’s been always extremly buggy, with Windows sometimes failing to recognize the backups once you reinstalled the OS, plus some random slowdowns when copying files, or even the feature not being able to make an initial snapshot. Plus this worked for specific folders, not entire drives. I think even Microsoft knew how poor and laughable their implementation was, plus they surely do anything to push you over to storing backups in OneDrive, so, with Windows 11, they completely removed the UI for configuring this via the Settings app (as with other things), putting a final nail in the coffin for this feature that never worked right (there is remaining UI in the legacy Control Panel, but that alone can’t be used to configure all aspects of this feature).

But as I have discovered, snapshots taken using the volume shadow copy service also generate entries in the “Previous versions” tab. Cool! So it means that ll we have to do is enable snapshots for our volumes and we’ll start seeing entries there, right?

Well, yeah. Unfortunately, due to another questionable Microsoft move, they have completely removed the UI for configuring shadow copies from client Windows (it’s still available in server SKUs). But the functionality is still there, they just don’t deem it important, for some reason, for regular users.

So, becuase we have no UI for configuring a schedule for snapshots, I have resorted to setting up a scheduled task that runs the following command every 15 minutes:

wmic shadowcopy call create Volume=C:\

What this does is create a snapshot of drive C: every 15 minutes. By policy, snapshots cannot take more than 10 seconds, and I experience no slowdowns anyway, so I am fine with executing them this often. This means I can lose, at worst, a maximum of 15 minutes of saved work, which sounds really good to me.

This is enough to get you the basics. Even if you stop reading here to return at a later time, you’ll be covered against the basics. Follow on to learn more.

Increase the number of snapshots

By default, Windows is pretty conservative in the number of snapshots you can take (64) before older snapshots are deleted. The system allows for a more reasonable maximum of 512, which you can configure by setting MaxShadowCopies (DWORD) under HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\VSS\Settings as per these instructions. Reboot after making the changes.

Increase the disk quota snapshots can take

By default, snapshot contents are stored on the same drive you are snapshotting. You can increase the maximum space snapshots can take on your hard driver by issuing such a command in an administrative prompt:

vssadmin Resize ShadowStorage /For=C: /On=C: /MaxSize=UNBOUNDED

The command above removed any quota on the space snapshots can take - be careful, this can quickly fill your disk; you can also specify something like 100GB instead of UNBOUNDED there and so on. Find out more about the command by running: vssadmin Resize ShadowStorage.

View all snapshots

You may notice, depending on your usage, that some snapshots are not displayed in the “Previous versions” tab (only 64 of them or so may be displayed). It’s indeed a display bug that sometimes happens; to consult the entire list, use:

vssadmin List Shadows

Alternatively, you can use a full featured third party GUI like ShadowExplorer or ShadowCopyView.

Do note that under Windows 11 22H2-based builds (22621+), this functionality seems to be broken altogether; read more about my investigation regarding this here.

Restore an entire drive

This is where thing get a bit more complicated. But, first of all, do not right click a drive and go to Properties - Previous versions and attempt to restore from there. It’s inneficcient and takes a ton of time and will likely mess your system. That feature basically copies over files from the snapshot over your current files, which is very unlikely to work properly, so do not goi that route. You have been warned.

Instead, remember how I told you that Microsoft ‘inteligentlly’ removed the GUI portion of shadow copies from client SKUs? Well, not only that; you see, vssadmin under client SKUs offers fewer option than it does under server SKUs.

For example, here’s vssadmin /? under Windows 11:

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

---- Commands Supported ----

Delete Shadows        - Delete volume shadow copies
List Providers        - List registered volume shadow copy providers
List Shadows          - List existing volume shadow copies
List ShadowStorage    - List volume shadow copy storage associations
List Volumes          - List volumes eligible for shadow copies
List Writers          - List subscribed volume shadow copy writers
Resize ShadowStorage  - Resize a volume shadow copy storage association

And here it is under Windows Server 2022:

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2013 Microsoft Corp.

---- Commands Supported ----

Add ShadowStorage     - Add a new volume shadow copy storage association
Create Shadow         - Create a new volume shadow copy
Delete Shadows        - Delete volume shadow copies
Delete ShadowStorage  - Delete volume shadow copy storage associations
List Providers        - List registered volume shadow copy providers
List Shadows          - List existing volume shadow copies
List ShadowStorage    - List volume shadow copy storage associations
List Volumes          - List volumes eligible for shadow copies
List Writers          - List subscribed volume shadow copy writers
Resize ShadowStorage  - Resize a volume shadow copy storage association
Revert Shadow         - Revert a volume to a shadow copy
Query Reverts         - Query the progress of in-progress revert operations.

Again, why they decided to do this is beyond my understanding, probably some business decision. Fortunately, as described in this Reddit post, you can trick vssadmin into thinking it’s running under a server SKU by using WinPrivCmd, a redirector for some API calls programs use to find various information from Windows, including whether the program runs under a client or server SKU. To get the server vssadmin, run this:

winprivcmd /ServerEdition vssadmin /?

If the drive you’re trying to restore is not the system drive, than you can use a command like this one to revert to a snapshot of it:

vssadmin Revert Shadow /Shadow={c5946237-af12-3f23-af80-51aadb3b20d5} /ForceDismount

Again, you can identify the exact snapshot GUID using vssadmin List Shadows. Actually, I don’t really know if this works from client SKUs anyway, even if you trick it into recognizing it like this - I don’t know if the backend supports it. Anyway, read on for a workaround that works for all scenarios, including your system drive.

First of all, you can’t restore a snapshot on the system drive from an online system. To workaround this, I recommend booting into a Windows Server-based live image. The easiest way to do this is to grab a pen drive and use Rufus to create a Windows To Go installation on it using a Windows Server ISO. Then, boot that. When making the bootable drive, make sure to uncheck the “Prevent Windows To Go from accessing internal disks” option when clicking “Start” in Rufus.

Then, there are a couple of pre-requisties for the revert operation to work. I skipped the documentation and looked directly on the disassembly of the VSSVC.exe executale (the Volume Shadow Copy service):

bool __fastcall CVssCoordinator::IsRevertableVolume(CVssCoordinator *this, const unsigned __int16 *a2)
{
  CVssCoordinator *v3; // rcx
  char v4; // bl
  CVssCoordinator *v5; // rcx
  CVssCoordinator *v6; // rcx

  v4 = 0;
  if ( !CVssCoordinator::IsSystemVolume(this, a2)
    && !CVssCoordinator::IsBootVolume(v3, a2)
    && !CVssCoordinator::IsPagefileVolume(v5, a2) 
	&& !CVssCoordinator::IsSharedClusterVolume(v6, a2) )
  {
    return 1;
  }
  return v4;
}

So, basically, you cannot restore a system or booted volume, so that’s why you can’t restore the C: drive while the OS s running. Kind of makes sense. You also cannot restore it if it’s a shared cluster volume; I don’t know exactly what that is, but on regular configurations, it’s not. All that remains is not to have a page file on the volume. That’s easy to get around: in Start, search for “advanced system settings”, under “Performance”, click “Settings”, “Advanced”, “Virtual memory” section, “Change”, untick “Automatically manage paging file size for all drives”, then select your “C:” drive, check “No paging file”, click “Set”, then “OK”, “OK”, “OK” and reboot the system.

Most of these checks are also performed in VSSUI.dll, which is the DLL that hosts the UI for the shadow copies tab in Windows Server, in the IsValidRevertVolume function. Now, with the page file disabled, you’r ready to boot into the Windows Server environment.

A quick note before that: you can also edit the VSSVC.exe and VSSUI.dll binaries and nullify those checks. Does it work then? Can it work from a live booted OS restoring the system drive itself? I don’t know, I haven’t tried it, since I was doing all of this on my main workstation which I was trying to revert, but hopefully one day I play more with this under a VM and see what happens when doing this.

Now that you have the Windows Server environemnt, boot into it. When you get to the desktop, open File Explorer and identify your main OS system drive. Right click on it and choose “Configure Shadow Copies”.

In there, select your drive in the list and identify the snapshot you want to revert to, then click the “Revert” button. A pop-up window will ask you to confirm the operation. Check the box, then click to proceed. After a brief freeze of the interface, the restore operation will commence. It shouldn’t take long - it took around 20 minutes when I restored a snapshot on my NVMe drive filled with around 400GB, although it of course varies based on the hardware that you have.

You can also monitor the progress in the command line using this:

vssadmin Query Reverts

Conclusion

Shadow Copies are a VERY useful feature. I am actually impressed they aren’t much more popularized. They indeed may require quite a bit of space, depending on your configured options, but overall, the piece of mind and convenience they bring make them worth using despite the aritifical roadblocks imposed by Microsoft. I hope the functionality will continue to live on in future Windows versions, considering how it lived and did its job just fine for the past 20 years, despite alarming signs from Microsoft in their latest OS versions.

Also, keep in mind that these are not a replacement for a backup system! They are though a complement to your computing habits, and came in handy to me: I researched this article while trying to revert back to a config of my system before I messed up with it trying various programs and hacks to get my WhatsApp messages transfered from my iPhone to Android/Windows Subsystem for Android, in order to archive them and free up the storage on my phone (they take quite a bit of space). Maybe I will tell you about this in a future story. For now, this will do, peace!

Set default apps in Windows 11 (restore the Windows 10 Settings app in Windows 11)

2022-05-24T00:00:00+00:00

I just got fed up with the new idiotic way in which Microsoft wants to force us to set defaults in Windows 11. Instead of the good old Settings - Apps - Default apps page from Windows 10 that easily let you assign an app for an entire category (which is pretty logical, for example, you could set the default music player in a pinch, which immediatly associated the app with mp3, wav etc files), you now have to go through each extension and assign it the app that you want, or go to the app and manually assign each extension to it. I mean, why?

Other than that, the Windows 11 Settings app is fine, it looks nice, but this omission is a huge downgrade. People have resorted to all sorts of hacks, and even big players like Firefox have hacked their way into bypassing Microsoft’s protections because the current way for setting defaults is just awful.

Well, here I am again, fixing Microsoft’s OS. This time, I decided enough is enough and looked into a way to bring back the Settings app from Windows 10 which thankfully included the beloved “Default apps” page.

How to?

This time, you’ll have to do things manually; I have yet to decide on a way to automate this, if any - suggestions are welcome in the comments section; should this be integrated in ExplorerPatcher, and if so, how to go around the technical details? Anyway, for the tutorial, read on.

You’ll need the following things from build 22000.1, the single public build of Windows 11 that ever shipped with the legacy Settings app:

the C:\Windows\ImmersiveControlPanel folder; this is the main folder where the UWP Settings app lives
the C:\Windows\SystemResources\Windows.UI.SettingsAppThreshold folder; this is the folder that contains the resources used by the Settings app
the C:\Windows\System32\SettingsEnvironment.Desktop.dll file

If you stop at only the 2 things above, the legacy Settings app will work, but none of the built-in pages will actually work. If you connect with a debugger to the running SystemSettings.exe process, some first-time exceptions will be thrown, and a message from the Windows Runtime along the lines of WinRT information: Cannot find a resource with the given key: DefaultEntityItemSettingPageTemplate will be mentioned. Needless to say, the debug info is totally worthless; only by looking on the modules list and testing file by file afterwards was I able to successfully determine what was still required to be brought over, i.e. this file.

So, on a live system:

Take ownership of C:\Windows\ImmersiveControlPanel and rename it to ImmersiveControlPanelO, and paste the C:\Windows\ImmersiveControlPanel folder from 22000.1 in there.
Again, Windows is bugged up and won’t let you replace the C:\Windows\SystemResources\Windows.UI.SettingsAppThreshold folder, even after taking ownership of it. I don’t really understand what is wrong with this OS, but the solution is to manually rename the 3 items inside (pris to prisO, SystemSettings to SystemSettingsO and Windows.UI.SettingsAppThreshold.pri to Windows.UI.SettingsAppThreshold.priO) and copy in there the corresponding 3 items from 22000.1.
Instead of replacing the SettingsEnvironment.Desktop.dll in C:\Windows\System32, I recommend copying it from 22000.1 in C:\Windows\ImmersiveControlPanel. This DLL is loaded in a “classic” manner so to speak, thus the loader first looking in the applicaiton folder, and then in the system directories; thus, the DLL placed in C:\Windows\ImmersiveControlPanel can override the one in C:\Windows\System32.

That’s it. If you’ve done everything correctly, open the Settings app using Start, for example - you should be greeted by the legacy Settings app from Windows 10:

To spare you having to spin up a VM and gather the required files, you can download the following archive which contains the 2 folders from 22000.1: C:\Windows\ImmersiveControlPanel (already contains the proper SettingsEnvironment.Desktop.dll file in it as well), and C:\Windows\System32\SettingsEnvironment.Desktop.dll. All you have to do is to place these instead of the built-in folders that you have, of course, taking backups of the originals.

Needless to say, this will probably be reverted by Windows Update, but nevertheless, it’s a solution for at least doing some maintenance tasks that you can’t really do with the new Settings app. As I said, we can discuss in the comments section about a proper way to automate this, if interested.

If you stumble upon any problem and would like to restore a clean slate, you can run the following command and restart the computer when it finishes:

sfc /scannow

Bonus

Yeah, since we restored the full Windows 10 Settings app, you can now use the other UIs that were also forgotten by Microsoft and never replicated in the new Settings app. For example, I could finally properly adjust File History settings (Settings - Update & Security - Backup) which is another omission from the new app.

Conclusion

What can I say, another stupid move by Microsoft. I am VERY glad that this hack works after all, but the fact that we needed it is just mind boggling. Hopefully, things will improve with the upcoming updates, but I honestly doubt it. Anyway, in the mean time, we have this.

Looking forward to hearing your thoughts in the comments below.

Upload desktop.ini when using Nextcloud

2022-04-12T00:00:00+00:00

Again, much time has passed since the last post here. Again, ExplorerPatcher is the “culprit”. Anyway, I have decided to take a small break off it, for now.

Without further do, let’s begin.

Background

Recently, I have decided to find a solution to keep files syncronized between my home and work PC. I am using my own server for this, as I kind of hate storing personal files in the public clouds offered by various vendors, for obvious reasons. After going through the pitfalls of offline files (XP-era tech) and Work Folders (8.1-era tech), I decided those simply won’t cut it. It didn’t help that these features are largely forgotten and left by Microsoft in a zombie state in the latest iterations of the OS.

Anyway, having picked up Nextcloud next, I turned to their rather good (and open source) desktop client. Things went relatively smoothly, except I had this persisting problem: there is this desktop.ini file that Windows creates in various folders whenever you customize it in one way or another, and for some reason I have later learnt about, Nextcloud refuses to sync that, so it always remains in a pending state when looked at using File Explorer.

If you just see the pending icon on the parent folder, yet all files inside seemed to have synced, than you might simply not be able to see the desktop.ini files. To enable these, go to “Folder Options - View tab - Hide protected operating system files (Recommended)”.

Investigation

So, what about this? First, I used Google to search for stuff realted to this. I came up with this link which in turn was a mirror of the official repo which contains information about why they have chosen not to sync desktop.ini files: see, they create such a file to specify information for File Explorer that the Nextcloud folder should use a custom icon, yet they can’t sync that because the path where that folder is located may differ from one PC to another, depending on where you have Nextcloud installed. I mean, sure, they could just patch the file for each computer instead, or just enforce this restriction in the root folder, yet it seems they exclude files named like this in all the subdirectories and this is what I don’t really get why.

Anyway, next step was to look on the current source code and see how it looks currently. So:

git clone https://github.com/nextcloud/desktop nextcloud-desktop

Then, use grep to find where desktop.ini is mentioned:

grep -rn "desktop.ini"

It mentions these places:

ChangeLog:307:* Excludes: Hardcode desktop.ini
src/csync/csync_exclude.cpp:201:    /* Do not sync desktop.ini files anywhere in the tree. */
src/csync/csync_exclude.cpp:202:    const auto desktopIniFile = QStringLiteral("desktop.ini");

Of interest is csync_exclude.cpp. The mentioned lines look like this:

    /* Do not sync desktop.ini files anywhere in the tree. */
    const auto desktopIniFile = QStringLiteral("desktop.ini");
    if (blen == static_cast<qsizetype>(desktopIniFile.length()) && bname.compare(desktopIniFile, Qt::CaseInsensitive) == 0) {
        return CSYNC_FILE_SILENTLY_EXCLUDED;
    }

Of course, I could recompile the whole application in order to take that out, but for taking an if out, all the dance (setting up a build environment etc) is not really necessary. We could binary patch the existing files, a strategy I really love when it comes to these types of modifications. It’s much easier and takes way less time and effort, and it also teaches you things along the way.

So it’s time to identify where that piece of code ends in the binaries that are actually shipped with the Windows client. Go to C:\Program Files\Nextcloud and look there. Of course, nextcloud_csync.dll seems like a good candidate, but suppose it wouldn’t have been so obvious, you can use the Strings tool from Sysinternals to look for the string in all the files in the folder, like so:

strings64.exe -o "C:\Program Files\Nextcloud\*" | findstr /i "desktop.ini"

The -o flag will have it print the offsets to the string in the file, like so (for nextcloud_csync.dll):

979304:/Desktop.ini
979712:Desktop.ini
979792:Remove Desktop.ini from
1184248:desktop.ini

Eventually, we conclude the “culprit” is indeed nextcloud_csync.dll. Of course, the next step is to load this in IDA and look on it.

After loading, search for desktop.ini (Alt+T) and we get these results:

By exploring those, we can see that the disassembly does not match anything that resembles the source code. So, what’s up, does IDA miss our string, does it not live in this file after all? To clear the mystery, I learnt about the “String” window in IDA (View - Open subviews - Strings). There, right click and choose “Setup…”, and in there you can choose to display “Unicode C-style (16 bits)” strings as well. With Ctrl+F afterwards, we can search for it in the String window and identify it.

How do I know it’s Unicode 16-bits? Well, you can tell by limiting strings64.exe to searching only for Unicode 16-bits strings (with the -u flag), in which case it will identify that. But IDA knows those, how did it not pick it up? Well, it doesn’tr ecognize it as a string, before it’s nowhere referenced in the disassembly. But how could that be, the source code clearly accesses that string. Well, not really…

First, if we go to the string location we identified using the “Strings” window in IDA, it looks like this:

.data:00000001801237F8                 db  64h ; d
.data:00000001801237F9                 db    0
.data:00000001801237FA                 db  65h ; e
.data:00000001801237FB                 db    0
.data:00000001801237FC                 db  73h ; s
.data:00000001801237FD                 db    0
.data:00000001801237FE                 db  6Bh ; k
.data:00000001801237FF                 db    0
.data:0000000180123800                 db  74h ; t
.data:0000000180123801                 db    0
.data:0000000180123802                 db  6Fh ; o
.data:0000000180123803                 db    0
.data:0000000180123804                 db  70h ; p
.data:0000000180123805                 db    0
.data:0000000180123806                 db  2Eh ; .
.data:0000000180123807                 db    0
.data:0000000180123808                 db  69h ; i
.data:0000000180123809                 db    0
.data:000000018012380A                 db  6Eh ; n
.data:000000018012380B                 db    0
.data:000000018012380C                 db  69h ; i
.data:000000018012380D                 db    0
.data:000000018012380E                 db    0
.data:000000018012380F                 db    0

Why is that? The clue is right there in the source code: the string is defined as QStringLiteral("desktop.ini") actually, which is actually a specialized container for string stuff. The disassembly never accesses the string data directly, rather, as we can clearly see from the source code, it’s passed to helper implementations that do stuff for us, like .compare( aka QStringRef::compare. What these helpers receieve is a pointer to this QStringLiteral, so the address of where its data is in memory. And how does it look in memory? Well, it indeed contains the actual string, but it actually starts with the length of the entire string. Thus, things like .length can be easily performed without iterating over the characters until reaching a \0, also enabling the possibility of strings that are not null terminated. The disassembly clearly tells us this story, actually:

.data:00000001801237E0 unk_1801237E0   db 0FFh                 ; DATA XREF: sub_18001F690:loc_18001F943↑o
.data:00000001801237E0                                         ; csync_is_windows_reserved_word(QStringRef const &)+276↑o
.data:00000001801237E1                 db 0FFh
.data:00000001801237E2                 db 0FFh
.data:00000001801237E3                 db 0FFh
.data:00000001801237E4 dword_1801237E4 dd 0Bh                  ; DATA XREF: sub_18001F690+2BE↑r
.data:00000001801237E8                 align 10h
.data:00000001801237F0                 db  18h
.data:00000001801237F1                 db    0
.data:00000001801237F2                 db    0
.data:00000001801237F3                 db    0
.data:00000001801237F4                 db    0
.data:00000001801237F5                 db    0
.data:00000001801237F6                 db    0
.data:00000001801237F7                 db 0
.data:00000001801237F8                 db  64h ; d
.data:00000001801237F9                 db    0
.data:00000001801237FA                 db  65h ; e
.data:00000001801237FB                 db    0
.data:00000001801237FC                 db  73h ; s
.data:00000001801237FD                 db    0
.data:00000001801237FE                 db  6Bh ; k
.data:00000001801237FF                 db    0
.data:0000000180123800                 db  74h ; t
.data:0000000180123801                 db    0
.data:0000000180123802                 db  6Fh ; o
.data:0000000180123803                 db    0
.data:0000000180123804                 db  70h ; p
.data:0000000180123805                 db    0
.data:0000000180123806                 db  2Eh ; .
.data:0000000180123807                 db    0
.data:0000000180123808                 db  69h ; i
.data:0000000180123809                 db    0
.data:000000018012380A                 db  6Eh ; n
.data:000000018012380B                 db    0
.data:000000018012380C                 db  69h ; i
.data:000000018012380D                 db    0
.data:000000018012380E                 db    0
.data:000000018012380F                 db    0

So, actually, IDA identified the beginning of the QStringLiteral instance, and called it unk_1801237E0. Where is that referenced in the disassembly? Well, exactly where the QStringLiteral is used, to the portion of the disassembly that corresponds to the source code I presented above. In pseudocode form, it looks like this:

      v25 = &unk_1801237E0;
	  if ( v9 != dword_1801237E4 || (unsigned int)QStringRef::compare(&v19, &v25, 0i64) )
      {
        if ( v2 && OCC::Utility::isConflictFile(v3, v15) )
          v5 = 9;
        else
          v5 = 0;
      }

So, v9 != dword_1801237E4 basically says “if the length of the current string (v9) is different from the length of the desktop.ini string (which is just the memory location dword_1801237E4 from the above, as, you can see, right there at the beginning, after some zeros, it contains the length of the string at dword_1801237E4 (0xB which is 11, as desktop.ini has 11 characters)”. Current string in this conetxt means a string containing the name of the current file that the program is working with.

Then, (unsigned int)QStringRef::compare(&v19, &v25, 0i64) says “if the current string is not equal to the desktop.ini string, represented by that QStringLiteral instance”. If we want IDA to pick that string when searching text using Alt+T, we can define the portion where the string is located as a string by clicking on the beginning of it, then Alt+A and then choosing the “Unicode C-style (16 bits)” option.

The entire condition from the if check in the source code is negated, so to say. The statements are mathematically proven to be equivalent, you can look into De Morgan’s laws for that. Basically:

if (blen == static_cast(desktopIniFile.length()) && bname.compare(desktopIniFile, Qt::CaseInsensitive) == 0)

is equivalent to NOT the following:

if ( v9 != dword_1801237E4 || (unsigned int)QStringRef::compare(&v19, &v25, 0i64) )

So, in the pseudocode generated from the disassembly, what’s inside the if branches is what followed if we wouldn’t have entered the if branch in the original source code.

To patch this, we basically want the if check to disappear and always continue with that it has inside, equivalent to failing the if on the original source code and continuing with what was next there. To devise a patch, a look on the disassembly is required:

.text:000000018001F943 loc_18001F943:                          ; CODE XREF: sub_18001F690+268↑j
.text:000000018001F943                 lea     rax, unk_1801237E0
.text:000000018001F94A                 mov     [rbp+arg_18], rax
.text:000000018001F94E                 movsxd  rax, cs:dword_1801237E4
.text:000000018001F955                 cmp     r15, rax
.text:000000018001F958                 jnz     short loc_18001F96F
.text:000000018001F95A                 xor     r8d, r8d
.text:000000018001F95D                 lea     rdx, [rbp+arg_18]
.text:000000018001F961                 lea     rcx, [rbp+var_20]
.text:000000018001F965                 call    cs:?compare@QStringRef@@QEBAHAEBVQString@@W4CaseSensitivity@Qt@@@Z ; QStringRef::compare(QString const &,Qt::CaseSensitivity)
.text:000000018001F96B                 test    eax, eax
.text:000000018001F96D                 jz      short loc_18001F990
.text:000000018001F96F
.text:000000018001F96F loc_18001F96F:                          ; CODE XREF: sub_18001F690+2C8↑j
.text:000000018001F96F                 test    r12b, r12b
.text:000000018001F972                 jz      short loc_18001F98E
.text:000000018001F974                 mov     rcx, rsi        ; this
.text:000000018001F977                 call    ?isConflictFile@Utility@OCC@@YA_NAEBVQString@@@Z ; OCC::Utility::isConflictFile(QString const &)
.text:000000018001F97C                 test    al, al
.text:000000018001F97E                 jz      short loc_18001F98E
.text:000000018001F980                 mov     edi, 9
.text:000000018001F985                 jmp     short loc_18001F990

So, the first part of the if check is this:

.text:000000018001F958                 jnz     short loc_18001F96F

Only if the comparison is not zero, we jump to the area inside the brackets (loc_18001F96F), but instead, we always want to jump there.

How do the opcodes for that look? Of course, since the thing is so close in the disassembly, a short conditional jump instruction is used: 0x75 0x15: 0x75 means jnz and 0x15 is the offset relative to the current instruction. To jump there all the times, the patch is easy: turn 0x75 to 0xEB which is an unconditional jump. It’s very handy that only a single byte has to be patched after all.

Of course, I replace that byte in a hex editor, close Nextcloud, replace the original file with the modified one, reopen Nextcloud, wait a bit, and yeah, indeed, the client now simply uploaded the desktop.ini files as well. Nice.

Future

How to find this easily in the future? By examining the source code, we can try to identify based on the call to OCC::Utility::isConflictFile from below. That function has 2 overloads; in the case of our source code, the one taking in a QString appears to be used, since path is of type QString, as defined in the function prototype. Of course, overloads only exist in the C++ world - when everything gets compiled, 2 actual methods are generated, with symbol names that reflect the arguments for each of the methods, since the name is not enough anymore to distinguish the two, but we know that each overload has different arguments (in that, void foo(char a) and void foo(char b) is illegal in C++).

So, open the DLL in IDA, search for OCC::Utility::isConflictFile in the Functions window, pick the method that takes in a QString. Ctrl+X will find its references, and at the moment there are only 3 of them:

Out of those, as you can see, only one of them is from an actual method, so probably what we are interested in. If we go there, we can see that was indeed the case, it gets us right to where that check for whether the file is called desktop.ini is performed in the code. From there, identifying the byte to patch shouldn’t be too difficult.

Conclusion

Is it worth binary patching open source programs? Well, it’s always worth binary patching programs in general, if you can. With open source programs, the BIG advantage is that you always have the source code at hand, which can help you much easily and better understand what the heck is happening in the final executable. Sometimes, the compiler takes decisions that definitely help in some way, like reducing space, increasing execution speed, but which make human reading the program very tought to actually understand, and of course, having the source code is of big help.

And the truth is, upstreaming this is pretty tough. It’s a very niche use case that I have, maybe it just glitches on my end or maybe the thing really is troublesome on the long run for some other reason. Idk, offering this behavior as an option may be too much work indeed, as well.

The trouble with binary patching are updates to the patched software. Anyway, here, since it’s just a single byte, the program is open source, maintaining it wouldn’t be that big of a pain. But still, it would be nice if computers were smart as to perform the steps described above (identifying isConflictFile etc) themselves on new versions and patch it automatically using the knowledge above. A framework where that would help in doing things like this with less struggle than the tradiitional methods. But yeah, unless I am missing something, that’s still a dream, and a big burden, as I have learnt many times, with having such a patch apply at runtime, would be the difficulty it would take to have our piece of code loaded in the Nextcloud process, and then determining a reliable strategy of identifying that byte and patching it - most of the times without symbols/help generated by a tool like IDA. Idk, maybe one day though… Although, yeah, I am curious whether anyone knows a tool available today that would facilitate these kinds of binary patches - I personally don’t know of any, at the moment.

Disable the rounded corners in Windows 11

2022-01-21T00:00:00+00:00

This is another article related to ExplorerPatcher and Windows 11. This time, I wanted to take a more in-depth look at the changes in the compositor shipped with Windows 11 (the “Desktop Window Manager”).

Quite a few things have changed regarding it compared to Windows 10, like new window animations when maximizing and restoring down a window, while some things have regressed: Aero Peek is more broken than ever, basically a barely working artifact that is due for removal, apparently. Some things are the same, like the bug where clicks (on the very top of the title bar, or on the corner of the close button) on a maximized System-enhanced scaled foreground window displayed on a monitor with 150% and 3840x2160 resolution go to the window behind it, potentially accidentally closing it.

There is a rather visible new addition though: the window corners are now rounded. The whole design aesthetic for Windows 11 proposes rounded things everywhere. Personally I like them, but a lot of people do not, and for good reason: it would have been logical for Microsoft to offer at least a hidden option to enable the legacy behavior from Windows 10, where the corners are sharp, 90-degree angle. Unfortunately, they did not, so we are once again left to scramble through their executables and see what we can find.

The Desktop Window Manager is powered by many libraries; the one that apparently is tasked with doing the actual drawing on the screen surface is called uDWM.dll. I am a bit familiar with it from my work on WinCenterTitle, a program which let you center the text displayed in the non-client title bar of windows, as in Windows 8 and 8.1. Take notice, I said non-client: the compositor is only responsible for drawing the title bar of windows that do not elect to do it themselves (and inform it so). More and more applications are moving to custom-drawn title bars (or how the GNU/Linux/GNOME user land calls them, “client-side decorations”); whether that’s a good or a bad thing, that’s a subject of much debate. Thus, the effect of the patch becomes less and less impressive and consistent, as some applications will have their text centered while a whole host of others won’t. The operating system does not provide a mechanism to inform applications how the text should be drawn: it is generally implied that the text is left-aligned, while Windows 8 and 8.1 should be treated as the exception and the title be custom-drawn centered there. Again, very poor design from Microsoft, if you ask me.

Okay, so let’s start with uDWM.dll. If you look through the method list for it, one can quickly find a very interesting function: CTopLevelWindow::GetEffectiveCornerStyle. Here’s its pseudocode:

__int64 __fastcall CTopLevelWindow::GetEffectiveCornerStyle(__int64 a1)
{
  unsigned int v2; // ecx
  int v3; // ebx
  char Variant; // al

  if ( *((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 27)
    && !*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28)
    || *((int *)CDesktopManager::s_pDesktopManagerInstance + 8) >= 2 )
  {
    return 1;
  }
  else
  {
    v2 = *(_DWORD *)(*(_QWORD *)(a1 + 752) + 184i64);
    if ( !v2 )
    {
      v3 = *(_DWORD *)(a1 + 608);
      if ( (v3 & 2) != 0 )
      {
        return 3;
      }
      else
      {
        if ( (unsigned __int8)IsOpenThemeDataPresent() && (v3 & 6) != 0 )
          return 2;

        Variant = wil::details::FeatureImpl<__WilFeatureTraits_Feature_VTFrame>::__private_GetVariant(&`wil::Feature<__WilFeatureTraits_Feature_VTFrame>::GetImpl'::`2'::impl);
        v2 = 1;
        if ( Variant == 1 )
          return 2;
      }
    }
  }

  return v2;
}

Let’s analyze it a bit. For once, we can determine which branch in that main if is taken at run time on a system where rounded corners work in Windows 11 (more on that a bit later), and also determine the actual return value.

Because we need to attach a debugger to dwm.exe, things are a bit more complicated: you have to debug it on a virtual machine (or on a remote system in general). You can’t debug it on your development box as breaking into dwm.exe will prevent it from drawing updates to the screen, and thus you won’t be able to operate the computer.

As usual, I use WinDbg. Start it on the remote computer. Now, there are 2 methods to go on:

Attach to dwm.exe. When it breaks, the output will freeze, but the command text box will be focused. You can type in there .server tcp:port=5005 followed by g. That will have the debugger listen on port 5005 and then continue executing dwm.exe. The output window will display a connection string that you can use to connect to this instance from the development box.
Attach to winlogon.exe - this is the process that spawns new dwm.exe instances in case it crashes. After attaching, use .childdbg 1 to have it break and also attach to child processes spawned by it. Press g to continue, and then kill dwm.exe. WinDbg will break and from there you can interact with the newly spawned dwm.exe.

Okay, so inspecting this function at runtime, we can see that it returns a 2. If we statically patch it to return a 0 or a 1, window corners will draw not rounded. So, that would be it, right? Actually, this patch was already implemented in my previous Win11DisableOrRestoreRoundedCorners utility, so why all the fuss with this article (I mean, it was quite popular, to the point that it got featured in a LinusTechTips video. Well, I have 2 more goals for this:

Get rid of the dependency on symbol data
Fix the context menus not having a proper shadows applied when using this mode

For the first bullet point, let’s look at this part of the if statement: *((_BYTE*)CDesktopManager::s_pDesktopManagerInstance + 27) && !*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28). At runtime, we see that the first expression is a false, while the true. So, we would need to make *((_BYTE*)CDesktopManager::s_pDesktopManagerInstance + 27) behave as if it were true, for example. Looking on the opcodes, we can see that 80 78 1B 00 (which coresponds to cmp byte ptr [rax+1Bh], 0) is unique through the entire program. So we have an easy pattern match, right, just modify the comparison so that it behaves like a jz on the next instruction instead of jnz loc_18007774E, or modify the jump. Easy, but there’s still bullet point 2. Also, what does this if statement really check? Maybe it hides some registry setting which we could use to enable this easily without patching the executable…? let’s break down the if statement in parts:

!*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28)

Naturally, I have started looking in the constructor for CDesktopManager (which is an object but it is used as if it is a singleton throughout the entire program, it’s only instantiated once, and its reference kept in the s_pDesktopManagerInstance global variable).

CDesktopManager::CDesktopManager just zeroizes fields and fills in the virtual function tables properly, but let’s see where it is called from: CDesktopManager::Create (which in turn is called by DwmClientStartup, which is exported by ordinal 101 and looks like the entry point of this library). After the constructor, a call to CDesktopManager::Initialize is made.

Right in the beginning of that, some registry calls are performed. What’s interesting is that one of them writes to *((_BYTE *)this + 28) = 1;. Bingo, so DWORD ForceEffectMode in HKLM\Software\Microsoft\Windows\Dwm sets *((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28). So the condition is true when the registry value is NOT set to 2.
*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 27)

This one was a bit more complicated. I now know it’s written in CDesktopManager::CreateMonitorRenderTargetsLegacy by *((_BYTE *)this + 27) = IsWarpAdapterLuid;. IsWarpAdapterLuid is the value returned by CDWMDXGIEnumeration::IsWarpAdapterLuid.

But first, how do I know this was written here? Well, as Cheat Engine, WinDbg also has a very nice feature where it can break when a memory location is accessed: ba w 1 address (w is for write-access, 1 is for 1-byte length). So, just break at some early point where you have access to CDesktopManager::s_pDesktopManagerInstance and then set a breakpoint on access for writes to that + 27 and that’s it.

Okay, so what is CDWMDXGIEnumeration::IsWarpAdapterLuid. From its body (and name), we see that it tries to determine whether the graphics adapter (used) is the WARP (software rendered) adapter. This is plenty obvious as well once we take a look at this if statement specifically: if ( a2 == *(_QWORD *)(v5 + 336) && *(_DWORD *)(v5 + 296) == 5140 && *(_DWORD *)(v5 + 300) == 140 ) - 5140 is 0x1414 in hex, and 140 is 0x8c. According to this, those IDs corespond to the “Microsoft Basic Render Driver”, which is basically the software-based graphics adapter that is used as a fallback when graphics drivers for the real adapter are not installed etc. Why is this important? Well, we can observe that on such a setup, the rounded corners are disabled. So, it has to be that in such scenarios,*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 27) is set to true, as the adapter used was the software one, and then !*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28) is still 1 so the main branch is taken, the function returns 1 and rounded corners are disabled. Well, that means we are onto something: if we somehow make !*((_BYTE *)CDesktopManager::s_pDesktopManagerInstance + 28) return false, we can enable rounded corners when the software display adapter is used. But that’s pretty easy: all we have to do is to set ForceEffectMode in the registry to 2 and test it out:

So, the opposite of what we want works (including context menus). Great start =))
*((int *)CDesktopManager::s_pDesktopManagerInstance + 8) >= 2

Using a similar break-on-access trick as described above, we can see that this is set in CDesktopManager::UpdateRemotingMode in possibly a couple of places. But that happens only when GetSystemMetrics(4096) is true, otherwise it is set in the end of the function to 0: *((_DWORD *)this + 8) = 0;. So what does GetSystemMetrics(4096) do? Well, that’s sparsely documented on various forums on the Internet: it check whether the current session is a remote desktop session. From my impression, what this does is disable rounded corners under certain remote desktop scenarios. So yeah, we won’t bother anymore with that, it’s pretty useless for this experiment. I mean, it would actually be if we decided to inject the remote process, as we could IAT patch the call to GetSystemMetrics and return true when we get a 4096 so that we enter the if and from there IAT patch away the rest until we make sure we leave some number at that memory address.

Yeah, so the conclusion is: rounded corners are disabled when the software display adapter is used, or when connected via certain remote sessions. Also, there are a couple of avenues. At this point, my idea was simple: have the CTopLevelWindow::GetEffectiveCornerStyle function somehow return 1 via the methods described above.

Well, I did that, but it has at least one nasty effect: context menus are drawn without a shadow. It kind of makes sense, if we look at where the function is called from at runtime: CTopLevelWindow::UpdateWindowVisuals. In there, of interest is a float that is initialized to 0.0 and then assigned some value when the corners are determined to be rounded - probably the radius of the curve. When corners are not touched, the number stays at 0.0. If we look further down, there is an if check: if ( (v10 & 0x20) == 0 && (IsOpenThemeDataPresent() && (v10 & 6) != 0 || v4 > 0.0) ). I tired messing with it in all the ways, in combination with patching CTopLevelWindow::GetEffectiveCornerStyle as well - I always broke some scenario: I had windows without shadows, windows that are drawn square cornered even when rounded corners are enabled (tooltips, for example) display visual glitches, the mouse display visual glitches (a transparent box behind it when clicked). I was stuck…

Then, after playing with it a bit more, I cam around with some other idea: how would we go about modifying the least amount of code and logic and still achieve what we want? Well, since the radius seems to be a float, what if instead of 0.0, which would give us 90-degree angles, we’d make it 0.00001 let’s say. That so small it would look like a square on the screen. Also, keep in mind that the thing is rasterized in the end and you have only a handful of pixels, so a very small float that’s not zero is basically zero.

Okay, so let’s inspect what happens with the value we get from CTopLevelWindow::GetEffectiveCornerStyle:

 {  
    EffectiveCornerStyle = CTopLevelWindow::GetEffectiveCornerStyle((__int64)this);
    if ( EffectiveCornerStyle == 2 )
    {
LABEL_8:
      wil::details::FeatureImpl<__WilFeatureTraits_Feature_VTFrame>::GetCachedVariantState(
        (volatile signed __int64 *)&`wil::Feature<__WilFeatureTraits_Feature_VTFrame>::GetImpl'::`2'::impl,
        (__int64)&v118);
      v4 = (float)v119;
      goto LABEL_9;
    }

    if ( EffectiveCornerStyle != 3 )
    {
      if ( EffectiveCornerStyle != 4 )
        goto LABEL_9;

      goto LABEL_8;
    }

    wil::details::FeatureImpl<__WilFeatureTraits_Feature_VTFrame>::GetCachedVariantState(
      (volatile signed __int64 *)&`wil::Feature<__WilFeatureTraits_Feature_VTFrame>::GetImpl'::`2'::impl,
      (__int64)&v118);
    v4 = (float)v119 * 0.5;
  }

LABEL_9:

Actually, the pseudocode this time is pretty bad. The raw diassembly is much better:

.text:0000000180029B3D                 call    ?GetEffectiveCornerStyle@CTopLevelWindow@@AEAA?AW4CORNER_STYLE@@XZ ; CTopLevelWindow::GetEffectiveCornerStyle(void)
.text:0000000180029B42                 mov     ecx, eax
.text:0000000180029B44                 sub     ecx, 2
.text:0000000180029B47                 jz      short loc_180029B57
.text:0000000180029B49                 sub     ecx, r15d
.text:0000000180029B4C                 jz      loc_180029C6A
.text:0000000180029B52                 cmp     ecx, r15d
.text:0000000180029B55                 jnz     short loc_180029B74
.text:0000000180029B57
.text:0000000180029B57 loc_180029B57:                          ; CODE XREF: CTopLevelWindow::UpdateWindowVisuals(void)+97↑j
.text:0000000180029B57                 lea     rdx, [rsp+1F0h+var_190]
.text:0000000180029B5C                 lea     rcx, ?impl@?1??GetImpl@?$Feature@U__WilFeatureTraits_Feature_VTFrame@@@wil@@CAAEAV?$FeatureImpl@U__WilFeatureTraits_Feature_VTFrame@@@details@3@XZ@4V453@A ; wil::details::FeatureImpl<__WilFeatureTraits_Feature_VTFrame> `wil::Feature<__WilFeatureTraits_Feature_VTFrame>::GetImpl(void)'::`2'::impl
.text:0000000180029B63                 call    ?GetCachedVariantState@?$FeatureImpl@U__WilFeatureTraits_Feature_VTFrame@@@details@wil@@AEAA?ATwil_details_FeatureStateCache@@XZ ; wil::details::FeatureImpl<__WilFeatureTraits_Feature_VTFrame>::GetCachedVariantState(void)
.text:0000000180029B68                 mov     eax, [rsp+1F0h+var_18C]
.text:0000000180029B6C                 xorps   xmm6, xmm6
.text:0000000180029B6F                 cvtsi2ss xmm6, rax

So, if the corner style is 2, we jump to that place where we make the weird GetCachedVariantState call. Then, we move some value from the stack in rax and from there convert it to a single-precision floating point number.

At runtime, the value we obtain is 0x8. It’s pretty weird… I investigated with different values, it doesn’t seem to actually hold a meaningful IEE754 single-precision floating point value, or maybe it just didn’t tick for me how to work with it. Anyway, by experimenting, I saw that a 0x0 there is essentially CTopLevelWindow::GetEffectiveCornerStyle returning 1, while if we set to 0x1… boom, we get the nice 90-degree corners from Windows 10, complete with the context menu working.

Okay, so it seems the way to go is to make the radius as small as to be basically square nut not mathematically square, so dwm.exe would still think it works with rounded corners.

How do we patch? Well, if we look at the disassembly, we see that xorps xmm6, xmm6; cvtsi2ss xmm6, rax only appears in constructs specific to a check similar to the one showed here after getting the corner style. So it’s actually safe to patch based on this pattern. But how? Well, again, if we look at all the 4 matches, we see they are all preceded by a mov instruction that fetches the value that will be ultimately converted to a float. So we can overwrite that safely, and in all places is even better, it is as if it would have read that value. So, how long is the mov? Well, 4 bytes. We need to write a 1 there, so mov eax, 1 which is b8 01 00 00 00 which is… 5 bytes long :(… CISC baby, what do we do now? Well, we take advantage of the fact that x86 has a billion instructions and opcodes, so we can trick it in 4 bytes like so: xor eax, eax; inc eax - 31 c0 ff c0. Oh yeah, good old inc.

So the patch is simple. Find this pattern 0x0F, 0x57, 0xF6, 0xF3, 0x48, 0x0F and replace the preceding 4 bytes with 0x31, 0xC0, 0xFF, 0xC0.

Lastly, how do we patch this time? The problem with dwm.exe is that it runs either as SYSTEM account, either under some obscure service accounts (DWM-1, DWM-2 etc). We cannot, obviously, inject it from a process running with standard rights. Not even from an administrator one. We have to inject it from a process running as SYSTEM, only though there does it work.

Naturally, the way to go for this is to create a service, as that always runs as SYSTEM. So, we create a service that enumerates the Desktop Window Manager processes and patches 4 bytes at various locations. Since we only do that, no need to run remote code, we can forego injecting in dwm.exe and instead use ReadProcessMemory and WriteProcessMemory to alter its memory.

An example implementation is here (ep_dwm). it can be compiled as a library and then called from your own application, for example.

That’s it! The functionality has been incorporated in the latest ExplorerPatcher, version 22000.434.41.10. Hopefully it will serve you well.

Update (29/08/2022): I also recommend checking out this project on GitHub. It basically patches CDesktopManager::s_pDesktopManagerInstance + 28 to be a 1, with all the quirks described above (context menus and some windows, like the UAC prompt, do not have shadows). Still, it is an alternative to my strategy here. Hopefully, one day, I will have the time and maybe integrate this into ExplorerPatcher as an option.

Functional Windows 10 flyouts in Windows 11

2021-11-18T00:00:00+00:00

Quite some time has passed since my last post here. And that’s for no ordinary reason: the past months, I have been pretty busy with work, both at my workplace and while working on ExplorerPatcher. So many have happened regarding that since I last talked about it here, that it’s just simpler to check it out yourself, and look through the CHANGELOG if you want, then for me to reiterate all the added funcionality. Basically, it has become a full fledged program in its own.

But today’s topic is not really about ExplorerPatcher. Or rather it is, but not directly about it, but about how I enabled a certain functionality for it and how it can be achieved even without it and what are the limitations.

Background

As you are probably aware, Windows 11 has recently launched, and with it, a new radical design change. To keep it short, there are a couple of utilities that let you restore the old Windows 10 UI, one of which being ExplorerPatcher (there is also StartAllBack, for example).

One problem that was still unsolved was how to restore the functionality of the Windows 10 network, battery and language switcher flyouts in the taskbar. While volume and clock work well, and the language switcher is the one from Windows 11 at least, the network and battery ones just do not launch properly. So, what gives?

Investigation

I started my investigation… well, it took me the better part of a whole day to figure my way around how Windows does those things, but to spare you the time, let me tell you directly how it works.

First, most flyouts seem to be invoked by using some “experience manager” interface. Let’s generically call that IExperienceManager. It derives from IInspectable and the next 2 methods are the aptly named ShowFlyout and HideFlyout methods.

These usually live in twinui.dll. At least the ones common to all types of Windows devices (for example, the battery, clock, sound one). By the way, sound is called “MtcUvc”, whatever that means. The more specific ones live some places else, but are instanced by twinui.dll, so you usually can find the required interface IDs there. Also in there are the required names for these interfaces (dump all string containing Windows.Internal.ShellExperience from twinui.dll using strings64.exe from Sysinternals, for example).

So, by looking on some disassembly in twinui.dll, I figured out that the way to invoke these flyouts is something like this:

void InvokeFlyout(BOOL bAction, DWORD dwWhich)
{
    HRESULT hr = S_OK;
    IUnknown* pImmersiveShell = NULL;
    hr = CoCreateInstance(
        &CLSID_ImmersiveShell,
        NULL,
        CLSCTX_NO_CODE_DOWNLOAD | CLSCTX_LOCAL_SERVER,
        &IID_IServiceProvider,
        &pImmersiveShell
    );
    if (SUCCEEDED(hr))
    {
        IShellExperienceManagerFactory* pShellExperienceManagerFactory = NULL;
        IUnknown_QueryService(
            pImmersiveShell,
            &CLSID_ShellExperienceManagerFactory,
            &CLSID_ShellExperienceManagerFactory,
            &pShellExperienceManagerFactory
        );
        if (pShellExperienceManagerFactory)
        {
            HSTRING_HEADER hstringHeader;
            HSTRING hstring = NULL;
            WCHAR* pwszStr = NULL;
            switch (dwWhich)
            {
            case INVOKE_FLYOUT_NETWORK:
                pwszStr = L"Windows.Internal.ShellExperience.NetworkFlyout";
                break;
            case INVOKE_FLYOUT_CLOCK:
                pwszStr = L"Windows.Internal.ShellExperience.TrayClockFlyout";
                break;
            case INVOKE_FLYOUT_BATTERY:
                pwszStr = L"Windows.Internal.ShellExperience.TrayBatteryFlyout";
                break;
            case INVOKE_FLYOUT_SOUND:
                pwszStr = L"Windows.Internal.ShellExperience.MtcUvc";
                break;
            }
            hr = WindowsCreateStringReference(
                pwszStr,
                pwszStr ? wcslen(pwszStr) : 0,
                &hstringHeader,
                &hstring
            );
            if (hstring)
            {
                IUnknown* pIntf = NULL;
                pShellExperienceManagerFactory->lpVtbl->GetExperienceManager(
                    pShellExperienceManagerFactory,
                    hstring,
                    &pIntf
                );
                if (pIntf)
                {
                    IExperienceManager* pExperienceManager = NULL;
                    pIntf->lpVtbl->QueryInterface(
                        pIntf,
                        dwWhich == INVOKE_FLYOUT_NETWORK ? &IID_NetworkFlyoutExperienceManager :
                        (dwWhich == INVOKE_FLYOUT_CLOCK ? &IID_TrayClockFlyoutExperienceManager :
                            (dwWhich == INVOKE_FLYOUT_BATTERY ? &IID_TrayBatteryFlyoutExperienceManager :
                                (dwWhich == INVOKE_FLYOUT_SOUND ? &IID_TrayMtcUvcFlyoutExperienceManager : &IID_IUnknown))),
                        &pExperienceManager
                    );
                    if (pExperienceManager)
                    {
                        RECT rc;
                        SetRect(&rc, 0, 0, 0, 0);
                        if (bAction == INVOKE_FLYOUT_SHOW)
                        {
                            pExperienceManager->lpVtbl->ShowFlyout(pExperienceManager, &rc, NULL);
                        }
                        else if (bAction == INVOKE_FLYOUT_HIDE)
                        {
                            pExperienceManager->lpVtbl->HideFlyout(pExperienceManager);
                        }
                        pExperienceManager->lpVtbl->Release(pExperienceManager);
                    }

                }
                WindowsDeleteString(hstring);
            }
            pShellExperienceManagerFactory->lpVtbl->Release(pShellExperienceManagerFactory);
        }
        pImmersiveShell->lpVtbl->Release(pImmersiveShell);
    }
}

Full source code: ImmersiveFlyouts.c, ImmersiveFlyouts.h

Okay, so with the above one can pretty much manually invoke the flyouts for any of the system icons shown in the notification area. Yet still, network and battery crash.

So, I attached with WinDbg to explorer.exe and clicked the network icon. Then, i saw it spits some error message:

(a0a0.6880): Windows Runtime Originate Error - code 40080201 (first chance)
(a0a0.6880): C++ EH exception - code e06d7363 (first chance)
(a0a0.6880): C++ EH exception - code e06d7363 (first chance)
(a0a0.6880): C++ EH exception - code e06d7363 (first chance)
(a0a0.6880): C++ EH exception - code e06d7363 (first chance)
pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(216)\twinui.pcshell.dll!00007FFE213CA625: (caller: 00007FFE212B91D2) ReturnHr(11) tid(6880) 80070490 Element not found.
    Msg:[Platform::Exception^: Element not found.

pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(174)\twinui.pcshell.dll!00007FFE212B94ED: (caller: 00007FFE212B91D2) Exception(7) tid(6880) 80070490 Element not found.
] 
shell\twinui\experiencemanagers\lib\networkexperiencemanager.cpp(107)\twinui.dll!00007FFE16C266DD: (caller: 00007FFE16C45B05) LogHr(2) tid(30d8) 80070578 Invalid window handle.
pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(148)\twinui.pcshell.dll!00007FFE212B966C: (caller: 00007FFE212B91D2) LogHr(11) tid(84b0) 80070490 Element not found.
pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(211)\twinui.pcshell.dll!00007FFE212B96A6: (caller: 00007FFE212B91D2) Exception(8) tid(84b0) 80070490 Element not found.
(a0a0.84b0): Windows Runtime Originate Error - code 40080201 (first chance)
(a0a0.84b0): C++ EH exception - code e06d7363 (first chance)
(a0a0.84b0): C++ EH exception - code e06d7363 (first chance)
(a0a0.84b0): C++ EH exception - code e06d7363 (first chance)
(a0a0.84b0): C++ EH exception - code e06d7363 (first chance)
pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(216)\twinui.pcshell.dll!00007FFE213CA625: (caller: 00007FFE212B91D2) ReturnHr(12) tid(84b0) 80070490 Element not found.
    Msg:[Platform::Exception^: Element not found.

pcshell\twinui\viewmanagerinterop\lib\windowmanagerbridge.cpp(211)\twinui.pcshell.dll!00007FFE212B96A6: (caller: 00007FFE212B91D2) Exception(8) tid(84b0) 80070490 Element not found.
]

Okay… also, not much else can be done by attaching to explorer.exe, as the flyouts run out of process. After some trial and error, I figured out that the process they run into is ShellExperienceHost.exe. This just crashes when the 2 flyouts are invoked. So, let’s attach to it. To have it spawn and be able to attach to it, besides .child dbg 1 in the parent process, in this instance we can see that once a flyout is launched, the process remains resident in memory, and gets suspended. So, I opened the sound flyout, and then attached with WinDbg. Clicked network and then the error was trapped. The call stack looked like this (up to the first frame that did not look like some error handler):

[0x0]   KERNELBASE!RaiseFailFastException + 0x152   
[0x1]   Windows_UI_QuickActions!wil::details::WilDynamicLoadRaiseFailFastException + 0x53   
[0x2]   Windows_UI_QuickActions!wil::details::WilRaiseFailFastException + 0x22   
[0x3]   Windows_UI_QuickActions!wil::details::WilFailFast + 0xbc   
[0x4]   Windows_UI_QuickActions!wil::details::ReportFailure_NoReturn<3> + 0x29f   
[0x5]   Windows_UI_QuickActions!wil::details::ReportFailure_Base<3,0> + 0x30   
[0x6]   Windows_UI_QuickActions!wil::details::ReportFailure_Msg<3> + 0x67   
[0x7]   Windows_UI_QuickActions!wil::details::ReportFailure_HrMsg<3> + 0x6e   
[0x8]   Windows_UI_QuickActions!wil::details::in1diag3::_FailFast_HrMsg + 0x33   
[0x9]   Windows_UI_QuickActions!wil::details::in1diag3::FailFast_HrIfNullMsg + 0x5c   
[0xa]   Windows_UI_QuickActions!QuickActions::QuickActionTemplateSelector::[Windows::UI::Xaml::Controls::IDataTemplateSelectorOverrides]::SelectTemplateCore + 0x99a   

So, it looks like the error is in that SelectTemplateCore. Also, the Windows Runtime again spat out something in the debugger:

minkernel\mrt\mrm\mrmex\priautomerger.cpp(65)\MrmCoreR.dll!00007FFE3F2561BD: (caller: 00007FFE3F2676AD) LogHr(4) tid(3394) 80070002 The system cannot find the file specified.
ModLoad: 00007ffe`3cba0000 00007ffe`3cbd5000   C:\Windows\System32\Windows.Energy.dll
shellcommon\shell\windows.ui.shell\quickactions\quickactions\QuickActionTemplateSelector.cpp(84)\Windows.UI.QuickActions.dll!00007FFD8848A8DE: (caller: 00007FFD884954B7) FailFast(1) tid(3394) 80070490 Element not found.
    Msg:[QuickActionTemplateSelector missing a valid TemplateDictionary.] 
(8c30.3394): Security check failure or stack buffer overrun - code c0000409 (!!! second chance !!!)
Subcode: 0x7 FAST_FAIL_FATAL_APP_EXIT 
KERNELBASE!RaiseFailFastException+0x152:
00007ffe`57e01de2 0f1f440000      nop     dword ptr [rax+rax]

What’s interesting in that is it coroborates with what we have got from explorer.exe: that 80070490 HRESULT is the same we got in Explorer. Okay, so it seems some “dictionary” is broken for some quick action toggle, apparently… I don’t know exactly their internal structure, I am just guessing, of course.

Let’s load Windows.UI.QuickActions.dll in IDA and take a look at it. You can find this in the same folder as ShellExperienceHost.exe: C:\Windows\SystemApps\ShellExperienceHost_cw5n1h2txyewy.

We find that function quickly, among the sea of symbols, especially if we search for that QuickActionTemplateSelector missing a valid TemplateDictionary error string.

That code area gets called if either branch of some if check eventually fails. So we’re presumably on one of the branches. Without looking too much beside that, I thought: what if we would be on the other branch? A small note on that:

This thought didn’t really came that out of the blue. For the better part of a whole day, I started trying to find a way to do the same thing the lock screen (LockApp.exe) does: that can show the Windows 10 network flyout just fine. So I knew that has to be in some other condition, and I intially thought that maybe this if check here is the determinant for the lock screen’s working condition versus it failing on the desktop. Again, just a wild guess.

Also, since we are talking about that, as I learned when I studied the language switcher, there is some global flag in the network flyout implementation, let’s called it generically, that determines what mode to show. The implementation mostly lives in C:\Windows\ShellExperiences\NetworkUX.dll. In there, look for a method called NetworkUX::ViewContext::SetNetworkUXMode. That’s the single thing that sets a global variable that is used all around the place to determine the type of UX to show, called s_networkUXMode.

The desktop seems to set s_networkUXMode to 0. The lock screen sets that to 7 (also, it cannot be launched in desktop mode, it crashes for some other reason which needs to be investigated as well). There are also other interesting modes: the Windows 10 OOBE screen is 4, which looks quite funny when enabled instead of the regular one:

And no, clicking Next does not advance you to the next in the “desktop OOBE” =)))

The Windows 11 one is 5 if I remember correctly. Find out for yourself. The assembly instructions where that is set look like:
.text:000000018006BC0C                 mov     Ns_networkUXMode, edi ; 
.text:000000018006BC12                 mov     rcx, [rbp+var_8]
.text:000000018006BC16                 xor     rcx, rsp        ; StackCookie
.text:000000018006BC19                 call    __security_check_cookie
So, there’s plenty of space to write something like:
mov     edi, 7
mov     Ns_networkUXMode, edi ; 
Just nop the stack protector check to gain the necessary space and make sure to adjust that relative mov Ns_networkUXMode, edi ; if you shift it a few bytes down due to the mov edi, 7.

Back to the main story, so what if we would be on the other branch? What controls that if check? Scrolling a few lines above, we see this pseudocode:

    {
      Init_thread_header(&dword_18057130C);
      if ( dword_18057130C == -1 )
      {
        byte_180571308 = FlightHelper::CalculateRemodelEnabled();
        Init_thread_footer(&dword_18057130C);
      }
    }

    if ( byte_180571308 )
    {

That byte_180571308 is the if check we are talking about. So it seems that the branch we take ultimately gets determined by that FlightHelper::CalculateRemodelEnabled(); call. Let’s hop into that: it’s a mostly useless call, as it does not seem to have any other return path other than a plain return 1 at the end (maybe the stuff in there can hard fail, but it doesn’t seem to be the case here, we seem to reach that return 1 at the end). Okay, so according to this, byte_180571308 is going to be a 1. So let’s try setting it to 0.

Looking on the disassembly, the ending is something like:

.text:00000001800407C9                 mov     r8b, 3
.text:00000001800407CC                 mov     dl, 1
.text:00000001800407CE                 lea     rcx, ?impl@?1??GetImpl@?$Feature@U__WilFeatureTraits_Feature_TestNM@@@wil@@CAAEAV?$FeatureImpl@U__WilFeatureTraits_Feature_TestNM@@@details@3@XZ@4V453@A ; wil::details::FeatureImpl<__WilFeatureTraits_Feature_TestNM> `wil::Feature<__WilFeatureTraits_Feature_TestNM>::GetImpl(void)'::`2'::impl
.text:00000001800407D5                 call    ?ReportUsage@?$FeatureImpl@U__WilFeatureTraits_Feature_TestNM@@@details@wil@@QEAAX_NW4ReportingKind@3@_K@Z ; wil::details::FeatureImpl<__WilFeatureTraits_Feature_TestNM>::ReportUsage(bool,wil::ReportingKind,unsigned __int64)
.text:00000001800407DA                 mov     al, 1
.text:00000001800407DC                 jmp     short loc_1800407E0
.text:00000001800407DE ; ---------------------------------------------------------------------------
.text:00000001800407DE                 xor     al, al
.text:00000001800407E0
.text:00000001800407E0 loc_1800407E0:                          ; CODE XREF: FlightHelper__CalculateRemodelEnabled+F0↑j
.text:00000001800407E0                 add     rsp, 20h
.text:00000001800407E4                 pop     rdi
.text:00000001800407E5                 pop     rsi
.text:00000001800407E6                 pop     rbx
.text:00000001800407E7                 retn

Coresponding to pseudocode looking like this:

  LOBYTE(v4) = 3;
  LOBYTE(v3) = 1;
  wil::details::FeatureImpl<__WilFeatureTraits_Feature_TestNM>::ReportUsage(
    &`wil::Feature<__WilFeatureTraits_Feature_TestNM>::GetImpl'::`2'::impl,
    v3,
    v4);
  return 1;

But the disassembly is more interesting. Specifically, if you look at address 00000001800407DE, you see that xor al, al that is bypassed altogether by the preceding unconditional jump to the instruction below it. That’s great, we do not even have to insert too much stuff ourselves. That jump is 2 bytes: EB 02. Let’s nop those, so replace with 90 90.

Drop the file instead of the original Windows.UI.QuickActions.dll near ShellExperienceHost.exe, make sure to reload it if it’s loaded in memory, click the network icon and…:

YEEEEEY!!! Call it sheer luck or whatever, but it works. The battery fix came for free as well, after this, as it seemed to suffer from the same thing:

Okay, so it’s only 2 bytes Microsoft had to fix to make this happen, but yet again I have to come and clean up the mess…? Well, almost. See, there’s a caveat with this: it enables this behavior generally in ShellExperienceHost.exe. Try launching ms-availablenetworks:, which in Windows 11 should open a list like this:

Instead, it will now open the Windows 11 notification center combined with calendar thing. So, the way it seems to me is that setting byte_180571308 to 0 globally enables some legacy behavior in ShellExperienceHost.exe, let’s call it. That’s great for network and battery, as it fixes those, but it disables newer stuff, like the Windows 11 WiFi list or the Windows 11 action center. It is like an UndockingDisabled but for this case. Ideally, we would want to keep the best of the both worlds. This is discussed in the next part, where I develop a shippable solution as functionality for ExplorerPatcher.

Implementation

Okay, so how do we wrap this knwoledge to deliver something that’s actually shippable?

As you saw, the static patch is kind of limited in the sense that it enables only either of the 2 worlds. Of course, we could patch in our own assembly, but that’s too much of a hassle, if what we did until now wasn’t.

So let’s consider dynamic patching. How to approach that?

First of all, let’s consider what the common methods of executing our code there would be:

Exploiting the library search order by placing our custom library in the same folder as the target executable, with the name and exports of a well known library. ExplorerPatcher already does that by masqueraiding as dxgi.dll (DirectX Graphics Infrastructure) for explorer.exe and StartMenuExperienceHost.exe.

Looking on the list of imports from dxgi.dll for ShellExperienceHost.exe, similar to explorer.exe, it only calls DXGIDeclareAdapterRemovalSupport very early in the execution stage, so that is a viable option and candidate for our entry point.
Hooking and patching the executable at run time. For this, there are a plethora of methods: the basic idea is to CreateRemoteThread in the target process which executes shellcode we wrote there using WriteProcessMemory that loads our library in the target process and executes its entry point. I used this in the past in plenty of places, including as recent as hooking StartMenuExperienceHost.exe with that. Unfortunately, due to some reason that’s still unknown to me at the moment, calling LoadLibraryW when using this method in the remote process to load our library sometimes fails with ERROR_ACCESS_DENIED. Same config, it only happens on some systems. It’s very weird, seems like a half baked “security” feature oby Microsoft, since I can execute the shell code just fine, it is only LoadLibraryW that fails. If this is indeed considered “security”, it’s just plain stupid because the solution is obviously for one to write their own library loader, which I will certainly do at some point for future projects. Prefferably in amd64 assembly, so that it’s just one char* array in C into which I write some offsets at runtime, write it to the remote process, execute it and off to the races.

So, considering the above, I will go with the first option. For ExplorerPatcher, this is advantageous as this is basically the same infrastructure as for the 2 other executables that are hooked (explorer.exe and StartMenuExperienceHost.exe), so it keeps the implementation rather tidy.

Now, what do we do once we get to execute code in the target? We need to patch that jmp from before basically, or patch out the entire function maybe? It depends on the strategy that we choose. Let’s recall our options:

Hook/inject functions exported by system libraries used by the target application. This is used extensively though ExplorerPatcher and it is the prefferable method, when possible. It involves hooking some known function exported by libraries such as user32.dll, shell32.dll etc. This works when the thing we want to modify is calling some function like this, or depends on it etc. The patch is done by altering the delay import table or the import table for a library from the target. The limitations are that you need to have the target actually call some known function from some library (which is less likely in these new “Windows Runtime” executables Microsoft is flooding more and more of the OS with) and sometimes it’s hard to filter calls from other actors from calls from actors you are interested in.
Hook/inject any function. When I say “any”, I mean the rest of the functions, those actually contained in the target we want to exploit. Conviniently, Microsoft makes symbols available for its system files. These files tell you human-friendly names for various things in the executable code, including function sites. Hooking is usually done by installing a trampoline, basically patching the first few bytes of the function to jump to our custom crafted function where we do what we want and then we call the original we saved a pointer to. There are plenty of libraries to aid with this, like funchook or Microsoft Detours. I have done it manually as well in the past, the big problem is that in order to automate this, you need a disassembler at runtime: recall that x86 is CISC (instructions length varies). Say your trampoline takes 7 bytes, you need to patch the first 7 bytes + whatever it takes to complete the last instruction. To determine what that additional amount is, obviously the program needs to be able to understand x86 instructions at runtime.

The disadvantage is that this is global, affects the entire executable address space, plus the symbol addresses change with each version compiled, so those have to be updated for every new Windows build somehow (ExplorerPatcher uses a combination of downloading the symbols for your current build from Microsoft and extracting the info from there, and hardcoding the addresses for some known file releases). This is used in ExplorerPatcher for the Win+X menu build function from twinui.pcshell.dll, the context menu owner draw functions still from there and for hooking some function calls in StartMenuExperienceHost.exe; this is usually a last resort method.
Pattern matching. This is the classic of the classics. Basically, you try to determine as generic of a strategy as possible to identify some bytes in your target function that is likely to work on future versions and that is unique to your function. If you are smart, you can pull it off, if you are lucky, it also resists more than one build.

I feel lucky today, or rather, I feel that some things in that function are pretty unique and likely not to change based on my previous experiences diassembling Microsoft’s stuff, so I will go with this. The reason is also that I want to try to minimize the symbols fiasco (Microsoft recently delays publishing the symbols for some unknown reason for some builds, and then people running beta builds start flooding the forums with requests for me to “fix” it); it’s just one thing to patch, I don’t want to introduce the need for other symbols, and if we anyway have 2 methods of hooking stuff already present in ExplorerPatcher, what can a third one do…?

void InjectShellExperienceHost()
{
#ifdef _WIN64
    HMODULE hQA = LoadLibraryW(L"Windows.UI.QuickActions.dll");
    if (hQA)
    {
        PIMAGE_DOS_HEADER dosHeader = hQA;
        if (dosHeader->e_magic == IMAGE_DOS_SIGNATURE)
        {
            PIMAGE_NT_HEADERS64 ntHeader = (PIMAGE_NT_HEADERS64)((u_char*)dosHeader + dosHeader->e_lfanew);
            if (ntHeader->Signature == IMAGE_NT_SIGNATURE)
            {
                char* pSEHPatchArea = NULL;
                char seh_pattern1[14] =
                {
                    // mov al, 1
                    0xB0, 0x01,
                    // jmp + 2
                    0xEB, 0x02,
                    // xor al, al
                    0x32, 0xC0,
                    // add rsp, 0x20
                    0x48, 0x83, 0xC4, 0x20,
                    // pop rdi
                    0x5F,
                    // pop rsi
                    0x5E,
                    // pop rbx
                    0x5B,
                    // ret
                    0xC3
                };
                char seh_off = 12;
                char seh_pattern2[5] =
                {
                    // mov r8b, 3
                    0x41, 0xB0, 0x03,
                    // mov dl, 1
                    0xB2, 0x01
                };
                BOOL bTwice = FALSE;
                PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(ntHeader);
                for (unsigned int i = 0; i < ntHeader->FileHeader.NumberOfSections; ++i)
                {
                    if (section->Characteristics & IMAGE_SCN_CNT_CODE)
                    {
                        if (section->SizeOfRawData && !bTwice)
                        {
                            DWORD dwOldProtect;
                            VirtualProtect(hQA + section->VirtualAddress, section->SizeOfRawData, PAGE_EXECUTE_READWRITE, &dwOldProtect);
                            char* pCandidate = NULL;
                            while (TRUE)
                            {
                                pCandidate = memmem(
                                    !pCandidate ? hQA + section->VirtualAddress : pCandidate,
                                    !pCandidate ? section->SizeOfRawData : (uintptr_t)section->SizeOfRawData - (uintptr_t)(pCandidate - (hQA + section->VirtualAddress)),
                                    seh_pattern1,
                                    sizeof(seh_pattern1)
                                );
                                if (!pCandidate)
                                {
                                    break;
                                }
                                char* pCandidate2 = pCandidate - seh_off - sizeof(seh_pattern2);
                                if (pCandidate2 > section->VirtualAddress)
                                {
                                    if (memmem(pCandidate2, sizeof(seh_pattern2), seh_pattern2, sizeof(seh_pattern2)))
                                    {
                                        if (!pSEHPatchArea)
                                        {
                                            pSEHPatchArea = pCandidate;
                                        }
                                        else
                                        {
                                            bTwice = TRUE;
                                        }
                                    }
                                }
                                pCandidate += sizeof(seh_pattern1);
                            }
                            VirtualProtect(hQA + section->VirtualAddress, section->SizeOfRawData, dwOldProtect, &dwOldProtect);
                        }
                    }
                    section++;
                }
                if (pSEHPatchArea && !bTwice)
                {
                    DWORD dwOldProtect;
                    VirtualProtect(pSEHPatchArea, sizeof(seh_pattern1), PAGE_EXECUTE_READWRITE, &dwOldProtect);
                    pSEHPatchArea[2] = 0x90;
                    pSEHPatchArea[3] = 0x90;
                    VirtualProtect(pSEHPatchArea, sizeof(seh_pattern1), dwOldProtect, &dwOldProtect);
                }
            }
        }
    }
#endif
}

Firstly, I match by this pattern:

.text:00000001800407DA                 mov     al, 1
.text:00000001800407DC                 jmp     short loc_1800407E0
.text:00000001800407DE ; ---------------------------------------------------------------------------
.text:00000001800407DE                 xor     al, al
.text:00000001800407E0
.text:00000001800407E0 loc_1800407E0:                          ; CODE XREF: FlightHelper__CalculateRemodelEnabled+F0↑j
.text:00000001800407E0                 add     rsp, 20h
.text:00000001800407E4                 pop     rdi
.text:00000001800407E5                 pop     rsi
.text:00000001800407E6                 pop     rbx
.text:00000001800407E7                 retn

Then skip some bytes (the load address of and function call to that address, some telemetry call) and try to match this which is also a pattern I have seen in many of their libraries, I haven’t bothered to understand but it seems to stay there:

.text:00000001800407C9                 mov     r8b, 3
.text:00000001800407CC                 mov     dl, 1

On builds 22000.318 and 22000.346, first pattern yields 2 results, while adding the last one yields only the result we are interested in. Additionally, I patch only if I find a single match, as otherwise it is likely the file actually changed drastically.

Okay, so we do this patch at runtime, now what? It’s still going to behave like the static patch, ain’t it?

If we leave it like this, indeed it is. So, we have to enhance it a bit. At first I tried signaling ShellExperienceHost.exe to switch between the 2 modes, by patching and reverting those 2 bytes. Unfortunately, this doesn’t really work: once a mode is set, it stays like that: things already load in that mode and calling stuff only available in the other mode crashes ShellExperienceHost.exe apparently.

So what next? From the way I toggle the Windows 11 WiFi list, I already have code that opens something that is a Windows.UI.Core.CoreWindow and waits for it to close. The principle would be simple here: when the network or battery flyout is invoked, kill ShellExperienceHost.exe if running, somehow signal the instance that will open that we want it to run in legacy mode, wait for the flyout to be dismissed by the user (using the infrastructure from above), and then kill ShellExperienceHost.exe again so that Windows 11 things still open normally.

Last quest: how do we signal ShellExperienceHost.exe that we want it to run in legacy mode. Better said, how do we signal out code running in there this?

Well, UWP apps are kind of jailed. They have limited access to the file system and registry. I experienced this as well when working in StartMenuExperienceHost.exe. I don’t know exactly how to determine what they have/do not have access to, as that also seems to vary quite a lor based on the actual executable we are talking about, plus I do not really care. I care about making this working. So, after failing getting it to work with named events, I looked into what keys in the registry this program acesses. Indeed, I found this path in HKCU:

Control Panel\Quick Actions\Control Center\QuickActionsStateCapture

So, to finish this off, to signal the legacy mode, when it’s time to invoke the network/battery flyout, I create an ExplorerPatcher key in there. When our library gets injected in ShellExperienceHost.exe, we check whether that key exists and only then patch the executable; add this to the function above, right at the beginning:

    HKEY hKey;
    if (RegOpenKeyW(HKEY_CURRENT_USER, _T(SEH_REGPATH), &hKey))
    {
        return;
    }

After the flyout is dismissed, we make sure to delete the key, and then terminate ShellExperienceHost.exe, so that the new instance that will be launched eventually loads normally, in Windows 11 mode.

That would be it here.

Bonus: the language switcher

I mentioned this in the beginning, so let’s talk about this a bit as well: I recently needed a way to detect when the input language changes for sws (my custom, written from scratch implementation of a Windows 10-like Alt-Tab switcher). The reason I needed to detect this is so that I can change the key mapping for Alt+~, which shows the window switcher but only for windows of the current application. I need to change it because ~ has a different virtual key code depending on the layout that it is loaded: it is either VK_OEM_3, either VK_OEM_5, alrgely. Or rather, not ~ specifically, but the key above Tab. I always wanted to map that key in combination with Alt. I know it by scan code (29 if I remember correctly), but RegisterHotKey only accepts virtual key codes, to the hot key has to be unregistered and registered every time the user changes the layout.

As it is pretty standard with Windows these days, there are a couple of old APIs that largely do not work anymore or are weird:

WM_INPUTLANGCHANGE is received only by the active window, which in 99.99% of the cases is not the window switcher
this thing (ITfLanguageProfileNotifySink) does not really work, or I could not get it ot work, or rather, when I was seeing how much time I was wasting trying ot make it work and with Google failing to provide me with input from users facing the same hurdles I was facing, I decided to give up on it

So, what to do? Naturally, disassemble their executables and see how they do it. Isn’t this normal? Isn’t a closed system you are discouraged from tempering and playing with the norm these days?

I looked again though explorer.exe: in one of its CoCreateInstance calls, it requests some IID_InputSwitchControl interface from CLSID_InputSwitchControl (actual GUIDs are in ExplorerPatcher’s source code):

__int64 __fastcall CTrayInputIndicator::_RegisterInputSwitch(CTrayInputIndicator *this)
{
  LPVOID *ppv; // rbx
  HRESULT Instance; // edi

  ppv = (LPVOID *)((char *)this + 328);
  Microsoft::WRL::ComPtr<IVirtualDesktopNotificationService>::InternalRelease((char *)this + 328);
  Instance = CoCreateInstance(&CLSID_InputSwitchControl, 0i64, 1u, &IID_InputSwitchControl, ppv);
  if ( Instance >= 0 )
  {
    Instance = (*(__int64 (__fastcall **)(LPVOID, _QWORD))(*(_QWORD *)*ppv + 24i64))(*ppv, 0i64);
    if ( Instance < 0 )
    {
      Microsoft::WRL::ComPtr<IVirtualDesktopNotificationService>::InternalRelease(ppv);
    }
    else
    {
      (*(void (__fastcall **)(LPVOID, char *))(*(_QWORD *)*ppv + 32i64))(*ppv, (char *)this + 16);
      (*(void (__fastcall **)(LPVOID))(*(_QWORD *)*ppv + 64i64))(*ppv);
    }
  }

  return (unsigned int)Instance;
}

Okay, next step for determining the virtual table of this interface is to consult the registry and see which library actually implements it:

Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID\{B9BC2A50-43C3-41AA-A086-5DB14E184BAE}\InProcServer32

It points us to:

C:\Windows\System32\InputSwitch.dll

How do you know how the interface is named? Well, you don’t, but look around in the DLL. There should be some methods called CreateInstance that corespond to the factories for a COM object. If something has a CreateInstance method, it’s probably an interface. In InputSwitchControl.dll, there aren’t many, and there’s one called CInputSwitchControl. They usually name the interfaces after the implementations, prefixed with “C”.

Let’s take that as a candidate. The strategy next is to go though all the functions of that “class” and see if they are mentioned in any virtual table, and then try to see if the positions in that virtual table match with whatever code in explorer.exe is calling it. We eventually get to a virtual table that looks like this one:

dq offset ?QueryInterface@?$RuntimeClassImpl@U?$RuntimeClassFlags@$01@WRL@Microsoft@@$00$0A@$0A@UIInputSwitchControl@@@Details@WRL@Microsoft@@UEAAJAEBU_GUID@@PEAPEAX@Z
dq offset ?AddRef@?$RuntimeClassImpl@U?$RuntimeClassFlags@$01@WRL@Microsoft@@$00$0A@$0A@UIInputSwitchControl@@@Details@WRL@Microsoft@@UEAAKXZ ; Microsoft::WRL::Details::RuntimeClassImpl,1,0,0,IInputSwitchControl>::AddRef(void)
dq offset ?Release@?$RuntimeClassImpl@U?$RuntimeClassFlags@$01@WRL@Microsoft@@$00$0A@$0A@UIInputSwitchControl@@@Details@WRL@Microsoft@@UEAAKXZ ; Microsoft::WRL::Details::RuntimeClassImpl,1,0,0,IInputSwitchControl>::Release(void)
dq offset ?Init@CInputSwitchControl@@UEAAJW4__MIDL___MIDL_itf_inputswitchserver_0000_0000_0001@@@Z ; CInputSwitchControl::Init(__MIDL___MIDL_itf_inputswitchserver_0000_0000_0001)
dq offset ?SetCallback@CInputSwitchControl@@UEAAJPEAUIInputSwitchCallback@@@Z ; CInputSwitchControl::SetCallback(IInputSwitchCallback *)
...

Is this the virtual table we’re after?

If we look back on the disassembly from explorer.exe, we see 2 calls are made for functions within this interface:

Instance = (*(__int64 (__fastcall **)(LPVOID, _QWORD))(*(_QWORD *)*ppv + 24i64))(*ppv, 0i64);

And

(*(void (__fastcall **)(LPVOID, char *))(*(_QWORD *)*ppv + 32i64))(*ppv, (char *)this + 16);

That means the first 2 functions after QueryInterface, AddRef and Release, which would mean Init and SetCallback from above. Besides obviously checking at runtime (it’s easy since the called library is an in-process server and handler), we can also be more sure by looking at the second call, specifically how it sends a pointer, (char *)this + 16 to the library. That’s equivalent to (_QWORD*)this + 2 (an INT64 or QWORD is made up of 8 bytes or chars). If we look in the constructor of CTrayInputIndicator from explorer.exe (CTrayInputIndicator::CTrayInputIndicator), we see this on the first lines:

  *(_QWORD *)this = &CTrayInputIndicator::`vftable'{for `CImpWndProc'};
  *((_QWORD *)this + 2) = &CTrayInputIndicator::`vftable'{for `IInputSwitchCallback'};
  *((_QWORD *)this + 1) = 0i64;

So, (char *)this + 16 is the virtual table for an IInputSwitchCallback. That’s what the SetCallback method from above also states it takes and what you would expect from a SetCallback method. So yeah, that pretty much seems to be it.

Now, how do you use this?

As shown above. First of all you initialize the input switch control by calling Init with a paramater 0. What’s that 0? Let’s see what InputSwitchControl.dll does with it; it’s a long function (CInputSwitchControl::_Init), but at some point it does this:

v5 = IsUtil::MapClientTypeToString(a2);

Sounds very interesting. And it looks even better:

const wchar_t *__fastcall IsUtil::MapClientTypeToString(int a1)
{
  int v1; // ecx
  int v2; // ecx
  int v4; // ecx
  int v5; // ecx

  if ( !a1 )
    return L"DESKTOP";

  v1 = a1 - 1;
  if ( !v1 )
    return L"TOUCHKEYBOARD";

  v2 = v1 - 1;
  if ( !v2 )
    return L"LOGONUI";

  v4 = v2 - 1;
  if ( !v4 )
    return L"UAC";

  v5 = v4 - 1;
  if ( !v5 )
    return L"SETTINGSPANE";

  if ( v5 == 1 )
    return L"OOBE";

  return L"OTHER";
}

Now I understand. So explorer.exe seems to want the “desktop” user interface (kind of like the SetNetworkUXMode from above). There are other interfaces too. One that looks like the Windows 10 one I determined, by testing, to be 1, aka TOUCHKEYBOARD (I exposed the rest via ExplorerPatcher’s Properties GUI as well, each work to a certain degree).

Also, in my window switcher, I don’t want any UI, I just want to benefit from using the callback, as you’ll see in a moment. Again, by experimentation, it seems that providing any value not in that list (like 100) makes it not draw a UI, but the callback still works, the thing still works fine I mean. That’s great!

That’s cool. What about the callback? Well, I think the vtable explians it pretty well:

CTrayInputIndicator::QueryInterface(_GUID const &,void * *)
[thunk]:CTrayInputIndicator::AddRef`adjustor{16}' (void)
[thunk]:CTrayInputIndicator::Release`adjustor{16}' (void)
CTrayInputIndicator::OnUpdateProfile(__MIDL___MIDL_itf_inputswitchserver_0000_0000_0002 const *)
CTrayInputIndicator::OnUpdateTsfFloatingFlags(ulong)
CTrayInputIndicator::OnProfileCountChange(uint,int)
CTrayInputIndicator::OnShowHide(int,int,int)
CTrayInputIndicator::OnImeModeItemUpdate(__MIDL___MIDL_itf_inputswitchserver_0000_0000_0003 const *)
CTrayInputIndicator::OnModalitySelected(__MIDL___MIDL_itf_inputswitchserver_0000_0000_0005)
CTrayInputIndicator::OnContextFlagsChange(ulong)
CTrayInputIndicator::OnTouchKeyboardManualInvoke(void)

The method of interest for me in my window switcher was OnUpdateProfile, so I looked a bit on that. By trial and error and looking around, I determined its signature to actually be something like:

static HRESULT STDMETHODCALLTYPE _IInputSwitchCallback_OnUpdateProfile(IInputSwitchCallback* _this, IInputSwitchCallbackUpdateData *ud)

Where the IInputSwitchCallbackUpdateData looks like this:

typedef struct IInputSwitchCallbackUpdateData
{
    DWORD dwID; // OK
    DWORD dw0; // always 0
    LPCWSTR pwszLangShort; // OK ("ENG")
    LPCWSTR pwszLang; // OK ("English (United States)")
    LPCWSTR pwszKbShort; // OK ("US")
    LPCWSTR pwszKb; // OK ("US keyboard")
    LPCWSTR pwszUnknown5;
    LPCWSTR pwszUnknown6;
    LPCWSTR pwszLocale; // OK ("en-US")
    LPCWSTR pwszUnknown8;
    LPCWSTR pwszUnknown9;
    LPCWSTR pwszUnknown10;
    LPCWSTR pwszUnknown11;
    LPCWSTR pwszUnknown12;
    LPCWSTR pwszUnknown13;
    LPCWSTR pwszUnknown14;
    LPCWSTR pwszUnknown15;
    LPCWSTR pwszUnknown16;
    LPCWSTR pwszUnknown17;
    DWORD dwUnknown18;
    DWORD dwUnknown19;
    DWORD dwNumber; // ???
} IInputSwitchCallbackUpdateData;

The dwID is what contains a HKL combined with a language ID. Its an entire theory that is kind of well explained here that you understand in one evening, you make use of it and then you quickly forget as its too damaging for the human brain to remember such over engineered complications:

https://referencesource.microsoft.com/#system.windows.forms/winforms/Managed/System/WinForms/InputLanguage.cs

Also, take a look on my window switcher’s implementation, to see how I extract the HKL from that:

https://github.com/valinet/sws/blob/74a906c158a91100377a6e8220b0a3c5a8e98657/SimpleWindowSwitcher/sws_WindowSwitcher.c#L3

So, back to explorer.exe, how do we make it load the Windows 10 switcher instead? That 0 seems to be hardcoded in the code there.

Well, it is, but if we look at the disassembly:

.text:000000014013B21E                 call    cs:__imp_CoCreateInstance
.text:000000014013B225                 nop     dword ptr [rax+rax+00h]
.text:000000014013B22A                 mov     edi, eax
.text:000000014013B22C                 test    eax, eax
.text:000000014013B22E                 js      short loc_14013B294
.text:000000014013B230                 mov     rcx, [rbx]
.text:000000014013B233                 mov     rax, [rcx]
.text:000000014013B236                 mov     r10, 0ABF9B6BC16DC1070h
.text:000000014013B240                 mov     rax, [rax+18h]
.text:000000014013B244                 xor     edx, edx
.text:000000014013B246                 call    cs:__guard_xfg_dispatch_icall_fptr

Maybe patching the virtual table of the COM interface could be a solution, but it is a bit difficult because the executable is protected by control flow guard. How do you tell? Besides confirming with dumpbin, it’s right there in front of you. That mov r10, 0ABF9B6BC16DC1070h is the canary that is written before the target call site. So you have to write that + 1 (0ABF9B6BC16DC1071h, you can confirm by looking above ?Init@CInputSwitchControl in InputSwitchControl.dll) above your target function. Doable, I think. Haven’t tried, but will probably will when the hack that I did breaks, or with some other oppotunity (I have to figure out how to have the compiler place that number there automatically, without me patching the binary afterwards manually). Also, I don’t know if those numbers change between versions of the libraries, I presume not, as that would break compatibility between versions, but I do really have to learn more about CFG.

For this, I instead obted to hack it away, by observing that edx is never changed beside that xor edx, edx. So, I just have to neuter that, and then set it myself and immediatly return from my CoCreateInstance hook. Like this:

//globals:
char mov_edx_val[6] = { 0xBA, 0x00, 0x00, 0x00, 0x00, 0xC3 };
char* ep_pf = NULL;

//...
// in CoCreateInstance hook:
char pattern[2] = { 0x33, 0xD2 };
DWORD dwOldProtect;
char* p_mov_edx_val = mov_edx_val;
if (!ep_pf)
{
	ep_pf = memmem(_ReturnAddress(), 200, pattern, 2);
	if (ep_pf)
	{
		// Cancel out `xor edx, edx`
		VirtualProtect(ep_pf, 2, PAGE_EXECUTE_READWRITE, &dwOldProtect);
		memset(ep_pf, 0x90, 2);
		VirtualProtect(ep_pf, 2, dwOldProtect, &dwOldProtect);
	}
	VirtualProtect(p_mov_edx_val, 6, PAGE_EXECUTE_READWRITE, &dwOldProtect);
}
if (ep_pf)
{
	// Craft a "function" which does `mov edx, whatever; ret` and call it
	DWORD* pVal = mov_edx_val + 1;
	*pVal = dwIMEStyle;
	void(*pf_mov_edx_val)() = p_mov_edx_val;
	pf_mov_edx_val();
}

https://github.com/valinet/ExplorerPatcher/blob/ff26abe9a39fb90510450356ba2a807fb97cfa69/ExplorerPatcher/dllmain.c#L4247

The result?

Conclusion

Not the most beautiful patches in the world, but they work out rather nicely. Quite some stuff has been achieved with the limited resources available at our disposal. Of course, Microsoft fixing the interface would still be the preferable option, like, in some cases, it only takes 2 bytes give or take, but they chose to deliver a label-less taskbar instead, that’s simply a productivity nightmare. Oh, well… at least ExplorerPatcher exists.

Yeah, a long post, but hopefully it gave you some ideas. Let’s hack away!

P.S. Before asking, no, the 2 battery icons are not some “side effect” or a bug or anything like that: one is Windows’ icon, and the other is the icon of the excellent and versatile Battery Mode app one can use as a better replacement to what Windows offers in any of its modes.