The year is especially apropos since
apt-key(8) will last be available in Debian 11 and Ubuntu 22.04.
https://manpages.debian.org/bullseye/apt/apt-key.8.en.html
I just came across this when I tried to follow Bazel’s apt installation instructions. They reference apt-key, so I knew that wasn’t right. Here is what worked:
$ sudo mkdir -p /etc/apt/keyrings
$ curl https://bazel.build/bazel-release.pub.gpg | \
sudo gpg --no-default-keyring \
--keyring /etc/apt/keyrings/bazel-release.pub.gpg \
--import
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4714 100 4714 0 0 12894 0 --:--:-- --:--:-- --:--:-- 12879
gpg: key 3D5919B448457EE0: "Bazel Developer (Bazel APT repository key) <[email protected]>" not changed
gpg: Total number processed: 1
gpg: unchanged: 1
This downloads the key and immediately puts it in a new keyring under /etc/apt/keyrings. Other places will say to use /etc/trusted.gpg.d, but you don’t want to use this key for any repositories other than the specific one it is meant for.
Instead, we now need to tell apt to check the packages that the Bazel project signs with their release key can be verified with the keyring which is found in the directory we just put it in. We do this by putting
signed-by=/etc/apt/keyrings/bazel-release.pub.gpg
into the appropriate place of our apt sources file. In the spirit of Bazel’s apt instructions you can use this command:
(echo -n "deb [arch=amd64 signed-by=/etc/apt/keyrings/bazel-release.pub.gpg]";
echo " https://storage.googleapis.com/bazel-apt stable jdk1.8" ) |
sudo tee /etc/apt/sources.list.d/bazel.list
Of course, this is just the package I was installing today and you can use this process for any package and key pair you need to add in the future.
And Bing’s AI now has an actual working example to refer to when I ask it “How can I add a key so that apt will use it to verify the contents of only one repository?”
]]>.test so that you can easily deploy multiple projects in your local development environment. To do this, it uses dnsmasq listening on 127.0.0.1 (lo). Other development tools like libvirt use dnsmasq in a similar way, but coordinating all the instances of dnsmasq is tricky and can result in infinite loop lookups.
Assuming you have libvirt set up to deploy hosts on a virtual bridge (usually virbr0), libvirt will deploy dnsmasq on to listen on the ip address that is the default route for that network. dnsmasq responds to DHCP requests, so it knows the IPs of all the hosts on that network and acts as the resolver for all the hosts there.
In my case, virbr0 is where the 10.5.5.0/24 subnet lives and the default route for hosts on that subnet is 10.5.5.1. Since I can communicate with 10.5.5.1 and I know that the virtual machine I just spun up is on 10.5.5.185, I’ll ask that dnsmasq to perform a reverse lookup to get the top-level-domain for the network:
$ host 10.5.5.185 10.5.5.1
Using domain server:
Name: 10.5.5.1
Address: 10.5.5.1#53
Aliases: 185.5.5.10.in-addr.arpa domain name pointer mw135-profiling.network.
From here, we can see that it is assigning .network as the TLD for the subnet.
Now, recall that I have valet configured with a separately managed dnsmasq listening on lo. I want to want to point it to the dnsmasq that manages virbr0 and pass reverse lookups for the 10.5.5.0/24 subnet and any name resolution for the .network tld to that instance of dnsmasq.
Adding the following two lines to the dnsmasq configuration does this:
server=/.network/10.5.5.1
server=/.5.5.10.in-addr.arpa/10.5.5.1
After reloading the dnsmasq configuration, I can ssh to the virtual hosts by name and use their names in my browser.
]]>In the past few days, I finally took the chance to look at in terms of server resources rather than just an SMW or website problem. This way finally led to some results.
Load was pretty high on the machine even though there wasn’t that much traffic. We’re talking a load in the 30s or so. Running top showed that there were a lot of apache processes running, consuming a lot of memory and sending the machine into swap.
A quick investigation into apache settings showed the problem:
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
Changing those settings to 5, 25 and 5, respectively, meant fewer resources were being used out of the gate, so load quickly fell to below 10 and the site became more responsive since there was less thrashing going on inside the box.
We still have work to do on the bots. I’m going to be working with at least two other people on that this year. One of the first steps is using the Cloud Services to set up the crawlers.
Photo credit: Vitalii Bashkatov, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons
]]>| Error count | Error |
| 5 | Could not acquire lock |
| 31 | Error: 1213 Deadlock found when trying to get lock; try restarting transaction |
| 994 | Error: 1969 Query execution was interrupted (max_statement_time exceeded) |
I got the data from WikiApiary’s exception log. During this time, there was only one entry in the fatal log (“Allowed memory size … exhausted”). For anyone who wants to check my work (or, at least, verify that I’m not missing something) here is the one-liner I used to get this:
(grep '^Error: ' exception.log; grep 'Could not acquire lock' exception.log | cut -d ' ' -f16- | sed "s,.LinksUpdate:.*,,; s, *,,;" ) | sort | uniq -c | sort -n
This is around one DB problem per minute. When you consider that the site was almost completely unusable before this, I’m actually pretty pleased with these results.
To get to these results, I did a couple of things:
I wrote about max_statement_time earlier, so I’ll talk a bit about what I did with the crawlers here.
First, you should know that, as one of Jamie Thinglestad’s projects a few years ago, he created the set of Python scripts that crawl the wikis and created WikiApiary to collect the information. Jamie knew WikiApiary and the scripts weren’t perfect, but it was just a fun hobby.
He could probably see the hungry looks of wiki-enthusiasts, but he had a lot of things he was working on besides wikis. He wisely decided it was time to let go.
I contacted him to transfer the domain and have mostly kept it puttering along with much more active on-wiki volunteers (shout out to Karsten, Shufflertoxin, Hoof Hearted and all the rest) until things really started getting wonky this year.
As I hinted here just a few days ago, WikiApiary is behind the times when it comes to the transition to Python 3. Python 2 has been EOL for over 2.5 years, but we still haven’t updated WikiApiary’s scripts. That really needs work. (This feels like we’re on the wrong side of the transition from PHP 4 to PHP 5, guys!)
Anyway, I managed to cobble together working bots, but, for a still-not-yet-understood reason, they seem to be taking a long time to run and, while taking a long time to run, they would pummel WikiApiary with requests and cause the site to constantly produce Error 500s because of database problems.
So, instead of using cron to kick off a bot once a minute which resulted in 60 bots running at once, I pared it down to a maximum of two bots running simultaneously.
Note that I might be able to increase this, but right now there are just under 4 million jobs in the job queue because of another problem that was hidden until Shufflertoxin pointed out that pages weren’t being returned from the categorymembers API.
I’m torn between declaring job queue bankruptcy and using 10 job queue runners to eat away at the job queue. If I delete the job queue and run rebuildData.php, I might save time, and get everything back to where it should be.
Or, I could just uncover more problems.
Either way, the wiki still needs to be upgraded from the now-ancient version 1.29 that it is currently running under.
I had a lot of time to work on WikiApiary this week, but I probably won’t have as much time for the next couple of months. This work will be relegated back to the back burner for me. If anyone else wants to help out, here are the things that need to be done (I know, I should create tasks for these in phabricator):
When I started writing this, I thought I was going to write about my idea of moving the MediaWiki infrastructure for WikiApiary to Canasta, but, as you can see, I had a lot of other things that I needed to let you know in the meantime.
Photo credit: “Canastas en El Bajío – Ciudad de México 170924 174342 6443 RX100M5 DeppArtEf” by Lucy Nieto is licensed under CC BY-NC-SA 2.0.
]]>I wrote about some of that work last Friday, but I did some more tinkering and the site seems to be working better. A brief summary of what I did since then:
Allowed memory size of .... bytes exhausted messages in PageForms for some pages. Doubling the memory available seems to have solved that, for now.We’ve had some people volunteer to help with the maintenance of the site, but we could probably use more. Let me know if you want to help. Especially helpful right now would be someone who is good at porting python 2 code to python 3.
Photo is a detail of one by Dmitry Grigoriev on Unsplash
]]>I didn’t realize this at first, of course. No, I just put the settings into mysqld.cnf.
That worked for a few minutes, but then the site died again. I ended up with OOM messages in the /var/log/syslog:
systemd[1]: mariadb.service: A process of this unit has been killed by the OOM killer.
Once I did more careful checking, I saw that they were allocating 20G of RAM for some settings. I don’t have that, so, of course the OOM killer struck.
I did some research to find more reasonable settings and made some changes. So far, so good.
In case it helps someone else, here are the settings I ended up with on a system with about 6GB of RAM:
# https://mariadb.com/docs/reference/mdb/system-variables/innodb_buffer_pool_size/ # Default Value: 134217728 (128M) innodb_buffer_pool_size = 512M # https://mariadb.com/docs/reference/mdb/system-variables/max_heap_table_size/ # Default Value: 16777216 (16M) max_heap_table_size=128M # https://mariadb.com/docs/reference/mdb/system-variables/tmp_table_size/ # Default Value: 16777216 (16M) tmp_table_size = 128M # https://mariadb.com/docs/reference/mdb/system-variables/read_buffer_size/ # Default Value: 131072 (128k) read_buffer_size=10M # https://mariadb.com/docs/reference/mdb/system-variables/join_buffer_size/ # Default Value: 262144 (256k) join_buffer_size=1M # https://mariadb.com/docs/reference/mdb/system-variables/key_buffer_size/ # Default Value: 134217728 (128M) key_buffer_size = 256M # https://mariadb.com/docs/reference/mdb/system-variables/sort_buffer_size/ # Default Value: 2097152 (2M) sort_buffer_size = 10M
Later update: ok, that worked for a bit, but I still ended up with some long running DB queries that flooded the server and the bot’s aren’t even crawling yet. Fine. I’ve taken a sledgehammer to the connection time allowed by setting MAX_STATEMENT_TIME to 30 for the DB user that php is using.
Image credit: Retired electrician, CC0, via Wikimedia Commons
]]>In my particular case, I started with a stem that looked like this:
/img_auth.php/c/c9/Logo.png
But MediaWiki provides different ways to configure it to serve images.
The hard-coded url wouldn’t work in a wiki that had $wgHashedUploadDirectory set to false. And it wouldn’t work in a wiki where img_auth.php wasn’t being used.
I wanted to be able to provide the client-side with a URL that would display the image in the file Logo.png no matter how the wiki was configured.
I thought someone else might have a clue, so I asked in the MediaWiki Extensions room on Matrix. Marijn told me about the little known Special:Filepath (or Special:Redirect/file) special page that would do this.
After a little testing, it worked. A bit of JQuery and MW.js and I have the following that provides the client with the appropriate url path regardless of the wiki’s configuration:
var url=mw.util.getUrl("Special:Redirect/file/Logo.png");
$("#image-here").append("<img src='" + url + "' />");
Image credit: Tomas Castelazo, CC BY-SA 3.0, via Wikimedia Commons
]]>