Credit for this solution must go to Kannan Suresh, who published a walkthrough for this solution on his website, aneejian.com, and hosts example code on his GitHub Repo, jekyll-blog-archive-workflow.
In the context of Jekyll sites, or other blog sites, the term “archives” means a listing of all blog posts, often grouped by some metadata. In this case we are going to group blog posts by:
It is assumed that all blog posts in the site will have Front Matter YAML which includes both category and tag metadata for the post.
In order to make this work, a few things are needed in your Jekyll site, and in your GitHub Repo:
This will be used by a GitHub Action to loop through each category, tag and year associated with the site’s blog posts.
Three layout templates will define the appearance of each listing of blog pages. In order words, these are templates for the pages that show, for example, a list of blog pages that match a given category, etc.
If you don’t need different formatting for these listings it would be possible to change this solution to have just one template, but other changes would be needed, especially in the Python script used by the GitHub Action.
The GitHib Action includes steps that will generate new pages in your site for each category, tag and publication year of your posts, listing the blog posts relating to that metadata.
Not included in the solution on aneejian.com are pages that hold indexes of each category, tag or year by which the blogs are grouped, with links to each page of grouped content.
It is assumed that you are reading/following the published solution on aneejian.com.
A rough elaboration of the solution, plus details of a few changes which were needed in my own implementation, are detailed here:
The solution suggests putting the archivedata.txt file in a folder named _archives. The name of this folder seems to be optional, but if you want to change it you’ll need to adjust some of the configuration that follows.
The contents of this folder are published as a Collection, defined in the _config.yml file:
collections:
archives:
output: true
permalink: /archives/:path/
Thus, the archivedata.txt file is published under the
/archives/archivedata/ path in your site, which is important as it is needed
later on when the GitHub Action fires.
NOTE: Despite there (eventually) being other folders and files under this
path, e.g. /_archives/years/2024.md, they aren’t published under the
/archives/ path. This is because each page has it’s own permalink defined
in its Front Matter, overriding the path that the Collection would have created.
/_archives/years/2024.md---
title: 2024
year: "2024"
layout: archive-years
permalink: "year/2024"
---
Eventually, a number of new files will be generated by the GitHub Action under
the _archives folder, e.g. /_archives/years/2024.md. These files will have
Front Matter data that defines the layout (i.e. template) to be used to render
that page. See above for example Front Matter for such a page.
The published solution defines three layouts depending on whether posts are being grouped by category, tag or year of publication. These layouts are effectively hard-coded in the Python script used in the GitHub action so, unless you want to rewrite the script, it’s easiest to create the three layout files as suggested.
WARNING: The example layout files are themselves based on a default layout (template) for the site. In the example this template is named default but in my Jekyll Minima site the default template has been renamed as base, so the Front Matter for the layout files needed to be changed:
---
layout: base
---
Also, I just wanted a simple listing of the blog pages, with links to each, rather than the more advanced formatting used in the solution (which used an include), so I changed the content of my layout files accordingly.
The GitHub Action (workflow) is triggered by changes made under the _posts folder, but it can be triggered manually from the Actions tab of your repo on GitHub. As noted above, it performs the following steps:
The provided solution will run the workflow when changes are made within the _posts path, but this assumes that folder is in the root of the repo. In my case the Jekyll site was within a docs folder, so this needed to be fixed:
on:
workflow_dispatch:
push:
paths:
- "docs/_posts/**"
I made the required ones to specify the location of the archivedata file and the output path for the archive files. The following snippet of the add_archives.yml file shows these changes, as part of the step where the original repo is used to define the core of the Action:
- name: Generate Jekyll Archives
uses: kannansuresh/jekyll-blog-archive-workflow@master
with:
archive_url: "https://dev.joynt.co.uk/archives/archivedata"
archive_folder_path: "docs/_archives"
From what I can tell, this looks for the action.yml file in that repo which then creates a docker image using a dockerfile from that same repo. The Docker image includes a copy of the Python script that creates the many output files for each category, tag and year.
When I first tried to run this action I noticed that I was getting errors saying that it failed to update my repo. This was caused by two problems. Firstly the workflow was using master as the branch name (mine is main) and the second was restrictive permissions not allowing unauthenticated updates. So I made some more changes:
The original code used the master branch:
git push origin master
I needed to change this to main (the newer default name for the first branch of a new GitHub repo). However I needed to make further changes to this line; see below.
I added the following, although I cannot be certain it was needed:
permissions:
contents: write
git pushMy repo doesn’t allow pushes from just anyone, so I created a new Personal
Access Token (PAT) and saved this as a secret in my repo. I changed the final
line of the workflow to use this token in the git push command:
git push https://x-access-token:[email protected]/$.git HEAD:main || echo "No changes to push."
It is worth noting, as they originally confused me, how the trailing echo
statements worked, e.g.
primary_command || echo "some text"
If the primary_command fails, i.e. gives a non-zero exit value, then the
command following the || (OR operator) runs.
Thus, if there were no changes that needed to be made to the repo then the logs for the Action would show this explicitly.
It may be that some Jekyll themes automatically include templates that will create index pages for categories, but these pages weren’t created in my case. To fix this I created two new pages (I wasn’t so bothered about indexing by year):
/_pages/categories.md/_pages/tags/mdThese needed to list all the categories/tags in the site.posts list, and to
provide links to the “archive” page for each category/tag. This would have been
easy, except that these archive pages had names generated by the Python script.
The script sanitizes the names of the files it creates, which are based on the names of the categories and tags included in the blog posts. Thus, the names of the pages these indexes need to link to cannot be taken immediately from the original category or tag names. The following Liquid code was used to update the names of the categories and tags to match the output of the script:
{% assign value_escaped = category[0] | replace: ' ', '-' | replace: '.', '-' %}
{% assign value_escaped = value_escaped | replace: '#', 'sharp' %}
{% assign value_escaped = value_escaped | downcase %}
{% assign value_escaped = value_escaped | replace_regex: '[^a-z0-9_-]', '-' %}
---
layout: page
title: "Categories"
permalink: /categories/
---
<ul>
{% assign sorted_categories = site.categories | sort %}
{% for category in sorted_categories %}
{% assign value_escaped = category[0] | replace: ' ', '-' | replace: '.', '-' %}
{% assign value_escaped = value_escaped | replace: '#', 'sharp' %}
{% assign value_escaped = value_escaped | downcase %}
{% assign value_escaped = value_escaped | replace_regex: '[^a-z0-9_-]', '-' %}
<li>
<a href="proxy.php?url=/category/{{ value_escaped }}">{{ category[0] }}</a> ({{ category[1].size }} posts)
</li>
{% endfor %}
</ul>
Unrelated to the deployment of Syncthing, but relevant to the overall goal of the exercise, was the use of Samba. This was used to set up a share on the local network so that devices could stream music files from this server. Thus, the location of where Syncthing would save these files was important.
There is an official docker container available for Syncthing on DockerHub: syncthing/syncthing:latest.
I was able to get this up and running easily, directly in Docker, however I was sure that I wanted to orchestrate this container using Kubernetes, so I moved on quickly to Microk8s.
As expected, a NodePort was needed to expose the Syncthing UI to the local network.
One of the challenges / worries that I had when considering how to run Syncthing
with Microk8s was how it would access the host file system. I didn’t want it to
just create it’s own folders for mounting storage within the Docker container. I
knew there would be a lot of data copied to/from the server and it needed to be
on a separate physical disk, mounted at /data/ on the host.
In order to get this working, I implemented the following combination:
# Under spec/template/spec/containers:
env:
- name: PUID
value: "110" # local account UID
- name: PGID
value: "110" # local account GID
securityContext:
runAsUser: 110
runAsGroup: 110
# Under spec/template/spec/containers:
volumeMounts:
- mountPath: /var/syncthing
name: syncthing-config
- mountPath: /data/syncthing
name: syncthing-data
# Under spec/template/spec/
volumes:
- name: syncthing-config
hostPath:
path: /data/syncthing/config
type: Directory
- name: syncthing-data
hostPath:
path: /data/syncthing
type: Directory
It didn’t seem necessary to create a PersistentVolumeClaim for this one-off instance.
]]>Snaps are app packages for desktop, cloud and IoT that are easy to install, secure, cross‐platform and dependency‐free. Snaps are discoverable and installable from the Snap Store, the app store for Linux with an audience of millions.
A snap is a bundle of an app and its dependencies that works without modification across Linux distributions. –Canonical Snapcraft
apt-get etc.).Applications in a Snap run in a “container” [sandbox] with limited access to the host system. The Snap sandbox heavily relies on the AppArmor Linux Security Module from the upstream Linux kernel. –Wikipedia
In theory this means that services running as snaps cannot access certain paths on the host system, however in my own deployments I was able to mount local paths into Docker containers.
With Docker installed as a snap, it behaved pretty much as expected, at least for the simple testing I did.
For those used to simply running kubectl commands, when using Microk8s it
seems that these have to be “wrapped” by the microk8s command. For example:
microk8s kubectl get all -n kube-system
Since Microk8s is a version of Kubernetes, designed for small systems, test deployments, etc. it is configured by applying YAML configuration files. To avoid losing these config files, I have saved them in a Microk8s folder, with sub-folders per namespace.
microk8s kubectl apply -f <file_name> -n <namespace>
To help see what is going on within the Kubernetes deployment, the microk8s dashboard was enabled, however it wasn’t immediately visible on the local network. To achieve this I had to:
Note that one of the restrictions was that the NodePort port needed to be in the range of 30000 to 32767.
]]>VSCode supports additional functionality via extensions that can be added to the IDE via .vsix files downloaded from the VSCode Marketplace. The one I chose to experiment with first is:
Once installed (and VSCode reloads its extensions) there will be a new icon in the Activity Bar. Clicking on this opens the Continue extension in the Primary Side Bar. The recommendation is to move this to the Secondary Side Bar panel on the right hand side.
If it is not already visible, open the Secondary Side Bar panel with Ctrl+Alt+B.
You can then drag the Continue icon from the Activity Bar over to the Secondary
Side Bar.
You should now see a prompt box with the text “Ask anything. ‘@’ to add context” and a few other icons.
Continue.dev is a VSCode (and JetBrains) extension that provides the following functionality, similar to GitHub Copilot:
Out of the box, the extension is configured to use Anthropic’s Claude 3.5 Sonnet, for which you’ll need an API key and spending money. You can select (add) other models via a drop-down at the bottom of the prompt box and then choosing the + Add Chat Model option. A new panel will open, proving a list of LLM providers to choose from. If you have an OpenAI API key, you could enter that here and use GPT-4o, for example.
In my case I want to use a LLM running on my local network, at least to test how well this worked compared to GitHub Copilot. I already have LM Studio running as a server, so I chose LM Studio from the list of providers. I left the other options as default (“Install provider” was the LM Studio website and “Model” was set to auto-detect) and clicked Connect.
Doing this adds the following configuration for Continue in its config.json file:
{
"apiBase": "http://localhost:1234/v1/",
"model": "AUTODETECT",
"title": "Autodetect (1)",
"provider": "lmstudio"
}
If you’re not running LM Studio on the local machine, change localhost to be
the IP address or hostname of the machine LM Studio is running on. Once you have
done this you should now see a list of “auto-detected” models in the drop-down
list as an alternative to Claude 3.5 Sonnet. The list is provided by LM Studio
so if you want different models you need to download them there.
Once you have a model selected, you can use the prompt box to interrogate the
LLM. You can use data provided by VSCode for context; just type @ and a list
of options appears. For example, you can specify the currently active file and
ask questions about the contents.
If you want to include a chunk of highlighted code in your message to the LLM,
press CTRL+L to copy the highlighted code across.
You may not want everything you are working on to be sent to the LLM for code completion suggestions. Click on Continue in the status bar (right-hand end) and you can enable/disable code completion.
Code completion options are set in the config.json file. In order to set up completion suggestions you need to add/update the following JSON in the config:
"tabAutocompleteModel": {
"apiBase": "http://localhost:1234/v1/",
"title": "lmstudio",
"provider": "lmstudio",
"model": "stable-code-instruct-3b"
},
Again, replace localhost with the server name or IP address of the machine
running LM Studio if it isn’t running locally. You may see some errors in the
chat bot panel if the extension attempts to retrieve completion suggestions from
the LLM but isn’t able to connect.
Note also that the model name needs to be hard-coded in this chunk of JSON. The documentation suggests entering the full “path” of the model in LM Studio, but based on the 404 error message from the server it’s only necessary to use the short name for the model in LM Studio.
Editing code in-place in the active file tab works as described (see link above) but the quality of the suggestions will obviously depend on the model used.
Not all models are suitable for all tasks. For example, when selecting stable-code-instruct-3b in the example above for code completion, the Continue extension shows the following warning:
Warning: stable-code-instruct-3b is not trained for tab-autocomplete, and will result in low-quality suggestions. See the docs to learn more about why:
The FAQs and other documentation on docs.continue.dev explain that models need to be trained for fill-in-the-middle usage to work well for auto-completion, while general purpose “chat” LLMs are not so optimised.
The recommendation from Continue is to use Qwen2.5-Coder 1.5B.
For normal chat and editing of code, more traditional models can be used. For example, the Llama 2 model might be useful. Choosing a model that is small enough to run on your hardware but yet sufficient to perform well is important.
While looking for suitable tools, I came across a number of different IDEs that supported AI integration. There’s not much point listing them here as a web search for “IDE with AI integration” provides what is needed.
]]>Some projects that I have wanted to work on require access to APIs, which charge for access. While the prices aren’t high, especially for relatively low usage, it seemed unnecessary to pay anything when open-source models are available that can run on relatively modest hardware (NVidia RTX GPUs).
So I started looking into options for serving large language models (LLMs) on my local network, specifically, on a Windows 11 PC with an 8GB RTX 2080 GPU. As it happens, the first app I tried seemed to meet my needs so I didn’t look further!
LM Studio, on version 0.3.6 at the time of writing, is described on its own website as “a desktop app for developing and experimenting with LLMs on your computer”. There are installers for Windows, MacOS and Linux. Once installed you can:
It doesn’t seem to matter what architecture you choose for the models you want to run (e.g. LLama, Phi, etc.). Perhaps obviously, you can only download open-source models; you can’t run versions of ChatGPT or Claude.
You need to choose models that will fit in the available VRAM (8GB in my case). There are options to offload some inference to the CPU and main RAM but I chose not to experiment with that.
Despite the model that you choose, the API that is offered by LM Studio running as a server is OpenAI-compatible.
It is worth mentioning that the other LLM server that I have most frequent come across is Ollama. I may experiment with this in the future but for now, LM Studio is working fine.
Another option is vLLM. Like Ollama this appears to be command-line only, and designed to be run as a server rather than a desktop app with a UI.
]]>remote-theme setting to pull
down the latest version of the theme to support all the documented features.
The breakthrough was as simple as reading the _config.yml file used by the GitHub Pages for the jekyll/minima theme itself:
All that was really needed was to comment out the theme setting and replace
it with a remote-theme setting, like so:
# As of November 2023, GitHub Pages still uses Minima 2.5.1
# (https://pages.github.com/versions/).
# If you want to use the latest Minima version on GitHub Pages, use the
# following setting and add comment out the "theme: minima" line.
remote_theme: jekyll/minima
# theme: minima
This has ensured that the following features (and probably some others I haven’t found) now work:
Initially the purpose of creating these GPTs was simply to explore the new feature(s) being released by OpenAI. Simple/trivial use-cases such as customised DALL-E image generation seemed to be more fun. Over time these GPTs have been extended slightly to explore how GPT Actions can interact with remote APIs such as the GitHub API and AWS Lambda functions.
For more information on these, please visit my gpt.joynt.co.uk site, which will be more up-to-date than this post here!
]]>It’s probably best to open and read these pages first, then come back to the notes below:
You enable GitHub pages via the Settings for your repository, where there should be a Pages option in the menu at the left hand side.
In brief, GitHub allows users to host pages under the owner.github.io domain, where owner is your GitHub username. You can use your own custom domain if you want. If you want to use a custom domain name you need to set up a CNAME in your own domain that points to owner.github.io.
I have chosen to put my documents in the /docs/ subfolder in my main
branch, rather than use a separate gh-pages branch, at least for now. It seems
a more intuitive to just use one branch from a personal GitHub Pages site.
Setting up the site as above has automatically created a GitHub Action that will build the site. The success (or otherwise) of these deployments can be seen in the Actions section of the repo.
Formatting for GitHub pages is provided by Jekyll, which can transform Markdown
(and other markups) into themed websites, with little more than a few lines of
YAML at the top of each page (known as Front Matter) and/or in a
/docs/_config.yml file.
index.md file in the /docs/ folder seems to have created
a simple site./docs/ path.The documentation seems to suggest that a _config.yml file will be created
automatically; this didn’t seem to happen so I have created one manually within
the /docs/ folder.
This file defines the overall look and feel for the site, albeit with a number of options pre-defined by GitHub.
There is a list of supported themes which
can be specified by adding theme: jekyll-theme-NAME in _config.yml:
title: dev.joynt.co.uk
description: Home
theme: minima
However it seems that, for some of these themes at least, the version of the
theme provided by GitHub is older than the main release of the theme. I wanted
to try the latest minima theme, so I had to comment out theme: minima and
specify a remote-theme:
#theme: minima
remote_theme: jekyll/minima
As I created more pages, the root folder of the site started filling up. I created a _pages subfolder and moved various markdown pages into that. Sadly, when the site rebuilt they weren’t present. The fix for this was to specify the folder in the _config.yml file:
include:
- _pages
NOTE:
For the files in the _pages folder to appear in the built site directly
under the root path, e.g. https://dev.joynt.co.uk/blog/, it was necessary to
add permalink: <name> to the Front Matter (see below) on each page.
Each page in the Jekyll-themed site should have some lines of YAML at the very top, known as Front Matter.
For example, this “blog post” page has the following:
---
layout: post
title: Learning about GitHub Pages
---