Unnecessary Complex Deployment Workflow: Blog Deployment

Thursday, July 4, 2024 - Permalink

Categories: blog -- Tags: #hugo #gemini #sourcehut

Nota: This post is tagged as a long post, meaning it may be better to prepare yourself a coffee or a drink of your choice before starting reading this page :).

Table of Content

Introduction

Nota: This is a post written before my data loss, but I still had a copy saved via syncthing on my phone. I tried to update it with lastest changes but some things (like scripts or build file) may have evolved since. Check the git repositories just in case.

In the previous post of this series of post, I explained the context, constraints and overall solution to deploy my blog and gemini capsule. This second post is about the initial setup and the specific steps to build and deploy this blog. The 3rd post will be about the capsule deployment. If you haven’t yet, I suggest you read at least the solution overview paragraph of the previous post.

Solution details

As describe in the previous post, the main steps are:

Pushing code to sourcehut git repository
An automated build process starts to build both the web and capsule artifacts
Once the artifacts are ready, send a message to a ntfy topic
Web server subscribe to this ntfy topic and download and copy files when it receive a message
Web server send a message to a ntfy topic to alert me that all is done

That is the global idea. So now let’s configure all this!

Configuration

Sourcehut Git Repositories configuration

Let’s start with SourceHut, where my git repository are hosted and where the build process will run.

On SourceHut, I have a Writings project containing 4 git repos:

website: contains the hugo files and markdown content for generating the site.
capsule: contains the kiln files and gemtext content for generating the capsule and blog links (without their content).
Deploy (unlisted / hidden from the list): contains the script files
MinIndie: My hugo theme manage independently

To start a build automatically after a git push, you need to have its manifest at the root of the said repo, in a file named .build.yml. I have the same .build.yml in my website and capsule repo. If I push something in either, the build will start. I’m using a slightly different .build.yml within the deploy repository, as it will simply run the pipeline without deployment and alert me via email of the result. Deployments happen only for changes within the capsule or website repository.

I’m using an archlinux as the based image of the CI to leverage AUR packages and thus easily installed tools usually not in official repos like kiln, ntfysh or python-aiolinkding (a python library to interact with linkding API).

The build manifest will load the 4 mentioned repositories and use them to generate 2 artifacts: website.tgz and capsule.tgz, containing respectively the public files of this website and the one for the gemini capsule (even though the capsule isn’t complete at this stage and doesn’t use yet this workflow).

To generate bookmarks, the scripts load linkding related secrets (API Token and Instance URL) via the sourcehut secrets management system. set +x and set -x are used to hide secrets from the build output. Ntfy configuration is loaded using secrets too, as well as the name of the topic to use.

Gemlog files are retrieved from the capsule git repository and then transformed as required. This mean changing the extension and only keeping the frontmatter area, not the content.

The complete .build.yml file look like this:

image: archlinux
packages:
  - hugo
  - go
  - scdoc
  - make
  - python
  - python-aiolinkding
  - kiln-git
  - ntfysh-bin
  - python-pygal
  - python-cairosvg
sources:
  - https://git.sr.ht/~bacardi55/website
  - https://git.sr.ht/~bacardi55/capsule
  - https://git.sr.ht/~bacardi55/writting-deploy
  - https://git.sr.ht/~bacardi55/MinIndie
secrets:
  - addbd283-3e0a-40dd-ad27-093ae07963df # linkding url and api token
  - 9e2057e3-8f8f-4e6c-91e2-0236fc4a7e18 # ntfy token
  - d552a5b3-3f7d-48c0-a2bb-15669c63a014 # ntfy topic
tasks:
  - setup: |
      mkdir ~/website/themes/
      cp -r ~/MinIndie ~/website/themes/minindie
  - bookmarks: |
      set +x
      . ~/.linkding_secrets
      set -x
      cd ~/writting-deploy/ && ./build_bookmarks.sh
  - content: |
      cd ~/writting-deploy/ && ./import_gemlog_to_blog.sh
      #cd ~/writting-deploy/ && ./import_blog_to_gemlog.sh
  - stats: |
      cd ~/writting-deploy/ && ./generate_content_stats.sh
  - build: |
      cd ~/writting-deploy/ && ./build_blog.sh
      cd ~/writting-deploy/ && ./build_capsule.sh
  - deploy: |
      mkdir ~/artifacts
      cd ~/website/ && tar czf ~/artifacts/website.tgz ./public/
      cd ~/capsule/ && tar czf ~/artifacts/capsule.tgz ./public/
  - ntfy: |
      set +x
      . ~/.ntfy_topic
      /usr/bin/ntfy publish -c ~/.ntfy_secrets  --title="JOB_ID: ${JOB_ID}" "${NTFY_TOPIC}" "New successful build: ${JOB_ID} -- Job URL: ${JOB_URL}" > /dev/null 2>&1
      set -x
artifacts:
  - "artifacts/website.tgz"
  - "artifacts/capsule.tgz"
triggers:
  - action: email
    condition: failure
    to: <email@example.com>

The different build tasks are:

setup: Will put the MinIndie theme within the theme folder of the hugo directory
bookmarks: Load Linkding secrets and generate bookmarks markdown files via the build_bookmarks.sh script (see below)
content: Copy (and adapt) gemlog entries to the blog (see below). Copy (and adapt) blog entries within gemlog (will be detailed in the 3rd post of this series)
stats: Generate content statistics, you can read more about it
build: Generate blog public files via build_blog.sh (see below) and the capsule public files via build_capsule.sh (detailed in 3rd post)
deploy: Create both website.tgz and capsule.tgz archive files
ntfy: Send message to relevant topic with JOB_ID.

Regarding triggers, it will send an email only if a build fails. I don’t want an email if the build is successful as I should received notification via ntfy.

The artifacts section tells Sourcehut to keep the generated artifact (for 90 days) so the web (and gemini) server(s) can download them afterwards.

Build scripts

All the following scripts are available in the Deploy repo.

dirconfig.sh

All the following bash scripts rely on this one that contains directories configuration (so I can have a different one locally). The content of the dirconfig.sh file is:

#!/bin/bash

# Main directories
temp="/home/build/_tmpdir"
blog="/home/build/website"
capsule="/home/build/capsule"
export_bookmarks_script="/home/build/writting-deploy/shared-bookmarks-to-md.py"
generate_stats_script="/home/build/writting-deploy/generate_content_stats_graph.py"

# Subdirectories
## Blog:
blog_content="content"
blog_public="public"
blog_bookmarks="content/bookmarks"
blog_gemlog="content/gemlog"
## Capsule:
capsule_public="public"
capsule_gemlog="content/gemlog"

# Stats related directory
stats_export_dir="/home/build/website/data"
stats_export_images="/home/build/website/static/images/pages/stats/"
stats_filename="content_stats.json"
stats_content_types=("posts" "gemlog" "bookmarks")

build_bookmarks.sh

Simplified, the goal of this script is to start a python script (shared-bookmarks-to-md.py) that will do all the work to import bookmarks. Content of the build_bookmarks.sh file:

#!/bin/bash

source ./dirconfig.sh

echo "Starting bookmark export to markdown"

echo "Creating temp directory: ${temp:?}"
mkdir "${temp:?}" || exit

echo "Going to bookmarks directory: ${blog:?}/${blog_bookmarks:?}"
cd "${blog:?}/${blog_bookmarks}/" || exit
cp ./_index.md "${temp:?}/bookmark_index.md" || exit
rm -f ./*.md
/usr/bin/python "${export_bookmarks_script:?}" || exit
cp "${temp:?}/bookmark_index.md" ./_index.md || exit

echo "Cleaning temp dir ${temp}"
rm -rf "${temp:?}" || exit

echo "Bookmarks files generated successfully"

shared-bookmarks-to-md.py

This script will retrieve all shared bookmarks via Linkding API and will generate markdown files for each in the right folder (content/bookmarks/).

/!\ Script written quickly and with an obvious problem if the number of bookmarks exceed 5555. Indeed, I need to manage pagination in API responses, but was lazy for this first version. The script as of the day of writing is:

import asyncio
import os

from datetime import datetime
from aiolinkding import async_get_client

async def main() -> None:
    linkding_url = ""
    linkding_api_token = ""

    try:
        linkding_url = os.environ["LINKDING_URL"]
        linkding_api_token = os.environ["LINKDING_API_TOKEN"]
    except:
        print("Incorrect Linkding Configuration")
        exit()

    client = await async_get_client(linkding_url, linkding_api_token)

    # Get all bookmarks:
    # TODO: Manage "next" links in case there are more results than requested.
    bookmarks = await client.bookmarks.async_get_all(limit=5555)

    for bookmark in bookmarks['results']:
        if bookmark["is_archived"] or not bookmark["shared"]:
            continue

        fm = "+++\n"
        fm += "title = \"" + bookmark["title"] + "\"\n"
        fm += "author = [\"Bacardi55\"]\n"
        fm += "id = " + str(bookmark["id"]) + "\n"

        t = datetime.strptime(bookmark["date_added"], "%Y-%m-%dT%H:%M:%S.%fZ").strftime("%Y-%m-%d")
        fm += "date = " + str(t) + "\n"

        if bookmark["tag_names"]:
            fm += "tags = [\"" + "\", \"".join(bookmark["tag_names"]) + "\"]\n"

        fm += "bookmark_url = \"" + bookmark["url"] + "\"\n"
        fm += "description = \"" + bookmark["description"].replace('"', "'") + "\"\n"

        fm += "+++\n\n"

        if bookmark["notes"]:
            fm += "Personal notes:\n\n" + bookmark["notes"] + "\n"

        filename = str(bookmark["id"]) + ".md"
        with open(filename, "w") as write_file:
            write_file.write(fm)

asyncio.run(main())

import_gemlog_to_blog

This script import gemlog entries (in gemtext with a .gmi extension) into markdown and remove their content, keeping only the frontmatter header. This is because I don’t want the content of my gemlog to be available anywhere on my blog.

The script is as follow:

#!/bin/bash

source ./dirconfig.sh

echo "Start importing gemlog article in website"
cd "${blog:?}" || exit

echo "Creating temp directory: ${temp:?}"
mkdir "${temp:?}" || exit


echo "Copying ${blog}/${blog_gemlog}/_index.md file in temporary directory: ${temp}"
cp "${blog:?}/${blog_gemlog:?}/_index.md" "${temp:?}/gemlog_index.md" || exit
echo "Removing gemlog folder:"
rm -rf "${blog:?}/${blog_gemlog:?}" || exit

echo "Copying gemlog entries from capsule repository: ${capsule:?}/${capsule_gemlog:?}"
cp -r "${capsule:?}/${capsule_gemlog:?}/" "${blog:?}/${blog_gemlog:?}/" || exit

echo "Going into gemlog directory: ${blog}/${blog_gemlog}"
cd "${blog:?}/${blog_gemlog:?}/" || exit
echo "Removing content after frontmatter in all gemlog .gmi files:"
sed -i '1,/---/!d' ./*.gmi || exit
echo "Renaming all .gmi files into .md in ${blog}/${blog_gemlog}:"
for file in ./*.gmi; do
  mv -- "$file" "${file%.gmi}.md"
done

echo "Copying back the gemlog _index.md file"
cp "${temp:?}/gemlog_index.md" "${blog:?}/${blog_gemlog:?}/_index.md" || exit

echo "Cleaning temp dir ${temp}"
rm -rf "${temp:?}" || exit

generate_content_stats.sh and generate_content_stats_graph.py

These scripts will generate the data and images for the content statitics page.

I wrote at length about it in a dedicated blog post, so read it if you want as I won’t re-detailed everything.

build_blog.sh

This script basically generate public files via the hugo command and a few other minor stuff:

#!/bin/bash

source ./dirconfig.sh

echo "Start building blog"
echo "Going to blog directory: ${blog:?}"
cd "${blog:?}" || exit

echo "Creating temp directory: ${temp:?}"
mkdir "${temp:?}" || exit

echo "Removing old public files in ${blog}/${blog_public}/"
rm -fr "${blog:?}/${blog_public:?}/"

# Generate outputs
echo "Generating public directory…"
cd "${blog:?}"
hugo

# To keep old RSS filename.
echo "Duplicating index.xml to rss.xml for backward compatibility."
cp "${blog:?}/${blog_public:?}/index.xml" "${blog:?}/${blog_public:?}/rss.xml"

# Because I don't want the gemlog files to be available at all on the blog (https):
# But before, save the gemlog index and atom feed files:
echo "Saving gemlog atom feed: ${blog}/${blog_public}/gemlog/index.xml"
cp "${blog:?}/${blog_public:?}/gemlog/index.xml" "${temp:?}/_gemlog_atom.xml"
echo "Saving gemlog index.html: ${blog}/${blog_public}/gemlog/index.html"
cp "${blog:?}/${blog_public:?}/gemlog/index.html" "${temp:?}/_gemlog_index.html"
echo "Deleting gemini html files: ${blog}/${blog_public}/gemlog"
rm -rf "${blog:?}/${blog_public:?}/gemlog"
# Copy back the atom feed:
echo "Reinstalling gemlog atom feed:"
mkdir "${blog:?}/${blog_public:?}/gemlog"
cp "${temp:?}/_gemlog_atom.xml" "${blog:?}/${blog_public}/gemlog/index.xml"
cp "${temp:?}/_gemlog_index.html" "${blog:?}/${blog_public}/gemlog/index.html"

echo "Cleaning temp dir ${temp}"
rm -rf "${temp:?}" || exit

echo "Blog built successfully"

Ntfy configuration

For the rest of this post, let’s decide that:

topic-deployment is the name of the ntfy topic used by sourcehut to indicate a new build was successful
topic-updated is the name of the ntfy topic used by my webserver to indicate the update has been published

Adapt accordingly :).

Also, I’m not covering the ntfy installation process here as I already had that working. I’m just going to detail the configuration I went through.

Create ntfy users

Create sourcehut ntfy user: only able to write on topic-deployment channel:

sudo ntfy user add --role=user sourcehut
sudo ntfy access sourcehut topic-deployment wo
sudo ntfy token add sourcehut

Create a zoro (name of my webserver) ntfy user: only able to read topic-deployment channel and able to write only to topic-updated channel:

sudo ntfy user add --role=user zoro
sudo ntfy access zoro topic-deployment ro
sudo ntfy access zoro topic-updated wo
sudo ntfy token add zoro

Install and configure ntfy on the web server

On SourceHut

As I’m using an archlinux image, installation is done via adding the package to the .build.yml file. I also used sourcehut secrets to upload a file with my ntfy configuration that contains only these 2 lines:
```
default-host: https://example.com
default-token: yourTokenHere
```
Use here the token generated above for sourcehut.

On the web server

Install ntfy package depending on your distro and configure /etc/ntfy/client.yml:
```
default-host: https://domain.tld
default-token: mytoken # ZORO TOKEN
subscribe:
  - topic: topic-deployment
    command: 'export NTFY_UPDATE_TOPIC="topic-updates"; /path/to/update_blog.sh $t'
    token: mytoken # ZORO TOKEN
```
In my case the web server is running debian. The package I installed ntfy with also comes with a systemd service for ntfy (server and client). Before running it, I want to configure it to use a dedicated user, so just to be sure, I ran systemctl disable ntfy. The export NTFY_UPDATE_TOPIC… is used to avoid writing the topic name in the public bash script.

To use the service as a specific user, which is my case, as I want to run it with the same user that has the permission to deploy the new code. For that, override systemd service by creating the /etc/systemd/system/ntfy-client.service.d/override.conf file:
```
[Service]
User=yourUser
Group=yourUserGroup
```
Then, enable and start systemd service:
```
systemctrl enable ntfy-client.service
systemctrl start ntfy-client.service
```
And now the web server listen to the topic configured in the client.yml file. Each time sourcehut will send a message on that topic, the deamon will read it and launch the update_blog script, with the received title as an argument.

Webserver scripts

Now that ntfy is correctly installed and configured on all side, let’s look into what the web server does once a message is received. First, it takes the 2nd argument (the job ID) and will use the Hut tool (CLI for working with sourcehut APIs) to retrieve the job artifacts. It will then download the website.tgz file, untar/uncompress it, and then rsync it with the public folder. The full script:

#!/bin/bash

jobID="$2"
echo "JobID recieved: ${jobID}"

# Just wait in case the artifact are not published yet.
# Shouldn't be long, ntfy notif is sent in last step.
sleep 15

download_dir="/home/bacardi55/_deploy/download"
website_destination="/srv/www/bacardi55.io"

mkdir "${download_dir:?}" || exit

artifacts=$(/usr/bin/hut builds artifacts "${jobID}")

website_artifact=$(echo "$artifacts" | grep "website.tgz" | awk -F " " '{print $4}')
wget --directory-prefix "${download_dir:?}" "${website_artifact:?}" || exit

mkdir "${download_dir}/website" && mv "${download_dir}/website.tgz" "${download_dir}/website"
cd "${download_dir}/website" && tar xf website.tgz .

rsync -avzhP --delete "${download_dir}/website/public/" "${website_destination:?}"

# Send ntfy message:
/usr/bin/ntfy publish --title="Blog has been updated" "${NTFY_UPDATE_TOPIC}" "Blog was updated after successful build #${JOB_ID}" > /dev/null 2>&1

# Cleaning:
rm -rf "${download_dir:?}"

And voilà, I should receive an alert when everything is done.

What about deployment when a bookmark is added?

At the moment, bookmarks are not automatically published when added to Linkding. And there isn’t any CI scheduling other than on git push on the website/capsule repositories, bookmarks are refreshed on the site only when I publish any other changes.

To improve this, I have the following ideas:

A regular build (eg: Once per day or one every other day)
A cron script that will check at 23h59 if any deployment has been made that day, if not, do one (but still may not be needed if no bookmarks have been added that day)
A cron that will check for latest bookmarks and if newer than last time => run the CI. Unlikely solution as I think it will be to complicated for such a small use case.
Other? (Open to ideas!)

This is a lot less important, I don’t mind mass publishing bookmarks because I waited a few days before deploying a new content. I’m already more or less doing so, as when I publish a new post, all the new bookmarks since last deployed will be added at once… But I’m keeping this problem in my todo list to see what I can come up with in the future weeks/months.

Conclusion

That’s it for deploying the blog on my webserver within my homelab. Hopefully, it was clear enough. The next and final steps are to build and deploy the gemini capsule, but that’s for a later post.

From the « Unnecessary Complex Deployment Workflow »: collection:

Contact

If you find any issue or have any question about this article, feel free to reach out to me via webmentions, email, mastodon, matrix or even IRC, see the About page for details.

« RE: To Blogroll, or to not?

Do people IRL know you have a... »