Using GitHub to host recurring jobs

Overview

I manage my own server, and while I appreciate the sense of ownership and the learning that goes along with it, it can still increase the sense of friction.

For example, a few times lately I've had an idea for a simple little program, to run regularly and email me something. The programs aren't large, and the effort in developing and testing my Ansible playbook to host the program (including probably checking out from source, setting up deploy keys, etc) is probably about the same… so nothing gets done. Yes, I'm lazy.

Then suddenly I realised that GitHub1 had free "Actions" (they'll run some code for you), and that these include cron-like scheduled triggers. That was all I needed!

Case-study: Weekly summary of budgets in Harvest

At my day-job we use Harvest for time and budget tracking on projects. Try as I might, my instincts to bury myself in a technical problem mean that periodically reviewing budgets doesn't happen anywhere nearly as often as it should. Naturally this sounds like a technical problem to me! I'd like an email that gets sent every week to tell me how my projects are going: it's in my Inbox, and harder to ignore or forget. Zero-friction.

Skipping ahead, the result is on GitHub. Harvest have an API that's fairly easy to work with. I wrote a simple Python program that invokes it, formats the results using Mako, and writes to a file. For everything else, we will use GitHub's infrastructure.

Setting up GitHub Actions is fairly simple and covered elsewhere.

The key point I want to focus on here is the trigger ("on:") that invokes your workflow, which normally is a repository action such as pushing a new commit, or a pull request. It turns out, you can also define a schedule, using a cron-style syntax. Also worth noting, you probably want to include workflow_dispatch: in your workflows, which means you can invoke them on-demand for testing.

name: Harvest Budget Summary Report
on:
  workflow_dispatch:            # Enable manual invocation for testing
  schedule:
    - cron: "0 22 * * THU"      # 8am Fri, AEST

There is some boilerplate to create my python environment, then a simple entry to run the program, configured by environment variables2:

      - run: python projects_status.py
        env:
          HARVEST_ID: ${{ secrets.HARVEST_ID }}
          HARVEST_TOKEN: ${{ secrets.HARVEST_TOKEN }}
          ACCOUNT_EMAIL: ${{ secrets.ACCOUNT_EMAIL }}

Finally, we can lean on a third-party contribution to send the email, and we are done!

      - uses: dawidd6/action-send-mail@v3
        with:
          server_address: smtp.gmail.com
          server_port: 465
          username: ${{ secrets.SMTP_USERNAME }}
          password: ${{ secrets.SMTP_PASSWORD }}
          subject: "[Harvest] Project Budget Tracking"
          html_body: file://email.html
          from: ${{ secrets.ACCOUNT_EMAIL }}
          to: ${{ secrets.ACCOUNT_EMAIL }}

Improvements

I really don't love the idea of having a Gmail application-password accessed by a third-party action. This could/should probably be rolled back into my own Python code and smtplib. At the time it seemed like a nice separation of concerns; all I had to write was a simple and easily-testable program, and let the CI infrastructure handle the rest. It certainly is convenient, but the overhead of doing it myself is minimal and the potential vulnerability impact is fairly high so I think I'll rewrite it shortly.


1

Also GitLab, probably others, but I'm focussing on GitHub here as that's the one most people are familiar with.

2

Pro-tip: I use direnv for local development, complete with seamless Emacs integration.


comments powered by Disqus