Atuin and per Directory History Import

There's an oft-quoted Unix philosophy that tools should do one thing only, and do it well: searching, filtering, manipulation, etc, with the idea being the shell can quickly combine these into more sophisticated custom functionality. Perhaps ironically, the shells themselves are accumulating plenty of additional functionality themselves.

I've recently started using Atuin, which is essentially a souped-up tool for managing your shell history. Every shell (bash, zsh, etc) will keep a record of your previous commands, but they'll all store them in their own file, using their own format, with their own tweaks on accessing it. Using atuin delegates all of that to a separate tool, which is now relatively shell-agnostic, and you get additional metadata such as the directory the command was executed in and the return status and duration, and you get more advanced search capabilities, and you can even sync history across machines (I don't use this feature, currently).

If you clicked the link above you'll see that atuin provides a search UI as well. I've also been using fzf lately, which is an absolute game-changer. I don't even make heavy use of its capabilities (although I'm starting to integrate it more, eg as a jq repl), but the real-time fuzzy filtering is amazing, and mirrors the experience I get in Emacs via helm or more recently selectrum et al.

Atuin has a few different options for searching (prefix, fuzzy, exact, excluding, etc), including mostly mimicking the experience I was used to with fzf and Emacs, but ultimately didn't quite reproduce it.1

What completed the puzzle for me was a snippet from a Hacker News comment, that combines the fzf UI with atuin for search management. I have changed it slightly; by default it restricts the search to the current directory only, but a second control-r will search the entire history.

atuin-setup() {
    if ! which atuin &> /dev/null; then return 1; fi
    # bindkey '^E' _atuin_search_widget

    export ATUIN_NOBIND="true"
    eval "$(atuin init zsh)"
    fzf-atuin-history-widget() {
        local selected num
        setopt localoptions noglobsubst noposixbuiltins pipefail no_aliases 2>/dev/null

        # local atuin_opts="--cmd-only --limit ${ATUIN_LIMIT:-5000}"
        local atuin_opts="--cmd-only"
        local fzf_opts=(
            "--bind=ctrl-d:reload(atuin search $atuin_opts -c $PWD),ctrl-r:reload(atuin search $atuin_opts)"

            eval "atuin search ${atuin_opts} -c $PWD" |
                fzf "${fzf_opts[@]}"
        local ret=$?
        if [ -n "$selected" ]; then
            # the += lets it insert at current pos instead of replacing
        zle reset-prompt
        return $ret
    zle -N fzf-atuin-history-widget
    bindkey '^R' fzf-atuin-history-widget

This "toggle between directory and global history" feature is something I've been using in zsh via the per-directory-history plugin. Atuin does have a history-import feature, but not surprisingly it doesn't work with the per-directory-history which is in a different format. I wrote a little python snippet (below) to handle this. The gist is that per-directory-history keeps a separate history file for each directory, where the directory tree is mirrored under a separate directory in your home: for example if you run a command in /var/log/httpd, it would go into the file ~/.directory_history/var/log/httpd/history. Most of the rest of the metadata we have to fake or guess (or in the case of timestamp resolution, adjust). You may need to tweak to your situation, and I've elided a few non-essential bits, but hopefully it helps someone!

  import os
  import os.path
  import sqlite3
  import sys
  import uuid

  def load_history(fpath, session, hostname, hist_dir):
      history = []
      cwd = fpath.removeprefix(hist_dir).removesuffix('/history')
      print(f'Using cwd {cwd}')
      lastcmd = ''
      with open(fpath, 'r') as f:
          for line in iter(f):
                  while line.strip()[-1] == '\\':
                      line += next(f)
                  line = line.lstrip(': ')
                  tss, line = line.split(':', 1)
                  _,cmd = line.split(';', 1)
                  if cmd != lastcmd: # Ignore dups, since the timestamp doesn't always have enough resolution
                          "id": uuid.uuid4().hex,        # id
                          "timestamp": int(tss) * 1e9,   # timestamp
                          "duration": 0,                 # duration
                          "exit": 0,                     # exit
                          "command": cmd,                # command
                          "cwd": cwd,
                          "session": session,
                          "hostname": hostname,
                      lastcmd = cmd
              except Exception as e:
                  print(f'Error: {e}')
      return history

  INSERT_SQL = """
  insert into history (id, timestamp, duration, exit, command, cwd, session, hostname)
  values (:id, :timestamp, :duration, :exit, :command, :cwd, :session, :hostname)

  def main(basedir):
      basedir = os.path.expanduser(basedir)
      basedir = os.path.realpath(basedir)
      print(f'Using basedir: {basedir}')

      db = sqlite3.connect('history.db')
      cur = db.cursor()

      session = uuid.uuid4().hex
      hostname = 'hostname:username' # update as necessary
      for dirpath, _dirnames, filenames in os.walk(basedir):
          if 'history' in filenames:
              history = load_history(os.path.join(dirpath, 'history'),
              print(f'Loaded {len(history)} commands')
              cur.executemany(INSERT_SQL, history)



In full disclaimer, it may have improved or changed, and I probably didn't spend a lot of time testing it. That's not what this post is about anyway!

comments powered by Disqus