A Spaced-Repetition Password Trainer

Sep 9, 2023

Motivation and inspiration

Remembering secure passwords is hard. That's why password managers were invented. But there are still passwords that one must remember without the aid of a password manager: these include those used to access the password manager (local user login credentials, cellphone passwords, full disk encryption passwords), the password manager database password itself, and critical credentials that one must absolutely know even if their password manager database is unavailable (such as email account passwords, cloud storage passwords, and critical financial passwords).

To make matters worse, several of these passwords (such as full-disk encryption passwords and password manager database passwords) are used in situations where offline attacks¹ are likely possible; this necessitates paying special attention to password strength. I thus needed to memorize a strong, many-word passphrase and reliably and I needed to ensure that I had remembered it before I changed the password to my password manager database—it would be very unfortunate to lose access to my passwords by prematurely adopting a password that I could not reliably remember.

To aid in this memorization, I took inspiration from spaced-repetition approaches used to memorize flash cards and the like. These approaches help a user memorize several pieces of information by questioning the user about pieces of information in sequence, with the time interval between successive presentations of any given piece of information being determined by one's success rate in remembering that piece.

Under similar circumstances of manually entering long, secure passwords into SSH sessions, I accidentally (but serendipitously) memorized them. This experience is another reason I adopted this approach for my password-memorization aid.

How it works

I've adopted the spaced-repetition approach by writing a program, pwtrainer_sr, to be run at regular time intervals; at each run, it uses the amount of time elapsed since the last pass and fail to decide whether or not to prompt the user for their password. I put pwtrainer_sr in my .zlogin file so that there is a possible opportunity to test my memory every time I start a new terminal window If the user is to be prompted, another program is started: pwtrainer. This program queries the user for their password, compares it to a hashed-and-salted copy of the password, and gives feedback as to whether the input was correct or incorrect. Then, pwtrainer_sr keeps track of whether the user successfully guessed the password, and prints statistics (which can be suppressed with the -q flag).

I set up pwtrainer to allow the user to type their attempt at recalling their password in plain, visible text; I believe that this is important in ensuring that the user can spot and correct any typos they may have made in entering their password. Once the user submits their guess, the password is obscured. In my opinion, this strikes an acceptable balance between security and usability.

After using pwtrainer_sr as described above (with it set up to run whenever I opened a new shell window), I began to run into a problem: often, I would open a new window and immediately type out a command while zsh was initializing. If pwtrainer_sr did not prompt me for my password, this command would be run as normal; if it did, however, it would interpret my command as an obviously wrong guess at my password. I thus modified pwtrainer_sr to check if any input is available to read immediately before prompting for the password, and to clear that input first.

When to repeat

On the advice of a friend of mine (Tara Weling) who studies neuroscience, pwtrainer_sr uses the following algorithm to decide whether to prompt the user for their password: let $n$ be the number of successful attempts at recall since the last failed attempt, and let $t$ be the number of seconds since the last failed attempt. The user will be tested if and only if $n\le 1$ or $t\ge 3600\cdot2^{n-2}$. Thus, the interval between queries decreases exponentially as the number of consecutive successes increases, and this exponential lengthening resets when an error is made. When passed the flag -f, pwtrainer_sr queries the user unconditionally.

Security considerations

It is critical that passwords not be stored in plaintext at rest. Ideally, passwords should be hashed and salted such that the password cannot be recovered easily from the hash even though the hash can be used to verify that a string purporting to be the password is, in fact, the password. To provide this security, I opted to use openssl, which is capable of hashing and salting passwords using the algorithms used by the UNIX /etc/passwd file. While not perfect², this ensures that the password will be adequately secured.

By default, the password displays in plaintext while the password is being entered but is replaced by a line of asterisks equal in length to the password once the Enter key is pressed. Thus, someone who can see your screen while you're entering the password could see the password in plaintext. I think that this is the right behavior for most circumstances: entering your password into the password trainer is never essential, so you could simply not enter your password when in the presence of a malicious observer; your password is also cleansed from your terminal's window and scrollback buffer once you press Enter.

Using it yourself

These scripts require zsh and openssl. To use this code yourself, first input your password to openssl passwd -stdin -6 -salt [SALT], where [SALT] is an arbitrary string. Then replace [HASH] and [SALT] in the command below with the string returned by the openssl command and the salt you decided on earlier. Additionally, give pwtrainer_sr somewhere to store a log file by either creating a logs directory one level below the directory in which you store pwtrainer_sr, or alter the definition of the logfile variable on line 3 to point to a more suitable location. Put pwtrainer and pwtrainer_sr in a directory listed in your $PATH environment variable, so that your shell can find them. Then run pwtrainer_sr regularly or set your computer up to run it regularly. Have a secure way of retrieving the password you wish to memorize close at hand—you're going to need it a lot in the beginning.

Source code

For `pwtrainer`, a ZSH script

 1#!/usr/bin/env zsh
 2prompt="Enter password: "
 3fill="*"
 4
 5verifypw()
 6{
 7  [[ $(echo $@ | openssl passwd -stdin -6 -salt [SALT]) ==
 8    '[HASH]' ]]
 9
10  exit $?
11}
12
13#echo -n "\n\e[A$prompt\e[s'
14echo -n "$prompt"
15read input
16extent=$#input
17#echo "\e[u\e[0K*\e[1000b"
18echo "\e[A\e[$(( ${#prompt} + 1 ))G\e[0K$fill\e[$(( ${extent} - 1 ))b"
19if (verifypw $input)
20then
21  print "\e[32mCorrect!\e[0m"
22  exit 0
23else
24  print "\e[31mIncorrect!\e[0m"
25  exit 1
26fi

For `pwtrainer_sr`, a ZSH script

 1#!/usr/bin/env zsh
 2
 3logfile=$(dirname $(realpath $0))/../logs/$(basename $0)
 4touch $logfile
 5
 6time_of_fail=0
 7time_of_pass=0
 8num_pass=0
 9num_fail=0
10num_pass_since_fail=0
11. $logfile
12
13timesince()
14{
15  print $(( $(date +%s) - $1 ))
16}
17
18#if [[ $(( $(date +%s) - $time_of_fail )) -gt 3600
19#  ||  $(( $(date +%s) - $time_of_pass ))  -gt 86400
20#  || $1 == '-f' ]]
21
22# Based on Tara Weling's recommendations regarding spaced repetition.
23if [[ $num_pass_since_fail -le 1
24  || $(timesince $time_of_pass) -gt $(( 3600*2**($num_pass_since_fail-2) )) 
25  || $1 == '-f' ]]
26then
27  should_redo=1
28else
29  should_redo=0
30fi
31
32if [[ $should_redo == 1 ]]
33then
34  while read -k 1 -t 0
35  do
36    read "?Press enter to continue: "
37  done
38  pwtrainer
39  if [[ $? == 0 ]]
40  then
41    time_of_pass=$(date +%s)
42    num_pass=$(( $num_pass + 1 ))
43    num_pass_since_fail=$(( $num_pass_since_fail + 1 ))
44  else
45    time_of_fail=$(date +%s)
46    num_fail=$(( $num_fail + 1 ))
47    num_pass_since_fail=0
48  fi
49else
50  [[ $1 != "-q" ]] && print No repetition needed
51fi
52
53if [[ $should_redo == 1 || $1 != "-q" ]]
54then
55  printf 'Stats:\n'
56  printf '  Last failure: %s (%u seconds ago)\n' \
57    "$(date -d @$time_of_fail)" \
58    $(( $(date +%s) - $time_of_fail ))
59  printf '  Last pass:    %s (%u seconds ago)\n' \
60    "$(date -d @$time_of_pass)" \
61    $(( $(date +%s) - $time_of_pass ))
62  printf '  Fails: %03u    Passes: %03u    Pass rate: %.2f%%\n' \
63    $num_fail $num_pass \
64    $(( 100.0 * $num_pass / ($num_pass+$num_fail) ))
65fi
66
67typeset -p time_of_pass time_of_fail num_pass num_fail num_pass_since_fail \
68  >$logfile

To be precise, an offline attack is one where the attacker has unrestricted ability to guess the password, check that their guess is correct, attempt to access the secured data, and access the data if their guess is correct. This is in contrast to an online attack, where an attacker must submit their guesses to another service to access the protected data; that service is then able to rate-limit, delay, or block the attacker as they see fit. Offline attacks are harder to defend against as they give the attacker the same access as the trusted party the same ciphertext and can do anything they want with it, ad infinitum. The best defenses against offline attacks on encrypted data are to use a strong password and a key derivation function that takes as input a password, transforms it using an algorithm that is difficult to parallelize and optimize, and outputs a key of the length required by the encryption algorithm in use.

I would have liked openssl's password hashing function to support modern password hashing/key derivation algorithms like argon2id, which are specifically designed to be hard for modern computer hardware to process efficiently.