The LaTeX Input Method
Motivation and inspiration
As someone who has frequent online conversations about math,
I often find myself needing to use mathematical or technical symbols like the
right arrow (→), the set intersection symbol (∩), and various Greek letters (π,
Σ, ϵ, δ, Δ).
In an online discussion, frequently the best way to use these symbols is as
Unicode characters1.
Inserting these characters from a standard American English keyboard is
nontrivial;
on Linux, some symbols (like arrows, ≠, super- and subscript numerals, °, and
accented characters like ô) can be inserted using the Compose key;
a key is pre-designated as the Compose key,
and then pressing the Compose key followed by two or three characters that
roughly represent the desired character will insert that character.
Compose oo
yields °, Compose o^
yields ô, Compose ->
yields →,
Compose >=
yields ≥, and so on.
But there is no Compose key support for Greek letters or many math symbols;
the Compose key, in fact, was developed more for inserting accented letters than
specialized symbols.
And many Linux users do not have a Compose key configured.
How the program is used
On my computer, pressing Super+Tab opens a menu into which one can type the
\(\LaTeX\) code for a symbol (and other things, which will be explained later).
The menu gives completions,
which one can accept by pressing Enter.
Pressing Shift+Enter accepts the input as entered.
The program then looks up the code in a dictionary and uses
XDoTool
to insert that symbol into whichever dialog is currently focused.
XDoTool can sometimes be flaky2, so a facility is provided to insert
the previously-selected character again quickly by
pressing Super+Tab and then Enter (or entering !last
at the prompt).
This program is written in ZSH, and stores the code-character mapping in a ZSH
associative array.
A Julia program pulls Julia's code-character mappings and outputs them as a
shell script that adds each into that associative array.
User-defined symbols can be added to the array in the first few lines of the
source code file.
I've also defined ways of entering some emoji, box-drawing characters, and
commonly-used blackboard bold characters; these are shown in the tables below.
Entering Uxxxxxxxx
or Uxxxx
at the prompt inserts the Unicode character with
hex code U+xxxxxxxx
or U+xxxx
, respectively.
Pressing Super+Shift+Tab invokes the \(\LaTeX\) input method but copies the
selected
character to the clipboard instead of inserting it using XDoTool—handy for when
XDoTool is misbehaving.
How the program works
The program uses dmenu
to get user input and present a list of possible
completions to the user,
XDoTool to insert characters,
and xclip
to copy characters.
I have the i3
window manager configured to run dmenu_latexinput.zsh
whenever Super+Tab is pressed and dmenu_latexinput.zsh -c
whenever
Super+Shift+Tab is pressed.
The -c
command line option copies the selected character to the clipboard
instead of inserting it, -p
does the same but to the X11 PRIMARY selection,
and -r
outputs the selected character to stdout.
Other character mappings
\(\LaTeX\) additions
Code | Hex | Character |
---|---|---|
\crossoff |
U+2717 |
✗ |
\Rls |
U+211D |
ℝ |
\Int |
U+2124 |
ℤ |
\Rxi |
U+2102 |
ℂ |
\Rat |
U+211A |
ℚ |
\Nat |
U+2115 |
ℕ |
--- |
U+2014 |
— |
-- |
U+2013 |
– |
Emoji shortcodes
In the process of writing and editing this post, I added the capability to automatically add all emoji shortcodes (using the Joypixels dataset from the Emojibase project). These shortcodes are based on the ones Discord uses.
Box drawing characters
The box-drawing character commands use the following schema:
the first character of the command represents the type of box-drawing character
("b" for single light, "d" for # double), and then "u", "l", "d", and "r"
(in that order) are included if the character in question has a line going that
direction.
For instance, blr
maps to ─
, dldr
maps to ╦
, and buldr
maps to ┼
.
This schema doesn't support all box-drawing characters present in Unicode,
but it supports those that are most necessary.
Code | Hex | Character |
---|---|---|
bur |
U+2514 |
└ |
blr |
U+2500 |
─ |
bldr |
U+252C |
┬ |
buldr |
U+253C |
┼ |
bud |
U+2502 |
│ |
bulr |
U+2534 |
┴ |
bul |
U+2518 |
┘ |
bld |
U+2510 |
┐ |
bdr |
U+250C |
┌ |
budr |
U+251C |
├ |
buld |
U+2524 |
┤ |
dlr |
U+2550 |
═ |
dud |
U+2551 |
║ |
ddr |
U+2554 |
╔ |
dld |
U+2557 |
╗ |
dur |
U+255A |
╚ |
dul |
U+255D |
╝ |
dudr |
U+2560 |
╠ |
duld |
U+2563 |
╣ |
dldr |
U+2566 |
╦ |
dulr |
U+2569 |
╩ |
duldr |
U+256C |
╬ |
Source code
For dmenu_latexinput.zsh
1#!/usr/bin/env zsh
2typeset -A latex_syms
3. $(dirname $(realpath $0))/latex_syms.zsh
4
5# Other symbols
6latex_syms[\crossoff]=U00002717
7latex_syms[\Rls]=U0000211D
8latex_syms[\Int]=U00002124
9latex_syms[\Rxi]=U00002102
10latex_syms[\Rat]=U0000211A
11latex_syms[\Nat]=U00002115
12latex_syms[---]=U00002014
13latex_syms[--]=U00002013
14
15# Emoji
16latex_syms[:rat:]=U0001F400
17latex_syms[:mouse:]=U0001F401
18
19# Box-drawing characters
20# Explanation: first char is type of character ("b" for single light, "d" for
21# double), and then "u", "l", "d", and "r" (in that order) are included if the
22# character in question has a line going that direction.
23## single light
24latex_syms[bur]=U00002514 # "└"
25latex_syms[blr]=U00002500 # "─"
26latex_syms[bldr]=U0000252C # "┬"
27latex_syms[buldr]=U0000253C # "┼"
28latex_syms[bud]=U00002502 # "│"
29latex_syms[bulr]=U00002534 # "┴"
30latex_syms[bul]=U00002518 # "┘"
31latex_syms[bld]=U00002510 # "┐"
32latex_syms[bdr]=U0000250C # "┌"
33latex_syms[budr]=U0000251C # "├"
34latex_syms[buld]=U00002524 # "┤"
35
36## double
37latex_syms[dlr]=U00002550
38latex_syms[dud]=U00002551
39latex_syms[ddr]=U00002554
40latex_syms[dld]=U00002557
41latex_syms[dur]=U0000255A
42latex_syms[dul]=U0000255D
43latex_syms[dudr]=U00002560
44latex_syms[duld]=U00002563
45latex_syms[dldr]=U00002566
46latex_syms[dulr]=U00002569
47latex_syms[duldr]=U0000256C
48
49local rp=$(realpath $0)
50local last=$(dirname $rp)/../logs/$(basename $rp)-last
51
52local charname=$(print -rl \!last ${(@k)latex_syms} | dmenu)
53local charcode=""
54
55if [[ $charname == "!last" ]]
56then
57 charname=$(cat $last)
58fi
59
60if [[ $charname[1] == "U" ]]
61then
62 # bad input sanitization
63 charcode=$charname
64else
65 charcode=$latex_syms[$charname]
66fi
67
68# bad input sanitization
69local char=$(eval "print \"\\$charcode\"")
70echo $charname >$last
71
72if [[ $1 == "-c" ]]
73then
74 echo -n $char | xclip -selection clipboard
75elif [[ $1 == "-p" ]]
76then
77 echo -n $char | xclip
78elif [[ $1 == "-r" ]]
79then
80 echo $char
81else
82 xdotool key --clearmodifiers \
83 $charcode
84fi
For gen_syms.jl
1#!/usr/bin/env julia
2import REPL, Downloads, JSON
3
4const TARGET = "latex_syms"
5const EMOJIBASE_URL = "https://github.com/milesj/emojibase/raw/master/packages/data/en/shortcodes/joypixels.raw.json"
6
7flip1(k::String, vs::Vector) = map(v->":$v:"=>parse(UInt32, k, base=16), vs)
8flip1(k::String, v::String) = flip1(k, [v])
9
10escape_paren_brkt(x) = replace(x, "("=>raw"\(", ")"=>raw"\)",
11 "["=>raw"\(", ")"=>raw"\]",
12 "{"=>raw"\(", ")"=>raw"\}")
13to_xdotool(x) = length(x)!=1 ? @error(x) :
14 "U"*uppercase(string(UInt(x[1]), base=16, pad=8))
15
16chars2xdotool1((x,y),) =
17 "$TARGET[$(escape_paren_brkt(x))]='$(to_xdotool(y))'"
18chars2xdotool(ps) = join(map(chars2xdotool1, ps), "\n")
19
20iobuf = IOBuffer()
21
22Downloads.download(EMOJIBASE_URL, iobuf)
23
24shortcode_db = JSON.parse(String(take!(iobuf)))
25shortcode_db_proc = vcat(map(kvs->flip1(kvs...),
26 filter(((k, v),)->isnothing(match(r"-", k)), collect(shortcode_db)))...)
27
28typeset = "typeset -A $TARGET\n"
29latexes = chars2xdotool(collect(REPL.REPLCompletions.latex_symbols))
30emoji = chars2xdotool(shortcode_db_proc)
31
32println(typeset*latexes*emoji)
For latex_syms.zsh
Generated by gen_syms.jl
.
The relevant parts of i3.conf
1bindsym $mod+Tab exec --no-startup-id zsh -c dmenu_latexinput.zsh
2bindsym $mod+Shift+Tab exec --no-startup-id zsh -c 'dmenu_latexinput.zsh -p'
Generating the above tables
This code is hacked-together and kind of disgusting, but it's worth including here to prevent it from being lost forever.
1lsre=r"^latex_syms\[(.+)\]=U(....)(....)"
2srccode=clipboard()
3clipboard(join(map(m->let
4 code=m[1]; uh=m[2]; ul=m[3];
5 uc="U+"*(uh!="0000" ? uh*" "*ul : ul);
6 ch=Char(parse(UInt32, uh*ul, base=16));
7 "| ~$code~ | $uc | $ch |"; end,
8 filter(!isnothing, match.(lsre, split(srccode, "\n")))), "\n"))
Other times require more advanced means of typesetting math, such as \(\LaTeX\). To incorporate full \(\LaTeX\) output into online discussions that support only text and images, I've developed a similar program that takes the contents of the clipboard, compiles it using \(\LaTeX\), and copies the output as an image to the clipboard.
The programming language Julia—designed for math,
scientific computing, and technical computing—is designed to allow its users to
use symbols in variable names the same way one might use symbols in published
math formulae.
(For example, sin(θ)
is valid and, to an extent, encouraged.)
To accommodate this, the Julia REPL has a feature where typing the \(\LaTeX\)
code for a symbol—like ~θ~—and pressing Tab will result in that code being
replaced by the symbol it represents (θ, here).
While I can't hijack the Tab key across the entire operating system,
I found the idea interesting and decided to adapt it into a method for entering
Unicode symbols across the operating system.
To be fair to the developer of XDoTool, it may be possible that the way my script uses XDoTool is to blame: it might be that my script calls XDoTool while the user is still holding down the Enter key, and that that prevents XDoTool from inserting the desired character.