K_TTY Language description

K_TTY understands a language (or file format) closely related to that read by the Tavultesoft keyboard manager (KEYMAN), version 3.2 . The main difference is that K_TTY is a Linux driver, whereas Keyman works on MS Windows (and possibly other operating systems by now). Many of the other differences stem from the fact the programs are not at all related internally, as far as I know. I wrote K_TTY after using keyman some years ago and thinking that something similar for Linux would be very handy.

There are some differences in the strictness of the grammar beteen the two programs, some constructs used in keyman are silently ignored, and some new "instructions" have been added. The language that keyman uses is itself based on of SIL's "consistent changes", but with a very reduced instruction set. Some of K_TTY's extra instructions are inspired by a couple keyman's omissions.

The flavour of the language that K_TTY understands has zero or more header items, and one or more groups of pattern matching rules.

Glossary

In the text below, certain terms have a specific meaning that may not be exactly what you expect, and other representations are used for convenience.
String
Either: a sequence of litteral bytes enclosed in double or single quotes. Single quotes should not be balanced, i.e. both must be ascii code 0x27, as used in shell and C programming.
one or more numerical byte values in base 10, represented as dNNN.
one or more mumerical unicode values in hexadecimal, represented as U+NNNN.
Or A mixture of the above.
Keyvalue
A single byte or in UTF-8 mode, a single UTF-8 value (as a quoted litteral string, or in U+NNNN notation).
NL
A line break. I.e most commands may not be broken across lines without using the continuation marker.
NN
A number.

Header Items

Implemented Header Items

name keyboard name NL
The name of the keyboard, used to reference this keyboard as a unique entity to the kernel. If the name is changed, it is counted as a different keyboard. If an attempt is made to load a keyboard with a name that matches one already loaded, then the kernel will interpret this as an attempt to replace the currently loaded one, which will fail if the current one is in use. Limits: the compiler's binary file format has been defined with a maximum of 64 bytes for this value, and currently the kernel only uses the first 16 bytes of this value.

store (store_name) store_definition NL
A store is a string (or sequence of bytes) referenced by store_name that may be used either for matching / translating input or as a predefined string that you might want to output from more than one rule. The second case may seem more space efficient, and certainly makes for a smaller keyboard definition and easier editing, but if caching is used then each new store adds to the memory needed by the kernel.

begin unicode > use(group name) NL
This file expects to be used on a console/terminal that uses unicode in UTF-8 format. Also, start processing using the group named, rather than the first one found in the file.

begin > use(group name) NL
Start processing using the group named, rather than the first one found in the file.

Currently Unimplemented Header Items

hotkey [shiftstate key] NL
This command is used in keyman(TM) 3.2 to set a "hot key" combination for starting/stoping this keyboard. Later versions of keyman are believed to deprecate this command, as keyboard selection is done by the standard windows(TM) mechanisms.

At present, there is neither a "standard" linux mechanism for switching keyboards (other than running the right command), nor is it possible to obtain the shift state of a dumb terminal or a telnet/ssh session.

Ignored Header Items

bitmaps string string NL
This is used by keyman 3.2 to specify a pair of bitmap icons to be displayed when the keyboard is in use or not, on windows.
version string NL

GROUPS

group (name) 
group (name) using keys
A group has one or more "rules", each of which contains three parts: a pattern; an action; and a "greater than" sign that separates the two. if the optional "using keys" is marked, then each match part in the group must finish with a "+" sign followed by a (possibly multi-byte) keystroke. The "+" sign is not optional in a "using keys" group, and is not permitted in group that does not use keys. Items marked with "*" are not acceptable to Keyman 3.2.

Pattern items

string
A string that should be matched exactly.
any(store_name)
The character under test (utf-8 included) should match a character in this store.
outs(store_name)
The named store contains a string that should be matched exactly.
deadkey(NN)
deadkey(U+NNNN)*
deadkey("keyvalue")*
A state marker equal to the keyvalue. Keyboard designers may use this to match a "marker" that another rule has placed in this position previously. It may value may be either a decimal integer, a litteral character (possibly in UTF-8) surrounded by quotes or a hexadecimal unicode character. It may not be zero. "deadkey" may be abbreviated to "dk".
isset(NN)*
Matches if flags AND NN=NN, where NN is an int.
isclear(NN)*
Matches if flags AND NN=0, where NN is an int.
match
This item may only occur as the final pattern (or penultimate if followed by a nomatch pattern) and does not co-occur with other pattern items. It's corresponding action is executed if any previous rule in the group has been succesfully matched.
nomatch
This item may only occur as the final pattern (or penultimate if followed by a match pattern) and does not co-occur with other pattern items. It's corresponding action is executed if no previous rule has matched. Beware that if a nomatch command is used it overrides the default of outputing the key pressed. If you want to output the key and do something else too, use the matched_key command. This may differ from keyman's behaviour, since keyman does not have the matched_key command.
( ITEM or ITEM )*
A successful match on either the first or second ITEM will cause the expression to be successful. Note that the "(" and ")" are required, and that currently ITEM may not be another (...or...) sequence. i.e. "or" statements do not currently nest, although the syntax has been designed to cope with this, should the interpreter ever be rewritten so that it can. Note that for the purposes of simply matching characters, the any() instruction is more powerful. The statement is mainly intended to be used with the flags commands, and possibly as a way of combining any()s.

Action items

string
A string that should be output exactly.
outs(store_name)
The named store contains a string that should be matched exactly.
deadkey(keyvalue)
Place a state marker equal to keyvalue (see above for acceptable identifiers) onto the history list, but output nothing. This may be used either to remember a key that has been pressed or to prevent this particular output being matched by other rules that might otherwise do so.
index(store_name, NN)
Output a character from store_name, it being the corresponding character to the Nth any() that we have just matched (where N is the number).
beep
Emit a beep (send ascii 0x07 "BELL") to warn the user that they have done something you dissaprove of.
use(group_name)
Process the current state using the named group.
return
Return immediately to waiting for input. Do not process any match instructions, do not act on any instructions after a use() call.
nul
Do nothing.
del(NN)*
Send NN delete characters, where NN is an unsigned short.
set(NN)*
Set flags to flags AND NN , where NN is an int.
clear(NN)*
Set flags to flags AND NOT NN, where NN is an int.
toggle(NN)*
Set flags to flags XOR NN, where NN is an int.
context
Output everything except the key that triggered the match.
matched_key*
Output the key that triggered the match.