This file is intended for developers' reference. It probably makes no sense if you don't understand the keyman language, and only little sense to non-programmers. Latter parts are probably best read with a copy of the source code open in another window. I've tried to put the most general stuff at the start, so once it's just too much gibberish, give up and you shouldn't have missed much else that you're likely to understand.
any(A) + any(foo) > context index(bar,2) any(B) + any(foo) > index(bar,2) context any(C) + any(foo) > context index(bar,2)Thus the decision was made to store the fact that an any() lookup had occurred on that history item and store id, and what the result (true/false) was. It was later realised that for certain keyboards this was entirely pointless. E.g. consider a keyboard of form:
any(foo) "X" + "Y" > index(fooOUT,1) any(foo) "X" + "Z" > index(fooOUT,1) "Z" any(bar) "X" + "Y" > index(barOUT,1)In this keyboard there is no posibility that any values that might be stored in a cache might be useful, but the caches are taking up valuable kernel memory and cache lookups and updates take cpu cycles. E.g. a keyboard that has 32 stores (some known keyboards have double this) will require 128*32*2/8 bytes, or 1k of additional kernel storage per tty in use, just for the cache. While this may seem a trivial amount on a modern machine, it all adds up when you have a dozen virtual consoles, a few xterms and some more users logged in over the net. Swapping is by no means a thing of the past and the average user does not want to wait for too many page faults every time they press a key.
Once everything is compiled, the user has 3 sorts of files to deal with. If the user has kept to the (current) default naming scheme, these are:
.kmn
) files:
.supp
) files:
kttyload
) uses to convince itself and the
kernel that the .out
file is a valid keyboard. In loading the
keyboard, the information in this file is loaded into a struct
which also has a pointer to the byte code. When switching keyboards the
pointer is NULL
, but otherwise the struct is filled in the
same way. This means that the keyboard loader needs the .supp
in order to switch keyboards.
.out
) files:
A number of alternative methods of doing things (rather than the "two-output-files-per-input-file" described above) were briefly considered, but eventually the following thoughts came to have dominance:
The single file format (foo.ktty) includes an 8 byte header "K_TTY b\n" (the "b" standing for binary), the keyboard name (up to 32 bytes), a file format number, the sanity-checking information included in the support file (except as binary data). and the keyboard data itself.
In the byte-code, groups of commands in the original input file are preserved, albeit rearranged into the propper search order, and two methods of storing them are permitted by the kernel. It is intended that the compiler should make an inteligent choice regarding how best to store each group.
(max_rule_length - avg_rule_length)*number_of_rulesin "wasted space", and is obviously ideal where all the rules compile to the same length.
sizeof(unsigned short) * number of rulesin wasted space, and also requires a few more CPU cycles on each rule as we need to read the length of the next rule as we go through. It is the obvious choice when we have a some very long rules and the rest are quite short.
The binary data follows the following format (as described in k_tty.h):
MAX_K_STR (currently 255) of them are.
These should ideally be sorted such that caching may be optimal, but are
currently sorted by (decreasing) length order, which we hope will not be
too bad an approximation;
- an extra (char) zero;
- one or more group(s) of (input, output) rule pairs, each group
is terminated by an optional match or nomatch rule and an
END_TBL rule (0);
The starts of consecutive rules are separated by precisely seqlen bytes.
The rules are sorted by decending length and order in the input file.
If seqlen (excluding any flags) for a group is zero then
we have variable seqlens, and the first sizeof(SEQLNTYPE) bytes
in each rule pair hold the length of this rule, excluding
the sizeof(SEQLNTYPE). Such groups should also have the
group flag F_GRP_VARSEQ set, though if they don't then the kernel fixes
this.
- an extra END_TBL rule (i.e a null table).
Limitations: The block must be small enough that kmalloc can grab a block that size, and must not be longer than the max value an unsigned int can hold (since this is 4Gb on a x86, this should be plenty). A single compiled `line' of a match rule cannot add up to more than the length storable in the residual bits of a SEQLNTYPE & F_GRP_SEQLEN. (16777215 with an int and F_GRP_SEQLEN=0x00FFFFFF) There may not be more than 65535 (MAX_SHORT) groups or stores
There will be a performance penalty if you use more stores than MAX_K_STR, but I'm not sure how large. Probably negligable, unless you use late string values a lot. Use of more stores than MAX_CACHE in any() tests will probably cause a big slowdown, since for each any() lookup that needs to be done it'll have to hunt through the (uncached) stores to find the trailing zeros. A large MAX_CACHE will eat RAM. On the otherhand, its probably all probably negligable on a modern machine anyway.