Saturday, December 24, 2016

[nxhebfuc] 34 characters for English

A ... Z
thorn eng ampersand
space
apostrophe
( ) #

Parentheses -- a matched pair of delimiters -- allow hierarchically structured text.  Most obviously, sentences, paragraphs.

The number sign # functions like an escape character.  #a ... j map to digits (probably like #cegab).  #k and beyond might be common punctuation and other things for escaping, e.g., formatting, capitalization.  #( ... ) is intriguing.

The parentheses suggest that text encoded using these characters should actually be interpreted as Lisp program that emitting Unicode or more structured text.  Lisp makes common use of dash in identifiers to separate words (note CamelCase is not available in our encoding because no capitalization), but we also don't have dash.  Maybe identifiers are permitted to have spaces and every atom is surrounded by parentheses.  Or weirdly use apostrophe as a separator.  Or distinguish between one and two spaces.  Or identifiers with spaces are surrounded by a special form.  Or  add a dash or underscore, increasing to 35.

34 (or 35) is a bit larger than nice 32, though changing base is not a big deal.  Fitting within 36=6*6 might be useful.

No comments :