Monday, 1 March 2021

Colour Names 3

Some more colour naming experiments...

After creating the RGB-125 palette (one-word names for colours based on five levels each of red, green and blue), I limited the number of levels to 64 (4x4x4). This is similar to the EGA source palette.

The EGA palette uses the adjective "bright" to describe colours with any channel at 100% intensity. This idea is mirrored in the "vivid" modifier of the ISCC-NBS system and Color Naming System. I thought that "bright" could be confused with "light", so I chose "vivid" as the prefix for any RGB-64 colour that has maximal chroma.

RGB-64 has 39 base colour names ("amaranth", "amber", "apple", "aquamarine", "azure", "black", "blue", "bluebell", "brown", "celeste", "cerise", "cerulean", "chartreuse", "cyan", "denim", "erin", "green", "grey", "harlequin", "inchworm", "jade", "liberty", "magenta", "mauve", "milan", "mint", "olive", "orange", "orchid", "pink", "plum", "poison", "purple", "red", "spring", "tradewind", "violet", "white" and "yellow") and 3 modifiers ("dark", "light" and "vivid"). The bold names correspond to the eleven (contentious) basic colour terms.

RGB-27 is a strict subset of RGB-125 covering 3x3x3 colour points. Unfortunately, we lose "brown" and "pink" in the process.

At this point, I turned my attention to limited, hue-based colour naming schemes.

HSL-79 is a scheme based on the HSL colour space. The hue (an angle between 0 and 360 degrees) is split into 12 equal segments:

  • red
  • orange
  • yellow
  • chartreuse
  • green
  • spring
  • cyan
  • azure
  • blue
  • violet
  • magenta
  • rose
These hues can be optionally modified with prefixes:
  • deep (very low lightness)
  • dark (low lightness)
  • light (high lightness)
  • pale (very high lightness)
  • dull (low saturation)
If the saturation is high, no prefix is used. If the saturation is very low, the colour is achromatic and the lightness is used to pick from a grey scale:
  • black
  • deep grey
  • dark grey
  • grey
  • light grey
  • pale grey
  • white
These combine to form 79 unique colours: "black" and "white" do not take prefixes and "grey" cannot be "dull". I quite liked the names, though "chartreuse" and "spring" feel a little clunky.

HSV-79 is similar but based on the HSV colour space instead of HSL. I tried to parameterize the partitioning of the segment so that each contains approximately the same number of RGB (256x256x256) colour points.

HWB-91 is based on Alvy Ray Smith's HWB colour space. I can't believe I hadn't come across this before, but there you go! The twelve additional colour points (up from 79 to 91) are due to the addition of the "vivid" modify for hues. Instead of using explicit conditions to partition the space, the "whiteness" and "blackness" coordinates are used as the basis for a Euclidean "nearest neighbour" algorithm based on the following reference points (in this case, for "red"):

HWB-91 reference points for a given hue (x-axis is blackness, y-axis is whiteness)

This produces quite intuitive results. Take, for example, the web-safe colour "#FFCCFF"; "pale magenta" seems like an appropriate name:

Colour names for #FFCCFF in various schemes

The numbers in the final column are the ΔE*(2000) colour differences between "#FFCCFF" and the nearest colour from the discrete palette.

Next, I hope to look into CNS (from 1982) and/or the Artist's Color Naming System (from 1986).

Tuesday, 23 February 2021

Colour Names 2

The companion colournames.html web page, contains experiments with naming colours.

As mentioned previously, HTML5/CSS colour names are highly subjective and lists have developed organically over many decades. As the 140 unique names mentioned in the first post derive from the original X11 list, I'll refer to them as X11 colours.

When plotted in RGB space, it is obvious that named X11 colours aren't evenly distributed:

The 140 unique X11 colours plotted in RGB space

There is a large cluster of very pale (near white) colours. But what about hue distribution?

A simplistic mechanism for dividing a colour space is to slice it up according to HSL parameters. If we use the following pseudocode, we get nine colour groups:

if lightness >= 95% then group = "whites"
else if saturation <= 15% then group = "blacks"
else if hue < 20° or hue >= 320° then group = "reds"
else if hue < 45° then group = "oranges"
else if hue < 75° then group = "yellows"
else if hue < 155° then group = "greens"
else if hue < 185° then group = "aquas"
else if hue < 255° then group = "blues"
else group = "purples"

For the X11 colours, we get 12 whites, 9 blacks, 23 reds, 20 oranges, 12 yellows, 17 greens, 14 aquas, 19 blues and 14 purples. This sounds like a fairly good distribution until you plot them as the number of colour names per degree of HSL hue:

X11 colour names per degree of HSL hue

We can see that the red-orange-yellow hue range is much more crowded.

One way to deal with this lack of uniform distribution is to pick colour points that are uniformly distributed and then name those points according to some dictionary. Our first attempt could be to pick RGB colours at regular intervals.

Web-safe colours are limited to six levels for each RGB channel (0%, 20%, 40%, 60%, 80% and 100%) for a total of 216 distinct colours. However, they are mostly unnamed. If we reduce the number of levels to five (0%, 25%, 50%, 75% and 100%), we get 125 distinct colours, similar to the number of X11 colours. Obviously, the X11 colours won't align perfectly, so there must be some fettling.

In the process of performing this experiment, I found that jan Misali has attempted something very similar. But it's an experiment, right? So repeating it cannot hurt.

The names in the so-called RGB-125 dictionary come from a variety of sources:
I also wanted the colour names to be single words, without qualifiers and unambiguous. For instance, "Chocolate" has very different RGB values in the various source dictionaries. "Lavender" is another example.

To compare colours objectively, I used the CIELAB ΔE*(1994) colour difference metric purely because there was a JavaScript function readily available. I should probably have used CIEDE2000, not least because the CIE94 algorithm is frustratingly quasimetric , i.e. CIE94(a, b) ≠ CIE94(b, a).

I had to invent ten names: "majorelle", "leaf", "lagoon", "felicia", "frog", "lettuce", "roxo", "sororia", "limon" and "kovidar".

See the full RGB-125 table here.

Thursday, 18 February 2021

Tetrascii 3

I took the plunge and animated the tetromino character set with each glyph made up of 25 pieces falling in order:

The final code to produce the animations is surprisingly concise, but the process of generating the data tables was a bit more involved.

Firstly, I originally drew each glyph in PowerPoint, so the construction wasn't very data-friendly. I had to screen-grab the slide and process the resultant image via JavaScript and HTML5 canvas elements.

Each screen-grabbed glyph was "pixel walked" to work out which of the 10x10 texels were directly connected to their neighbours. From the 100 texel neighbour data, I was able to reconstruct the shape, position and orientation of the 25 tetrominoes that made up the glyph. The pixel intensity was used to determine if a tetromino was foreground or background.

The 25 tetrominoes for each glyph were ordered so that they stacked correctly, from bottom to top. This involved finding candidates (pieces whose lower boundaries all fit exactly on top of existing pieces) and picking a random candidate. Note that this simple scheme disallows overhangs, so pieces can drop vertically.

The glyphs are string-encoded as 25 groups of three characters: "<letter><x><y>".

"<letter>" indicates the shape and orientation of the piece. Lowercase letters are background pieces, uppercase are foreground:

     +---- +---- cyan
AB   |#### |#
     |     |#
     |     |#
     |     |#

     +---- +---- +---- +---- orange
EFGH |###  |##   |  #  |#
     |#    | #   |###  |#
     |     | #   |     |##
     |     |     |     |

     +---- +---- +---- +---- blue
IJKL |###  | #   |#    |##
     |  #  | #   |###  |#
     |     |##   |     |#
     |     |     |     |

     +---- +---- +---- +---- purple
MNOP |###  | #   | #   |#
     | #   |##   |###  |##
     |     | #   |     |#
     |     |     |     |

     +---- +---- red
QR   |##   | #
     | ##  |##
     |     |#
     |     |

     +---- +---- green
UV   | ##  |#
     |##   |##
     |     | #
     |     |

     +---- yellow
Y    |##

"<x><y>" is the coordinate within the 10x10 glyph grid of the top-left corner of the piece.

The resultant 75-character encoding for each glyph therefore encapsulates:

  • The order that the pieces fall
  • The shape and orientation of each piece
  • The column that each piece falls in
  • The row that each piece comes to rest on
  • The colour of each piece
A few more hoops have to be jumped through to convert this information into the final animated SVG elements, but that's because of the baroque relationships between SVG, HTML and CSS.

Wednesday, 10 February 2021

Tetrascii 2

Well, it turns out that the Tetrascii lowercase letters are relatively easy if you open up the smaller counters:


It's not too bad a bitmap font, given the limitations imposed. Some of the glyphs are necessarily quirky, but that just adds character. Cough, cough.

Tetrascii 1

What do we get when you cross two old computer phenomena: ASCII and Tetris?


The "rules" are as follows:
  1. Each glyph must fit within a 10x10 grid.
  2. The foreground pixels must be constructible from standard tetrominoes.
  3. The background pixels (including counters) must be constructible from standard tetrominoes.
It's surprisingly difficult to construct a readable font. Obviously, I've cheated with the lowercase letters; but in my defence, the counters of the uppercase letters are hard enough. The minimum counter size is four pixels and you can see the trouble I had with the dollar sign.

A project for another time would be to animate the construction of text by falling pieces.

Wednesday, 27 January 2021

Whitty T-Shirt


  • Unpack your bags
  • Wash your hands
  • Next slide, please
* A three-word remark uttered by Chris Whitty

Sunday, 29 November 2020

egg Pointer Ambiguity

I've been beavering away on the egg programming language for the last few months. One of the major stumbling blocks has been the type system. As Hisham Muhammad points out, type systems are almost always more complex they first appear to be. I've had to re-implement the compiler and virtual machine several times because of fundamental mistakes I've made with the egg type system, even though it's supposedly "simple."

Here's an example of one problem I'm still struggling with right now.

I'd like to have "safe pointers" in the egg language to handle concepts such as pass-by-reference:

bool safeDivide(float num, float den, float* out) {
  if (den == 0) {
    return false;
  *out = num / dev;
  return true;

This all looks hunky-dory, but now consider this:

any v = 123; // line 1
int* p = &v; // line 2
v = "hello"; // line 3

In line 1, we define a variable 'v' that can store most types of value and initialize it with an integer value. In line 2, we define a pointer variable 'p' and point it at 'v'. In line 3, we modify 'v' to be a string. The question is: "What is the value of '*p' after line 3?"

The type declaration of 'p' suggests that '*p' should (always) be an integer, but it's now pointing to a string. There's obviously something "wrong" here, but what exactly is it?

Option A

Line 2 should have produced a compile-time error along the lines of
Cannot initialize a pointer to a value of type 'int' with the address of a value of type 'any'

That is, we only allow pointers to point to values of exactly the appropriate type. This requires that the type of any operand of the address-of '&' operator is known precisely at compile-time.

Option B

Line 3 should have produced a compile-time error because the assignment invalidates the type constraint of 'p'. This requires us to perform some very sophisticated static analysis; I'm not even sure it's possible beyond trivial examples.

Option C

Line 3 produces a runtime error because the assignment invalidates the type constraint of 'p'. This requires us to keep track of all pointers pointing to a value and checking for invalidation on every assignment. This "observer" scheme sounds very expensive to me.

Option D

Produce a runtime error if or when 'p' is subsequently dereferenced and it is discovered that it no longer points to an integer. This would mean that the error is raised "at a distance" from the assignment that caused the issue, thereby making debugging more difficult.

Option E

We make egg pointers "typeless" or, put it another way, all pointers must be of type 'any?*'. This means that if the pointee changes type, we don't really care.

Option F

Nothing is wrong! Just live with the fact that '*p' isn't necessarily an integer, even though it's defined like that.

It all comes down to how the runtime type of the pointee and the compile-time declaration of the pointer interact. I cannot imagine I'm the first to have come across this issue, but a quick search of literature hasn't come up with anything. But then, I don't know what the problem is called, so I'm stumbling in the dark somewhat.

My current "least-hated solution" is a hybrid of Options A and D: try to detect inconsistencies at compile-time but fall back to checking at runtime whenever the pointer is dereferenced.