Saturday, 9 December 2017

Anachronicons 1

If you look at icons and symbols in everyday use, you'll notice something strange. A few of them use old representations of a concept in order to differentiate them from similar visual elements. I call them "anachronicons". Take, for example, the speed camera UK traffic sign from The Highway Code:
Speed Camera Traffic Sign (UK)
Everyone (in the UK, at least) knows what it means, but isn't it strange that the graphic designer used the image of a late nineteenth-/early twentieth-century camera? It's not as if the sign just hasn't been updated; cameras of this form were already defunct when speed cameras (and presumably their signs) were introduced to the UK.

Tellingly, it only seems to be the UK that uses an old-fashioned camera in this way; other countries use words, radar "waves" or images of more modern cameras.

Saturday, 23 September 2017

Double Negative T-Shirt


There ain't nothing worse than a double negative

Friday, 24 February 2017

Synthetic Phonics Audio Resources

Chilli recently volunteered to help adults learn and improve their English reading and writing skills. This meant having to teach herself about synthetic phonics.

Looking at the various free resources available on the web, it quickly became apparent that there wasn't a single page where you could easily listen to the forty-two phonic sounds using a nice clean interface. The BBC has an excellent interactive phonics tool, but it relies on Adobe Flash which makes it unsuitable for some devices.

Fortunately, the good folks at Jolly Learning Ltd have made available audio files that I've wrapped in a minimum of HTML5 for use on most devices (including iPads).


The forty-two sounds are divided into their seven groups and can be played with either a British or American accent: here.

Wednesday, 22 February 2017

Three More Letterboxd Lists

After completing my Monsters of the Movies list, I decided to curate three more lists on letterboxd.


"theyshootpictures.com Top 50 Horror Movies (2016)" is fairly self-explanatory, providing that you realise that they're horror movies from the 2016 TSPDT list, not just movies released in 2016! So far, I've seen forty of them.

"99 Years, 99 Movies" has one film from each year 1901 to 1999 inclusive. Mostly, they're critically-acclaimed movies, but for the sparse years at the beginning of the century, I selected films that are popular on letterboxd. I've currently only seen 39 of the 99.

Finally, "Twentieth Century Movies by Duration" is a quirky list of 100 films ordered by duration in minutes from 61 minutes to 160 minutes (as reported by letterboxd). Again, it includes classic movies with some oddballs to fill the gaps. I've seen 68 of them.

Needless to say, I'm going to be ploughing through these lists over the next few months, desperately trying to improve my "scores".

STOP PRESS: TSPDT has released their 2017 list, but I don't believe any of the lists above will have changed as a result.

Computus 4

I noticed a few days ago that Dr J R Stockton's web pages on Computus are now longer available. The last impression of the site in the Wayback Machine was 2015-09-07. In it he lists a number of JavaScript algorithms for the calculation of the date of Easter Sunday in the Gregorian calendar. The result of the following functions are "the day of March" such that 1 indicates March 1st, 32 indicates April 1st, etc.

The first version is my own humble attempt (here translated to C/C++):

  int EasterDayOfMarch(int year) {
    // 25 ops = 2 div, 3 rem, 2 mul, 5 shift, 8 add, 5 sub
    int a = year % 19;
    int b = year >> 2;
    int c = (b / 25) + 1;
    int d = (c * 3) >> 2;
    int e = ((c << 3) + 5) / 25;
    int f = (a * 19 + d - e + 15) % 30;
    int g = f + ((29578 - a - (f << 5)) >> 10);
    int h = g - ((year + b - d + g + 2) % 7);
    return h;
  }

This is derived from Gauss's algorithm and subsequent revisions. However, the calculation of 'g' is my own work and chosen such that it can be computed on a 16-bit machine efficiently (though it works for all integer sizes).

There are other minor variations outlined on Dr Stockton's page:

    // Lichtenberg
    int g = 28 + f - ((f + a / 11) / 29);

and

    // Hutchins
    int g = 28 + f - ((a + f * 11) / 319);  

There is also a C implementation which is the equivalent of:

    // Pemberton
    int g = 28 + f - (int)((f + g + (int)(a > 10)) > 28);

As mentioned previously, Al Petrofsky came up with another variation on the theme:

  int EasterDayOfMarchPetrofsky(int year) {
    // 21 ops = 4 div, 3 rem, 4 mul, 1 shift, 6 add, 3 sub
    int a = (year % 19) * 6060;
    int b = year >> 2;
    int c = b / 25;
    int p = (c * 2267) - ((b / 100) * 6775) + 3411;
    int q = ((a + ((p / 25) * 319) - 1) % 9570) / 330;
    int r = 28 + q - (year + b + p + q) % 7;
    return r;
  }

This cleverly uses fewer operations, though requires integer sizes greater than sixteen bits for larger years (not a great imposition, these days!)

To convert a "day of March" 'h' to an actual date, the following code can be used:

  void Easter(int year, int* month, int* day) {
    int a = year % 19;
    int b = year >> 2;
    int c = (b / 25) + 1;
    int d = (c * 3) >> 2;
    int e = ((c << 3) + 5) / 25;
    int f = (a * 19 + d - e + 15) % 30;
    int g = f + ((29578 - a - (f << 5)) >> 10);
    int h = g - ((year + b - d + g + 2) % 7);
    int i = h >> 5; // is it April?
    *month = i + 3;
    *day = (h & 31) + i;
  }

Sunday, 27 November 2016

Monsters of the Movies - Achievement Unlocked

After almost forty years, I've finally managed to see all the films listed in Monsters of the Movies. I've been tracking my progress in a spreadsheet and a letterboxd list and over the last few months, I've been slowly filling my gaps by downloading the films from archive.org or watching the films online. The exception was Mad Love (1935), which I had to get through Amazon as a Spanish-subtitled DVD and then watch with the subtitles turned off!

Amongst the recently-watched films, the two stand-out gems for me were The Manster (1959) a.k.a. The Split:


and The Abominable Dr. Phibes (1971):


I kept the latter until the very end ... and I wasn't disappointed.

Lexicon 2

To finish off the discussion of my Lexicon demo, I'll outline the in-memory data structure used to store the dictionary.

In the first part, I described how the lexicon is stored trivially-compressed for transmission "over the wire". The words are then expanded in memory into a form that makes searching simple and efficient.

The main data structure is a four-dimensional array of JavaScript objects named "Lexicon.byRaritySubsetLength":

  1. The first index is an integer rarity between one and five inclusive: one means "not very rare", whereas five means "esoteric".
  2. The second index is an integer bit-field (one to seven inclusive) which encodes which dictionaries the corresponding list is part of: TWL06, SOWPODS and SCOWL.
  3. The third index is the integer length of the word (two to fifteen inclusive).
  4. The fourth index is an integer index into a list sorted alphabetically (starting from zero).
By default, the demo looks at all rarities and word lengths for the SOWPODS lexicon. This means that without changing the criteria, when matching words, the client must look through 5 * 4 * 14 = 70 lists of words.

Each word entry is a JavaScript object of the following form:

    word : string
    rarity : integer
    subset : integer
    mask : integer

The first three fields contain the uppercase word, rarity index and lexicon subset mask. The last field is used when searching for anagrams. It is a bit-field that encodes which letters are contained within the word: 1 for "A", 2 for "B", 4 for "C", 8 for "D", etc.

For instance, to find anagrams for "WORDS" in SOWPODS, the client needs to search through only 20 lists of words: five rarities (1 to 5), four lists for SOWPODS and one word length (5). These lists contain 12,478 entries which can be scanned quickly by looking for those that have a "mask" with the bits for "D", "O", "R", "S" and "W" all set. It will typically come up with the solution ("SWORD" and "DROWS") within a couple of milliseconds.