Thursday, 5 November 2015

sRGB Integer Conversions

A comment on a previous post on this subject got me thinking; 'Anonymous' mentioned using a lookup table (LUT) for converting sRGB to and from linear values. This would be less than ideal on a GPU, but a good compromise on a CPU, particularly if integer values were expected.

But first, I tried to wring some more mileage out of the polynomial approximation of the sRGB gamma curve, which, as you'll remember, is:

  if (srgb <= 0.04045)
      lin = srgb / 12.92;
  else
      lin = pow((srgb + 0.055) / 1.055, 2.4);

It is common that the input 'srgb' values are bytes (0..255 for each of red, green and blue channels) and so are the output 'lin' values. My initial thought was to use a quartic integer polynomial of the form:

  uint32 x = srgb;
  uint32 y = a*x*x*x*x + b*x*x*x + c*x*x + d*x + e;
  lin = (uint8)(y >> 24);


That is, pick the constants 'a', 'b', 'c', 'd' and 'e' such that, after multiplying through as 32-bit unsigned integers, the result is conveniently in the uppermost eight bits of 'y'. I dutifully prepared a table of values and fed them into an on-line tool such as the excellent Polynomial Regression Data Fit page by Paul Lutus. It came up with the following polynomial for 'y' in terms of 'x':

y = -1.296·10-1 x4 + b x3 + c x2 + d x + e

This was a little disappointing: the value 'a' turned out to be about -0.1296. That's definitely not an integer coefficient. So I thought I'd be clever and approximate the first term as

-(x * x * x * x / 8)

Serendipitously, the division can be performed using a right-shift. This approximation is all very well, but the remaining coefficients are no longer correct. So I rearranged the table so that the residuals could be approximated as a cubic polynomial:

y + x*x*x*x/8 = b*x*x*x + c*x*x + d*x + e

The next data fit suggested a polynomial with a value of 'b' of approximately 142. I continued the process of fitting, rounding to an integer coefficient and computing the residuals until I got all the values:

  a = -1/8
  b = 142
  c = 35322
  d = 602000
  e = 11918016

Plugging these numbers back into the formula and testing all 256 input possibilities, I got exact results for 238 inputs and a maximum error of ±1 for the remaining 18. That's not too bad, but hardly stellar, especially as a lookup table of 256 bytes would be more accurate and almost certainly faster!

However, if the input/output integers were 16 bits, instead of 8, a polynomial solution might be worthwhile as the size of a 65536-element array would be cache-unfriendly. This is not as unlikely as it seems: high quality graphics processing is expected to deal with 16-bits-per-channel sRGB inputs.

So, going through the whole process again, but this time using 16- and 64-bit integers, I came up with:

  uint64 x1 = srgb;
  uint64 x2 = x1 * x1;
  uint64 x4 = x2 * x2;
  uint64 y = (x1 + 3441) * (x1 + 72046) * x1 * 32627 - x4 / 10;
  lin = (uint16)(y >> 48);

If you forgive the unfortunate integer division by 10, this boils down to nine arithmetic operations and one shift. The error is never more than about 1 part in a 1000, but just about all 65536 conversions are slightly off. Still, not too bad.

Finally, I bit the bullet and went for the somewhat more prosaic piecewise linear approximation. This is an undervalued technique that allows you to balance the speed of table-based lookups with the immediacy of full computation. Here's the algorithm:

  const uint32 lut[256][2];
  uint8 i = (uint8)(srgb >> 8);
  uint32 y = lut[i][0] * srgb + lut[i][1];
  lin = (uint16)(y >> 16);

One kilobyte of constant lookup table data isn't too cache-unfriendly and the code has a certain beautiful simplicity.

I divided the 65536-element sRGB curve table into 256 equal parts. Each part was fitted to the appropriate portion of the curve using simple linear regression, being careful to handle rounding correctly. The gradient coefficients are in 'lut[i][0]' and the y-intercepts are in 'lut[i][1]'. They are biased so that the required 16-bit result is in the upper half of 'y'. For completeness, the full table is listed at the end of this post.

So, how does it perform? Actually, very well. The error is never more than ±1 in 65535, with 61009 of the 65536 mappings being exact results. As an aside, if you replaced the integer entries in 'lut' with appropriate floating-point values, you could also construct a fairly accurate integer-sRGB-to-float-linear conversion.

I'll leave the reverse transformation (linear to sRGB) as an exercise for the reader.

Here's that table...

  const uint32 lut[256][2] = {
    {   5076, 0x00007CEA },
    {   5068, 0x000086E6 },
    {   5071, 0x00008368 },
    {   5069, 0x00008C66 },
    {   5078, 0x000067EB },
    {   5073, 0x00007B68 },
    {   5071, 0x00008A68 },
    {   5070, 0x000090E7 },
    {   5065, 0x0000C064 },
    {   5078, 0x00004BEB },
    {   5192, 0xFFFBB824 },
    {   5497, 0xFFEEA33C },
    {   5798, 0xFFE08653 },
    {   6103, 0xFFD1136C },
    {   6415, 0xFFC00C08 },
    {   6739, 0xFFAD16AA },
    {   7059, 0xFF99184A },
    {   7391, 0xFF830BF0 },
    {   7719, 0xFF6BF894 },
    {   8054, 0xFF531BBB },
    {   8390, 0xFF38D863 },
    {   8734, 0xFF1C9F0F },
    {   9079, 0xFEFEF63C },
    {   9428, 0xFEDF956A },
    {   9778, 0xFEBEC519 },
    {  10138, 0xFE9B9CCD },
    {  10500, 0xFE76D482 },
    {  10870, 0xFE4FC93B },
    {  11227, 0xFE28B46E },
    {  11600, 0xFDFE71A8 },
    {  11972, 0xFDD2D862 },
    {  12347, 0xFDA5709E },
    {  12726, 0xFD7612DB },
    {  13114, 0xFD440F9D },
    {  13505, 0xFD1021E0 },
    {  13891, 0xFCDB5BA2 },
    {  14287, 0xFCA3AB68 },
    {  14673, 0xFC6BE828 },
    {  15082, 0xFC2F3775 },
    {  15489, 0xFBF134C0 },
    {  15894, 0xFBB1EC0B },
    {  16273, 0xFB753D48 },
    {  16699, 0xFB2F7A1E },
    {  17126, 0xFAE7C973 },
    {  17549, 0xFA9F13C6 },
    {  17963, 0xFA565196 },
    {  18384, 0xFA0AAFE8 },
    {  18823, 0xF9BA1644 },
    {  19246, 0xF96AC597 },
    {  19679, 0xF917E4F0 },
    {  20107, 0xF8C44FC6 },
    {  20545, 0xF86D16A0 },
    {  20989, 0xF812EA7E },
    {  21418, 0xF7BA20D5 },
    {  21845, 0xF7603D2A },
    {  22327, 0xF6F8AE1C },
    {  22784, 0xF694B180 },
    {  23228, 0xF631D25E },
    {  23678, 0xF5CBE33F },
    {  24141, 0xF56135A6 },
    {  24608, 0xF4F3C210 },
    {  25070, 0xF485A8F7 },
    {  25541, 0xF4139362 },
    {  25986, 0xF3A611C1 },
    {  26477, 0xF32B5836 },
    {  26945, 0xF2B48320 },
    {  27423, 0xF2394610 },
    {  27896, 0xF1BD7A7C },
    {  28387, 0xF13B0CF2 },
    {  28864, 0xF0BA7360 },
    {  29347, 0xF0365ED2 },
    {  29830, 0xEFB06A43 },
    {  30318, 0xEF272B37 },
    {  30815, 0xEE9972B0 },
    {  31308, 0xEE0AEF26 },
    {  31800, 0xED7ACB1C },
    {  32285, 0xECEAD38E },
    {  32770, 0xEC592501 },
    {  33305, 0xEBB6168C },
    {  33798, 0xEB1DF803 },
    {  34309, 0xEA7E5182 },
    {  34817, 0xE9DD9A80 },
    {  35333, 0xE9385682 },
    {  35844, 0xE892AB02 },
    {  36361, 0xE7E90B84 },
    {  36886, 0xE73AB80B },
    {  37403, 0xE68D058E },
    {  37919, 0xE5DDA990 },
    {  38447, 0xE5282998 },
    {  38970, 0xE472551D },
    {  39493, 0xE3BA7BA2 },
    {  40021, 0xE2FED4AA },
    {  40556, 0xE23E9636 },
    {  41099, 0xE1794FC6 },
    {  41633, 0xE0B53CD0 },
    {  42167, 0xDFEF0FDC },
    {  42710, 0xDF236F6B },
    {  43255, 0xDE54E7FC },
    {  43812, 0xDD7F9892 },
    {  44339, 0xDCB3CD1A },
    {  44892, 0xDBDBC4AE },
    {  45439, 0xDB03F040 },
    {  45983, 0xDA2B3150 },
    {  46545, 0xD9491468 },
    {  47082, 0xD86EEDF5 },
    {  47654, 0xD7845613 },
    {  48211, 0xD69DB2AA },
    {  48764, 0xD5B6923E },
    {  49313, 0xD4CF0AD0 },
    {  49891, 0xD3D8F7F2 },
    {  50459, 0xD2E4EE0E },
    {  51033, 0xD1EC112C },
    {  51599, 0xD0F47248 },
    {  52156, 0xCFFE9BDE },
    {  52745, 0xCEF85C84 },
    {  53324, 0xCDF44726 },
    {  53896, 0xCCF11344 },
    {  54487, 0xCBE2FBEC },
    {  55055, 0xCADD2508 },
    {  55641, 0xC9CCC12C },
    {  56217, 0xC8BEC54C },
    {  56799, 0xC7ABB970 },
    {  57394, 0xC6903619 },
    {  57990, 0xC571DA43 },
    {  58585, 0xC4519FEC },
    {  59169, 0xC3347710 },
    {  59770, 0xC20CA4BD },
    {  60357, 0xC0E97262 },
    {  60947, 0xBFC2798A },
    {  61554, 0xBE90A439 },
    {  62151, 0xBD6180E4 },
    {  62758, 0xBC2AEA93 },
    {  63359, 0xBAF50C40 },
    {  63976, 0xB9B482F4 },
    {  64565, 0xB880359A },
    {  65313, 0xB6F5B510 },
    {  65594, 0xB6609D1D },
    {  66392, 0xB4B5DDAC },
    {  67048, 0xB35433F4 },
    {  67654, 0xB20B1C23 },
    {  68259, 0xB0C03BD2 },
    {  68881, 0xAF69A808 },
    {  69496, 0xAE1485BC },
    {  70120, 0xACB7FAF4 },
    {  70740, 0xAB5B3F2A },
    {  71367, 0xA9F822E4 },
    {  71987, 0xA896921A },
    {  72614, 0xA72E93D3 },
    {  73252, 0xA5BDC612 },
    {  73882, 0xA44F214D },
    {  74524, 0xA2D6F48E },
    {  75163, 0xA15E054E },
    {  75799, 0x9FE4628C },
    {  76435, 0x9E683ECA },
    {  77063, 0x9CEE7B04 },
    {  77702, 0x9B6B9DC3 },
    {  78338, 0x99E81901 },
    {  79002, 0x9850E64D },
    {  79643, 0x96C5460E },
    {  80292, 0x95322DD2 },
    {  80938, 0x939E6E15 },
    {  81602, 0x91FCD061 },
    {  82225, 0x90728F18 },
    {  82889, 0x8ECBCE64 },
    {  83541, 0x8D2A20AA },
    {  84197, 0x8B8352F2 },
    {  84856, 0x89D806BC },
    {  85512, 0x882C1B04 },
    {  86176, 0x86785C50 },
    {  86832, 0x84C75298 },
    {  87509, 0x8305BC6A },
    {  88164, 0x81503732 },
    {  88831, 0x7F901500 },
    {  89506, 0x7DC7EAD1 },
    {  90164, 0x7C08B21A },
    {  90840, 0x7A3A906C },
    {  91493, 0x7879A832 },
    {  92187, 0x7699D98E },
    {  92859, 0x74C695DE },
    {  93538, 0x72EBD0B1 },
    {  94212, 0x7111E102 },
    {  94887, 0x6F34A1D4 },
    {  95566, 0x6D51ECA7 },
    {  96251, 0x6B683D7E },
    {  96941, 0x69784BD6 },
    {  97615, 0x67913128 },
    {  98304, 0x659CA500 },
    {  98993, 0x63A547D8 },
    {  99670, 0x61B421AB },
    { 100369, 0x5FB01288 },
    { 101055, 0x5DB2ECE0 },
    { 101744, 0x5BB0E1B8 },
    { 102443, 0x59A4A496 },
    { 103138, 0x5798AD71 },
    { 103835, 0x55887B4E },
    { 104527, 0x53795EA8 },
    { 105232, 0x515D9D88 },
    { 105935, 0x4F40A268 },
    { 106634, 0x4D23F945 },
    { 107327, 0x4B094C20 },
    { 108041, 0x48DB7F84 },
    { 108731, 0x46BDC1DE },
    { 109469, 0x4477684E },
    { 110164, 0x4250472A },
    { 110872, 0x401C138C },
    { 111573, 0x3DEABC6A },
    { 112282, 0x3BB03E4D },
    { 113000, 0x396BB7B4 },
    { 113717, 0x3725289A },
    { 114417, 0x34E9B4F8 },
    { 115149, 0x32914B66 },
    { 115865, 0x30432DCC },
    { 116593, 0x2DE85338 },
    { 117317, 0x2B8DEAA2 },
    { 118038, 0x2933358B },
    { 118760, 0x26D4D1F4 },
    { 119487, 0x246F66E0 },
    { 120214, 0x220721CB },
    { 120938, 0x1F9E9935 },
    { 121667, 0x1D2EF422 },
    { 122401, 0x1AB82990 },
    { 123141, 0x18394902 },
    { 123858, 0x15CB7DE9 },
    { 124586, 0x13515B55 },
    { 125319, 0x10D00244 },
    { 126063, 0x0E421EB8 },
    { 126796, 0x0BBB07A6 },
    { 127535, 0x092BC398 },
    { 128289, 0x068C3910 },
    { 129043, 0x03E9AC8A },
    { 129778, 0x01553C79 },
    { 130452, 0xFEF517CA },
    { 131275, 0xFC0B4AE6 },
    { 131988, 0xF9825BCA },
    { 132739, 0xF6D3F2C2 },
    { 133496, 0xF41D06BC },
    { 134231, 0xF17774AC },
    { 134991, 0xEEB7E328 },
    { 135745, 0xEBFAE3A0 },
    { 136491, 0xE9426F16 },
    { 137238, 0xE686280B },
    { 137998, 0xE3BAB987 },
    { 138758, 0xE0EC5103 },
    { 139518, 0xDE1AE87F },
    { 140265, 0xDB52F474 },
    { 141036, 0xD8711D76 },
    { 141788, 0xD59E83EE },
    { 142558, 0xD2B7A06F },
    { 143327, 0xCFCEAA70 },
    { 144092, 0xCCE6966E },
    { 144853, 0xC9FF696A },
    { 145619, 0xC71066EA },
    { 146384, 0xC41F5CE8 },
    { 147190, 0xC102CA7B },
    { 147924, 0xBE2A77EA },
    { 148707, 0xBB1E88F2 }
  };

No comments:

Post a Comment