Uninformed: Informative Information for the Uninformed

Vol 5» 2006.Sep


The next important method to override is the encode_block method. This method is used by the framework to allow an encoder to encode a single block and return the resultant encoded buffer. The size of each block is provided to the framework through the encoder's information hash. For this particular encoder, the block size is four bytes. The implementation of the encode_block routine is as simple as trying to encode the block using either the add instruction or the sub instruction. Which instruction is used will depend on the bytes in the block that is being encoded.

def encode_block(state, block)
   buf = try_add(state, block)

   if (buf.nil?)
      buf = try_sub(state, block)

   if (buf.nil?)
      raise BadcharError.new(state.encoded, 0, 0, 0)


The first thing encode_block tries is add. The try_add method is implemented as shown below:

def try_add(state, block)
   buf  = "\x68"
   vbuf = ''
   ctx  = ''

   block.each_byte { |b|
      return nil if (b == 0xff or b == 0x01 or b == 0x00)

         xv = rand(b - 1) + 1
      end while (is_badchar(state, xv) or is_badchar(state, b - xv))

      vbuf += [xv].pack('C')
      ctx  += [b - xv].pack('C')

   buf += vbuf + "\x5f\x01\x39\x03\x0c\x24"

   state.context += ctx

   return buf

The try_add routine enumerates each byte in the block, trying to find a random byte that, when added to another random byte, produces the byte value in the block. The algorithm it uses to accomplish this is to loop selecting a random value between 1 and the actual value. From there a check is made to ensure that both values are within the valid character set. If they are both valid, then one of the values is stored as one of the bytes of the 32-bit immediate operand to the push instruction that is part of the decode transform for the current block. The second value is appended to the encoded block context. After all bytes have been considered, the instructions that compose the decode transform are completed and the encoded block context is appended to the string of encoded blocks. Finally, the decode transform is returned to the framework.

In the event that any of the bytes that compose the block being encoded by try_add are 0x00, 0x01, or 0xff, the routine will return nil. When this happens, the encode_block routine will attempt to encode the block using the sub instruction. The implementation of the try_sub routine is shown below:

def try_sub(state, block)
   buf   = "\x68";
   vbuf  = ''
   ctx   = ''
   carry = 0

   block.each_byte { |b|
      return nil if (b == 0x80 or b == 0x81 or b == 0x7f)

      x          = 0
      y          = 0
      prev_carry = carry

         carry = prev_carry

         if (b > 0x80)
            diff  = 0x100 - b
            y     = rand(0x80 - diff - 1).to_i + 1
            x     = (0x100 - (b - y + carry))
            carry = 1
            diff  = 0x7f - b
            x     = rand(diff - 1) + 1
            y     = (b + x + carry) & 0xff
            carry = 0

      end while (is_badchar(state, x) or is_badchar(state, y))

      vbuf += [x].pack('C')
      ctx  += [y].pack('C')

   buf += vbuf + "\x5f\x29\x39\x03\x0c\x24"

   state.context += ctx

   return buf

Unlike the try_add routine, the try_sub routine is a little bit more complicated, perhaps unnecessarily. The main reason for this is that subtracting two 32-bit values has to take into account things like carrying from one digit to another. The basic idea is the same. Each byte in the block is enumerated. If the byte is above 0x80, the routine calculates the difference between 0x100 and the byte. From there, it calculates the y value as a random number between 1 and 0x80 minus the difference. Using the y value, it generates the x value as 0x100 minus the byte value minus y plus the current carry flag. To better understand this, consider the following scenario.

Say that the byte being encoded is 0x84. The difference between 0x100 and 0x84 is 0x7c. A valid value of y could be 0x3, as derived from rand(0x80 - 0x7c - 1) + 1. Given this value for y, the value of x would be, assuming a zero carry flag, 0x7f. When 0x7f, or x, is subtracted from 0x3, or y, the result is 0x84.

However, if the byte value is less than 0x80, then a different method is used to select the x and y values. In this case, the difference is calculated as 0x7f minus the value of the current byte. The value of x is then assigned a random value between 1 and the difference. The value of y is then calculated as the current byte plus x plus the carry flag. For example, if the value is 0x24, then the values could be calculated as described in the following scenario.

First, the difference between 0x7f and 0x24 is 0x5b. The value of x could be 0x18, as derived from rand(0x5b - 1) + 1. From there, the value of y would be calculated as 0x3c through 0x24 + 0x18. Therefore, 0x3c - 0x18 is 0x24.

Given these two methods of calculating the individual byte values, it's possible to encode all byte with the exception of 0x7f, 0x80, and 0x81. If any one of these three bytes is encountered, the try_sub routine will return nil and the encoding will fail. Otherwise, the routine will complete in a fashion similar to the try_add routine. However, rather than using an add instruction, it uses the sub instruction.