**Updated 2014/08/09 (see update)**

NOTE:Throughout this article I try to use lowercase names (i.e.`float`

or`integer`

) for the types and uppercase names (i.e.`Float`

or`Integer`

) for class names.

I’m currently working with some professional vision mixers and one of the tasks is loading & saving image from and to the mixer. And the mixer expects all in 8-bit (4:2:2) encoded YCbCr. Converting between RGB and YCbCr is pretty straight forward using some transformation matrices:

```
floatRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| r g b |
r := (y - 16) * 1.164 + ((cr - 128) * 1.793).
g := (y - 16) * 1.164 + ((cb - 128) * -0.213) + ((cr - 128) * -0.533).
b := (y - 16) * 1.164 + ((cb - 128) * 2.112).
^ {r.g.b}
```

However using this code to convert a 1280x720 image from YCbCr takes around 20 seconds on my machine … not really fast. After some surfing I found a two different ways to convert the data - both of which where a bit and much faster:

```
integerRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| ym cbm crm r g b |
ym := (y - 16) * 149 / 128.
cbm := cb - 128.
crm := cr - 128.
r := ym + (crm * 459 / 256).
g := ym - (cbm * 109 / 512) - (crm * 17 / 32).
b := ym + (cbm * 135 / 64).
^ {r.g.b}
```

```
shiftRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| ym r g b cbm crm |
ym := y - 16.
ym := ym + (ym >> 3) + (ym >> 5) + (ym >> 7).
cbm := cb - 128.
crm := cr - 128.
r := ym + crm + (crm >> 3) + (crm >> 4).
g := ym - ((cbm >> 3) + (cbm >> 4) + (cbm >> 5)) - ((crm >> 1) + (crm >> 5)).
b := ym + ((cbm << 1) + (cbm >> 4) + (cbm >> 5) + (cbm >> 6)).
^ {r.g.b}
```

To be honest I thought I understood the integer approach but had no idea how the shift approach worked. Still the results were the same numerically - just a lot faster. So I my adventure into “Integer witchcraft” started and I decided to keep my findings here in the blog.

I’ll start with a simple example. Lets take the term `integer`

shift(s) expression step by step.

## Eliminate floating point numbers (using powers of 10)

The `float`

The last two terms are mathematically equal – however if you work in a language where `integer`

operations do not coerce to floats “automagically” (e.g. C or Ruby) you my want to multiply first. Why? Let’s assume

So multiplying first doesn’t hurt in Smalltalk (unless you sprinkle in a lot of `asInteger`

calls) or other languages which automatically coerce to `float`

s if needed … but might safe your bu.. in C, Ruby and maybe a few others.

## Eliminate floating point numbers (using powers of 2)

Instead of using powers of 10 we can also use powers of 2. This will allow us to convert the expression

We can use any power of 2 – but let’s just use 32 (

25 ) for now. We can/will/should/must use others as well; but that’s discussed later.

** integer** expression which is “kind of” equal:

As you can see the difference between the `float`

value

When working in languages which to not automatically coerce the result of

`integer`

operations to`float`

s when needed your can stop reading here. Integer arithmetic is on of the most heavily optimized areas in today’s compilers and CPUs.

It’s different in Smalltalk though: A simple expression like`3 * 4 / 5`

(which would be`2`

with integer arithmetic and`2.4`

with floating points) might result in a`Fraction`

(`(12/5)`

)!!! Although great from a “keep the precision” standpoint it’s a speed nightmare. Sprinkling in some`asFloat`

calls might speed up the thing a bit … but for soeme calculations it can even be faster … just read on.

## Replace multiplication of floating point numbers with integer shifts

So how to convert the expression `integer`

shifts? The “trick” is to replace the numerator

QUICK REFRESH

x≪s=x⋅2s : Shifting an integerx bys bits to the left is equal to multiplyingx with2s .

x≫s=x∗2−s=x÷2s : Shifting an integerx bys bits to the left is equal to multiplyingx with2−s or dividing by2s .

So we can replace the expression `float`

values&operations) with `integer`

values&operations).

## Conclusion

Maybe everybody knew this “trick” and I have simply been living under a rock for the last decades. If this is the case this blog still server of a reminder for myself. If you indeed learned something that’s even better!

## Limitations

### IT ONLY WORKS FOR INTEGER VALUES!!

An expression like `integer`

!

### PRECISION

`Float`

values (`float`

s (e.g. `float`

value *achievable* precision depends on the used exponent:

**Denominator 16=24**

- Approximated fraction:
1.164≈1916=1.1875 ; Error≈0.02350 - Shift Expression:
`x + (x >> 3) + (x >> 4)`

**Denominator 32=25**

- Approximated fraction:
1.164≈3732=1.15625 ; Error≈0.00775 - Shift Expression:
`x + (x >> 3) + (x >> 5)`

**Denominator 64=26**

- Approximated fraction:
1.164≈7464=1.15625 ; Error≈0.00775 - Shift Expression:
`x + (x >> 3) + (x >> 5)`

**Denominator 128=27**

- Approximated fraction:
1.164≈149128=1.1640625 ; Error≈0.00006 - Shift Expression:
`x + (x >> 3) + (x >> 5) + (x >> 7)`

So in theory the bigger the denominator the higher the precision – although not every step means more precision (compare 32 and 64). Another limiting factor is the numeric range of the variable. If the variable is an `unsigned byte`

(`0-255`

) it’s pointless to shift it more than 8 bits to the right.

So given the example above (

Finding the best compromise between precision and number of shifts is the actual hard decision you have to make. Check the code below to make an informed decision.

## Code

NOTE:The code below is neither nice nor fast. But it does it’s job :-)

```
Number>>#multiplyAsShiftReportWith
^ self multiplyAsShiftReportWith: 'x'
```

```
Number>>#multiplyAsShiftReportWith: variableName
^ String
streamContents: [ :report |
| denominator numerator aproximatedFloat bitString numberOfSetBits currentShiftOffset shiftPositions |
report
nextPutAll: 'Float: ' , self printString;
cr.
4 to: 9 do: [ :denominatorExponent |
denominator := 2 raisedTo: denominatorExponent.
numerator := (self * denominator) rounded.
aproximatedFloat := (numerator / denominator) asFloat.
bitString := numerator bitString last: 16.
numberOfSetBits := bitString asBag occurrencesOf: $1.
report
cr;
nextPutAll: ('<1p> / <2p> = <3p>' expandMacrosWith: numerator with: denominator with: aproximatedFloat);
cr;
nextPutAll:
('|Error| = <1s>; (log10 |Error| = <2s>)'
expandMacrosWith: ((aproximatedFloat - self) abs printShowingDecimalPlaces: 5)
with: ((aproximatedFloat - self) abs log printShowingDecimalPlaces: 3));
cr;
nextPutAll: ('<1p> = Binary <2s>' expandMacrosWith: numerator with: bitString);
nextPutAll: ('; required shifts: <1p>' expandMacrosWith: numberOfSetBits);
cr;
nextPutAll:
('Precision/Shifts Ratio: <1s>'
expandMacrosWith: ((aproximatedFloat - self) abs log abs / numberOfSetBits printShowingDecimalPlaces: 3));
cr;
nextPutAll: ('Orignal Expression: <1s> * <2p>' expandMacrosWith: variableName with: self);
cr;
nextPutAll:
('Integer Expression: <1s> * <2p> / <3p>' expandMacrosWith: variableName with: numerator with: denominator);
cr.
currentShiftOffset := denominatorExponent - bitString size.
shiftPositions := OrderedCollection new.
bitString asArray
do: [ :bit |
currentShiftOffset := currentShiftOffset + 1.
bit = $1
ifTrue: [ shiftPositions add: currentShiftOffset ] ].
report nextPutAll: 'Shift Expression: '.
shiftPositions
do: [ :position |
position = 0
ifTrue: [ report nextPutAll: variableName ]
ifFalse: [
position negative
ifTrue: [ report nextPutAll: ('(<1s> %<%< <2p>)' expandMacrosWith: variableName with: position negated) ]
ifFalse: [ report nextPutAll: ('(<1s> >> <2p>)' expandMacrosWith: variableName with: position) ] ] ]
separatedBy: [ report nextPutAll: ' + ' ].
report cr ] ]
```

### Code (Usage)

I using the same `float`

(

```
1.164 multiplyAsShiftReportWith: 'x'. "print-it"
'Float: 1.164
19 / 16 = 1.1875
|Error| = 0.02350; (log10 |Error| = -1.629)
19 = Binary 0000000000010011; required shifts: 3
Precision/Shifts Ratio: 0.543
Orignal Expression: x * 1.164
Integer Expression: x * 19 / 16
Shift Expression: x + (x >> 3) + (x >> 4)
37 / 32 = 1.15625
|Error| = 0.00775; (log10 |Error| = -2.111)
37 = Binary 0000000000100101; required shifts: 3
Precision/Shifts Ratio: 0.704
Orignal Expression: x * 1.164
Integer Expression: x * 37 / 32
Shift Expression: x + (x >> 3) + (x >> 5)
74 / 64 = 1.15625
|Error| = 0.00775; (log10 |Error| = -2.111)
74 = Binary 0000000001001010; required shifts: 3
Precision/Shifts Ratio: 0.704
Orignal Expression: x * 1.164
Integer Expression: x * 74 / 64
Shift Expression: x + (x >> 3) + (x >> 5)
149 / 128 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
149 = Binary 0000000010010101; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 149 / 128
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
298 / 256 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
298 = Binary 0000000100101010; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 298 / 256
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
596 / 512 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
596 = Binary 0000001001010100; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 596 / 512
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
'
```

## Update 1

This approach of course works for every kind of `Number`

, `ScaledDecimal`

s included (thanks Tobias!). So I moved the method to class `Number`

and updated the code.

## Update 2

Thanks to a Twitter conversation with Hwa Jong Oh there is now a `FastConstantMultiplier`

. So if you need to multiply an `Integer`

with a constant quite often this might be interesting for you:

### Load package

```
Gofer it
smalltalkhubUser: 'UdoSchneider' project: 'Playground';
package: 'FastConstantMultiplier';
load.
```

### Using the package

```
m := FastConstantMultiplier constant: 0.123456789 significantDigits: 11.
m * 10000. "1232 "
m * 10000000000000." 1234567890054"
```

Written with StackEdit.

## No comments:

Post a Comment