Updated 2014/08/09 (see update)
NOTE: Throughout this article I try to use lowercase names (i.e.
float
orinteger
) for the types and uppercase names (i.e.Float
orInteger
) for class names.
I’m currently working with some professional vision mixers and one of the tasks is loading & saving image from and to the mixer. And the mixer expects all in 8-bit (4:2:2) encoded YCbCr. Converting between RGB and YCbCr is pretty straight forward using some transformation matrices:
floatRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| r g b |
r := (y - 16) * 1.164 + ((cr - 128) * 1.793).
g := (y - 16) * 1.164 + ((cb - 128) * -0.213) + ((cr - 128) * -0.533).
b := (y - 16) * 1.164 + ((cb - 128) * 2.112).
^ {r.g.b}
However using this code to convert a 1280x720 image from YCbCr takes around 20 seconds on my machine … not really fast. After some surfing I found a two different ways to convert the data - both of which where a bit and much faster:
integerRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| ym cbm crm r g b |
ym := (y - 16) * 149 / 128.
cbm := cb - 128.
crm := cr - 128.
r := ym + (crm * 459 / 256).
g := ym - (cbm * 109 / 512) - (crm * 17 / 32).
b := ym + (cbm * 135 / 64).
^ {r.g.b}
shiftRgbFromY: y Cb: cb Cr: cr
"YCrCb to RGB
RGB 0-255
Y 16-235
CbCr 16-240"
| ym r g b cbm crm |
ym := y - 16.
ym := ym + (ym >> 3) + (ym >> 5) + (ym >> 7).
cbm := cb - 128.
crm := cr - 128.
r := ym + crm + (crm >> 3) + (crm >> 4).
g := ym - ((cbm >> 3) + (cbm >> 4) + (cbm >> 5)) - ((crm >> 1) + (crm >> 5)).
b := ym + ((cbm << 1) + (cbm >> 4) + (cbm >> 5) + (cbm >> 6)).
^ {r.g.b}
To be honest I thought I understood the integer approach but had no idea how the shift approach worked. Still the results were the same numerically - just a lot faster. So I my adventure into “Integer witchcraft” started and I decided to keep my findings here in the blog.
I’ll start with a simple example. Lets take the term integer
shift(s) expression step by step.
Eliminate floating point numbers (using powers of 10)
The float
The last two terms are mathematically equal – however if you work in a language where integer
operations do not coerce to floats “automagically” (e.g. C or Ruby) you my want to multiply first. Why? Let’s assume
So multiplying first doesn’t hurt in Smalltalk (unless you sprinkle in a lot of asInteger
calls) or other languages which automatically coerce to float
s if needed … but might safe your bu.. in C, Ruby and maybe a few others.
Eliminate floating point numbers (using powers of 2)
Instead of using powers of 10 we can also use powers of 2. This will allow us to convert the expression
We can use any power of 2 – but let’s just use 32 (
25 ) for now. We can/will/should/must use others as well; but that’s discussed later.
integer
expression which is “kind of” equal:
As you can see the difference between the float
value
When working in languages which to not automatically coerce the result of
integer
operations tofloat
s when needed your can stop reading here. Integer arithmetic is on of the most heavily optimized areas in today’s compilers and CPUs.
It’s different in Smalltalk though: A simple expression like3 * 4 / 5
(which would be2
with integer arithmetic and2.4
with floating points) might result in aFraction
((12/5)
)!!! Although great from a “keep the precision” standpoint it’s a speed nightmare. Sprinkling in someasFloat
calls might speed up the thing a bit … but for soeme calculations it can even be faster … just read on.
Replace multiplication of floating point numbers with integer shifts
So how to convert the expression integer
shifts? The “trick” is to replace the numerator
QUICK REFRESH
x≪s=x⋅2s : Shifting an integerx bys bits to the left is equal to multiplyingx with2s .
x≫s=x∗2−s=x÷2s : Shifting an integerx bys bits to the left is equal to multiplyingx with2−s or dividing by2s .
So we can replace the expression float
values&operations) with integer
values&operations).
Conclusion
Maybe everybody knew this “trick” and I have simply been living under a rock for the last decades. If this is the case this blog still server of a reminder for myself. If you indeed learned something that’s even better!
Limitations
IT ONLY WORKS FOR INTEGER VALUES!!
An expression like integer
!
PRECISION
Float
values (float
s (e.g. float
value
Denominator
- Approximated fraction:
1.164≈1916=1.1875 ; Error≈0.02350 - Shift Expression:
x + (x >> 3) + (x >> 4)
Denominator
- Approximated fraction:
1.164≈3732=1.15625 ; Error≈0.00775 - Shift Expression:
x + (x >> 3) + (x >> 5)
Denominator
- Approximated fraction:
1.164≈7464=1.15625 ; Error≈0.00775 - Shift Expression:
x + (x >> 3) + (x >> 5)
Denominator
- Approximated fraction:
1.164≈149128=1.1640625 ; Error≈0.00006 - Shift Expression:
x + (x >> 3) + (x >> 5) + (x >> 7)
So in theory the bigger the denominator the higher the precision – although not every step means more precision (compare 32 and 64). Another limiting factor is the numeric range of the variable. If the variable is an unsigned byte
(0-255
) it’s pointless to shift it more than 8 bits to the right.
So given the example above (
Finding the best compromise between precision and number of shifts is the actual hard decision you have to make. Check the code below to make an informed decision.
Code
NOTE: The code below is neither nice nor fast. But it does it’s job :-)
Number>>#multiplyAsShiftReportWith
^ self multiplyAsShiftReportWith: 'x'
Number>>#multiplyAsShiftReportWith: variableName
^ String
streamContents: [ :report |
| denominator numerator aproximatedFloat bitString numberOfSetBits currentShiftOffset shiftPositions |
report
nextPutAll: 'Float: ' , self printString;
cr.
4 to: 9 do: [ :denominatorExponent |
denominator := 2 raisedTo: denominatorExponent.
numerator := (self * denominator) rounded.
aproximatedFloat := (numerator / denominator) asFloat.
bitString := numerator bitString last: 16.
numberOfSetBits := bitString asBag occurrencesOf: $1.
report
cr;
nextPutAll: ('<1p> / <2p> = <3p>' expandMacrosWith: numerator with: denominator with: aproximatedFloat);
cr;
nextPutAll:
('|Error| = <1s>; (log10 |Error| = <2s>)'
expandMacrosWith: ((aproximatedFloat - self) abs printShowingDecimalPlaces: 5)
with: ((aproximatedFloat - self) abs log printShowingDecimalPlaces: 3));
cr;
nextPutAll: ('<1p> = Binary <2s>' expandMacrosWith: numerator with: bitString);
nextPutAll: ('; required shifts: <1p>' expandMacrosWith: numberOfSetBits);
cr;
nextPutAll:
('Precision/Shifts Ratio: <1s>'
expandMacrosWith: ((aproximatedFloat - self) abs log abs / numberOfSetBits printShowingDecimalPlaces: 3));
cr;
nextPutAll: ('Orignal Expression: <1s> * <2p>' expandMacrosWith: variableName with: self);
cr;
nextPutAll:
('Integer Expression: <1s> * <2p> / <3p>' expandMacrosWith: variableName with: numerator with: denominator);
cr.
currentShiftOffset := denominatorExponent - bitString size.
shiftPositions := OrderedCollection new.
bitString asArray
do: [ :bit |
currentShiftOffset := currentShiftOffset + 1.
bit = $1
ifTrue: [ shiftPositions add: currentShiftOffset ] ].
report nextPutAll: 'Shift Expression: '.
shiftPositions
do: [ :position |
position = 0
ifTrue: [ report nextPutAll: variableName ]
ifFalse: [
position negative
ifTrue: [ report nextPutAll: ('(<1s> %<%< <2p>)' expandMacrosWith: variableName with: position negated) ]
ifFalse: [ report nextPutAll: ('(<1s> >> <2p>)' expandMacrosWith: variableName with: position) ] ] ]
separatedBy: [ report nextPutAll: ' + ' ].
report cr ] ]
Code (Usage)
I using the same float
(
1.164 multiplyAsShiftReportWith: 'x'. "print-it"
'Float: 1.164
19 / 16 = 1.1875
|Error| = 0.02350; (log10 |Error| = -1.629)
19 = Binary 0000000000010011; required shifts: 3
Precision/Shifts Ratio: 0.543
Orignal Expression: x * 1.164
Integer Expression: x * 19 / 16
Shift Expression: x + (x >> 3) + (x >> 4)
37 / 32 = 1.15625
|Error| = 0.00775; (log10 |Error| = -2.111)
37 = Binary 0000000000100101; required shifts: 3
Precision/Shifts Ratio: 0.704
Orignal Expression: x * 1.164
Integer Expression: x * 37 / 32
Shift Expression: x + (x >> 3) + (x >> 5)
74 / 64 = 1.15625
|Error| = 0.00775; (log10 |Error| = -2.111)
74 = Binary 0000000001001010; required shifts: 3
Precision/Shifts Ratio: 0.704
Orignal Expression: x * 1.164
Integer Expression: x * 74 / 64
Shift Expression: x + (x >> 3) + (x >> 5)
149 / 128 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
149 = Binary 0000000010010101; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 149 / 128
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
298 / 256 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
298 = Binary 0000000100101010; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 298 / 256
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
596 / 512 = 1.1640625
|Error| = 0.00006; (log10 |Error| = -4.204)
596 = Binary 0000001001010100; required shifts: 4
Precision/Shifts Ratio: 1.051
Orignal Expression: x * 1.164
Integer Expression: x * 596 / 512
Shift Expression: x + (x >> 3) + (x >> 5) + (x >> 7)
'
Update 1
This approach of course works for every kind of Number
, ScaledDecimal
s included (thanks Tobias!). So I moved the method to class Number
and updated the code.
Update 2
Thanks to a Twitter conversation with Hwa Jong Oh there is now a FastConstantMultiplier
. So if you need to multiply an Integer
with a constant quite often this might be interesting for you:
Load package
Gofer it
smalltalkhubUser: 'UdoSchneider' project: 'Playground';
package: 'FastConstantMultiplier';
load.
Using the package
m := FastConstantMultiplier constant: 0.123456789 significantDigits: 11.
m * 10000. "1232 "
m * 10000000000000." 1234567890054"
Written with StackEdit.
No comments:
Post a Comment