•Catalase Website Index •James •Techno •Optimisation

Molecular Biologists may want to view 3D models of protein molecules on a computer screen and to rotate them to look at them from different viewpoints

Each atom of the molecule has some 3D coordinates,
(** x,y,z**).
In standard implementations of molecular viewers, to convert to positions on a
screen these (

Z = R

If we were applying perspective we would then scale
**X **and **Y** down by the distance from the viewer, i.e. divide
each of these by **Z** . However, a non perspective
'orthogonal projection' is just fine for this kind of a molecular
model.

The above with a direct implemntation takes nine multiplications and six additions per transformed atom's coordinates. On a 68000 processor multiplications take 77 cycles and additions take 4 cycles, so each atom transformed takes at least 725 cycles.

There are several ways to reduce this work. The
matrix **R **can be decomposed into two 2D rotation
matricese. The calculation then requires eight
multiplications and four additions taking just
632 cycles. However, a much larger saving can be made on
this problem. The approach presented here for rapid rotation
eliminates multiplication entirely from point transformation calculations.

The trick is to notice that most of the time the molecule is being rotated about a fixed axis. We can use this to pre-prepare our coordinates.

A point moving in a unit circle has coordinates (
cos* t*, sin

Similar equations hold for **Y **and
**Z** .
These are the parametric equations for an elliptical path on the
screen - which is what a circle looks like in orthogonal perspective.

For a more general circle we need to be able to adjust
the radius **A **and the position of the centre of the circle,
**B**:

By some trigonometric magic this is equivalent to:

The cosine and sine waves have here been combined to
create a single cosine wave with a new amplitude **M **and
phase * θ*. We need to do this calculation for
each coordinate, so there are three multiplications, six additions
and three look ups of cosines from a prebuilt
cosine table. We only need ten bit accuracy on the
cosines, so a table is perfectly O.K. The 68000
code for this is:

Cycles movem.w (a1)+,d0-d2 ;fetch m, theta and base value (15) muls 0(ao,d1),d0 ;replace d0 with m*cos(t+theta) (77) clr.w d0 ;clear out insignificant bits (04) swap d0 ;bring sig bits into lower word (04) add.w d2,d0 ;add in base value (04) move.w d0,(a2)+ ;ship result out (05) TOTAL: 109

Since the same calculation is also done for **Y **and **Z **the
time taken is 327 cycles per atom.

Can we do better?

With some more trig the **X **equation is converted to:

The trick here is in choosing
* φ*. If

Cycles movem.w (a1)+,d0-d2 ;fetch basevalue, theta+phi, theta-phi (15) add.w 0(ao,d1),d0 ;add 512 times cos(t+theta+phi) to d0 (11) add.w 0(ao,d2),d0 ;add 512 times cos(t+theta-phi) to d0 (11) move.w d0,(a2)+ ;ship result out (05) TOTAL: 42

Notice that there are no multiplications. This method reduces the time taken per atom to 126 cycles, a near six fold reduction over the direct implementation.

It turns out that the calculations needed to get the φ's, θ's and B's are about twice as 'expensive' as the original matrix multiplication scheme. Typically a molecule rotates for at least 30 frames about one axis before a new axis is chosen so the cost of computing these φ's, θ's and B's is paid back many times over.

The method was well suited to processors of the mid 80's with slow (or even software-only) multiply instructions. At that time it made sense to trade multiply instructions for the relatively much faster table look up of cosines. The technique is not well suited to pentium class processors. On pentiums using a look up table for cosines, which will cause many cache misses, is an expensive alternative to multiplication.

However, the technique of pre-processing coordinate sets prior to rendering a sequence of frames from related viewpoints is still useful in current graphics programming. With the right pre processing one can eliminate 'hidden objects' and set levels of object detail for a whole sequence of frames rather than calculate it on a per-frame basis.

For a 68000 processor a progression of optimisations was as follows:

Direct implementation | 725 Cycles |

Factored rotation matrix | 632 Cycles |

M cos(phi) method | 327 Cycles |

cos(phi+theta) + cos(phi-theta) method | 126 Cycles |

The near six fold saving is through recognising that a calculation can be done once on the coordinates that then makes significant savings on a sequence of subsequent frames.

© James Crook, March 2000.

•Catalase Website Index •James •Techno •Optimisation