AOH :: GALANTI3.TXT Galilean Antialiasing for VR 3/4

Sender: VIRTU-L@UIUCVMD.BITNET
Received: from pucc.Princeton.EDU by iha.compuserve.com (5.65/5.910516)
id AA08531; Wed, 4 Nov 92 22:14:53 -0500
Message-Id: <9211050314.AA08531@iha.compuserve.com>
Received: from PUCC.PRINCETON.EDU by pucc.Princeton.EDU (IBM VM SMTP V2R2)
with BSMTP id 3036; Wed, 04 Nov 92 22:12:42 EST
Received: from PUCC.BITNET by PUCC.PRINCETON.EDU (Mailer R2.09 ptf003) with
BSMTP id 0610; Wed, 04 Nov 92 22:12:34 EST
Date:         Wed, 4 Nov 1992 19:11:23 -0800
Sender: "VR / sci.virtual-worlds" <VIRTU-L%UIUCVMD.BITNET@pucc.Princeton.EDU>
Comments:     Warning -- original Sender: tag was hlab@STEIN.U.WASHINGTON.EDU
From: "Human Int. Technology" <hlab@u.washington.edu>
Subject:      Sci-VW: PAPER: Galilean Antialiasing for VR, Part 03/04
To: Multiple recipients of list VIRTU-L <VIRTU-L%UIUCVMD.BITNET@pucc.Princeton.EDU>

From: John Costella <jpc@tauon.ph.unimelb.edu.au>
Subject: PAPER: Galilean Antialiasing for VR, Part 03/04
Date: Mon, 26 Oct 92 5:18:49 EET

%  File 3 of 4.  NOTE: All four files MUST be concatenated
%                before this document can be LaTeXed.
%
%
%  ... Continuation of "Galilean Antialiasing for Virtual Reality Displays"
%
%  (The following line *must* be left blank.)

As an aside, it is worth noting, at this point, that the prospects
for implementing significant parallelism in the video controller
circuitry \e{in general} are reasonably good,
if a designer so wishes to---or needs to---proceed in that direction.
One approach would be to slice the display up into smaller rectangular
portions, and assign
dedicated controller circuitry to each of these portions.
Of course, such implementations need to correctly treat memory
contention issues, since a galpixel in any part of the display
can, in general, be propagated to any other part of the display
by the next frame.
The extent to which this kneecaps the parallel approach depends
intimately on the nature of the memory hardware structure that is
implemented.
Furthermore, the video controller is subject to a rock-solid
deadline: the new frame buffer \e{must} be completely
updated before the
next tick of the frame clock; this further complicates the design
of a reliable parallel system.
However, this is a line of research that may prove promising.

We have, above, outlined the general plan of attack in
\e{unobscuration} situations.
We should first check to see if using debris solves this problem
Imagine an object undergoing (apparent) expansion on the display.
Just say a pixel near the centre becomes a hole''.
What will the debris method do?
Well, it will simply fill it in with the \e{old} central pixel---which
looks like a reasonably good match!
But now consider a hole that appears near the boundary of the
expanding object.
The debris method will now fill in whatever was \e{behind} that
position in the previous frame---which could be any other object.
Not so good.
Now consider an even worse situation: the object is apparently
moving towards the viewer \e{as well as moving transversely}:
in the new frame, it occupies a new position, but its own
debris is left behind.
Again, objects behind the one approaching will show through''
the holes.
Not very satisfactory at all.

We must, therefore, devise some way in which the video controller
can tell whether a missing pixel is due to unobscuration, of whether
it is due to expansion of an object.
But this test \e{must be lightning-fast}: it must essentially be
hard-wired to be of any use in the high-speed frame buffer
propagation process.
Undaunted, however,
let us consider, from first principles, how this test
might be carried out.
Firstly, even in abstract terms, how would one distinguish between
unobscuration and expansion anyway?
Well, we might look at the \e{surrounding galpixels}, and see what
patterns they form.
How does this help?
Well, in an \e{expansion} situation, the surrounding pixels will
all lie on a three-dimensional, roughly smooth surface.
On the other hand, on the edge of a patch of \e{unobscured}
galpixels, the surrounding pixels will be discontinuous, when
looked at in three-space: the object moving in front of the
one behind must be at least some \e{measurable} $z$-distance in front
of the other
(or else $z$-buffering would be screwed up anyway).
So, in abstract terms, our task is to examine the surrounding pixels,
and see if they all lie on the same surface.

How is this to be done in practice, much less in hardware?
To investigate this question, we first note that we are only interested
in a \e{small area} of the surface; in such a situation, one may
approximate the surface as \e{approximately flat}, whether it really
\e{is} flat (\eg\ a polygon) or not (\eg\ a sphere).
(If the surface is not really flat, and only subtends a few pixels'
worth on the display, in which area the curvature of the surface is
significant, then this argument breaks down; the following
algorithm will mistakenly think unobscuration is happening; but, in this
situation, such a small rendering of
the object is pretty well unrecognisable anyway,
and a hole through it is probably not going to bring the world to
an end.)
Now, we can always approximate a sufficiently flat surface by a
\e{plane}, at least over the section of it that we are interested in.
Consider, now, an \e{arbitrary}
surface in three-space, which we assume
can be expressed in the form $z=z(x,y)$, \ie\ the $z$ component at
any point is specified according to some definite
single-valued function of $x$
and $y$.
(Our display methodology automatically ensures only single-valued
functions occur anyway, due to the hidden-surface removal property
of $z$-buffering.)
Now, if that surface is a \e{plane},
$z$ will (by definition) be simply a linear function of $x$ and
$y$:
\beqn{PlaneDef}
z(x,y)=\al x+\be y+\g,
\eeqn
where $\al$, $\be$ and $\g$ are simply constants
specifying the orientation and position of the plane.
Now consider taking the \e{two-dimensional} gradient of this
function $z(x,y)$, namely, for the case of the plane,
\del z(x,y)=\del(\al x+\be y+\g)=\al\vi+\be\vj,
\eeqn
where the two-dimensional gradient $\del$ has components
$(\pard/\pard x,\pard/\pard y)$,
and $\vi$ and $\vj$ are unit vectors in the $x$ and
$y$ directions respectively.
Now consider taking the divergence of \eq{PlaneGrad} itself, namely
$\del\vdot\del z(x,y)\id\del^2z(x,y)=\del\vdot\,(\al\vi+\be\vj)=0.$
This tells us that, for any point on a smooth surface,
the function $z=z(x,y)$ satisfies \e{Laplace's equation},
\beqn{Laplace}
\del^2z(x,y)=0.
\eeqn
Consider, on the other hand, what happens if we abut
two \e{different} surfaces side-by-side, such as is done when we
produce a $z$-buffered display showing one object partly obscuring
another.
Assume, for simplicity, that this join'' is along the $y$-axis
of the display.
To the right of the $y$-axis, we have one surface, which can be
described by the function $z_R=z_R(x,y)$; to the left of the $y$-axis,
we have a different surface, described by $z_L=z_L(x,y)$.
We can describe the function $z=z(x,y)$ for \e{all} $x$
simultaneously (\ie\ on
\e{both} sides of the $y$-axis) by employing the
\e{Heaviside step function}, $\th(x)$, which is equal to $+1$
for $x>0$, and is equal to $0$ for $x<0$ (\ie\ it switches on''
only to the right of the $y$-axis):
\beqn{PieceWise}
z(x,y)=\th(x)z_R(x,y)+\th(-x)z_L(x,y),
\eeqn
where the $\th(-x)$ term similarly switches on'' the left-hand surface
to the left of the $y$ axis.
Let us now proceed to take the Laplacian of \eq{PieceWise}, step by
step, as was done in proceeding
\eq{Laplace}.
Taking the two-dimensional gradient of \eq{PieceWise}, we have
\del z(x,y)\tb\id\tb\braces{\vi\f{\pard}{\pard x}
+\vj\f{\pard}{\pard y}}z(x,y)       \nline
\tb=\tb\vi\braces{\de(x)\brac{z_R(x,y)-z_L(x,y)}
+\th(x)\f{\pard z_R(x,y)}{\pard x}
+\th(-x)\f{\pard z_L(x,y)}{\pard x}}  \nline
\tb\tb+\vj\braces{\th(x)\f{\pard z_R(x,y)}{\pard y}
+\th(-x)\f{\pard z_L(x,y)}{\pard y}}  \nline
\vspace{0.2cm}
\tb\id\tb\th(x)\del z_R(x,y)+\th(-x)\del z_L(x,y)
+\vi\de(x)\brac{z_R(x,y)-z_L(x,y)}
\eeqnarr
where we have used the product rule for differentiation, and
where $\de(x)\id d\th(x)/dx$---the derivative of the Heaviside step
function---is the \e{Dirac delta function}, which is an infinitely tall,
infinitely thin
spike at the origin, such that the area under the curve is equal to
one.
We now need to take the divergence of \eq{SliceGrad} itself.
(Readers who are by this stage nauseated by the
mathematics should persevere, if only skimmingly; things will
be simplified again shortly.)
Performing this divergence, we have
\beqnarr{PoissonMess}
\del\vdot\del z(x,y)\tb\id\tb\del^2z(x,y)
\id\braces{\vi\f{\pard}{\pard x}+\vj\f{\pard}{\pard y}}
\vdot\del z(x,y)   \nline
\tb=\tb\de'(x)\brac{z_R(x,y)-z_L(x,y)}
+2\de(x)\braces{\f{\pard z_R(x,y)}{\pard x}
-\f{\pard z_L(x,y)}{\pard x}} \nline
\tb\tb+\th(x)\del^2z_R(x,y)+\th(-x)\del^2z_L(x,y),
\eeqnarr
where $\de'(x)\id d\de(x)/dx$ is the derivative of the Dirac
delta function (which looks like an infinite up-then-down double-spike).
Now, if each of the surfaces on the right and left are smooth (or,
in the case of the closer surface, \e{would} be smooth if continued on
in space), then they will,
by \eq{Laplace}, satisfy $\del^2z_R(x,y)=0$
and $\del^2z_L(x,y)=0$; thus, the last line
of equation~\eq{PoissonMess}
will vanish.
Let us, therefore, consider the remaining terms in \eq{PoissonMess}.
Since both $\de(x)$ and $\de'(x)$ vanish everywhere except for the
line $x=0$ (\ie\ the $y$-axis), \eq{PoissonMess}
tells us that the expression $\del^2 z(x,y)$ \e{also} vanishes
everywhere except the $y$-axis; \ie, away from the join, each
surface separately satisfies equation~\eq{Laplace}---this we could
have guessed.
What about \e{on and about} the $y$-axis?
Well, in that vicinity, neither $\de(x)$ nor $\de'(x)$ vanish;
the only way that the whole expression \eq{PoissonMess}
can vanish is if the factors that these functions are respectively
multiplied by \e{both} separately vanish.
This would require
\beqn{ContAbut}
z_R(0,y)=z_L(0,y)
\eeqn
and
\beqn{ContDerivAbut}
\f{\pard z_R}{\pard x}(0,y)=\f{\pard z_L}{\pard x}(0,y)
\eeqn
to both hold true.
But \eq{ContAbut} says that the two surfaces on either side of the
$y$-axis must be at the \e{same depth} for the particular $y$
value in question (and thus \coin cident in three-space);
and, furthermore, \eq{ContDerivAbut} requires that their \e{derivatives}
be equal across the $y$-axis.
But if we had two surfaces next to each other, with moreover
a smooth matching'' of the rate of change of depth across the
join, then these two surfaces may just as well be called \e{one
single} surface---they are smoothly joined together.
Therefore, \e{for any two non-smoothly-joined surfaces
apparently abutting each
other in the display plane},
we conclude that the depth function $z=z(x,y)$
obeys the equation
\beqn{Poisson}
\del^2z(x,y)=\r(x,y),
\eeqn
with some source'' function $\r(x,y)$ \e{that is non-zero at the
abutment, but zero everywhere else}.
Equation \eq{Poisson} is, of course, \e{Poisson's equation}.
It is clear that it is an ideal way to
determine whether we are inside'' a smooth surface
(\ie\ expansion is in order) or at the \e{edge} of two surfaces
(which will allow us to detect the edges of unobscuration areas),
simply by computing $\del^2 z(x,y)$ on the
$z$-buffer information, and checking whether or not the answer is
(reasonably close to) zero;
we shall call such a test a \e{Poisson test}.

OK, then,'' our recently-nauseated readers ask, How on earth
do we compute Poission's equation, \eq{Poisson}, in our
video controllers?
What sort of complicated, slow, expensive,
floating-point chip will we need for
\e{that}?''
The answer is, of course, that computing \eq{Poisson} is
\e{especially} easy if we have $z$-buffer information on a rectangular
grid (which is precisely what we \e{have} got!).
Why is this so?
Well, considered the Laplacian operator $\del^2\id\del\vdot\del$.
Performing this dot-product \e{before} having either of the $\del$
operators act on anything, we have
\beqn{ExpandLap}
\del^2\id\del\vdot\del
\id\braces{\vi\f{\pard}{\pard x}+\vj\f{\pard}{\pard y}}
\vdot\braces{\vi\f{\pard}{\pard x}+\vj\f{\pard}{\pard y}}
\id\f{\pard^2}{\pard x^2}+\f{\pard^2}{\pard y^2}
\eeqn
(hence the suggestive notation $\del^2$);
expressed like this, Poisson's equation \eq{Poisson} becomes
$\f{\pard^2z(x,y)}{\pard x^2}+\f{\pard^2z(x,y)}{\pard y^2}=\r(x,y).$
Thus, if we can compute both $\pard^2z/\pard x^2$ and
$\pard^2z/\pard y^2$, then we can perform our Poisson test!
How, then, do we compute (say) $\pard^2z/\pard x^2$?
Well, using the information in the display buffer,
we cannot do this \e{exactly}, of course; however, we \e{can} get
quite a good estimate of it by using \e{finite differences}.
To see this, let us first worry about simply computing the \e{first}
derivative, $\pard z/\pard x$.
The fundamental definition of this derivative is
\beqn{DerivFund}
\left.\f{\pard z(x,y)}{\pard x}\right\|_{x=x_0,y=y_0}
\id\lim_{\eps\rightarrow0}\f{z(x_0+\eps,y_0)-z(x_0,y_0)}{\eps},
\eeqn
where the subscript on the left hand side denotes the fact that
we are evaluating the derivative at the point $(x=x_0,y=y_0)$.
Now, we cannot get arbitrarily close to $\eps\rightarrow0$,
as this definition specifies: we can only go down to $\eps=1$ pixel
(which, if the surfaces are a significant number of pixels in
size, \e{is} really a small distance, despite appearances).
However, if we have a \e{finite} step size $\eps$, as we have,
then where would the approximate derivative, as computed
via \eq{DerivFund}, belong to''?
In other words, is it the derivative evaluated at $(x_0,y_0)$;
or is it the derivative evaluated at $(x_0+1,y_0)$;
or is it something else again?
Just as with our earlier considerations of such questions,
a democracy'' principle is optimal here: we split the
difference'', and say that the computed derivative belongs''
to the (admittedly undisplayable) point $(x_0+\half,y_0)$.
In other words,
\beqn{DiscreteDeriv}
\left.\f{\pard z(x,y)}{\pard x}\right\|_{x=x_0+\half,y=y_0}
\approx z(x_0+1,y_0)-z(x_0,y_0)
\eeqn
is a good estimate of the quantity $\pard z/\pard x$.
But we want $\pard^2z/\pard x^2$, not $\pard z/\pard x$; how do
we get this?
Well, we simply apply \eq{DerivFund} a \e{second} time, since
\beqn{SecondDeriv}
\f{\pard^2z}{\pard x^2}\id\f{\pard}{\pard x}
\braces{\f{\pard z}{\pard x}}.
\eeqn
But, so far, we only have $\pard z/\pard x$ evaluated at $(x=x_0+\half, y=y_0)$: we need \e{two} points to compute a discrete derivative
using the approximate formula \eq{DiscreteDeriv}.
Well, this is easy to arrange: we just use the points $(x=x_0,y=y_0)$
and $(x=x_0-1,y=y_0)$ to compute $\pard z/\pard x$ as evaluated at the
position half a pixel to the \e{left},
$(x=x_0-\half,y=y_0)$; in other words,
\beqn{DiscreteDerivBack}
\left.\f{\pard z(x,y)}{\pard x}\right\|_{x=x_0-\half,y=y_0}
\approx z(x_0,y_0)-z(x_0-1,y_0).
\eeqn
We now use \eq{SecondDeriv}, with \eq{DiscreteDeriv}
and \eq{DiscreteDerivBack}, to get a good estimate of
$\pard^2 z/\pard x^2$:
\beqnarr{SecondDerivDisc}
\left.\f{\pard^2z(x,y)}{\pard x^2}\right\|_{x=x_0,y=y_0}
\tb\approx\tb\brac{z(x_0+1,y_0)-z(x_0,y_0)}  \nline
\tb\tb-\brac{z(x_0,y_0)-z(x_0-1,y_0)}       \nline
\tb=\tb z(x_0+1,y_0)+z(x_0-1,y_0)-2z(x_0,y_0).
\eeqnarr
We now note that, not only has our split the difference'' philosophy
brought us back bang on to the pixel $(x=x_0,y=y_0)$ that we wanted,
\e{we can perform this computation using only
an adder and by shifting bits}: it can be hard-wired into our
video controller, no problems!
To the result \eq{SecondDerivDisc} we must add, of course, the
corresponding second derivative in the $y$-direction,
$\pard^2z/\pard y^2$; this follows the same procedure as
used to obtain \eq{SecondDerivDisc}, but now taken adjacent pixels
in the $y$ direction.
Adding these quantities together, and collecting together terms,
we thus have our final, hard-wireable form of the Poisson test:
\beqnarr{PoissonFinal}
\left.\del^2 z(x,y)\right|_{x_0,y_0}
\tb\approx\tb z(x_0+1,y_0)+z(x_0-1,y_0) \nline
\tb+\tb z(x_0,y_0+1)+z(x_0,y_0-1)-4z(x_0,y_0).
\eeqnarr
We must therefore simply implement this
hard-wired test in hardware, and check to
see if the result is reasonably close to zero.
(Just \e{how}
close is something that must be checked for typical operating
conditions against real-life objects that are rendered; no general
formula can be given; research, with a modicum of commonsense,
should prevail.)
If the Poisson test reveals a non-zero source'' at that
pixel, $\r(x_0,y_0)$, then we are probably at the edge of
two separated surfaces;
conversely, if the test shows a source'' value compatible with zero,
then we are, most likely, within the interior of a single surface.

OK, then, we have shown that computing the Poisson test for any
pixel is something that we can definitely do in hardware.
How do we use this information,
in practice, to perform an expansion'' of the image of
an object moving towards us?
Well, this is an algorithmically straightforward task;
it does, however, require some extra memory on the part of the
video controller, as well as sufficient speed to be able to perform
\e{two} sweeps: once through the old frame buffer (as described above),
and once to go through the new frame buffer.
The general plan of attack is as follows.
The first sweep, through the old frame buffer, follows the
procedure outlined above: both debris and propagated galpixels are
moved to the new frame buffer.
To this first sweep, however, we add yet
\e{another} piece of parallel circuitry
propagate galpixels respectively), that computes the Laplacian of
$z$-buffer information according to the approximation
\eq{PoissonFinal}, as each galpixel is encountered in turn.
For each galpixel, it determines whether the Poisson test is satisfied
or not, and stores this information in a \e{Poisson flag map}.
This 1-bit-deep matrix simply stores yes'' or no'' information
for each galpixel: yes'' if there is a non-zero Poisson source,
no'' if the computed source term is compatible with zero.
This Poisson flag map is only a modest increase in memory requirements
(only 120~kB for even a $1000\times1000$ display) compared to the
memory that we have already allocated for \Galn\ \anti ing.
If \e{any} of the five $z$-buffer values used to test the
Poisson equation are \e{debris} (\ie\ debris already left behind
on a previous frame propagation), the value yes'' is automatically
stored, since the pixel in question cannot then be inside a surface
(because debris should never have been left inside
the surface last time
through).
Pixels around the very edge of the display, however,
\e{cannot} be tested;
they may simply (arbitrarily) be also flagged as yes''.
(The out-of-view buffering method of section~\ssect{LocalUpdate}
makes this \e{edge effect} much less important.)

Once the first sweep has been completed (and not before),
a second sweep
is made, this time of the \e{new} frame buffer.
At the beginning of this sweep, the new frame buffer consists solely
of propagated galpixels and, where they are absent, debris;
and the Poisson flag map contains the yes--no'' Poisson test
every galpixel in the \e{old} frame buffer.
The video controller now scans through the new frame buffer, looking
for any pixels marked as debris.
When it finds one, it then looks at the \e{previous} galpixel the
new frame buffer (\ie, with a regular left-to-right, top-to-bottom
scan, the pixel immediately to the left of the debris).
Why?
The reasoning is a little roundabout.
First, assume that the unoccupied pixel of the new frame buffer
\e{is}, in fact, within the boundaries of a smooth surface.
If that \e{is} true, then the galpixel to its left must be
inside the surface too.
If we then \e{reverse-propagate} that galpixel (the one
directly to the left) back to where it \e{originally} came from in the
old frame buffer (as, of course, we \e{can} do---simply reversing
the sign of $t$ in the \Galn\ propagation equation, using the same
hardware as before), then we can simply check its Poisson flag map
entry to see if, in fact, it \e{is} within the boundaries of a surface.
If it is, then all our assumptions are in fact true; the debris pixel
in the new frame buffer is within the surface; we should now invoke
our expansion'' algorithm (to be outlined shortly).
On the other hand, if the Poisson flag map entry turns out to show
that the reverse-propagated galpixel is \e{not} within the boundaries
of a surface, then our original assumption is therefore false:
we have (as best as we can determine) a \e{just-unobscured} pixel;
for such a pixel, debris should simply be displayed---but since
there already \e{is} debris there, we need actually do nothing more.

However, we must, in practice, add some meat'' to the
above algorithm
to make it a little more reliable.
What happens when the galpixel to the left was \e{itself} on the left
edge of the surface?
In that situation, we \e{do} want to fill in the empty pixel as an
expansion''---but the Poisson test tells us (rightly, of course)
that the pixel to the left is on an edge of something.
In an ideal world, we could always count up'' the number of edges
we encounter to figure out if we are entering
or leaving a given expanding
surface.
But, even apart from the need to treat special cases in mathematical
space, such
a technique is not suitable at all in
the approximate, noisy environment of
a pixel-resolution \e{numerical} Poisson test.
One solution to the problem
is to leave such cases as simply not correctly done:
holes may appear near the edges of approaching objects; we have,
at least, covered the vast majority of the interior of the surface.
A better approach, if at least one extra memory fetch cycle from the
old frame buffer is possible (and, perhaps thinking wishfully,
up to three extra fetch cycles), is to examine not only the galpixel
in the new frame buffer to the \e{left} of the piece of debris found,
but also the ones \e{above}, to the \e{right}, and \e{below}.
If \e{any} of these galpixels are in a surface's interior,
then the current pixel probably is too.

On the other hand, what is our procedure if \e{all} of the
surrounding galpixels that we have time to fetch are debris?
Then it means that, most likely, we are in the interior of a
\e{just-unobscured} region of the new frame buffer, and we should leave the
debris in the current pixel there.
This is how the algorithm bootstraps'' its way up from
purely edge-detection to actually filling in the \e{interior}
of just-unobscured areas.
In any case, the video controller, in such a case,
would have no information to change the pixel in question
anyway---the correct answer is forced on whether we like it or not!

what our expansion algorithm'' should be, for pixels that we decide,
by the above method, \e{should} be blended'' into the
surrounding surface.
There are many fancy algorithms that one might cook up to
interpolate'' between points that have expanded; however,
we are faced with the dual problems that we have very little time
to perform such feats, and we do not really know for sure (without
some complicated arithmetic) just where the new pixel \e{would} have
been in the old frame buffer.
Therefore, the simplest thing to do is simply to replicate the galpixel
to the left---colour, derivatives, velocity, and all---except that
its \e{fractional position} should be zeroed (to centre it on its
new home).
If we \e{did}, perchance, fetch \e{two} surrounding galpixels---say, the
ones to the left and to the right---then a simple improvement is to
\e{average} each component of RGB (which only requires an addition,
and a division by half---\ie\ a shift right).
Finally, if we have really been luxurious, and have fetched \e{all four}
immediately-surrounding pixels, then again we can perform a simple
averaging, by using a four-way adder, and shifting two bits to the
right.
Considering that it is a fairly subtle trick to expand'' surfaces
in the first place, and that a display update will soon be needed
to fill in more detail for this object approaching the viewer in
any case,
approaches should be sufficient in practice---even if,
from a computer graphics point of view, they are fairly
primitive.

Now, the complete
explanation above of this second frame sweep'' seems to
involve a fair bit of thinking'' on the part of the video
controller: it has to look for debris; \e{if} it finds it, it
then has to fetch the galpixel to the left;
it then reverse-propagates it back to the old frame buffer;
it then fetches its Poission flag map entry;
it then decides whether to replicate the galpixel to the left
(if in a surface) or simply leave debris (if
representing unobscuration); all of these steps seem
to add up to a pretty complicated procedure.
What if it had to do this for almost every galpixel in the new frame
buffer?
Can it get it done in time?
The mode of description used above, however, was presented from a
microprocessor'', sequential point of view; on the other hand,
all of the operations described are really very trivial memory fetch,
which can be hard-wired trivially.
Thus, \e{all} of the above steps should be done \e{in parallel},
for \e{every} galpixel in the second sweep, \e{regardless} of whether it
is debris or not: the operations are so fast that it will not
slow anything down to have all that information ready in essentially
one tick of the clock'' anyway.
Thus, the only overhead with this process is that one needs to have
time to perform both the first and second sweeps---which, it should be
noted, \e{cannot} be done in parallel.

There is, however, a more subtle speed problem
that is somewhat hidden in the above
description of the Poisson test.
This speed problem is (perhaps surprisingly)
not in the second sweep at all---but, rather,
in the \e{first}.
The problem arises when we are trying to compute the Poisson test,
according to \eq{PoissonFinal}, for each galpixel: this requires
fetching \e{five} entries of $z$-buffer information at the same
time.
But,'' the reader asks, you have described many times
procedures that require the simultaneous fetching of
several pieces of
memory. Why is it only now a problem?''
The subtle answer is that, in those other cases, the various
pieces of information
that needed to be simultaneously fetched were always \e{different}
in nature.
For example, to compute the propagation equation \eq{PropPos},
we must fetch all three components of motional information
simultaneously.
The point is that \e{these pieces of information can quite naturally
be placed on different physical RAM chips}---which can,
of course, all be
accessed at the same time.
On the other hand, the five pieces of $z$-buffer information that
we require to compute \eq{PoissonFinal} will most efficiently
be stored on the \e{one} set of RAM chips---which requires that
\e{five separate memory-fetch cycles} be initiated
(or, at the very least, something above
three fetch cycles, if burst mode is used to retrieve the three value
entries on the same row as the pixel being tested).
This problem would, if left unchecked, ultimately slow down the
whole first-sweep process unacceptably.

The answer to this problem is to have
an extra five small memory chips, each with
just enough memory to store a \e{single row} of $z$-buffer information.
(In fact, to store the possible results from equation~\eq{PoissonFinal},
they need three more bits per pixel than the $z$-buffer contains,
because the overall computation of Poisson's equation has a
range $[-4z_\txt{max},+4z_\txt{max}]$, where the $z$-buffer's minimum
and maximum values are assumed to be $0$ and $z_\txt{max}$ respectively.)
These five
long, thin, independent
memory structures will be termed \e{Poisson fingers}.
As the video controller sweeps through the old frame buffer, it adds
the (single-fetch) $z$-buffer information of the galpixel to the
Poisson finger entry directly below, to the right, to the left,
and above the current galpixel's location; and shifts the
$z$-buffer value to the left by 2 bits (multiplying it by four)
and subtracts it from the Poisson finger entry right at the current
pixel position.
Clearly, the Poisson fingers for the rows directly
above and below the currently-scanned row
are only required to be written to once for each pixel-scanning
period.
The remaining three Poisson fingers are arranged such that one
takes column $0,3,6,9,\ldots$ entries, the next takes columns
$1,4,7,10,\ldots$, and the third takes columns $2,5,8,11,\ldots$.
Thus, in any one pixel-scanning period,
all of these Poisson fingers are being
written into at the same time, and only one write to each is
required in each such period.
It is in this way that the five-way Poisson test
bottleneck of equation
\eq{PoissonFinal} is avoided using the five Poisson fingers.

It now simply remains to be specified how the Poisson finger information
is used.
At any given pixel being processed during the first sweep,
the information necessary to finish the computation
of the Poisson test for the galpixel
\e{directly above} the current pixel becomes available.
In our previous description,
we simply said that this Poisson finger is \e{written} to,
like the other three surrounding pixels;
in fact, all that is necessary is that
it be \e{read} (and not rewritten), added to this final piece of
$z$-buffer information, and passed through to the Poisson test
checker (the circuitry that determines whether the source term in the
Poisson equation is close enough to zero to be zero or not).

At the end of scanning each row in the first sweep, the
video controller circuitry must multiplex the contents
of the three centre-row
Poisson fingers back into a single logical finger,
and scroll'' them up to the top finger; likewise,
the finger below must be demultiplexed into three sets, to be placed
in the three central-row fingers; and the finger below must
in turn be zeroed, ready to start a new row.
These memory-copy operations can be performed in burst mode,
and only need be done at the end of each scan-row.
Alternatively, for even better performance,
\e{nine} Poisson fingers may be employed, with three each for the
row being scanned and the rows directly above and below;
can then simply be permuted at the end of each
scan-row, so that no bulk copying need take place at
all.
The only remaining task is then to \e{zero} the set of fingers
corresponding to the row that is becoming the new bottom row; the
chosen memory chip should have a line that clears all cells
simultaneously.
Of course, with the nine-finger option, each chip need only store
the information for a third of a row, since it will always be
allocated to the same position in the $(0,1,2)$ cyclic sequence.

We have now completed our considerations on what hardware additions
and modifications are required for a minimal implementation
of $\Gal{2}$ \anti ing, as might be retrofitted to an existing
\VR\ system.
In summary,
the hardware described, if implemented in full, will propagate
forward in time, from frame to frame,
moving images that are rendered by the display processor.
If an object appears to expand on the display, the display
hardware will, in most situations, detect this fact, and fill in''
the holes so produced.
If objects that were previously obscured become unobscured,
the system will fall back to conventional, $\Gal{0}$ methods,
and will simply leave the previous contents of those areas unchanged,
as debris.

These properties of the $\Gal{2}$ display
system that we have outlined are achieveable today, on existing systems.
Designers of brand new \VR\ systems, however, can do even
better than this.
We invite such designers to glance at the further enhancements
suggested in section~\sect{Enhancements}: implementing these
promises to result in a display technology
that will be truly powerful, and, it might be argued,
for the first time in history worthy of a place in the
new, exciting field of commercial \VR.

In the previous section,
the minimal additions to hardware necessary to implement
\Galn\ \anti ing immediately, on an existing \VR\ system, were
described.
In this section, we consider, in a briefer form, the
software changes that are needed to drive such a system.
Detailed formulas will, in the interests of space, not be listed;
but, in any case, they are simple to derive from first
principles, following the guidelines that will be given here.

Our first topic of consideration must be the \e{display processor},
which has been treated in fairly nebulous way up to now.
The display processor must, at the very least, scan-convert all
of the appropriately clipped, transformed primitives passed to it
from further down the graphical pipeline.
In $\Gal{0}$ display systems, this consists of several logical tasks
(which may, of course, be merged together in practice), which,
determine the rasterised boundary of the primitive;
scan through the pixels in the interior of the boundary
(except in wire-frame systems);
compute $z$-buffer information for each interior pixel,
using its value to determine visibility;
compute appropriate shading information for each
visible interior pixel.
A \Galn\ \anti ed system adds two more tasks to this procedure
(which may, however, be carried out in parallel with the others):
interpolation of velocity and acceleration information across
the filled primitive;
and a computation of colour-information time derivatives.

acceleration information, is particularly simple to implement if the
logical display device is a planar viewport on the virtual world,
and if all of the primitives to be rendered are planar (polygons,
lines, \etc).
In such systems, the linear nature of the perspective transformation
means that lines in physical space correspond to lines in display
space, and, as a consequence,
planar polygons in physical space correspond to planar polygons in
display space.
But if we take successive time derivatives of these statements,
we find that the velocities, accelerations, \etc, of the various
points on a line, or the surface of a polygon, are simply
\e{linear interpolations} of the velocities of the vertices of
the line or polygon in question.
Thus, the same algorithms that are used to linearly interpolate
$z$-buffer values can be used for the velocity and acceleration
data also; we simply need a few more bits of hardware to carry this

Computing the time derivatives of the colour data, on the other
hand, is a little more complicated.
In general, one must simply write down the complete set
of fundamental equations
for the illumation model and shading model of choice, starting
from the
fundamental physical inputs of the virtual world
(such as the position, angle,
brightness, \etc\ of each light
source;
the direction of surface normals of the polygons and the
positions of their vertices; and so on), and proceeding
right through
to the ultimate equations that yield the red, green and blue
colour components that are to be applied to each single pixel.
One must then
carefully take the time-derivative of this complete set of
equations, to compute the first derivative of the red, green and
blue components in terms of the fundamental physical quantities
(such as light source positions, \etc), and their derivatives.
This is easier said than done; the resulting equations
can look very nasty indeed.
So far, however, the described
procedure has been purely mathematical: no brains at all
needed there, just a sufficiently patient computer (whether human
or machine) to compute derivatives.
The real intelligence comes in distilling, from this jungle of
equations, those relations that have the most effect,
\e{psychologically}, on the perceived temporal smoothness of the colour
information presented in a real-life \VR\ session.
There are innumerable questions that need to be investigated
in this regard;
for example, if our shading model interpolates intensities across
a surface, should we take the time derivative of the interpolation,
or should we interpolate the time derivatives?
The former approach will clearly lead to a greater \e{consistency}
between updates, but the latter is almost certainly easier to
compute in real life.
Such questions can only be answered with a sufficient amount of
focused, quality research.
The author will not illustrate his ignorance of this research any
further, but will rather leave the field to the experts.

A more subtle problem arises when we consider \e{optically
sophisticated models}.
By this term, we mean the use of a rendering model that simulates,
to a greater or lesser extent, some of the more subtle aspects of
optics in real life: the use of shadows;
the portrayal of transparent and translucent objects;
the use of ray-tracing.
While these techniques are (for want of gigaflops)
only modelled crudely, if at all,
in current \VR\ systems, there is nevertheless a more or less
unstated feeling that these things will be catered for in the future,
when processing power increases.
For sure, spectacular simulations of the real world are vitally
important proof-of-concept tools in conventional computer graphics;
and there are a large number of potentially powerful applications
that can make excellent use of such capabilities.
These applications---and, indeed, new ones---will also
be, in time, feasible targets for the \VR\ industry: simulating
the real world will be a powerful domestic and commercial tool.
But \VR\ will be a poor field indeed if it is limited to simply
simluating \RR\ with ever-increasing levels of realism.
Rather, it will be the imaginative and efficient programmes of
\e{virtual world design} that will be the true cornerstone of the
\VR\ industry in years to come.
Do not replicate the mistakes of Pen Computing---otherwise
known as How You Can
Build a \$5000 Electronic Gadget to Simulate a \$1 Notepad---\noone\
will be very interested.

To illustrate what a mistake it would be to pursue \RR\ simulation
to the exclusion of everything else, it is sufficient to note
that even the field of sexy graphics''
(that can produce images of teddy-bears and office interiors
practically indistinguishable from the real thing)
is usually only equipped to deal with a very small subset of the
physical properties of the real world.
Consider, as an example, what happens if one
shines a green light off a large, flat, reflecting
object that is coming towards you.
What do you see?
Obviously, a green light.
What happens if the object is moving towards you very fast?
So what; it still looks green.
But what if it is moving \e{really} fast, from a universal point of
view? What then?
Ah, well, then the real world diverges from computer graphics.
In the real world, the object looks blue; on the screen, however,
it still looks green.

But that's cheating,'' a computer graphics connoisseur argues,
Who cares about the Doppler effect for light in real life?''
Excluding the fact that thousands of cosmologists would suddenly
jump up and down shouting Me! Me!'', just imagine
that we \e{do} want to worry about it: what then?
Well, perhaps in the days of punch cards and computational
High Priests, the answer would have been,
Too bad. Stop mucking around.
We only provide \RR\ simulations here.''
But this attitude will not be tolerated for a minute in today's
computationally liberated'' world.

Of course, the algorithms for ray-tracing may be modified, quite
trivially, to take the Doppler effect into account.
But what if we now wanted to look at a lensed image of a distant
quasar (still a real-world situation here, not a virtual one);
what then?
Ah, well, yes, well we'd have to program General Relativity in
as well.
Er\ldots, OK, I think I can program in continuous fields.
The photoelectric effect?
Optical fibres?
Umm\ldots, yes, well\ldots, well what on Earth do you want all
this junk for anyway?!

The point, of course, is that, on Earth, we don't.
But that doesn't mean that people want their brand-new
fancy-pants ray-traced \VR\ system to be the
late-1990s version of a Pen Computer!
If someone wants to view a virtual world in the infra-red, or
ultra-violet, or from the point of view of high-energy gamma
rays for that matter,
why stop them?
What a \VR\ system must do, as best it can, is stimulate
our senses in ways that we can comprehend, \e{but for the sole
purpose of being information inputs to our brains}.
If we want to simulate the ray-traced, sun-lit, ultra-low-gravity,
ultra-low-velocity, ultra-low-temperature
world that is our home on the third planet around
an average-sized star, two-thirds of the way out of a pretty
ordinary-looking galaxy, then that is fine.
Such applications will, in many cases, be enormously beneficial
(and profitable) to the participants.
But it is a very small fragment of what \e{can}---and will---be done.

So where is all of this meaningless banter leading?
The basic point is this: already, with wire-frame graphics,
the early \VR\ systems were able to give a surprisingly good
feeling of presence'' in the virtual world.
Flat-shaded polygons make this world a lot meatier''.
Interpolative shading, curved surfaces and textures make the virtual
world look
just that little bit nicer.
However, we are rapidly reaching the saturation limit of how much
information can be absorbed, via these particular senses,
by our minds; making the \e{visible} virtual
world even more realistic'' will not lead, in itself, to much
more of an appreciation of the information we are trying to understand
with the technology (although history already has places
reserved for the many advances in stimulating our \e{other}
senses that will heighten the experience---aural
presence being excitingly close to being
commercially viable;
force and touch being investigated vigorously;
smell and taste being in the exploratory phases).
When you walk around a (\RR) office, for example,
do you have to stop and stare in awe because
someone turns another light on?
Can you still recognise your car if it is parked in the shade?
Does the world look hallucinogenically
different under fluorescent lighting
than under incandescent lighting?
The simple fact is that \e{our brains have evolved to normalise out
much of this lighting information as extraneous}; it makes sexy-looking
demos, but so do Pen Computers.

It may be argued, by aficionados of the field of optically
sophisticated modelling, that the
author's opinions on this topic are merely sour grapes:
machines can't do it in real-time right now, so I don't want it
anyway.
This could not be further from the truth:
ideas on how these sophisticated approaches might be reconciled
with the techniques outlined in this \typeofdoc\ are already
in the germination stage; but they are of a purely speculative
nature, and will remain so
until hardware systems capable of generating these
effects in practical virtual-world-application situations
Implementing optically sophisticated techniques
in \VR\ will, indeed, be an important commericial application
for the technology in years to come.
But this is all it will be: \e{an} application, not the whole
field.
It will not be part of general-purpose \VR\ hardware; it will
be a special-purpose, but lucrative, niche market.
It will be a subdiscipline.

All that remains is for \VR\ consumers
to decide What on Virtual Earth they
want to do with the rest of their new-found power.

The previous section of this \typeofdoc\ outlined
how a minimal modification
of an exisiting \VR\ system might be made to
incorporate \Galn\ \anti ing.
In this section,
not required for such a minimal implementation,
are contemplated.
Firstly,
a slight interlude away
from the rigours of \Galn\ \anti ing is taken:
in section~\ssect{WrapAround}, a simple-to-follow
review of the fundamental problems facing the designers of
\e{wrap-around head-mounted display systems} is offered;
the technical material offered, however, is not in any way new.
After this respite,
section~\ssect{LocalUpdate} renews the charge on
\Galn\ \anti ing proper, and suggests further ways in which
\VR\ systems can be optimised to present visual
information to the participant in the most psychologically
convincing way that technological constraints will allow.

In conventional computing environments, the display device is generally
flat and rectangular, or, in light of technological constraints on
CRT devices, as close as possible to this geometry as can be attained.
The display device is considered to be a window on the virtual world'':
it is a planar rectangular viewport embedded in the real world that
the viewer can,
in effect, look through'', much as one looks out a regular glass
window at the scenery (or lack of) outside.

In a similar way, most of the mass-marketed \e{audio} reproduction equipment
in use around the world aims to simply offer the aural equivalent of
a window: a number of speakers'' (usually two, at the present time)
reproduce the sounds that the author of the audio signal has created;
but, with the exception of quadrophonic systems, and the relatively
recent Dolby Surround Sound, have not attempted to portray to the
listener any sense of \e{participation}: the listener is (quite literally)
simply a part of the \e{audience}---even if the reproductive qualities
of the system are of such a high standard that the listener may well
believe, with eyes closed, that they are actually sitting in the audience
of a live performer.

multimedia'' computing---which has only exploded commercially in the
past twelve months---heightens
the effectiveness of the window on a virtual world'' experience greatly,
drawing together our two most informationally-important physical senses
into a cohesive whole.
High-fidelity television and video, in a similar way,
present a read-only'' version of such experiences.

\VR, on the other hand, has as its primary goal \e{the removal of the
window-frame, the walls, the ceiling, the floor}, that separate the
virtual world from the real world.
While the prospect of the
demise of the virtual window might be slightly saddening to Microsoft
Corporation (who will, however, no doubt release Microsoft Worlds 1.0
in due time),
it is an extremely exciting prospect for the
average man in the street.
Ever since Edison's invention of the gramophone just over a century
ago, the general public has been accustomed to being
listeners''---then, later, viewers'' (\e{\{a}~la} Paul Hogan's
immortal greeting, G'day viewers'')---and finally, in the computer
age, operators''; now, for the first time in history, they
have the opportunity of being completely-immersed \e{participants} in
their virtual world, with the ability to mould and shape it to suit
their own tastes, their own preferences, their own idiosyncracies.

Attaining this freedom, however, requires that the participant is
convinced that they \e{are}, truly, immersed in the virtual world,
and that they have the power to mould it.
There are many, many challenging problems for \VR\ designers
to consider to ensure that this goal is in fact fulfilled---many of
them still not resolved satisfactorily, or in some cases not at all.
In this \typeofdoc, however, we are concerned primarily with the visual
senses, to which we shall restrict our attention.
and~\sect{MinimalImplementation}, we have considered how best to match
computer-generated images to our visual motion-detection senses.
However, we have, to date, assumed that the window''
philosophy underlying traditional computer graphics is still appropriate.
It is this unstated assumption that we shall now investigate more
critically.

in front of a rectangular graphics display of some kind.
The controlling
software generates images that are either inherently 2-dimensional
in nature; are $2\half$-dimensional'' (\ie\ emulating three dimensions
in some ways, but which do not have all of the degrees of freedom of
a true 3-dimensional image);
or are true perspective views of 3-dimensional objects.
Clearly, the former two cases are not relevant to immersive
\VR\ discussions,
and do not need to be considered further.
The last case, however---perspective 3-dimensional viewing---forms the
very basis of current \VR\ display methodology.
By and large, the concepts, methods, algorithms, and vast experience
gained in the field of traditional 3-dimensional computer
graphics development
have been carried across to the field of \VR\ unchanged.
The reasons for such a strong linkage between these two fields
are many---some historical, some simply
practical: Sutherland's pioneering efforts in both; the relative
recency of large-scale \VR\ development following the emergence
of enabling technologies; the many technical
problems common to both;
the common commercial parentage of many of the pivotal research
groups in both; and so on.
However, it is clear that, at some point in time, the field
of \VR\ must eventually be weaned off its overwhelming dependency
on the field of Computer Graphics, for the health of both.
Of course, there will remain strong bonds between the two---we will
rue the day when a Master of \VR\ student
comes up with the novel'' idea of collaborating with his counterparts
in the Computer Graphics department---but, nevertheless, the cord must
be cut, and it must be cut soon.

The reader may, by now, be wondered why the author is pushing
this point so strongly---after all, aren't \VR\ displays just like
Well, as has already been shown in sections~\sect{BasicPhilosophy}
and~\sect{MinimalImplementation}, this is
not the case: new problems require new solutions; and, conversely,
old dogs are not always even \e{interested} in new tricks (as
the millions of still-contented DOS users can testify).
But more than this, there is, \e{and always will be},
a fundamental difference
between the field of \e{planar} Computer Graphics, and the
subdiscipline of the field of \VR\ that will deal
with visual displays: namely, \e{the human visual system is
intrinsically curved}: even without moving our heads, each of our eyes can
view almost a complete hemisphere of solid angle.
Now, for the purposes of traditional computer graphics, this observation
is irrelevant: the display device is \e{itself} flat; thus, images
\e{must} be computed as if projected onto a planar viewing plane.
Whether the display itself is large or small, high-resolution or low,
it is always a planar window'' for the viewer.
But the aims of \VR\ are completely different: we wish to \e{immerse}
the participant, as completely as technically possible, in the virtual
world.
Now our terminology subtly shifts: since we ideally want to cover each
of the participant's eye's \e{entire} almost-hemispherical field of view,
the \e{size} and \e{physical viewing distance} of the display
are irrelevant: all we care about are \e{solid angles}---not diagonal
inches'', not pixels per inch'', but rather
We are working in a new world; we are subject to new constraints;
we have a gaggle of new problems---and,
currently, only a handful of old solutions.

Surely,'' a sceptic might ask, can't one always cover one's field
of view using a planar display, by placing the eye sufficiently close'
to it, and using optical devices to make the surface focusable?''
The theoretical answer to this question is, of course, in the affirmative:
since the angular range viewable by the human eye is less than
180~degrees in any direction (a violation of which would require
considerable renovations to the human anatomy), it \e{is}, indeed,
always possible to place a plane of finite area in front of the eye such
that it covers the entire field of view.
However, theoretical answers are not worth their salt in the real world;
our ultimate goal in designing a \VR\ system is to \e{maximise the
convinceability} of the virtual-world experience,
given the hardware and software resources that are available to us.

How well, then, does the planar display configuration
(as suggested by our sceptic above) perform in real life?
To answer this question at all, quantitatively,
requires consideration of the \e{human}
side of the equation: What optical
information is registrable by our eyes?
The capabilities of the human visual system have,
in fact, been investigated
in meticulous detail; we may call on this research to give us precise
quantitative answers to almost any question that we might wish to ask.
It will, however, prove sufficient for our purposes to consider only
a terribly simplistic subset of this information---not, it is hoped,
offending too greatly those who have made such topics
of research their life's work---to get a reasonable feel'' for
what we must take into account.
In order to do so, however, we shall need to have an accurate way
of portraying \e{solid angles}.
Unfortunately, it is intrinsically impossible to represent solid
angles without distortion on a flat sheet of paper, much less by
describing it in words.
Seeing as that we are still far from having \VR\ on every desk'',
it is also not currently possible to use the technology itself to
portray this information.
the following pieces of hardware so that a mathematical construction
may be carried out: a mathematical compass---preferably one with a
device that locks the arms after being set in position;
one red and one blue ballpoint pen, that both fit in the compass;
a texta (thick-tipped fibre marker), or a whiteboard marker;
a ruler, marked in millimetres (or, in the US, down to at least
sixteenths of an inch);
a simple calculator; a pair of scissors, or a knife;
an orange, or a bald tennis ball;
a piece of string, long enough to circumnavigate the orange or
tennis ball;
and a roll of sticky tape.
research budgets cut so far that this
hardware is prohibitively expensive
should skip the next few pages.

The first task is to wrap a few turns of sticky tape around the sharp
point of the compass's pivot'' arm, slightly extending past the point.
This is to avoid puncturing the orange or tennis ball
(as appropriate); orange-equipped readers that like orange juice,
and do not have any objections to licking their apparatus, may omit
this step.

The next task is to verify that the orange or tennis ball is as close
to spherical as possible for objects of their type, and suitable for
being written on by the ballpoint pens.
If this proves to be the case, pick up the piece of string and
wrap it around the orange or ball; do not worry if the string
does not yet follow a great circle.
on top of the string near where it overlaps itself (but not on top of that
point).
With the fingers of your left hand, roll the far side of the string
so that the string become more taut; let it slip under your right thumb
only gradually; but make sure that no parts of the string grab'' the
surface of the orange or
ball (except where your right thumb is holding it!).
After the rolled string passes through a great circle, it will become
loose again (or may even roll right off).
Without letting go with the right thumb,
mark the point on the string where it crosses its other end with the
texta.
Now put the orange or ball down, cut the string at the mark,
and dispose of the
part that did not participate in the circumnavigation.
Fold the string in half, pulling it taut to
align the ends.
At the half-way fold, mark it with the texta.
Then fold it in half again, and mark the two unmarked folds with the texta.
On unravelling the string, there should be three texta marks,
indicating the quarter, half and three-quarter points along its length.
Now pull the string taut along the ruler and measure its length.
This is the circumference of the orange or ball; store its value in the
calculator's memory (if it has one), or else write it down: we will
use it later.

We now define some geographical names for places on our sphere,
by analogy with the surface of the Earth.
North Pole of the orange will be defined as the point where the stem
was attached.
to denote the North Pole.
Mark this pole with a small N' with the blue ballpoint pen.
Similarly, the small mark $180\degrees$ from the North Pole on an orange's
surface will be termed the South Pole.
Tennis-ballers, however, will need to use the piece of string to find
this point, as follows: Wind the string around the ball,
placing the two ends at the North Pole, and, as before, roll it
carefully until it is taut; inspect it from the side to ensure that
it appears to dissect'' the ball into equal halves.
The half-way texta mark on the string is now over the South Pole; mark
this spot on the orange with the blue ballpoint pen.

>From this point, it will be assumed, for definiteness, that the object
is an orange; possessors of tennis balls can project a mental image of
the texture of an orange onto the surface of their ball, if so desired.
Place the string around the orange, making sure the North and South
Poles are aligned.
(If possessors of oranges find, at this point, that the mark on the
orange is not at the half-way point on the string, then either mark
a new South Pole to agree with the string, or get another orange.)
Now find the one-quarter and three-quarter points on the string,
and use the texta to make marks on the orange at these two points.
Put the string down.
Choose one of the two points just marked---the one whose surrounding
area is most amenable to being written on.
This point shall be called \e{Singapore}
(being a recognisable city, near the Equator, close to the centre of
conventional planar maps of the world);
write a small S' on the orange with the pen next to it.
(This mark cannot be confused with the South Pole, since it is not
diametrically opposite the North!)
The marked point diametrically opposite Singapore will be called \e{Quito};
it may be labelled, but will not be used much in the following.
Next, wind the string around the orange, so that is passes through
all of the following points: the two Poles, Singapore and Quito.
Use the blue ballpoint pen to trace around the circumference, completing
a great circle through these points, taking care that the string is
accurately aligned; this circle will be referred to henceforth as the
\e{Central Meridian}.
(If the pen does not write, wipe the orange's surface dry, get the
ink flowing from the pen by scribbling on some paper, and try again.
Two sets of ballpoint pens can make this procedure easier.)
Now wrap the string around the orange, through the poles, but roughly
$90\degrees$ \e{around from} Singapore and Quito; \ie\ if viewing
Singapore head-on, the string should now look like a circle surrounding
the globe.
This alignment need not be exact; the most important thing is that
the string start and end on the North Pole, and pass over the South Pole.
Make a small \e{ballpoint} mark at the one-quarter and three-quarter
positions of the string.
Now wind the string around the orange so that it passes over both of
these two new points, as well as over both Singapore and Quito.
Trace this line onto the orange.
This is the Equator.

We are now in a position to start to relate this orangeography to
our crude mapping of the human visual system.
We shall imagine that the surface of the orange represents the solid
angles seen by the viewer's eye, by imagining that the viewer's eye
is located at the \e{centre} of the orange; the orange would (ideally)
be a transparent sphere, fixed to the viewer's head,
on which we would plot the extent of her view.
Firstly, we shall consider the situation when the viewer is looking
straight ahead at an object at infinity,
with her head facing in the same direction.
We shall, in this situation, define the direction of view (\ie\ the
direction of \e{foveal view}---the most detailed vision in the centre
of our vision)
as being in the \e{Singapore} direction (with the North Pole towards the
One could imagine two oranges, side by side, one centred on each of
the viewer's eyes, with both Singapores in the same direction; this
would represent the viewer's direction of foveal view from each eye
in this situation.
Having defined a direction thus, the oranges should now
be considered to be \e{fixed} to the viewer's head for the remainder of
this section.

We now concentrate solely on the \e{left} eye of the viewer, and
the corresponding orange surrounding it.
We shall, in the following,
be using the calculator to compute lengths on the ruler
against which we shall set our compass;
readers that will be using solar-powered credit-card-sized
models, bearing in large fluorescent letters the words
ACME CORPORATION---FOR ALL YOUR COMPUTER NEEDS'',
should at this point relocate themselves to a suitably sunny
location to avoid catastrophic system shut-downs.
The compass, having been set using the ruler to the number spat out by the
calculator, will be
used to both measure distances'' between points
inhabiting the curved surface of the orange,
as well as to actually draw circles on the thing.

The first task is to compute how long 0.122 circumferences is.
(For example, if the circumference of the orange was 247~mm,
punch $247\times0.122=$'' on the calculator, which yields the answer
$30.134$; ignore fractions of millimetres.
Readers using Imperial units will need to convert back and forth
between fractions of an inch.)
Put the \e{red} ballpoint pen into the compass,
and set its arms, with tips against the ruler,
so that the tips are separated by this
length of 0.122 circumferences (whatever the calculator gave for that
Now centre the \e{pivot} of the compass on Singapore, and draw a
circle on the orange with the red pen (which is often easier said than
done, but \e{is} possible).
The solid angle enclosed by this red circle (subtended, as always,
at the centre of the orange---where the viewer's eye is assumed to be
located) indicates, in rough terms,
all of the possible directions that our viewer can look directly at'';
in other words,
the muscles of her eye can rotate her eyeball so that any direction
in this solid angle is brought into central foveal view.

It will be noted that, all in all, this solid angle of foveal view is
not too badly curved'', when looked at in three dimensions.
Place a \e{flat plane} (such as a book)
against the orange, so that it touches the
orange at Singapore.
One could imagine cutting out the peel of the orange around this
red circle, and flattening it'' onto the plane of the book without
too much trouble; the outer areas would be stretched (or, if dry and
brittle, would fracture), but overall the correspondence between the
section of peel and the planar surface is not too bad.
(Readers possessing multiple oranges, who do not mind going through the
above procedure a second time, might actually like to try this
peel-cutting and -flattening exercise.)
The surface of the plane corresponding to the flattened peel
corresponds, roughly, to the maximum (apparent) size that a
traditional, desk-mounted graphics screen can use: any larger
than this and the user would need to \e{move her head}
to be able to focus on all parts of the screen---a somewhat tiring
Thus, it can be seen that it is \e{the very structure of our eyes}
that allows flat display devices to be so successful: any device
subtending a small enough solid angle that all points can be read''
(\ie\ viewed in fine detail)
without gross movements of one's head cannot have
problems of wrap-around'' anyway.

\VR\ systems, of course, have completely different goals:
the display device is not considered to be a readable''
object, as it is in traditional computing environments---rather,
it is meant
to be a convincing, \e{completely immersive} stimulation of our
visual senses.
In such an environment, \e{peripheral vision} is of vital importance,
for two reasons.
Firstly, and most obviously, the convinceability of the \VR\ session
will suffer if the participant has blinkers on'' (except, of course,
for the
singular case in which one is trying to give normally-sighted people
an idea of what various sight disabilities look like from the
sufferer's point of view).
Secondly, and quite possibly more importantly, is the fact that,
although our peripheral vision is no good for \e{detailed} work,
it is especially good at detecting \e{motion}.
Such feats are not of very much use in traditional computing environments,
but are vital for a \VR\
participant to (quite literally) get one's
bearings'' in the spatially-extended immersive environment.
We must therefore get some sort of feeling---using
our orange-mapped model---for the range of human peripheral
vision, so that we might cater for it satisfactorily
in our hardware and software implementations.

The following
construction will be a little more complicated than the earlier ones;
a check-list'' will be presented at the end of it so that the reader
can verify that it has been carried out correctly.
Firstly, set the compass tip-separation to a distance (measured,
as always, on the ruler) equal to $0.048$ circumferences (a relatively
small distance).
Place the pivot on Singapore, and mark off the position to the \e{east}
(right) of Singapore where the pen arm intersects the Equator.
We are now somewhere near the Makassar Strait.
Now put the \e{blue} ballpoint pen into the
compass, and set its tip-to-tip distance to $0.2$ circumferences;
this is quite a large span.
Now, \e{with the pivot on the new point marked in the Makassar Strait},
carefully draw a circle on the orange.
The large portion of solid angle enclosed by this circle
represents, in rough terms, the range of our peripheral vision.
To check that the
construction has been carried out
correctly, measure the following distances, by
placing the compass tips on the two points mentioned, and then measuring
the tip-to-tip distance on the ruler:
>From Singapore to the point
where this freshly-drawn circle cuts the Central Meridian (either
to the north or south): about 0.18 circumferences;
from Singapore to the point where the
circle cuts the Equator to the
from Singapore to the point where the circle cuts the Equator to the
If these are roughly correct, one can,
in fact, also check that the \e{earlier}, foveal
construction is correct, by measuring the tip-to-tip distance between
the red and blue circles where they cross the Equator and Central
Meridian.
To the west, this distance should be about 0.04 circumferences; to
0.08 circumferences.

There are several features of the range of our peripheral vision that
we can now note.
Firstly, it can be seen that our eyes can actually look
a little \e{behind} us---the left eye to the left, and the right
eye to the right.
(Wrap the string around the orange, through the poles, $90\degrees$
east of Singapore; the peripheral field of view cuts behind the
forward hemisphere.)
To verify that this is, indeed, correct, it suffices to stand
with one's face against a tall picket fence; with one's nose
between the pickets, and looking straight ahead,
you can still see things going on on \e{your} side of the fence!
That our field of vision is
configured in this way
is is not too surprising when it is noted that one's \e{nose} blocks
one's field of view in the other direction (but which is,
of course, picked up
by the other eye); hence, our eyes and faces
have been appropriately angled so that our eyes' fields of vision
are put to most use.
But this also means that it is \e{not} possible to project the view
from both eyes onto \e{one single} plane directly in front of our
faces.
Of course, we would intend to use two planar displays anyway, to
provide stereopsis; but this observation is important when it comes
to \e{clipping} objects for display (for which a single planar clip,
to cover \e{both} eyes, is thus impossible).

Secondly, we note that a reasonable approximation is to consider
the point near the Makassar Strait to be the centre'' of a circle
of field of view for the left eye.
(This unproved assertion by the author may be verified as approximately
correct by an examination of ocular data.)
The field of view extends about $75\degrees$--$80\degrees$
in each direction from this point, or about a $150\degrees$--$160\degrees$
span'' along any great circle.
This is, of course, a little shy of the full $180\degrees$ that a
hemispherical view provides, but not by much (as an examination of the
orange reveals).
Now imagine that
one was to cut the peel of the orange out around the large (blue)
circle of peripheral vision, and were then to try and flatten it''
onto a plane.
It would be a mess!
An almost-hemisphere is something that does not map well to a flat plane.

Let us reconsider, nevertheless,
our earlier idea of placing a planar display
device in front of each eye (with appropriate optics for focusing).
How would such a device perform?
Let us ignore, for the moment, the actual size of the device, and merely
consider the \e{density of pixels} in any direction of sight.
For this purpose, let us assume that the pixels are laid out on a
regular rectangular grid (as is the case for real-life display devices).
Let the distance between the eye and the plane (or the
effective plane, when employing optics) be $R$ pixel widths (\ie\ we
are using the pixel width as a unit of distance, not only in the
display plane, but also in the orthogonal direction).
Let us measure first along the $x$ axis of the display, which we shall
take to have its origin at the point on the display
pierced by a ray going through the eye and the Makassar Strait point
(which may be visualised by placing a plane, representing the display,
against the orange at the Makassar Strait).
Let us imagine that we have a head-mountable display of resolution (say)
$512\times512$, which we want to cover the entire field of view (with
a little wastage in the corners).
Taking the maximum field of view span from the Makassar Strait to be
about $80\degrees$, we can now compute $R$ from simple trigonometry.
Drawing a triangle connecting the eye, the Makassar Strait and
the extremum pixel on the axis, and insisting that this extrumum
pixel be just at the edge of the field of view (\viz\ $80\degrees$ away),
we obtain
$\tan80\degrees=\f{256}{R},$
where we have noted that half of the display will extend to the other
side of the origin, and hence the centre-to-extremum distance in the
plane is 256 pixels.
Computing the solution to this equation, we have
$R=\f{256}{\tan80\degrees}\approx45.1\mbox{ pixels}.$
For a planar device of resolution $512\times512$ to just cover the entire
field of view, this eye--plane distance is absolute;
\VR\ displays, unlike traditional computing environments, by their
nature specify \e{exactly} how far away the display plane must be;
this is no longer a freedom of choice.
The most important question then arises: how good a resolution does
this represent, if we simply look straight ahead?
Well, in this direction, we have the approximation $dx\approx R\,d\th$,
where $d\th$ is the linear angle (in radians)
subtended by the pixel of width $dx$.
Since the pixel width $dx$ is, by definition, $1$~pixel-width (our unit
of distance), we therefore find that
$d\th\approx\f{dx}{R}\approx\f{1}{45.1}\approx0.0222\mbox{ radians} \approx1.27\degrees.$
So how big is a $1.27\degrees$-wide pixel?
Let us compare this with familiar quantities.
The pixel at the centre of the $640\times480$ VGA display mode, viewed on
a standard IBM 8513 monitor from a distance of half a metre,
subtends an angle of about 2.2~arc-\e{minutes}, which is approaching
the limit of our foveal vision.
Thus, our planar-display wrap-around central pixel looks about the
same as a $35\times35$ square does on a VGA $640\times480$ display---{it
is huge!}
But \e{why} is it so huge?
After all, 512 pixels of a VGA display subtend about $18.6\degrees$ of arc
at the same half-metre viewing distance;
stretch this out to cover $160\degrees$ instead and each pixel should
get bigger in each direction by a factor of about $160/18.6=8.6$.
So why are we getting an answer that is four times worse in
linear dimensions (or sixteen times worse in terms of area)
than this argument suggests?

The answer comes from considering, as a simple example, the very
\e{edgemost} pixel in the $x$-direction on the display---the one that is
$80\degrees$ away from the Makassar Strait point, just visible on
the edge of our viewer's peripheral vision.
How much angle does \e{this} pixel subtend?
An out-of-hat answer would be the same as the one in the middle''---after
all, the rectangular grid is regularly spaced.
But this reasoning relies, for its approximate truth, on the fact that
\e{conventional} planar displays only subtend a small total angle anyway.
On the contrary, we are now talking about a planar device subtending
a \e{massive} angle; paraxial
approximations are rubbish in this environment.
$x=R\tan\th,$
where $R=45.1$ pixels in our example,
and $\th$ is the angle between the pixel's
apparent position and the Makassar Strait.
Inverting this relationship we have
$\th=\artan\paren{\f{x}{R}},$
and, on differentiation,
$d\th=\f{dx}{\sqrt{x^2+R^2}}.$
For $dx=1$ at $x=0$ we find $d\th\approx1/45.1\approx0.0222$ radians, or
$1.27\degrees$, as before; this only applies at $x=0$.
For $dx=1$ at $x=256$, on the other hand, we find that
$d\th=\f{1}{\sqrt{256^2+45.1^2}}\approx0.00385\mbox{ radians} \approx0.22\degrees.$
Thus, the pixels at the outermost edges of the visible display are
subtending a linear angle nearly six times \e{smaller} than at the
centre of the display---or, in other words, it would take
about 33 extremity-pixels to subtend the same solid angle as the
\e{central} pixel!
But this is crazy!
We already know that the peripheral vision, outside the range of
foveal tracking (the red circle on the orange), is sensitive
to significantly \e{less} detail that the fovea!
Why on earth are we putting such a high pixel density out there?
Who ordered that?
Sack him!

%
%  ...File 4 of 4 should be concatenated here ...



The entire AOH site is optimized to look best in Firefox® 3 on a widescreen monitor (1440x900 or better).