Indigo, a Portable Implementation
of Forth for Unix Workstations

By: R. J. Brown
Elijah Laboratories, Inc.


Introduction

Indigo is a portable implementation of Forth.  It is written
entirely in C and itself.  It is intended to run on 32 bit machines
with 2's compliment binary integers.  It is more specificaly intended
to be run on 32 bit Unix based workstations supporting X-windows with
OSF Motif.  Indigo supports an address space of 29 bits, or 512 MW,
which is 2 GB.  This should be adaquate for a while.

Addressing & Tokens

Indigo uses a token threading scheme to facilitate object oriented
programming using the Dreams metaphor.  It also has the capability to
bypass the token threading to facilitate direct binding to an object
without respect to its name, or to changes in the definition of that
object's name.  See the paper, "Viva La Spaghetti", which discusses
this concept.

Indigo has a branch, or goto, instruction that may be either token
threaded or direct threaded.  This may be used to implement control
constructs such as IF ELSE THEN and DO LOOP, etc.  It may also used to
perform tail recursion optimization.

All addressing in Indigo is either primative, meaning the address
of a built-in Indigo word written in C; or threaded, meaning the
address of an interpretable definition.  

Addressing is either indirect, meaning a token that is translated
by the token translation table into an absolute address of either a
primative or threaded definition; or relative, meaning the offset from
itself to another cell in the dictionary.  It is not possible to
address a primative with relative addressing.

Relative addressing permits objects to be copied without corrupting
the meaning associated with branch and call destinations to other
cells in that same object given as relative addresses.  Absolute
addressing permits objects to be copied without corrupting the
meanings of direct addresses referencing words external to the copied
object.

Stacks & Multi-Stacks

Indigo has a number of stacks.  In addition to the classic Forth
parameter and return stacks, indigo keeps a set of stacks to control
the Dreams object oriented system.  It also keeps a stack of
interpreter contexts, called threads.  These are used to control the
execution of the Forth interpreter.  

Indigo uses a structure sharing multi-stack implementation of
stacks.  This permits stacks that have a common ancestry to share
structure among themselves.  These multi-stacks are implemented as
trees.  The trees are composed of nodes called joints.  Each joint has
a value, called the fruit, where the data associated with that
position of the tree is stored.  Each joint also has a pointer, called
the trunk, that points to its ancestor, in the direction of the root
of the tree.  

Each joint has a counter, called branches, that keeps a count of
the number of branches of the tree that branch off of this joint.
This is equivelent to a count of the number of joints whose trunk
points to this joint.  The branches field is a reference count.  It is
used to eliminate the need for a garbage collector.  Joints are
returned to free storage immeadiatly as the are released from use in
the tree.  This occurs most frequently during pop operations on the
stack.

Continuations & Interpreter Context

Indigo keeps the context information for the interpreter in a data
structure called a thread.  Threads are analogous to the continuations
found in Scheme.  They are called stack groups on the Symbolics Lisp
Machine.  

During the execution of a threaded definition, a call may be made
to another definition, or subroutine.  When this happens, Indigo
pushes the current thread onto a stack, known as the fabric.  The top
thread of the current fabric is the current context of the Forth
interpreter.  In a classical Forth system, all that gets saved is the
instruction pointer, IP.  In Indigo, the IP is only part of the
context that gets saved.  All of the current stack pointers get saved
as well.  With the multi-stack implementation of stacks, this allow
Indigo to save the complete status of each stack by saving is stack
pointer.  

The subroutine linkage mechanism, which is part of the Forth
engine, then creates a new thread as a spin-off of the old thread.
This new thread inherits all the context information of the parent
thread as its initial value, but it is an object unto itself, with an
existence of its own.  The subroutine executes in the environment of
this spin-off thread.

When the subroutine returns, the instruction pointer saved in the
parent thread's IP slot is copied into the IP slot of the spin-off
thread.  Then the parent thread is replaced by the spin-off thread.
The parent thread is destroyed, returning its resources to the
available pool.  The pool of available threads is called the warp.
The list of active threads is called the weft.

Exception Handling

The Forth mechanism of CATCH and THROW is beautifully implemented
using with thread concept.  In addition, all of the techniques of
Continuations, and the theory of Prompts, such as encountered in the
Scheme dialect of Lisp, are readily available in Indigo.

Failure

In Indigo, the execution of a definition may fail, much as in
SNOBOL or Icon.  Since all subroutines execute in their own private
environment, a subroutine may return failure instead of returning
normally.  When this happens, its thread is pulled off the fabric and
destroyed, allowing its parent thread to survive.  This results in all
stack effects and Dream bindings that were changed by that subroutine
to be undone, and the entire, unaltered environment of the caller is
restored.  Only global side-effects produced by the subroutine remain.
If the subroutine modifies no non-local variables and performs no I/O,
then it will be as though it were never called at all.

This can be quite useful in a debugging situation.  If a break
occurs giving control to a debugger, then the debugger may typically
provide facilities for the programmer to inspect the stacks, and even
to modify them.  If a routine is determined to be incorrect, then the
source code for the routine may be edited and recompiled.  Note that
only that one routine is recompiled.  The new definition is then
substituted for the old definition.  The programmer may then cause the
old invocation of the subroutine to fail, restoring the environment,
and all its stacks, etc., to what they were just before the call was
made.  The IP is then backed up to point to the call to the new
version of the subroutine, and execution is resumed.

This technique allows interruption of a program inside the
execution of a subrotine, modification of the definition of that
subroutine, and then retrying the call to that subroutine.  All this
is done without disturbing the execution context of the rest of the
program.  If a large program takes several minutes to compile, and
several hours to run up to the point where a bug occurs, then this
technique can save enormous amounts of time in debugging.  What would
have required recompiling the entire program, and rerunning the
program from the begining up to the point of the bug, can be turned
into just recompiling a single definition, and restarting the
program's execution at the point immeadiatly before the call to the
buggy subroutine, saving the time of the hours of execution it took to
reach the bug.

Fabrics & Multi-Tasking

Indigo maintains a stack of threads, called a fabric, for each
concurrently executing process in the system.  Each fabric is
sufficient to characterize the entire process, except for global
side-effects.  Since each fabric is an object, and the underlying
stack is a multi-stack, the entire multi-stack characterizes the
status of one task in the Indigo system.  Each active process is
represented as such a multi-stack.  This entire set of fabrics is
called the garment.  It covers the Indigo system, giving it its total
appearance, much as a suit of clothing covers a person.  As clothes
make the man, garments make the application.

A garment may be a single thread, like a garmet knit from one
color, and hence a single process, or it may be composed of as
complicated a concurrently executing collection of processes as is
desired.  When multiple concurrently executing processes are present,
some kind of scheduler is usually desired.  To schedule the multiple
fabrics, one must manipulate the order in which the leaves of the
garment are executed.  The literature is full of various scheduling
techniques, each better suited than another for a particular problem.
For this reason, Indigo does not provide a generic scheduler at this
time, but does provide all the hooks to allow a programmer to roll his
own.

Dynamic Binding and the Environment

Indigo's token translation scheme translates a Forth execution
token into an execution address by using indirect addressing.  The
token translation table implements this mapping.  The token is used as
a subscript into the token translation table array.  The value fetched
from that slot in the array is the execution address.  Since the slots
in the token translation table may be written as well as read,
a modification of the translation table effects a modification in the
binding of the word referenced by the associated token.  This
modification of bindings is used by the Dreams object oriented
programming package.  

Since each concurrently executing fabric is able to modify the
token translation table independently, each fabric needs its own copy
of the table.  This permits reasonably rapid context switching when
changing the Forth engine from from one fabric to another.  Thus, when
a thread is pulled from a fabric to start a new fabric, a copy of the
token translation table is made at that time.

Modifications to the active token table, or env, need to persist as
side-effects upon return from subroutine calls, but need to be undone
in the event of a failure return, or a throw.  Rather than have a
separate copy of the env for each thread, a single copy is shared
among all the threads in a fabric.  This eliminates the storage burden
of providing a separate token translate table for each thread.  It
also eliminates the time burden of initializing a new table for each
thread.  Remember that a new thread is spun each time a subroutine is
called, so these burdens would be quite prohibitive indeed.  

In order to maintain the necessary historical information to enable
restoring the binding environment to its initial state at the time a
thread was spun, and not incurr too much overhead in the process, a
change history is maintained for the token translate table.  Each time
a token's translation is modified, this history list is searched.  If
the token is already present on the list, then nothing special need be
done; the slot coresponding to that token is modified, and that is it.
If the token is not found on the list, then it is added to the list,
together with a copy of the original contents of the token's
translation table slot.  This way, original values are backed up only
when they are first modified.

In order to minimize the search time needed to determine whether a
particular token's translation has been modified by this thread, or a
thread spun off of this thread, a linked data structure known as the
slush fund is employed.  It is called the slush fund because it is
used to clean up the environment in case of emergencies, such as a
throw or failure.  

Each slot in the token translation table points to a structure
known as a polutant.  The polutant is a 6-tuple containing the
execution address associated with that token, the token associated
with that polutant, a serial number, the thread that was the agent
causing the change in the binding, a shallow binding history chain of
events relating all of the changes to that token, and a deep binding
history chain of events relating all of the changes to that
environment, represented as the token translation table associated
with that fabric.  All of this information is maintained by the
ecology routines as bindings are altered by threads, and as threads
return either normally, by failure, or by throws, etc.

Case Tools

Indigo is intended to host the initial version of BROOT, the Bilateral
Recursive Object Oriented Tasker, and to be the cross development
platform for BROOT based applications.

The design of this tasker is intended to mirror the way hardware is
designed.  Each task is analogous to a component in a logic or
schematic diagram.  It has an input connector and an output connector.
These connectors each have input or output pins, respectively.  Each
of these pins has a mechanism to determine whether its value has
changed since the last time its associated component has accessed it.
This way, the system is data-driven: no re-computation of a result is
performed if the inputs have not changed, and no recomputation is
performed if a later component requests the latest value if that value
has not changed.  This allows for both demand-driven execution when an
input to the system changes state, as well as polled execution after
the manner of "streams", where no re-computation of an unchanged value
is performed.

The dispatching strategy is designed around each component running
to completion each time it is invoked, rather than looping and
waiting.  This permits the local state to be maintained in memory
allocated to the component instance and its associated I/O pins
instead of on the two Forth stacks.  The dispatcher will only allocate
stack memory to a component when it has been triggered by input
conditions that indicate its execution is required.

When the component exits, the dispatcher will release the stack
memory and distribute the changed signals on the component's output
pins according to their associated signal daisy chains.  This, in
turn, will cause the capture methods of the receiving input pins to
save whatever information about the signal that is required by the
associated component and optionally set any flags to communicate with
the trigger method.  

After all changed signals have been distributed, all affected
components will have their trigger method run in the dispatcher's
environment to determine if the component needs to compete for the CPU
on a priority basis.  If so, the memory allocater attempts to find a
suitable stack buffer for the highest priority component.  If stack
space is available, then it is assigned to the component and the
component is placed on the dispatch queue in priority order.

If stack space is not available, the dispatcher attempts to find a
lower priority component on the dispatch queue that has not yet
started execution.  Such a compponent has not yet used its stacks, and
so the dispatcher may "steal" that stack space and give it to the
higher priority component instead.  If this is the case, then the
component from whom the stack space was stolen must be removed from
the dispatch queue and placed on the waiting for stack space queue
instead.

If no such lower priority component can be found, then a check is
made to determine if any lower priority component that has started
execution has a suitable stack, and that no higher priority component
has such a stack space.  If so, then that component's priority is
temporarily promoted to the priority of the waiting component.  This
insures that a higher priority component is only blocked minimally by
a lower priority one if memory is running low.

The notion of components and connections used in BROOT is actually
only a metaphor for the "data flow graph" paradigm.  BROOT will
support a component as being a dataflow sub-graph with a single input
and a single output connector, so that a component may actually be a
graph at a lower level.  If a component is not a graph at a lower
level, then it is atomic.  Each atomic component has an associated
process method that is a Forth word, either primative or threaded.  It
is this word that executes when the component is dispatched.

The capture, trigger, and output methods all execute in the
dispatcher's environment, but the process method executes in its own
environment.  The other methods should be quite short in practice, and
since BROOT is targeted at embedded systems applications, there is no
need for protection of one peice of code from another.

Since the real-time system is represented as a graph, network
analysis techniques are suitable design tools.  A critical path may be
determined through any non-cyclic sub-graph in the system, and the
time to execute that critical path may be determined by summing the
times to execute each component.  Recursive application of this rule
will eventually result in the need to determine the execution time for
each process method, but since these are Forth words, and their
threaded compiled definition is readily available in the development
environment, they too may be recursivly decomposed into primatives.
The primatives, in the case of a FORCE processor, are machine
instructions that each execute in a single clock cycle, so the worst
case number of clock cycles for any non-cyclic path through the system
may be determined automatically.

For cycles, which correspond to loops in a conventional
architecture, the designer must establish a worst case value for the
number of iterations before the cycle will generate an output that is
sent outside of the cycle itself.  This determines the time for the
sub-graph containing the cycle to execute.  This sub-graph is then
wrapped-up as a module and treated like any other component: the
cyclic nature has been hidden at a higher level, and only the time to
execute remains important.

By means of this kind of designer/machine interaction in the
development environment, the response time of the system may be
determined.  What's more, the determination is auditable and
repeatable: a very desirable trait for a real-time system.  If a
subsequent edit changes the definition of any of the components, a
redetermination of the response time should be fully automatic.  Only
in the case that the network topology has been changed with respect to
cycles does the designer need to interact in the response time
determination.

The development environment spoken of here is intended to be highly
graphic and interactive.  The system as a whole shall be viewed on a
graphics display screen as a dataflow graph, with lines connecting
components.  The components themselves are displayed as user definable
icons.  The lines connecting them represent potentially more than one
signal from one component to another.

If the mouse is moved to a line and the botton pressed, the names
of all the signals that line represents will be displayed.  If the
mouse is then dragged to a particular signal's name, all the
components to which that signal is distributed will be high-lighted.
If the mouse is released at this position, the highlighting will
remain in effect, but if the mouse is dragged away from all signal
names before being released, then the high-lighting will go away.

If the mouse is pressed on a component, the component will be
high-lighted.  If the mouse is dragged, the component will move, and
all the lines connecting to that component will move with it,
remaining anchored at their other ends.  This permits the designer to
arrange and rearrange the appearance of the dataflow graph at will.

If the mouse is clicked on a component, that component and all
signal lines connected to it will be high-lighted.  If the mouse is
presssed on that component, a menu will pop up with various choises on
it, such as: input, output, trigger, process, locals, stack.  Input
and output will expand the respective connectors and show which
signals are on the changed lists.  A similar operation will expand a
single input or output pin, and the capture, retrieval, output, and
transmission methods may be displayed as Forth source screens.

Trigger will likewise pop up the screen editor on the source for
the trigger method word.  Process will either pop up the screen editor
if the method is a Forth word, or else the screen will dive in one
level deeper and display the dataflow sub-graph of the module, showing
a trace of the nesting by a stack of the icons that have been
expanded.  Return from this recursive browsing is accomplished by
clicking on the icon to return to.  The screen may be split and
manipulated using all the standard techniques provided by OSF Motif
and X-Windows, so an earlier level may be held on the screen at the
same time as a later one by dragging it's icon to the desired place on
the screen.

Since each component is a member of a class, the class as a whole
may be manipulated by manipulating the icon for the class.  Each
instance has an instance name associated with it, so many instances of
the same icon are differentiated by unique names.  The instance names
may be viewed by pointing to the icon and pressing a mouse key.