Indigo, a Portable Implementation of Forth for Unix Workstations By: R. J. Brown Elijah Laboratories, Inc. Introduction Indigo is a portable implementation of Forth. It is written entirely in C and itself. It is intended to run on 32 bit machines with 2's compliment binary integers. It is more specificaly intended to be run on 32 bit Unix based workstations supporting X-windows with OSF Motif. Indigo supports an address space of 29 bits, or 512 MW, which is 2 GB. This should be adaquate for a while. Addressing & Tokens Indigo uses a token threading scheme to facilitate object oriented programming using the Dreams metaphor. It also has the capability to bypass the token threading to facilitate direct binding to an object without respect to its name, or to changes in the definition of that object's name. See the paper, "Viva La Spaghetti", which discusses this concept. Indigo has a branch, or goto, instruction that may be either token threaded or direct threaded. This may be used to implement control constructs such as IF ELSE THEN and DO LOOP, etc. It may also used to perform tail recursion optimization. All addressing in Indigo is either primative, meaning the address of a built-in Indigo word written in C; or threaded, meaning the address of an interpretable definition. Addressing is either indirect, meaning a token that is translated by the token translation table into an absolute address of either a primative or threaded definition; or relative, meaning the offset from itself to another cell in the dictionary. It is not possible to address a primative with relative addressing. Relative addressing permits objects to be copied without corrupting the meaning associated with branch and call destinations to other cells in that same object given as relative addresses. Absolute addressing permits objects to be copied without corrupting the meanings of direct addresses referencing words external to the copied object. Stacks & Multi-Stacks Indigo has a number of stacks. In addition to the classic Forth parameter and return stacks, indigo keeps a set of stacks to control the Dreams object oriented system. It also keeps a stack of interpreter contexts, called threads. These are used to control the execution of the Forth interpreter. Indigo uses a structure sharing multi-stack implementation of stacks. This permits stacks that have a common ancestry to share structure among themselves. These multi-stacks are implemented as trees. The trees are composed of nodes called joints. Each joint has a value, called the fruit, where the data associated with that position of the tree is stored. Each joint also has a pointer, called the trunk, that points to its ancestor, in the direction of the root of the tree. Each joint has a counter, called branches, that keeps a count of the number of branches of the tree that branch off of this joint. This is equivelent to a count of the number of joints whose trunk points to this joint. The branches field is a reference count. It is used to eliminate the need for a garbage collector. Joints are returned to free storage immeadiatly as the are released from use in the tree. This occurs most frequently during pop operations on the stack. Continuations & Interpreter Context Indigo keeps the context information for the interpreter in a data structure called a thread. Threads are analogous to the continuations found in Scheme. They are called stack groups on the Symbolics Lisp Machine. During the execution of a threaded definition, a call may be made to another definition, or subroutine. When this happens, Indigo pushes the current thread onto a stack, known as the fabric. The top thread of the current fabric is the current context of the Forth interpreter. In a classical Forth system, all that gets saved is the instruction pointer, IP. In Indigo, the IP is only part of the context that gets saved. All of the current stack pointers get saved as well. With the multi-stack implementation of stacks, this allow Indigo to save the complete status of each stack by saving is stack pointer. The subroutine linkage mechanism, which is part of the Forth engine, then creates a new thread as a spin-off of the old thread. This new thread inherits all the context information of the parent thread as its initial value, but it is an object unto itself, with an existence of its own. The subroutine executes in the environment of this spin-off thread. When the subroutine returns, the instruction pointer saved in the parent thread's IP slot is copied into the IP slot of the spin-off thread. Then the parent thread is replaced by the spin-off thread. The parent thread is destroyed, returning its resources to the available pool. The pool of available threads is called the warp. The list of active threads is called the weft. Exception Handling The Forth mechanism of CATCH and THROW is beautifully implemented using with thread concept. In addition, all of the techniques of Continuations, and the theory of Prompts, such as encountered in the Scheme dialect of Lisp, are readily available in Indigo. Failure In Indigo, the execution of a definition may fail, much as in SNOBOL or Icon. Since all subroutines execute in their own private environment, a subroutine may return failure instead of returning normally. When this happens, its thread is pulled off the fabric and destroyed, allowing its parent thread to survive. This results in all stack effects and Dream bindings that were changed by that subroutine to be undone, and the entire, unaltered environment of the caller is restored. Only global side-effects produced by the subroutine remain. If the subroutine modifies no non-local variables and performs no I/O, then it will be as though it were never called at all. This can be quite useful in a debugging situation. If a break occurs giving control to a debugger, then the debugger may typically provide facilities for the programmer to inspect the stacks, and even to modify them. If a routine is determined to be incorrect, then the source code for the routine may be edited and recompiled. Note that only that one routine is recompiled. The new definition is then substituted for the old definition. The programmer may then cause the old invocation of the subroutine to fail, restoring the environment, and all its stacks, etc., to what they were just before the call was made. The IP is then backed up to point to the call to the new version of the subroutine, and execution is resumed. This technique allows interruption of a program inside the execution of a subrotine, modification of the definition of that subroutine, and then retrying the call to that subroutine. All this is done without disturbing the execution context of the rest of the program. If a large program takes several minutes to compile, and several hours to run up to the point where a bug occurs, then this technique can save enormous amounts of time in debugging. What would have required recompiling the entire program, and rerunning the program from the begining up to the point of the bug, can be turned into just recompiling a single definition, and restarting the program's execution at the point immeadiatly before the call to the buggy subroutine, saving the time of the hours of execution it took to reach the bug. Fabrics & Multi-Tasking Indigo maintains a stack of threads, called a fabric, for each concurrently executing process in the system. Each fabric is sufficient to characterize the entire process, except for global side-effects. Since each fabric is an object, and the underlying stack is a multi-stack, the entire multi-stack characterizes the status of one task in the Indigo system. Each active process is represented as such a multi-stack. This entire set of fabrics is called the garment. It covers the Indigo system, giving it its total appearance, much as a suit of clothing covers a person. As clothes make the man, garments make the application. A garment may be a single thread, like a garmet knit from one color, and hence a single process, or it may be composed of as complicated a concurrently executing collection of processes as is desired. When multiple concurrently executing processes are present, some kind of scheduler is usually desired. To schedule the multiple fabrics, one must manipulate the order in which the leaves of the garment are executed. The literature is full of various scheduling techniques, each better suited than another for a particular problem. For this reason, Indigo does not provide a generic scheduler at this time, but does provide all the hooks to allow a programmer to roll his own. Dynamic Binding and the Environment Indigo's token translation scheme translates a Forth execution token into an execution address by using indirect addressing. The token translation table implements this mapping. The token is used as a subscript into the token translation table array. The value fetched from that slot in the array is the execution address. Since the slots in the token translation table may be written as well as read, a modification of the translation table effects a modification in the binding of the word referenced by the associated token. This modification of bindings is used by the Dreams object oriented programming package. Since each concurrently executing fabric is able to modify the token translation table independently, each fabric needs its own copy of the table. This permits reasonably rapid context switching when changing the Forth engine from from one fabric to another. Thus, when a thread is pulled from a fabric to start a new fabric, a copy of the token translation table is made at that time. Modifications to the active token table, or env, need to persist as side-effects upon return from subroutine calls, but need to be undone in the event of a failure return, or a throw. Rather than have a separate copy of the env for each thread, a single copy is shared among all the threads in a fabric. This eliminates the storage burden of providing a separate token translate table for each thread. It also eliminates the time burden of initializing a new table for each thread. Remember that a new thread is spun each time a subroutine is called, so these burdens would be quite prohibitive indeed. In order to maintain the necessary historical information to enable restoring the binding environment to its initial state at the time a thread was spun, and not incurr too much overhead in the process, a change history is maintained for the token translate table. Each time a token's translation is modified, this history list is searched. If the token is already present on the list, then nothing special need be done; the slot coresponding to that token is modified, and that is it. If the token is not found on the list, then it is added to the list, together with a copy of the original contents of the token's translation table slot. This way, original values are backed up only when they are first modified. In order to minimize the search time needed to determine whether a particular token's translation has been modified by this thread, or a thread spun off of this thread, a linked data structure known as the slush fund is employed. It is called the slush fund because it is used to clean up the environment in case of emergencies, such as a throw or failure. Each slot in the token translation table points to a structure known as a polutant. The polutant is a 6-tuple containing the execution address associated with that token, the token associated with that polutant, a serial number, the thread that was the agent causing the change in the binding, a shallow binding history chain of events relating all of the changes to that token, and a deep binding history chain of events relating all of the changes to that environment, represented as the token translation table associated with that fabric. All of this information is maintained by the ecology routines as bindings are altered by threads, and as threads return either normally, by failure, or by throws, etc. Case Tools Indigo is intended to host the initial version of BROOT, the Bilateral Recursive Object Oriented Tasker, and to be the cross development platform for BROOT based applications. The design of this tasker is intended to mirror the way hardware is designed. Each task is analogous to a component in a logic or schematic diagram. It has an input connector and an output connector. These connectors each have input or output pins, respectively. Each of these pins has a mechanism to determine whether its value has changed since the last time its associated component has accessed it. This way, the system is data-driven: no re-computation of a result is performed if the inputs have not changed, and no recomputation is performed if a later component requests the latest value if that value has not changed. This allows for both demand-driven execution when an input to the system changes state, as well as polled execution after the manner of "streams", where no re-computation of an unchanged value is performed. The dispatching strategy is designed around each component running to completion each time it is invoked, rather than looping and waiting. This permits the local state to be maintained in memory allocated to the component instance and its associated I/O pins instead of on the two Forth stacks. The dispatcher will only allocate stack memory to a component when it has been triggered by input conditions that indicate its execution is required. When the component exits, the dispatcher will release the stack memory and distribute the changed signals on the component's output pins according to their associated signal daisy chains. This, in turn, will cause the capture methods of the receiving input pins to save whatever information about the signal that is required by the associated component and optionally set any flags to communicate with the trigger method. After all changed signals have been distributed, all affected components will have their trigger method run in the dispatcher's environment to determine if the component needs to compete for the CPU on a priority basis. If so, the memory allocater attempts to find a suitable stack buffer for the highest priority component. If stack space is available, then it is assigned to the component and the component is placed on the dispatch queue in priority order. If stack space is not available, the dispatcher attempts to find a lower priority component on the dispatch queue that has not yet started execution. Such a compponent has not yet used its stacks, and so the dispatcher may "steal" that stack space and give it to the higher priority component instead. If this is the case, then the component from whom the stack space was stolen must be removed from the dispatch queue and placed on the waiting for stack space queue instead. If no such lower priority component can be found, then a check is made to determine if any lower priority component that has started execution has a suitable stack, and that no higher priority component has such a stack space. If so, then that component's priority is temporarily promoted to the priority of the waiting component. This insures that a higher priority component is only blocked minimally by a lower priority one if memory is running low. The notion of components and connections used in BROOT is actually only a metaphor for the "data flow graph" paradigm. BROOT will support a component as being a dataflow sub-graph with a single input and a single output connector, so that a component may actually be a graph at a lower level. If a component is not a graph at a lower level, then it is atomic. Each atomic component has an associated process method that is a Forth word, either primative or threaded. It is this word that executes when the component is dispatched. The capture, trigger, and output methods all execute in the dispatcher's environment, but the process method executes in its own environment. The other methods should be quite short in practice, and since BROOT is targeted at embedded systems applications, there is no need for protection of one peice of code from another. Since the real-time system is represented as a graph, network analysis techniques are suitable design tools. A critical path may be determined through any non-cyclic sub-graph in the system, and the time to execute that critical path may be determined by summing the times to execute each component. Recursive application of this rule will eventually result in the need to determine the execution time for each process method, but since these are Forth words, and their threaded compiled definition is readily available in the development environment, they too may be recursivly decomposed into primatives. The primatives, in the case of a FORCE processor, are machine instructions that each execute in a single clock cycle, so the worst case number of clock cycles for any non-cyclic path through the system may be determined automatically. For cycles, which correspond to loops in a conventional architecture, the designer must establish a worst case value for the number of iterations before the cycle will generate an output that is sent outside of the cycle itself. This determines the time for the sub-graph containing the cycle to execute. This sub-graph is then wrapped-up as a module and treated like any other component: the cyclic nature has been hidden at a higher level, and only the time to execute remains important. By means of this kind of designer/machine interaction in the development environment, the response time of the system may be determined. What's more, the determination is auditable and repeatable: a very desirable trait for a real-time system. If a subsequent edit changes the definition of any of the components, a redetermination of the response time should be fully automatic. Only in the case that the network topology has been changed with respect to cycles does the designer need to interact in the response time determination. The development environment spoken of here is intended to be highly graphic and interactive. The system as a whole shall be viewed on a graphics display screen as a dataflow graph, with lines connecting components. The components themselves are displayed as user definable icons. The lines connecting them represent potentially more than one signal from one component to another. If the mouse is moved to a line and the botton pressed, the names of all the signals that line represents will be displayed. If the mouse is then dragged to a particular signal's name, all the components to which that signal is distributed will be high-lighted. If the mouse is released at this position, the highlighting will remain in effect, but if the mouse is dragged away from all signal names before being released, then the high-lighting will go away. If the mouse is pressed on a component, the component will be high-lighted. If the mouse is dragged, the component will move, and all the lines connecting to that component will move with it, remaining anchored at their other ends. This permits the designer to arrange and rearrange the appearance of the dataflow graph at will. If the mouse is clicked on a component, that component and all signal lines connected to it will be high-lighted. If the mouse is presssed on that component, a menu will pop up with various choises on it, such as: input, output, trigger, process, locals, stack. Input and output will expand the respective connectors and show which signals are on the changed lists. A similar operation will expand a single input or output pin, and the capture, retrieval, output, and transmission methods may be displayed as Forth source screens. Trigger will likewise pop up the screen editor on the source for the trigger method word. Process will either pop up the screen editor if the method is a Forth word, or else the screen will dive in one level deeper and display the dataflow sub-graph of the module, showing a trace of the nesting by a stack of the icons that have been expanded. Return from this recursive browsing is accomplished by clicking on the icon to return to. The screen may be split and manipulated using all the standard techniques provided by OSF Motif and X-Windows, so an earlier level may be held on the screen at the same time as a later one by dragging it's icon to the desired place on the screen. Since each component is a member of a class, the class as a whole may be manipulated by manipulating the icon for the class. Each instance has an instance name associated with it, so many instances of the same icon are differentiated by unique names. The instance names may be viewed by pointing to the icon and pressing a mouse key.