Re: Ideas about data types

Date: Tue, 22 Jul 1997 15:14:20 -0400
From:
Erik Ostrom <eostrom@research.att.com>
cc: moo-cows@the-b.org, eostrom@research.att.com
Content-Type: text/plain; charset=us-ascii
In-reply-to: Your message of "Tue, 22 Jul 1997 10:53:27 PDT." <199707221753.KAA15412@netcom5.netcom.com>

I'm going to respond to David's message by talking about what's going on in my 
current design/implementation.  It's not very close to release yet; the bright 
side of that is that there's a lot of time left for changes to be made.  
Comments welcome.

> One is that most datatypes have constant or literal values.
> These constant or literal values usually have some kind of syntactic
> form, which is distinct from any literals of all the other datatypes.

At the moment I have no plans to allow new data types to define their own 
syntax.  It would probably be _possible_ to provide hooks into the parser, but 
I'd prefer not to.  Rather, a data type T is allowed to provide a function for 
converting an arbitrary value into one of type T, and there will be a uniform 
syntax for performing that conversion.  I don't know yet what the syntax will 
be; here are some of the ideas I've kicked around:

    5 to boolean
    5 -> boolean
    5 as boolean
    boolean from 5
    (boolean) 5
    [[boolean 5]]

All of these have their problems; I won't get into the details here.  Note 
that I'm not considering any options that look like function calls; I want to 
emphasize that this is a simple type conversion.

Many types won't need a conversion function or "literal" value syntax at all; 
their values are generated by other means.  For example, consider a file I/O 
package that provides a persistent handle for an open file.  Its values are 
created not by syntax but by an "open" function:

  f = open("/etc/passwd", "r")

The key here is a feeling that if two "literal" values _look_ the same, they 
should _be_ the same, in all contexts that are visible to the programmer.  So, 
for example, if

  x = {"/etc/passwd", "r"};
  y = {"/etc/passwd", "r"};

then x is equal to y, and they can be used interchangeably.  (This property 
does not hold if the expressions contained in the list are not themselves 
literal values; and, of course, it holds in this example only until x or y is 
assigned to.  But you know what I mean.)  If we used "literal" syntax for 
files, you'd have something like this:

  x = {"/etc/passwd", "r"} as file;
  y = {"/etc/passwd", "r"} as file;

and x and y would be two different file handles.  There are variations on 
this, but I think all of them fail.  So anyway, many types don't have literal 
value syntax.

The down side of this is that now, for the first time, there may be values 
that can't usefully be printed (e.g., with toliteral()).  Aaaaand, I'll just 
leave it at that, for now.

> I agree that the server would have to load the definintion of the new
> datatype prior to parsing any programs that have literals specified in
> them.

Note that in my scheme, this may not even be true (as a naive implementation 
will simply parse it as a list and an identifier).  However, it IS necessary 
to load the data type before loading _stored values_ of that type--for 
example, in properties or in suspended tasks.

> so a function that is called to create a new in-memory
> literal structure from a string will be necessary.

Again, not strictly true.

> There will also be a need to create a new string from an in-memory
> literal structure. (for use by toliteral() and when the MOO checkpoints)

Yep, toliteral() needs it, at least for printable data types.  Types are also 
given their own hooks for reading from and writing to the DB file; these use 
the DB module's I/O mechanisms, rather than being directly string-based.  This 
creates a bit of difficulty if you have a database containing values of a type 
that your server doesn't support.  I haven't finalized a plan for that yet.

> One of the hard parts of expressions is handling coercion from
> one type to another type.  The experience we have with floats has
> resolved this issue for MOOlang. Mixing floats and integers
> was considered to be an error. 

I'm not planning on any implicit coercion, as you suggest.  The conversion 
operator suggested above is sort of an explicit coercion to the new data type, 
which is fine.  However, I haven't yet put in any hooks for converting back to 
built-in data types.  So, you can do

  x = {{"a", 1}, {"b", 2}};
  d = x as dictionary;

but you can't then do

  l = d as list;

This won't be hard to add.  In addition, I haven't done conversion _between_ 
distinct extended data types, and I'm not sure yet how I'm going to do it.

> I would expect that part of loading/registering/checking-in a new datatype
> would include a way to specify what code should be run for each of the 
> operators in MOOlang.  (dyadic +,-,*,/,in etc.)

Possibly.  I'm not sure it's necessary to overload the arithmetic operators, 
but I suppose it'd be handy if one wanted to implement new kinds of numbers, 
and as Nick pointed out, they are already overloaded, so what's a few more 
meanings.

> I suppose that if the new datatype is a container type, that
> it should be able to handle indexing of a variable like lists and strings
> currently allow.

Yep.  This is planned but not yet implemented.

> If the new datatype is a range type, I would expect the
> for var in [literal..literal]  construct to allow you to iterate over
> a sequence of literal values in that type.

Can you give an example of this?

> On a final note, I would also expect that a datatype would be able to 
> register the number that typeof() returns.

At the moment, I'm just using the string-valued name of the data type as a 
return value for typeof().  I really don't want to add a new built-in variable 
for each data type.  (The type name used in the conversion expressions above 
isn't really a variable; it's an identifier valid only in the context of 
conversion.  There'll also be a way to convert to an arbitrary named type, 
analogous to call_function() for functions.)

Data types _do_ have numbers, internally; this is how the server keeps track of which values are of which type.  However, the number is assigned by the server when the data type is registered (with a function similar to register_function()), rather than selected by the data type.  It won't necessarily be the same from one invocation of the server to the next, and it won't be visible to the MOO programmer at all.

--Erik

Follow-Ups:

Re: Ideas about data types (toliteral)

From: Andy Bakun <abakun@scinc.com>

References:

Ideas about data types

From: whitten@netcom.com (David Whitten)

Prev by Subject: Re: Ideas about data types
Next by Subject: Re: Ideas about data types (fwd)
Prev by thread: Re: Ideas about data types
Next by thread: Re: Ideas about data types (toliteral)
Index(es):
- Subject
- Thread

Home | Subject Index | Thread Index