MOO-cows Mailing List Archive

[Prev][Next][Index][Thread]

Re: Binary strings



>>>>> "Andrew" == Andrew Wendt <powerpig@m-net.arbornet.org> writes:

    Andrew> On Thu, 11 Jan 1996, Pavel Curtis wrote:

    >> output to that connection can contain arbitrary bytes.  On
    >> input, any byte that isn't an ASCII printing character, the
    >> space character, or the TAB character will be represented as
    >> the four-character substring "~NNN", where NNN is the octal
    >> representation of the byte; the input character `~' is

    Andrew> Wouldn't hex be more sane, being smaller and exactly
    Andrew> encoding a byte with nothing left over?

    Andrew> TTFN Andy

Well certainly hex would be more sane, but less traditional.  All C
compilers support octal, as does the shell.  Only modern C compilers
support hex. 

The reason for this stupidity is purely historical.  Unix was
originally written on the PDP-11, and it was a 16 bit machine, but
stubornly used octal instead of hex because DEC had previously made
the 12 bit PDP-8, which packed 2 6-bit chars per 12 bit word (no lower
case, limited punctuation, limited control chars), and the DEC-10
which used a 36 bit word and packed 6 of these glorious "half-ascii"
chars per word, or 8 decimal digits of 4 bits each, or (yech!) 4 9-bit
ebcdic chars with parity, or 4 7 bit ascii chars with 2 wasted bits.
How awful!  This was back when 4 K words of main memory (core then,
which is where we get that other quaint Unix name for a dump file;
should be ram instead of core now days.) cost several thousand dolars,
and a coke was only 10 cents.  A 40 MB disk drive was the size of a
washing machine.  

So with this wonderful heritage of 6-bit characters and word lengths
tht were divisible by 3, DEC *NATURALLY* (translation: "without
thinking") used octal for *EVERYTHING*, even when they made a 16 bit
machine with 2 8-bit chars per word, all their own programmers (who
hadn't been poisoned by Big Blue) knew the octal codes by heart.  

The marketing department wanted to maintain an image distinct from
IBM, so the even swapped the bytes in a word and gave us the
abominable little endian format for binary numbers.  All this in a
deliberate attempt to be as incompatible as possible with their
biggest competitor, IBM.  

So now the price we in the "modern" world have to pay is network
software that must know the format of the data to know whether to swap
byte if the endianness is wrong, but only if it is a binary number;
character strings are fine without swapping, so you can't just swap
everything automatically in the hardware, you have to know what kind
of data each byte represents.  

How grotesque!  And this is an "open" system? Bah!  I don't care how
you represent things, as long as it is the same everywhere, but this
nonsense really ought to be banned.  If I were King Of The World, it
would be, along with different keyboard layouts.  Does Rush have this
in his book "The Way Things Ought To Be"?  He OUGHT to!

[ rant mode off, "click" ]

-- 
-----------  "...  And the men went up and viewed Ai."  [Jos 7:2]  -----------
Robert Jay Brown III  rj@eli.wariat.org  http://eli.wariat.org  1 708 705-0370
Elijah Laboratories Inc;  759 Independence Drive;  Suite 5;  Palatine IL 60074
-----  M o d e l i n g   t h e   M e t h o d s   o f   t h e   M i n d  ------


References:

Home | Subject Index | Thread Index