Standard trajectories

History repeats itself — in programming languages too.

Andreas Rossberg pointed out a hysterical raisin in Java:

The reason Java generics turned out sub-par is mainly because they were added too late! They had to work around all kinds of language design and infrastructure legacy at that point. Compare with C#, where generics were put into the language early on and integrate much nicer. That is, generics should be designed into a language from the very beginning, and not as an afterthought.

John Nagle replied:

The other classic language design mistake is not having a Boolean type. One is usually retrofitted later, and the semantics are usually slightly off for reasons of backwards compatibility. This happened to Python, C, LISP, and FORTRAN.

These are two of the standard trajectories languages take through design space. Some things happen over and over:

  • Languages make more type distinctions over time. Booleans are a common case of this: if you have integers or nil or a defined/undefined distinction, separate Booleans are obviously unnecessary, so they're often left out of early versions of languages. Eventually the designers tire of wondering which ints are really integers and which are Booleans, and add a separate type.
  • Restrictions chafe and get relaxed. In particular, languages with restrictive static type systems usually add extensions to make them less restrictive: witness Haskell's numerous extensions for high-order polymorphism, dependent types, deferred type errors, and so on. Java's generics are a case of this — it got along without them for a while because it was possible to escape from the static type system by downcasting.
  • General cases supplant special cases. In particular, many languages start with a small closed set of datatypes, and add more very slowly. At some point they add user-defined types, and initially treat them as another, very flexible type. Once they get used to user-defined types, they realize all the others are unnecessary special cases which could be expressed as ordinary (i.e. user-defined) types. Python went through this (painfully, because its built-in types had special semantics); Scheme will soon, since R7RS is standardising define-record-type. (I'm not sure if the Schemers realize this; several WG1ers objected to a proposed record? operator on grounds of strong abstraction, but none mentioned that it should return true for all types, including built-in ones.)
  • Interfaces are regularized. It's easier to add a feature than to understand its relation to the rest of the language, so many features have ad-hoc interfaces, which are later replaced with uniform ones. This seems to happen to collection libraries a lot, although this might be simply because they're large and popular.
  • Newly fashionable features are added in haste and repented later. How many languages have poorly-designed bolted-on “object systems”? Sometimes parts of the new features are later extended to the rest of the language; sometimes they're made obsolete by better-designed replacements; sometimes they're abandoned as not useful.
  • Support for libraries and large programs is often added later. Newborn languages have few libraries and no large programs, so there's little need for features like modules or library fetching or separate compilation. When they're later needed (or perceived to be needed), they're added, often as kludges like Common Lisp's packages and C's header files, or even as external tools like C's linker (and makefiles) and Clojure's leiningen. This may be changing: libraries are now considered important enough that new languages usually have modules, at least.
  • And, of course, languages grow. It's much easier to add a feature than remove one.

Many of these trajectories are easy to trace, because they leave trails of obsolete features kept around for compatibility.

What other pieces of history repeat themselves?

5 comments:

  1. R6RS also has define-record-type, though with a different syntax. However, while pairs can be simulated with a record, vectors cannot be. Even in Smalltalk there is a fundamental distinction between ordinary classes, classes with a variable number of slots, and classes with a variable number of arbitrary bytes, and Scheme provides only the first type.

    What really sets Scheme off from most other dynamically typed languages is its comparative monomorphism. With the exception of exact/inexact number polymorphism, the arguments to every standard Scheme procedure are either monomorphic or universally polymorphic. As long as there is no subtyping other than the numbers, using systematic names more than makes up for this, providing some of the run-time benefits of static typing. Since R7RS-large will reintroduce subtyping with single inheritance of slots (like Common Lisp structs), it'll be interesting to see if the pressure to standardize generic functions begins to go up either in WG2 or in a future standardization effort.

    ReplyDelete
    Replies
    1. Variable-size records are a simple and obvious extension (especially from the implementational viewpoint where every object is a tagged vector); I don't think they're much of an obstacle to considering everything a record.

      Isn't Scheme's monomorphism mostly confined to newer features? Its older parts — numbers, equality, read/write — have always been polymorphic wherever it's convenient. Its collections (and maybe char/string ordering) are the only parts that cry out for polymorphism, especially in Schemes recent enough to have many collection types with the same operations, and I don't understand why new polymorphic operations are seen as so much more complicated than old ones. Is it just because they sound like they ought to be defined with generic functions instead of plain old cond?

      Delete
    2. I think you're right about numbers: Scheme's numerical procedures is polymorphic because MacLisp was (and MacLisp was ultimately because Fortran IV was). Equality and I/O, though, provide a universally polymorphic interface even though the underlying implementation is a type-case. "Write", for example, doesn't just puke if it gets a non-standard object: it outputs something implementation-defined.

      I'm not sure I fully understand why Schemers resist polymorphism. Some possibilities are (a) tradition, (b) being Not Common Lisp (much less any other upstart dynamic language), and (c) the fact that standardizing the monomorphic procedures supplies methods for use by a roll-your-own object system. One of the few things I have declared Off Limits for R7RS-large (of course the WG can override this) is to avoid attempting to standardize an object system: there are so many to choose from, and each has its advantages and disadvantages. My own system, JSO (JavaScript Objects) is very lightweight, but certainly doesn't provide all the convenience of TinyCLOS or Meroon.

      Delete
    3. Does universality make a difference? If < worked on anything (e.g. doing lexicographical order on compound structures), or if there were a write-readably that rejected values that couldn't be read back in, would it matter?

      Delete
    4. It would matter because you'd have to reject the whole of < if you wanted something that worked differently on strings, such as by being based on ISO 14651 collation order. The alternative is to introduce a new string subtype, IsoCollatableString, which is awkward in any number of ways.

      Delete

It's OK to comment on old posts.