Unboxed arrays break identity

Common Lisp explicitly allows its implementations to copy numbers whenever they feel like it, so object identity is not reliable. Previously I said this was a relic of Maclisp, but I overlooked a simple, obvious stronger reason: unboxed arrays. Long ago on RRRS-authors, Pavel Curtis gave another example where numbers might be copied:

(let ((v (make-vector 1 3.0)))
      (eq? (vector-ref v 0) (vector-ref v 0)))

This returns true in any ordinary Scheme, because storing a number into a vector does not copy it. However, if v is an unboxed vector of floats, this will probably return false, because the number naturally gets boxed twice. It does in Racket:

> (require racket/flonum)
> (let ((v (make-flvector 1 3.0)))
    (eq? (flvector-ref v 0) (flvector-ref v 0)))
#f

And SBCL:

CL-USER> (make-array '() :element-type 'single-float :initial-element 3.0)
#0A3.0
CL-USER> (eq (aref *) (aref *))
NIL

(That's a zero-dimensional array, with one element.)

Clojure doesn't explicitly allow copying of numbers, but does it anyway, of course:

user> (let [x 1.0 v [x]] (identical? (v 0) (v 0)))
true
user> (let [x 1.0 a (double-array [x])] (identical? (get a 0) (get a 0)))
false
user> (let [x 1.0 a (object-array [x])] (identical? (get a 0) (get a 0)))
true

It doesn't even require an array, since it sometimes unboxes ordinary variables without preventing multiple reboxing:

user> (let [x 1.0] (identical? x x))
false
user> (let [x (if true 1.0 1)] (identical? x x))
true

Scala hides the issue by making eq unavailable on potentially unboxed types like Float (and therefore on Any, which might be annoying):

scala> 1.0 eq 1.0
<console>:7: error: value eq is not a member of Double
       1.0 eq 1.0
       ^

Any language that boxes floats but wants efficient numerics practically has to support unboxed numeric vectors, and therefore allow implicit copying of numbers, since preventing it requires (undecidable) nonlocal analysis. So its spec must provide some permission to copy numbers — or any boxed type with an unboxed container; it's not specific to numbers. This permission need not be a blanket license to copy, though; it could be restricted to specialized arrays. Or, in order to permit unboxing variables without forcing the compiler to be paranoid about multiple reboxing, it could be permitted for a conservative approximation of "potentially unboxed numbers", e.g. those in local variables statically known to be numbers of a specific type, whose values come from unboxable operations (those that compute new numbers: sin, not car).

Does this make NaNboxing sound more attractive?

6 comments:

  1. What it means, as far as I am concerned, is that Scheme's object identity predicate is eqv? (CL's EQL) and not eq?. Indeed, ever since R4RS there has been language in the standard similar to "An object fetched from a location, by a variable reference or by a procedure such as car, vector-ref, or string-ref, is equivalent in the sense of eqv? to the object last stored in the location before the fetch" (R7RS section 3.4, Storage Model). So in principle even pairs can return numbers that are not eqv? to the ones they were set up to contain.

    Things are further complicated by the special cases of NaN and procedures.

    ReplyDelete
    Replies
    1. Oops. I meant "not eq? to the ones they were set up to contain".

      Delete
  2. So Scheme also has blanket permission to copy, but by omission rather than explicitly. This is not a great way to specify this sort of thing, both because it's easy to miss (as I did), and because it's hard to tell whether the permission is deliberate.

    eq? matters because it's implementational identity: it shows what's actually happening, not just what's guaranteed. I'm trying to do a post on why this is important, but I keep confusing implementational/operational behavior with customary behavior and with explicit vs. implicit specification (as in the previous paragraph). These are three different issues that happen to coincide here.

    ReplyDelete
    Replies
    1. It may be implementation identity, but not always. See my fixnum info page for details. In particular, there are several Schemes in which `eq?` and `eqv?` always seem to return the same results.

      You can find out a good deal about implementation differences between Schemes at the implementation contrasts index page.

      Delete
    2. I should have guessed there are Schemes where eq? doesn't have the usual operational meaning...

      Shoe's eq? is an alias for = (which is not R5RS-compliant, but it doesn't have eqv? at all, so despite having R5RS as its only documentation it's not trying very hard to comply).

      More interestingly, in NexJ and JScheme, eq? isn't pure pointer comparison, because it special-cases booleans since Java doesn't intern them: NexJ's eq? is (obj1 == obj2 || obj1 instanceof Boolean && obj1.equals(obj2)), and JScheme's is equivalent.

      BTW some of the languages with “no fixnums” or “apparently unbounded fixnums” do distinguish fixnums from bignums; they just don't intern them, or do intern bignums (or eq? compares by value, in Shoe).

      Delete
  3. Tried the example with SBCL and at least with SBCL 2.1.8 it actually gives T.

    ReplyDelete

It's OK to comment on old posts.