- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Re: Unhashable trait bug
Fri, 2009-07-17, 11:52
Hello Ilya.
> Probably, you're not the person who is supposed to answer to the
> following question, and if it's so, please, redirect me to the right
> person. One more problem after update to 2.8 is strange exception in
> the old place, related to to work with mutable HashTable. I attach
> the stacktrace to this letter. It seems, that the problem is caused
> by animplicit call of BoxedArray. Class ScCompoundType, visible near
> the top of stack is a case classes with sequences of some traits as
> parameters. As I told it worked fine before an update.
I would guess your problem has to do with the following.
In the design of the new collection library, we thought that, as an
example, List(1, 2, 3) should be equal to Array(1, 2, 3). In other
words, the actual implementation of a collection should not be taken
into account for comparing equality, just the content of the
collection and the collection kind (map, set, sequence). This is
structural equality and we think it makes programming with the Scala
collection library quite a bit easier.
Since hashCode and equals are supposed to have the same contract (a ==
b <==> a.hashCode == b.hashCode), hashCode would have to be
implemented as structural hashing. This is a problem for mutable
collections as such a hash code would not be stable (val ahc =
a.hashCode; a += e, ahc != a.hashCode). This prevents using mutable
collections as the key of a hash map. Instead of creating weird bugs
in hash maps, we fail at runtime when a mutable collection is used as
the key for a hash map (or rather, when the hash code for a mutable
collection is requested). Another solution we considered, but rejected
I think, is to break the same-contract assumption for mutable
collections and return an object-identity hash code for mutable
collections.
We had many discussions (and disagreements) about what the best design
for equals/hashCode on collection should be, and what I described
above may in fact not be the final design. I am not even sure whether
we made a final decision in that regard, and whether the current
implementation of the collection library reflects it. I am sorry about
that. Maybe someone else knows better?
Please, feel free to comment on the equals/hashCode design on the
scala-internals mailing list if it is of particular concern to you.
Cheers,
Gilles.
java.lang.UnsupportedOperationException: unsuitable as hash key
at scala.Unhashable$class.hashCode(Unhashable.scala:24)
at scala.runtime.BoxedArray.hashCode(BoxedArray.scala:24)
at scala.runtime.ScalaRunTime$._hashCode(ScalaRunTime.scala:76)
at
org
.jetbrains
.plugins
.scala.lang.psi.types.ScCompoundType.hashCode(ScCompoundType.scala:8)
Fri, 2009-07-17, 14:57
#2
Re: Re: Unhashable trait bug
On Fri, Jul 17, 2009 at 04:24:21PM +0400, Ilya Sergey wrote:
> The problem occurs when we deal with case classes with fields of Seq
> type. Of course, this fields are supposed to be immutable and
> therefore these instances may be used as keys. But very often while
> creating instances of such classes we obtain arguments from other
> libraries (usually, implemented in Java) which return not sequence but
> Java arrays.
I opened this ticket about a month ago:
http://lampsvn.epfl.ch/trac/scala/ticket/2088
"case classes with varargs params get tied up in collections issues"
Sat, 2009-07-18, 11:17
#3
Re: Re: Unhashable trait bug
I think we decided that the trouble was not worth it in the end, and
we'd just implement structural hash codes. The only exception we need
to make is for arrays. Arrays will have identity hash codes and they
won't be equal to other collections. We can't hide the fundamental
differences between Java arrays and collections, unfortunately.
Cheers
Sat, 2009-07-18, 11:47
#4
Re: Re: Unhashable trait bug
2009/7/18 martin odersky :
> I think we decided that the trouble was not worth it in the end, and
> we'd just implement structural hash codes. The only exception we need
> to make is for arrays. Arrays will have identity hash codes and they
> won't be equal to other collections.
Or eachother then, unfortunately.
Some care will have to be taken here because arrays are the default
implementation for varargs. So if we define
case class Foo(ints : Int*)
then the result will be
Foo(1, 2, 3) != Foo(1, 2, 3)
because their underlying arrays in ints are not equal. So either case
class Foo(ints : Int*) and case class Foo(ints : Seq[Int]) have to
have different equality properties or pattern matching suddenly
becomes quite confusing.
> We can't hide the fundamental
> differences between Java arrays and collections, unfortunately.
Is there a plan on how to make this deal with BoxedAnyArray?
For those not familiar with the problem: The underlying array of
BoxedAnyArray changes. It starts out as an Array[AnyRef] and turns
into an Array[T] of the appropriate type of T once the array has been
unboxed somewhere. Unfortunately this means that the hash code of its
underlying array changes, and there's no way to know what the hash
code of the final underlying array is. Further it's important that the
BoxedAnyArray is == to its underlying array, so the hash code has to
be that of the unboxed array. But there's no way to know what that
hash code is until you've been unboxed...
This problem went away when we thought arrays were going to have
structural equality. It could also be made to go away if we required
arrays to take manifests as BoxedAnyArray could (thankfully) be axed.
Is that still on the cards?
Sat, 2009-07-18, 11:57
#5
Re: Re: Unhashable trait bug
On Sat, Jul 18, 2009 at 12:40 PM, David MacIver wrote:
> 2009/7/18 martin odersky :
>> I think we decided that the trouble was not worth it in the end, and
>> we'd just implement structural hash codes. The only exception we need
>> to make is for arrays. Arrays will have identity hash codes and they
>> won't be equal to other collections.
>
> Or eachother then, unfortunately.
>
> Some care will have to be taken here because arrays are the default
> implementation for varargs. So if we define
>
> case class Foo(ints : Int*)
>
> then the result will be
>
> Foo(1, 2, 3) != Foo(1, 2, 3)
>
> because their underlying arrays in ints are not equal. So either case
> class Foo(ints : Int*) and case class Foo(ints : Seq[Int]) have to
> have different equality properties or pattern matching suddenly
> becomes quite confusing.
>
>
>> We can't hide the fundamental
>> differences between Java arrays and collections, unfortunately.
>
> Is there a plan on how to make this deal with BoxedAnyArray?
>
> For those not familiar with the problem: The underlying array of
> BoxedAnyArray changes. It starts out as an Array[AnyRef] and turns
> into an Array[T] of the appropriate type of T once the array has been
> unboxed somewhere. Unfortunately this means that the hash code of its
> underlying array changes, and there's no way to know what the hash
> code of the final underlying array is. Further it's important that the
> BoxedAnyArray is == to its underlying array, so the hash code has to
> be that of the unboxed array. But there's no way to know what that
> hash code is until you've been unboxed...
>
> This problem went away when we thought arrays were going to have
> structural equality. It could also be made to go away if we required
> arrays to take manifests as BoxedAnyArray could (thankfully) be axed.
> Is that still on the cards?
>
I don't think so. The problem is that then every collection
construction operation has to take a manifest, because the underlying
type might be an array.
Sat, 2009-07-18, 12:27
#6
Re: Re: Unhashable trait bug
2009/7/18 martin odersky :
>> This problem went away when we thought arrays were going to have
>> structural equality. It could also be made to go away if we required
>> arrays to take manifests as BoxedAnyArray could (thankfully) be axed.
>> Is that still on the cards?
>>
> I don't think so. The problem is that then every collection
> construction operation has to take a manifest, because the underlying
> type might be an array.
ok. That's not entirely surprising. Any thoughts on what to do about
the problems I mentioned then?
Sat, 2009-07-18, 19:07
#7
Re: Re: Unhashable trait bug
On Sat, Jul 18, 2009 at 1:25 PM, David MacIver wrote:
> 2009/7/18 martin odersky :
>>> This problem went away when we thought arrays were going to have
>>> structural equality. It could also be made to go away if we required
>>> arrays to take manifests as BoxedAnyArray could (thankfully) be axed.
>>> Is that still on the cards?
>>>
>> I don't think so. The problem is that then every collection
>> construction operation has to take a manifest, because the underlying
>> type might be an array.
>
> ok. That's not entirely surprising. Any thoughts on what to do about
> the problems I mentioned then?
>
I'm not worried for varargs in case classes; they can be treated by
generating hashcodes that are adapted to them and that are structural.
Dealing with generic arrays is a can of worms. I think our only choice
right now is to open as few such cans as possible, knowing that we
have to open at least one. The other can of worms that I know of is
equality/hashcode for (boxed) primitive types. Essentially, we only
have two choices for either:
1) Treat hashCode and equals as in Java, and treat == as a separate
overloaded method, or:
2) Try to virtually override both hashCode and equals by having
special run-time type checks generated by the compiler.
I think in balance I prefer (1), because it would be very strange to
have hashCode or equals return different result depending on who calls
them. So that means we do not pull any special tricks for equals and
hashCode. For Java arrays they are what they are, and for Scala boxed
arrays they are equals and hashCode of the underlying Java array.
However, there would be an overloaded == method of type
def ==(other: Sequence[T]): Boolean
and that == would be structural.
So Array(1, 2, 3) == Array(1, 2, 3) is true but Array(1, 2, 3) equals
Array(1, 2, 3) is not. Generally eq implies equals and equals implies
==.
Then the only remaining problem is BoxedAnyArrays which have not yet
"snapped" into their target representation. Here, the equals contract
means that an unsnapped BoxedAnyArray can only be equal to itself
(because when it snaps into a concrete representation that
representation will be a new array, different from all others). But we
can't produce a hashCode yet. My inclination would be to throw an
UnSupportedOperationException in this case. That's safer than
returning a bogus hashcode that can change later.
Cheers
Sat, 2009-07-18, 19:27
#8
Re: Re: Unhashable trait bug
Martin~
Is it not possible to "snap" on hashCode request?
Matt
On Jul 18, 2009 2:04 PM, "martin odersky" <martin.odersky@epfl.ch> wrote:On Sat, Jul 18, 2009 at 1:25 PM, David MacIver<david.maciver@gmail.com> wrote: > 2009/7/18 martin od...
I'm not worried for varargs in case classes; they can be treated by
generating hashcodes that are adapted to them and that are structural.
Dealing with generic arrays is a can of worms. I think our only choice
right now is to open as few such cans as possible, knowing that we
have to open at least one. The other can of worms that I know of is
equality/hashcode for (boxed) primitive types. Essentially, we only
have two choices for either:
1) Treat hashCode and equals as in Java, and treat == as a separate
overloaded method, or:
2) Try to virtually override both hashCode and equals by having
special run-time type checks generated by the compiler.
I think in balance I prefer (1), because it would be very strange to
have hashCode or equals return different result depending on who calls
them. So that means we do not pull any special tricks for equals and
hashCode. For Java arrays they are what they are, and for Scala boxed
arrays they are equals and hashCode of the underlying Java array.
However, there would be an overloaded == method of type
def ==(other: Sequence[T]): Boolean
and that == would be structural.
So Array(1, 2, 3) == Array(1, 2, 3) is true but Array(1, 2, 3) equals
Array(1, 2, 3) is not. Generally eq implies equals and equals implies
==.
Then the only remaining problem is BoxedAnyArrays which have not yet
"snapped" into their target representation. Here, the equals contract
means that an unsnapped BoxedAnyArray can only be equal to itself
(because when it snaps into a concrete representation that
representation will be a new array, different from all others). But we
can't produce a hashCode yet. My inclination would be to throw an
UnSupportedOperationException in this case. That's safer than
returning a bogus hashcode that can change later.
Cheers
Sat, 2009-07-18, 19:37
#9
Re: Re: Unhashable trait bug
2009/7/18 Matt Fowles :
> Martin~
>
> Is it not possible to "snap" on hashCode request?
If it were possible to snap on hashCode request it would be possible
to snap at creation time. You can't fix the type of BoxedAnyArray
until you see its type parameter as something concrete.
Thanks for the explanation, this point is quite reasonable and evident and I fully agree with it.
The problem occurs when we deal with case classes with fields of Seq type. Of course, this fields are supposed to be immutable and therefore these instances may be used as keys. But very often while creating instances of such classes we obtain arguments from other libraries (usually, implemented in Java) which return not sequence but Java arrays.
Since now Seq (Sequence) is just a collection (not immutable), we have to write conversion like Seq(arr : _*) manually in all erroneous calls of constructor or refactor whole code by replacing Seq with List (which is still an alias to the immutable List class). In both cases this is kind of back-compatibility trouble. May be, you can give me ideas, how to avoid it by some convenient way?
Thanks.
With best regards,
Ilya
2009/7/17 Gilles Dubochet <gilles.dubochet@epfl.ch>