This page is no longer maintained — Please continue to the home page at www.scala-lang.org

equals and "Numbers"

17 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.

The equals method logic currently ends with this on Number/Number
comparisons:

return xn.intValue() == yn.intValue();

Unfortunately Number could be many things beyond the primitives, and
.intValue() just means "hack off 32 bits and call it an Int", so:

scala> BoxesRunTime.equals(0, BigInt((Int.MaxValue.toLong + 1) * 2))
res0: Boolean = true

This brings up a question which was outstanding anyway, that being what
to do with the java BigInt and BigDecimal and their scala wrappers. All
four of them are "Numbers". Since they all have arbitrary precision,
and we can now dictate their hashCodes, it's easy enough to have them
all use the same hashCode as, and be equal to, all whole numbers in the
Int range.

Beyond that it potentially becomes quite a bit more complicated, as
essentially we have a whole second tier of hashCodes which need to be
coordinated on top of those in the Int range (the set of Longs -- Ints)
and the (also unresolved) issues with Float arise again with Double. So
there are approximately three available approaches taking "equal objects
must have equal hashcodes" as a given, and ignoring for now the third
hashCode tier which arises between arbitrary precision types.

1) The "nothing" approach:

0 != BigInt(0) != BigDecimal(0)

2) The "limited range consistency" approach:

Int.MaxValue == BigInt(Int.MaxValue) == BigDecimal(Int.MaxValue)
Long.MaxValue != BigInt(Long.MaxValue)

3) The "pull out the stops" approach:

Long.MaxValue == BigInt(Long.MaxValue) == BigDecimal(Long.MaxValue)

// But what to do about Double, which has Float's problem 8x worse:
// Iterator.iterate(Long.MaxValue)(_ - 1) takeWhile (Long.MaxValue.toDouble == _) length
// res0: Int = 512

All of them imply strange behavior somewhere.

Also, anyone can subclass Number, so handling these two types isn't
sufficient if we're potentially routing any Number into this equals.

nilskp
Joined: 2009-01-30,
User offline. Last seen 1 year 27 weeks ago.
Re: equals and "Numbers"
On Thu, Nov 12, 2009 at 8:56 AM, Paul Phillips <paulp@improving.org> wrote:
// But what to do about Double, which has Float's problem 8x worse:

Floating point number comparison has always been an inexact thing, so I would have no problem with comparison simply being: any non-fp number is cast to fp and then compared. End of story.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 09:49:35AM -0600, Nils Kilden-Pedersen wrote:
> Floating point number comparison has always been an inexact thing, so
> I would have no problem with comparison simply being: any non-fp
> number is cast to fp and then compared. End of story.

End of story except that it doesn't solve anything. What you describe
is already what takes place. The issue is the hash codes.

scala> (Long.MaxValue).toDouble == Long.MaxValue.toDouble
res0: Boolean = true

scala> (Long.MaxValue - 1).toDouble == Long.MaxValue.toDouble
res1: Boolean = true

scala> (Long.MaxValue).hashCode == (Long.MaxValue - 1).hashCode
res2: Boolean = false

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 3:56 PM, Paul Phillips wrote:
> The equals method logic currently ends with this on Number/Number
> comparisons:
>
>  return xn.intValue() == yn.intValue();
>
> Unfortunately Number could be many things beyond the primitives, and
> .intValue() just means "hack off 32 bits and call it an Int", so:
>
Yes, I was thinking of that also. Do we need a case like the old
BoxesRunTime where we sometimes do a b.equals(a)? (namely if b is an
unknown instance of Number).
But then I looked at the classes BigInt and BigDecimal, which are
currently the only two classes falling in that category. Their equals
method returns false for all operands that are not BigInts,
respectively BigDecimals. So it seems all that logic in BoxesRuntime
did nothing.

It's something to think about whether we want to enable == equality
between user-defined
Number types and standard Number types. If we want to go down that
route I think it would be better to generalize, i.e. always call
equals on the right operand if it is of a type EqualsProxy, or
something like that, and the left operand is not. It would complicate
the logic, and consequently slow down ==, but it would make == more
general.

For the moment I say, let's not do it, because it seems we did not
support thgis kind of equality anyway so far. Let's get more data how
much it would cost first.

Cheers

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 05:22:08PM +0100, martin odersky wrote:
> For the moment I say, let's not do it, because it seems we did not
> support thgis kind of equality anyway so far. Let's get more data how
> much it would cost first.

Actually it used to, but sometime in the past I altered it because it
was such a blatant violation of equals/hashCode consistency. I cannot
remember exactly what discussion took place since it was several
equality attempts back. But in 2.7 these things are true:

scala> 0 == BigInt(0)
res0: Boolean = true

scala> BigInt(0) == 0
res1: Boolean = true

As is this:

scala> BigInt(0) == BigInt(0).bigInteger
res4: Boolean = true

Of course it's not transitive:

scala> 0 == BigInt(0).bigInteger
res5: Boolean = false

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

Here's what BigInt's equals used to look like. It had all kinds of
issues, for starters: assuming equal longValues are equal values, being
equal to objects with half a dozen different hashcodes, not being
reflexive because you can sneak by the scalac logic for spotting boxed
primitives.

override def equals (that: Any): Boolean = that match {
case that: BigInt => this equals that
case that: java.lang.Double => this.bigInteger.doubleValue == that.doubleValue
case that: java.lang.Float => this.bigInteger.floatValue == that.floatValue
case that: java.lang.Number => this equals BigInt(that.longValue)
case that: java.lang.Character => this equals BigInt(that.charValue.asInstanceOf[Int])
case _ => false
}

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: equals and "Numbers"

I've been wondering if we could have an "imprecise" number type, call
it ApproximateNumber
We can then create a +- operator on numbers to yield approximate numbers
(hell, why not go all the way and also create ±, we've already gone
beyond raw ascii for some arrow operators)

This allows the formulation:

(1.1 == 1.0 +- 0.2) = true
or
(1.1 == 1.0 ± 0.2) = true

After that it's a simple (cough, cough) matter of extending equality
to support these new beasts :)

I'll leave it to better folk than me to determine how this would
interact with < and > behaviour. Not to mention the fallout of
further breaking the equals/hashcode contract.

On Thu, Nov 12, 2009 at 4:37 PM, Paul Phillips wrote:
> On Thu, Nov 12, 2009 at 05:22:08PM +0100, martin odersky wrote:
>> For the moment I say, let's not do it, because it seems we did not
>> support thgis kind of equality anyway so far. Let's get more data how
>> much it would cost first.
>
> Actually it used to, but sometime in the past I altered it because it
> was such a blatant violation of equals/hashCode consistency.  I cannot
> remember exactly what discussion took place since it was several
> equality attempts back.  But in 2.7 these things are true:
>
> scala> 0 == BigInt(0)
> res0: Boolean = true
>
> scala> BigInt(0) == 0
> res1: Boolean = true
>
> As is this:
>
> scala> BigInt(0) == BigInt(0).bigInteger
> res4: Boolean = true
>
> Of course it's not transitive:
>
> scala> 0 == BigInt(0).bigInteger
> res5: Boolean = false
>
> --
> Paul Phillips      | It's not enough to bash in heads - you've got to
> Apatheist          | bash in minds.
> Empiricist         |     -- Capt Hammer
> all hip pupils!    |----------* http://www.improving.org/paulp/ *----------
>

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 05:22:08PM +0100, martin odersky wrote:
> It's something to think about whether we want to enable == equality
> between user-defined Number types and standard Number types. If we
> want to go down that route I think it would be better to generalize,
> i.e. always call equals on the right operand if it is of a type
> EqualsProxy, or something like that, and the left operand is not.

Ah, we are reliving PriorityEquals or whatever I called it. I wonder if
I still have that code, as I had that fully implemented and performance
measured (although at the time I was more focused on RichString and the
ol' "bob" == "bob".reverse.)

As I recall there was a measurable impact. But it should be smaller if
the instanceOf check is part of inlinedEquals. I can measure.

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 6:13 PM, Paul Phillips wrote:
> On Thu, Nov 12, 2009 at 05:22:08PM +0100, martin odersky wrote:
>> It's something to think about whether we want to enable == equality
>> between user-defined Number types and standard Number types. If we
>> want to go down that route I think it would be better to generalize,
>> i.e. always call equals on the right operand if it is of a type
>> EqualsProxy, or something like that, and the left operand is not.
>
> Ah, we are reliving PriorityEquals or whatever I called it.  I wonder if
> I still have that code, as I had that fully implemented and performance
> measured (although at the time I was more focused on RichString and the
> ol' "bob" == "bob".reverse.)
>
> As I recall there was a measurable impact.  But it should be smaller if > the instanceOf check is part of inlinedEquals.  I can measure.
>
OK, let's be conservative because it's late for 2.8. I think if 2.7 made

0 == BigInt(0)

true then we should not drop this unless we have a very good reason. I
just checked in some hooks to make this possible again. There's now a
class scala.math.ScalaNumber from which BigInt and BigDecimal inherit.
Furthermore, there's a test in BoxesRuntime.equals that makes sure
that if we have a comparison (a: Number) == (b: ScalaNumber) or (a:
Character) == (b: ScalaNumber) we translate that to b.equals(a). So
Scala numbers always get their equals methods called when compared
with Numbers or Characters. Now, all we need to do is give BigInt,
BigDecimal the right implementation of equals and hashCode that's
compatible with normal numeric equality.

I believe the new hooks add very little run-time overhead beause they
are only executed in the slow path of equality. We can think about a
more general PritoryEquals method later but right now I would not do
it, and restrict myself to numbers only. After all the previous use
cases we had for String vs RichString and maybe also for Arrays all
went away.

Do you agree with that, Paul, and do you want to add the cases to
BigInt/BigDecimal?

Cheers

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 06:30:38PM +0100, martin odersky wrote:
> Do you agree with that, Paul, and do you want to add the cases to
> BigInt/BigDecimal?

I do agree and I would like to implement the Big* changes. Also I think
the new equals method in BoxesRunTime doesn't have quite the necessary
logic, which I can fix/explain in the upcoming patch.

Randall R Schulz
Joined: 2008-12-16,
User offline. Last seen 1 year 29 weeks ago.
Re: equals and "Numbers"

On Thursday November 12 2009, Kevin Wright wrote:
> I've been wondering if we could have an "imprecise" number type, call
> it ApproximateNumber
> We can then create a +- operator on numbers to yield approximate
> numbers (hell, why not go all the way and also create ±, we've
> already gone beyond raw ascii for some arrow operators)
>
> This allows the formulation:
>
> (1.1 == 1.0 +- 0.2) = true
> or
> (1.1 == 1.0 ± 0.2) = true
>
> After that it's a simple (cough, cough) matter of extending equality
> to support these new beasts :)

Are you suggesting an interval math library? That would be a good thing.

> I'll leave it to better folk than me to determine how this would
> interact with < and > behaviour. Not to mention the fallout of
> further breaking the equals/hashcode contract.

I believe (based on a trifling amount of actual knowledge) that interval
math defines all the usual arithmetic operations, including
comparisons.

Randall Schulz

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 5:50 PM, Randall R Schulz wrote:
> On Thursday November 12 2009, Kevin Wright wrote:
>> I've been wondering if we could have an "imprecise" number type, call
>> it ApproximateNumber
>> We can then create a +- operator on numbers to yield approximate
>> numbers (hell, why not go all the way and also create ±, we've
>> already gone beyond raw ascii for some arrow operators)
>>
>> This allows the formulation:
>>
>> (1.1 == 1.0 +- 0.2) = true
>> or
>> (1.1 == 1.0 ± 0.2) = true
>>
>> After that it's a simple (cough, cough) matter of extending equality
>> to support these new beasts :)
>
> Are you suggesting an interval math library? That would be a good thing.
>
>
>> I'll leave it to better folk than me to determine how this would
>> interact with < and > behaviour.  Not to mention the fallout of
>> further breaking the equals/hashcode contract.
>
> I believe (based on a trifling amount of actual knowledge) that interval
> math defines all the usual arithmetic operations, including
> comparisons.
>
>
> Randall Schulz
>

I was hesitant to use "Interval", thinking it was used elsewhere.
Glad to say that a quick search of the current scaladoc has proved me
wrong here, so interval it is!

So something like Interval[@specialized T] should do the trick nicely :)

As I think more about it, I don't see why equality and other
comparisons couldn't also be handled via implicit conversions from
number types using the pimp-my-library pattern.

Randall R Schulz
Joined: 2008-12-16,
User offline. Last seen 1 year 29 weeks ago.
Re: equals and "Numbers"

On Thursday November 12 2009, Kevin Wright wrote:
> ...
>
> I was hesitant to use "Interval", thinking it was used elsewhere.
> Glad to say that a quick search of the current scaladoc has proved me
> wrong here, so interval it is!
>
> So something like Interval[@specialized T] should do the trick nicely
> :)
>
> As I think more about it, I don't see why equality and other
> comparisons couldn't also be handled via implicit conversions from
> number types using the pimp-my-library pattern.

Do understand that I'm talking about an established branch of
mathematics, not just the name for some class. To wit:

Randall Schulz

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: equals and "Numbers"

On Thu, Nov 12, 2009 at 7:12 PM, Randall R Schulz wrote:
> On Thursday November 12 2009, Kevin Wright wrote:
>> ...
>>
>> I was hesitant to use "Interval", thinking it was used elsewhere.
>> Glad to say that a quick search of the current scaladoc has proved me
>> wrong here, so interval it is!
>>
>> So something like Interval[@specialized T] should do the trick nicely
>> :)
>>
>> As I think more about it, I don't see why equality and other
>> comparisons couldn't also be handled via implicit conversions from
>> number types using the pimp-my-library pattern.
>
> Do understand that I'm talking about an established branch of
> mathematics, not just the name for some class. To wit:
>
>
>
>
>
>
> Randall Schulz
>

Most definitely, Interval was my preferred name anyhow!
I'm just musing now on how to implement the thing and what practical
considerations might separate us from algebraic perfection...

Kris Nuttycombe
Joined: 2009-01-16,
User offline. Last seen 42 years 45 weeks ago.
Re: equals and "Numbers"

For whatever it's worth, here's an Interval class I use principally
for date ranges, but should be usable for other types (or at least
could provide a few potentially useful methods on a new Interval
class).

trait Interval[T, U <: Interval[T,U]] extends Ordered[U] {
this: U =>

implicit def ord(t: T): Ordered[T]
def start : Option[T];
def end : Option[T];
def withStart(start: Option[T]) : U;
def withEnd(end: Option[T]) : U;

def startsBefore(other: U): Boolean = {
start.forall(s => other.start.exists(_ > s))
}

def endsAfter(other: U): Boolean = {
end.forall(e => other.end.exists(_ < e))
}

def disjoint(other: U): Boolean = {
end.flatMap(x => other.start.filter(_ > x)).
orElse(start.flatMap(x => other.end.filter(_ < x))).
isDefined
}

def intersect(other: U): Option[U] = {
if (disjoint(other)) {
None
} else if (startsBefore(other) && endsAfter(other)) {
Some(other)
} else if (startsBefore(other) && other.endsAfter(this)) {
Some(withEnd(other.start))
} else if (other.startsBefore(this) && endsAfter(other)) {
Some(withStart(other.end))
} else {
other.intersect(this);
}
}

def union(other: U): Either[U, (U, U)] = {
if (disjoint(other)) {
Right(this, other)
} else if (startsBefore(other) && endsAfter(other)) {
Left(this)
} else if (startsBefore(other) && other.endsAfter(this)) {
Left(withEnd(other.end))
} else if (other.startsBefore(this) && endsAfter(other)) {
Left(withStart(other.start))
} else {
other.union(this)
}
}

def compare(other: U): Int = {
compare(start, other.start, true) | compare(end, other.end, false);
}

private def compare(o1: Option[T], o2: Option[T], left:
Boolean): Int = {
return if (o1.isEmpty && o2.isEmpty) 0
else if (o1.isEmpty) (if (left) -1 else 1)
else if (o2.isEmpty) (if (left) 1 else -1)
else o1.get.compareTo(o2.get)
}
}

On Thu, Nov 12, 2009 at 1:56 PM, Kevin Wright
wrote:
> On Thu, Nov 12, 2009 at 7:12 PM, Randall R Schulz wrote:
>> On Thursday November 12 2009, Kevin Wright wrote:
>>> ...
>>>
>>> I was hesitant to use "Interval", thinking it was used elsewhere.
>>> Glad to say that a quick search of the current scaladoc has proved me
>>> wrong here, so interval it is!
>>>
>>> So something like Interval[@specialized T] should do the trick nicely
>>> :)
>>>
>>> As I think more about it, I don't see why equality and other
>>> comparisons couldn't also be handled via implicit conversions from
>>> number types using the pimp-my-library pattern.
>>
>> Do understand that I'm talking about an established branch of
>> mathematics, not just the name for some class. To wit:
>>
>>
>>
>>
>>
>>
>> Randall Schulz
>>
>
> Most definitely, Interval was my preferred name anyhow!
> I'm just musing now on how to implement the thing and what practical
> considerations might separate us from algebraic perfection...
>

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: equals and "Numbers"

Just looking at you checkin... Seeing the two tests with typeCode <=
INT in BoxesRuntime.equals, I realize that I had not taken
java.math.BigInteger and friends into account. But there must be a
faster way to do this. the way it's done now is not so
great.

Cheers

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: equals and "Numbers"

On Fri, Nov 13, 2009 at 07:09:08PM +0100, martin odersky wrote:
> Just looking at you checkin... Seeing the two tests with typeCode <=
> INT in BoxesRuntime.equals, I realize that I had not taken
> java.math.BigInteger and friends into account. But there must be a
> faster way to do this. the way it's done now is not so great.

Right, where "and friends" includes all the whole universe of unknown
subtypes of Number. I do not think it is even theoretically possible to
avoid those checks and preserve correctness; but it is probably possible
to reorder/tweak the logic to get the most common cases completing with
the fewest checks. I am happy to look into this.

In one of my previous iterations of equality I blended the typecodes of
the lhs and rhs up front like

switch(typeCode(lhs) * MAX + typeCode(rhs)) { ...

and as I remember it was the fastest on average.

I thought I checked in a reordering of the checks in typeCode but maybe
I left that in a branch. It can be made faster by preferencing the most
common types, and I would strongly suspect it's faster overall to
discriminate on Number before going through all the boxed types, because
all the Object/Object comparisons which don't turn out to be primitives
can then get out with two checks instead of seven.

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: equals and "Numbers"

On Fri, Nov 13, 2009 at 8:08 PM, Paul Phillips wrote:
> On Fri, Nov 13, 2009 at 07:09:08PM +0100, martin odersky wrote:
>> Just looking at you checkin... Seeing the two tests with typeCode <=
>> INT in BoxesRuntime.equals, I realize that I had not taken
>> java.math.BigInteger and friends into account. But there must be a
>> faster way to do this. the way it's done now is not so great.
>
> Right, where "and friends" includes all the whole universe of unknown
> subtypes of Number.  I do not think it is even theoretically possible to
> avoid those checks and preserve correctness; but it is probably possible
> to reorder/tweak the logic to get the most common cases completing with
> the fewest checks.  I am happy to look into this.
>
> In one of my previous iterations of equality I blended the typecodes of
> the lhs and rhs up front like
>
>  switch(typeCode(lhs) * MAX + typeCode(rhs)) { ...
>
> and as I remember it was the fastest on average.
>
> I thought I checked in a reordering of the checks in typeCode but maybe
> I left that in a branch.  It can be made faster by preferencing the most
> common types, and I would strongly suspect it's faster overall to
> discriminate on Number before going through all the boxed types, because
> all the Object/Object comparisons which don't turn out to be primitives
> can then get out with two checks instead of seven.
>
I have checked in something following some of these suggestions. I
have also checked in a benchmark under test/files/bench which tries to
get some usable data on equality performance. The mix of operations is
detailed in the doc comment of eqeq.scala. I have put some test
results in eqeq.log. More testing and possibly tuning is very welcome!

Generally, I think it would be good to start adding some benchmarks of
critical operations to the tests. Is test/files/bench the best place,
or are there other suggestions?

Cheers

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland