This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Improving Numeric's division operation

14 replies
Ben Hutchison 3
Joined: 2009-11-02,
User offline. Last seen 42 years 45 weeks ago.

What are the issues and options for improving scala.math.Numeric's
division operator?

1. The complaint that comes up over and over is that Numeric doesn't
define division, ie /. Instead, you have to use subtypes Integral (for
integer floor division via 'quot') or Fractional (for regular division
via 'div').

Could and should division be lifted up to the Numeric level?

+ The same type signature, (T, T) => T fits both styles of division.

- An existing implementation of Numeric, outside the core library,
would break if an abstract division method were added to Numeric. Is
that acceptable in a version jump like 2.9.x => 2.10.x or similar?
Alternately, is it feasible to define a default implementation for the
trait? I can't think of one.

Some lesser issues around division:

2. I think the following operation makes sense (conceptually):
Integral / Integral => Fractional. After all, this is the /definition/
of Rational numbers: the ratio of two integers. Is there a way to
support this kind of operation on an Integral typeclass of some
(sub)type? It seems to imply parametrizing the Integral type with an
associated Fractional type.

3. For integral division, should the fundamental division operation be
(T, T) => (T, T), returning both quotient and remainder in one step?
It seems inefficient to compute the same division twice if you need
both values, although probably only significant for tight loops &/
huge integer values.

-Ben

Simon Ochsenreither
Joined: 2011-07-17,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation
+1, I definitely want division in Numeric.
Ismael Juma 2
Joined: 2011-01-22,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation
On Wed, Oct 26, 2011 at 7:14 AM, Ben Hutchison <brhutchison@gmail.com> wrote:
3. For integral division, should the fundamental division operation be
(T, T) => (T, T), returning both quotient and remainder in one step?
It seems inefficient to compute the same division twice if you need
both values, although probably only significant for tight loops &/
huge integer values.

How do you intend to return both values in one step? If you box the result in a tuple, it's significantly more inefficient than simply doing the division twice while avoiding boxing.
Best,Ismael
d_m
Joined: 2010-11-11,
User offline. Last seen 35 weeks 2 days ago.
Re: Improving Numeric's division operation

On Wed, Oct 26, 2011 at 05:14:33PM +1100, Ben Hutchison wrote:
> Could and should division be lifted up to the Numeric level?

As you know, I feel the answer to this should be 'yes'. :)

> - An existing implementation of Numeric, outside the core library,
> would break if an abstract division method were added to Numeric. Is
> that acceptable in a version jump like 2.9.x => 2.10.x or similar?
> Alternately, is it feasible to define a default implementation for the
> trait? I can't think of one.

I can't think of a god default. I don't think very many people will be
affected by this change and the "fix" is an easy one. But it's not my
call to make.

> 2. I think the following operation makes sense (conceptually):
> Integral / Integral => Fractional. After all, this is the /definition/
> of Rational numbers: the ratio of two integers. Is there a way to
> support this kind of operation on an Integral typeclass of some
> (sub)type? It seems to imply parametrizing the Integral type with an
> associated Fractional type.

I would definitely support adding a Rational[T] type and then giving
Integral a method to create these. I would personally not use "/"
because I think there's a well-established tradition of using / as
integer division. That said, if we changed the operator that
Byte/Short/Int/Long use to something else (e.g. /_ or /~) and gave the
literals something like /(Int, Int) => Double then I would support
doing this for Numeric as well.

I did some testing and it seems possible to associate a fractional type
with each integral one [1]. I like this less than using a custom
rational type because it seems like the rational case allows you to get
a Double out if that's what you want (or a BigDecimal, or a Float, or
whatever). It's also clunky, requiring manual casting to get back the
type you actually want to work with (e.g. Double).

> 3. For integral division, should the fundamental division operation be
> (T, T) => (T, T), returning both quotient and remainder in one step?
> It seems inefficient to compute the same division twice if you need
> both values, although probably only significant for tight loops &/
> huge integer values.

As was alluded to, the JVM can compute either of these on primitives
very quickly. We can easily create a divmod operator (e.g. /%) which
returns a tuple for those who want it, but for Int doing integer
division and modulus in sequence is much faster than boxing the result
of both into a Tuple[Int, Int].

Hope this helps,

d_m
Joined: 2010-11-11,
User offline. Last seen 35 weeks 2 days ago.
Re: Improving Numeric's division operation

Typo!

For the record, the quoted error should be replaced with the actual
error you get if you capitalize Main properly. Ugh, sorry for the
confusion.

On Wed, Oct 26, 2011 at 09:29:40AM -0400, Erik Osheim wrote:
> scala> val x:Double = main.compute(1, 2, 3)
> :7: error: not found: value main
> val x:Double = main.compute(1, 2, 3)
> ^

scala> val x:Double = Main.compute(1, 2, 3)
:9: error: type mismatch;
found : Intish[Int]#Fract
required: Double
val x:Double = Main.compute(1, 2, 3)
^

John Nilsson
Joined: 2008-12-20,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation

On Wed, Oct 26, 2011 at 1:54 PM, Ismael Juma wrote:
> How do you intend to return both values in one step? If you box the result
> in a tuple, it's significantly more inefficient than simply doing the
> division twice while avoiding boxing.
> Best,
> Ismael

First option would probably be to trust the VM to inline the call and
remove the object allocation.

If that won't work, how about supplying a handler for the result?

def /[T](i2: Int)(h: (Int,Int)=>T): T

But this will also result in boxing, because of the curried definition, no?

So
def /[T](i2: Int, h: (Int,Int)=>T): T

but that doesn't really make for a nice looking api.

Are there other options on this theme?

BR,
John

Todd Vierling
Joined: 2011-04-27,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation
On Friday, October 28, 2011 5:33:38 PM UTC-4, John Nilsson wrote:
On Wed, Oct 26, 2011 at 1:54 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> How do you intend to return both values in one step? If you box the result
> in a tuple, it's significantly more inefficient than simply doing the
> division twice while avoiding boxing.

First option would probably be to trust the VM to inline the call and

remove the object allocation.

It won't. The bytecode says to create a Tuple2$mcII$sp to carry the values, so it will do that. Not really removable, AFAIK.
(That's the @specialized version of Tuple2[Int, Int], which holds the Ints themselves as primitives, so no boxing happens at the Int->Integer level. These specializations only exist for Tuple1 and Tuple2, not higher-arity tuples.)

If  that won't work, how about supplying a handler for the result?

    def /[T](i2: Int)(h: (Int,Int)=>T): T

But this will also result in boxing, because of the curried definition, no?

The existence of (Int,Int) anywhere in the definition results in boxing in a Tuple2 as described above.

John Nilsson
Joined: 2008-12-20,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation


Den 29 okt 2011 07:55 skrev "Todd Vierling" <tv@duh.org>:
>
> On Friday, October 28, 2011 5:33:38 PM UTC-4, John Nilsson wrote:
>>
>> On Wed, Oct 26, 2011 at 1:54 PM, Ismael Juma <ism...@juma.me.uk> wrote:
>> > How do you intend to return both values in one step? If you box the result
>> > in a tuple, it's significantly more inefficient than simply doing the
>> > division twice while avoiding boxing.
>
>
>> First option would probably be to trust the VM to inline the call and
>>
>> remove the object allocation.
>
> It won't. The bytecode says to create a Tuple2$mcII$sp to carry the values, so it will do that. Not really removable, AFAIK.

Fortunatley the bytecode is not the final say in what will be executed. Removing the tuple after inlining the call should be a trivial optimization for a reasonably competent VM.

> (That's the @specialized version of Tuple2[Int, Int], which holds the Ints themselves as primitives, so no boxing happens at the Int->Integer level. These specializations only exist for Tuple1 and Tuple2, not higher-arity tuples.)
>
>> If  that won't work, how about supplying a handler for the result?
>>
>>     def /[T](i2: Int)(h: (Int,Int)=>T): T
>>
>> But this will also result in boxing, because of the curried definition, no?
>
> The existence of (Int,Int) anywhere in the definition results in boxing in a Tuple2 as described above.(
>
(Int, Int) above was part of a function signature, not a tuple type. So no it won't result in a tuple.

BR,
John

Todd Vierling
Joined: 2011-04-27,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation
On Saturday, October 29, 2011 1:22:38 PM UTC-4, John Nilsson wrote:

>> First option would probably be to trust the VM to inline the call and
>> remove the object allocation.
>
> It won't. The bytecode says to create a Tuple2$mcII$sp to carry the values, so it will do that. Not really removable, AFAIK.

Fortunatley the bytecode is not the final say in what will be executed. Removing the tuple after inlining the call should be a trivial optimization for a reasonably competent VM.

I know of zero cases where any VM will remove an explicit object instantiation such as Tuple2. Perhaps for the boxed primitives, but not for complex objects.

>>     def /[T](i2: Int)(h: (Int,Int)=>T): T
>> But this will also result in boxing, because of the curried definition, no?
>
> The existence of (Int,Int) anywhere in the definition results in boxing in a Tuple2 as described above.

(Int, Int) above was part of a function signature, not a tuple type. So no it won't result in a tuple.

Right. I shouldn't trust my eyes at that hour. The currying doesn't result in a tuple, either, because it is basically a syntactical difference for what amounts to a 2-arg method in the class file.
Rüdiger Klaehn
Joined: 2009-06-02,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation

On Sat, Oct 29, 2011 at 8:35 PM, Todd Vierling wrote:
> On Saturday, October 29, 2011 1:22:38 PM UTC-4, John Nilsson wrote:
>>
>> >> First option would probably be to trust the VM to inline the call and
>> >> remove the object allocation.
>> >
>> > It won't. The bytecode says to create a Tuple2$mcII$sp to carry the
>> > values, so it will do that. Not really removable, AFAIK.
>>
>> Fortunatley the bytecode is not the final say in what will be executed.
>> Removing the tuple after inlining the call should be a trivial optimization
>> for a reasonably competent VM.
>
> I know of zero cases where any VM will remove an explicit object
> instantiation such as Tuple2. Perhaps for the boxed primitives, but not for
> complex objects.
>
Really? In principle, an extremely short-lived Tuple2 should be
eliminated by escape analysis and stack allocation. This has been in
the JVM for a while. See the section "escape analysis" here:
http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance...

Todd Vierling
Joined: 2011-04-27,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation
On Saturday, October 29, 2011 2:49:37 PM UTC-4, rklaehn wrote:
Really? In principle, an extremely short-lived Tuple2 should be
eliminated by escape analysis and stack allocation. This has been in
the JVM for a while. See the section "escape analysis" here:
http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html

"a while" is relative; apparently this appeared in 6u23, which is well after the last trivial-object-churn speed test I did. Thanks for the pointer, this is new (and quite welcomed!) to me.
Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: Improving Numeric's division operation


On Oct 29, 2011 8:49 PM, "Rüdiger Klaehn" <rklaehn@googlemail.com> wrote:
>
> On Sat, Oct 29, 2011 at 8:35 PM, Todd Vierling <tv@duh.org> wrote:
> > On Saturday, October 29, 2011 1:22:38 PM UTC-4, John Nilsson wrote:
> >>
> >> >> First option would probably be to trust the VM to inline the call and
> >> >> remove the object allocation.
> >> >
> >> > It won't. The bytecode says to create a Tuple2$mcII$sp to carry the
> >> > values, so it will do that. Not really removable, AFAIK.
> >>
> >> Fortunatley the bytecode is not the final say in what will be executed.
> >> Removing the tuple after inlining the call should be a trivial optimization
> >> for a reasonably competent VM.
> >
> > I know of zero cases where any VM will remove an explicit object
> > instantiation such as Tuple2. Perhaps for the boxed primitives, but not for
> > complex objects.
> >
> Really? In principle, an extremely short-lived Tuple2 should be
> eliminated by escape analysis and stack allocation.

Ah, the mythical stack allocation strikes again.
Last I checked, hotspot doesn't do stack allcation. What it can do is scalar replacement via inlining and escape analysis.

Cheers,
V

This has been in
> the JVM for a while. See the section "escape analysis" here:
> http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html

Rüdiger Klaehn
Joined: 2009-06-02,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation

2011/10/29 √iktor Ҡlang :
>> Really? In principle, an extremely short-lived Tuple2 should be
>> eliminated by escape analysis and stack allocation.
>
> Ah, the mythical stack allocation strikes again.
> Last I checked, hotspot doesn't do stack allcation. What it can do is scalar
> replacement via inlining and escape analysis.
>
Really? I was not aware of the distinction.

If I have an object consisting of two ints, and it ends up as two
int-size things on the stack, I always thought of the object being
"allocated" on the stack, since the picture of the stack is exactly
the same as if you would have a struct containing two ints in C.

Ideally the compiler will go even further and allocate the two ints in
registers.

By the way: I actually saw the effect of scalar replacement when I did
some tests a few years ago using a case class Complex(re:Double,
im:Double) and the mandelbrot set iteration.

But there are a lot of cases where it does not happen. If I remember
correctly variables that are declared outside and used inside a loop
were not scalar replaced, which lead to the absurd situation that a
straight recursive method without @tailrec was faster than a loop or a
@tailrec optimized recursive method.

Even so, the effect of scalar replacement for a language like scala
that creates tiny shortlived objects like no tomorrow should be huge.

Todd Vierling
Joined: 2011-04-27,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation

On Sat, Oct 29, 2011 at 3:36 PM, Rüdiger Klaehn wrote:
> But there are a lot of cases where it does not happen. If I remember
> correctly variables that are declared outside and used inside a loop
> were not scalar replaced, which lead to the absurd situation that a
> straight recursive method without @tailrec was faster than a loop or a
> @tailrec optimized recursive method.

It may depend on the type of loop.

For instance, a for-comprehension on Range involves creating an anon
Function1 and executing it, so _vars_ reassigned from within the loop,
which are declared outside, become explicitly boxed proxy objects by
the Scala compiler.

Rüdiger Klaehn
Joined: 2009-06-02,
User offline. Last seen 42 years 45 weeks ago.
Re: Improving Numeric's division operation

On Sat, Oct 29, 2011 at 9:47 PM, Todd Vierling wrote:
> It may depend on the type of loop.
>
> For instance, a for-comprehension on Range involves creating an anon
> Function1 and executing it, so _vars_ reassigned from within the loop,
> which are declared outside, become explicitly boxed proxy objects by
> the Scala compiler.
>
It was a straight while loop. Anything else I would never expect to be
optimized.

I have to run the benchmark again with java 7. Would be very
reassuring to know that you can finally stop worrying about
short-lived objects.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland