This page is no longer maintained — Please continue to the home page at www.scala-lang.org

scala ranges versus lists performance on large collections

2 replies
fractal
Joined: 2010-05-12,
User offline. Last seen 25 weeks 4 days ago.

I've been running some performance micro-benchmarks on large
collections (10,000,000 elements), and I was quite surprised by the
performance a range operation been much more efficient than other
similar approaches. In particular, the conversion from Range to a List
seems to have a major impact.

Details of the benchmark can be found here:

http://stackoverflow.com/questions/8027801/scala-ranges-versus-lists-per...

The Range implementation uses less memory and is almost as fast as the
Java primitive array implementation, whereas the conversion from a
range to a list with the same code, has a even larger memory footprint
than a similar Java List implementation.

Can anybody explain why this behaviour? I'm really curious.

Simon Ochsenreither
Joined: 2011-07-17,
User offline. Last seen 42 years 45 weeks ago.
Re: scala ranges versus lists performance on large collections

In the first example you create a linked list with 10 elements by computing the steps of the range.

In the second example you create a linked list with 10 millions of elements and filter it down to a new linked list with 10 elements.

In the third example you create an array-backed buffer with 10 millions of elements which you traverse and print, no new array-backed buffer is created.

Conclusion:

Every piece of code does something different, that's why the performance varies greatly.

fractal
Joined: 2010-05-12,
User offline. Last seen 25 weeks 4 days ago.
Re: scala ranges versus lists performance on large collections

The objective of creating those different implementations is to
actually compare how would they behave.

What i was looking for in the first place is why the performance using
just range was so close to an array of primitives, or even less
memory. now is clear to me that the range doesn't actually store the
values, which is great.

On the other hand, the toList is actually larger than an ArrayList.
However, as pointed out, the comparison should be against a linked
list. I added a new example comparing the performance of a linked
list, and is in fact similar to the second scala example.

Thanks a lot for the answers

Fernando

On Nov 6, 5:52 pm, Simon Ochsenreither
wrote:
> In the first example you create a *linked list* with 10 elements by
> computing the steps of the range.
>
> In the second example you create a *linked list* with 10 millions of
> elements and filter it down to a new *linked list* with 10 elements.
>
> In the third example you create an *array-backed buffer* with 10 millions
> of elements which you traverse and print, no new *array-backed buffer* is
> created.
>
> *Conclusion:*
>
> Every piece of code does something different, that's why the performance
> varies greatly.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland