- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
sizeHint
Sun, 2010-05-09, 13:10
Paul suggested some time ago a better Builder size hint to make operations like map faster. The idea was to ask whether
the source collection had a "cheap to compute" size and, if yes, use this in a size hint for the builder of the map method, and methods
like it. I have come up with a variation of that. Let's add a new sizeHint variant to Builder, like this:
/** Gives a hint that one expects the `result` of this builder
* to have the same size as the given collection. This will
* provide a hint only if the collection is known to have a cheap
* `size` method. Currently this is assumed ot be the case if and only if
* the collection is of type `IndexedSeqLike`.
* Some builder classes
* will optimize their representation based on the hint. However,
* builder implementations are still required to work correctly even if the hint is
* wrong, i.e. a different number of elements is added.
*
* @param coll the collection which serves as a hint for the result's size.
*/
def sizeHint(coll: TraversableLike[_, _]) {
if (coll.isInstanceOf[IndexedSeqLike[_,_]]) sizeHint(coll.size)
}
The advantage is that it's very easy to use. Simply do bldr.sizeHint(coll), once you have the builder `bldr` and the
collection `coll` that should serve as template for the sizeHint. Also, there is now very little need for a "hasCheapSize" method for every collection. We can fine tune the code in Builder, but I think inheriting from IndexedSeqLike is a pretty good approximation for having a cheap size.
What do you think?
-- Martin
the source collection had a "cheap to compute" size and, if yes, use this in a size hint for the builder of the map method, and methods
like it. I have come up with a variation of that. Let's add a new sizeHint variant to Builder, like this:
/** Gives a hint that one expects the `result` of this builder
* to have the same size as the given collection. This will
* provide a hint only if the collection is known to have a cheap
* `size` method. Currently this is assumed ot be the case if and only if
* the collection is of type `IndexedSeqLike`.
* Some builder classes
* will optimize their representation based on the hint. However,
* builder implementations are still required to work correctly even if the hint is
* wrong, i.e. a different number of elements is added.
*
* @param coll the collection which serves as a hint for the result's size.
*/
def sizeHint(coll: TraversableLike[_, _]) {
if (coll.isInstanceOf[IndexedSeqLike[_,_]]) sizeHint(coll.size)
}
The advantage is that it's very easy to use. Simply do bldr.sizeHint(coll), once you have the builder `bldr` and the
collection `coll` that should serve as template for the sizeHint. Also, there is now very little need for a "hasCheapSize" method for every collection. We can fine tune the code in Builder, but I think inheriting from IndexedSeqLike is a pretty good approximation for having a cheap size.
What do you think?
-- Martin
Sun, 2010-05-09, 14:37
#2
Re: sizeHint
On Sun, May 9, 2010 at 2:17 PM, Ismael Juma <mlists@juma.me.uk> wrote:
Hi Martin,
On Sun, May 9, 2010 at 1:10 PM, martin odersky <martin.odersky@epfl.ch> wrote:
> Let's add a new sizeHint variant to Builder, like this:
This sounds like a good approach.
> collection. We can fine tune the code in Builder, but I think inheriting
> from IndexedSeqLike is a pretty good approximation for having a cheap size.
I'm not sure if this is enough, although it's a good start. Hash-based
collections like HashMap and HashSet could benefit greatly from
avoiding resizes (more so than Array/ArrayBuffer due to rehashing),
they usually have a cheap size method and they do not inherit from
IndexedSeqLike.
True. But currently HashMaps and HashSets are not set up anyway to profit from this, because they go through the normal
MapBuilder. Also, I am not sure whether map and methods like it are really that common on maps.
Cheers
-- Martin
Sun, 2010-05-09, 15:07
#3
Re: sizeHint
On Sun, May 9, 2010 at 2:32 PM, martin odersky wrote:
> True. But currently HashMaps and HashSets are not set up anyway to profit
> from this, because they go through the normal
> MapBuilder.
I see.
> Also, I am not sure whether map and methods like it are really
> that common on maps.
I found that statement a bit surprising since I tend to use them a
fair amount. Out of curiosity, I asked in #scala and both people who
replied also shared the view. A small sample, of course, but I thought
I'd mention it.
Best,
Ismael
Hi Martin,
On Sun, May 9, 2010 at 1:10 PM, martin odersky wrote:
> Let's add a new sizeHint variant to Builder, like this:
This sounds like a good approach.
> collection. We can fine tune the code in Builder, but I think inheriting
> from IndexedSeqLike is a pretty good approximation for having a cheap size.
I'm not sure if this is enough, although it's a good start. Hash-based
collections like HashMap and HashSet could benefit greatly from
avoiding resizes (more so than Array/ArrayBuffer due to rehashing),
they usually have a cheap size method and they do not inherit from
IndexedSeqLike.
Best,
Ismael