- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Retaining the retain - what methods should the collections API additionally support
Tue, 2011-05-17, 15:54
Hello all,
this e-mail is related to the ticket #4597 and all the similar tickets received lately (e.g. #4247). Basically, the proposal in these tickets is to add additional methods to the collections API. This concrete example (#4597) proposes that we add `shrink` to `Shrinkable` so that it's possible to remove elements using a predicate. This method would do what `retain` does for mutable sets and maps.
We admit that the collections API may lack certain methods. However, we do not want to treat this and similar tickets in isolation. To decide which additional methods should be added to the collections API, we should analyze the common use cases and the required methods together. We don't want to add mutually redundant methods that may possibly result from multiple enhancement requests.
So the question is:
What methods do you think the collections API is lacking, that cannot be easily expressed in terms of existing methods?
Which methods do you think are particularly relevant, and cannot be expressed in terms of existing methods at all?
Which methods do you feel should be moved to superclasses?
If and once the list of proposals becomes large enough, we might decide to include the missing methods.
Thanks,
Aleksandar
this e-mail is related to the ticket #4597 and all the similar tickets received lately (e.g. #4247). Basically, the proposal in these tickets is to add additional methods to the collections API. This concrete example (#4597) proposes that we add `shrink` to `Shrinkable` so that it's possible to remove elements using a predicate. This method would do what `retain` does for mutable sets and maps.
We admit that the collections API may lack certain methods. However, we do not want to treat this and similar tickets in isolation. To decide which additional methods should be added to the collections API, we should analyze the common use cases and the required methods together. We don't want to add mutually redundant methods that may possibly result from multiple enhancement requests.
So the question is:
What methods do you think the collections API is lacking, that cannot be easily expressed in terms of existing methods?
Which methods do you think are particularly relevant, and cannot be expressed in terms of existing methods at all?
Which methods do you feel should be moved to superclasses?
If and once the list of proposals becomes large enough, we might decide to include the missing methods.
Thanks,
Aleksandar
Tue, 2011-05-17, 16:57
#2
Re: Retaining the retain - what methods should the collections
On Tue, May 17, 2011 at 4:33 PM, martin odersky wrote:
> I should also say that every addition of a method to the collection
> libraries comes at a cost, and the question is still unanswered whether the
> cost is worth the gains. We can try to answer that only by looking at
> concrete proposals that cover the whole spectrum of methods to be added.
This is indeed important to keep in mind. I had to set
-XX:ReservedCodeCacheSize=96m (default is 64m) when running IDEA once
I moved to Scala 2.9.0. I believe this is due to the increased number
of classes that are now JIT'd. Once the maximum is reached, no new
methods are JIT'd and this can cause major performance issues.
It's a shame that Paul Phillips attempt to introduce abstract classes
in the collection hierarchy did not work out.
Best,
Ismael
Tue, 2011-05-17, 18:57
#3
Re: Retaining the retain - what methods should the collections
On Tue, May 17, 2011 at 11:54, Aleksandar Prokopec
wrote:
> Hello all,
>
> So the question is:
> What methods do you think the collections API is lacking, that cannot be
> easily expressed in terms of existing methods?
> Which methods do you think are particularly relevant, and cannot be
> expressed in terms of existing methods at all?
> Which methods do you feel should be moved to superclasses?
My own issue is with the methods that split a collection in
before/after. There's take/drop/splitAt and takeWhile/dropWhile/span.
In the first case, the problem is solely with splitAt, where I'd like
to have something that returned (before, at, after). It doesn't matter
with List, but it can make a difference with other collections.
However, this is just a performance issue.
The takeWhile/dropWhile/span is another matter. But many times I need
the first element that falsifies the predicate to be kept with the
first collection, not the second. The only way to accomplish this is
to span, and then move the first element of the second collection to
the end of the first collection -- an expensive operation, and very
awkward to begin with. I need it mostly with takeWhile, where the
current implementation often yields the equivalent of a foldLeft that
returned the next-to-last accumulator.
And as others suggested before, the mutable collections ought to have
a decent set of mutator methods for insertion/deletion. It is quite
ridiculous to resort to Java for that.
Wed, 2011-05-18, 11:07
#4
RE: Retaining the retain - what methods should the collections
> From: dcsobral@gmail.com
> In the first case, the problem is solely with splitAt, where I'd like
> to have something that returned (before, at, after). It doesn't matter
> with List, but it can make a difference with other collections.
> However, this is just a performance issue.
It's also a clarity issue. For example, when wanting a File name and extension, would be really nice:
val (name, _, ext) = file.getName.splitAt(file.getName lastIndexOf '.')
This kind of thing is incredibly common and the current "workarounds" are not so nice
Chris
Wed, 2011-05-18, 11:27
#5
RE: Retaining the retain - what methods should the collections
I totally agree about the "big picture" - but I think it is reasonable to assume that for every immutable operation, if the operation "makes sense" to have a mutable version (which filter *most certainly does*), then there should be a mutable version. Possible conventions:
prefix m: mFilter/mfilter etc suffix "inPlace" : filterInPlace
I prefer the former, personally as "filterNotInPlace" is confusing, whereas mfilterNot/mFilterNot is not.
Daniel's suggestion of a version of splitAt on Repr <% IndexedSeq[A] which returns (Repr, A, Repr) would be *incredibly* useful. Unfortunately the split and partition names are taken - perhaps "separateAt"
Chris
From: martin.odersky@epfl.ch
Date: Tue, 17 May 2011 17:33:23 +0200
Subject: Re: [scala-internals] Retaining the retain - what methods should the collections API additionally support
To: scala-internals@googlegroups.com
Before we get into a shoot-for-all where everybody proposes his favorite method, let me propose we focus this a little bit.
In the collection libraries most bulk operations return a new collection, even if the underlying collection is immutable. There are some bulk operations such as ++= and --= which work in place, however. They are analogues of ++ and --, respectively. The question is, should be generalize this, and, if yes, how? That is, which other functional bulk operations should have in-place correspondents? And, how should these be named? I think before deciding whether we want to go down that route it's important to see the whole picture.
I should also say that every addition of a method to the collection libraries comes at a cost, and the question is still unanswered whether the cost is worth the gains. We can try to answer that only by looking at concrete proposals that cover the whole spectrum of methods to be added.
Cheers
-- Martin
prefix m: mFilter/mfilter etc suffix "inPlace" : filterInPlace
I prefer the former, personally as "filterNotInPlace" is confusing, whereas mfilterNot/mFilterNot is not.
Daniel's suggestion of a version of splitAt on Repr <% IndexedSeq[A] which returns (Repr, A, Repr) would be *incredibly* useful. Unfortunately the split and partition names are taken - perhaps "separateAt"
Chris
From: martin.odersky@epfl.ch
Date: Tue, 17 May 2011 17:33:23 +0200
Subject: Re: [scala-internals] Retaining the retain - what methods should the collections API additionally support
To: scala-internals@googlegroups.com
Before we get into a shoot-for-all where everybody proposes his favorite method, let me propose we focus this a little bit.
In the collection libraries most bulk operations return a new collection, even if the underlying collection is immutable. There are some bulk operations such as ++= and --= which work in place, however. They are analogues of ++ and --, respectively. The question is, should be generalize this, and, if yes, how? That is, which other functional bulk operations should have in-place correspondents? And, how should these be named? I think before deciding whether we want to go down that route it's important to see the whole picture.
I should also say that every addition of a method to the collection libraries comes at a cost, and the question is still unanswered whether the cost is worth the gains. We can try to answer that only by looking at concrete proposals that cover the whole spectrum of methods to be added.
Cheers
-- Martin
Wed, 2011-05-18, 11:47
#6
Re: Retaining the retain - what methods should the collections
On Wed, May 18, 2011 at 12:19 PM, Chris Marshall
wrote:
> I totally agree about the "big picture" - but I think it is reasonable to
> assume that for every immutable operation, if the operation "makes sense" to
> have a mutable version (which filter *most certainly does*), then there
> should be a mutable version. Possible conventions:
> prefix m: mFilter/mfilter etc
> suffix "inPlace" : filterInPlace
> I prefer the former, personally as "filterNotInPlace" is confusing, whereas
> mfilterNot/mFilterNot is not.
> Daniel's suggestion of a version of splitAt on Repr <% IndexedSeq[A] which
> returns (Repr, A, Repr) would be *incredibly* useful. Unfortunately the
> split and partition names are taken - perhaps "separateAt"
> Chris
Last time this discussion went around, the idea to us a nested object
to namespace the methods was proposed, along the lines of:
Buffer(1, 2, 3).mutably.filter(_ > 1)
I would prefer this to suffixes/prefixes.
-jason
Wed, 2011-05-18, 13:17
#7
Re: Retaining the retain - what methods should the collections
>>>>> "Jason" == Jason Zaugg writes:
Jason> Last time this discussion went around, the idea to us a nested
Jason> object to namespace the methods was proposed, along the lines
Jason> of:
Jason> Buffer(1, 2, 3).mutably.filter(_ > 1)
Jason> I would prefer this to suffixes/prefixes.
That thread is here:
http://scala-programming-language.1934581.n4.nabble.com/In-place-mutable...
(Chris Marshall's objections are worrisome.)
Wed, 2011-05-18, 13:27
#8
Re: Retaining the retain - what methods should the collections
I don't really agree with Chris Marshall's objections. Isn't it a
given that the mutable methods need to be distinct from the existing
ones? Given their completely different nature this is a plus rather
than a minus in my eyes. I like to idea of an inPlace/mutable nested
object.
Regards,
Rüdiger
2011/5/18 Seth Tisue :
>>>>>> "Jason" == Jason Zaugg writes:
>
> Jason> Last time this discussion went around, the idea to us a nested
> Jason> object to namespace the methods was proposed, along the lines
> Jason> of:
> Jason> Buffer(1, 2, 3).mutably.filter(_ > 1)
> Jason> I would prefer this to suffixes/prefixes.
>
> That thread is here:
>
> http://scala-programming-language.1934581.n4.nabble.com/In-place-mutable...
>
> (Chris Marshall's objections are worrisome.)
>
> --
> Seth Tisue | Northwestern University | http://tisue.net
> lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/
>
Wed, 2011-05-18, 13:37
#9
RE: Retaining the retain - what methods should the collections
Worrisome in the sense that you fear for my sanity, or worrisome in the sense that they raise valid points?
> To: scala-internals@googlegroups.com
> Subject: Re: [scala-internals] Retaining the retain - what methods should the collections API additionally support
> From: seth@tisue.net
> Date: Wed, 18 May 2011 08:08:20 -0400
>
> >>>>> "Jason" == Jason Zaugg <jzaugg@gmail.com> writes:
>
> Jason> Last time this discussion went around, the idea to us a nested
> Jason> object to namespace the methods was proposed, along the lines
> Jason> of:
> Jason> Buffer(1, 2, 3).mutably.filter(_ > 1)
> Jason> I would prefer this to suffixes/prefixes.
>
> That thread is here:
>
> http://scala-programming-language.1934581.n4.nabble.com/In-place-mutable...
>
> (Chris Marshall's objections are worrisome.)
>
> --
> Seth Tisue | Northwestern University | http://tisue.net
> lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/
> To: scala-internals@googlegroups.com
> Subject: Re: [scala-internals] Retaining the retain - what methods should the collections API additionally support
> From: seth@tisue.net
> Date: Wed, 18 May 2011 08:08:20 -0400
>
> >>>>> "Jason" == Jason Zaugg <jzaugg@gmail.com> writes:
>
> Jason> Last time this discussion went around, the idea to us a nested
> Jason> object to namespace the methods was proposed, along the lines
> Jason> of:
> Jason> Buffer(1, 2, 3).mutably.filter(_ > 1)
> Jason> I would prefer this to suffixes/prefixes.
>
> That thread is here:
>
> http://scala-programming-language.1934581.n4.nabble.com/In-place-mutable...
>
> (Chris Marshall's objections are worrisome.)
>
> --
> Seth Tisue | Northwestern University | http://tisue.net
> lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/
Fri, 2011-05-20, 18:27
#10
Re: Retaining the retain - what methods should the collections
On 5/17/11 8:50 AM, Ismael Juma wrote:
> This is indeed important to keep in mind. I had to set
> -XX:ReservedCodeCacheSize=96m (default is 64m) when running IDEA once
> I moved to Scala 2.9.0. I believe this is due to the increased number
> of classes that are now JIT'd. Once the maximum is reached, no new
> methods are JIT'd and this can cause major performance issues.
I've been seeing that message about the code cache being full more often
than not in recent times when building the compiler from scratch
(including running the tests - it turns up near the end of the tests.)
Didn't seem like a good sign.
In the collection libraries most bulk operations return a new collection, even if the underlying collection is immutable. There are some bulk operations such as ++= and --= which work in place, however. They are analogues of ++ and --, respectively. The question is, should be generalize this, and, if yes, how? That is, which other functional bulk operations should have in-place correspondents? And, how should these be named? I think before deciding whether we want to go down that route it's important to see the whole picture.
I should also say that every addition of a method to the collection libraries comes at a cost, and the question is still unanswered whether the cost is worth the gains. We can try to answer that only by looking at concrete proposals that cover the whole spectrum of methods to be added.
Cheers
-- Martin