- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
flattening through space and time
Thu, 2011-04-14, 09:56
So I was looking at the parallel collections refactor and I had this
nagging feeling it was missing something.
I reserve the right for this to turn out to have been a pointless
exercise, but I pulled on the thread long enough to arrive at an
implementation. What I don't have is an existence proof that it is
necessary, and it's late so I will present it without elaboration and if
I'm lucky I will wake up in a few hours and not be tempted to keep
pulling: this can be thanks to your educating me as to why it isn't
necessary, or because it turns out that it is. Either is good, I only
hope to avoid the chasm in between.
https://github.com/scala/scala/tree/abstract-flatMap
This is the first version I got compiling so it's a given it's not 100%
done. Summarized:
trait GenTraversableOnceLike[+A] {
type Flattenable[T] <: GenTraversableOnce[T]
...
}
trait GenTraversableLike[+T, +Repr] extends GenTraversableOnceLike[T] {
def flatMap[S, That](f: T => Flattenable[S])(implicit bf:
CanBuildFrom[Repr, S, That]): That
def ++[U >: T, That](that: Flattenable[U])(implicit bf:
CanBuildFrom[Repr, U, That]): That
...
}
trait TraversableOnceLike[+A] extends GenTraversableOnceLike[A] {
type Flattenable[U] = TraversableOnce[U]
...
}
trait ParIterableLike[+T, ...] {
type Flattenable[T] = GenTraversableOnce[T]
...
}
Thu, 2011-04-14, 13:17
#2
Re: Re: flattening through space and time
On Thu, Apr 14, 2011 at 1:39 PM, Aleksandar Prokopec <aleksandar.prokopec@epfl.ch> wrote:
Hi,I tried this once. The main problem is that the abstract types tend to leak into error messages about collections: it would tend say
So, if I understood correctly, you introduce abstract types into method signatures, thus ensuring that, for instance, a `++` on a sequential Seq can only be applied another sequential collection. Of the top of my head, I can't see why this wouldn't work - so this change looks ok to me.
found: Traversable { type Flattenable <: ... }
instead of just Traversable, even though the refinement would make no difference. That's why I decided not to leave any abstract types in user-facing classes. I believe that would be a problem with Flattenable as well.
Cheers
-- Martin
Thu, 2011-04-14, 16:37
#3
Re: Re: flattening through space and time
On 4/14/11 4:39 AM, Aleksandar Prokopec wrote:
> So, if I understood correctly, you introduce abstract types into
> method signatures, thus ensuring that, for instance, a `++` on a
> sequential Seq can only be applied another sequential collection. Of
> the top of my head, I can't see why this wouldn't work - so this
> change looks ok to me.
The central issue which is making my little voice get involved
is the appearance of GenTraversableOnce in the type signatures of the
sequential/legacy collections. Maybe that's something we can get away
with, but I'm afraid it will harbor unpleasant surprises. If it's worth
trying to shield existing code from our parallel shenanigans I think
it's worth going the distance.
On 4/14/11 5:11 AM, martin odersky wrote:
> That's why I decided not to leave any abstract types in user-facing
> classes. I believe that would be a problem with Flattenable as well.
I have two supportive data points.
1) My "first compiling" implementation came within a couple meaningless
test tweaks of all tests passing. I updated the git branch to an all
passing one.
2) My naive attempts to incur ugly error messages fail, as below. (If
you know what kind of construct is likely to lead to such messages, let
me know; though depending on how exactly they arise, it might be a
better argument for working on the error message than for letting it
influence the design.)
scala> def f(xs: collection.GenTraversableOnce[Int]) = xs
f: (xs:
scala.collection.GenTraversableOnce[Int])scala.collection.GenTraversableOnce[Int]
scala> f(1 to 10 par)
res4: scala.collection.GenTraversableOnce[Int] = ParRange(1, 2, 3, 4, 5,
6, 7, 8, 9, 10)
scala> 1 to 10 flatMap (1 to _ par)
:8: error: type mismatch;
found : scala.collection.parallel.immutable.ParRange
required: scala.collection.TraversableOnce[?]
1 to 10 flatMap (1 to _ par)
^
scala> (1 to 10).par flatMap (1 to _)
res6: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 1,
2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5,
6, 7, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10)
scala> (1 to 10) ++ (1 to 10).par
:8: error: type mismatch;
found : scala.collection.parallel.immutable.ParRange
required: scala.collection.TraversableOnce[?]
(1 to 10) ++ (1 to 10).par
^
scala> (1 to 10) ++ (1 to 10).par.seq
res8: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> (1 to 10).par ++ (1 to 10).par
res9: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Hi,
So, if I understood correctly, you introduce abstract types into method
signatures, thus ensuring that, for instance, a `++` on a sequential Seq
can only be applied another sequential collection. Of the top of my
head, I can't see why this wouldn't work - so this change looks ok to me.
However, it should ensure that only a sequential collection can be
passed as an argument to `++` invoked on a sequential collection.
I am not sure whether we want to do this. Presently, it makes sense to
pass a parallel collection as an argument to `++`. If `this` is a
sequential collection, it will invoke `seq` on the parallel collection
and then add elements sequentially (the `seq` conversion method is
always O(1)).
Another problem that I see is that if you don't know statically exactly
what your collection is, you cannot use it in your methods. A `++` will
not accept any `GenTraversableOnce`, even though a client is sure that
his traversable is compatible.
I cannot say this is not tempting. In general it may be costly or
impossible to traverse the target collection. For instance, distributed
collections may not trivially implement iterators, since the dataset is
in general not locally available. In such cases, it may make sense to
disallow using them. The Flattenable type could then have a marker
`Local` which is mixed in with parallel and sequential collections, but
not with distributed.
I'd have to think about this a while longer.
Cheers,
Alex
On 14/04/11 10:56, Paul Phillips wrote:
> So I was looking at the parallel collections refactor and I had this
> nagging feeling it was missing something.
>
> I reserve the right for this to turn out to have been a pointless
> exercise, but I pulled on the thread long enough to arrive at an
> implementation. What I don't have is an existence proof that it is
> necessary, and it's late so I will present it without elaboration and
> if I'm lucky I will wake up in a few hours and not be tempted to keep
> pulling: this can be thanks to your educating me as to why it isn't
> necessary, or because it turns out that it is. Either is good, I only
> hope to avoid the chasm in between.
>
> https://github.com/scala/scala/tree/abstract-flatMap
>
> This is the first version I got compiling so it's a given it's not
> 100% done. Summarized:
>
> trait GenTraversableOnceLike[+A] {
> type Flattenable[T] <: GenTraversableOnce[T]
> ...
> }
>
> trait GenTraversableLike[+T, +Repr] extends GenTraversableOnceLike[T] {
> def flatMap[S, That](f: T => Flattenable[S])(implicit bf:
> CanBuildFrom[Repr, S, That]): That
> def ++[U >: T, That](that: Flattenable[U])(implicit bf:
> CanBuildFrom[Repr, U, That]): That
> ...
> }
>
> trait TraversableOnceLike[+A] extends GenTraversableOnceLike[A] {
> type Flattenable[U] = TraversableOnce[U]
> ...
> }
>
> trait ParIterableLike[+T, ...] {
> type Flattenable[T] = GenTraversableOnce[T]
> ...
> }
> .
>