- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Re: fold/reduce vs foldLeft/reduceLeft?
Sat, 2012-02-11, 16:03
Why isn't it, though? It seems to me that having O(n) heap space (for
the reversed collection) would be superior to having O(n) stack space,
which may blow up your stack. And they're both O(n) runtime, so
risking a stack overflow seems much more dangerous than running a
constant factor slower.
-Haoyi
On Sat, Feb 11, 2012 at 7:32 AM, Rex Kerr wrote:
> On Sat, Feb 11, 2012 at 5:59 AM, Alex Repain wrote:
>>
>>
>>
>> 2012/2/10 Rex Kerr
>>>
>>> On Fri, Feb 10, 2012 at 1:36 PM, Haoyi Li wrote:
>>>>
>>>> Hello Everyone,
>>>>
>>>> I've been learning scala recently (already know java/c#/python) and
>>>> been thoroughly enjoying the higher order collections functions.
>>>> filter/map is pretty obvious, and I've more or less figured out
>>>> foldLeft/reduceLeft. I've heard somewhere that you probably shouldn't
>>>> use foldRight/reduceRight because of O(n^2) complexity
>>>
>>>
>>> That would be a strange implementation. The typical implementation
>>> requires O(n) stack space, which is the real problem--for many collections
>>> of practical size, you throw stack overflow exceptions.
>>
>>
>> Does it have to be on the stack though ? Can't the function be tail
>> recursive on the reversed collection ?
>
>
> Of course. xs.foldRight(z)(f) is the same as xs.reverse.foldLeft(z)((a,b)
> => f(b,a)). It's often not implemented that way in the library, however.
>
> --Rex
Sun, 2012-02-12, 12:11
#2
Re: fold/reduce vs foldLeft/reduceLeft?
On Sun, Feb 12, 2012 at 11:06 AM, Alex Repain <alex.repain@gmail.com> wrote:
scala> def foldRight[A, B](fa: Stream[A], z: => B)(f: (A, => B) => B): B = | if (fa.isEmpty) z else f(fa.head, foldRight(fa.tail, z)(f))foldRight: [A, B](fa: Stream[A], z: => B)(f: (A, => B) => B)B
scala> def contains[A](as: Stream[A], a: A) = foldRight(as, false){(aa, z) => | a == aa || z | }contains: [A](as: Stream[A], a: A)Boolean
scala> contains(Stream.continually(2), 2)res6: Boolean = true
-jason
2012/2/12 Tony Morris <tonymorris@gmail.com>-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/12/2012 08:24 AM, Rex Kerr wrote:
> For example, if you have something that is lazily computed (e.g. a Stream
> or a view), you have just committed yourself to computing the entire
> collection (potentially). Depending on your fold, you might not need to.
More importantly than algorithmic complexity, for non-strict data
structures, a foldRight written using foldLeft is a different function
altogether, since for some values that would typically terminate, they
no longer do. That is to say, your program will no longer work.
Interesting, could you provide an example for that case, please ?
scala> def foldRight[A, B](fa: Stream[A], z: => B)(f: (A, => B) => B): B = | if (fa.isEmpty) z else f(fa.head, foldRight(fa.tail, z)(f))foldRight: [A, B](fa: Stream[A], z: => B)(f: (A, => B) => B)B
scala> def contains[A](as: Stream[A], a: A) = foldRight(as, false){(aa, z) => | a == aa || z | }contains: [A](as: Stream[A], a: A)Boolean
scala> contains(Stream.continually(2), 2)res6: Boolean = true
-jason
Sun, 2012-02-12, 14:51
#3
Re: fold/reduce vs foldLeft/reduceLeft?
On Sat, Feb 11, 2012 at 4:50 PM, Tony Morris <tmorris@tmorris.net> wrote:
For example, if you have something that is lazily computed (e.g. a Stream or a view), you have just committed yourself to computing the entire collection (potentially). Depending on your fold, you might not need to.
Also, if you have an infinite collection _but_ for some values of input your function doesn't care what would have been computed so far, you can perform a right fold like so:
f(x0, f(x1 , f(x2, f(x3, whoCares) ) ) )
(that is, f(x3,y) is the same for any value of y, so you don't need to fold any deeper even if it's _infinitely_ deep).
There is a symmetric case for foldLeft that also works for infinite collections, namely
f( f( f( f(z, x0), x1), x2), whoCares)
where for some values of the carried term, f(carry, x) is the same for all x. Then you need not fold left any more, even if it's infinitely deep.
So left and right folds are interesting creatures and not theoretically trivially interconvertible. For cases of strictly evaluated finite collections, however, the reverse version is the way to go. But you can do that yourself when you need it, so it doesn't desperately need to be in the library.
--Rex
To expand on Tony's point, while it is true that anything traversable (once) or any iterator can be packed into another data structure on the heap in opposite order (a List is a reasonable choice), after which a tail-recursive foldLeft can be used, there are a number of data structures for which that operation is questionable.
On Feb 12, 2012 1:03 AM, "Haoyi Li" <haoyi.sg@gmail.com> wrote:
>
> Why isn't it, though? It seems to me that having O(n) heap space (for
> the reversed collection) would be superior to having O(n) stack space,
> which may blow up your stack. And they're both O(n) runtime, so
> risking a stack overflow seems much more dangerous than running a
> constant factor slower.It is not true that for all values of xs then foldRight can be written with foldLeft. I should imagine this is why it is not implemented that way. Indeed, it is only true for very few heap-based structures like List.
For example, if you have something that is lazily computed (e.g. a Stream or a view), you have just committed yourself to computing the entire collection (potentially). Depending on your fold, you might not need to.
Also, if you have an infinite collection _but_ for some values of input your function doesn't care what would have been computed so far, you can perform a right fold like so:
f(x0, f(x1 , f(x2, f(x3, whoCares) ) ) )
(that is, f(x3,y) is the same for any value of y, so you don't need to fold any deeper even if it's _infinitely_ deep).
There is a symmetric case for foldLeft that also works for infinite collections, namely
f( f( f( f(z, x0), x1), x2), whoCares)
where for some values of the carried term, f(carry, x) is the same for all x. Then you need not fold left any more, even if it's infinitely deep.
So left and right folds are interesting creatures and not theoretically trivially interconvertible. For cases of strictly evaluated finite collections, however, the reverse version is the way to go. But you can do that yourself when you need it, so it doesn't desperately need to be in the library.
--Rex
Sun, 2012-02-12, 14:51
#4
Re: fold/reduce vs foldLeft/reduceLeft?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/12/2012 08:06 PM, Alex Repain wrote:
> 2012/2/12 Tony Morris
>
> On 02/12/2012 08:24 AM, Rex Kerr wrote:
>>>> For example, if you have something that is lazily computed (e.g. a Stream
>>>> or a view), you have just committed yourself to computing the entire
>>>> collection (potentially). Depending on your fold, you might not need to.
>
> More importantly than algorithmic complexity, for non-strict data
> structures, a foldRight written using foldLeft is a different function
> altogether, since for some values that would typically terminate, they
> no longer do. That is to say, your program will no longer work.
>
>
>> Interesting, could you provide an example for that case, please ?
Yepper. http://paste.pocoo.org/show/549682/
case class EStream[A](x: Option[(() => A, () => EStream[A])]) {
import EStream._
def foldRight[B](b: => B)(f: (=> A, => B) => B): B =
x match {
case Some((h,t)) => f(h(), t().foldRight(b)(f))
case None => b
}
@annotation.tailrec
final def foldLeft[B](b: B)(f: (B, A) => B): B =
x match {
case Some((h,t)) => t().foldLeft(f(b,h()))(f)
case None => b
}
def reverse: EStream[A] =
foldLeft(nil[A])((b, a) => cons(() => a, () => b))
def foldRightBroken[B](b: => B)(f: (=> A, => B) => B): B =
foldLeft(b)((a, b) => f(b, a))
}
object EStream {
def rep[A](a: => A): EStream[A] =
cons(() => a, () => rep(a))
def cons[A](h: () => A, t: () => EStream[A]): EStream[A] =
EStream(Some(h, t))
def nil[A]: EStream[A] =
EStream(None)
}
object M {
import EStream._
def main(args: Array[String]) {
val ones = rep(1)
val twos = ones.foldRight(nil[Int])((a, b) => cons(() => a+1, () => b))
println(twos.x match {
case None => "hi!"
case Some((h, _)) => "lo" + h()
})
val oopsie = ones.foldRightBroken(nil[Int])((a, b) => cons(() =>
a+1, () => b))
println("you ain't never gonna be seeing this")
}
}
Sun, 2012-02-12, 14:51
#5
Re: fold/reduce vs foldLeft/reduceLeft?
On Sat, Feb 11, 2012 at 5:59 AM, Alex Repain <alex.repain@gmail.com> wrote:
Of course. xs.foldRight(z)(f) is the same as xs.reverse.foldLeft(z)((a,b) => f(b,a)). It's often not implemented that way in the library, however.
--Rex
2012/2/10 Rex Kerr <ichoran@gmail.com>On Fri, Feb 10, 2012 at 1:36 PM, Haoyi Li <haoyi.sg@gmail.com> wrote:
Hello Everyone,
I've been learning scala recently (already know java/c#/python) and
been thoroughly enjoying the higher order collections functions.
filter/map is pretty obvious, and I've more or less figured out
foldLeft/reduceLeft. I've heard somewhere that you probably shouldn't
use foldRight/reduceRight because of O(n^2) complexity
That would be a strange implementation. The typical implementation requires O(n) stack space, which is the real problem--for many collections of practical size, you throw stack overflow exceptions.
Does it have to be on the stack though ? Can't the function be tail recursive on the reversed collection ?
Of course. xs.foldRight(z)(f) is the same as xs.reverse.foldLeft(z)((a,b) => f(b,a)). It's often not implemented that way in the library, however.
--Rex
Sun, 2012-02-12, 15:01
#6
Re: fold/reduce vs foldLeft/reduceLeft?
Blowing the heap (OOME) kills your entire application, blowing your
stack only ruins one thread.
Cheers,
√
On Sat, Feb 11, 2012 at 4:03 PM, Haoyi Li wrote:
> Why isn't it, though? It seems to me that having O(n) heap space (for
> the reversed collection) would be superior to having O(n) stack space,
> which may blow up your stack. And they're both O(n) runtime, so
> risking a stack overflow seems much more dangerous than running a
> constant factor slower.
>
> -Haoyi
>
> On Sat, Feb 11, 2012 at 7:32 AM, Rex Kerr wrote:
>> On Sat, Feb 11, 2012 at 5:59 AM, Alex Repain wrote:
>>>
>>>
>>>
>>> 2012/2/10 Rex Kerr
>>>>
>>>> On Fri, Feb 10, 2012 at 1:36 PM, Haoyi Li wrote:
>>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I've been learning scala recently (already know java/c#/python) and
>>>>> been thoroughly enjoying the higher order collections functions.
>>>>> filter/map is pretty obvious, and I've more or less figured out
>>>>> foldLeft/reduceLeft. I've heard somewhere that you probably shouldn't
>>>>> use foldRight/reduceRight because of O(n^2) complexity
>>>>
>>>>
>>>> That would be a strange implementation. The typical implementation
>>>> requires O(n) stack space, which is the real problem--for many collections
>>>> of practical size, you throw stack overflow exceptions.
>>>
>>>
>>> Does it have to be on the stack though ? Can't the function be tail
>>> recursive on the reversed collection ?
>>
>>
>> Of course. xs.foldRight(z)(f) is the same as xs.reverse.foldLeft(z)((a,b)
>> => f(b,a)). It's often not implemented that way in the library, however.
>>
>> --Rex
Sun, 2012-02-12, 15:11
#7
Re: fold/reduce vs foldLeft/reduceLeft?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 02/12/2012 08:24 AM, Rex Kerr wrote:
> For example, if you have something that is lazily computed (e.g. a Stream
> or a view), you have just committed yourself to computing the entire
> collection (potentially). Depending on your fold, you might not need to.
More importantly than algorithmic complexity, for non-strict data
structures, a foldRight written using foldLeft is a different function
altogether, since for some values that would typically terminate, they
no longer do. That is to say, your program will no longer work.
Sun, 2012-02-12, 15:21
#8
Re: fold/reduce vs foldLeft/reduceLeft?
On Feb 12, 2012 1:03 AM, "Haoyi Li" <haoyi.sg@gmail.com> wrote:
>
> Why isn't it, though? It seems to me that having O(n) heap space (for
> the reversed collection) would be superior to having O(n) stack space,
> which may blow up your stack. And they're both O(n) runtime, so
> risking a stack overflow seems much more dangerous than running a
> constant factor slower.
It is not true that for all values of xs then foldRight can be written with foldLeft. I should imagine this is why it is not implemented that way. Indeed, it is only true for very few heap-based structures like List.
2012/2/12 Tony Morris <tonymorris@gmail.com>
Interesting, could you provide an example for that case, please ?