This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Re: StackOverflowError in RedBlack

20 replies
Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Have you tried increasing stack size via -Xss when launching the VM?

2009/12/30 Jan Van Besien <janvanbesien@gmail.com>
Hi all,

[scala 2.7.7]

I'm currently writing an immutable data structure on top of an immutable SortedSet (scala.collection.immutable.TreeSet).

If I have a little more than 180000 elements in my data structure, I suffer from a StackOverflowError:

java.lang.StackOverflowError
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)

Looking at the code of RedBlack.scala, I can see why this is happening, but now I am in doubt about how to solve it.

A sorted set is really the best fit for my problem because I have elements with a natural order, and fast lookup is much more important than fast addition/removal.
Immutability is "nice to have", but maybe this problem will force me to use a mutable java.collection.jcl.TreeSet anyway?

Are there other immutable SortedSet implementations out there that don't suffer from this problem?

Any other suggestions?

thanks in advance,
Jan



--
Kevin Wright

mail/google talk: kev.lee.wright@googlemail.com
wave: kev.lee.wright@googlewave.com
skype: kev.lee.wright
twitter: @thecoda

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
StackOverflowError in RedBlack

Hi all,

[scala 2.7.7]

I'm currently writing an immutable data structure on top of an immutable
SortedSet (scala.collection.immutable.TreeSet).

If I have a little more than 180000 elements in my data structure, I
suffer from a StackOverflowError:

java.lang.StackOverflowError
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)

Looking at the code of RedBlack.scala, I can see why this is happening,
but now I am in doubt about how to solve it.

A sorted set is really the best fit for my problem because I have
elements with a natural order, and fast lookup is much more important
than fast addition/removal.
Immutability is "nice to have", but maybe this problem will force me to
use a mutable java.collection.jcl.TreeSet anyway?

Are there other immutable SortedSet implementations out there that don't
suffer from this problem?

Any other suggestions?

thanks in advance,
Jan

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Kevin Wright wrote:
> Have you tried increasing stack size via -Xss when launching the VM?

Yes that will certainly help... but its more a workaround than an actual
solution, isn't it.

Maybe I'm searching for an immutable SortedSet implementation which has
(almost) the performance of a mutable one. Because I assume this deep
recursion will not do performance much good (although I should admit
that I didn't compare it with a mutable TreeSet yet).

Jan
>
> 2009/12/30 Jan Van Besien >
>
> Hi all,
>
> [scala 2.7.7]
>
> I'm currently writing an immutable data structure on top of an
> immutable SortedSet (scala.collection.immutable.TreeSet).
>
> If I have a little more than 180000 elements in my data structure, I
> suffer from a StackOverflowError:
>
> java.lang.StackOverflowError
> at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
> at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
> at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
>
> Looking at the code of RedBlack.scala, I can see why this is
> happening, but now I am in doubt about how to solve it.
>
> A sorted set is really the best fit for my problem because I have
> elements with a natural order, and fast lookup is much more
> important than fast addition/removal.
> Immutability is "nice to have", but maybe this problem will force me
> to use a mutable java.collection.jcl.TreeSet anyway?
>
> Are there other immutable SortedSet implementations out there that
> don't suffer from this problem?
>
> Any other suggestions?
>
> thanks in advance,
> Jan
>
>
>
>

Kevin Wright
Joined: 2009-06-09,
User offline. Last seen 49 weeks 3 days ago.
Re: StackOverflowError in RedBlack
If your problem is that you're using more stack than you have available, then making more available is a valid fix.  It also depends on the size of your structure, is this largely stable or might it grow to some arbritary size in the future?
Just so long as memory footprint or performance aren't suffering, then this is your simplest and most pragmatic way out :)
Neither the scala compiler nor the eclipse plugin will build without extra stack space; by their very nature functional designs tend to lean on the stack a bit harder, so this is a pretty standard thing to be doing nowadays.

Hopefully work going into Java 7 will deliver the goods on proper tail-recursion, which should help out with a lot of similar problems...

2009/12/30 Jan Van Besien <janvanbesien@gmail.com>
Kevin Wright wrote:
Have you tried increasing stack size via -Xss when launching the VM?

Yes that will certainly help... but its more a workaround than an actual solution, isn't it.

Maybe I'm searching for an immutable SortedSet implementation which has (almost) the performance of a mutable one. Because I assume this deep recursion will not do performance much good (although I should admit that I didn't compare it with a mutable TreeSet yet).

Jan

2009/12/30 Jan Van Besien <janvanbesien@gmail.com <mailto:janvanbesien@gmail.com>>

   Hi all,

   [scala 2.7.7]

   I'm currently writing an immutable data structure on top of an
   immutable SortedSet (scala.collection.immutable.TreeSet).

   If I have a little more than 180000 elements in my data structure, I
   suffer from a StackOverflowError:

   java.lang.StackOverflowError
          at
   scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
          at
   scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
          at
   scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)

   Looking at the code of RedBlack.scala, I can see why this is
   happening, but now I am in doubt about how to solve it.

   A sorted set is really the best fit for my problem because I have
   elements with a natural order, and fast lookup is much more
   important than fast addition/removal.
   Immutability is "nice to have", but maybe this problem will force me
   to use a mutable java.collection.jcl.TreeSet anyway?

   Are there other immutable SortedSet implementations out there that
   don't suffer from this problem?

   Any other suggestions?

   thanks in advance,
   Jan




ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: StackOverflowError in RedBlack

2009/12/30 Jan Van Besien :
> Hi all,
>
> [scala 2.7.7]
>
> I'm currently writing an immutable data structure on top of an immutable
> SortedSet (scala.collection.immutable.TreeSet).
>
> If I have a little more than 180000 elements in my data structure, I suffer
> from a StackOverflowError:
>
> java.lang.StackOverflowError
>        at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
>        at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
>        at
> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:113)
>
> Looking at the code of RedBlack.scala, I can see why this is happening, but
> now I am in doubt about how to solve it.
>
> A sorted set is really the best fit for my problem because I have elements
> with a natural order, and fast lookup is much more important than fast
> addition/removal.
> Immutability is "nice to have", but maybe this problem will force me to use
> a mutable java.collection.jcl.TreeSet anyway?
>
> Are there other immutable SortedSet implementations out there that don't
> suffer from this problem?
>
> Any other suggestions?
>
> thanks in advance,
> Jan
>

This is utterly surprizing for me. 180000 elements in a red black tree
can't create a tree with height more than 36. This can hardly overflow
the stack! I'm not aware of the implementation, but is it really a
red-black tree, as the name on the tin says? In fact, I can hardly
imagine a red-black tree causing stack overflows, no matter how many
elements you put in, unless you reduce the default stack size
yourself.

What's the deal here?

Dimitris

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Dimitris Andreou wrote:
> This is utterly surprizing for me. 180000 elements in a red black tree
> can't create a tree with height more than 36. This can hardly overflow
> the stack!

Hmm indeed, I can only agree with you...

I am debugging a bit, and it looks like my code is resulting in an
extremely unbalanced tree... Actually, it looks like the tree is
composed of "BlackTree" instances only, while a normal balanced tree
should have "BlackTree" and "RedTree" instances.

So either there is a bug in the RedBlack implementation, or there is
something wrong with the compareTo implementation of the objects that I
put into it... I'll assume the latter ;-)

Jan

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Jan Van Besien wrote:
> So either there is a bug in the RedBlack implementation, or there is
> something wrong with the compareTo implementation of the objects that I
> put into it... I'll assume the latter ;-)

I managed to reproduce the StackOverflowError with a SortedSet[Int]. I
guess that already eliminates a bug in my compareTo implementation ;-)

import collection.immutable.{TreeSet,SortedSet}

object RedBlackTest {
def main(args: Array[String]) {

var big = 100000

var aSortedSet: SortedSet[Int] = TreeSet(big)

for (i <- 1 until 10000) {
aSortedSet = (aSortedSet - big).asInstanceOf[SortedSet[Int]] ++
(TreeSet(i, big - 1))
big = big - 1
if (i % 1000 == 0) {
println("size = " + aSortedSet.size)
val head = aSortedSet.until(i)
println("head size = " + head.size)
}
}
}
}

It is "aSortedSet.until(i)" which actually triggers the StackOverflowError.

Notice that I don't simply add some numbers to the set. I start with a
set with one element (a very big one) and in each iteration I first
remove the big element and add two other elements: the "i" from the for
loop and a new big element (but not as big as the previous).

The cast to SortedSet is a workaround for a problem described here:
http://stackoverflow.com/questions/1271426/scala-immutable-sortedset-are...

If you run this in a debugger, and set a breakpoint in TreeSet line 100,
you see that the tree looks for example (after 5 iterations) like this:

BlackTree(5,(),BlackTree(4,(),BlackTree(3,(),BlackTree(2,(),BlackTree(1,(),Empty,Empty),Empty),Empty),Empty),BlackTree(99995,(),Empty,Empty))

In other words: unbalanced.

I assume this is not the expected behavior of RedBlack trees? Or am I
doing something embarasingly wrong here?

kind regards,
Jan

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
I believe the problem is simple. When you do (aSortedSet - big), you get a non-sorted tree. You may cast it to SortedSet, but that doesn't make it one.
Then again, that's what I said in answer to that question.

On Wed, Dec 30, 2009 at 12:03 PM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Jan Van Besien wrote:
So either there is a bug in the RedBlack implementation, or there is something wrong with the compareTo implementation of the objects that I put into it... I'll assume the latter ;-)

I managed to reproduce the StackOverflowError with a SortedSet[Int]. I guess that already eliminates a bug in my compareTo implementation ;-)

import collection.immutable.{TreeSet,SortedSet}

object RedBlackTest {
 def main(args: Array[String]) {

   var big = 100000

   var aSortedSet: SortedSet[Int] = TreeSet(big)

   for (i <- 1 until 10000) {
     aSortedSet = (aSortedSet - big).asInstanceOf[SortedSet[Int]] ++ (TreeSet(i, big - 1))
     big = big - 1
     if (i % 1000 == 0) {
       println("size = " + aSortedSet.size)
       val head = aSortedSet.until(i)
       println("head size = " + head.size)
     }
   }
 }
}

It is "aSortedSet.until(i)" which actually triggers the StackOverflowError.

Notice that I don't simply add some numbers to the set. I start with a set with one element (a very big one) and in each iteration I first remove the big element and add two other elements: the "i" from the for loop and a new big element (but not as big as the previous).

The cast to SortedSet is a workaround for a problem described here: http://stackoverflow.com/questions/1271426/scala-immutable-sortedset-are-not-stable-on-deletion

If you run this in a debugger, and set a breakpoint in TreeSet line 100, you see that the tree looks for example (after 5 iterations) like this:

BlackTree(5,(),BlackTree(4,(),BlackTree(3,(),BlackTree(2,(),BlackTree(1,(),Empty,Empty),Empty),Empty),Empty),BlackTree(99995,(),Empty,Empty))

In other words: unbalanced.

I assume this is not the expected behavior of RedBlack trees? Or am I doing something embarasingly wrong here?

kind regards,
Jan



--
Daniel C. Sobral

I travel to the future all the time.
Randall R Schulz
Joined: 2008-12-16,
User offline. Last seen 1 year 29 weeks ago.
Re: StackOverflowError in RedBlack

On Wednesday December 30 2009, Jan Van Besien wrote:
> Hi all,
>
> [scala 2.7.7]
>
> I'm currently writing an immutable data structure on top of an
> immutable SortedSet (scala.collection.immutable.TreeSet).
>
> If I have a little more than 180000 elements in my data structure, I
> suffer from a StackOverflowError:
>
> ...

Might you be encountering [1]?

[1]

> thanks in advance,
> Jan

Randall Schulz

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Randall R Schulz wrote:
> Might you be encountering [1]?
>
> [1]

I doubt it. That's a bug in a quicksort implementation. As far as I
know, SortedSet doesn't use quicksort. It's backed by a binary tree
which keeps all elements sorted (and balanced... or at least it should).

Jan

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Daniel Sobral wrote:
> I believe the problem is simple. When you do (aSortedSet - big), you get
> a non-sorted tree. You may cast it to SortedSet, but that doesn't make
> it one.

Hmm... I have my doubts. If you debug, you'll see that in my example
(aSortedSet - big) always returns a TreeSet. Every TreeSet is a
SortedSet, so it seems very odd that in this particular case it would
return a TreeSet which is not sorted...

I'll give scala 2.8 a try to see if that makes a difference.

kind regards,
Jan

Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Jan Van Besien wrote:
> I'll give scala 2.8 a try to see if that makes a difference.

Same problem with scala 2.8, without the cast to SortedSet[Int].

jvb@lapjvb /tmp> scala -version
Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright
2002-2010, LAMP/EPFL
jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
import collection.immutable.{TreeSet,SortedSet}

object RedBlackTestNoCast {
def main(args: Array[String]) {

var big = 100000

var aSortedSet: SortedSet[Int] = TreeSet(big)

for (i <- 1 until 10000) {
aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
big = big - 1
if (i % 1000 == 0) {
println("size = " + aSortedSet.size)
val head = aSortedSet.until(i)
println("head size = " + head.size)
}
}
}
}

jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
jvb@lapjvb /tmp> scala RedBlackTestNoCast
size = 1001
head size = 999
size = 2001
head size = 1999
size = 3001
head size = 2999
size = 4001
head size = 3999
size = 5001
head size = 4999
size = 6001
head size = 5999
size = 7001
java.lang.StackOverflowError
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)

Kind regards,
Jan

Florian Hars 2
Joined: 2009-11-01,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Jan Van Besien schrieb:
> Same problem with scala 2.8, without the cast to SortedSet[Int].

I made another test in 2.7 by defining a TestSet by copy and pasting
(jay for abstraction and protected fields!) the TreeSet source and
adding the new test methods:

def isBalanced = {
treeIsBalanced(t, true)
true
}

private def treeIsBalanced(t: RedBlack[A]#Tree[Unit], expectBlack:
Boolean): Int = {
if (!t.isBlack && expectBlack) {
throw (new Exception("Children of red nodes must be black: " + t))
}
t match {
case Empty => 1 // why does this never match ..
case RedTree(_,_, left, right) =>
val depthL = treeIsBalanced(left, true)
val depthR = treeIsBalanced(right, true)
if (depthL != depthR) {
throw (new Exception("Tree is not balanced: " + t))
}
depthR
case BlackTree(_, _, left, right) =>
val depthL = treeIsBalanced(left, false)
val depthR = treeIsBalanced(right, false)
if (depthL != depthR) {
throw (new Exception("Tree is not balanced: " + t))
}
depthR + 1
case t => println(t); 1 // .. while this prints empty?
}
}

and found that even the simplest test case fails to preserve the
red-black condition:

scala> (TestSet(1,2,3) - 3).asInstanceOf[TestSet[Int]].isBalanced
Empty
Empty
Empty
java.lang.Exception: Tree is not balanced:
BlackTree(2,(),BlackTree(1,(),Empty,Empty),Empty)
at TestSet.treeIsBalanced(RBTest.scala:108)
at TestSet.isBalanced(RBTest.scala:87)
at .(:5)

I've updated your bug report.

- Florian

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
This stack overflow is cruel. It happens when you call "until". That is defined in SortedSetLike, but defer implementation to rangeImpl. The method rangeImpl, on the other hand, is defined at TreeSet like this:
  override def rangeImpl(from: Option[A], until: Option[A]): TreeSet[A] = {    val tree = this.tree.range(from, until)     newSet(tree.count, tree)  }
And there's your "count" causing trouble -- it is used to initialize "size", which is a val for TreeSet. So, how does "count" make trouble? On RedBlackTree:
def count = 1 + left.count + right.count
Which is clearly not tail-recursive. If the tree were balanced as it should have been, it wouldn't cause much trouble.
On Thu, Dec 31, 2009 at 6:32 AM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Jan Van Besien wrote:
I'll give scala 2.8 a try to see if that makes a difference.

Same problem with scala 2.8, without the cast to SortedSet[Int].

jvb@lapjvb /tmp> scala -version
Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright 2002-2010, LAMP/EPFL
jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
import collection.immutable.{TreeSet,SortedSet}

object RedBlackTestNoCast {
 def main(args: Array[String]) {

   var big = 100000

   var aSortedSet: SortedSet[Int] = TreeSet(big)

   for (i <- 1 until 10000) {
     aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
     big = big - 1
     if (i % 1000 == 0) {
       println("size = " + aSortedSet.size)
       val head = aSortedSet.until(i)
       println("head size = " + head.size)
     }
   }
 }
}

jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
jvb@lapjvb /tmp> scala RedBlackTestNoCast
size = 1001
head size = 999
size = 2001
head size = 1999
size = 3001
head size = 2999
size = 4001
head size = 3999
size = 5001
head size = 4999
size = 6001
head size = 5999
size = 7001
java.lang.StackOverflowError
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)


Kind regards,
Jan



--
Daniel C. Sobral

I travel to the future all the time.
Jan Van Besien
Joined: 2009-12-28,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack

Daniel Sobral wrote:
> Which is clearly not tail-recursive. If the tree were balanced as it
> should have been, it wouldn't cause much trouble.

Yes indeed. The whole point is that a red-black-tree should always be
balanced...

A bug has been filed in the meantime:
https://lampsvn.epfl.ch/trac/scala/ticket/2849

kind regards
Jan

Burkhard Ludwig
Joined: 2009-04-18,
User offline. Last seen 42 years 45 weeks ago.
Re: StackOverflowError in RedBlack
This is just a slightly informed guess from a glimpse at scala/collection/immutable/RedBlack.scala:
The addition of elements to a red black tree goes through the member function 'upd' which takes care for rebalancing the tree, while the removal is done by the member 'del' which ignores balancing entirely. IRRC, the rebalancing during removal is far less contrived than the one during insertions. As a rude rule of thumb: A deletion code which is shorter than the insertion code is probably wrong.
The test case below uses element removal a lot (SortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1)) and therefore it does not come as a surprise the this results in a rather skewed tree. The stack overflow is then inevitable an fiddling with stack sizes won't help.
I did not try to break the high level constructs down to calls to 'upd' and 'del', but if I am not mistaken, the only remedy is a rewrite of the deletion algorithm of the tree implementation. 
On the user side, you could maybe try to replace the tree building discipline for one which relies mostly on additions, or change the set representation to one based on hash set (is there one?).
I would suggest to file a bug report, since the test case is a marvelous example of the necessity to respect the tree invariants for all operations. (I forgot who did breakdown the problem to this example, but he did a good job to produce an ugly corner case)
Burkhard


2009/12/31 Daniel Sobral <dcsobral@gmail.com>
This stack overflow is cruel. It happens when you call "until". That is defined in SortedSetLike, but defer implementation to rangeImpl. The method rangeImpl, on the other hand, is defined at TreeSet like this:
  override def rangeImpl(from: Option[A], until: Option[A]): TreeSet[A] = {    val tree = this.tree.range(from, until)     newSet(tree.count, tree)  }
And there's your "count" causing trouble -- it is used to initialize "size", which is a val for TreeSet. So, how does "count" make trouble? On RedBlackTree:
def count = 1 + left.count + right.count
Which is clearly not tail-recursive. If the tree were balanced as it should have been, it wouldn't cause much trouble.
On Thu, Dec 31, 2009 at 6:32 AM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Jan Van Besien wrote:
I'll give scala 2.8 a try to see if that makes a difference.

Same problem with scala 2.8, without the cast to SortedSet[Int].

jvb@lapjvb /tmp> scala -version
Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright 2002-2010, LAMP/EPFL
jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
import collection.immutable.{TreeSet,SortedSet}

object RedBlackTestNoCast {
 def main(args: Array[String]) {

   var big = 100000

   var aSortedSet: SortedSet[Int] = TreeSet(big)

   for (i <- 1 until 10000) {
     aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
     big = big - 1
     if (i % 1000 == 0) {
       println("size = " + aSortedSet.size)
       val head = aSortedSet.until(i)
       println("head size = " + head.size)
     }
   }
 }
}

jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
jvb@lapjvb /tmp> scala RedBlackTestNoCast
size = 1001
head size = 999
size = 2001
head size = 1999
size = 3001
head size = 2999
size = 4001
head size = 3999
size = 5001
head size = 4999
size = 6001
head size = 5999
size = 7001
java.lang.StackOverflowError
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)


Kind regards,
Jan



--
Daniel C. Sobral

I travel to the future all the time.

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
This is wrong:
 else if (left.isEmpty) right      else if (right.isEmpty) left
According to wikipedia,

 If we are deleting a red node, we can simply replace it with its child, which must be black. All paths through the deleted node will simply pass through one less red node, and both the deleted node's parent and child must be black, so properties 3 (All leaves, including nulls, are black) and 4 (Both children of every red node are black) still hold. Another simple case is when the deleted node is black and its child is red. Simply removing a black node could break Properties 4 (Both children of every red node are black) and 5 (All paths from any given node to its leaf nodes contain the same number of black nodes), but if we repaint its child black, both of these properties are preserved.

The complex case is when both the node to be deleted and its child are black. We begin by replacing the node to be deleted with its child. We will call (or label) this child (in its new position) N, and its sibling (its new parent's other child) S. In the diagrams below, we will also use P for N's new parent, SL for S's left child, and SR for S's right child (it can be shown that S cannot be a leaf).

<etc>


That's just not being done.


On Thu, Dec 31, 2009 at 2:05 PM, Burkhard Ludwig <ludwig.burkhard@googlemail.com> wrote:
This is just a slightly informed guess from a glimpse at scala/collection/immutable/RedBlack.scala:
The addition of elements to a red black tree goes through the member function 'upd' which takes care for rebalancing the tree, while the removal is done by the member 'del' which ignores balancing entirely. IRRC, the rebalancing during removal is far less contrived than the one during insertions. As a rude rule of thumb: A deletion code which is shorter than the insertion code is probably wrong.
The test case below uses element removal a lot (SortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1)) and therefore it does not come as a surprise the this results in a rather skewed tree. The stack overflow is then inevitable an fiddling with stack sizes won't help.
I did not try to break the high level constructs down to calls to 'upd' and 'del', but if I am not mistaken, the only remedy is a rewrite of the deletion algorithm of the tree implementation. 
On the user side, you could maybe try to replace the tree building discipline for one which relies mostly on additions, or change the set representation to one based on hash set (is there one?).
I would suggest to file a bug report, since the test case is a marvelous example of the necessity to respect the tree invariants for all operations. (I forgot who did breakdown the problem to this example, but he did a good job to produce an ugly corner case)
Burkhard


2009/12/31 Daniel Sobral <dcsobral@gmail.com>
This stack overflow is cruel. It happens when you call "until". That is defined in SortedSetLike, but defer implementation to rangeImpl. The method rangeImpl, on the other hand, is defined at TreeSet like this:
  override def rangeImpl(from: Option[A], until: Option[A]): TreeSet[A] = {    val tree = this.tree.range(from, until)     newSet(tree.count, tree)  }
And there's your "count" causing trouble -- it is used to initialize "size", which is a val for TreeSet. So, how does "count" make trouble? On RedBlackTree:
def count = 1 + left.count + right.count
Which is clearly not tail-recursive. If the tree were balanced as it should have been, it wouldn't cause much trouble.
On Thu, Dec 31, 2009 at 6:32 AM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Jan Van Besien wrote:
I'll give scala 2.8 a try to see if that makes a difference.

Same problem with scala 2.8, without the cast to SortedSet[Int].

jvb@lapjvb /tmp> scala -version
Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright 2002-2010, LAMP/EPFL
jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
import collection.immutable.{TreeSet,SortedSet}

object RedBlackTestNoCast {
 def main(args: Array[String]) {

   var big = 100000

   var aSortedSet: SortedSet[Int] = TreeSet(big)

   for (i <- 1 until 10000) {
     aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
     big = big - 1
     if (i % 1000 == 0) {
       println("size = " + aSortedSet.size)
       val head = aSortedSet.until(i)
       println("head size = " + head.size)
     }
   }
 }
}

jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
jvb@lapjvb /tmp> scala RedBlackTestNoCast
size = 1001
head size = 999
size = 2001
head size = 1999
size = 3001
head size = 2999
size = 4001
head size = 3999
size = 5001
head size = 4999
size = 6001
head size = 5999
size = 7001
java.lang.StackOverflowError
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
       at scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)


Kind regards,
Jan



--
Daniel C. Sobral

I travel to the future all the time.




--
Daniel C. Sobral

I travel to the future all the time.
ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: StackOverflowError in RedBlack

If you refer to this code:

def del(k: A): Tree[B] = {
if (isSmaller(k, key)) mkTree(isBlack, key, value, left.del(k), right)
else if (isSmaller(key, k)) mkTree(isBlack, key, value, left,
right.del(k))
else if (left.isEmpty) right
else if (right.isEmpty) left
else {
val s = right.smallest
mkTree(isBlack, s.key, s.value, left, right.del(s.key))
}
}

Ok, so seems to be the the deletion algorithm for plain binary search
trees (which ignores balancing of course). But still, even if it
creates an unbalanced tree, it doesn't *increase* its height (at
worst, the height remains the same), which means that there is at
least a second problem lurking.

But perhaps debugging this would best be left for 2010 :)

Dimitris

2009/12/31 Daniel Sobral :
> This is wrong:
>  else if (left.isEmpty) right
>       else if (right.isEmpty) left
> According to wikipedia,
>
>  If we are deleting a red node, we can simply replace it with its child,
> which must be black. All paths through the deleted node will simply pass
> through one less red node, and both the deleted node's parent and child must
> be black, so properties 3 (All leaves, including nulls, are black) and 4
> (Both children of every red node are black) still hold. Another simple case
> is when the deleted node is black and its child is red. Simply removing a
> black node could break Properties 4 (Both children of every red node are
> black) and 5 (All paths from any given node to its leaf nodes contain the
> same number of black nodes), but if we repaint its child black, both of
> these properties are preserved.
>
> The complex case is when both the node to be deleted and its child are
> black. We begin by replacing the node to be deleted with its child. We will
> call (or label) this child (in its new position) N, and its sibling (its new
> parent's other child) S. In the diagrams below, we will also use P for N's
> new parent, SL for S's left child, and SR for S's right child (it can be
> shown that S cannot be a leaf).
>
>
>
> That's just not being done.
>
> On Thu, Dec 31, 2009 at 2:05 PM, Burkhard Ludwig
> wrote:
>>
>> This is just a slightly informed guess from a glimpse at
>> scala/collection/immutable/RedBlack.scala:
>> The addition of elements to a red black tree goes through the member
>> function 'upd' which takes care for rebalancing the tree, while the removal
>> is done by the member 'del' which ignores balancing entirely. IRRC, the
>> rebalancing during removal is far less contrived than the one during
>> insertions. As a rude rule of thumb: A deletion code which is shorter than
>> the insertion code is probably wrong.
>> The test case below uses element removal a lot (SortedSet = (aSortedSet -
>> big) ++ (TreeSet(i, big - 1)) and therefore it does not come as a surprise
>> the this results in a rather skewed tree. The stack overflow is then
>> inevitable an fiddling with stack sizes won't help.
>> I did not try to break the high level constructs down to calls to 'upd'
>> and 'del', but if I am not mistaken, the only remedy is a rewrite of the
>> deletion algorithm of the tree implementation.
>> On the user side, you could maybe try to replace the tree building
>> discipline for one which relies mostly on additions, or change the set
>> representation to one based on hash set (is there one?).
>> I would suggest to file a bug report, since the test case is a marvelous
>> example of the necessity to respect the tree invariants for all
>> operations. (I forgot who did breakdown the problem to this example, but he
>> did a good job to produce an ugly corner case)
>> Burkhard
>>
>>
>> 2009/12/31 Daniel Sobral
>>>
>>> This stack overflow is cruel. It happens when you call "until". That is
>>> defined in SortedSetLike, but defer implementation to rangeImpl. The method
>>> rangeImpl, on the other hand, is defined at TreeSet like this:
>>>   override def rangeImpl(from: Option[A], until: Option[A]): TreeSet[A] =
>>> {
>>>     val tree = this.tree.range(from, until)
>>>     newSet(tree.count, tree)
>>>   }
>>> And there's your "count" causing trouble -- it is used to initialize
>>> "size", which is a val for TreeSet. So, how does "count" make trouble? On
>>> RedBlackTree:
>>> def count = 1 + left.count + right.count
>>> Which is clearly not tail-recursive. If the tree were balanced as it
>>> should have been, it wouldn't cause much trouble.
>>> On Thu, Dec 31, 2009 at 6:32 AM, Jan Van Besien
>>> wrote:
>>>>
>>>> Jan Van Besien wrote:
>>>>>
>>>>> I'll give scala 2.8 a try to see if that makes a difference.
>>>>
>>>> Same problem with scala 2.8, without the cast to SortedSet[Int].
>>>>
>>>> jvb@lapjvb /tmp> scala -version
>>>> Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright
>>>> 2002-2010, LAMP/EPFL
>>>> jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
>>>> import collection.immutable.{TreeSet,SortedSet}
>>>>
>>>> object RedBlackTestNoCast {
>>>>  def main(args: Array[String]) {
>>>>
>>>>    var big = 100000
>>>>
>>>>    var aSortedSet: SortedSet[Int] = TreeSet(big)
>>>>
>>>>    for (i <- 1 until 10000) {
>>>>      aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
>>>>      big = big - 1
>>>>      if (i % 1000 == 0) {
>>>>        println("size = " + aSortedSet.size)
>>>>        val head = aSortedSet.until(i)
>>>>        println("head size = " + head.size)
>>>>      }
>>>>    }
>>>>  }
>>>> }
>>>>
>>>> jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
>>>> jvb@lapjvb /tmp> scala RedBlackTestNoCast
>>>> size = 1001
>>>> head size = 999
>>>> size = 2001
>>>> head size = 1999
>>>> size = 3001
>>>> head size = 2999
>>>> size = 4001
>>>> head size = 3999
>>>> size = 5001
>>>> head size = 4999
>>>> size = 6001
>>>> head size = 5999
>>>> size = 7001
>>>> java.lang.StackOverflowError
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>
>>>>
>>>> Kind regards,
>>>> Jan
>>>
>>>
>>>
>>> --
>>> Daniel C. Sobral
>>>
>>> I travel to the future all the time.
>>
>
>
>
> --
> Daniel C. Sobral
>
> I travel to the future all the time.
>

dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
Note that the example code removes one element and adds two every time. Since the tree resulting from the deletion does not respect RB invariants, the adding code might be doing the wrong thing because it was misled into it.

On Thu, Dec 31, 2009 at 3:59 PM, Dimitris Andreou <jim.andreou@gmail.com> wrote:
If you refer to this code:

   def del(k: A): Tree[B] = {
     if (isSmaller(k, key)) mkTree(isBlack, key, value, left.del(k), right)
     else if (isSmaller(key, k)) mkTree(isBlack, key, value, left,
right.del(k))
     else if (left.isEmpty) right
     else if (right.isEmpty) left
     else {
       val s = right.smallest
       mkTree(isBlack, s.key, s.value, left, right.del(s.key))
     }
   }

Ok, so seems to be the the deletion algorithm for plain binary search
trees (which ignores balancing of course). But still, even if it
creates an unbalanced tree, it doesn't *increase* its height (at
worst, the height remains the same), which means that there is at
least a second problem lurking.

But perhaps debugging this would best be left for 2010 :)

Dimitris

2009/12/31 Daniel Sobral <dcsobral@gmail.com>:
> This is wrong:
>  else if (left.isEmpty) right
>       else if (right.isEmpty) left
> According to wikipedia,
>
>  If we are deleting a red node, we can simply replace it with its child,
> which must be black. All paths through the deleted node will simply pass
> through one less red node, and both the deleted node's parent and child must
> be black, so properties 3 (All leaves, including nulls, are black) and 4
> (Both children of every red node are black) still hold. Another simple case
> is when the deleted node is black and its child is red. Simply removing a
> black node could break Properties 4 (Both children of every red node are
> black) and 5 (All paths from any given node to its leaf nodes contain the
> same number of black nodes), but if we repaint its child black, both of
> these properties are preserved.
>
> The complex case is when both the node to be deleted and its child are
> black. We begin by replacing the node to be deleted with its child. We will
> call (or label) this child (in its new position) N, and its sibling (its new
> parent's other child) S. In the diagrams below, we will also use P for N's
> new parent, SL for S's left child, and SR for S's right child (it can be
> shown that S cannot be a leaf).
>
> <etc>
>
> That's just not being done.
>
> On Thu, Dec 31, 2009 at 2:05 PM, Burkhard Ludwig
> <ludwig.burkhard@googlemail.com> wrote:
>>
>> This is just a slightly informed guess from a glimpse at
>> scala/collection/immutable/RedBlack.scala:
>> The addition of elements to a red black tree goes through the member
>> function 'upd' which takes care for rebalancing the tree, while the removal
>> is done by the member 'del' which ignores balancing entirely. IRRC, the
>> rebalancing during removal is far less contrived than the one during
>> insertions. As a rude rule of thumb: A deletion code which is shorter than
>> the insertion code is probably wrong.
>> The test case below uses element removal a lot (SortedSet = (aSortedSet -
>> big) ++ (TreeSet(i, big - 1)) and therefore it does not come as a surprise
>> the this results in a rather skewed tree. The stack overflow is then
>> inevitable an fiddling with stack sizes won't help.
>> I did not try to break the high level constructs down to calls to 'upd'
>> and 'del', but if I am not mistaken, the only remedy is a rewrite of the
>> deletion algorithm of the tree implementation.
>> On the user side, you could maybe try to replace the tree building
>> discipline for one which relies mostly on additions, or change the set
>> representation to one based on hash set (is there one?).
>> I would suggest to file a bug report, since the test case is a marvelous
>> example of the necessity to respect the tree invariants for all
>> operations. (I forgot who did breakdown the problem to this example, but he
>> did a good job to produce an ugly corner case)
>> Burkhard
>>
>>
>> 2009/12/31 Daniel Sobral <dcsobral@gmail.com>
>>>
>>> This stack overflow is cruel. It happens when you call "until". That is
>>> defined in SortedSetLike, but defer implementation to rangeImpl. The method
>>> rangeImpl, on the other hand, is defined at TreeSet like this:
>>>   override def rangeImpl(from: Option[A], until: Option[A]): TreeSet[A] =
>>> {
>>>     val tree = this.tree.range(from, until)
>>>     newSet(tree.count, tree)
>>>   }
>>> And there's your "count" causing trouble -- it is used to initialize
>>> "size", which is a val for TreeSet. So, how does "count" make trouble? On
>>> RedBlackTree:
>>> def count = 1 + left.count + right.count
>>> Which is clearly not tail-recursive. If the tree were balanced as it
>>> should have been, it wouldn't cause much trouble.
>>> On Thu, Dec 31, 2009 at 6:32 AM, Jan Van Besien <janvanbesien@gmail.com>
>>> wrote:
>>>>
>>>> Jan Van Besien wrote:
>>>>>
>>>>> I'll give scala 2.8 a try to see if that makes a difference.
>>>>
>>>> Same problem with scala 2.8, without the cast to SortedSet[Int].
>>>>
>>>> jvb@lapjvb /tmp> scala -version
>>>> Scala code runner version 2.8.0.r20327-b20091231020112 -- Copyright
>>>> 2002-2010, LAMP/EPFL
>>>> jvb@lapjvb /tmp> cat RedBlackTestNoCast.scala
>>>> import collection.immutable.{TreeSet,SortedSet}
>>>>
>>>> object RedBlackTestNoCast {
>>>>  def main(args: Array[String]) {
>>>>
>>>>    var big = 100000
>>>>
>>>>    var aSortedSet: SortedSet[Int] = TreeSet(big)
>>>>
>>>>    for (i <- 1 until 10000) {
>>>>      aSortedSet = (aSortedSet - big) ++ (TreeSet(i, big - 1))
>>>>      big = big - 1
>>>>      if (i % 1000 == 0) {
>>>>        println("size = " + aSortedSet.size)
>>>>        val head = aSortedSet.until(i)
>>>>        println("head size = " + head.size)
>>>>      }
>>>>    }
>>>>  }
>>>> }
>>>>
>>>> jvb@lapjvb /tmp> scalac RedBlackTestNoCast.scala
>>>> jvb@lapjvb /tmp> scala RedBlackTestNoCast
>>>> size = 1001
>>>> head size = 999
>>>> size = 2001
>>>> head size = 1999
>>>> size = 3001
>>>> head size = 2999
>>>> size = 4001
>>>> head size = 3999
>>>> size = 5001
>>>> head size = 4999
>>>> size = 6001
>>>> head size = 5999
>>>> size = 7001
>>>> java.lang.StackOverflowError
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>        at
>>>> scala.collection.immutable.RedBlack$NonEmpty.count(RedBlack.scala:129)
>>>>
>>>>
>>>> Kind regards,
>>>> Jan
>>>
>>>
>>>
>>> --
>>> Daniel C. Sobral
>>>
>>> I travel to the future all the time.
>>
>
>
>
> --
> Daniel C. Sobral
>
> I travel to the future all the time.
>



--
Daniel C. Sobral

I travel to the future all the time.
dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
I provided a patch for it. I created and used a suit of ScalaCheck tests to ensure that it was indeed fixed.

On Thu, Dec 31, 2009 at 1:46 PM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Daniel Sobral wrote:
Which is clearly not tail-recursive. If the tree were balanced as it should have been, it wouldn't cause much trouble.

Yes indeed. The whole point is that a red-black-tree should always be balanced...

A bug has been filed in the meantime: https://lampsvn.epfl.ch/trac/scala/ticket/2849

kind regards
Jan



--
Daniel C. Sobral

I travel to the future all the time.
dcsobral
Joined: 2009-04-23,
User offline. Last seen 38 weeks 5 days ago.
Re: StackOverflowError in RedBlack
By the way, here is the output with my patch to RedBlack.scala:
size = 1001head size = 999size = 2001head size = 1999size = 3001head size = 2999 size = 4001head size = 3999size = 5001head size = 4999size = 6001head size = 5999size = 7001head size = 6999size = 8001 head size = 7999size = 9001head size = 8999

On Wed, Dec 30, 2009 at 12:03 PM, Jan Van Besien <janvanbesien@gmail.com> wrote:
Jan Van Besien wrote:
So either there is a bug in the RedBlack implementation, or there is something wrong with the compareTo implementation of the objects that I put into it... I'll assume the latter ;-)

I managed to reproduce the StackOverflowError with a SortedSet[Int]. I guess that already eliminates a bug in my compareTo implementation ;-)

import collection.immutable.{TreeSet,SortedSet}

object RedBlackTest {
 def main(args: Array[String]) {

   var big = 100000

   var aSortedSet: SortedSet[Int] = TreeSet(big)

   for (i <- 1 until 10000) {
     aSortedSet = (aSortedSet - big).asInstanceOf[SortedSet[Int]] ++ (TreeSet(i, big - 1))
     big = big - 1
     if (i % 1000 == 0) {
       println("size = " + aSortedSet.size)
       val head = aSortedSet.until(i)
       println("head size = " + head.size)
     }
   }
 }
}

It is "aSortedSet.until(i)" which actually triggers the StackOverflowError.

Notice that I don't simply add some numbers to the set. I start with a set with one element (a very big one) and in each iteration I first remove the big element and add two other elements: the "i" from the for loop and a new big element (but not as big as the previous).

The cast to SortedSet is a workaround for a problem described here: http://stackoverflow.com/questions/1271426/scala-immutable-sortedset-are-not-stable-on-deletion

If you run this in a debugger, and set a breakpoint in TreeSet line 100, you see that the tree looks for example (after 5 iterations) like this:

BlackTree(5,(),BlackTree(4,(),BlackTree(3,(),BlackTree(2,(),BlackTree(1,(),Empty,Empty),Empty),Empty),Empty),BlackTree(99995,(),Empty,Empty))

In other words: unbalanced.

I assume this is not the expected behavior of RedBlack trees? Or am I doing something embarasingly wrong here?

kind regards,
Jan



--
Daniel C. Sobral

I travel to the future all the time.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland