- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
partest actor deadlocks
Tue, 2009-10-06, 00:19
At some point in the recent past, the test suite started getting hung
up. At first I didn't think much of it since I'm always breaking stuff
locally, but now I realize it must be the actors. I have no data, just
the knowledge that I can find the test suite stopped for good, probably
because an actor lost a message somewhere.
The next time I catch it I'll try to get a trace. In the meantime,
please trust me that this is happening and if you know of any recent
change to the actors that might be causing it, give it another look.
Testing is laborious enough already.
Tue, 2009-10-06, 10:27
#2
Re: Re: partest actor deadlocks
Paul Phillips wrote:
> Here is a thread dump:
>
> http://paste.pocoo.org/show/143193/
>
> Here is what looks like a scary patch which coincides with when I
> started getting deadlocks:
>
> http://lampsvn.epfl.ch/trac/scala/changeset/18781
I'd be surprised if this would be the cause. Although it touches many
files, it mostly renames some vars, adds deprecation annotations, and
changes access modifiers.
Actually, I suspect there might be a problem in
`scala/tools/partest/nest/StreamAppender.scala`, because that's where
the threads in the dump are seemed to be stuck. (It is used in
Worker.scala @ line 223. There are two StreamAppenders writing to the
same FileWriter.)
What about using a better process abstraction/implementation?
Cheers,
Philipp
Tue, 2009-10-06, 19:17
#3
Re: Re: partest actor deadlocks
Philipp Haller wrote:
> Paul Phillips wrote:
>> Here is a thread dump:
>>
>> http://paste.pocoo.org/show/143193/
>>
>> Here is what looks like a scary patch which coincides with when I
>> started getting deadlocks:
>>
>> http://lampsvn.epfl.ch/trac/scala/changeset/18781
>
> I'd be surprised if this would be the cause. Although it touches many
> files, it mostly renames some vars, adds deprecation annotations, and
> changes access modifiers.
>
> Actually, I suspect there might be a problem in
> `scala/tools/partest/nest/StreamAppender.scala`, because that's where
> the threads in the dump are seemed to be stuck. (It is used in
> Worker.scala @ line 223. There are two StreamAppenders writing to the
> same FileWriter.)
> What about using a better process abstraction/implementation?
So, I tried to reproduce it. When it seemed to get stuck in a similar
way on my machine, it turned out that the test was timing out after 10
minutes or so, because it did not terminate. Partest then continued
normally.
Other than that I couldn't reproduce it today.
Philipp
Tue, 2009-10-06, 20:27
#4
Re: Re: partest actor deadlocks
On Tue, Oct 06, 2009 at 08:07:50PM +0200, Philipp Haller wrote:
> So, I tried to reproduce it. When it seemed to get stuck in a similar
> way on my machine, it turned out that the test was timing out after 10
> minutes or so, because it did not terminate. Partest then continued
> normally.
That's interesting. Do you know what test that was? I would like to
move the tests with ongoing non-deterministic results somewhere else,
and also I'd like to lower the partest timeout to the lowest value which
won't induce spurious failures on a slowish machine. Given that you
need about a gig of ram to even build scalac, I doubt people are trying
to build on their 4 MHz wristwatch chips. Other than one or two tests
which are needlessly thorough (bridges, I'm looking at you) I would
guess one minute is plenty of time.
Tue, 2009-10-06, 21:57
#5
Re: Re: partest actor deadlocks
On Tue, Oct 06, 2009 at 08:07:50PM +0200, Philipp Haller wrote:
> So, I tried to reproduce it. When it seemed to get stuck in a similar
> way on my machine, it turned out that the test was timing out after 10
> minutes or so, because it did not terminate. Partest then continued
> normally.
I was able to catch the (or if not "the", "a") hung test in the act, and
it is:
files/jvm/reactor-exceptionOnSend
As a side note it'd sure be nice if partest printed the name of the test
BEFORE it started the test rather than after, because when a test is
hung or taking too long we instead are treated to the name of the last
successfully run test. Which might be a little useful if they were run
in a deterministic order, but no.
Tue, 2009-10-06, 22:07
#6
Re: Re: partest actor deadlocks
On Tue, Oct 06, 2009 at 08:07:50PM +0200, Philipp Haller wrote:
> So, I tried to reproduce it. When it seemed to get stuck in a similar
> way on my machine, it turned out that the test was timing out after 10
> minutes or so, because it did not terminate.
I haven't looked for where timeouts are set, but I waited 20 minutes for
it to timeout, and no dice. I disabled that test in trunk for now.
Tue, 2009-10-06, 22:17
#7
Re: Re: partest actor deadlocks
And then seconds after I sent that it timed out.
Tue, 2009-10-06, 22:47
#8
Re: Re: partest actor deadlocks
Paul Phillips wrote:
> On Tue, Oct 06, 2009 at 08:07:50PM +0200, Philipp Haller wrote:
>> So, I tried to reproduce it. When it seemed to get stuck in a similar
>> way on my machine, it turned out that the test was timing out after 10
>> minutes or so, because it did not terminate. Partest then continued
>> normally.
>
> I was able to catch the (or if not "the", "a") hung test in the act, and
> it is:
>
> files/jvm/reactor-exceptionOnSend
OK, good to know. I'll take a look.
Actually, the test that timed out in my case had to time out given the
changes I was testing.
> As a side note it'd sure be nice if partest printed the name of the test
> BEFORE it started the test rather than after, because when a test is
> hung or taking too long we instead are treated to the name of the last
> successfully run test. Which might be a little useful if they were run
> in a deterministic order, but no.
Indeed.
Philipp
Thu, 2009-10-08, 00:27
#9
Re: Re: partest actor deadlocks
Still hanging. Last two times I have caught it it's been t2359.scala,
which sure looks like a born hanger. I disabled that too. At some
point when I'm less frustrated with squandered time I am going to
partition the test suite into deterministic and non-deterministic
sections.
Here is a thread dump:
http://paste.pocoo.org/show/143193/
Here is what looks like a scary patch which coincides with when I
started getting deadlocks:
http://lampsvn.epfl.ch/trac/scala/changeset/18781
...and here is where I express frustration, because I could parallelize
the test suite with about five lines of code and it wouldn't deadlock.
This is a huge drag, I'm trying to rewrite difficult parts of the
compiler and I have no hope of making steady progress if I can't
regularly confirm that the test suite passes.