This page is no longer maintained — Please continue to the home page at www.scala-lang.org

classfile parser cycles

7 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.

I spent way too long yesterday trying out ways of breaking cycles in
parsing cyclical java bytecode. Here is what I came up with. This
patch causes tickets #2433 (which is an indirect cycle) and #2940 (which
is a direct cycle) to compile correctly without breaking anything else
or imposing any performance cost on non-cyclical scenarios. I mention
both because various approaches I tried before this one caused one or
the other to work but not both.

https://lampsvn.epfl.ch/trac/scala/ticket/2433
https://lampsvn.epfl.ch/trac/scala/ticket/2940

Unfortunately I have a lot of difficulty gaining high confidence about
such patches, and at that stage so many things are sensitive to being
forced if I do so much as glance at them to find out what they hold.
Can you comment on whether this patch might be usable, or if it's in the
right ballpark. Here is the gist of it. At first I was doing a lot
more saving and restoring shuffling, but as far as I can see it didn't
make any difference, and it's enough to simply call parse again after
letting one member of the cycle complete.

- def parse(file: AbstractFile, root: Symbol) = try {
+ def parse(file: AbstractFile, root: Symbol): Unit = {
+ if (busy.isDefined) {
+ reparseRequired = true
+ root setInfo DummyTypeObject
+ }
+ else {
+ val reparse = parseInternal(file, root)
+ // reparse any symbols it returns
+ reparse foreach { case (file, root) =>
+ if (settings.debug.value)
+ println("Reparsing " + root.name)
+
+ parseInternal(file, root)
+ }
+ }
+ }
+

And then at the end of parseInternal, it returns:

+ // run it again if we deferred anything due to a cycle
+ if (reparseRequired) List((file, root)) else Nil

Iulian Dragos 2
Joined: 2009-02-10,
User offline. Last seen 42 years 45 weeks ago.
Re: classfile parser cycles

I don't know if you got any answer about this one, but it seems ok to
me. I presume 'parseInternal' is the current parse method?

You seem to use a sort of a lazy type, which is forced by reparsing at
the end of 'parse'. Maybe something similar would work when the cycle
involves inner classes? The difference is that laziness is needed for
the inner class /symbol/ (not type), since the type of the outer class
is needed to resolve inner classes, /while parsing the outer class/. I
think there's an impressive number of tickets that could be all solved
by this.

iulian

On Tue, Mar 2, 2010 at 4:21 PM, Paul Phillips wrote:
> I spent way too long yesterday trying out ways of breaking cycles in
> parsing cyclical java bytecode.  Here is what I came up with.  This
> patch causes tickets #2433 (which is an indirect cycle) and #2940 (which
> is a direct cycle) to compile correctly without breaking anything else
> or imposing any performance cost on non-cyclical scenarios.  I mention
> both because various approaches I tried before this one caused one or
> the other to work but not both.
>
>  https://lampsvn.epfl.ch/trac/scala/ticket/2433
>  https://lampsvn.epfl.ch/trac/scala/ticket/2940
>
> Unfortunately I have a lot of difficulty gaining high confidence about
> such patches, and at that stage so many things are sensitive to being
> forced if I do so much as glance at them to find out what they hold.
> Can you comment on whether this patch might be usable, or if it's in the
> right ballpark.  Here is the gist of it.  At first I was doing a lot
> more saving and restoring shuffling, but as far as I can see it didn't
> make any difference, and it's enough to simply call parse again after
> letting one member of the cycle complete.
>
> -  def parse(file: AbstractFile, root: Symbol) = try {
> +  def parse(file: AbstractFile, root: Symbol): Unit = {
> +    if (busy.isDefined) {
> +      reparseRequired = true
> +      root setInfo DummyTypeObject
> +    }
> +    else {
> +      val reparse = parseInternal(file, root)
> +      // reparse any symbols it returns
> +      reparse foreach { case (file, root) =>
> +        if (settings.debug.value)
> +          println("Reparsing " + root.name)
> +
> +        parseInternal(file, root)
> +      }
> +    }
> +  }
> +
>
> And then at the end of parseInternal, it returns:
>
> +      // run it again if we deferred anything due to a cycle
> +      if (reparseRequired) List((file, root)) else Nil
>
> --
> Paul Phillips      | A Sunday school is a prison in which children do
> Moral Alien        | penance for the evil conscience of their parents.
> Empiricist         |     -- H. L. Mencken
> i pull his palp!   |----------* http://www.improving.org/paulp/ *----------
>

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: classfile parser cycles

On Mon, Mar 08, 2010 at 02:39:35PM +0100, Iulian Dragos wrote:
> I don't know if you got any answer about this one, but it seems ok to
> me. I presume 'parseInternal' is the current parse method?

I haven't, and yes (plus or minus a little.)

> You seem to use a sort of a lazy type, which is forced by reparsing at
> the end of 'parse'. Maybe something similar would work when the cycle
> involves inner classes? The difference is that laziness is needed for
> the inner class /symbol/ (not type), since the type of the outer class
> is needed to resolve inner classes, /while parsing the outer class/. I
> think there's an impressive number of tickets that could be all solved
> by this.

Yeah, as far as I could see the doComplete/complete infrastructure
cannot solve this problem, because parsing inner classes forces
completion and the cycle will strike there unless at least one of the
cycle members is complete.

So yes, if this works in principle then I should try to adapt it for
inner classes. The reason I asked about it and haven't checked it in
despite it all looking good to me is that I was afraid this approach
might be somehow leaving symbols partially set up in a way which was not
apparent in my tests, thus introducing subtle breakage elsewhere (an
outcome I enjoy avoiding when I can.)

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: classfile parser cycles

I think I have another fix. In a way the single classfile parser
instance reflects a useful architecture invariant. I'd be wary to give
that up for an insignificant issue. I believe in this case we can just
replace existentialAbstraction with ExistentialType in the classfile
parser to break the cycle. What this means is that the classfile
parser will not try to be smarter than the bytecodes. If they specify
an existential type that's what you get, even if it might be
redundant.

Cheers

rytz
Joined: 2008-07-01,
User offline. Last seen 45 weeks 5 days ago.
Re: classfile parser cycles


On Wed, Mar 10, 2010 at 18:31, martin odersky <martin.odersky@epfl.ch> wrote:
I think I have another fix. In a way the single classfile parser
instance reflects a useful architecture invariant. I'd be wary to give
that up for an insignificant issue.

Currently, there is one ClassfileParser instance per classfile. In SymbolLoaders

  class ClassfileLoader(val classfile: AbstractFile) extends SymbolLoader {
    private object classfileParser extends ClassfileParser {
      val global: SymbolLoaders.this.global.type = SymbolLoaders.this.global
    }

This was introduced some time ago by Sean (r5494), I don't know what
the reason was. Changing it back now is not possible because it breaks
parsing of inner classes.

 
I believe in this case we can just
replace existentialAbstraction with ExistentialType in the classfile
parser to break the cycle. What this means is that the classfile
parser will not try to be smarter than the bytecodes. If they specify
an existential type that's what you get, even if it might be
redundant.

Cheers

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: classfile parser cycles

On Wed, Mar 10, 2010 at 06:31:57PM +0100, martin odersky wrote:
> I think I have another fix. In a way the single classfile parser
> instance reflects a useful architecture invariant. I'd be wary to give
> that up for an insignificant issue.

I don't think the issue is any less significant than is java interop.
If that is indeed insignificant I have a long list of good ideas. We
can try to solve it any way you want, but personally I'm not prepared to
punt on vast swaths of the java landscape. Two of the manifestations of
this that we're aware of are google gdata and thrift, and there is no
reasonable workaround available. We're not in some obscure java
backwater on this one - if we can't even parse common, legal bytecode
we're offering a pretty pale sort of interop.

Jorge Ortiz
Joined: 2008-12-16,
User offline. Last seen 29 weeks 4 days ago.
Re: classfile parser cycles
I agree strongly with Paul. This is not an insignificant issue. I don't know when the "Popular Voted Tickets" section was added to Trac (http://lampsvn.epfl.ch/trac/scala/report/9), but it's very enlightening on this issue. Two of the tickets associated with this bug (#2464 and #2433) are the 2nd and 4th "most popular" Trac tickets, respectively.
People frequently come to me with this issue, and it's deeply unsatisfying to have to say, "Sorry, Scala can't use that library."
--j
On Wed, Mar 10, 2010 at 4:36 PM, Paul Phillips <paulp@improving.org> wrote:
On Wed, Mar 10, 2010 at 06:31:57PM +0100, martin odersky wrote:
> I think I have another fix. In a way the single classfile parser
> instance reflects a useful architecture invariant. I'd be wary to give
> that up for an insignificant issue.

I don't think the issue is any less significant than is java interop.
If that is indeed insignificant I have a long list of good ideas.  We
can try to solve it any way you want, but personally I'm not prepared to
punt on vast swaths of the java landscape.  Two of the manifestations of
this that we're aware of are google gdata and thrift, and there is no
reasonable workaround available.  We're not in some obscure java
backwater on this one - if we can't even parse common, legal bytecode
we're offering a pretty pale sort of interop.

--
Paul Phillips      | It's better to have gloved and tossed than never to
Stickler           | have played baseball.
Empiricist         |
pull his pi pal!   |----------* http://www.improving.org/paulp/ *----------

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: classfile parser cycles

Sorry, of course this is an important issue and we need to solve it.
But architecturally, it's a corner case. We should try to not
compromise the overall architecture for this, for instance by putting
in a second and different way to break cycles. We should try to find
another solution which fits better with the way things are done.

Cheers

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland