This page is no longer maintained — Please continue to the home page at www.scala-lang.org

More Concrete AST?

9 replies
Mirko Stocker
Joined: 2009-09-10,
User offline. Last seen 45 weeks 6 days ago.

Hi all,

For my Scala Refactoring project [1], I'm currently digging into the Scala
compiler, especially the AST to find out how to best realize these
refactorings. To manipulate the source code, I need a syntax tree that isn't
too abstract so that I have a chance to regenerate the code as similar to the
original as possible.

For now, I've mostly explored the AST with the Eclipse debugger and -Ybrowse.
An example that troubles me is that these two expressions result in the same
ASTs (which makes sense, as they are identical):

for(i <- List(1)) yield i

List(1).map(i => i)

Unfortunately, this makes refactoring much harder, and I don't think users
would accept it if the tool just converted from one notation to the other :-)

Now, what I haven't found out yet, is it possible to get a more concrete
representation of the code?

I hope it is ok that I sent this to the internals-list even if I don't
contribute to the Scala code base (yet :-) ).

Thanks in advance

Mirko Stocker

[1] http://misto.ch/scala-refactoring-term-project/

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: More Concrete AST?

On Fri, Sep 25, 2009 at 10:30 AM, Mirko Stocker wrote:
> Hi all,
>
> For my Scala Refactoring project [1], I'm currently digging into the Scala
> compiler, especially the AST to find out how to best realize these
> refactorings. To manipulate the source code, I need a syntax tree that isn't
> too abstract so that I have a chance to regenerate the code as similar to the
> original as possible.
>
> For now, I've mostly explored the AST with the Eclipse debugger and -Ybrowse.
> An example that troubles me is that these two expressions result in the same
> ASTs (which makes sense, as they are identical):
>
>  for(i <- List(1)) yield i
>
>  List(1).map(i => i)
>
> Unfortunately, this makes refactoring much harder, and I don't think users
> would accept it if the tool just converted from one notation to the other :-)
>
> Now, what I haven't found out yet, is it possible to get a more concrete
> representation of the code?

Right now, no. The problem is that the first expression is typed as if
it was the second. There is not a separate type checking rule for
fors. That means, even if you changed the parser to produce a
ForExpression syntax tree, after type checking all you would see is
the map expression.

I am not sure what to do about this. Maybe there's a way to label
map/filter/flatMap expressions in some way to indicate that they came
from a for?

Cheers

Mirko Stocker
Joined: 2009-09-10,
User offline. Last seen 45 weeks 6 days ago.
Re: More Concrete AST?

On Friday 25 September 2009 10:49:13 martin odersky wrote:
> Right now, no. The problem is that the first expression is typed as if
> it was the second. There is not a separate type checking rule for
> fors. That means, even if you changed the parser to produce a
> ForExpression syntax tree, after type checking all you would see is
> the map expression.
>
> I am not sure what to do about this. Maybe there's a way to label
> map/filter/flatMap expressions in some way to indicate that they came
> from a for?

When refactoring, I should always have the source code around, and I know the
for/map's position. With that, I should be able to disambiguate and generate
the correct code.

Are there many other cases like the for<->map/filter expressions in the parser?

I also wonder if it might make sense to transform the AST to my own
"Refactoring Tree" so I could create a suitable representation (with comments,
white spaces, etc.) that I can also present to the outside world.

Another open question I have is where my code will live (assuming my project
is successful), what would you suggest?

Thanks!

Mirko

milessabin
Joined: 2008-08-11,
User offline. Last seen 33 weeks 3 days ago.
Re: More Concrete AST?

On Fri, Sep 25, 2009 at 9:49 AM, martin odersky wrote:
> I am not sure what to do about this. Maybe there's a way to label
> map/filter/flatMap expressions in some way to indicate that they came
> from a for?

I'm pretty sure it can be inferred from the combination of the node
positions and the source text. It's a bit yucky to have to back this
information out in this way, but it's better than nothing.

Cheers,

Miles

milessabin
Joined: 2008-08-11,
User offline. Last seen 33 weeks 3 days ago.
Re: More Concrete AST?

On Fri, Sep 25, 2009 at 10:12 AM, Mirko Stocker wrote:
> I also wonder if it might make sense to transform the AST to my own
> "Refactoring Tree" so I could create a suitable representation (with comments,
> white spaces, etc.) that I can also present to the outside world.

I'd say, no, not unless it's unavoidable ... I'd really like the
scalac AST to be the canonical one right across the Scala toolchain.

BTW, attaching comments to the tree (at the compilation unit level
with rather than interspersed throughout) is on my TODO list.

> Another open question I have is where my code will live (assuming my project
> is successful), what would you suggest?

I'll be wanting to use in Eclipse, so a top-level module in the EPFL
SVN would be wonderful :-)

Cheers,

Miles

Mirko Stocker
Joined: 2009-09-10,
User offline. Last seen 45 weeks 6 days ago.
Re: More Concrete AST?

On Friday 25 September 2009 12:51:07 Miles Sabin wrote:
> I'd say, no, not unless it's unavoidable ... I'd really like the
> scalac AST to be the canonical one right across the Scala toolchain.

I'd prefer it as well, it would save me some effort. But will the AST structure
remain stable over different scala versions?

> BTW, attaching comments to the tree (at the compilation unit level
> with rather than interspersed throughout) is on my TODO list.

That would be great!

> I'll be wanting to use in Eclipse, so a top-level module in the EPFL
> SVN would be wonderful :-)

Just for now and until I have something to show, I'm working with git-svn and
keep the code in my own repository at the university, I was more wondering how
I should organize my code, can I write a plug-in for scalac or put a more
sophisticated pretty printer directly into scala.tools.nsc.ast? What would you
advise?

Regards

Mirko

milessabin
Joined: 2008-08-11,
User offline. Last seen 33 weeks 3 days ago.
Re: More Concrete AST?

On Fri, Sep 25, 2009 at 12:22 PM, Mirko Stocker wrote:
> Just for now and until I have something to show, I'm working with git-svn and
> keep the code in my own repository at the university, I was more wondering how
> I should organize my code, can I write a plug-in for scalac or put a more
> sophisticated pretty printer directly into scala.tools.nsc.ast? What would you
> advise?

That's a question for Martin ...

Cheers,

Miles

Mirko Stocker
Joined: 2009-09-10,
User offline. Last seen 45 weeks 6 days ago.
Re: More Concrete AST?

On Friday 25 September 2009 12:51:07 Miles Sabin wrote:
> > I also wonder if it might make sense to transform the AST to my own
> > "Refactoring Tree" so I could create a suitable representation (with
> > comments, white spaces, etc.) that I can also present to the outside
> > world.
>
> I'd say, no, not unless it's unavoidable ... I'd really like the
> scalac AST to be the canonical one right across the Scala toolchain.

I've just discovered something scary, consider the following code:

case class A

Then the position of the ClassDef node doesn't contain the 'case' modifier, and
not even the top-level PackageDef node encloses 'case' :-(
So why is this a problem? Consider the case when I, for example with a 'move
class' refactoring, change the position of the case class. If I delete the
class at its current position, I also have to remove the 'case', so I would
have to parse the source to find the 'case'.. now, an alternative (and my
preferred way) would be to remove the class from the AST and just re-generate
the whole file (preserving all white space etc.) and replace everything. But
now, if there are comments:

/*case*/ case /*class*/ class Test

then I would still have to parse the file to find out where to insert the
comments into the newly generated code.

Ideally for me, such things as modifiers need to have a richer representation
(including a position!) in the AST rather than just being a flag of the
ClassDef.. but I guess that can't be simply changed?

Could we at least change the position so the case modifier is included in the
ClassDef?

Any other ideas or comments are greatly appreciated!

Regards

Mirko

milessabin
Joined: 2008-08-11,
User offline. Last seen 33 weeks 3 days ago.
Re: More Concrete AST?

On Sun, Sep 27, 2009 at 1:03 PM, Mirko Stocker wrote:
> I've just discovered something scary, consider the following code:
>
> case class A
>
> Then the position of the ClassDef node doesn't contain the 'case' modifier, and
> not even the top-level PackageDef node encloses 'case' :-(

Ouch.

> Could we at least change the position so the case modifier is included in the
> ClassDef?

Yes, that looks like a problem ... I'll look into it.

Cheers,

Miles

Mirko Stocker
Joined: 2009-09-10,
User offline. Last seen 45 weeks 6 days ago.
Re: More Concrete AST?

On Sunday 27 September 2009 14:16:12 Miles Sabin wrote:
> > Could we at least change the position so the case modifier is included in
> > the ClassDef?
>
> Yes, that looks like a problem ... I'll look into it.

Thanks! I see this is a problem with all modifiers ('val' isn't included in the
ValDef position).

Would it be acceptable if we attached to the Modifiers instances a list of
(long, Position) so we know the position of each flag?

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland