- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Using lookahead in the new parser in trunk
Wed, 2009-05-27, 10:47
Hey All
I am currently updating an old patch to the Parser
(ast/parser/Parsers.scala) but I have a problem using the next token!
I can use in.token to test what token I can looking at now, but the
in.next.token is always EMPTY (-3) ...
So what I want to do and what worked before is
if (in.token == SUPER && in.next.token != LBRACKET && in.next.token != DOT ) {
This will find all cases where the super keyword is used and it not part
of a bath or anything. How should something like this be written now?
/Anders
Wed, 2009-05-27, 14:57
#2
Re: Using lookahead in the new parser in trunk
Scanners1 was the old Scanners before Sean replaced that new
NewScanners. So Scanners1/Scanners is a reversal to pre Sean state.
NewScanners had issues with the new collections and IO did not
understand it sufficiently well to fix it with confidence. Besides a
lot of it was obsolete, dating from some failed experiments with the
Eclipse plugin.
next and prev are for internal Scanner consumption only. The scanner
stores there the next token if it was forced to do a lookahead.
Otherwise it is empty.
(We are currently re-doing the Eclipse plugin. Having a real fast
Scanners & Parsers combo is essential for this, because it will be
invoked on every keystroke. So, I am against nice to have's that would
slow down the Scanner.)
Hope this helps
Wed, 2009-05-27, 15:07
#3
Re: Using lookahead in the new parser in trunk
On Wed, May 27, 2009 at 03:53:30PM +0200, martin odersky wrote:
> (We are currently re-doing the Eclipse plugin. Having a real fast
> Scanners & Parsers combo is essential for this, because it will be
> invoked on every keystroke. So, I am against nice to have's that would
> slow down the Scanner.)
I can totally understand not keeping the next token "hot", but unless
you object I figured I would encapsulate the logic already used where a
lookahead is necessary so at least it doesn't get duplicated:
if (token == CASE) {
prev copyFrom this
val nextLastOffset = charOffset - 1
fetchToken()
...
} else {
lastOffset = nextLastOffset
next copyFrom this
this copyFrom prev
Wed, 2009-05-27, 15:17
#4
Re: Using lookahead in the new parser in trunk
Hey Martin
Yes for the Eclipse perspective a fast scanner/parser combo is
important, that I can see and I wont argue ;-)
But, the early definitions syntax as you surgested yourself should use
the super keyword as a delimiter between the early definition statements
and the normal body statements. This means that I have to identify the
super statement and not the super[X] and super.something, because they
are also valid, but not as the delimiter between the two parts.
One solution could be to create a new Token SUPERKW (Super Key Word)
that is identified in the same way as CASECLASS and the like in the
Scanner, this way a simple match in the Parser can be implemented to
find this special super delimiter. What is your oppinion about this or
do you have a better idea?
/Anders
Martin Odersky wrote:
> Scanners1 was the old Scanners before Sean replaced that new
> NewScanners. So Scanners1/Scanners is a reversal to pre Sean state.
> NewScanners had issues with the new collections and IO did not
> understand it sufficiently well to fix it with confidence. Besides a
> lot of it was obsolete, dating from some failed experiments with the
> Eclipse plugin.
>
> next and prev are for internal Scanner consumption only. The scanner
> stores there the next token if it was forced to do a lookahead.
> Otherwise it is empty.
>
> (We are currently re-doing the Eclipse plugin. Having a real fast
> Scanners & Parsers combo is essential for this, because it will be
> invoked on every keystroke. So, I am against nice to have's that would
> slow down the Scanner.)
>
> Hope this helps
>
Sat, 2009-05-30, 22:17
#5
Re: Using lookahead in the new parser in trunk
Hi Paul, Anders:
Yes, adding a method for getting a lookahead token in Scanners makes sense.
Cheers
After a little talk with paulp on IRC, we wondered about the origin of
Scanner1 (before it was merged in as Scanner in the current trunk).
Before there was Scanner and NewScanner and it was NewScanner that was
used in the compiler. Then Martin created Scanner1 from file ? (yes,
there is no svn log of what the origin of Scanner1 is, if you know
better please say so).
This new Scanner1 shares directly some code with NewScanner, but there
are some things that are missing.
In Scanner1 (the current Scanner) there is a next and prev token that
can be used for lookahead (if you trust the comment), but most of the
time next is EMPTY.
My understanding of (how) the design of the scanner (should have been)
is that in.token holds the current token, in.next.token holds the next token and
prev holds the previeus one. When we do a nextToken() we move the
content of next to this and load the next token in in.next... This way
you have the one token lookahead. Currently this is not what is
happening, as far as I can see in the source.
/Anders
Anders Bach Nielsen wrote:
> Hey All
>
> I am currently updating an old patch to the Parser
> (ast/parser/Parsers.scala) but I have a problem using the next token!
>
> I can use in.token to test what token I can looking at now, but the
> in.next.token is always EMPTY (-3) ...
>
> So what I want to do and what worked before is
>
> if (in.token == SUPER && in.next.token != LBRACKET && in.next.token != DOT ) {
>
> This will find all cases where the super keyword is used and it not part
> of a bath or anything. How should something like this be written now?
>
> /Anders
>