This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Parser combinator gotcha (sort of...)

5 replies
mtm
Joined: 2009-01-27,
User offline. Last seen 3 years 32 weeks ago.

I've been messing around with a parser to parse input such as:

foo(1, 2)

My first attempt was along the lines of:

class FooParser extends JavaTokenParsers {
def func : Parser[Int] = ident~"("~repsep(wholeNumber, ",")~")"
^^ { case f~"("~a~")" => execFunc(f, a) }
...
}

This works fine, but I wanted to get rid of some of the clutter:

def func : Parser[Int] = ident~"("~>repsep(wholeNumber, ",")<~")"
^^ { case f~a => execFunc(f, a) }

Looks better, but, alas, it won't compile. I was getting a type
error on the case pattern. Why? Well, it took me a while to grok
what was going on. As far as I can tell the first combinator, ~,
combined 'ident' and "(" and then the second combinator, ~>,
basically threw away the left result (not just the "("). Makes sense
once you realize that ~ and ~> have the same precedence.

Of course the following also works:

def func : Parser[Int] = ident~("("~>repsep(wholeNumber,
",")<~")") ^^ { case f~a => execFunc(f, a) }

But I'm not sure that looks any better than my original attempt.

I finally settled on:

def tuple : Parser[List[String]] = "("~>repsep(wholeNumber, ",")<~")"
def func : Parser[Int] = ident~tuple ^^ { case f~a =>
execFunc(f, a) }

since I was going to need tuples elsewhere in the parser.

I'm still a Scala newbie, so please enlighten me if there are more
idiomatic ways of writing this.

-mike

--
Mike T. Miller
Principal Software Engineer
A-Team
LinkedIn, Inc.

Meredith Gregory
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: Parser combinator gotcha (sort of...)
Mike,

i confess, the Scala rendition of the parser combinator stuff is hideous to me. If you'd like something a tad more robust and flexible and maintainable that still interoperates with Java and Scala, might i suggest BNFC. It pretty much just works.

It sits above parser generators like Alex+Happy (Haskell) or JLex+<WhateverTheJavaParserGeneratorIsCalled> or OCamlLex+OCamlYacc or ...

It's targets include
  • Java
  • C#
  • Haskell
  • OCaml
  • F#
  • XML
  • ...
The input looks like BNF with labeled choices. So, for example, to build my JSON parser, i cut the JSON BNF spec into my text editor, labeled the choices and had a working parser in 20 mins. For URIs, i did the same.

Best wishes,

--greg

On Thu, Feb 19, 2009 at 4:42 PM, Mike T. Miller <jinious@gmail.com> wrote:
I've been messing around with a parser to parse input such as:

foo(1, 2)

My first attempt was along the lines of:

class FooParser extends JavaTokenParsers {
 def func : Parser[Int]  = ident~"("~repsep(wholeNumber, ",")~")"  ^^  { case f~"("~a~")" => execFunc(f, a) }
 ...
}

This works fine, but I wanted to get rid of some of the clutter:

 def func : Parser[Int]  = ident~"("~>repsep(wholeNumber, ",")<~")"  ^^  { case f~a => execFunc(f, a) }

Looks better, but, alas, it won't compile.   I was getting a type error on the case pattern.  Why?  Well, it took me a while to grok what was going on.  As far as I can tell the first combinator, ~, combined 'ident' and "(" and then the second combinator, ~>,  basically threw away the left result (not just the "(").  Makes sense once you realize that ~ and ~> have the same precedence.

Of course the following also works:

 def func : Parser[Int]  = ident~("("~>repsep(wholeNumber, ",")<~")")  ^^  { case f~a => execFunc(f, a) }

But I'm not sure that looks any better than my original attempt.

I finally settled on:

 def tuple : Parser[List[String]] = "("~>repsep(wholeNumber, ",")<~")"
 def func : Parser[Int]  = ident~tuple  ^^  { case f~a => execFunc(f, a) }

since I was going to need tuples elsewhere in the parser.

I'm still a Scala newbie, so please enlighten me if there are more idiomatic ways of writing this.

-mike


--
Mike T. Miller
Principal Software Engineer
A-Team
LinkedIn, Inc.







--
L.G. Meredith
Managing Partner
Biosimilarity LLC
806 55th St NE
Seattle, WA 98105

+1 206.650.3740

http://biosimilarity.blogspot.com
David Biesack
Joined: 2008-11-18,
User offline. Last seen 2 years 38 weeks ago.
Re: Parser combinator gotcha (sort of...)

> From: "Mike T. Miller"
> Date: Thu, 19 Feb 2009 16:42:13 -0800
>
> ... I wanted to get rid of some of the clutter:
>
> def func : Parser[Int] = ident~"("~>repsep(wholeNumber, ",")<~")"
> ^^ { case f~a => execFunc(f, a) }
>
> Looks better, but, alas, it won't compile. I was getting a type
> error on the case pattern. Why? Well, it took me a while to grok
> what was going on. As far as I can tell the first combinator, ~,
> combined 'ident' and "(" and then the second combinator, ~>,
> basically threw away the left result (not just the "("). Makes sense
> once you realize that ~ and ~> have the same precedence.

I wrote this up on the Scala wiki page when I discovered this, too!

http://scala.sygneca.com/start
-> Standard library details
-> parsing == http://scala.sygneca.com//libs/parsing

At the time, a Google search for the compiler errors for this case
scala "constructor cannot be instantiated to expected type"
yielded no documentation (just source code links). Now, libs:parsing
is near the top of the list.

mtm
Joined: 2009-01-27,
User offline. Last seen 3 years 32 weeks ago.
Re: Parser combinator gotcha (sort of...)
I appreciate the pointer and may look at BNFC later, closer to production.  Right now I'm using this as an opportunity to get more familiar with Scala.  BNFC is probably more efficient, but that's not a concern right now.
-mike
On Feb 19, 2009, at 5:26 PM, Meredith Gregory wrote:
Mike,

i confess, the Scala rendition of the parser combinator stuff is hideous to me. If you'd like something a tad more robust and flexible and maintainable that still interoperates with Java and Scala, might i suggest BNFC. It pretty much just works.

It sits above parser generators like Alex+Happy (Haskell) or JLex+<WhateverTheJavaParserGeneratorIsCalled> or OCamlLex+OCamlYacc or ...

It's targets include
  • Java
  • C#
  • Haskell
  • OCaml
  • F#
  • XML
  • ...
The input looks like BNF with labeled choices. So, for example, to build my JSON parser, i cut the JSON BNF spec into my text editor, labeled the choices and had a working parser in 20 mins. For URIs, i did the same.

Best wishes,

--greg

On Thu, Feb 19, 2009 at 4:42 PM, Mike T. Miller <jinious@gmail.com> wrote:
I've been messing around with a parser to parse input such as:

foo(1, 2)

My first attempt was along the lines of:

class FooParser extends JavaTokenParsers {
 def func : Parser[Int]  = ident~"("~repsep(wholeNumber, ",")~")"  ^^  { case f~"("~a~")" => execFunc(f, a) }
 ...
}

This works fine, but I wanted to get rid of some of the clutter:

 def func : Parser[Int]  = ident~"("~>repsep(wholeNumber, ",")<~")"  ^^  { case f~a => execFunc(f, a) }

Looks better, but, alas, it won't compile.   I was getting a type error on the case pattern.  Why?  Well, it took me a while to grok what was going on.  As far as I can tell the first combinator, ~, combined 'ident' and "(" and then the second combinator, ~>,  basically threw away the left result (not just the "(").  Makes sense once you realize that ~ and ~> have the same precedence.

Of course the following also works:

 def func : Parser[Int]  = ident~("("~>repsep(wholeNumber, ",")<~")")  ^^  { case f~a => execFunc(f, a) }

But I'm not sure that looks any better than my original attempt.

I finally settled on:

 def tuple : Parser[List[String]] = "("~>repsep(wholeNumber, ",")<~")"
 def func : Parser[Int]  = ident~tuple  ^^  { case f~a => execFunc(f, a) }

since I was going to need tuples elsewhere in the parser.

I'm still a Scala newbie, so please enlighten me if there are more idiomatic ways of writing this.

-mike


--
Mike T. Miller
Principal Software Engineer
A-Team
LinkedIn, Inc.







--
L.G. Meredith
Managing Partner
Biosimilarity LLC
806 55th St NE
Seattle, WA 98105

+1 206.650.3740

http://biosimilarity.blogspot.com

Andreas W
Joined: 2009-01-10,
User offline. Last seen 1 year 14 weeks ago.
Re: Parser combinator gotcha (sort of...)

The parser combinators are great, but this example shows that there is some repetition/boilerplate involved in the
id~"+"~id ^^ { case x~"+"~y => ... } pattern.

Wouldn't it be great if instead you could write something like this:

case x:id~"+"~y:id => x+y

or, following in the earlier example:

case f:ident~"("~a:repsep(wholeNumber, ",")~")" => execFunc(f, a)

I am not sure it is feasible, I just know that I have at times wished for it.

Andreas

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: Parser combinator gotcha (sort of...)

On Sat, Feb 21, 2009 at 12:46 AM, Windemuth Andreas wrote:
>
> The parser combinators are great, but this example shows that there is some repetition/boilerplate involved in the
> id~"+"~id ^^ { case x~"+"~y => ... } pattern.
>
> Wouldn't it be great if instead you could write something like this:
>
> case x:id~"+"~y:id => x+y
>
> or, following in the earlier example:
>
> case f:ident~"("~a:repsep(wholeNumber, ",")~")" => execFunc(f, a)
>
> I am not sure it is feasible, I just know that I have at times wished for it.
>
Very interesting idea! It might be that something like this is
feasible using extractors:

case (x@id)~"+"~(y@id) => ...

I hve not tried it, but it looks plausible that one can define an
extractor-based parser combinator library that would allow this.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland