This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Feedback on my implementation of Shunting-yard

6 replies
jcrites
Joined: 2009-01-14,
User offline. Last seen 42 years 45 weeks ago.

Hi all,

I'm experimenting with programming languages in Scala, learning about their
implementation. One of the classic parsing algorithms is the Shunting-yard
algorithm [1] for operator precedence parsing. The algorithm transforms
infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 *
+".

As I am looking to develop expertise with Scala, I thought I would submit my
implementation for feedback.

First, an expression type:

sealed abstract class Expr
case class Num(i:Int) extends Expr
case class Add extends Expr
case class Mul extends Expr

Expressions are represented by Queue[Expr]. The Queue and Stack in this
code are mutable. An execution algorithm:

def execute(instructions: Queue[Expr]) = {
val stack = new Stack[Expr]
while (!instructions.isEmpty) {
def runInstruction(e:Expr) {
e match {
case n:Num => stack.push(n)
case a:Add => {
val op1 = stack.pop match { case Num(i) => i }
val op2 = stack.pop match { case Num(i) => i }
stack.push(Num(op1 + op2))
}
}
}
val i = instructions.dequeue()
runInstruction(i)
}
stack.pop()
}

We can use the execute code like this:

val q = new Queue[Expr]
q.enqueue(Num(1))
q.enqueue(Num(2))
q.enqueue(Add())
println(execute(q))

This will print "Num(3)". Now, the Shunting-yard algorithm. First we need
a function to map operators to their predecence. Higher number represents
higher (tighter) binding precedence.

def precedence(e:Expr) = e match {
case x:Add => 2
case x:Mul => 4
}

def isOperator(e:Expr) = e match {
case n:Num => false
case _ => true
}

Finally, the reordering algorithm:

def reorder(instructions: Queue[Expr]) = {
val output = new Queue[Expr]
val stack = new Stack[Expr]
while (!instructions.isEmpty) {
val tok = instructions.dequeue()
tok match {
case n:Num => {
output.enqueue(n)
}
case _ => {
while (!stack.isEmpty && isOperator(stack.top) &&
precedence(tok) <= precedence(stack.top)) {
output.enqueue(stack.pop())
}
stack.push(tok)
}
}
}
while (!stack.isEmpty) {
output.enqueue(stack.pop())
}
output
}

One can use the code like this:

val q2 = new Queue[Expr]
q2.enqueue(Num(3))
q2.enqueue(Add())
q2.enqueue(Num(4))
q2.enqueue(Mul())
q2.enqueue(Num(5))
println(q2)
println(reorder(q2))

The input expression (infix form) prints as "Queue(Num(3), Add(), Num(4),
Mul(), Num(5))". After Shunting-yard reordering, we get "Queue(Num(3),
Num(4), Num(5), Mul(), Add())", showing that multiplication is tighter than
addition.

If I change the precedence to:

def precedence(e:Expr) = e match {
case x:Add => 2
case x:Mul => 0
}

Then addition binds tighter, yielding the result "Queue(Num(3), Num(4),
Add(), Num(5), Mul())".

Feedback / comments appreciated.

What are the disadvantages of operator precedence parsing with
Shunting-yard? It seems like a very flexible and fast algorithm with many
advantages.

[1] http://en.wikipedia.org/wiki/Shunting_yard_algorithm

--
Justin Crites

Landei
Joined: 2008-12-18,
User offline. Last seen 45 weeks 4 days ago.
Re: Feedback on my implementation of Shunting-yard

jcrites wrote:
>
> Hi all,
>
> I'm experimenting with programming languages in Scala, learning about
> their implementation. One of the classic parsing algorithms is the
> Shunting-yard algorithm [1] for operator precedence parsing. The
> algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> expressions like "1 2 3 * +".
>
> As I am looking to develop expertise with Scala, I thought I would submit
> my implementation for feedback.
>

I'm also still a beginner, but your code looks so terrible... Java-ish :-)
All these confusing whiles and mutable state. I "feel" this code could be
made much clearer, but I guess this would take me while. Menwhile probably
one of the gurus materializes a two-liner inspired by a Zen-Koan (which
includes Haskell Curry, Buddha and three pound flax)
To give you at least a tiny bit of concrete advise: If you have
parameterless case classes (Add, Mul), you can always (?) use case objects
instead.

Take care!
Daniel

Landei
Joined: 2008-12-18,
User offline. Last seen 45 weeks 4 days ago.
Re: Feedback on my implementation of Shunting-yard
Landei wrote:
jcrites wrote: Hi all, I'm experimenting with programming languages in Scala, learning about their implementation. One of the classic parsing algorithms is the Shunting-yard algorithm [1] for operator precedence parsing. The algorithm transforms infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 * +". As I am looking to develop expertise with Scala, I thought I would submit my implementation for feedback.
I'm also still a beginner, but your code looks so terrible... Java-ish :-) All these confusing whiles and mutable state. I "feel" this code could be made much clearer, but I guess this would take me while. Menwhile probably one of the gurus materializes a two-liner inspired by a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax) To give you at least a tiny bit of concrete advise: If you have parameterless case classes (Add, Mul), you can always (?) use case objects instead. Take care! Daniel
That's what I got so far - not exactly beautiful, but a little bit more functional:
    sealed abstract class Expr
    case class Num(i:Int) extends Expr
    abstract case class Op extends Expr
    case object Add extends Op
    case object Mul extends Op

    def op(a:Int, b:Int, o:Op) = o match {
        case Add => a + b
        case Mul => a * b
    }

    def execute(instructions: Seq[Expr]) = {
        instructions.foldLeft(List[Expr]()){
            (list:List[Expr], e:Expr) => e match {
                case n:Num => n :: list
                case o:Op =>
                    list match {
                        case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail
                        case _ => error("Parse error: " + list)
                    }
            }}.head
    }

    val q = List(Num(1), Num(2), Add)
    println(execute(q))


Happy meditating!
Daniel
View this message in context: Re: Feedback on my implementation of Shunting-yard
Sent from the Scala mailing list archive at Nabble.com.
Ricky Clarkson
Joined: 2008-12-19,
User offline. Last seen 3 years 2 weeks ago.
Re: Re: Feedback on my implementation of Shunting-yard
If you have the book Programming in Scala, take a look at the Combinator Parsing chapter.  It has an example of parsing arithmetic expressions, and it does precedence quite simply by nesting parse rules.

E.g., in the expression 3 * 2 + 4 * 6, we have:

term + term

In each term, 3 * 2, we have factor * factor.

So a term is a number of factor * factor [* factor..] together and an expr is a number of term + term [+ term..] together.  I find that quite elegant.

The relevant three lines of code (but you'll need the rest of the chapter or a similar resource to make use of them):

def expr: Parser[Any] = term~rep("+"~term | ""~
term)
def term: Parser[Any] = factor~rep("*"~factor | "/"~factor)
def factor: Parser[Any] = floatingPointNumber | "("~expr~")"

You might want to see my recent thread about reverse pattern matching on Lists, for some ideas on how pattern matching can help.  The code I posted in that thread implements a toy postfix interpreter, and as I recall, with no vars.

2009/1/14 Landei <Daniel.Gronau@gmx.de>
Landei wrote:
jcrites wrote: Hi all, I'm experimenting with programming languages in Scala, learning about their implementation. One of the classic parsing algorithms is the Shunting-yard algorithm [1] for operator precedence parsing. The algorithm transforms infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 * +". As I am looking to develop expertise with Scala, I thought I would submit my implementation for feedback.
I'm also still a beginner, but your code looks so terrible... Java-ish :-) All these confusing whiles and mutable state. I "feel" this code could be made much clearer, but I guess this would take me while. Menwhile probably one of the gurus materializes a two-liner inspired by a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax) To give you at least a tiny bit of concrete advise: If you have parameterless case classes (Add, Mul), you can always (?) use case objects instead. Take care! Daniel
That's what I got so far - not exactly beautiful, but a little bit more functional:
    sealed abstract class Expr
    case class Num(i:Int) extends Expr
    abstract case class Op extends Expr
    case object Add extends Op
    case object Mul extends Op

    def op(a:Int, b:Int, o:Op) = o match {
        case Add => a + b
        case Mul => a * b
    }

    def execute(instructions: Seq[Expr]) = {
        instructions.foldLeft(List[Expr]()){
            (list:List[Expr], e:Expr) => e match {
                case n:Num => n :: list
                case o:Op =>
                    list match {
                        case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail
                        case _ => error("Parse error: " + list)
                    }
            }}.head
    }

    val q = List(Num(1), Num(2), Add)
    println(execute(q))


Happy meditating!
Daniel
View this message in context: Re: Feedback on my implementation of Shunting-yard
Sent from the Scala mailing list archive at Nabble.com.

Jim McBeath
Joined: 2009-01-02,
User offline. Last seen 42 years 45 weeks ago.
Re: Re: Feedback on my implementation of Shunting-yard

It's also possible to factor out the precedence level when using parser
combinators, as I describe in the "Precdence Revisited" section of my
blog entry about Scala Parser Combinators:

Thhat blog entry also includes a bunch of references at the end to
other blog entries and articles abouut parser combinators (mostly Scala).

--
Jim

On Thu, Jan 15, 2009 at 12:36:08AM +0000, Ricky Clarkson wrote:
> Date: Thu, 15 Jan 2009 00:36:08 +0000
> From: Ricky Clarkson
> To: Landei
> Cc: scala@listes.epfl.ch
> Subject: Re: [scala] Re: Feedback on my implementation of Shunting-yard
>
> If you have the book Programming in Scala, take a look at the
> Combinator Parsing chapter. It has an example of parsing arithmetic
> expressions, and it does precedence quite simply by nesting parse
> rules.
> E.g., in the expression 3 * 2 + 4 * 6, we have:
> term + term
> In each term, 3 * 2, we have factor * factor.
> So a term is a number of factor * factor [* factor..] together and an
> expr is a number of term + term [+ term..] together. I find that quite
> elegant.
> The relevant three lines of code (but you'll need the rest of the
> chapter or a similar resource to make use of them):
> def expr: Parser[Any] = term~rep("+"~term | ""~
> term)
> def term: Parser[Any] = factor~rep("*"~factor | "/"~factor)
> def factor: Parser[Any] = floatingPointNumber | "("~expr~")"
> You might want to see my recent thread about reverse pattern matching
> on Lists, for some ideas on how pattern matching can help. The code I
> posted in that thread implements a toy postfix interpreter, and as I
> recall, with no vars.
>
> 2009/1/14 Landei
>
> Landei wrote:
>
> jcrites wrote:
> Hi all, I'm experimenting with programming languages in Scala, learning
> about their implementation. One of the classic parsing algorithms is
> the Shunting-yard algorithm [1] for operator precedence parsing. The
> algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> expressions like "1 2 3 * +". As I am looking to develop expertise with
> Scala, I thought I would submit my implementation for feedback.
>
> I'm also still a beginner, but your code looks so terrible... Java-ish
> :-) All these confusing whiles and mutable state. I "feel" this code
> could be made much clearer, but I guess this would take me while.
> Menwhile probably one of the gurus materializes a two-liner inspired by
> a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax)
> To give you at least a tiny bit of concrete advise: If you have
> parameterless case classes (Add, Mul), you can always (?) use case
> objects instead. Take care! Daniel
>
> That's what I got so far - not exactly beautiful, but a little bit
> more functional:
> sealed abstract class Expr
> case class Num(i:Int) extends Expr
> abstract case class Op extends Expr
> case object Add extends Op
> case object Mul extends Op
>
> def op(a:Int, b:Int, o:Op) = o match {
> case Add => a + b
> case Mul => a * b
> }
>
> def execute(instructions: Seq[Expr]) = {
> instructions.foldLeft(List[Expr]()){
> (list:List[Expr], e:Expr) => e match {
> case n:Num => n :: list
> case o:Op =>
> list match {
> case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail
> case _ => error("Parse error: " + list)
> }
> }}.head
> }
>
> val q = List(Num(1), Num(2), Add)
> println(execute(q))
>
> Happy meditating!
> Daniel
> _______________________________________________________________
>
> View this message in context: Re: Feedback on my implementation of
> Shunting-yard
>
> Sent from the Scala mailing list archive at Nabble.com.

geoff
Joined: 2008-08-20,
User offline. Last seen 1 year 25 weeks ago.
Re: Re: Feedback on my implementation of Shunting-yard

On Wed, Jan 14, 2009 at 01:39:12PM -0800, Landei said
>
>
>
> jcrites wrote:
> >
> > Hi all,
> >
> > I'm experimenting with programming languages in Scala, learning about
> > their implementation. One of the classic parsing algorithms is the
> > Shunting-yard algorithm [1] for operator precedence parsing. The
> > algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> > expressions like "1 2 3 * +".
> >
> > As I am looking to develop expertise with Scala, I thought I would submit
> > my implementation for feedback.
> >
>
> I'm also still a beginner, but your code looks so terrible... Java-ish :-)
> All these confusing whiles and mutable state. I "feel" this code could be
> made much clearer, but I guess this would take me while. Menwhile probably
> one of the gurus materializes a two-liner inspired by a Zen-Koan (which
> includes Haskell Curry, Buddha and three pound flax)
> To give you at least a tiny bit of concrete advise: If you have
> parameterless case classes (Add, Mul), you can always (?) use case objects
> instead.

While it's certainly longer than two lines, here's a tail-recursive
implementation of shunting yard using pattern matching. I've
interspersed the algorithm steps from the wikipedia article to show how
the algorithm is translated.

object ShuntingYard extends Application {
sealed abstract class Associativity
case object Left extends Associativity
case object Right extends Associativity
case object Non extends Associativity

sealed abstract class Tok(sym: String) {
override def toString = sym
}
case class Num(i: Int) extends Tok(i.toString)
case object LParen extends Tok("(")
case object RParen extends Tok(")")

sealed abstract case class
Op(precedence: Int, associativity: Associativity)(sym: String)
extends Tok(sym)
case object Add extends Op(2, Left)("+")
case object Sub extends Op(2, Left)("-")
case object Mul extends Op(4, Left)("*")
case object Div extends Op(4, Left)("/")
case object Exp extends Op(6, Right)("^")

type Tokens = List[Tok]

def testprec(a: Associativity, p1: Int, p2: Int) = {
/* and either */
a match {
/* o1 is right-associative and its precedence is less than
* that of o2 */
case Right => p1 < p2
/* o1 is associative or left-associative and its precedence
* is less than or equal to that of o2 */
case _ => p1 <= p2
}
}

def shunt(tokens: Tokens, output: Tokens, pending: Tokens): Tokens = {
tokens match {
/* While there are tokens to bre read, read a token */
case t::ts => {
t match {
/* If the token is a number, then add it to the output queue */
case n: Num => { shunt(ts, n::output, pending) }
/* If the token is an operator, o1, then*/
case o1@Op(p1, a) => {
pending match {
/* while there is an operator, o2, at the top of
* the stack and [see testprec] */
case (o2@Op(p2, _))::ps if (testprec(a, p1, p2)) => {
/* pop o2 off the stack onto the output queue */
shunt(tokens, o2::output, ps)
}
/* push o1 onto the stack */
case _ => { shunt(ts, output, o1::pending) }
}
}
/* If the token is a left parenthesis, then push it onto
* the stack */
case LParen => shunt(ts, output, LParen::pending)
/* If the token is a right parenthesis */
case RParen => {
pending match {
/* Pop the left parenthesis from the stack, but not
* onto the output queue */
case LParen::ps => shunt(ts, output, ps)
/* Until the token at the top of the stack is a left
* parenthesis, pop operators off the stack onto the
* output queue */
case p::ps => shunt(tokens, p::output, ps)
/* If the stack runs out without finding a left
* parenthesis, then there are mismatched
* parenthesis */
case Nil => error("mismatched parens")
}
}
}
}
/* When there are no more tokens to read */
case Nil => {
pending match {
/* If the operator token on the top of the stack is a
* parenthesis, then there are mismatched parenthesis */
case (LParen|RParen)::_ => error("mismatched parens")
/* Pop the operator onto the output queue */
case o::os => {
shunt(tokens, o::output, os)
}
/* When there are no operator tokens in the stack */
case Nil => {
/* exit */
output.reverse
}
}
}
}
}

def shunt(tokens: Tok*): List[Tok] = {
shunt(tokens.toList, Nil, Nil)
}

println(shunt(Num(1),Add,Num(2),Mul,Num(3)).mkString(" "))
/* => 1 2 3 * + */
println(shunt(Num(3),Add,Num(4),Mul,Num(2),Div,
LParen,Num(1),Sub,Num(5),RParen,Exp,Num(2),Exp,Num(3)).mkString(" "))
/* => 3 4 2 * 1 5 - 2 3 ^ ^ / + */
}

Landei
Joined: 2008-12-18,
User offline. Last seen 45 weeks 4 days ago.
Re: Re: Feedback on my implementation of Shunting-yard

Geoff Reedy wrote:
>
> While it's certainly longer than two lines, here's a tail-recursive
> implementation of shunting yard using pattern matching. I've
> interspersed the algorithm steps from the wikipedia article to show how
> the algorithm is translated.
>
> object ShuntingYard extends Application {
> sealed abstract class Associativity
> case object Left extends Associativity
> case object Right extends Associativity
> case object Non extends Associativity
>
> sealed abstract class Tok(sym: String) {
> override def toString = sym
> }
> case class Num(i: Int) extends Tok(i.toString)
> case object LParen extends Tok("(")
> case object RParen extends Tok(")")
>
> sealed abstract case class
> Op(precedence: Int, associativity: Associativity)(sym: String)
> extends Tok(sym)
> case object Add extends Op(2, Left)("+")
> case object Sub extends Op(2, Left)("-")
> case object Mul extends Op(4, Left)("*")
> case object Div extends Op(4, Left)("/")
> case object Exp extends Op(6, Right)("^")
>
> type Tokens = List[Tok]
>
> def testprec(a: Associativity, p1: Int, p2: Int) = {
> /* and either */
> a match {
> /* o1 is right-associative and its precedence is less than
> * that of o2 */
> case Right => p1 < p2
> /* o1 is associative or left-associative and its precedence
> * is less than or equal to that of o2 */
> case _ => p1 <= p2
> }
> }
>
> def shunt(tokens: Tokens, output: Tokens, pending: Tokens): Tokens = {
> tokens match {
> /* While there are tokens to bre read, read a token */
> case t::ts => {
> t match {
> /* If the token is a number, then add it to the output queue */
> case n: Num => { shunt(ts, n::output, pending) }
> /* If the token is an operator, o1, then*/
> case o1@Op(p1, a) => {
> pending match {
> /* while there is an operator, o2, at the top of
> * the stack and [see testprec] */
> case (o2@Op(p2, _))::ps if (testprec(a, p1, p2)) => {
> /* pop o2 off the stack onto the output queue */
> shunt(tokens, o2::output, ps)
> }
> /* push o1 onto the stack */
> case _ => { shunt(ts, output, o1::pending) }
> }
> }
> /* If the token is a left parenthesis, then push it onto
> * the stack */
> case LParen => shunt(ts, output, LParen::pending)
> /* If the token is a right parenthesis */
> case RParen => {
> pending match {
> /* Pop the left parenthesis from the stack, but not
> * onto the output queue */
> case LParen::ps => shunt(ts, output, ps)
> /* Until the token at the top of the stack is a left
> * parenthesis, pop operators off the stack onto the
> * output queue */
> case p::ps => shunt(tokens, p::output, ps)
> /* If the stack runs out without finding a left
> * parenthesis, then there are mismatched
> * parenthesis */
> case Nil => error("mismatched parens")
> }
> }
> }
> }
> /* When there are no more tokens to read */
> case Nil => {
> pending match {
> /* If the operator token on the top of the stack is a
> * parenthesis, then there are mismatched parenthesis */
> case (LParen|RParen)::_ => error("mismatched parens")
> /* Pop the operator onto the output queue */
> case o::os => {
> shunt(tokens, o::output, os)
> }
> /* When there are no operator tokens in the stack */
> case Nil => {
> /* exit */
> output.reverse
> }
> }
> }
> }
> }
>
> def shunt(tokens: Tok*): List[Tok] = {
> shunt(tokens.toList, Nil, Nil)
> }
>
> println(shunt(Num(1),Add,Num(2),Mul,Num(3)).mkString(" "))
> /* => 1 2 3 * + */
> println(shunt(Num(3),Add,Num(4),Mul,Num(2),Div,
> LParen,Num(1),Sub,Num(5),RParen,Exp,Num(2),Exp,Num(3)).mkString(" "))
> /* => 3 4 2 * 1 5 - 2 3 ^ ^ / + */
> }
>
>
Of course the solution above is better (e.g. has more features), but my
version could be still interesting for Justin, as I kept his structure and
just tried to "translate" from imperative to functional style.

object ShuntingYard {

sealed abstract class Expr
case class Num(i:Int) extends Expr
abstract case class Op extends Expr
case object Add extends Op
case object Mul extends Op

def op(a:Int, b:Int, o:Op) = o match {
case Add => a + b
case Mul => a * b
}

def prec(e:Op) = e match {
case Add => 2
case Mul => 4
}

def execute(instructions: Seq[Expr]) =
(List[Expr]() /: instructions){
(list:List[Expr], e:Expr) => e match {
case n:Num => n :: list
case o:Op =>
list match {
case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) ::
tail
case _ => error("Parse error: " + list)
}
}}.head

def reorder(instructions: Seq[Expr]) = {

def cycle(o: Op, output: List[Expr], stack:List[Expr]):(List[Expr],
List[Expr]) =
if (stack.isEmpty)
(output, o :: stack)
else stack.head match {
case oper:Op if prec(o) <= prec(oper) => cycle(o, oper ::
output, stack.tail)
case _ => (output, o :: stack)
}

val (output, stack) = ((List[Expr](),List[Expr]()) /: instructions){
(pair:(List[Expr],List[Expr]), expr:Expr) =>
expr match {
case n:Num => (n :: pair._1, pair._2)
case o:Op => cycle(o, pair._1, pair._2)
}
}
output.reverse ::: stack
}

def main(args:Array[String]) {
val q = List(Num(1), Num(2), Add)
println(execute(q))
val q2 = List(Num(3),Add,Num(4),Mul,Num(5))
println(q2)
val q3 = reorder(q2)
println(q3)
println(execute(q3))

}
}

Cheers!
Daniel

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland