- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Feedback on my implementation of Shunting-yard
Wed, 2009-01-14, 21:28
Hi all,
I'm experimenting with programming languages in Scala, learning about their
implementation. One of the classic parsing algorithms is the Shunting-yard
algorithm [1] for operator precedence parsing. The algorithm transforms
infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 *
+".
As I am looking to develop expertise with Scala, I thought I would submit my
implementation for feedback.
First, an expression type:
sealed abstract class Expr
case class Num(i:Int) extends Expr
case class Add extends Expr
case class Mul extends Expr
Expressions are represented by Queue[Expr]. The Queue and Stack in this
code are mutable. An execution algorithm:
def execute(instructions: Queue[Expr]) = {
val stack = new Stack[Expr]
while (!instructions.isEmpty) {
def runInstruction(e:Expr) {
e match {
case n:Num => stack.push(n)
case a:Add => {
val op1 = stack.pop match { case Num(i) => i }
val op2 = stack.pop match { case Num(i) => i }
stack.push(Num(op1 + op2))
}
}
}
val i = instructions.dequeue()
runInstruction(i)
}
stack.pop()
}
We can use the execute code like this:
val q = new Queue[Expr]
q.enqueue(Num(1))
q.enqueue(Num(2))
q.enqueue(Add())
println(execute(q))
This will print "Num(3)". Now, the Shunting-yard algorithm. First we need
a function to map operators to their predecence. Higher number represents
higher (tighter) binding precedence.
def precedence(e:Expr) = e match {
case x:Add => 2
case x:Mul => 4
}
def isOperator(e:Expr) = e match {
case n:Num => false
case _ => true
}
Finally, the reordering algorithm:
def reorder(instructions: Queue[Expr]) = {
val output = new Queue[Expr]
val stack = new Stack[Expr]
while (!instructions.isEmpty) {
val tok = instructions.dequeue()
tok match {
case n:Num => {
output.enqueue(n)
}
case _ => {
while (!stack.isEmpty && isOperator(stack.top) &&
precedence(tok) <= precedence(stack.top)) {
output.enqueue(stack.pop())
}
stack.push(tok)
}
}
}
while (!stack.isEmpty) {
output.enqueue(stack.pop())
}
output
}
One can use the code like this:
val q2 = new Queue[Expr]
q2.enqueue(Num(3))
q2.enqueue(Add())
q2.enqueue(Num(4))
q2.enqueue(Mul())
q2.enqueue(Num(5))
println(q2)
println(reorder(q2))
The input expression (infix form) prints as "Queue(Num(3), Add(), Num(4),
Mul(), Num(5))". After Shunting-yard reordering, we get "Queue(Num(3),
Num(4), Num(5), Mul(), Add())", showing that multiplication is tighter than
addition.
If I change the precedence to:
def precedence(e:Expr) = e match {
case x:Add => 2
case x:Mul => 0
}
Then addition binds tighter, yielding the result "Queue(Num(3), Num(4),
Add(), Num(5), Mul())".
Feedback / comments appreciated.
What are the disadvantages of operator precedence parsing with
Shunting-yard? It seems like a very flexible and fast algorithm with many
advantages.
[1] http://en.wikipedia.org/wiki/Shunting_yard_algorithm
--
Justin Crites
Thu, 2009-01-15, 00:17
#2
Re: Feedback on my implementation of Shunting-yard
Landei wrote:That's what I got so far - not exactly beautiful, but a little bit more functional:jcrites wrote: Hi all, I'm experimenting with programming languages in Scala, learning about their implementation. One of the classic parsing algorithms is the Shunting-yard algorithm [1] for operator precedence parsing. The algorithm transforms infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 * +". As I am looking to develop expertise with Scala, I thought I would submit my implementation for feedback.I'm also still a beginner, but your code looks so terrible... Java-ish :-) All these confusing whiles and mutable state. I "feel" this code could be made much clearer, but I guess this would take me while. Menwhile probably one of the gurus materializes a two-liner inspired by a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax) To give you at least a tiny bit of concrete advise: If you have parameterless case classes (Add, Mul), you can always (?) use case objects instead. Take care! Daniel
sealed abstract class Expr case class Num(i:Int) extends Expr abstract case class Op extends Expr case object Add extends Op case object Mul extends Op def op(a:Int, b:Int, o:Op) = o match { case Add => a + b case Mul => a * b } def execute(instructions: Seq[Expr]) = { instructions.foldLeft(List[Expr]()){ (list:List[Expr], e:Expr) => e match { case n:Num => n :: list case o:Op => list match { case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail case _ => error("Parse error: " + list) } }}.head } val q = List(Num(1), Num(2), Add) println(execute(q))
Happy meditating!
Daniel
View this message in context: Re: Feedback on my implementation of Shunting-yard
Sent from the Scala mailing list archive at Nabble.com.
Thu, 2009-01-15, 01:47
#3
Re: Re: Feedback on my implementation of Shunting-yard
If you have the book Programming in Scala, take a look at the Combinator Parsing chapter. It has an example of parsing arithmetic expressions, and it does precedence quite simply by nesting parse rules.
E.g., in the expression 3 * 2 + 4 * 6, we have:
term + term
In each term, 3 * 2, we have factor * factor.
So a term is a number of factor * factor [* factor..] together and an expr is a number of term + term [+ term..] together. I find that quite elegant.
The relevant three lines of code (but you'll need the rest of the chapter or a similar resource to make use of them):
def expr: Parser[Any] = term~rep("+"~term | ""~
term)
def term: Parser[Any] = factor~rep("*"~factor | "/"~factor)
def factor: Parser[Any] = floatingPointNumber | "("~expr~")"
You might want to see my recent thread about reverse pattern matching on Lists, for some ideas on how pattern matching can help. The code I posted in that thread implements a toy postfix interpreter, and as I recall, with no vars.
2009/1/14 Landei <Daniel.Gronau@gmx.de>
E.g., in the expression 3 * 2 + 4 * 6, we have:
term + term
In each term, 3 * 2, we have factor * factor.
So a term is a number of factor * factor [* factor..] together and an expr is a number of term + term [+ term..] together. I find that quite elegant.
The relevant three lines of code (but you'll need the rest of the chapter or a similar resource to make use of them):
def expr: Parser[Any] = term~rep("+"~term | ""~
term)
def term: Parser[Any] = factor~rep("*"~factor | "/"~factor)
def factor: Parser[Any] = floatingPointNumber | "("~expr~")"
You might want to see my recent thread about reverse pattern matching on Lists, for some ideas on how pattern matching can help. The code I posted in that thread implements a toy postfix interpreter, and as I recall, with no vars.
2009/1/14 Landei <Daniel.Gronau@gmx.de>
Landei wrote:That's what I got so far - not exactly beautiful, but a little bit more functional:jcrites wrote: Hi all, I'm experimenting with programming languages in Scala, learning about their implementation. One of the classic parsing algorithms is the Shunting-yard algorithm [1] for operator precedence parsing. The algorithm transforms infix expressions like "1 + 2 * 3" into postfix expressions like "1 2 3 * +". As I am looking to develop expertise with Scala, I thought I would submit my implementation for feedback.I'm also still a beginner, but your code looks so terrible... Java-ish :-) All these confusing whiles and mutable state. I "feel" this code could be made much clearer, but I guess this would take me while. Menwhile probably one of the gurus materializes a two-liner inspired by a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax) To give you at least a tiny bit of concrete advise: If you have parameterless case classes (Add, Mul), you can always (?) use case objects instead. Take care! Danielsealed abstract class Expr case class Num(i:Int) extends Expr abstract case class Op extends Expr case object Add extends Op case object Mul extends Op def op(a:Int, b:Int, o:Op) = o match { case Add => a + b case Mul => a * b } def execute(instructions: Seq[Expr]) = { instructions.foldLeft(List[Expr]()){ (list:List[Expr], e:Expr) => e match { case n:Num => n :: list case o:Op => list match { case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail case _ => error("Parse error: " + list) } }}.head } val q = List(Num(1), Num(2), Add) println(execute(q))
Happy meditating!
Daniel
View this message in context: Re: Feedback on my implementation of Shunting-yard
Sent from the Scala mailing list archive at Nabble.com.
Thu, 2009-01-15, 02:57
#4
Re: Re: Feedback on my implementation of Shunting-yard
It's also possible to factor out the precedence level when using parser
combinators, as I describe in the "Precdence Revisited" section of my
blog entry about Scala Parser Combinators:
Thhat blog entry also includes a bunch of references at the end to
other blog entries and articles abouut parser combinators (mostly Scala).
--
Jim
On Thu, Jan 15, 2009 at 12:36:08AM +0000, Ricky Clarkson wrote:
> Date: Thu, 15 Jan 2009 00:36:08 +0000
> From: Ricky Clarkson
> To: Landei
> Cc: scala@listes.epfl.ch
> Subject: Re: [scala] Re: Feedback on my implementation of Shunting-yard
>
> If you have the book Programming in Scala, take a look at the
> Combinator Parsing chapter. It has an example of parsing arithmetic
> expressions, and it does precedence quite simply by nesting parse
> rules.
> E.g., in the expression 3 * 2 + 4 * 6, we have:
> term + term
> In each term, 3 * 2, we have factor * factor.
> So a term is a number of factor * factor [* factor..] together and an
> expr is a number of term + term [+ term..] together. I find that quite
> elegant.
> The relevant three lines of code (but you'll need the rest of the
> chapter or a similar resource to make use of them):
> def expr: Parser[Any] = term~rep("+"~term | ""~
> term)
> def term: Parser[Any] = factor~rep("*"~factor | "/"~factor)
> def factor: Parser[Any] = floatingPointNumber | "("~expr~")"
> You might want to see my recent thread about reverse pattern matching
> on Lists, for some ideas on how pattern matching can help. The code I
> posted in that thread implements a toy postfix interpreter, and as I
> recall, with no vars.
>
> 2009/1/14 Landei
>
> Landei wrote:
>
> jcrites wrote:
> Hi all, I'm experimenting with programming languages in Scala, learning
> about their implementation. One of the classic parsing algorithms is
> the Shunting-yard algorithm [1] for operator precedence parsing. The
> algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> expressions like "1 2 3 * +". As I am looking to develop expertise with
> Scala, I thought I would submit my implementation for feedback.
>
> I'm also still a beginner, but your code looks so terrible... Java-ish
> :-) All these confusing whiles and mutable state. I "feel" this code
> could be made much clearer, but I guess this would take me while.
> Menwhile probably one of the gurus materializes a two-liner inspired by
> a Zen-Koan (which includes Haskell Curry, Buddha and three pound flax)
> To give you at least a tiny bit of concrete advise: If you have
> parameterless case classes (Add, Mul), you can always (?) use case
> objects instead. Take care! Daniel
>
> That's what I got so far - not exactly beautiful, but a little bit
> more functional:
> sealed abstract class Expr
> case class Num(i:Int) extends Expr
> abstract case class Op extends Expr
> case object Add extends Op
> case object Mul extends Op
>
> def op(a:Int, b:Int, o:Op) = o match {
> case Add => a + b
> case Mul => a * b
> }
>
> def execute(instructions: Seq[Expr]) = {
> instructions.foldLeft(List[Expr]()){
> (list:List[Expr], e:Expr) => e match {
> case n:Num => n :: list
> case o:Op =>
> list match {
> case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) :: tail
> case _ => error("Parse error: " + list)
> }
> }}.head
> }
>
> val q = List(Num(1), Num(2), Add)
> println(execute(q))
>
> Happy meditating!
> Daniel
> _______________________________________________________________
>
> View this message in context: Re: Feedback on my implementation of
> Shunting-yard
>
> Sent from the Scala mailing list archive at Nabble.com.
Thu, 2009-01-15, 16:47
#5
Re: Re: Feedback on my implementation of Shunting-yard
On Wed, Jan 14, 2009 at 01:39:12PM -0800, Landei said
>
>
>
> jcrites wrote:
> >
> > Hi all,
> >
> > I'm experimenting with programming languages in Scala, learning about
> > their implementation. One of the classic parsing algorithms is the
> > Shunting-yard algorithm [1] for operator precedence parsing. The
> > algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> > expressions like "1 2 3 * +".
> >
> > As I am looking to develop expertise with Scala, I thought I would submit
> > my implementation for feedback.
> >
>
> I'm also still a beginner, but your code looks so terrible... Java-ish :-)
> All these confusing whiles and mutable state. I "feel" this code could be
> made much clearer, but I guess this would take me while. Menwhile probably
> one of the gurus materializes a two-liner inspired by a Zen-Koan (which
> includes Haskell Curry, Buddha and three pound flax)
> To give you at least a tiny bit of concrete advise: If you have
> parameterless case classes (Add, Mul), you can always (?) use case objects
> instead.
While it's certainly longer than two lines, here's a tail-recursive
implementation of shunting yard using pattern matching. I've
interspersed the algorithm steps from the wikipedia article to show how
the algorithm is translated.
object ShuntingYard extends Application {
sealed abstract class Associativity
case object Left extends Associativity
case object Right extends Associativity
case object Non extends Associativity
sealed abstract class Tok(sym: String) {
override def toString = sym
}
case class Num(i: Int) extends Tok(i.toString)
case object LParen extends Tok("(")
case object RParen extends Tok(")")
sealed abstract case class
Op(precedence: Int, associativity: Associativity)(sym: String)
extends Tok(sym)
case object Add extends Op(2, Left)("+")
case object Sub extends Op(2, Left)("-")
case object Mul extends Op(4, Left)("*")
case object Div extends Op(4, Left)("/")
case object Exp extends Op(6, Right)("^")
type Tokens = List[Tok]
def testprec(a: Associativity, p1: Int, p2: Int) = {
/* and either */
a match {
/* o1 is right-associative and its precedence is less than
* that of o2 */
case Right => p1 < p2
/* o1 is associative or left-associative and its precedence
* is less than or equal to that of o2 */
case _ => p1 <= p2
}
}
def shunt(tokens: Tokens, output: Tokens, pending: Tokens): Tokens = {
tokens match {
/* While there are tokens to bre read, read a token */
case t::ts => {
t match {
/* If the token is a number, then add it to the output queue */
case n: Num => { shunt(ts, n::output, pending) }
/* If the token is an operator, o1, then*/
case o1@Op(p1, a) => {
pending match {
/* while there is an operator, o2, at the top of
* the stack and [see testprec] */
case (o2@Op(p2, _))::ps if (testprec(a, p1, p2)) => {
/* pop o2 off the stack onto the output queue */
shunt(tokens, o2::output, ps)
}
/* push o1 onto the stack */
case _ => { shunt(ts, output, o1::pending) }
}
}
/* If the token is a left parenthesis, then push it onto
* the stack */
case LParen => shunt(ts, output, LParen::pending)
/* If the token is a right parenthesis */
case RParen => {
pending match {
/* Pop the left parenthesis from the stack, but not
* onto the output queue */
case LParen::ps => shunt(ts, output, ps)
/* Until the token at the top of the stack is a left
* parenthesis, pop operators off the stack onto the
* output queue */
case p::ps => shunt(tokens, p::output, ps)
/* If the stack runs out without finding a left
* parenthesis, then there are mismatched
* parenthesis */
case Nil => error("mismatched parens")
}
}
}
}
/* When there are no more tokens to read */
case Nil => {
pending match {
/* If the operator token on the top of the stack is a
* parenthesis, then there are mismatched parenthesis */
case (LParen|RParen)::_ => error("mismatched parens")
/* Pop the operator onto the output queue */
case o::os => {
shunt(tokens, o::output, os)
}
/* When there are no operator tokens in the stack */
case Nil => {
/* exit */
output.reverse
}
}
}
}
}
def shunt(tokens: Tok*): List[Tok] = {
shunt(tokens.toList, Nil, Nil)
}
println(shunt(Num(1),Add,Num(2),Mul,Num(3)).mkString(" "))
/* => 1 2 3 * + */
println(shunt(Num(3),Add,Num(4),Mul,Num(2),Div,
LParen,Num(1),Sub,Num(5),RParen,Exp,Num(2),Exp,Num(3)).mkString(" "))
/* => 3 4 2 * 1 5 - 2 3 ^ ^ / + */
}
Thu, 2009-01-15, 23:27
#6
Re: Re: Feedback on my implementation of Shunting-yard
Geoff Reedy wrote:
>
> While it's certainly longer than two lines, here's a tail-recursive
> implementation of shunting yard using pattern matching. I've
> interspersed the algorithm steps from the wikipedia article to show how
> the algorithm is translated.
>
> object ShuntingYard extends Application {
> sealed abstract class Associativity
> case object Left extends Associativity
> case object Right extends Associativity
> case object Non extends Associativity
>
> sealed abstract class Tok(sym: String) {
> override def toString = sym
> }
> case class Num(i: Int) extends Tok(i.toString)
> case object LParen extends Tok("(")
> case object RParen extends Tok(")")
>
> sealed abstract case class
> Op(precedence: Int, associativity: Associativity)(sym: String)
> extends Tok(sym)
> case object Add extends Op(2, Left)("+")
> case object Sub extends Op(2, Left)("-")
> case object Mul extends Op(4, Left)("*")
> case object Div extends Op(4, Left)("/")
> case object Exp extends Op(6, Right)("^")
>
> type Tokens = List[Tok]
>
> def testprec(a: Associativity, p1: Int, p2: Int) = {
> /* and either */
> a match {
> /* o1 is right-associative and its precedence is less than
> * that of o2 */
> case Right => p1 < p2
> /* o1 is associative or left-associative and its precedence
> * is less than or equal to that of o2 */
> case _ => p1 <= p2
> }
> }
>
> def shunt(tokens: Tokens, output: Tokens, pending: Tokens): Tokens = {
> tokens match {
> /* While there are tokens to bre read, read a token */
> case t::ts => {
> t match {
> /* If the token is a number, then add it to the output queue */
> case n: Num => { shunt(ts, n::output, pending) }
> /* If the token is an operator, o1, then*/
> case o1@Op(p1, a) => {
> pending match {
> /* while there is an operator, o2, at the top of
> * the stack and [see testprec] */
> case (o2@Op(p2, _))::ps if (testprec(a, p1, p2)) => {
> /* pop o2 off the stack onto the output queue */
> shunt(tokens, o2::output, ps)
> }
> /* push o1 onto the stack */
> case _ => { shunt(ts, output, o1::pending) }
> }
> }
> /* If the token is a left parenthesis, then push it onto
> * the stack */
> case LParen => shunt(ts, output, LParen::pending)
> /* If the token is a right parenthesis */
> case RParen => {
> pending match {
> /* Pop the left parenthesis from the stack, but not
> * onto the output queue */
> case LParen::ps => shunt(ts, output, ps)
> /* Until the token at the top of the stack is a left
> * parenthesis, pop operators off the stack onto the
> * output queue */
> case p::ps => shunt(tokens, p::output, ps)
> /* If the stack runs out without finding a left
> * parenthesis, then there are mismatched
> * parenthesis */
> case Nil => error("mismatched parens")
> }
> }
> }
> }
> /* When there are no more tokens to read */
> case Nil => {
> pending match {
> /* If the operator token on the top of the stack is a
> * parenthesis, then there are mismatched parenthesis */
> case (LParen|RParen)::_ => error("mismatched parens")
> /* Pop the operator onto the output queue */
> case o::os => {
> shunt(tokens, o::output, os)
> }
> /* When there are no operator tokens in the stack */
> case Nil => {
> /* exit */
> output.reverse
> }
> }
> }
> }
> }
>
> def shunt(tokens: Tok*): List[Tok] = {
> shunt(tokens.toList, Nil, Nil)
> }
>
> println(shunt(Num(1),Add,Num(2),Mul,Num(3)).mkString(" "))
> /* => 1 2 3 * + */
> println(shunt(Num(3),Add,Num(4),Mul,Num(2),Div,
> LParen,Num(1),Sub,Num(5),RParen,Exp,Num(2),Exp,Num(3)).mkString(" "))
> /* => 3 4 2 * 1 5 - 2 3 ^ ^ / + */
> }
>
>
Of course the solution above is better (e.g. has more features), but my
version could be still interesting for Justin, as I kept his structure and
just tried to "translate" from imperative to functional style.
object ShuntingYard {
sealed abstract class Expr
case class Num(i:Int) extends Expr
abstract case class Op extends Expr
case object Add extends Op
case object Mul extends Op
def op(a:Int, b:Int, o:Op) = o match {
case Add => a + b
case Mul => a * b
}
def prec(e:Op) = e match {
case Add => 2
case Mul => 4
}
def execute(instructions: Seq[Expr]) =
(List[Expr]() /: instructions){
(list:List[Expr], e:Expr) => e match {
case n:Num => n :: list
case o:Op =>
list match {
case Num(i) :: Num(j) :: tail => Num(op(i,j,o)) ::
tail
case _ => error("Parse error: " + list)
}
}}.head
def reorder(instructions: Seq[Expr]) = {
def cycle(o: Op, output: List[Expr], stack:List[Expr]):(List[Expr],
List[Expr]) =
if (stack.isEmpty)
(output, o :: stack)
else stack.head match {
case oper:Op if prec(o) <= prec(oper) => cycle(o, oper ::
output, stack.tail)
case _ => (output, o :: stack)
}
val (output, stack) = ((List[Expr](),List[Expr]()) /: instructions){
(pair:(List[Expr],List[Expr]), expr:Expr) =>
expr match {
case n:Num => (n :: pair._1, pair._2)
case o:Op => cycle(o, pair._1, pair._2)
}
}
output.reverse ::: stack
}
def main(args:Array[String]) {
val q = List(Num(1), Num(2), Add)
println(execute(q))
val q2 = List(Num(3),Add,Num(4),Mul,Num(5))
println(q2)
val q3 = reorder(q2)
println(q3)
println(execute(q3))
}
}
Cheers!
Daniel
jcrites wrote:
>
> Hi all,
>
> I'm experimenting with programming languages in Scala, learning about
> their implementation. One of the classic parsing algorithms is the
> Shunting-yard algorithm [1] for operator precedence parsing. The
> algorithm transforms infix expressions like "1 + 2 * 3" into postfix
> expressions like "1 2 3 * +".
>
> As I am looking to develop expertise with Scala, I thought I would submit
> my implementation for feedback.
>
I'm also still a beginner, but your code looks so terrible... Java-ish :-)
All these confusing whiles and mutable state. I "feel" this code could be
made much clearer, but I guess this would take me while. Menwhile probably
one of the gurus materializes a two-liner inspired by a Zen-Koan (which
includes Haskell Curry, Buddha and three pound flax)
To give you at least a tiny bit of concrete advise: If you have
parameterless case classes (Add, Mul), you can always (?) use case objects
instead.
Take care!
Daniel