Benchmarking pattern matching

Tue, 2011-05-31, 02:07

#1

Joshua.Suereth

Joined: 2008-09-02,

Re: Benchmarking pattern matching

did you use the @switch annotation to ensure the optimisation is applied? Are all the values static? There are many ways to misstep the switch.optimisation.

On May 30, 2011 8:20 PM, "Matthew Roberts" <alt.mattr@gmail.com> wrote:
> Scala users,
>
> In pursuit of a fairly esoteric concept (a better understanding of how different compilers translate pattern matching expressions) I have discovered that Scala is unable to optimise pattern matches into switch statements/jump tables/table lookups as many functional languages can. This means pattern matching can be slower than one might expect or hope. I have written up my thoughts and would love to have them torn to shreds by people who know better :)
>
> http://mattr.net.au/articles/pattern_benchmarks/
>
> I am not arguing that Scala's pattern matching is worse than the equivalent in other languages. After all, Scala's pattern matching can do things that no other language can, so we might expect some performance penalty. However, I think it is worth knowing where and how the penalty is paid and even ways we can avoid it.
>
> Matt Roberts

Tue, 2011-05-31, 02:37

#2

Joshua.Suereth

Joined: 2008-09-02,

Re: Benchmarking pattern matching

i think the big mistake here is not marking the abstract num class as sealed. in scala, inheritance is 'open' and addtional "switch ruining" classes could be generated in other source files. unlike haskell's pattern matching.

Scala's switch optimisation should show up with examples after a good google search.

On May 30, 2011 9:00 PM, "Josh Suereth" <joshua.suereth@gmail.com> wrote:
> did you use the @switch annotation to ensure the optimisation is applied?
> Are all the values static? There are many ways to misstep the
> switch.optimisation.
> On May 30, 2011 8:20 PM, "Matthew Roberts" <alt.mattr@gmail.com> wrote:
>> Scala users,
>>
>> In pursuit of a fairly esoteric concept (a better understanding of how
> different compilers translate pattern matching expressions) I have
> discovered that Scala is unable to optimise pattern matches into switch
> statements/jump tables/table lookups as many functional languages can. This
> means pattern matching can be slower than one might expect or hope. I have
> written up my thoughts and would love to have them torn to shreds by people
> who know better :)
>>
>> http://mattr.net.au/articles/pattern_benchmarks/
>>
>> I am not arguing that Scala's pattern matching is worse than the
> equivalent in other languages. After all, Scala's pattern matching can do
> things that no other language can, so we might expect some performance
> penalty. However, I think it is worth knowing where and how the penalty is
> paid and even ways we can avoid it.
>>
>> Matt Roberts

Tue, 2011-05-31, 02:57

#3

Matthew Roberts

Joined: 2011-05-31,

Re: Benchmarking pattern matching

I can't get sealing to make a difference, and this
https://issues.scala-lang.org/browse/SI-3375
would suggest it wont. I didn't know about the @switch annotation, which would have made my testing a bit easier :) I still think it is useful to see the results of not achieving a switch though.
Matt
On 31/05/2011, at 11:35 AM, Josh Suereth wrote:

i think the big mistake here is not marking the abstract num class as sealed. in scala, inheritance is 'open' and addtional "switch ruining" classes could be generated in other source files. unlike haskell's pattern matching.
Scala's switch optimisation should show up with examples after a good google search.
On May 30, 2011 9:00 PM, "Josh Suereth" <joshua.suereth@gmail.com> wrote:
> did you use the @switch annotation to ensure the optimisation is applied?
> Are all the values static? There are many ways to misstep the
> switch.optimisation.
> On May 30, 2011 8:20 PM, "Matthew Roberts" <alt.mattr@gmail.com> wrote:
>> Scala users,
>>
>> In pursuit of a fairly esoteric concept (a better understanding of how
> different compilers translate pattern matching expressions) I have
> discovered that Scala is unable to optimise pattern matches into switch
> statements/jump tables/table lookups as many functional languages can. This
> means pattern matching can be slower than one might expect or hope. I have
> written up my thoughts and would love to have them torn to shreds by people
> who know better :)
>>
>> http://mattr.net.au/articles/pattern_benchmarks/
>>
>> I am not arguing that Scala's pattern matching is worse than the
> equivalent in other languages. After all, Scala's pattern matching can do
> things that no other language can, so we might expect some performance
> penalty. However, I think it is worth knowing where and how the penalty is
> paid and even ways we can avoid it.
>>
>> Matt Roberts

Tue, 2011-05-31, 07:57

#4

extempore

Joined: 2008-12-17,

Benchmarking pattern matching

[Moved to scala-internals.]

On 5/30/11 6:35 PM, Josh Suereth wrote:
> i think the big mistake here is not marking the abstract num class as
> sealed. in scala, inheritance is 'open' and addtional "switch ruining"
> classes could be generated in other source files. unlike haskell's
> pattern matching.

It won't make any difference to seal it. sealed is only used in exhaustiveness checking.

http://mattr.net.au/articles/pattern_benchmarks

"Finally, both languages compile to another compiled language (C in the case of Haskell and Java in the case of Scala) which might then translate-away any differences."

No, scala doesn't compile to java. Much as I'd like to have someone to blame for translating away the good parts. This is kind of important because without examining the bytecode you can't have much idea what you're comparing. The box is a little too black.

For instance here are the last twelve lines I have in B-10-scala.out when I run your code.

int-8: (8000) 724 milli secs
num-8: (8000) 707 milli secs
int-9: (9000) 706 milli secs
num-9: (9000) 755 milli secs
int-10: (10000) 715 milli secs
num-10: (10000) 765 milli secs
alt int-1: (1000) 658 milli secs
alt num-1: (1000) 555 milli secs
alt int-5: (5000) 664 milli secs
alt num-5: (5000) 636 milli secs
alt int-10: (10000) 710 milli secs
alt num-10: (10000) 768 milli secs

Here are the lines in the same file after I alter the code to measure fewer extraneous things, like the cost of moving a local variable to the heap and then manipulating it from unnecessary closures.

int-8: (8000) 317 milli secs
num-8: (8000) 588 milli secs
int-9: (9000) 290 milli secs
num-9: (9000) 628 milli secs
int-10: (10000) 337 milli secs
num-10: (10000) 659 milli secs
alt int-1: (1000) 194 milli secs
alt num-1: (1000) 404 milli secs
alt int-5: (5000) 244 milli secs
alt num-5: (5000) 503 milli secs
alt int-10: (10000) 306 milli secs
alt num-10: (10000) 670 milli secs

Scala pattern matches are indeed too slow, but your analysis cart is way ahead of your horse.

Here's a sample of the bytecode from my version.

67: instanceof #42; //class Four
70: ifeq 80
73: iload_3
74: iconst_4
75: iadd
76: istore_3
77: goto 186
80: aload 5
82: instanceof #44; //class Five
85: ifeq 95
88: iload_3
89: iconst_5
90: iadd
91: istore_3
92: goto 186
95: aload 5

Here's the comparable sample from yours.

86: instanceof #44; //class BenchmarkB$Four
89: ifeq 111
92: aload_0
93: getfield #33; //Field run$2:Lscala/runtime/IntRef;
96: aload_0
97: getfield #33; //Field run$2:Lscala/runtime/IntRef;
100: getfield #38; //Field scala/runtime/IntRef.elem:I
103: iconst_4
104: iadd
105: putfield #38; //Field scala/runtime/IntRef.elem:I
108: goto 307
111: aload_2
112: instanceof #46; //class BenchmarkB$Five
115: ifeq 137
118: aload_0
119: getfield #33; //Field run$2:Lscala/runtime/IntRef;
122: aload_0
123: getfield #33; //Field run$2:Lscala/runtime/IntRef;
126: getfield #38; //Field scala/runtime/IntRef.elem:I
129: iconst_5
130: iadd
131: putfield #38; //Field scala/runtime/IntRef.elem:I
134: goto 307
137: aload_2

Here's a diff for the "less extraneous stuff" source code. This should not be interpreted as a comprehensive assessment.

diff --git a/BenchmarkB.scala b/BenchmarkB.scala
index 3037957..b20c4cd 100644

Tue, 2011-05-31, 09:07

#5

Matthew Roberts

Joined: 2011-05-31,

Re: Benchmarking pattern matching

>
> No, scala doesn't compile to java. Much as I'd like to have someone to blame for translating away the good parts.

You are right, I should have said Java byte code. I believe quite a lot can happen between the byte code and what actually gets run though
>
> Here are the lines in the same file after I alter the code to measure fewer extraneous things, like the cost of moving a local variable to the heap and then manipulating it from unnecessary closures.
>

I left in the extraneous closure because the Haskell version used recursion, it seemed like a closer match.

> int-8: (8000) 317 milli secs
> num-8: (8000) 588 milli secs
> int-9: (9000) 290 milli secs
> num-9: (9000) 628 milli secs
> int-10: (10000) 337 milli secs
> num-10: (10000) 659 milli secs
> alt int-1: (1000) 194 milli secs
> alt num-1: (1000) 404 milli secs
> alt int-5: (5000) 244 milli secs
> alt num-5: (5000) 503 milli secs
> alt int-10: (10000) 306 milli secs
> alt num-10: (10000) 670 milli secs
>
>
> Scala pattern matches are indeed too slow, but your analysis cart is way ahead of your horse.
>
>

Are you suggesting the analysis is wrong? I am keen to hear any improvements, but I am not sure what you mean. Please note that I am not trying to find out if pattern matches are generally slow or fast, but whether the compiled byte code (and the perhaps later optimised thing running in the JVM) still uses drop-down semantics, even for simple algebraic data types.

Matt

Tue, 2011-05-31, 09:37

#6

extempore

Joined: 2008-12-17,

Re: Re: Benchmarking pattern matching

On 5/31/11 12:48 AM, Matthew Roberts wrote:
> Are you suggesting the analysis is wrong?

I am suggesting it lacks meaning.

> I am keen to hear any improvements, but I am not sure what you mean.
> Please note that I am not trying to find out if pattern matches are
> generally slow or fast, but whether the compiled byte code (and the
> perhaps later optimised thing running in the JVM) still uses
> drop-down semantics, even for simple algebraic data types.

If hotspot does something clever, you will not likely witness it with
all the extra indirection warding it off. If you want to know if scalac
does, you don't need to smash particles in the cyclotron to find out.
You can just look. It doesn't.

(But it used to. See previous discussions of $tag.)

Tue, 2011-05-31, 10:37

#7

Ruediger Keller 2

Joined: 2010-04-30,

Re: Benchmarking pattern matching

Hi,

well you made an interesting investigation, although I think there's
room for improvement. For one, it seems you didn't follow the basic
rules for micro benchmarking on the JVM. See the following
Stackoverflow question for more information. Note that this applies
equally to all languages running on the JVM, not only to Java:

http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro...

Also, I think your choice of words is rather poor. Compiled vs.
interpreted pattern matches is quite misleading IMHO. Switch vs.
nested-if semantics would be a more accurate and less misleading
choice, I think.

Also, I think you can determine that Scala is not compiling this kind
of matches to switch statements without doing any benchmarking. The
JVM only supports switching on primitive and boxed integral types and
enum types, so Scala cannot compile the pattern match to a switch if
this isn't the case. It seems in the past case classes had a "tag",
which was used for this, but this is no longer the case with more
recent versions of Scala.

For more information regarding switch statements and control flow in the JVM:

http://www.artima.com/underthehood/flowP.html
http://download.oracle.com/javase/tutorial/java/nutsandbolts/switch.html
http://java.sun.com/docs/books/jvms/second_edition/html/Instructions2.do...
http://java.sun.com/docs/books/jvms/second_edition/html/Instructions2.do...

Regards,
Rüdiger

2011/5/31 Matthew Roberts :
> Scala users,
>
> In pursuit of a fairly esoteric concept (a better understanding of how
> different compilers translate pattern matching expressions) I have
> discovered that Scala is unable to optimise pattern matches into switch
> statements/jump tables/table lookups as many functional languages can. This
> means pattern matching can be slower than one might expect or hope. I have
> written up my thoughts and would love to have them torn to shreds by people
> who know better :)
>
> http://mattr.net.au/articles/pattern_benchmarks/
>
> I am not arguing that Scala's pattern matching is worse than the equivalent
> in other languages. After all, Scala's pattern matching can do things that
> no other language can, so we might expect some performance penalty. However,
> I think it is worth knowing where and how the penalty is paid and even ways
> we can avoid it.
>
> Matt Roberts

Tue, 2011-06-07, 06:17

#8

Matthew Roberts

Joined: 2011-05-31,

Re: Benchmarking pattern matching

Specific comments in response to Ruediger below but I want to thank everyone involved, I know a lot more now than I did at the start of the conversion :)

Matt

On 31/05/2011, at 7:32 PM, Ruediger Keller wrote:

> Hi,
>
> well you made an interesting investigation, although I think there's
> room for improvement. For one, it seems you didn't follow the basic
> rules for micro benchmarking on the JVM.

Thankyou for the link, I will chase that up and re-do the benchmarks.

> Also, I think your choice of words is rather poor. Compiled vs.
> interpreted pattern matches is quite misleading IMHO. Switch vs.
> nested-if semantics would be a more accurate and less misleading
> choice, I think.

In the context of a general forum such as this, you are right. However, I am working hard on some rather less practical pattern compilation work where the distinction is more important and the term "interpreter" is synonymous with nested-ifs. If I stated again I would avoid the term "interpreter" but I don't think it is inaccurate.

>
> Also, I think you can determine that Scala is not compiling this kind
> of matches to switch statements without doing any benchmarking. The
> JVM only supports switching on primitive and boxed integral types and
> enum types, so Scala cannot compile the pattern match to a switch if
> this isn't the case. It seems in the past case classes had a "tag",
> which was used for this, but this is no longer the case with more
> recent versions of Scala.

Again you are right, but Scala/JVM continually confounds me with the extent of its capabilities, and I wanted to be *really sure* :)

Matt Roberts

Tue, 2011-06-07, 07:27

#9

odersky

Joined: 2008-07-29,

Re: Re: Benchmarking pattern matching

Much earlier, Scala used to compile pattern matches to switches, using a compiler-generated $tag method. This was removed because our measurements showed that it was actually slower than a sequence of instanceof tests. See the paper "Matching Objects with Patterns", ECOOP 2007, for more details.

Cheers

-- Martin

On Tue, Jun 7, 2011 at 7:03 AM, Matthew Roberts <alt.mattr@gmail.com> wrote:

Specific comments in response to Ruediger below but I want to thank everyone involved, I know a lot more now than I did at the start of the conversion :)

Matt

On 31/05/2011, at 7:32 PM, Ruediger Keller wrote:

> Hi,
>
> well you made an interesting investigation, although I think there's
> room for improvement. For one, it seems you didn't follow the basic
> rules for micro benchmarking on the JVM.

Thankyou for the link, I will chase that up and re-do the benchmarks.

> Also, I think your choice of words is rather poor. Compiled vs.
> interpreted pattern matches is quite misleading IMHO. Switch vs.
> nested-if semantics would be a more accurate and less misleading
> choice, I think.

In the context of a general forum such as this, you are right. However, I am working hard on some rather less practical pattern compilation work where the distinction is more important and the term "interpreter" is synonymous with nested-ifs. If I stated again I would avoid the term "interpreter" but I don't think it is inaccurate.

>
> Also, I think you can determine that Scala is not compiling this kind
> of matches to switch statements without doing any benchmarking. The
> JVM only supports switching on primitive and boxed integral types and
> enum types, so Scala cannot compile the pattern match to a switch if
> this isn't the case. It seems in the past case classes had a "tag",
> which was used for this, but this is no longer the case with more
> recent versions of Scala.

Again you are right, but Scala/JVM continually confounds me with the extent of its capabilities, and I wanted to be *really sure* :)

Matt Roberts

--
----------------------------------------------
Martin Odersky
Prof., EPFL and CEO, Typesafe
PSED, 1015 Lausanne, Switzerland
Tel. EPFL: +41 21 693 6863
Tel. Typesafe: +41 21 691 4967

Scala Main Menu

Scala Quick Links

Featured News

User login