- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
880 MB
Fri, 2011-09-09, 15:59
A snapshot of the heap after running through the scala sbt build. Post-processing. A count of the number of distinct copies of strings which are found on the heap. It starts reasonably enough like this:
1 matchesPT :
1 failed of type :
1 stack after interpret:
1 stack at the beginning of block
After 93,641 unique strings we get to the 2s:
2 getting typeinfo at the beginning of block
2 %-
2 (which expands to)
Line 133,637 starts the 3s:
3 )
3 of which in failed :
3 of which in implicits :
Jumping forward, line 194,641 starts the 10s:
10 $anonfun$transform$
10 $plus
10 <refinement>
10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
10 /scratch/trunk1/lib/scala-compiler.jar
Line 203,799 begins the 50s:
50 '
50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala
At line 206,373 we enter the 100s:
100 #|
100 <~
100 ChoiceSetting
100 Forward
100 Implementation
Line 210,820 opens the 1000s:
1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange
Line 210,944 opens the 10,000s:
10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType
And I'll include the six-figure club in its entirety.
100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1
Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!
Fri, 2011-09-09, 21:37
#2
Re: 880 MB
On Fri, Sep 9, 2011 at 5:46 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
Seems like a good intern could help reduce the clutter....Interesting. I also noted that discipline using names instead of strings was gradually dissipating. This is the result. (For reference: names are optimized, interned strings. Neal Gafter tried to replace names with interned strings in javac once, and got a general 5-10% slowdown. So don't try that).
On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp@improving.org> wrote:
A snapshot of the heap after running through the scala sbt build. Post-processing. A count of the number of distinct copies of strings which are found on the heap. It starts reasonably enough like this:
1 matchesPT :
1 failed of type :
1 stack after interpret:
1 stack at the beginning of block
After 93,641 unique strings we get to the 2s:
2 getting typeinfo at the beginning of block
2 %-
2 (which expands to)
Line 133,637 starts the 3s:
3 )
3 of which in failed :
3 of which in implicits :
Jumping forward, line 194,641 starts the 10s:
10 $anonfun$transform$
10 $plus
10 <refinement>
10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
10 /scratch/trunk1/lib/scala-compiler.jar
Line 203,799 begins the 50s:
50 '
50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala
At line 206,373 we enter the 100s:
100 #|
100 <~
100 ChoiceSetting
100 Forward
100 Implementation
Line 210,820 opens the 1000s:
1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange
Line 210,944 opens the 10,000s:
10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType
And I'll include the six-figure club in its entirety.
100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1
Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!
Cheers
-- Martin
Mon, 2011-09-12, 15:17
#3
Re: 880 MB
So, it sounds like for a single instance of the compiler, using the Naming char interned store is a great idea. This SBT build has a few instances of the compiler open at a time, but that should only account for, at max, a replication of 4 or 5 of the same string from Namers if the store is being used correctly. So although interning would reduce that duplication to 1, I think reducing the duplication from half a million to 5 would be a pretty big win.
How would one go about 'fixing' this issue? I'm offering to help where I can because the reduction of this duplication will drastically help the SBT build.
- Josh
On Fri, Sep 9, 2011 at 4:28 PM, martin odersky <martin.odersky@epfl.ch> wrote:
How would one go about 'fixing' this issue? I'm offering to help where I can because the reduction of this duplication will drastically help the SBT build.
- Josh
On Fri, Sep 9, 2011 at 4:28 PM, martin odersky <martin.odersky@epfl.ch> wrote:
On Fri, Sep 9, 2011 at 5:46 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
Seems like a good intern could help reduce the clutter....Interesting. I also noted that discipline using names instead of strings was gradually dissipating. This is the result. (For reference: names are optimized, interned strings. Neal Gafter tried to replace names with interned strings in javac once, and got a general 5-10% slowdown. So don't try that).
On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp@improving.org> wrote:
A snapshot of the heap after running through the scala sbt build. Post-processing. A count of the number of distinct copies of strings which are found on the heap. It starts reasonably enough like this:
1 matchesPT :
1 failed of type :
1 stack after interpret:
1 stack at the beginning of block
After 93,641 unique strings we get to the 2s:
2 getting typeinfo at the beginning of block
2 %-
2 (which expands to)
Line 133,637 starts the 3s:
3 )
3 of which in failed :
3 of which in implicits :
Jumping forward, line 194,641 starts the 10s:
10 $anonfun$transform$
10 $plus
10 <refinement>
10 /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar
10 /scratch/trunk1/lib/scala-compiler.jar
Line 203,799 begins the 50s:
50 '
50 /scratch/trunk1/src/compiler/scala/reflect/runtime/ScalaToJava.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/backend/icode/Printers.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/plugins/Plugin.scala
50 /scratch/trunk1/src/compiler/scala/tools/nsc/settings/StandardScalaSettings.scala
At line 206,373 we enter the 100s:
100 #|
100 <~
100 ChoiceSetting
100 Forward
100 Implementation
Line 210,820 opens the 1000s:
1000 ListSet
1007 PrintStream
1009 Typed
1012 contextError
1012 ctx
1012 firePropertyChange
Line 210,944 opens the 10,000s:
10008 isTerm
10038 isType
10101 Name
10173 symbol
10495 children
10547 setType
And I'll include the six-figure club in its entirety.
100507 p
104843 String
110642 Function2
134476 that
137226 java
142073 collection
143074 java.lang
143345 !=
143345 ==
164973 Object
176175 Function1
214742 wait
214867 (classOf[java.lang.InterruptedException])
416501 scala
782459 x$1
Congratulations to x$1, today's winner! Who will be the first million copy string? Stay tuned!
Cheers
-- Martin
Mon, 2011-09-12, 15:37
#4
Re: 880 MB
I burned the whole weekend profiling and tweaking sbt. I'll have a
report today.
Mon, 2011-09-12, 15:47
#5
Re: 880 MB
This sounds awesome guys. Perhaps one of you could put up a blog post on performance profiling scala code and potential pitfalls. When you say `names` I assume you are referring to symbols [1].
[1]: http://www.scala-lang.org/api/current/scala/Symbol.html
-Doug Tangren
http://lessis.me
[1]: http://www.scala-lang.org/api/current/scala/Symbol.html
-Doug Tangren
http://lessis.me
On Fri, Sep 9, 2011 at 10:59 AM, Paul Phillips <paulp@improving.org> wrote: