This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Class file sizes in the Scala compiler

No replies
DRMacIver
Joined: 2008-09-02,
User offline. Last seen 42 years 45 weeks ago.
In the aftermath of Paul's discoveries yesterday, I did some analysis of the size of class files in scala-compiler.jar. Here's a breakdown by size and count:
<= 1kB: 1.33% by size, 6.7% by count
<= 2kB: 17.83% by size, 40.26% by count
<= 3kB: 42.19% by size, 75.27% by count
<= 4kB: 52.84% by size, 85.88% by count
<= 5kB: 58.42% by size, 90.21% by count
<= 6kB: 61.36% by size, 92.04% by count
<= 7kB: 63.78% by size, 93.31% by count
<= 8kB: 65.64% by size, 94.16% by count
<= 9kB: 67.32% by size, 94.84% by count
<= 10kB: 68.98% by size, 95.44% by count
<= 11kB: 70.6% by size, 95.97% by count
<= 12kB: 72.28% by size, 96.47% by count
<= 13kB: 73.93% by size, 96.92% by count
<= 14kB: 74.22% by size, 96.99% by count
<= 15kB: 74.97% by size, 97.17% by count
<= 16kB: 75.43% by size, 97.27% by count
<= 17kB: 76.99% by size, 97.59% by count
<= 18kB: 78.53% by size, 97.89% by count
<= 19kB: 79.21% by size, 98.02% by count
<= 20kB: 79.78% by size, 98.12% by count
<= 21kB: 81.29% by size, 98.37% by count
<= 22kB: 81.76% by size, 98.44% by count
<= 23kB: 82.26% by size, 98.52% by count
<= 24kB: 82.78% by size, 98.59% by count
<= 25kB: 83.86% by size, 98.74% by count
<= 26kB: 84.42% by size, 98.82% by count
<= 27kB: 85.2% by size, 98.92% by count
<= 28kB: 86.2% by size, 99.04% by count
<= 29kB: 86.62% by size, 99.09% by count
<= 31kB: 87.06% by size, 99.14% by count
<= 32kB: 88.0% by size, 99.24% by count
<= 33kB: 88.47% by size, 99.29% by count
<= 35kB: 89.74% by size, 99.42% by count
<= 37kB: 90.0% by size, 99.44% by count
<= 39kB: 90.28% by size, 99.47% by count
<= 41kB: 90.88% by size, 99.52% by count
<= 42kB: 91.19% by size, 99.54% by count
<= 43kB: 91.5% by size, 99.57% by count
<= 44kB: 91.82% by size, 99.59% by count
<= 45kB: 92.15% by size, 99.62% by count
<= 46kB: 92.81% by size, 99.67% by count
<= 48kB: 93.16% by size, 99.69% by count
<= 50kB: 93.52% by size, 99.72% by count
<= 53kB: 93.91% by size, 99.74% by count
<= 57kB: 94.33% by size, 99.77% by count
<= 61kB: 94.78% by size, 99.79% by count
<= 68kB: 95.27% by size, 99.82% by count
<= 70kB: 95.78% by size, 99.84% by count
<= 75kB: 96.33% by size, 99.87% by count
<= 78kB: 97.48% by size, 99.92% by count
<= 80kB: 98.06% by size, 99.94% by count
<= 90kB: 98.71% by size, 99.97% by count
<= 175kB: 100.0% by size, 100.0% by count

In particular, note that 50% of the size is contributed by files weighing 4kb or less. So this makes paul's results rather unsurprising. The $tag generation was happening on every class file (I think), and the vast majority of class files are small, so a small amount to each classfile is going to increase the size of things dramatically.

My experience is that this is pretty typical for scala applications. This suggests that the issues raised in http://www.nabble.com/-scala---half-a-kb-of-bytecode-overhead-per-anonymous-function-td20826275.html are a lot more pressing than they might initially seem. Half a kb off every <= 4kb file would translate to a significant improvement.

I'm not really suggesting we do anything immediately about this. I just though other people on the list might be interested in the results. :-)

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland