- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Strange behaviour of map function depending on first argument. Explanation = ?
Sat, 2009-06-06, 16:36
Dear,
when I execute
val objects = (1 to 2) map (i => new SomeObject(i))
it seems that objects is an object containing functions to create "new
SomeObject(i)" because when printing out these objects twice, I get two
different sets of object names.
When executing
val objects2 = List.range(1, 3) map (i => new SomeObject(i))
and then also printing out objects2 twice, I get the same object names also
twice. So the "new SomeObject(i)" is computed only once.
Below is a small program and printout showing this behaviour. Could someone
please enlighten me on the explanation?
(Both 2.7.4 and 2.7.5 show this behaviour)
Thanks!
Bart
========= Program ==========
package bart.stubTools
object Testing123 {
def main(args: Array[String]) {
val objects = (1 to 2) map (i => new SomeObject(i))
println("=== Using (1 to 2) ---> 2 different sets of objects")
objects foreach println
println
objects foreach println
println("\n=== Now with List.range ---> the same set is printed twice")
val objects2 = List.range(1, 3) map (i => new SomeObject(i))
objects2 foreach println
println
objects2 foreach println
}
class SomeObject(val i : Int) {
}
}
============ Ouput ==========
"C:\Program Files\Java\jdk1.6.0_13\bin\java" -Didea.launcher.port=7532
"-Didea.launcher.bin.path=C:\Program Files\Jetbrains\IntelliJ IDEA 8.x\bin"
-Dfile.encoding=windows-1252 -classpath "C:\Program
Files\Java\jdk1.6.0_13\jre\lib\charsets.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\deploy.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\javaws.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\jce.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\jsse.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\management-agent.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\plugin.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\resources.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\rt.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\dnsns.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\localedata.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunjce_provider.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunmscapi.jar;C:\Program
Files\Java\jdk1.6.0_13\jre\lib\ext\sunpkcs11.jar;C:\Users\bart\IdeaProjects\wintermute\out\production\wintermute;C:\Program
Files\Scala\lib\scala-dbc.jar;C:\Program
Files\Scala\lib\scala-compiler.jar;C:\Program
Files\Scala\lib\scala-library.jar;C:\Program
Files\Scala\lib\scala-swing.jar;C:\Program Files\Jetbrains\IntelliJ IDEA
8.x\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain
bart.stubTools.Testing123 3
=== Using (1 to 2) ---> 2 different sets of objects
bart.stubTools.Testing123$SomeObject@1cd8669
bart.stubTools.Testing123$SomeObject@337838
bart.stubTools.Testing123$SomeObject@18558d2
bart.stubTools.Testing123$SomeObject@18a47e0
=== Now with List.range ---> the same set is printed twice
bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308
bart.stubTools.Testing123$SomeObject@15eb0a9
bart.stubTools.Testing123$SomeObject@1a05308
Sat, 2009-06-06, 17:27
#2
Busy evaluation (was: Strange behaviour of map function dependi
Jan Lohre wrote:
> (1 to 2) results in a lazy data structure
No, it doesn't. A lazy data structure guarantees to evaluate its members
*at* *most* once. If you want a catchy name for the evaluation semantics
of the default non-strict constructs in scala it would be "busy", not
"lazy".
- Florian.
Sat, 2009-06-06, 17:47
#3
Re: Busy evaluation (was: Strange behaviour of map function de
2009/6/6 Florian Hars :
> Jan Lohre wrote:
>>
>> (1 to 2) results in a lazy data structure
>
> No, it doesn't. A lazy data structure guarantees to evaluate its members
> *at* *most* once. If you want a catchy name for the evaluation semantics of
> the default non-strict constructs in scala it would be "busy", not
> "lazy".
I don't know where I'd look for a formal definition of "lazy
evaluation" that would be considered authoritative, but in a lot of
usages (and in particular wikipedia for whatever that's worth), lazy
does actually seem to get used as a catch all term for non-strict
rather than specifically call by need. So structures with call by name
evaluation like 1 to 2 could thus be called lazy.
Sat, 2009-06-06, 20:27
#4
Re: Strange behaviour of map function depending on first argumen
The issue was discussed few weeks ago on this thread:
Seq repeated execution unexpected behavior
http://www.nabble.com/Seq-repeated-execution-unexpected-behavior-td23577...
Though I understand the motivation and the way it works, in my opinion such
implicit behavior is dangerous. I would lobby to change it :-)
eishay
Sun, 2009-06-07, 06:57
#5
Re: Busy evaluation
On 06.06.2009 18:17, Florian Hars wrote:
> Jan Lohre wrote:
>> (1 to 2) results in a lazy data structure
>
> No, it doesn't. A lazy data structure guarantees to evaluate its members
> *at* *most* once. If you want a catchy name for the evaluation semantics
> of the default non-strict constructs in scala it would be "busy", not
> "lazy".
>
> - Florian.
Scala actually offers all three variants.
- if you want upfront (or strict) evaluation, use List.range()
- if you want deferred evaluation at most once, use Stream.range()
- if you want deferred evaluation every time, use new Range()
Try it with the examples offered by the original poster.
I suppose an argument could be made that it would be more intuitive for
beginning users if the (1 to 2) syntax sugar was mapped to
Stream.range() or even List.range() instead of new Range(), but that
could lead to exorbitant memory use in simple for loops if the number of
iterations were large and we didn't need to collect all results.
Perhaps this should go in the Scala FAQ?
Sun, 2009-06-07, 12:07
#6
Re: Re: Busy evaluation
My preference would go to mapping (1 to 3) to the Stream.range because this
seems the most defensive option:
- it protects against running out of memory due to huge lists
- it executes the code only once which is what would be the intention in
most cases
- people who really want the "busy" behaviour will most probably make a
concious decision to want this behaviour and so will not mind making that
explicit in the code.
I would also propose keep the term kazy for calculations that are performed
at most once and use another term for the "busy" form. Maybe busy is not the
best term. "Repeated", "reiterated", ...?
Tnx for the swift replies!
Bart
Ivan Todoroski-2 wrote:
>
> On 06.06.2009 18:17, Florian Hars wrote:
>> Jan Lohre wrote:
>>> (1 to 2) results in a lazy data structure
>>
>> No, it doesn't. A lazy data structure guarantees to evaluate its members
>> *at* *most* once. If you want a catchy name for the evaluation semantics
>> of the default non-strict constructs in scala it would be "busy", not
>> "lazy".
>>
>> - Florian.
>
> Scala actually offers all three variants.
>
> - if you want upfront (or strict) evaluation, use List.range()
> - if you want deferred evaluation at most once, use Stream.range()
> - if you want deferred evaluation every time, use new Range()
>
> Try it with the examples offered by the original poster.
>
> I suppose an argument could be made that it would be more intuitive for
> beginning users if the (1 to 2) syntax sugar was mapped to
> Stream.range() or even List.range() instead of new Range(), but that
> could lead to exorbitant memory use in simple for loops if the number of
> iterations were large and we didn't need to collect all results.
>
> Perhaps this should go in the Scala FAQ?
>
>
Sun, 2009-06-07, 15:47
#7
Re: Re: Busy evaluation
On 07.06.2009 12:56, wintermute314 wrote:
> My preference would go to mapping (1 to 3) to the Stream.range because this
> seems the most defensive option:
> - it protects against running out of memory due to huge lists
That's not completely true though. To borrow the theme of your example:
var sum: BigInt = 0
val objects = Stream.range(1, 10000000) map (i => new SomeObject(i))
objects foreach {x => sum += x.i}
println(sum)
The above will throw OOM, whereas the Range version will still compute
the sum.
However, the following works fine:
Stream.range(1, 10000000) map (i => new SomeObject(i)) foreach {x => sum
+= x.i}
Same goes for the simple for-loop idiom:
for (x <- Stream.range(1, 10000000)) sum += x
The reason the first code example breaks is because the "objects" value
keeps a reference to the beginning of the stream preventing the garbage
collector from collecting any stream cells. In the second and third
examples, the stream is not assigned to a variable so the garbage
collector is free to reclaim the stream cells as they are "spent" by the
iteration.
> - it executes the code only once which is what would be the intention in
> most cases
> - people who really want the "busy" behaviour will most probably make a
> concious decision to want this behaviour and so will not mind making that
> explicit in the code.
I'm inclined to agree with you. If (1 to n) were replaced with
Stream.range(), the most common for-loop idiom will continue to work
even with large number of iterations as shown above, yet the behaviour
would be less surprising when (1 to n) is explicitly assigned to a
variable that is referenced multiple times.
And if someone explicitly needs the range semantics, they can just use
the Range class directly, or as Paul Phillips suggested in the other
thread new methods could be created (1 rangeTo n) and (1 rangeUntil n)
that produce ranges.
(1 to 2) results in a lazy data structure (Range if I am correct), hence calling map on it results again in something lazy.
List.range(1,3) results in a strict data structure (List), hence calling map on it results again in something strict.
try
(1 to 2).force map (i => new SomeObject(i))
or
((1 to 2) map (i => new SomeObject(i))).force
I hope that helps.
Kind regards,
Jan
2009/6/6 wintermute314 <bart@uzleuven.be>