Okasaki: Purely Functional data Structures

Mon, 2009-07-20, 22:27

#1

Josh Stratton

Joined: 2009-04-23,

Re: Okasaki: Purely Functional data Structures

> To follow up on an earlier thread, for those interested, this book came
> today from Amazon. Despite the description on the Amazon website it is
> not a second edition - it's a reprint (but I'm sure the contents are
> just as valid!)

Well, after reading through Okasaki's thesis and a few other journals,
I'm still wondering which types of applications would be a "pain" to
implement. It seems like a lot of the techniques he suggests is to
just get functional programming working efficiently. I guess once one
gets around the extra data structure design, one has all the touted
benefits of functional programming like "easier" unit testing.

Some articles have noted simulations to the rise of OO programming.
Is this still better suited to OO today? For example, I have a large
graph of interconnected nodes where every simulation step some of the
values in random nodes change (like position in space), new nodes are
added, links are made, nodes are removed, etc. Would one be fighting
with functional programming to get this running efficiently?

Josh

Mon, 2009-07-20, 22:47

#2

Robert Fischer

Joined: 2009-01-31,

Re: Okasaki: Purely Functional data Structures

Josh Stratton wrote:
> Some articles have noted simulations to the rise of OO programming.
> Is this still better suited to OO today? For example, I have a large
> graph of interconnected nodes where every simulation step some of the
> values in random nodes change (like position in space), new nodes are
> added, links are made, nodes are removed, etc. Would one be fighting
> with functional programming to get this running efficiently?
>
> Josh
>

At JavaOne's "Scripting Language Bowl", Rich Hickey demonstrated a Life simulation[1] and elaborated
on features of Clojure that made it easier to code than a straight OO solution. So obviously
simulations aren't innately a problem.

GUIs seem to be a recurring bit of awkwardness, even in the face of reactive programming. I'm not
sure I'd program the sprites of a video game in a functional programming language, either.

[1] http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life

~~ Robert Fischer, Smokejumper IT Consulting.
Enfranchised Mind Blog http://EnfranchisedMind.com/blog

Check out my book, "Grails Persistence with GORM and GSQL"!
http://www.smokejumperit.com/redirect.html

Mon, 2009-07-20, 23:17

#3

Josh Stratton

Joined: 2009-04-23,

Re: Okasaki: Purely Functional data Structures

> At JavaOne's "Scripting Language Bowl", Rich Hickey demonstrated a Life
> simulation[1] and elaborated on features of Clojure that made it easier to
> code than a straight OO solution. So obviously simulations aren't innately
> a problem.

Well, I'm sure it's possible to do simulations in functional
programming, but in the Life Simulation, one typically uses a
relatively small grid and just pass of the cells in a linear fashion.
I could imagine writing such a program would be simpler in a
functional programming paradigm, but in large simulations one might
just be mutating parts of large data sets over and over. I don't see
functional programming as a barrier to simulation, but it seems like
there were some motivations to use OO at the time. Was it
performance? Ease of coding (at the time)? Looking at Okasaki's
work, some of these data structures seem really elaborate to guarantee
immutability. Plus lazy evaluation seems a bit pointless if the CPU
is being pushed to its limits anyway.

Josh

Tue, 2009-07-21, 00:07

#4

Robert Fischer

Joined: 2009-01-31,

Re: Okasaki: Purely Functional data Structures

Josh Stratton wrote:
> Looking at Okasaki's
> work, some of these data structures seem really elaborate to guarantee
> immutability. Plus lazy evaluation seems a bit pointless if the CPU
> is being pushed to its limits anyway.
>
> Josh
>

Heh: I've come to think of mutable data structures as being inordinately complicated b/c of their
mutable nature. Ever try to rollback a data structure, or accidentally modify the hash code of an
object in a hash map?

~~ Robert Fischer, Smokejumper IT Consulting.
Enfranchised Mind Blog http://EnfranchisedMind.com/blog

Check out my book, "Grails Persistence with GORM and GSQL"!
http://www.smokejumperit.com/redirect.html

Tue, 2009-07-21, 14:47

#5

Gabriel Claramunt 2

Joined: 2009-07-21,

Re: Okasaki: Purely Functional data Structures

On Mon, Jul 20, 2009 at 7:00 PM, Robert
Fischer wrote:
> Josh Stratton wrote:
>> Looking at Okasaki's
>> work, some of these data structures seem really elaborate to guarantee
>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>> is being pushed to its limits anyway.
>>
>> Josh
>>
>
> Heh: I've come to think of mutable data structures as being inordinately
> complicated b/c of their mutable nature. Ever try to rollback a data
> structure, or accidentally modify the hash code of an object in a hash map?
>
> ~~ Robert Fischer, Smokejumper IT Consulting.
> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>
> Check out my book, "Grails Persistence with GORM and GSQL"!
> http://www.smokejumperit.com/redirect.html
>

Tue, 2009-07-21, 15:07

#6

Gabriel Claramunt 2

Joined: 2009-07-21,

Re: Okasaki: Purely Functional data Structures

That's one of the things I like about Scala: it doesn't force you to
choose an approach.
You can write your solution using objects, using pure functions, using
mutable structures, using immutable structures, using strict
evaluation, or using lazy evaluation. ( or a combination of all)
Granted, it doesn't come for free, but I think the synthesis of OO and
FP can be really powerful.

Tue, 2009-07-21, 16:07

#7

Kevin Wright

Joined: 2009-06-09,

Re: Okasaki: Purely Functional data Structures

I think it's the other way around, what you want is a higher level that just data structures, you need the algorithms surrounding the use of those structures as well.

On Tue, Jul 21, 2009 at 4:00 PM, Bastian, Mark <mbastia@sandia.gov> wrote:

This is all very interesting, but it seems like what is really needed are some simple examples (perhaps on the wiki?) of the functional versions of problems commonly done in OOP style. I am guessing those of us who have spent years doing things in C++ then Java find it easy to say that the OO way is easier because it is all we know. What I’d really like to see are some really simple examples of how immutability is maintained in different types of applications when things change.

As a simple example, I am interested in all kinds of simulations, which typically has some sort of look like so:

class StatefulObject
{
  var position
  var velocity
  def sim(dt : Double) = { //Some logic which updates the state variables (position and velocity, in this case) }
}

class PhysicalSystem
{
  val objects = mutable.Set[StatefulObject]
  def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
}

I can’t think of a more stateful problem. How would I implement this sort of program functionally? <- This question can be taken rhetorically or literally if someone actually wants to address it. I am mainly just saying that to move from OO to functional thinking is going to require examples and explanations at a more basic level than data structure descriptions.

-Mark

On 7/21/09 7:36 AM, "Gabriel Claramunt" <gabriel [dot] claramunt [at] gmail [dot] com" target="_blank" rel="nofollow">gabriel.claramunt@gmail.com> wrote:

On Mon, Jul 20, 2009 at 7:00 PM, Robert
Fischer<robert [dot] fischer [at] smokejumperit [dot] com" target="_blank" rel="nofollow">robert.fischer@smokejumperit.com> wrote:
> Josh Stratton wrote:
>> Looking at Okasaki's
>> work, some of these data structures seem really elaborate to guarantee
>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>> is being pushed to its limits anyway.
>>
>> Josh
>>
>
> Heh: I've come to think of mutable data structures as being inordinately
> complicated b/c of their mutable nature. Ever try to rollback a data
> structure, or accidentally modify the hash code of an object in a hash map?
>
> ~~ Robert Fischer, Smokejumper IT Consulting.
> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>
> Check out my book, "Grails Persistence with GORM and GSQL"!
> http://www.smokejumperit.com/redirect.html
>

--
Gabriel Claramunt
Sun Certified Enterprise Architect
http://gabrielsw.blogspot.com

Tue, 2009-07-21, 16:17

#8

Bastian, Mark

Joined: 2009-01-16,

Re: Okasaki: Purely Functional data Structures

Re: [scala-user] Okasaki: Purely Functional data Structures This is all very interesting, but it seems like what is really needed are some simple examples (perhaps on the wiki?) of the functional versions of problems commonly done in OOP style. I am guessing those of us who have spent years doing things in C++ then Java find it easy to say that the OO way is easier because it is all we know. What I’d really like to see are some really simple examples of how immutability is maintained in different types of applications when things change.

As a simple example, I am interested in all kinds of simulations, which typically has some sort of look like so:

class StatefulObject
{
  var position
  var velocity
  def sim(dt : Double) = { //Some logic which updates the state variables (position and velocity, in this case) }
}

class PhysicalSystem
{
  val objects = mutable.Set[StatefulObject]
  def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
}

I can’t think of a more stateful problem. How would I implement this sort of program functionally? <- This question can be taken rhetorically or literally if someone actually wants to address it. I am mainly just saying that to move from OO to functional thinking is going to require examples and explanations at a more basic level than data structure descriptions.

-Mark

On 7/21/09 7:36 AM, "Gabriel Claramunt" <gabriel [dot] claramunt [at] gmail [dot] com" rel="nofollow">gabriel.claramunt@gmail.com> wrote:

On Mon, Jul 20, 2009 at 7:00 PM, Robert
Fischer<robert [dot] fischer [at] smokejumperit [dot] com" rel="nofollow">robert.fischer@smokejumperit.com> wrote:
> Josh Stratton wrote:
>> Looking at Okasaki's
>> work, some of these data structures seem really elaborate to guarantee
>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>> is being pushed to its limits anyway.
>>
>> Josh
>>
>
> Heh: I've come to think of mutable data structures as being inordinately
> complicated b/c of their mutable nature. Ever try to rollback a data
> structure, or accidentally modify the hash code of an object in a hash map?
>
> ~~ Robert Fischer, Smokejumper IT Consulting.
> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>
> Check out my book, "Grails Persistence with GORM and GSQL"!
> http://www.smokejumperit.com/redirect.html
>

--
Gabriel Claramunt
Sun Certified Enterprise Architect
http://gabrielsw.blogspot.com

Tue, 2009-07-21, 16:27

#9

Bastian, Mark

Joined: 2009-01-16,

Re: Okasaki: Purely Functional data Structures

Re: [scala-user] Okasaki: Purely Functional data Structures Yes to all.

On 7/21/09 9:04 AM, "Kevin Wright" <kev [dot] lee [dot] wright [at] googlemail [dot] com" rel="nofollow">kev.lee.wright@googlemail.com> wrote:

I think it's the other way around, what you want is a higher level that just data structures, you need the algorithms surrounding the use of those structures as well.

On Tue, Jul 21, 2009 at 4:00 PM, Bastian, Mark <mbastia [at] sandia [dot] gov" rel="nofollow">mbastia@sandia.gov> wrote:

This is all very interesting, but it seems like what is really needed are some simple examples (perhaps on the wiki?) of the functional versions of problems commonly done in OOP style. I am guessing those of us who have spent years doing things in C++ then Java find it easy to say that the OO way is easier because it is all we know. What I’d really like to see are some really simple examples of how immutability is maintained in different types of applications when things change.

As a simple example, I am interested in all kinds of simulations, which typically has some sort of look like so:

class StatefulObject
{
  var position
  var velocity
  def sim(dt : Double) = { //Some logic which updates the state variables (position and velocity, in this case) }
}

class PhysicalSystem
{
  val objects = mutable.Set[StatefulObject]
  def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
}

I can’t think of a more stateful problem. How would I implement this sort of program functionally? <- This question can be taken rhetorically or literally if someone actually wants to address it. I am mainly just saying that to move from OO to functional thinking is going to require examples and explanations at a more basic level than data structure descriptions.

-Mark

On 7/21/09 7:36 AM, "Gabriel Claramunt" <gabriel [dot] claramunt [at] gmail [dot] com" rel="nofollow">gabriel.claramunt@gmail.com <gabriel [dot] claramunt [at] gmail [dot] com" rel="nofollow">http://gabriel.claramunt@gmail.com> > wrote:

On Mon, Jul 20, 2009 at 7:00 PM, Robert
Fischer<robert [dot] fischer [at] smokejumperit [dot] com" rel="nofollow">robert.fischer@smokejumperit.com <robert [dot] fischer [at] smokejumperit [dot] com" rel="nofollow">http://robert.fischer@smokejumperit.com> > wrote:
> Josh Stratton wrote:
>> Looking at Okasaki's
>> work, some of these data structures seem really elaborate to guarantee
>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>> is being pushed to its limits anyway.
>>
>> Josh
>>
>
> Heh: I've come to think of mutable data structures as being inordinately
> complicated b/c of their mutable nature. Ever try to rollback a data
> structure, or accidentally modify the hash code of an object in a hash map?
>
> ~~ Robert Fischer, Smokejumper IT Consulting.
> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>
> Check out my book, "Grails Persistence with GORM and GSQL"!
> http://www.smokejumperit.com/redirect.html
>

--
Gabriel Claramunt
Sun Certified Enterprise Architect
http://gabrielsw.blogspot.com

Tue, 2009-07-21, 16:27

#10

Ricky Clarkson

Joined: 2008-12-19,

Re: Okasaki: Purely Functional data Structures

case class O(position: Point, velocity: Vector) {
def sim(dt: Double) = O(position + velocity * dt, velocity)
}

case class PhysicalSystem(os: Set[O]) {
def sim(dt: Double) = os map (_ sim dt)
}

2009/7/21 Bastian, Mark :
> This is all very interesting, but it seems like what is really needed are
> some simple examples (perhaps on the wiki?) of the functional versions of
> problems commonly done in OOP style. I am guessing those of us who have
> spent years doing things in C++ then Java find it easy to say that the OO
> way is easier because it is all we know. What I’d really like to see are
> some really simple examples of how immutability is maintained in different
> types of applications when things change.
>
> As a simple example, I am interested in all kinds of simulations, which
> typically has some sort of look like so:
>
> class StatefulObject
> {
>   var position
>   var velocity
>   def sim(dt : Double) = { //Some logic which updates the state variables
> (position and velocity, in this case) }
> }
>
> class PhysicalSystem
> {
>   val objects = mutable.Set[StatefulObject]
>   def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
> }
>
> I can’t think of a more stateful problem. How would I implement this sort of
> program functionally? <- This question can be taken rhetorically or
> literally if someone actually wants to address it. I am mainly just saying
> that to move from OO to functional thinking is going to require examples and
> explanations at a more basic level than data structure descriptions.
>
> -Mark
>
> On 7/21/09 7:36 AM, "Gabriel Claramunt" wrote:
>
> On Mon, Jul 20, 2009 at 7:00 PM, Robert
> Fischer wrote:
>> Josh Stratton wrote:
>>> Looking at Okasaki's
>>> work, some of these data structures seem really elaborate to guarantee
>>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>>> is being pushed to its limits anyway.
>>>
>>> Josh
>>>
>>
>> Heh: I've come to think of mutable data structures as being inordinately
>> complicated b/c of their mutable nature. Ever try to rollback a data
>> structure, or accidentally modify the hash code of an object in a hash
>> map?
>>
>> ~~ Robert Fischer, Smokejumper IT Consulting.
>> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>>
>> Check out my book, "Grails Persistence with GORM and GSQL"!
>> http://www.smokejumperit.com/redirect.html
>>
>
>
>
> --
> Gabriel Claramunt
> Sun Certified Enterprise Architect
> http://gabrielsw.blogspot.com
>
>
>

Tue, 2009-07-21, 17:07

#11

Ricky Clarkson

Joined: 2008-12-19,

Re: Okasaki: Purely Functional data Structures

Josh,

You might want to hit Reply All in future. These mailing lists are
poorly set up.

> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure.

It can be, actually. The JVM is especially good at dealing with large
numbers of small very short-lived objects.

> If you have a large
> class with only a couple values like position and velocity changing,
> you can just have one function that updates those values.

I'm fed up to the back teeth of large classes. I'd rather have small
classes. The case where it's more interesting is when the set becomes
large. At some point copying becomes problematic and we need to find
a way of sharing. This can be mutation, as you would probably go for
at least until now, or it can be basically a chain of diffs from some
original.

> Sure, it
> would then be mutable, but I don't think that makes the software
> impossible to maintain.

Neither do I. But it's generally easier to maintain when the data
structures aren't mutable. You don't have to worry about which bits
of code use your immutables. In contrast, using mutables is a little
like using manual memory management. You need a concept of "who owns
this", so that you can make sure you don't pass it around to something
that frees, er, mutates it in ways your code doesn't expect.

> On that note, I have very little experience with unit testing and was
> wondering how certain algorithms like simulations are unit tested,
> where the answers are floating point numbers that can be within a
> certain tolerance or complex tree hierarchies, where you're unit
> testing against references. You can really do an assert against
> against the reference itself.

assertTrue(O(position=3 by 4, velocity=5 by 10).sim(dt=1).position > (3 by 4))

As a general rule, the easier your code is to unit test well the
easier it is to use, and the better it will be internally.

2009/7/21 Josh Stratton :
>> case class O(position: Point, velocity: Vector) {
>> def sim(dt: Double) = O(position + velocity * dt, velocity)
>> }
>>
>> case class PhysicalSystem(os: Set[O]) {
>> def sim(dt: Double) = os map (_ sim dt)
>> }
>
> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure. If you have a large
> class with only a couple values like position and velocity changing,
> you can just have one function that updates those values. Sure, it
> would then be mutable, but I don't think that makes the software
> impossible to maintain.
>
> On that note, I have very little experience with unit testing and was
> wondering how certain algorithms like simulations are unit tested,
> where the answers are floating point numbers that can be within a
> certain tolerance or complex tree hierarchies, where you're unit
> testing against references. You can really do an assert against
> against the reference itself.
>
> Josh
>
>>
>> 2009/7/21 Bastian, Mark :
>>> This is all very interesting, but it seems like what is really needed are
>>> some simple examples (perhaps on the wiki?) of the functional versions of
>>> problems commonly done in OOP style. I am guessing those of us who have
>>> spent years doing things in C++ then Java find it easy to say that the OO
>>> way is easier because it is all we know. What I’d really like to see are
>>> some really simple examples of how immutability is maintained in different
>>> types of applications when things change.
>>>
>>> As a simple example, I am interested in all kinds of simulations, which
>>> typically has some sort of look like so:
>>>
>>> class StatefulObject
>>> {
>>>   var position
>>>   var velocity
>>>   def sim(dt : Double) = { //Some logic which updates the state variables
>>> (position and velocity, in this case) }
>>> }
>>>
>>> class PhysicalSystem
>>> {
>>>   val objects = mutable.Set[StatefulObject]
>>>   def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
>>> }
>>>
>>> I can’t think of a more stateful problem. How would I implement this sort of
>>> program functionally? <- This question can be taken rhetorically or
>>> literally if someone actually wants to address it. I am mainly just saying
>>> that to move from OO to functional thinking is going to require examples and
>>> explanations at a more basic level than data structure descriptions.
>>>
>>> -Mark
>>>
>>> On 7/21/09 7:36 AM, "Gabriel Claramunt" wrote:
>>>
>>> On Mon, Jul 20, 2009 at 7:00 PM, Robert
>>> Fischer wrote:
>>>> Josh Stratton wrote:
>>>>> Looking at Okasaki's
>>>>> work, some of these data structures seem really elaborate to guarantee
>>>>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>>>>> is being pushed to its limits anyway.
>>>>>
>>>>> Josh
>>>>>
>>>>
>>>> Heh: I've come to think of mutable data structures as being inordinately
>>>> complicated b/c of their mutable nature. Ever try to rollback a data
>>>> structure, or accidentally modify the hash code of an object in a hash
>>>> map?
>>>>
>>>> ~~ Robert Fischer, Smokejumper IT Consulting.
>>>> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>>>>
>>>> Check out my book, "Grails Persistence with GORM and GSQL"!
>>>> http://www.smokejumperit.com/redirect.html
>>>>
>>>
>>>
>>>
>>> --
>>> Gabriel Claramunt
>>> Sun Certified Enterprise Architect
>>> http://gabrielsw.blogspot.com
>>>
>>>
>>>
>>
>>
>>
>> --
>> Ricky Clarkson
>> Java Programmer, AD Holdings
>> +44 1565 770804
>> Skype: ricky_clarkson
>> Google Talk: ricky.clarkson@gmail.com
>>
>

Tue, 2009-07-21, 17:27

#12

Kevin Wright

Joined: 2009-06-09,

Re: Okasaki: Purely Functional data Structures

If you can get your hands on a copy of "programming in scala" (I've found that buying it is a successful technique for this...) then you want to take a look at chapter 19, which is mostly about the type system, but also gives a very good explanation of how queues can be implemented in a fast, immutable, functional style.

A lot of this stuff can get quite deep, especially when people start discussing "amortized" structures, but that chapter is a great place to start.

On Tue, Jul 21, 2009 at 4:56 PM, Ricky Clarkson <ricky.clarkson@gmail.com> wrote:

Josh,

You might want to hit Reply All in future. These mailing lists are
poorly set up.

> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure.

It can be, actually. The JVM is especially good at dealing with large
numbers of small very short-lived objects.

> If you have a large
> class with only a couple values like position and velocity changing,
> you can just have one function that updates those values.

I'm fed up to the back teeth of large classes. I'd rather have small
classes. The case where it's more interesting is when the set becomes
large. At some point copying becomes problematic and we need to find
a way of sharing. This can be mutation, as you would probably go for
at least until now, or it can be basically a chain of diffs from some
original.

> Sure, it
> would then be mutable, but I don't think that makes the software
> impossible to maintain.

Neither do I. But it's generally easier to maintain when the data
structures aren't mutable. You don't have to worry about which bits
of code use your immutables. In contrast, using mutables is a little
like using manual memory management. You need a concept of "who owns
this", so that you can make sure you don't pass it around to something
that frees, er, mutates it in ways your code doesn't expect.

> On that note, I have very little experience with unit testing and was
> wondering how certain algorithms like simulations are unit tested,
> where the answers are floating point numbers that can be within a
> certain tolerance or complex tree hierarchies, where you're unit
> testing against references. You can really do an assert against
> against the reference itself.

assertTrue(O(position=3 by 4, velocity=5 by 10).sim(dt=1).position > (3 by 4))

As a general rule, the easier your code is to unit test well the
easier it is to use, and the better it will be internally.

2009/7/21 Josh Stratton <strattonbrazil@gmail.com>:
>> case class O(position: Point, velocity: Vector) {
>> def sim(dt: Double) = O(position + velocity * dt, velocity)
>> }
>>
>> case class PhysicalSystem(os: Set[O]) {
>> def sim(dt: Double) = os map (_ sim dt)
>> }
>
> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure. If you have a large
> class with only a couple values like position and velocity changing,
> you can just have one function that updates those values. Sure, it
> would then be mutable, but I don't think that makes the software
> impossible to maintain.
>
> On that note, I have very little experience with unit testing and was
> wondering how certain algorithms like simulations are unit tested,
> where the answers are floating point numbers that can be within a
> certain tolerance or complex tree hierarchies, where you're unit
> testing against references. You can really do an assert against
> against the reference itself.
>
> Josh
>
>>
>> 2009/7/21 Bastian, Mark <mbastia@sandia.gov>:
>>> This is all very interesting, but it seems like what is really needed are
>>> some simple examples (perhaps on the wiki?) of the functional versions of
>>> problems commonly done in OOP style. I am guessing those of us who have
>>> spent years doing things in C++ then Java find it easy to say that the OO
>>> way is easier because it is all we know. What I’d really like to see are
>>> some really simple examples of how immutability is maintained in different
>>> types of applications when things change.
>>>
>>> As a simple example, I am interested in all kinds of simulations, which
>>> typically has some sort of look like so:
>>>
>>> class StatefulObject
>>> {
>>>   var position
>>>   var velocity
>>>   def sim(dt : Double) = { //Some logic which updates the state variables
>>> (position and velocity, in this case) }
>>> }
>>>
>>> class PhysicalSystem
>>> {
>>>   val objects = mutable.Set[StatefulObject]
>>>   def sim(dt : Double) = { for(object <- objects) object.sim(dt) }
>>> }
>>>
>>> I can’t think of a more stateful problem. How would I implement this sort of
>>> program functionally? <- This question can be taken rhetorically or
>>> literally if someone actually wants to address it. I am mainly just saying
>>> that to move from OO to functional thinking is going to require examples and
>>> explanations at a more basic level than data structure descriptions.
>>>
>>> -Mark
>>>
>>> On 7/21/09 7:36 AM, "Gabriel Claramunt" <gabriel.claramunt@gmail.com> wrote:
>>>
>>> On Mon, Jul 20, 2009 at 7:00 PM, Robert
>>> Fischer<robert.fischer@smokejumperit.com> wrote:
>>>> Josh Stratton wrote:
>>>>> Looking at Okasaki's
>>>>> work, some of these data structures seem really elaborate to guarantee
>>>>> immutability. Plus lazy evaluation seems a bit pointless if the CPU
>>>>> is being pushed to its limits anyway.
>>>>>
>>>>> Josh
>>>>>
>>>>
>>>> Heh: I've come to think of mutable data structures as being inordinately
>>>> complicated b/c of their mutable nature. Ever try to rollback a data
>>>> structure, or accidentally modify the hash code of an object in a hash
>>>> map?
>>>>
>>>> ~~ Robert Fischer, Smokejumper IT Consulting.
>>>> Enfranchised Mind Blog http://EnfranchisedMind.com/blog
>>>>
>>>> Check out my book, "Grails Persistence with GORM and GSQL"!
>>>> http://www.smokejumperit.com/redirect.html
>>>>
>>>
>>>
>>>
>>> --
>>> Gabriel Claramunt
>>> Sun Certified Enterprise Architect
>>> http://gabrielsw.blogspot.com
>>>
>>>
>>>
>>
>>
>>
>> --
>> Ricky Clarkson
>> Java Programmer, AD Holdings
>> +44 1565 770804
>> Skype: ricky_clarkson
>> Google Talk: ricky.clarkson@gmail.com
>>
>

--
Ricky Clarkson
Java Programmer, AD Holdings
+44 1565 770804
Skype: ricky_clarkson
Google Talk: ricky.clarkson@gmail.com

Tue, 2009-07-21, 17:27

#13

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

2009/7/21 Ricky Clarkson <ricky.clarkson@gmail.com>

> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure.

It can be, actually. The JVM is especially good at dealing with large
numbers of small very short-lived objects.

Oh boy, if creating a large number of short-lived immutable objects is actually faster than creating a single mutable one, this must be Chuck Norris' JVM. :)
(I'm all in for agreeing that such objects can be allocated and reclaimed very quickly, of course, but the above sounded a tad stronger... )
Dimitris

Tue, 2009-07-21, 17:47

#14

Kevin Wright

Joined: 2009-06-09,

Re: Okasaki: Purely Functional data Structures

it's all about the eden space...

On Tue, Jul 21, 2009 at 5:17 PM, Jim Andreou <jim.andreou@gmail.com> wrote:

2009/7/21 Ricky Clarkson <ricky.clarkson@gmail.com>
> This is very intuitive to understand, but I'm not sure it could ever
> be faster than doing it with a mutable structure.

It can be, actually. The JVM is especially good at dealing with large
numbers of small very short-lived objects.

Oh boy, if creating a large number of short-lived immutable objects is actually faster than creating a single mutable one, this must be Chuck Norris' JVM. :)
(I'm all in for agreeing that such objects can be allocated and reclaimed very quickly, of course, but the above sounded a tad stronger... )
Dimitris

Tue, 2009-07-21, 18:17

#15

Gabriel Claramunt 2

Joined: 2009-07-21,

Re: Okasaki: Purely Functional data Structures

On Tue, Jul 21, 2009 at 12:17 PM, Jim Andreou wrote:
> 2009/7/21 Ricky Clarkson
>>
>> > This is very intuitive to understand, but I'm not sure it could ever
>> > be faster than doing it with a mutable structure.
>>
>> It can be, actually. The JVM is especially good at dealing with large
>> numbers of small very short-lived objects.
>
> Oh boy, if creating a large number of short-lived immutable objects is
> actually faster than creating a single mutable one, this must be Chuck
> Norris' JVM. :)
> (I'm all in for agreeing that such objects can be allocated and reclaimed
> very quickly, of course, but the above sounded a tad stronger... )
> Dimitris

I remember Martin Odersky talking about that in a Java Posse interview.
Having long lived mutable objects actually hurts GC performance.
http://www.ibm.com/developerworks/java/library/j-jtp11253/

Tue, 2009-07-21, 18:47

#16

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

2009/7/21 Gabriel Claramunt <gabriel.claramunt@gmail.com>

On Tue, Jul 21, 2009 at 12:17 PM, Jim Andreou<jim.andreou@gmail.com> wrote:
> 2009/7/21 Ricky Clarkson <ricky.clarkson@gmail.com>
>>
>> > This is very intuitive to understand, but I'm not sure it could ever
>> > be faster than doing it with a mutable structure.
>>
>> It can be, actually. The JVM is especially good at dealing with large
>> numbers of small very short-lived objects.
>
> Oh boy, if creating a large number of short-lived immutable objects is
> actually faster than creating a single mutable one, this must be Chuck
> Norris' JVM. :)
> (I'm all in for agreeing that such objects can be allocated and reclaimed
> very quickly, of course, but the above sounded a tad stronger... )
> Dimitris

I remember Martin Odersky talking about that in a Java Posse interview.
Having long lived mutable objects actually hurts GC performance.
http://www.ibm.com/developerworks/java/library/j-jtp11253/

I think it doesn't take a Martin Odersky though to realize that no matter how blazingly fast creating and throwing away an object may be, *not* creating it has to be faster. Nobody talks about object pools which have been beaten to death many years ago. But if you need X amount of updatable memory, as the simulation example given previously, I would be very interested in seeing an actual benchmark that shows that constantly copying this and throwing the old away can still be *faster* than updating in place. Fast enough? Sure. Faster? Hmmm...

Tue, 2009-07-21, 19:07

#17

milessabin

Joined: 2008-08-11,

Re: Okasaki: Purely Functional data Structures

On Tue, Jul 21, 2009 at 6:41 PM, Jim Andreou wrote:
> I think it doesn't take a Martin Odersky though to realize that no matter
> how blazingly fast creating and throwing away an object may be, *not*
> creating it has to be faster.

If the alternative is no allocation at all then, yes, clearly. But if
the alternative is stack allocation (either directly or via inlining),
then it's not as clearcut as you might think,

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.8219

Cheers,

Miles

Tue, 2009-07-21, 19:27

#18

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

2009/7/21 Miles Sabin :
> On Tue, Jul 21, 2009 at 6:41 PM, Jim Andreou wrote:
>> I think it doesn't take a Martin Odersky though to realize that no matter
>> how blazingly fast creating and throwing away an object may be, *not*
>> creating it has to be faster.
>
> If the alternative is no allocation at all then, yes, clearly. But if
> the alternative is stack allocation (either directly or via inlining),
> then it's not as clearcut as you might think,
>
> http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.39.8219
>
> Cheers,
>
>
> Miles
>

Quite nice read from the 80ies, although I don't immediately see its
relevance if we restrict the discussion just on current JVMs.

Lets make it concrete and less hand-wavy. For the simulation scenario,
I would just allocate an array of objects, or several arrays of
primitives (as much state as needed), and would update this memory in
place for each iteration. If someone shows that copying each and every
cell in every iteration will be faster than just updating the values
in place, I'll never use mutables again.

Tue, 2009-07-21, 19:47

#19

Alex Cruise

Joined: 2008-12-17,

Re: Okasaki: Purely Functional data Structures

Jim Andreou wrote:
> Lets make it concrete and less hand-wavy. For the simulation scenario,
> I would just allocate an array of objects, or several arrays of
> primitives (as much state as needed), and would update this memory in
> place for each iteration. If someone shows that copying each and every
> cell in every iteration will be faster than just updating the values
> in place, I'll never use mutables again.
>
One of the most important attributes of persistent data structures is
that they don't need to copy *everything*, only those nodes that
constitute the path from the root(s) to the changed cell(s).

-0xe1a

Tue, 2009-07-21, 19:47

#20

Naftoli Gugenheim

Joined: 2008-12-17,

Re: Okasaki: Purely Functional data Structures

-------------------------------------
Alex Cruise wrote:

Jim Andreou wrote:
> Lets make it concrete and less hand-wavy. For the simulation scenario,
> I would just allocate an array of objects, or several arrays of
> primitives (as much state as needed), and would update this memory in
> place for each iteration. If someone shows that copying each and every
> cell in every iteration will be faster than just updating the values
> in place, I'll never use mutables again.
>
One of the most important attributes of persistent data structures is
that they don't need to copy *everything*, only those nodes that
constitute the path from the root(s) to the changed cell(s).

-0xe1a

Tue, 2009-07-21, 19:57

#21

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

Duh, never mind, I give up, it's pointless. Sorry for any bothering at all.

2009/7/21 Jim Andreou <jim.andreou@gmail.com>

2009/7/21 Alex Cruise <alex@cluonflux.com>
Jim Andreou wrote:

Lets make it concrete and less hand-wavy. For the simulation scenario,
I would just allocate an array of objects, or several arrays of
primitives (as much state as needed), and would update this memory in
place for each iteration. If someone shows that copying each and every
cell in every iteration will be faster than just updating the values
in place, I'll never use mutables again.

One of the most important attributes of persistent data structures is that they don't need to copy *everything*, only those nodes that constitute the path from the root(s) to the changed cell(s).

-0xe1a

Can you implement a game of life with a persistent data structure that, everything else being equal, will be faster than an implementation using a mutable array? Please demonstrate.

Tue, 2009-07-21, 20:07

#22

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

2009/7/21 Alex Cruise <alex@cluonflux.com>

Jim Andreou wrote:

Lets make it concrete and less hand-wavy. For the simulation scenario,
I would just allocate an array of objects, or several arrays of
primitives (as much state as needed), and would update this memory in
place for each iteration. If someone shows that copying each and every
cell in every iteration will be faster than just updating the values
in place, I'll never use mutables again.

One of the most important attributes of persistent data structures is that they don't need to copy *everything*, only those nodes that constitute the path from the root(s) to the changed cell(s).

-0xe1a

Can you implement a game of life with a persistent data structure that, everything else being equal, will be faster than an implementation using a mutable array? Please demonstrate.

Tue, 2009-07-21, 20:07

#23

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

2009/7/21 Alex Cruise <alex@cluonflux.com>

Jim Andreou wrote:

2009/7/21 Alex Cruise <alex@cluonflux.com <mailto:alex@cluonflux.com>>

One of the most important attributes of persistent data structures
is that they don't need to copy *everything*, only those nodes
that constitute the path from the root(s) to the changed cell(s).

Can you implement a game of life with a persistent data structure that, everything else being equal, will be faster than an implementation using a mutable array? Please demonstrate.

I wouldn't claim that the persistent version is *faster*, just that it can bring the many well-known benefits of immutable data while not necessarily being *as inefficient* as one might imagine. :)

-0xe1a

Then why you feel the urge to reiterate basic facts of persistent data structures in a discussion that was about the truth or not of a claim stating it is possible to implement such a thing with immutable objects and still be *faster* than with mutable ones? To make me give up and go away, that's why! :D

Tue, 2009-07-21, 20:17

#24

Alex Cruise

Joined: 2008-12-17,

Re: Okasaki: Purely Functional data Structures

Jim Andreou wrote:
> 2009/7/21 Alex Cruise >
>
> One of the most important attributes of persistent data structures
> is that they don't need to copy *everything*, only those nodes
> that constitute the path from the root(s) to the changed cell(s).
>
> Can you implement a game of life with a persistent data structure
> that, everything else being equal, will be faster than an
> implementation using a mutable array? Please demonstrate.
I wouldn't claim that the persistent version is *faster*, just that it
can bring the many well-known benefits of immutable data while not
necessarily being *as inefficient* as one might imagine. :)

-0xe1a

Tue, 2009-07-21, 20:17

#25

Alex Cruise

Joined: 2008-12-17,

Re: Okasaki: Purely Functional data Structures

Jim Andreou wrote:
> 2009/7/21 Alex Cruise >
>
> I wouldn't claim that the persistent version is *faster*, just
> that it can bring the many well-known benefits of immutable data
> while not necessarily being *as inefficient* as one might imagine. :)
>
> Then why you feel the urge to reiterate basic facts of persistent data
> structures in a discussion that was about the truth or not of a claim
> stating it is possible to implement such a thing with immutable
> objects and still be *faster* than with mutable ones? To make me give
> up and go away, that's why! :D
Whoops, sorry, I'm trying to do too many things at once, and thoroughly
reading this thread is apparently not one of them :/

In a doomed attempt to turn my contribution to this thread into a net
positive, I'll suggest that this would be a great use case for the
Computer Language Benchmarks Game.

We'd need a nice textual encoding for board states, and the benchmark
driver data might consist of a list of pairs of (initial state, #
iterations, expected end state). I've seen a few Haskell
implementations floating around...

-0xe1a

Tue, 2009-07-21, 20:37

#26

Randall R Schulz

Joined: 2008-12-16,

Re: Okasaki: Purely Functional data Structures

On Tuesday July 21 2009, Jim Andreou wrote:
> 2009/7/21 Alex Cruise
>
> > Jim Andreou wrote:
> >> 2009/7/21 Alex Cruise >> >
> >>
> >> One of the most important attributes of persistent data
> >> structures is that they don't need to copy *everything*, only
> >> those nodes that constitute the path from the root(s) to the
> >> changed cell(s).
> >>
> >> Can you implement a game of life with a persistent data structure
> >> that, everything else being equal, will be faster than an
> >> implementation using a mutable array? Please demonstrate.
> >
> > I wouldn't claim that the persistent version is *faster*, just that
> > it can bring the many well-known benefits of immutable data while
> > not necessarily being *as inefficient* as one might imagine. :)
> >
> > -0xe1a
>
> Then why you feel the urge to reiterate basic facts of persistent
> data structures in a discussion that was about the truth or not of a
> claim stating it is possible to implement such a thing with immutable
> objects and still be *faster* than with mutable ones? To make me give
> up and go away, that's why! :D

If you have only one criterion, make it as fast as possible (or make it
as small as possible) then you write in assembly. This is real-world
softare engineering we're talking about and there are many competing
criteria. Execution speed isn't always even the prime criterion.

Furthermore, I think few people are claiming that with today's language
and data structure technologies that every possible application is
going to be best programmed using a purely functional language or with
purely functional techniques in a "conventional" language.

On the other hand, the performance situation, even given the limitations
of today's languages, data structure techniques and libraries, is
usually not as bad as it seems to a programmer who is adept at
object-oriented programming when they first encounter functional
programming.

So keep and open mind and give it a chance. And remember, with Scala
it's eminently feasible to choose either immutable or mutable
representations at a fairly fine granularity. I've been using it less
than a year, now, and I think it represents quite a good "sweet spot"
in terms of the multitude of criteria facing language designers as well
as programmers choosing a language for a project.

Randall Schulz

Tue, 2009-07-21, 20:57

#27

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

As I can't claim that performance is always a critical factor, one
can't claim that it never is. Scanning through the scala libraries
source for "var" definitions, and mailing the authors the suggestion
that they should be doing assembly instead, just because they cared
about performance, would be a fun thing to do :) (duh, I just spoiled
an excellent April fool's day joke!)

2009/7/21 Randall R Schulz :
> If you have only one criterion, make it as fast as possible (or make it
> as small as possible) then you write in assembly. This is real-world
> softare engineering we're talking about and there are many competing
> criteria. Execution speed isn't always even the prime criterion.
>
> Furthermore, I think few people are claiming that with today's language
> and data structure technologies that every possible application is
> going to be best programmed using a purely functional language or with
> purely functional techniques in a "conventional" language.
>
> On the other hand, the performance situation, even given the limitations
> of today's languages, data structure techniques and libraries, is
> usually not as bad as it seems to a programmer who is adept at
> object-oriented programming when they first encounter functional
> programming.
>
> So keep and open mind and give it a chance. And remember, with Scala
> it's eminently feasible to choose either immutable or mutable
> representations at a fairly fine granularity. I've been using it less
> than a year, now, and I think it represents quite a good "sweet spot"
> in terms of the multitude of criteria facing language designers as well
> as programmers choosing a language for a project.
>
>
> Randall Schulz
>

Tue, 2009-07-21, 21:37

#28

Ricky Clarkson

Joined: 2008-12-19,

Re: Okasaki: Purely Functional data Structures

I did some extensive benchmarking of a few approaches to decoding
JPEGs a few months back. And yes, I know how to benchmark on the JVM.
A result that surprised my boss was that allocating a single
ByteBuffer and reusing it for many decodes was noticeably slower than
allocating it as late as possible (and hence repeatedly). This
doesn't directly relate to mutable/immutable but I'm sure you can
extrapolate. I never dug deeper but I expect cache locality to be a
large factor in that.

If I have a point in this thread, it's that sacrificing readability
for performance might gain you neither.

2009/7/21 Jim Andreou :
> 2009/7/21 Ricky Clarkson
>>
>> > This is very intuitive to understand, but I'm not sure it could ever
>> > be faster than doing it with a mutable structure.
>>
>> It can be, actually. The JVM is especially good at dealing with large
>> numbers of small very short-lived objects.
>
> Oh boy, if creating a large number of short-lived immutable objects is
> actually faster than creating a single mutable one, this must be Chuck
> Norris' JVM. :)
> (I'm all in for agreeing that such objects can be allocated and reclaimed
> very quickly, of course, but the above sounded a tad stronger... )
> Dimitris

Tue, 2009-07-21, 21:47

#29

Robert Fischer

Joined: 2009-01-31,

Re: Okasaki: Purely Functional data Structures

These kinds of results are often accounted for by the JVM's generational garbage collector strongly
preferring minor GCs vs. major GCs, so the more you can keep object roots in the younger
generation/eden, the better off you are.

Note escape analysis/G1 garbage collector (standard in Java7, accessible in recent 1.6 builds) makes
this difference even more huge.

~~ Robert.

Ricky Clarkson wrote:
> I did some extensive benchmarking of a few approaches to decoding
> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
> A result that surprised my boss was that allocating a single
> ByteBuffer and reusing it for many decodes was noticeably slower than
> allocating it as late as possible (and hence repeatedly). This
> doesn't directly relate to mutable/immutable but I'm sure you can
> extrapolate. I never dug deeper but I expect cache locality to be a
> large factor in that.
>
> If I have a point in this thread, it's that sacrificing readability
> for performance might gain you neither.
>
> 2009/7/21 Jim Andreou :
>> 2009/7/21 Ricky Clarkson
>>>> This is very intuitive to understand, but I'm not sure it could ever
>>>> be faster than doing it with a mutable structure.
>>> It can be, actually. The JVM is especially good at dealing with large
>>> numbers of small very short-lived objects.
>> Oh boy, if creating a large number of short-lived immutable objects is
>> actually faster than creating a single mutable one, this must be Chuck
>> Norris' JVM. :)
>> (I'm all in for agreeing that such objects can be allocated and reclaimed
>> very quickly, of course, but the above sounded a tad stronger... )
>> Dimitris
>
>
>

Tue, 2009-07-21, 22:57

#30

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

If you depended on the existing data (which you would in a simulation,
which was the example), then the option to create a new version of the
data to stay immutable couldn't win: copying would make the old
version cache-resident too, paying in the process the cost of any
cache-miss of the old instance, which is what you would pay anyway to
use it and modify it directly (while the latter would avoid copying
the dataset every time).

In your case, the old ByteBuffer had irrelevant data in it w.r.t. the
current operation (or else you wouldn't have the luxury to throw away
the old one without copying it first), so the only thing that could be
gained is allocation and reclamation, both very fast, i.e. that would
"optimize" the entirely wrong (the very fast) thing.

The latter is similar to the known bad practice, somewhat common about
a decade ago, to reuse StringBuffer instances for various (irrelevant)
string operations, which ended up being slower than simply new'ing a
StringBuffer instead.

So my point in this thread would be: don't dogmatize in any direction,
train yourself and your wits to make better decisions on what to
choose when - not merely making choices because something looks like
the new trend or hype.

2009/7/21 Ricky Clarkson :
> I did some extensive benchmarking of a few approaches to decoding
> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
> A result that surprised my boss was that allocating a single
> ByteBuffer and reusing it for many decodes was noticeably slower than
> allocating it as late as possible (and hence repeatedly). This
> doesn't directly relate to mutable/immutable but I'm sure you can
> extrapolate. I never dug deeper but I expect cache locality to be a
> large factor in that.
>
> If I have a point in this thread, it's that sacrificing readability
> for performance might gain you neither.
>
> 2009/7/21 Jim Andreou :
>> 2009/7/21 Ricky Clarkson
>>>
>>> > This is very intuitive to understand, but I'm not sure it could ever
>>> > be faster than doing it with a mutable structure.
>>>
>>> It can be, actually. The JVM is especially good at dealing with large
>>> numbers of small very short-lived objects.
>>
>> Oh boy, if creating a large number of short-lived immutable objects is
>> actually faster than creating a single mutable one, this must be Chuck
>> Norris' JVM. :)
>> (I'm all in for agreeing that such objects can be allocated and reclaimed
>> very quickly, of course, but the above sounded a tad stronger... )
>> Dimitris
>
>
>
> --
> Ricky Clarkson
> Java Programmer, AD Holdings
> +44 1565 770804
> Skype: ricky_clarkson
> Google Talk: ricky.clarkson@gmail.com
>

Tue, 2009-07-21, 23:17

#31

Ricky Clarkson

Joined: 2008-12-19,

Re: Okasaki: Purely Functional data Structures

I suggest you have a look at zippers (google: zipper functional
programming). I can't suggest much more than that as my own
experience of performance problems is negligible.

My observation that copying is not always problematic is not dogma.

2009/7/21 Jim Andreou :
> If you depended on the existing data (which you would in a simulation,
> which was the example), then the option to create a new version of the
> data to stay immutable couldn't win: copying would make the old
> version cache-resident too, paying in the process the cost of any
> cache-miss of the old instance, which is what you would pay anyway to
> use it and modify it directly (while the latter would avoid copying
> the dataset every time).
>
> In your case, the old ByteBuffer had irrelevant data in it w.r.t. the
> current operation (or else you wouldn't have the luxury to throw away
> the old one without copying it first), so the only thing that could be
> gained is allocation and reclamation, both very fast, i.e. that would
> "optimize" the entirely wrong (the very fast) thing.
>
> The latter is similar to the known bad practice, somewhat common about
> a decade ago, to reuse StringBuffer instances for various (irrelevant)
> string operations, which ended up being slower than simply new'ing a
> StringBuffer instead.
>
> So my point in this thread would be: don't dogmatize in any direction,
> train yourself and your wits to make better decisions on what to
> choose when - not merely making choices because something looks like
> the new trend or hype.
>
> 2009/7/21 Ricky Clarkson :
>> I did some extensive benchmarking of a few approaches to decoding
>> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
>> A result that surprised my boss was that allocating a single
>> ByteBuffer and reusing it for many decodes was noticeably slower than
>> allocating it as late as possible (and hence repeatedly). This
>> doesn't directly relate to mutable/immutable but I'm sure you can
>> extrapolate. I never dug deeper but I expect cache locality to be a
>> large factor in that.
>>
>> If I have a point in this thread, it's that sacrificing readability
>> for performance might gain you neither.
>>
>> 2009/7/21 Jim Andreou :
>>> 2009/7/21 Ricky Clarkson
>>>>
>>>> > This is very intuitive to understand, but I'm not sure it could ever
>>>> > be faster than doing it with a mutable structure.
>>>>
>>>> It can be, actually. The JVM is especially good at dealing with large
>>>> numbers of small very short-lived objects.
>>>
>>> Oh boy, if creating a large number of short-lived immutable objects is
>>> actually faster than creating a single mutable one, this must be Chuck
>>> Norris' JVM. :)
>>> (I'm all in for agreeing that such objects can be allocated and reclaimed
>>> very quickly, of course, but the above sounded a tad stronger... )
>>> Dimitris
>>
>>
>>
>> --
>> Ricky Clarkson
>> Java Programmer, AD Holdings
>> +44 1565 770804
>> Skype: ricky_clarkson
>> Google Talk: ricky.clarkson@gmail.com
>>
>

Tue, 2009-07-21, 23:27

#32

Randall R Schulz

Joined: 2008-12-16,

Re: Okasaki: Purely Functional data Structures

On Tuesday July 21 2009, Jim Andreou wrote:
> ...
>
> So my point in this thread would be: don't dogmatize in any
> direction, train yourself and your wits to make better decisions on
> what to choose when - not merely making choices because something
> looks like the new trend or hype.

Really? Because it sounds like you've made several absolute assumptions
(about performance characteristics and programming techniques) and then
argued that functional practices cannot possibly compete with the
alternatives. You also seem to assume that Scala is to be used or is
being used only in a strictly functional style (or that such use is the
only one advocated or sanctioned), when the clear fact is that Scala
expressly incorporates constructs from both OO and FP programming
models.

And while there are those (in the Scala world) who are so well versed in
functional programming idioms and the algorithms and data structures
that support them that they have functional solutions to virtually
every problem at their fingertips, they are a pretty small minority, I
think. But apart from having trouble deciphering some of their
suggested solutions, being as they often are heavily laden with
parametric, higher-order and / or higher-kinded constructions as well
as being described using an argot I scarcely recognize, I don't have
any reason to believe they're those recommendations are inappropriate
or, for the most part, purely pedantic. ... Well, maybe there's a
little pedanticism about...

To exactly what hype or trend do you see Scala users succumbing? From
what I can tell, most everyone has a suitably skeptical eye and open
mind about how we might best exploit a new language that has opened up
options that were either flatly foreclosed or exceedingly tedious and
difficult to explore with languages that resided entirely in either the
object-oriented or the functional realms.

I have to wonder: What is your interest in Scala? Are you here to save
us from ourselves?

> ...

Randall Schulz

Wed, 2009-07-22, 00:07

#33

Bastian, Mark

Joined: 2009-01-16,

Re: Okasaki: Purely Functional data Structures

Re: [scala-user] Okasaki: Purely Functional data Structures Well, my original post was simply to state that many of us trained in OOP would like to see more simple examples of functional programming and the reply I got to my sample was quite appropriate (Thanks, Ricky!). I can appreciate the succinctness of the resulting code.

Regarding the flame war going on, Bloch in “Effective Java” spends an entire section on favoring small, immutable objects (Item #13) and I’ve always tried to stick to that principle where possible (even in an OO world). I’ve never seen it bite me when it comes to performance.

I am guessing that whichever technique is faster is generally only going to be incrementally faster. That being the case, I’ve always liked the other advantages of small, immutable objects (Thread safety and clarity being two big ones). I think clear, easy to read code will be maintainable years in the future and Moore’s law will take care of any performance gaps.

Thanks,
Mark

Wed, 2009-07-22, 00:27

#34

ounos

Joined: 2008-12-29,

Re: Okasaki: Purely Functional data Structures

Randall, this is distorting what I said, unfortunately. My only claim
is that sometimes mutable is the way to go, such in the simulations
example provided. This: "then argued that functional practices cannot
possibly compete with the alternatives" is clearly out of proportion.
Of course there are plenty of cases where this is not the case, and
where persistent data structures offer the best performance over
alternatives. I only concentrated on a more-or-less specific and
restricted case. For favoring mutability in this case, you practically
instructed to do assembly instead.

On the hype issue, this is what I observe, various cases where
newcomers are instructed to try functional approaches to problems
clearly not suited well to it (even for swing programming, few months
ago, and this thread is another example as well). Or using persistent
structures in most of the cases, even though in most of the cases only
one version of them is needed, and similar cases. You can argue that
this is just me, which I could accept.

2009/7/22 Randall R Schulz :
> On Tuesday July 21 2009, Jim Andreou wrote:
>> ...
>>
>> So my point in this thread would be: don't dogmatize in any
>> direction, train yourself and your wits to make better decisions on
>> what to choose when - not merely making choices because something
>> looks like the new trend or hype.
>
> Really? Because it sounds like you've made several absolute assumptions
> (about performance characteristics and programming techniques) and then
> argued that functional practices cannot possibly compete with the
> alternatives. You also seem to assume that Scala is to be used or is
> being used only in a strictly functional style (or that such use is the
> only one advocated or sanctioned), when the clear fact is that Scala
> expressly incorporates constructs from both OO and FP programming
> models.
>
> And while there are those (in the Scala world) who are so well versed in
> functional programming idioms and the algorithms and data structures
> that support them that they have functional solutions to virtually
> every problem at their fingertips, they are a pretty small minority, I
> think. But apart from having trouble deciphering some of their
> suggested solutions, being as they often are heavily laden with
> parametric, higher-order and / or higher-kinded constructions as well
> as being described using an argot I scarcely recognize, I don't have
> any reason to believe they're those recommendations are inappropriate
> or, for the most part, purely pedantic. ... Well, maybe there's a
> little pedanticism about...
>
>
> To exactly what hype or trend do you see Scala users succumbing? From
> what I can tell, most everyone has a suitably skeptical eye and open
> mind about how we might best exploit a new language that has opened up
> options that were either flatly foreclosed or exceedingly tedious and
> difficult to explore with languages that resided entirely in either the
> object-oriented or the functional realms.
>
> I have to wonder: What is your interest in Scala? Are you here to save
> us from ourselves?
>
>
>> ...
>
>
> Randall Schulz
>

Wed, 2009-07-22, 08:57

#35

ebowman

Joined: 2009-04-13,

Re: Okasaki: Purely Functional data Structures

Ricky Clarkson wrote:
> I did some extensive benchmarking of a few approaches to decoding
> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
> A result that surprised my boss was that allocating a single
> ByteBuffer and reusing it for many decodes was noticeably slower than
> allocating it as late as possible (and hence repeatedly). This
> doesn't directly relate to mutable/immutable but I'm sure you can
> extrapolate. I never dug deeper but I expect cache locality to be a
> large factor in that.
>
> If I have a point in this thread, it's that sacrificing readability
> for performance might gain you neither.
>

I've always believed and coded to this notion, but recently it burned
me. Perhaps this is interesting (or, perhaps not... :).

The application in question does offline processing of application
logs. It is massively parallelized, and somewhat performance critical
... if it takes more than n hours to process n hours worth of logs, all
is lost.

I recently added a new feature, where I was taking a string parsed from
a log line, and wrapping a very thin domain class around it. All the
domain class does, really, is keep a reference to the string. There are
only a couple dozen possible values for this string.

This simple change absolutely killed performance ... it practically
doubled the execution time. I changed it to look up the domain object
from a hashmap, and the problem was solved.

In fairness, I didn't take the time to try to tune the GC; I suppose I
might have been able to tune my way around it (this is on JDK 1.6.0.14
by the way). But when I wrote the code, part of me felt it was very
wasteful creating this very short-lived objects, and that part was ruled
out by the "lots of short-lived objects are cheap" claims, but in this
case it simply wasn't true.

So anyhow, it's difficult to find truly universal rules. And the fact
that things tend to change pretty dramatically from JVM version to JVM
version further complicates things.

cheers,
Eric

Wed, 2009-07-22, 10:37

#36

Ricky Clarkson

Joined: 2008-12-19,

Re: Okasaki: Purely Functional data Structures

Sure, hence the word 'might'. I'd guess in anything of that scale
there'd be a bigger change you could make to gain performance (e.g.,
using memory-mapped files, splitting the logs by those 12 separate
strings first, etc.).

A positive from this thread might be a feature request for Scala:
Value types. The JVM doesn't have them, other than the primitive
types, but that wouldn't have to stop a clever compiler from providing
them, and hence type-checking between them.

value class Point(x: Int, y: Int)
def abs(p: Point) = sqrt(p.x * p.x + p.y * p.y)

The def would desugar to:
def abs(p_x: Int, p_y: Int) = sqrt(p_x * p_x + p_y * p_y)

and abs(Point(3, 4)) would desugar to abs(3, 4)

2009/7/22 Eric Bowman :
> Ricky Clarkson wrote:
>>
>> I did some extensive benchmarking of a few approaches to decoding
>> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
>> A result that surprised my boss was that allocating a single
>> ByteBuffer and reusing it for many decodes was noticeably slower than
>> allocating it as late as possible (and hence repeatedly). This
>> doesn't directly relate to mutable/immutable but I'm sure you can
>> extrapolate. I never dug deeper but I expect cache locality to be a
>> large factor in that.
>>
>> If I have a point in this thread, it's that sacrificing readability
>> for performance might gain you neither.
>>
>
> I've always believed and coded to this notion, but recently it burned me.
> Perhaps this is interesting (or, perhaps not... :).
>
> The application in question does offline processing of application logs. It
> is massively parallelized, and somewhat performance critical ... if it takes
> more than n hours to process n hours worth of logs, all is lost.
>
> I recently added a new feature, where I was taking a string parsed from a
> log line, and wrapping a very thin domain class around it. All the domain
> class does, really, is keep a reference to the string. There are only a
> couple dozen possible values for this string.
>
> This simple change absolutely killed performance ... it practically doubled
> the execution time. I changed it to look up the domain object from a
> hashmap, and the problem was solved.
>
> In fairness, I didn't take the time to try to tune the GC; I suppose I might
> have been able to tune my way around it (this is on JDK 1.6.0.14 by the
> way). But when I wrote the code, part of me felt it was very wasteful
> creating this very short-lived objects, and that part was ruled out by the
> "lots of short-lived objects are cheap" claims, but in this case it simply
> wasn't true.
>
> So anyhow, it's difficult to find truly universal rules. And the fact that
> things tend to change pretty dramatically from JVM version to JVM version
> further complicates things.
>
> cheers,
> Eric
>

Wed, 2009-07-22, 10:57

#37

Kevin Wright

Joined: 2009-06-09,

Re: Okasaki: Purely Functional data Structures

There's an interesting approach to this considered here:
http://blogs.sun.com/jrose/entry/tuples_in_the_vm

(which is part of the da-vinci machine: http://openjdk.java.net/projects/mlvm/subprojects.html)

But I'm concerned that this sort of thing could premature and should properly be the responsibility of the JVM, with classes being defined as normal but getting inlined during runtime optimization.

On Wed, Jul 22, 2009 at 10:18 AM, Ricky Clarkson <ricky.clarkson@gmail.com> wrote:

Sure, hence the word 'might'. I'd guess in anything of that scale
there'd be a bigger change you could make to gain performance (e.g.,
using memory-mapped files, splitting the logs by those 12 separate
strings first, etc.).

A positive from this thread might be a feature request for Scala:
Value types. The JVM doesn't have them, other than the primitive
types, but that wouldn't have to stop a clever compiler from providing
them, and hence type-checking between them.

value class Point(x: Int, y: Int)
def abs(p: Point) = sqrt(p.x * p.x + p.y * p.y)

The def would desugar to:
def abs(p_x: Int, p_y: Int) = sqrt(p_x * p_x + p_y * p_y)

and abs(Point(3, 4)) would desugar to abs(3, 4)

2009/7/22 Eric Bowman <ebowman@boboco.ie>:
> Ricky Clarkson wrote:
>>
>> I did some extensive benchmarking of a few approaches to decoding
>> JPEGs a few months back. And yes, I know how to benchmark on the JVM.
>> A result that surprised my boss was that allocating a single
>> ByteBuffer and reusing it for many decodes was noticeably slower than
>> allocating it as late as possible (and hence repeatedly). This
>> doesn't directly relate to mutable/immutable but I'm sure you can
>> extrapolate. I never dug deeper but I expect cache locality to be a
>> large factor in that.
>>
>> If I have a point in this thread, it's that sacrificing readability
>> for performance might gain you neither.
>>
>
> I've always believed and coded to this notion, but recently it burned me.
> Perhaps this is interesting (or, perhaps not... :).
>
> The application in question does offline processing of application logs. It
> is massively parallelized, and somewhat performance critical ... if it takes
> more than n hours to process n hours worth of logs, all is lost.
>
> I recently added a new feature, where I was taking a string parsed from a
> log line, and wrapping a very thin domain class around it. All the domain
> class does, really, is keep a reference to the string. There are only a
> couple dozen possible values for this string.
>
> This simple change absolutely killed performance ... it practically doubled
> the execution time. I changed it to look up the domain object from a
> hashmap, and the problem was solved.
>
> In fairness, I didn't take the time to try to tune the GC; I suppose I might
> have been able to tune my way around it (this is on JDK 1.6.0.14 by the
> way). But when I wrote the code, part of me felt it was very wasteful
> creating this very short-lived objects, and that part was ruled out by the
> "lots of short-lived objects are cheap" claims, but in this case it simply
> wasn't true.
>
> So anyhow, it's difficult to find truly universal rules. And the fact that
> things tend to change pretty dramatically from JVM version to JVM version
> further complicates things.
>
> cheers,
> Eric
>

--
Ricky Clarkson
Java Programmer, AD Holdings
+44 1565 770804
Skype: ricky_clarkson
Google Talk: ricky.clarkson@gmail.com

Wed, 2009-07-22, 11:07

#38

ebowman

Joined: 2009-04-13,

Re: Okasaki: Purely Functional data Structures

Ricky Clarkson wrote:
> Sure, hence the word 'might'. I'd guess in anything of that scale
> there'd be a bigger change you could make to gain performance (e.g.,
> using memory-mapped files, splitting the logs by those 12 separate
> strings first, etc.).
>

This application has already had massive optimizations applied, and it's
blindingly fast. It has a nice mechanism to add new filters to the
incoming stream of log messages that are parsed, and we regularly add
new features. It would be unfortunate to need to tune it for one tiny
feature (and that tuning could have adverse affects on other parts of
the system). I could have done a better job of explaining the context,
I suppose.

Anyhow the core message here was that creating these incredibly
lightweight, incredibly short-lived objects absolutely killed
performance. Making them long-lasting brought back the performance we
had worked hard to achieve. Eyebrows were raised. "We are not idiots." ;)

> A positive from this thread might be a feature request for Scala:
> Value types. The JVM doesn't have them, other than the primitive
> types, but that wouldn't have to stop a clever compiler from providing
> them, and hence type-checking between them.
>
> value class Point(x: Int, y: Int)
> def abs(p: Point) = sqrt(p.x * p.x + p.y * p.y)
>
> The def would desugar to:
> def abs(p_x: Int, p_y: Int) = sqrt(p_x * p_x + p_y * p_y)
>
> and abs(Point(3, 4)) would desugar to abs(3, 4)
>

That would indeed be useful thing.

Thu, 2009-09-24, 16:57

#39

Chris Lewis

Joined: 2009-04-08,

Re: Okasaki: Purely Functional data Structures

A likely factor in this case, and cases like it, is the rate at which
new objects are created, as opposed to simply the number of new objects
created.

Eric Bowman wrote:
> Ricky Clarkson wrote:
>> Sure, hence the word 'might'. I'd guess in anything of that scale
>> there'd be a bigger change you could make to gain performance (e.g.,
>> using memory-mapped files, splitting the logs by those 12 separate
>> strings first, etc.).
>>
>
> This application has already had massive optimizations applied, and
> it's blindingly fast. It has a nice mechanism to add new filters to
> the incoming stream of log messages that are parsed, and we regularly
> add new features. It would be unfortunate to need to tune it for one
> tiny feature (and that tuning could have adverse affects on other
> parts of the system). I could have done a better job of explaining
> the context, I suppose.
>
> Anyhow the core message here was that creating these incredibly
> lightweight, incredibly short-lived objects absolutely killed
> performance. Making them long-lasting brought back the performance we
> had worked hard to achieve. Eyebrows were raised. "We are not
> idiots." ;)
>
>> A positive from this thread might be a feature request for Scala:
>> Value types. The JVM doesn't have them, other than the primitive
>> types, but that wouldn't have to stop a clever compiler from providing
>> them, and hence type-checking between them.
>>
>> value class Point(x: Int, y: Int)
>> def abs(p: Point) = sqrt(p.x * p.x + p.y * p.y)
>>
>> The def would desugar to:
>> def abs(p_x: Int, p_y: Int) = sqrt(p_x * p_x + p_y * p_y)
>>
>> and abs(Point(3, 4)) would desugar to abs(3, 4)
>>
>
> That would indeed be useful thing.
>

Fri, 2009-09-25, 00:17

#40

Robert Fischer

Joined: 2009-01-31,

Re: Okasaki: Purely Functional data Structures

I know that object caches are oh-so-passe, but there are times when a small one can pull load off of
the garbage collector. Back in Javaland, I've seen significant performance improvements on the 1.5
Sun JVM by using a NodeCachingLinkedList (from Commons-Collections) instead of a LinkedList for a
FIFO queue. The cache doesn't even have to be particularly large: just a handful of extra nodes can
pull some pressure off the GC.

~~ Robert Fischer, Smokejumper IT Consulting.
Enfranchised Mind Blog http://EnfranchisedMind.com/blog

Check out my book, "Grails Persistence with GORM and GSQL"!
http://www.smokejumperit.com/gormbook

Chris Lewis wrote:
> A likely factor in this case, and cases like it, is the rate at which
> new objects are created, as opposed to simply the number of new objects
> created.
>
> Eric Bowman wrote:
>> Ricky Clarkson wrote:
>>> Sure, hence the word 'might'. I'd guess in anything of that scale
>>> there'd be a bigger change you could make to gain performance (e.g.,
>>> using memory-mapped files, splitting the logs by those 12 separate
>>> strings first, etc.).
>>>
>>
>> This application has already had massive optimizations applied, and
>> it's blindingly fast. It has a nice mechanism to add new filters to
>> the incoming stream of log messages that are parsed, and we regularly
>> add new features. It would be unfortunate to need to tune it for one
>> tiny feature (and that tuning could have adverse affects on other
>> parts of the system). I could have done a better job of explaining
>> the context, I suppose.
>>
>> Anyhow the core message here was that creating these incredibly
>> lightweight, incredibly short-lived objects absolutely killed
>> performance. Making them long-lasting brought back the performance we
>> had worked hard to achieve. Eyebrows were raised. "We are not
>> idiots." ;)
>>
>>> A positive from this thread might be a feature request for Scala:
>>> Value types. The JVM doesn't have them, other than the primitive
>>> types, but that wouldn't have to stop a clever compiler from providing
>>> them, and hence type-checking between them.
>>>
>>> value class Point(x: Int, y: Int)
>>> def abs(p: Point) = sqrt(p.x * p.x + p.y * p.y)
>>>
>>> The def would desugar to:
>>> def abs(p_x: Int, p_y: Int) = sqrt(p_x * p_x + p_y * p_y)
>>>
>>> and abs(Point(3, 4)) would desugar to abs(3, 4)
>>>
>>
>> That would indeed be useful thing.
>>
>
>

Scala Main Menu

Scala Quick Links

Featured News

User login