- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
questioning FP
Wed, 2011-10-12, 14:37
#888
Re: Re: questioning FP
Magnificent!
On Wed, Oct 12, 2011 at 1:52 AM, Runar Bjarnason <runarorama@gmail.com> wrote:
--
Jim Powers
On Wed, Oct 12, 2011 at 1:52 AM, Runar Bjarnason <runarorama@gmail.com> wrote:
On Tuesday, October 11, 2011 6:18:55 PM UTC-4, martin odersky wrote:Not complicated at all. The pattern is presented in some detail in the paper "The Essence of the Iterator Pattern" by Gibbons and Oliveira:
> Maybe "map" is not the abstraction you're looking for, but "traverse":I definitively want a one-parameter function like map. Not sure what
it would take to make traverse into that. What I see so far looks
complicated, but I can see behind that if one presents the use cases
well.
http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/iterator.pdf
The basic idea is that data of some type F[A] can be traversed with a function of type A => M[B] to produce M[F[B]], where M represents some effect, and supports certain operations.
We have implemented this in Scalaz. For example, here is a use case where the "effect" is none at all. This is just "map":
scala> List(1,2,3).traverse[Id, Int](_ * 2)
res0: List[Int] = List(2, 4, 6)
The "Id" type here is just "type Id[A] = A".
Here is one where the effect is printing to the console:
scala> Stream("a", "b", "c").traverse(putStrLn)
res1: scalaz.effects.IO[scala.collection.immutable.Stream[Unit]] = scalaz.effects.IO$$anon$2@32064883
scala> res1.unsafePerformIO
a
b
c
res2: scala.collection.immutable.Stream[Unit] = Stream((), ?)
There is also a version where the result is discarded, if we only care about the effect:
scala> Stream("a", "b", "c").traverse_(putStrLn)
res3: scalaz.effects.IO[Unit] = scalaz.effects.IO$$anon$2@13641904
scala> res3.unsafePerformIO
a
b
c
Here is an example where the "effect" is that the computation is executed concurrently by a background thread:
scala> List(1,2,3).traverse(x => promise { x * 2 })
res4: scalaz.concurrent.Promise[List[Int]] = <promise>
scala> res4.get
res5: List[Int] = List(2, 4, 6)
A much more sophisticated example is where the "effect" is that computation succeeds or fails in an Either-like data structure, with failures accumulated on the left in a non-empty list:
scala> type MyValidation[A] = ValidationNEL[NumberFormatException, A]
defined type alias MyValidation
If everything succeeds, we get a success with a list in it:
scala> List("1", "2").traverse[MyValidation, Int](_.parseInt.liftFailNel)
res150: MyValidation[List[Int]] = Success(List(1, 2))
If anything fails, we get a failure with a nonempty list of errors in it:
scala> List("1", "a", "b").traverse[MyValidation, Int](_.parseInt.liftFailNel)
res151: MyValidation[List[Int]] = Failure(NonEmptyList(java.lang.NumberFormatException: For input string: "a", java.lang.NumberFormatException: For input string: "b"))
The "traverse" method is defined on any type F[_] for which there exists an implicit Traverse[F]. The method accepts a function of type A => M[B], and results in M[F[B]]. An implicit Applicative[M] instance is required. So M must be some applicative functor. These are exceedingly common. Any monad is an applicative functor. Every monoid yields an applicative functor (so traverse works as you would expect if M[B] happens to be a monoid, therefore it also generalizes foldLeft and foldRight), and of course if F[_] is applicative and G[_] is applicative, then the composite functor F[G[_]] is applicative as well.
Here is the what the Applicative interface might look like. A minimal definition requires overriding at least either "apply" or "zipWith" in addition to implementing "pure".trait Applicative[Z[_]] { def map[A, B](fa: Z[A], f: A => B): Z[B] = apply(pure(f), fa) def apply[A, B](f: Z[A => B], a: Z[A]): Z[B] = zipWith(f, a, (_:A => B)(_: A)) def zipWith[A, B, C](a: Z[A], b: Z[B], f: (A, B) => C): Z[C] = apply(map(a, f.curried), b)
def pure[A](a: => A): Z[A]
}
Traverse with Applicative is extremely versatile. Here it is generalizing "unzip":
scala> type Pair[A] = (A, A)
defined type alias Pair
scala> implicit val pairApplicative = new Applicative[Pair] {
| override def apply[A, B](f: (A => B, A => B), a: (A, A)) = (f._1(a._1), f._2(a._2))
| def pure[A](a: => A) = (a, a)
| }
pairApplicative: java.lang.Object with scalaz.Applicative[Pair] = $anon$1@63721cce
We now have a kind of "unzip" operation on anything that can be traversed.
scala> List((1, 10),(2, 20),(3, 30)).traverse[Pair, Int](x => x)
res191: (List[Int], List[Int]) = (List(1, 2, 3),List(10, 20, 30))
Traversing with the identity function has a shorthand called "sequence"
scala> some((1, 2)).sequence[Pair, Int]
res193: (Option[Int], Option[Int]) = (Some(1),Some(2))
--
Jim Powers
Wed, 2011-10-12, 15:07
#aaa
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 2:36 PM, Runar Bjarnason <runarorama@gmail.com> wrote:
okay, so to get effect polymorphism, all you need to do is abstract over the Applicative bound...
On Wednesday, October 12, 2011 4:55:49 AM UTC-4, Adriaan Moors wrote:(you can't abstract over class contexts, so that you can't write one map to traverse them all -- i.e., this signature is not expressible in Haskell: map :: M m => (a -> m b) -> [a] -> m [b] -- where M is abstract, but this is exactly what traverse achieves)
Traverse is very specific that it requires m to have class Applicative.
In Haskell:
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
In Scala (as a method on T[A]):
def traverse[M[_]:Applicative, B](f: A => M[B]): M[T[B]]
def traverse[Effectful[E[x]], M[_]: Effectful, B](f: A => M[B]): M[T[B]]
it looks a bit complicated though ^,^
Wed, 2011-10-12, 15:17
#ccc
Re: questioning FP
On Oct 11, 2011, at 10:17 AM, Rex Kerr wrote:
> Indeed. But if one does things this way, it is a huge hassle (of the major-rewrite-to-add-a-println variety). For specialized cases, encoding a finite state machine into your types (for instance) can be exactly what you need to keep things straight. But from what I've seen, this is a niche case when it comes to IO.
I feel like I should point out that adding a println to something that isn't already an IO action should be done for debugging only (and that's likely all people want when they bring this point up). It /is/ absurd to change signatures just to do this. In Haskell there is Debug.Trace[1] for these purposes. There's no reason not to create a similar "escape hatch" for use in Scala.
1: http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Debug-Trace.html
Wed, 2011-10-12, 15:37
#eee
Re: questioning FP
Runar,
Many thanks for your time and detailed explanation!
Vladimir
среда, 12 октября 2011 г. 9:52:38 UTC+4 пользователь Runar Bjarnason написал:
Many thanks for your time and detailed explanation!
Vladimir
среда, 12 октября 2011 г. 9:52:38 UTC+4 пользователь Runar Bjarnason написал:
On Tuesday, October 11, 2011 6:18:55 PM UTC-4, martin odersky wrote:Not complicated at all. The pattern is presented in some detail in the paper "The Essence of the Iterator Pattern" by Gibbons and Oliveira:
> Maybe "map" is not the abstraction you're looking for, but "traverse":I definitively want a one-parameter function like map. Not sure what
it would take to make traverse into that. What I see so far looks
complicated, but I can see behind that if one presents the use cases
well.
http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/iterator.pdf
The basic idea is that data of some type F[A] can be traversed with a function of type A => M[B] to produce M[F[B]], where M represents some effect, and supports certain operations.
We have implemented this in Scalaz. For example, here is a use case where the "effect" is none at all. This is just "map":
scala> List(1,2,3).traverse[Id, Int](_ * 2)
res0: List[Int] = List(2, 4, 6)
The "Id" type here is just "type Id[A] = A".
Here is one where the effect is printing to the console:
scala> Stream("a", "b", "c").traverse(putStrLn)
res1: scalaz.effects.IO[scala.collection.immutable.Stream[Unit]] = scalaz.effects.IO$$anon$2@32064883
scala> res1.unsafePerformIO
a
b
c
res2: scala.collection.immutable.Stream[Unit] = Stream((), ?)
There is also a version where the result is discarded, if we only care about the effect:
scala> Stream("a", "b", "c").traverse_(putStrLn)
res3: scalaz.effects.IO[Unit] = scalaz.effects.IO$$anon$2@13641904
scala> res3.unsafePerformIO
a
b
c
Here is an example where the "effect" is that the computation is executed concurrently by a background thread:
scala> List(1,2,3).traverse(x => promise { x * 2 })
res4: scalaz.concurrent.Promise[List[Int]] = <promise>
scala> res4.get
res5: List[Int] = List(2, 4, 6)
A much more sophisticated example is where the "effect" is that computation succeeds or fails in an Either-like data structure, with failures accumulated on the left in a non-empty list:
scala> type MyValidation[A] = ValidationNEL[NumberFormatException, A]
defined type alias MyValidation
If everything succeeds, we get a success with a list in it:
scala> List("1", "2").traverse[MyValidation, Int](_.parseInt.liftFailNel)
res150: MyValidation[List[Int]] = Success(List(1, 2))
If anything fails, we get a failure with a nonempty list of errors in it:
scala> List("1", "a", "b").traverse[MyValidation, Int](_.parseInt.liftFailNel)
res151: MyValidation[List[Int]] = Failure(NonEmptyList(java.lang.NumberFormatException: For input string: "a", java.lang.NumberFormatException: For input string: "b"))
The "traverse" method is defined on any type F[_] for which there exists an implicit Traverse[F]. The method accepts a function of type A => M[B], and results in M[F[B]]. An implicit Applicative[M] instance is required. So M must be some applicative functor. These are exceedingly common. Any monad is an applicative functor. Every monoid yields an applicative functor (so traverse works as you would expect if M[B] happens to be a monoid, therefore it also generalizes foldLeft and foldRight), and of course if F[_] is applicative and G[_] is applicative, then the composite functor F[G[_]] is applicative as well.
Here is the what the Applicative interface might look like. A minimal definition requires overriding at least either "apply" or "zipWith" in addition to implementing "pure".trait Applicative[Z[_]] { def map[A, B](fa: Z[A], f: A => B): Z[B] = apply(pure(f), fa) def apply[A, B](f: Z[A => B], a: Z[A]): Z[B] = zipWith(f, a, (_:A => B)(_: A)) def zipWith[A, B, C](a: Z[A], b: Z[B], f: (A, B) => C): Z[C] = apply(map(a, f.curried), b)
def pure[A](a: => A): Z[A]
}
Traverse with Applicative is extremely versatile. Here it is generalizing "unzip":
scala> type Pair[A] = (A, A)
defined type alias Pair
scala> implicit val pairApplicative = new Applicative[Pair] {
| override def apply[A, B](f: (A => B, A => B), a: (A, A)) = (f._1(a._1), f._2(a._2))
| def pure[A](a: => A) = (a, a)
| }
pairApplicative: java.lang.Object with scalaz.Applicative[Pair] = $anon$1@63721cce
We now have a kind of "unzip" operation on anything that can be traversed.
scala> List((1, 10),(2, 20),(3, 30)).traverse[Pair, Int](x => x)
res191: (List[Int], List[Int]) = (List(1, 2, 3),List(10, 20, 30))
Traversing with the identity function has a shorthand called "sequence"
scala> some((1, 2)).sequence[Pair, Int]
res193: (Option[Int], Option[Int]) = (Some(1),Some(2))
Wed, 2011-10-12, 16:27
#101010
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 9:58 AM, Adriaan Moors <adriaan.moors@epfl.ch> wrote:
You can always go back and ignore all this. It'll just keep coming back up on the list.
OK, having gone through the slides and begun reading the papers you pointed out w.r.t. DDC, it's not as if the type signatures on the effectual functions in DDC are a piece of cake either.
From the slides:updateInt :: forall (r1, r2 : region) . Int r1 -> Int r2 -(e1)> () :- e1 = Read r2 \/ Write r1 All to do the logical equivalent of assignment. Not only that but DDC gives you dependent types and a proof system.
I've been recently doing my best to learn ATS (http://www.ats-lang.org/) which has a lot in common with DDC including dependent types and an integrated proof system. It's not as if the type signatures in ATS are any better.
If dependent types and integrated theorem proving are part of Scala's future super cool, but I don't think that that's what you were going for. In the mean-time traverse really doesn't look all that bad[1].
--
Jim Powers
[1] In fact, it looks downright awesome.
Traverse is very specific that it requires m to have class Applicative.okay, so to get effect polymorphism, all you need to do is abstract over the Applicative bound...
In Haskell:
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
In Scala (as a method on T[A]):
def traverse[M[_]:Applicative, B](f: A => M[B]): M[T[B]]
def traverse[Effectful[E[x]], M[_]: Effectful, B](f: A => M[B]): M[T[B]]
it looks a bit complicated though ^,^
You can always go back and ignore all this. It'll just keep coming back up on the list.
OK, having gone through the slides and begun reading the papers you pointed out w.r.t. DDC, it's not as if the type signatures on the effectual functions in DDC are a piece of cake either.
From the slides:updateInt :: forall (r1, r2 : region) . Int r1 -> Int r2 -(e1)> () :- e1 = Read r2 \/ Write r1 All to do the logical equivalent of assignment. Not only that but DDC gives you dependent types and a proof system.
I've been recently doing my best to learn ATS (http://www.ats-lang.org/) which has a lot in common with DDC including dependent types and an integrated proof system. It's not as if the type signatures in ATS are any better.
If dependent types and integrated theorem proving are part of Scala's future super cool, but I don't think that that's what you were going for. In the mean-time traverse really doesn't look all that bad[1].
--
Jim Powers
[1] In fact, it looks downright awesome.
Wed, 2011-10-12, 16:37
#121212
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 5:16 PM, Jim Powers <jim@casapowers.com> wrote:
On Wed, Oct 12, 2011 at 9:58 AM, Adriaan Moors <adriaan.moors@epfl.ch> wrote:I didn't mean to imply they were (the signatures are expected to be complex, as they convey a lot of information; using these functions need not be complex, as the hope is these types can be inferred, even if the signatures can't)
Traverse is very specific that it requires m to have class Applicative.okay, so to get effect polymorphism, all you need to do is abstract over the Applicative bound...
In Haskell:
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
In Scala (as a method on T[A]):
def traverse[M[_]:Applicative, B](f: A => M[B]): M[T[B]]
def traverse[Effectful[E[x]], M[_]: Effectful, B](f: A => M[B]): M[T[B]]
it looks a bit complicated though ^,^
You can always go back and ignore all this. It'll just keep coming back up on the list.
OK, having gone through the slides and begun reading the papers you pointed out w.r.t. DDC, it's not as if the type signatures on the effectual functions in DDC are a piece of cake either.
I don't think the debate should be about which approach can technically enforce what we're looking for in the first place (as they both can).
The real question is which one can be made lightweight enough to be digestible (through automation, by integrating with/leveraging existing language features, ...) (It's somewhat reminiscent of the eternal dynamic type vs static typing debate: it's much more interesting to try to figure out how we can have our cake and eat it too -- using a gradual type system, for example. Can we have gradual effects?)
Then again, these are not questions you can just debate from your armchair. You (I mean the general kind of "you", which includes "us") have to try and see which works, so I'm thrilled to see both approaches getting more attention!
From the slides:updateInt :: forall (r1, r2 : region) . Int r1 -> Int r2 -(e1)> () :- e1 = Read r2 \/ Write r1 All to do the logical equivalent of assignment. Not only that but DDC gives you dependent types and a proof system.no, I don't think we'll ever have full-blown dependent types, but we're definitely working both on simplifying the type system and making it more powerful (no, I don't think these goals are contradictory)
I've been recently doing my best to learn ATS (http://www.ats-lang.org/) which has a lot in common with DDC including dependent types and an integrated proof system. It's not as if the type signatures in ATS are any better.
If dependent types and integrated theorem proving are part of Scala's future super cool, but I don't think that that's what you were going for.
In the mean-time traverse really doesn't look all that bad[1].
--
Jim Powers
[1] In fact, it looks downright awesome.
Wed, 2011-10-12, 17:37
#141414
Re: questioning FP
Jim Powers skrev 2011-10-12 17:16:
> If dependent types and integrated theorem proving are part of Scala's
> future super cool, but I don't think that that's what you were going
> for. In the mean-time traverse really doesn't look all that bad[1].
I have to point out that in Scala the decision has already been made to
allow uncontrolled side effects by default. Monads (nor applicatives)
will never be a solution for controlling side effects in Scala, unless,
in the unlikely event, the language is completely re-designed.
So, that only leaves the option of adding explicit notation for
specifying that functions/methods lack side effects, which is quite the
opposite from Haskell. The first step is obviously a way to specify that
a function/method is pure...
/Jesper Nordenberg
Wed, 2011-10-12, 17:47
#151515
Re: Re: questioning FP
On Wednesday, October 12, 2011 9:58:40 AM UTC-4, Adriaan Moors wrote:
okay, so to get effect polymorphism, all you need to do is abstract over the Applicative bound...
def traverse[Effectful[E[x]], M[_]: Effectful, B](f: A => M[B]): M[T[B]]
Traverse is already polymorphic in the effect. It needs to be Applicative specifically because it describes how effects are combined. That is, Applicative provides a distributive law, allowing us to distribute M over T. When we traverse a data structure, we are lifting its constructors into the M idiom so we can combine the effects on the elements into a larger effect on the whole structure.
But it's fun to think about whether there's an even more abstract way to describe such a distributive law than the pair Applicative[M] and Traverse[T].
Wed, 2011-10-12, 18:07
#fff
Re: questioning FP
On Wednesday, October 12, 2011 12:27:32 PM UTC-4, Jesper Nordenberg wrote:
I have to point out that in Scala the decision has already been made to
allow uncontrolled side effects by default. Monads (nor applicatives)
will never be a solution for controlling side effects in Scala, unless,
in the unlikely event, the language is completely re-designed.
Let's reword that: "The decision has already been made to make Scala Turing complete. Types will never be a solution for determining correctness in Scala, unless, in the unlikely event, the language is completely re-designed."
It doesn't require a redesign of the language. We could get a very long way by being explicit about effects in our libraries. Imagine if there were a complete and useful subset of the standard library that declared effects in its types. I think that would be sufficient, and all you need is higher kinds.
Now, it would be even better if we also had kind polymorphism and tail call elimination, but that's another story.
Wed, 2011-10-12, 19:07
#111111
Re: questioning FP
Runar Bjarnason skrev 2011-10-12 18:58:
> It doesn't require a redesign of the language. We could get a very long
> way by being explicit about effects in our libraries. Imagine if there
> were a complete and useful subset of the standard library that declared
> effects in its types. I think that would be sufficient, and all you need
> is higher kinds.
That's merely for documentation of side effects not control of them. You
will get no compiler error (or warning) if a function that's supposed to
be pure performs a side effect.
/Jesper Nordenberg
Wed, 2011-10-12, 21:37
#121212
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 15:05, Jesper Nordenberg wrote:
> Runar Bjarnason skrev 2011-10-12 18:58:
>>
>> It doesn't require a redesign of the language. We could get a very long
>> way by being explicit about effects in our libraries. Imagine if there
>> were a complete and useful subset of the standard library that declared
>> effects in its types. I think that would be sufficient, and all you need
>> is higher kinds.
>
> That's merely for documentation of side effects not control of them. You
> will get no compiler error (or warning) if a function that's supposed to be
> pure performs a side effect.
That means it will not detect bugs.
It's like using Option in Scala. Two, three years ago, I heard many
complains that using Option would not prevent null errors in the
programs. Well, since then I learned that if I kept to null-free
libraries, and "sanitized" nulls whenever I interfaced with a Java
library that used them, it truly reduced null problems a great deal.
The same thing applies here, except, perhaps, that doing I/O will not
cause instant exceptions like trying to use a null value. Still, the
problem will be reduced -- isolated at certain APIs.
Wed, 2011-10-12, 22:27
#141414
Re: Re: questioning FP
On 10/13/2011 06:27 AM, Daniel Sobral wrote:
> On Wed, Oct 12, 2011 at 15:05, Jesper Nordenberg wrote:
>> Runar Bjarnason skrev 2011-10-12 18:58:
>>> It doesn't require a redesign of the language. We could get a very long
>>> way by being explicit about effects in our libraries. Imagine if there
>>> were a complete and useful subset of the standard library that declared
>>> effects in its types. I think that would be sufficient, and all you need
>>> is higher kinds.
>> That's merely for documentation of side effects not control of them. You
>> will get no compiler error (or warning) if a function that's supposed to be
>> pure performs a side effect.
> That means it will not detect bugs.
>
> It's like using Option in Scala. Two, three years ago, I heard many
> complains that using Option would not prevent null errors in the
> programs. Well, since then I learned that if I kept to null-free
> libraries, and "sanitized" nulls whenever I interfaced with a Java
> library that used them, it truly reduced null problems a great deal.
>
> The same thing applies here, except, perhaps, that doing I/O will not
> cause instant exceptions like trying to use a null value. Still, the
> problem will be reduced -- isolated at certain APIs.
>
>
You got it!
Wed, 2011-10-12, 22:47
#161616
Re: Re: questioning FP
Hi Daniel,
Can you elaborate a bit about your approach to null-sanitizing Java libraries? Do you Option-alize all return values from Java libs?
Thanks,Jon
On Wed, Oct 12, 2011 at 4:27 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
Can you elaborate a bit about your approach to null-sanitizing Java libraries? Do you Option-alize all return values from Java libs?
Thanks,Jon
On Wed, Oct 12, 2011 at 4:27 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
On Wed, Oct 12, 2011 at 15:05, Jesper Nordenberg <megagurka@yahoo.com> wrote:
> Runar Bjarnason skrev 2011-10-12 18:58:
>>
>> It doesn't require a redesign of the language. We could get a very long
>> way by being explicit about effects in our libraries. Imagine if there
>> were a complete and useful subset of the standard library that declared
>> effects in its types. I think that would be sufficient, and all you need
>> is higher kinds.
>
> That's merely for documentation of side effects not control of them. You
> will get no compiler error (or warning) if a function that's supposed to be
> pure performs a side effect.
That means it will not detect bugs.
It's like using Option in Scala. Two, three years ago, I heard many
complains that using Option would not prevent null errors in the
programs. Well, since then I learned that if I kept to null-free
libraries, and "sanitized" nulls whenever I interfaced with a Java
library that used them, it truly reduced null problems a great deal.
The same thing applies here, except, perhaps, that doing I/O will not
cause instant exceptions like trying to use a null value. Still, the
problem will be reduced -- isolated at certain APIs.
--
Daniel C. Sobral
I travel to the future all the time.
Wed, 2011-10-12, 22:57
#181818
Re: questioning FP
Daniel Sobral skrev 2011-10-12 22:27:
> It's like using Option in Scala. Two, three years ago, I heard many
> complains that using Option would not prevent null errors in the
> programs. Well, since then I learned that if I kept to null-free
> libraries, and "sanitized" nulls whenever I interfaced with a Java
> library that used them, it truly reduced null problems a great deal.
>
> The same thing applies here, except, perhaps, that doing I/O will not
> cause instant exceptions like trying to use a null value. Still, the
> problem will be reduced -- isolated at certain APIs.
Sure, I agree that the Scalaz abstractions can help reduce the number of
bugs in your program the same way Option can. Still, I would like to see
both null and effects checking in the compiler at some point, and that
checking will not be based on Options or Monad.
/Jesper Nordenberg
Wed, 2011-10-12, 23:27
#1a1a1a
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 4:27 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
In many ways, though, it's _not_ like using Option. The main problem with nulls in Java--at least in my experience--is that some APIs would return null only when something disastrous happened, while others would return null in the normal course of events; if you didn't remember which was which, your program would die.
The reason to use null in the normal course of events is when a result may not be available, which is exactly what Option[X] captures explicitly. And Option not only is a marker that something might be null (that is slightly helpful but not _that_ helpful) but it gives you tools for dealing with data that might not be there (e.g. map). So you get the same concept you were using before except clearly instead of as a hack, and more powerful tools to deal with it. What's not to like?*
The situation with IO, however, is only analogous if people write APIs that opaquely perform IO into existing streams _and_ if IO gives you a way to avoid doing this. Personally, I have had hundreds, possibly thousands of problems caused by null-used-as-option, and I'm struggling to remember a single time where I had an unexpected-IO problem that was not introduced as part of a debugging process. Furthermore, I have yet to see an example where IO actually does something for me that is relevant to IO (like data type conversion or input validation or auto-closing streams or whatever) and which does often cause problems. There are certainly some wonderful capabilities out there with traverse (weird name, though!), but the wonderfulness isn't IO-dependent; it just makes working with an IO marker a bit less annoying.
"Here is a very awkward solution to a problem you don't really have--and look, with these nifty tricks we can make it only slightly awkward!" isnt' a very good selling point for a capability. There are all sorts of things that help out with IO: building a FSM to make sure things happen in order, writing robust possibly even monadic converters, packaging output up first and then sending it out all in one go in something that manages resources, and so on.
Some of us are still having difficulty seeing the problem, so it's feeling like Hungarian notation in dynamic languages: yes, you can use it to annotate in information that the language itself does not, but the burden is sufficiently high for the benefit one receives that it is not widely regarded as a good practice (indeed, it's mostly regarded as a painful waste of time). Maybe we should annotate mutation also, and what is in units of meters, and so on; but I really doubt that IO[Mutated[Meter[Array[Double]]]] is going to illuminate more than it confuses. (In particular, if I want to change the units to Centimeters, getting at it is going to be a pain.)
An Array[Double] with IO with Mutable with Meter, where the traits are all virtual and only kept track of by the compiler, might be doable as a marker system, but there are a lot of corner cases to work though.
--Rex
* What's not to like is the performance overhead. Pity, but in critical cases one can just pay more attention and use null after all.
On Wed, Oct 12, 2011 at 15:05, Jesper Nordenberg <megagurka@yahoo.com> wrote:
> Runar Bjarnason skrev 2011-10-12 18:58:
>>
>> It doesn't require a redesign of the language. We could get a very long
>> way by being explicit about effects in our libraries. Imagine if there
>> were a complete and useful subset of the standard library that declared
>> effects in its types. I think that would be sufficient, and all you need
>> is higher kinds.
>
> That's merely for documentation of side effects not control of them. You
> will get no compiler error (or warning) if a function that's supposed to be
> pure performs a side effect.
That means it will not detect bugs.
It's like using Option in Scala. Two, three years ago, I heard many
complains that using Option would not prevent null errors in the
programs. Well, since then I learned that if I kept to null-free
libraries, and "sanitized" nulls whenever I interfaced with a Java
library that used them, it truly reduced null problems a great deal.
In many ways, though, it's _not_ like using Option. The main problem with nulls in Java--at least in my experience--is that some APIs would return null only when something disastrous happened, while others would return null in the normal course of events; if you didn't remember which was which, your program would die.
The reason to use null in the normal course of events is when a result may not be available, which is exactly what Option[X] captures explicitly. And Option not only is a marker that something might be null (that is slightly helpful but not _that_ helpful) but it gives you tools for dealing with data that might not be there (e.g. map). So you get the same concept you were using before except clearly instead of as a hack, and more powerful tools to deal with it. What's not to like?*
The situation with IO, however, is only analogous if people write APIs that opaquely perform IO into existing streams _and_ if IO gives you a way to avoid doing this. Personally, I have had hundreds, possibly thousands of problems caused by null-used-as-option, and I'm struggling to remember a single time where I had an unexpected-IO problem that was not introduced as part of a debugging process. Furthermore, I have yet to see an example where IO actually does something for me that is relevant to IO (like data type conversion or input validation or auto-closing streams or whatever) and which does often cause problems. There are certainly some wonderful capabilities out there with traverse (weird name, though!), but the wonderfulness isn't IO-dependent; it just makes working with an IO marker a bit less annoying.
"Here is a very awkward solution to a problem you don't really have--and look, with these nifty tricks we can make it only slightly awkward!" isnt' a very good selling point for a capability. There are all sorts of things that help out with IO: building a FSM to make sure things happen in order, writing robust possibly even monadic converters, packaging output up first and then sending it out all in one go in something that manages resources, and so on.
The same thing applies here, except, perhaps, that doing I/O will not
cause instant exceptions like trying to use a null value. Still, the
problem will be reduced -- isolated at certain APIs.
Some of us are still having difficulty seeing the problem, so it's feeling like Hungarian notation in dynamic languages: yes, you can use it to annotate in information that the language itself does not, but the burden is sufficiently high for the benefit one receives that it is not widely regarded as a good practice (indeed, it's mostly regarded as a painful waste of time). Maybe we should annotate mutation also, and what is in units of meters, and so on; but I really doubt that IO[Mutated[Meter[Array[Double]]]] is going to illuminate more than it confuses. (In particular, if I want to change the units to Centimeters, getting at it is going to be a pain.)
An Array[Double] with IO with Mutable with Meter, where the traits are all virtual and only kept track of by the compiler, might be doable as a marker system, but there are a lot of corner cases to work though.
--Rex
* What's not to like is the performance overhead. Pity, but in critical cases one can just pay more attention and use null after all.
Wed, 2011-10-12, 23:57
#1c1c1c
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Daniel Sobral wrote:
Can you please share a snippet where there are bugs and how IO reveals them? This is the sort of thing I was looking for.
Ittay
Daniel Sobral wrote:
CAHyB3VU1R488A3-AuStVicbZCti+sb+ZvC3YJCoRbV-amx+KFA [at] mail [dot] gmail [dot] com" type="cite">On Wed, Oct 12, 2011 at 15:05, Jesper Nordenbergwrote: Runar Bjarnason skrev 2011-10-12 18:58:It doesn't require a redesign of the language. We could get a very long way by being explicit about effects in our libraries. Imagine if there were a complete and useful subset of the standard library that declared effects in its types. I think that would be sufficient, and all you need is higher kinds.That's merely for documentation of side effects not control of them. You will get no compiler error (or warning) if a function that's supposed to be pure performs a side effect.That means it will not detect bugs.
Can you please share a snippet where there are bugs and how IO reveals them? This is the sort of thing I was looking for.
CAHyB3VU1R488A3-AuStVicbZCti+sb+ZvC3YJCoRbV-amx+KFA [at] mail [dot] gmail [dot] com" type="cite">It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy. And if I do it at the top of the program, is there real benefit in it? I'm about to call performUnsafeIO several lines below anyway, so why invest in wrapping just to unwrap?It's like using Option in Scala. Two, three years ago, I heard many complains that using Option would not prevent null errors in the programs. Well, since then I learned that if I kept to null-free libraries, and "sanitized" nulls whenever I interfaced with a Java library that used them, it truly reduced null problems a great deal. The same thing applies here, except, perhaps, that doing I/O will not cause instant exceptions like trying to use a null value. Still, the problem will be reduced -- isolated at certain APIs.
Ittay
CAHyB3VU1R488A3-AuStVicbZCti+sb+ZvC3YJCoRbV-amx+KFA [at] mail [dot] gmail [dot] com" type="cite">
Thu, 2011-10-13, 02:47
#1e1e1e
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 18:43, Jon Steelman wrote:
> Hi Daniel,
> Can you elaborate a bit about your approach to null-sanitizing Java
> libraries? Do you Option-alize all return values from Java libs?
Not all return values, just the ones that may return null.
See, you *have* to know this anyway. This is not something you can get
away with, no matter if you use Scala or Java, or whether your code is
OO, functional or plain structured disguised as object oriented.
If you can a method M that may return null, you *have* to know it. If
you don't, you'll get a null pointer exception. Oh, ok, on a project I
was on there was a pervasive attitude by others that throwing null
pointer exceptions was normal, as long as the program did not stop
working. Only, of course, it did stop working, and we couldn't find
which of the exceptions was causing trouble...
Anyway, you got a value from an API you called, you have to know if it
is null or not. You might try to "defensively" check everything for
null, except I never saw anyone truly doing that. And even if you did,
if you don't know WHY the API is returning null, you cannot really act
properly to receiving it.
So, you have to know if a method returns null, and standard Java APIs
are pretty good at documenting this. And, once you know that, it is
just a matter of choosing which one to write:
val parent = file.getParent()
val parent = Option(file.getParent())
Thu, 2011-10-13, 11:07
#1f1f1f
Re: Re: questioning FP
On Wed, Oct 12, 2011 at 6:40 PM, Runar Bjarnason <runarorama@gmail.com> wrote:
you're right -- I blame my over-eager abstraction trigger finger
On Wednesday, October 12, 2011 9:58:40 AM UTC-4, Adriaan Moors wrote:okay, so to get effect polymorphism, all you need to do is abstract over the Applicative bound...
def traverse[Effectful[E[x]], M[_]: Effectful, B](f: A => M[B]): M[T[B]]
Traverse is already polymorphic in the effect.
It needs to be Applicative specifically because it describes how effects are combined. That is, Applicative provides a distributive law, allowing us to distribute M over T. When we traverse a data structure, we are lifting its constructors into the M idiom so we can combine the effects on the elements into a larger effect on the whole structure.
But it's fun to think about whether there's an even more abstract way to describe such a distributive law than the pair Applicative[M] and Traverse[T].
Thu, 2011-10-13, 15:47
#191919
Re: Re: questioning FP
On Wednesday, October 12, 2011 6:51:34 PM UTC-4, Ittay Dror wrote:
It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy.
I think this needs to be repeated: The idea of "call hierarchy" is applicable only to programming that is first-order and single-threaded. That said, if you call a function that depends on IO, then you too depend on IO whether you want to pretend otherwise or not.
I'm about to call performUnsafeIO several lines below anyway
Don't do that. Use map and flatMap instead. If you can say "f(x.unsafePerformIO)", then you can just as well say "x map f".
Thu, 2011-10-13, 16:37
#1b1b1b
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
* if most of the program logic is in those functions, then it means the IO values propagated through several layers (I assume)
* if those functions are trivial / fundamental, then I just wrapped a simple thing in IO to unwrap it immediately after.
Runar Bjarnason wrote:
Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
On Wednesday, October 12, 2011 6:51:34 PM UTC-4, Ittay Dror wrote:
It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy.
I think this needs to be repeated: The idea of "call hierarchy" is applicable only to programming that is first-order and single-threaded.
That said, if you call a function that depends on IO, then you too depend on IO whether you want to pretend otherwise or not.of course. so? as an analogy, if my function does division, I depend on math, so do I wrap its result in a Math[_] monad?
at the "end of the world", i call performUnsafeIO, right? You also said that normally I'd call my effectual functions up front and then yield a function that is pure, where most of my logic is. Am I correct so far?I'm about to call performUnsafeIO several lines below anyway
Don't do that. Use map and flatMap instead. If you can say "f(x.unsafePerformIO)", then you can just as well say "x map f".
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
* if most of the program logic is in those functions, then it means the IO values propagated through several layers (I assume)
* if those functions are trivial / fundamental, then I just wrapped a simple thing in IO to unwrap it immediately after.
Thu, 2011-10-13, 16:47
#1c1c1c
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
BTW, thanks for replying. I have a feeling that we're talking
about different things. Maybe the next message will contain the
"joining" argument
Runar Bjarnason wrote:
On Wednesday, October 12, 2011 6:51:34 PM UTC-4, Ittay Dror wrote:
It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy.
I think this needs to be repeated: The idea of "call hierarchy" is applicable only to programming that is first-order and single-threaded. That said, if you call a function that depends on IO, then you too depend on IO whether you want to pretend otherwise or not.
I'm about to call performUnsafeIO several lines below anyway
Don't do that. Use map and flatMap instead. If you can say "f(x.unsafePerformIO)", then you can just as well say "x map f".
Thu, 2011-10-13, 17:17
#1e1e1e
Re: Re: questioning FP
On Thu, Oct 13, 2011 at 11:27 AM, Ittay Dror <ittay.dror@gmail.com> wrote:
Video worth a thousand words?http://vimeo.com/25786102
--
Jim Powers
Runar Bjarnason wrote:Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
On Wednesday, October 12, 2011 6:51:34 PM UTC-4, Ittay Dror wrote:
It is isolated only if those APIs are used at the top of the program. Otherwise, the IO monad needs to propagate up the call hierarchy.
I think this needs to be repeated: The idea of "call hierarchy" is applicable only to programming that is first-order and single-threaded.
Video worth a thousand words?http://vimeo.com/25786102
--
Jim Powers
Thu, 2011-10-13, 19:47
#202020
Re: Re: questioning FP
On Thursday, October 13, 2011 11:27:49 AM UTC-4, Ittay Dror wrote:
Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
A higher-order function is a function that takes a function as its argument. First-order just means programming without those. In scala, we have higher-order functions like map and flatMap. When working with monads, this is how we lift non-monadic code into the monad.
This is closely related to continuation-passing. Consider these two methods:
def timesTwo(x: Int): Int = x * 2
def timesTwoK(x: Int, k: Int => Nothing): Nothing = k(x * 2)
The former is called like this: f(timesTwo(n))
The latter is called like this: timesTwoK(n, f)
The former is a first-order function that simply returns the result of the computation. The latter is a higher-order function that doesn't return, but instead accepts a function as its argument. That function is called the continuation. Instead of accepting a result from the function, we tell it how to continue without us.
The "map" and "flatMap" methods are exactly the same way. Instead of getting the result out of a monad (like IO), we pass the continuation to map or flatMap. That continuation will receive the result if and when it's available.
of course. so? as an analogy, if my function does division, I depend on math, so do I wrap its result in a Math[_] monad?
Ordinarily you would close over some division function and carry on, but it's not completely insane to use a data type instead. If you want to be polymorphic in the division function, for example. Or if you want to allow defer optimization of mathematical expressions to your math engine.
Take a simpler use case: multiplication. The expression 3 * 4 * 5 could be written like this:
List(3, 4, 5)
Abstracting over the multiplication operator. We can inject it later like this:
list.foldLeft(1, _ * _)
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
What's the point of constructing a list, if I'm just going to fold it later anyway?
Thu, 2011-10-13, 20:57
#222222
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
Let me state again my issues:
1. I don't code with random functions named 'foo' that I know nothing about.
2. If side effects are dangerous, IO[_] does not protect me from taking them. Calling a method 'doDangerousAction: Unit' is no different than 'doDangerousAction: IO[Unit]'. If this is the logic I need to call, then this is the logic I need to call. And wrapping it in IO[_] doesn't make the action safer. It will eventually be called anyway.
3. IO[_] makes code trickier to work with (need to work with for comprehensions, use traverse/sequence etc.)
4/ IO[_] does not prevent bugs:
Imagine 'close(resource: Resource): Unit'
Now this imperative code:
close(resource)
close(resource) // throws an exception that the resource is already closed.
Or with a higher order function:
def doWithResource(f: Resource => Unit) {
f(resource)
f(resource)
}
doWithResource(close)
Now this is of course a simple case, we need to imagine some sort of complex code in between and around the calls that made the developer get confused about the flow.
Now with IO:
val io = close(resource)
val io2 = io.flatMap(close(resource))
Or:
(close(resource), close(resource))
will result in the same exception.
I grant that it is hard to imagine doing the above by mistake since IO makes the code more intentional. We've made things difficult for the developer. To that, I have two things:
1. The same has been said about checked exceptions. But we know these are a failed experiment. Mainly due to the abstraction consideration.
2. We've traded one source of error with another. Because when using IO[_] the developer can forget to return the result of close and then the resource will not be closed, leading to a leak (which is harder to debug)
(about doWithResource: if it returns IO[Unit] or List[IO[Unit]] makes no difference since I'll need to sequence it and return it, maybe for the second case I can check the list size is 1, but this is a runtime check)
So I'm still baffled where the usefulness. Basically I think it is like checked exceptions: it looks nice to tag methods that can fail, but pretty soon you find out that you must use these methods, you can't do anything with the exception, and it makes your code more difficult to use.
To contrast with other monads: Take Option for example: when a function returns Option it not only documents the fact it may be returning none, I'm also able to do something about it. If I see a 'get(key: K): Option[V]' I can use it with 'get(key) getOrElse defaultValue'. So: 1) I've avoided the dangers of NPE, 2) I've reduced it to a simple value, so my code, and clients' code remains simple. In other words, Option is not an opaque wrapper like IO[_]
Some really smart people are saying I'm missing something. But the arguments are generally philosophical and with a lot of jargon. Maybe you guys are so used to the benefits of IO[_] that you give higher-level arguments. So please, go back to the basics. Show me an example of an imperative code and how introducing IO[_] removes the bugs without introducing the possibility for new ones (such as forgetting to return the IO instance).
Thanks,
ittay
Runar Bjarnason wrote:
On Thursday, October 13, 2011 11:27:49 AM UTC-4, Ittay Dror wrote:Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
A higher-order function is a function that takes a function as its argument. First-order just means programming without those. In scala, we have higher-order functions like map and flatMap. When working with monads, this is how we lift non-monadic code into the monad.
When folding the list, information is added. When calling unsafePerformIO, no information is created. It just takes out a delayed action.
This is closely related to continuation-passing. Consider these two methods:
def timesTwo(x: Int): Int = x * 2
def timesTwoK(x: Int, k: Int => Nothing): Nothing = k(x * 2)
The former is called like this: f(timesTwo(n))
The latter is called like this: timesTwoK(n, f)
The former is a first-order function that simply returns the result of the computation. The latter is a higher-order function that doesn't return, but instead accepts a function as its argument. That function is called the continuation. Instead of accepting a result from the function, we tell it how to continue without us.
The "map" and "flatMap" methods are exactly the same way. Instead of getting the result out of a monad (like IO), we pass the continuation to map or flatMap. That continuation will receive the result if and when it's available.
of course. so? as an analogy, if my function does division, I depend on math, so do I wrap its result in a Math[_] monad?
Ordinarily you would close over some division function and carry on, but it's not completely insane to use a data type instead. If you want to be polymorphic in the division function, for example. Or if you want to allow defer optimization of mathematical expressions to your math engine.
Take a simpler use case: multiplication. The expression 3 * 4 * 5 could be written like this:
List(3, 4, 5)
Abstracting over the multiplication operator. We can inject it later like this:
list.foldLeft(1, _ * _)
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
What's the point of constructing a list, if I'm just going to fold it later anyway?
Let me state again my issues:
1. I don't code with random functions named 'foo' that I know nothing about.
2. If side effects are dangerous, IO[_] does not protect me from taking them. Calling a method 'doDangerousAction: Unit' is no different than 'doDangerousAction: IO[Unit]'. If this is the logic I need to call, then this is the logic I need to call. And wrapping it in IO[_] doesn't make the action safer. It will eventually be called anyway.
3. IO[_] makes code trickier to work with (need to work with for comprehensions, use traverse/sequence etc.)
4/ IO[_] does not prevent bugs:
Imagine 'close(resource: Resource): Unit'
Now this imperative code:
close(resource)
close(resource) // throws an exception that the resource is already closed.
Or with a higher order function:
def doWithResource(f: Resource => Unit) {
f(resource)
f(resource)
}
doWithResource(close)
Now this is of course a simple case, we need to imagine some sort of complex code in between and around the calls that made the developer get confused about the flow.
Now with IO:
val io = close(resource)
val io2 = io.flatMap(close(resource))
Or:
(close(resource), close(resource))
will result in the same exception.
I grant that it is hard to imagine doing the above by mistake since IO makes the code more intentional. We've made things difficult for the developer. To that, I have two things:
1. The same has been said about checked exceptions. But we know these are a failed experiment. Mainly due to the abstraction consideration.
2. We've traded one source of error with another. Because when using IO[_] the developer can forget to return the result of close and then the resource will not be closed, leading to a leak (which is harder to debug)
(about doWithResource: if it returns IO[Unit] or List[IO[Unit]] makes no difference since I'll need to sequence it and return it, maybe for the second case I can check the list size is 1, but this is a runtime check)
So I'm still baffled where the usefulness. Basically I think it is like checked exceptions: it looks nice to tag methods that can fail, but pretty soon you find out that you must use these methods, you can't do anything with the exception, and it makes your code more difficult to use.
To contrast with other monads: Take Option for example: when a function returns Option it not only documents the fact it may be returning none, I'm also able to do something about it. If I see a 'get(key: K): Option[V]' I can use it with 'get(key) getOrElse defaultValue'. So: 1) I've avoided the dangers of NPE, 2) I've reduced it to a simple value, so my code, and clients' code remains simple. In other words, Option is not an opaque wrapper like IO[_]
Some really smart people are saying I'm missing something. But the arguments are generally philosophical and with a lot of jargon. Maybe you guys are so used to the benefits of IO[_] that you give higher-level arguments. So please, go back to the basics. Show me an example of an imperative code and how introducing IO[_] removes the bugs without introducing the possibility for new ones (such as forgetting to return the IO instance).
Thanks,
ittay
Thu, 2011-10-13, 21:07
#242424
Re: Re: questioning FP
On Tuesday, October 11, 2011 4:11:01 PM UTC+2, Runar Bjarnason wrote:
On Tue, Oct 11, 2011 at 9:55 AM, Ittay Dror <ittay...@gmail.com> wrote:Interesting. Maybe I missed something. I thought that when you use IO you need to propagate it up to 'main' (or 'run'). So one function that does IO (a println) pollutes your whole program. Where am I wrong?
Right there.
In practise, you compose effectful things in two ways:
for {
x <- effectsHere
y <- effectsHereToo(x)
} yield noEffectsHere(y)
or...
(effectsHere |@| effectsHereToo)((a, b) => noEffectsHere(a, b))
You'll find that most of your code is in "noEffectsHere". It doesn't need to be aware that it can be lifted into IO. This is what separation of concerns is all about.
I don't know about that. Maybe in trivial cases it can be arranged that the effects are calculated upfront, but in a complex system?
Let's say I want to use an event bus. So it has a method 'EventBus#pubish(event: Event): IO[Unit]', right?
Now in every piece of my code where I want to publish an event, I need to return that IO instance.
So whenever I have a new event I want to publish (say after taking some step in the business flow), all methods leading to that point need to change to return IO[_], and all their client code.
Suddenly, my business code has changed because I wanted to publish an event: abstraction and modularity went down the drain.
Also, I have a new source of bugs to deal with. Namely that one of these methods will forget to return the IO instance (if I use IO[_] a lot, then they already need to deal with some instances, so it is easy to forget)
Perhaps if I'm already skilled with IO[_] I can arrange my code cleverly so that introducing event publishing does not change it significantly (can it? how?). But then wouldn't this code be more complex to begin with? Creating yet another source for bugs?
Ittay
Thu, 2011-10-13, 22:17
#262626
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
If you have a high order function and it doesn't call its argument, but pass it to somewhere else, then a monad makes sense (In fact I wrote about such a monad in www.tikalk.com/incubator/blog/functional-programming-scala-rest-us). But it is a specific use case and should be treated as such. The monad is future/promise, not IO[_]
Runar Bjarnason wrote:
On Thursday, October 13, 2011 11:27:49 AM UTC-4, Ittay Dror wrote:Can you explain what is first-order? And why multi threaded doesn't have a call hierarchy? (a function can have several clients)
A higher-order function is a function that takes a function as its argument. First-order just means programming without those. In scala, we have higher-order functions like map and flatMap. When working with monads, this is how we lift non-monadic code into the monad.
If you have a high order function and it doesn't call its argument, but pass it to somewhere else, then a monad makes sense (In fact I wrote about such a monad in www.tikalk.com/incubator/blog/functional-programming-scala-rest-us). But it is a specific use case and should be treated as such. The monad is future/promise, not IO[_]
This is closely related to continuation-passing. Consider these two methods:
def timesTwo(x: Int): Int = x * 2
def timesTwoK(x: Int, k: Int => Nothing): Nothing = k(x * 2)
The former is called like this: f(timesTwo(n))
The latter is called like this: timesTwoK(n, f)
The former is a first-order function that simply returns the result of the computation. The latter is a higher-order function that doesn't return, but instead accepts a function as its argument. That function is called the continuation. Instead of accepting a result from the function, we tell it how to continue without us.
The "map" and "flatMap" methods are exactly the same way. Instead of getting the result out of a monad (like IO), we pass the continuation to map or flatMap. That continuation will receive the result if and when it's available.
of course. so? as an analogy, if my function does division, I depend on math, so do I wrap its result in a Math[_] monad?
Ordinarily you would close over some division function and carry on, but it's not completely insane to use a data type instead. If you want to be polymorphic in the division function, for example. Or if you want to allow defer optimization of mathematical expressions to your math engine.
Take a simpler use case: multiplication. The expression 3 * 4 * 5 could be written like this:
List(3, 4, 5)
Abstracting over the multiplication operator. We can inject it later like this:
list.foldLeft(1, _ * _)
if so, then what is the point of wrapping the effectual functions in IO, if I'm going to call unsafePerformIO anyway?:
What's the point of constructing a list, if I'm just going to fold it later anyway?
Thu, 2011-10-13, 22:47
#282828
Re: Re: questioning FP
On Thu, Oct 13, 2011 at 01:01:22PM -0700, Ittay Dror said
> I don't know about that. Maybe in trivial cases it can be arranged that the
> effects are calculated upfront, but in a complex system?
>
> Let's say I want to use an event bus. So it has a method
> 'EventBus#pubish(event: Event): IO[Unit]', right?
>
> Now in every piece of my code where I want to publish an event, I need to
> return that IO instance.
>
> So whenever I have a new event I want to publish (say after taking some step
> in the business flow), all methods leading to that point need to change to
> return IO[_], and all their client code.
>
> Suddenly, my business code has changed because I wanted to publish an event:
> abstraction and modularity went down the drain.
The business logic should be viewed as producing a stream of events to
be published instead of code that publishes events and produces a
unitary value. This is a good thing from the perspective of testable
code. You don't have to stub or mock the event bus to observe the
produced events, you can just inspect the result of calling the business
logic and check that the events are those you expect. Abstraction and
modularity have now been enhanced not eliminated.
> Also, I have a new source of bugs to deal with. Namely that one of these
> methods will forget to return the IO instance (if I use IO[_] a lot, then
> they already need to deal with some instances, so it is easy to forget)
>
> Perhaps if I'm already skilled with IO[_] I can arrange my code cleverly so
> that introducing event publishing does not change it significantly (can it?
> how?). But then wouldn't this code be more complex to begin with? Creating
> yet another source for bugs?
I hope it is clear that if the code is refactored as I suggested above
then the business logic code does not get any more complex and in fact
may become simpler. You might consider this to be a form of inversion of
control. Dependency injection inverts control of arguments, the pattern
I'm proposing inverts the control of results.
Thu, 2011-10-13, 22:57
#292929
Re: Re: questioning FP
On Thursday, October 13, 2011 3:51:02 PM UTC-4, Ittay Dror wrote:
1. I don't code with random functions named 'foo' that I know nothing about.
Right, so your code is mostly first-order. It probably has a lot of repetition. You can improve on that by abstracting more.
2. If side effects are dangerous, IO[_] does not protect me from taking them. Calling a method 'doDangerousAction: Unit' is no different than 'doDangerousAction: IO[Unit]'.
There is a huge difference. The latter call is referentially transparent. The former call will mingle the performing of the dangerous action with program logic. I want a clean separation between describing what is done and actually doing it. If you don't want that, that's fine. Your code, your rules.
3. IO[_] makes code trickier to work with (need to work with for comprehensions, use traverse/sequence etc.)
No it doesn't. It makes code easier to work with. In addition to having no side-effects, referential transparency, equational reasoning, compositionality, clean separation of concerns, I can also make use of functions like traverse and sequence which remove a lot of tedious repetition from my code.
4/ IO[_] does not prevent bugs:
Sure it does. Take for example any kind of race condition where you accidentally interleave two side-effects (like writing to a file while getting some content to write to that same file). That kind of interleaving is prevented if the type system forces you to sequence effects monadically.
1. The same has been said about checked exceptions.
The IO monad is not at all like checked exceptions, precisely because it is a monad.
2. We've traded one source of error with another. Because when using IO[_] the developer can forget to return the result of close and then the resource will not be closed, leading to a leak (which is harder to debug)
This is what types are for. Unit is not the same as IO[Unit], and the compiler will complain if you don't return anything.
Thu, 2011-10-13, 23:27
#232323
Re: Re: questioning FP
Geoff Reedy wrote:
20111013214319 [dot] GE18007 [at] programmer-monk [dot] net" type="cite">On Thu, Oct 13, 2011 at 01:01:22PM -0700, Ittay Dror saidI don't know about that. Maybe in trivial cases it can be arranged that the effects are calculated upfront, but in a complex system? Let's say I want to use an event bus. So it has a method 'EventBus#pubish(event: Event): IO[Unit]', right? Now in every piece of my code where I want to publish an event, I need to return that IO instance. So whenever I have a new event I want to publish (say after taking some step in the business flow), all methods leading to that point need to change to return IO[_], and all their client code. Suddenly, my business code has changed because I wanted to publish an event: abstraction and modularity went down the drain.The business logic should be viewed as producing a stream of events to be published instead of code that publishes events and produces a unitary value. This is a good thing from the perspective of testable
It will also be producing values to be cached, log statements to be written, etc. In short, a complex result that should now be handled in the root entry of the program, even though some things may be implementation details. why should the top of the application care that I'm creating events as an architectural decision? Why should my bootstrap module be modified because some other module decided to do things like caching or events or write logs?
20111013214319 [dot] GE18007 [at] programmer-monk [dot] net" type="cite">code. You don't have to stub or mock the event bus to observe the produced events, you can just inspect the result of calling the business logic and check that the events are those you expect. Abstraction and modularity have now been enhanced not eliminated.
Yes, when my code is not longer doing complex interactions it is easier to test. But now the caller of this code must do these and be harder to test.
With DI and proper interfaces, I don't find mocking is hard. It is harder than matching the result values, but considering the drawbacks of this approach, maybe it is a price to pay.
20111013214319 [dot] GE18007 [at] programmer-monk [dot] net" type="cite">Also, I have a new source of bugs to deal with. Namely that one of these methods will forget to return the IO instance (if I use IO[_] a lot, then they already need to deal with some instances, so it is easy to forget) Perhaps if I'm already skilled with IO[_] I can arrange my code cleverly so that introducing event publishing does not change it significantly (can it? how?). But then wouldn't this code be more complex to begin with? Creating yet another source for bugs?I hope it is clear that if the code is refactored as I suggested above then the business logic code does not get any more complex and in fact may become simpler. You might consider this to be a form of inversion of control. Dependency injection inverts control of arguments, the pattern I'm proposing inverts the control of results.
Well, it is a good suggestion (with pros and cons), but note that when I asked a while ago about how to implement such an event bus in the scalaz mailing list a while ago, the reply was to use IO[_]. I like your suggestion much better.
ittay
20111013214319 [dot] GE18007 [at] programmer-monk [dot] net" type="cite">
Thu, 2011-10-13, 23:47
#252525
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
Yes, mostly first order, not a lot of repetition.
Sure, in HOF, I might call a function several times just for the sake of not keeping a val of the result. In this case, the function I pass to the HOF should be idempotent.
I also agree that in such cases, IO[_] is a good approach, but why force me to use it everywhere. Give me a println: Unit and make flatMap accept f: Any => IO[Unit]. Then I need to wrap println for this case only.
OK, I'll bite. I have two threads. One writes to a file the other gets some content. Two questions:
1. how does using IO[_] prevent the race? At the 'run' method of each thread I call unsafePerformIO which will now run the actions in parallel. Unless of course they are synchronized, which can be done with normal methods.
2. how do i avoid this sequencing when the IO actions can actually be done in parallel (two different files, inserting a cache value into a concurrent hash map)?
Why will the compiler complain? I call close() which returns IO[_] and I forgot to return it. My calling method used to be pure, so it returned Int. It still returns Int. Compiler is happy. Another scenario is that my method was required to return IO and was already returning it from another IO action.
Runar Bjarnason wrote:
On Thursday, October 13, 2011 3:51:02 PM UTC-4, Ittay Dror wrote:1. I don't code with random functions named 'foo' that I know nothing about.
Right, so your code is mostly first-order. It probably has a lot of repetition. You can improve on that by abstracting more.
Yes, mostly first order, not a lot of repetition.
2. If side effects are dangerous, IO[_] does not protect me from taking them. Calling a method 'doDangerousAction: Unit' is no different than 'doDangerousAction: IO[Unit]'.
There is a huge difference. The latter call is referentially transparent. The former call will mingle the performing of the dangerous action with program logic. I want a clean separation between describing what is done and actually doing it. If you don't want that, that's fine. Your code, your rules.
Sure, in HOF, I might call a function several times just for the sake of not keeping a val of the result. In this case, the function I pass to the HOF should be idempotent.
I also agree that in such cases, IO[_] is a good approach, but why force me to use it everywhere. Give me a println: Unit and make flatMap accept f: Any => IO[Unit]. Then I need to wrap println for this case only.
3. IO[_] makes code trickier to work with (need to work with for comprehensions, use traverse/sequence etc.)
No it doesn't. It makes code easier to work with. In addition to having no side-effects, referential transparency, equational reasoning, compositionality, clean separation of concerns, I can also make use of functions like traverse and sequence which remove a lot of tedious repetition from my code.
4/ IO[_] does not prevent bugs:
Sure it does. Take for example any kind of race condition where you accidentally interleave two side-effects (like writing to a file while getting some content to write to that same file). That kind of interleaving is prevented if the type system forces you to sequence effects monadically.
OK, I'll bite. I have two threads. One writes to a file the other gets some content. Two questions:
1. how does using IO[_] prevent the race? At the 'run' method of each thread I call unsafePerformIO which will now run the actions in parallel. Unless of course they are synchronized, which can be done with normal methods.
2. how do i avoid this sequencing when the IO actions can actually be done in parallel (two different files, inserting a cache value into a concurrent hash map)?
1. The same has been said about checked exceptions.
The IO monad is not at all like checked exceptions, precisely because it is a monad.
2. We've traded one source of error with another. Because when using IO[_] the developer can forget to return the result of close and then the resource will not be closed, leading to a leak (which is harder to debug)
This is what types are for. Unit is not the same as IO[Unit], and the compiler will complain if you don't return anything.
Why will the compiler complain? I call close() which returns IO[_] and I forgot to return it. My calling method used to be pure, so it returned Int. It still returns Int. Compiler is happy. Another scenario is that my method was required to return IO and was already returning it from another IO action.
Fri, 2011-10-14, 00:37
#262626
Re: Re: questioning FP
On Thu, Oct 13, 2011 at 6:35 PM, Ittay Dror <ittay.dror@gmail.com> wrote:
OK, I'll bite. I have two threads. One writes to a file the other gets some content. Two questions:
1. how does using IO[_] prevent the race? At the 'run' method of each thread I call unsafePerformIO which will now run the actions in parallel. Unless of course they are synchronized, which can be done with normal methods.
2. how do i avoid this sequencing when the IO actions can actually be done in parallel (two different files, inserting a cache value into a concurrent hash map)?
No need to go to concurrent threads. Just think of nesting side-effects accidentally. You're writing to a log, and getting the message (claiming to be a String) has the side-effect of switching to the next log file if the current one is full. You end up writing your message to the wrong file. Anything like this:
f(x)
Where f has some sequence of side-effects, and a method on x, that f calls somewhere in this sequence, has a side-effect that interferes with it. If you've written enough "enterprise" software, there's no way you wouldn't have come across this kind of bug.
But, again, this is not the purpose of the IO monad. Its main purpose is separating effects from pure code. Doing this just also happens to prevent interleaving of effects.
Fri, 2011-10-14, 06:47
#282828
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
Like martin has demonstrated you can take any imperative program and imagine every expression is returning an IO[_] and ';' or '\n' as bind. So I don't think IO[_] solves such bugs. But, it creates a potential of other bugs where an effectful computation misses to include one of its effects in the result
In a language with a lot of use of HOFs, that are open to everyone to use, I see how the use of IO is necessary to the point of making all low level functions return IO[_].
Maybe the difference in Scala is that we have classes with open recursion and protected methods. So I can extend functionality while keeping the contract closed. And then the drawbacks of using IO[_] outweigh the benefits.
Runar Bjarnason wrote:
CABjJA71xBWd+vgx2_98wDRyByGdOe2ETqafBkeEV5tvpu3HO2A [at] mail [dot] gmail [dot] com" type="cite">
On Thu, Oct 13, 2011 at 6:35 PM, Ittay Dror <ittay [dot] dror [at] gmail [dot] com" rel="nofollow">ittay.dror@gmail.com> wrote:
OK, I'll bite. I have two threads. One writes to a file the other gets some content. Two questions:
1. how does using IO[_] prevent the race? At the 'run' method of each thread I call unsafePerformIO which will now run the actions in parallel. Unless of course they are synchronized, which can be done with normal methods.
2. how do i avoid this sequencing when the IO actions can actually be done in parallel (two different files, inserting a cache value into a concurrent hash map)?
No need to go to concurrent threads. Just think of nesting side-effects accidentally. You're writing to a log, and getting the message (claiming to be a String) has the side-effect of switching to the next log file if the current one is full. You end up writing your message to the wrong file. Anything like this:
f(x)
Where f has some sequence of side-effects, and a method on x, that f calls somewhere in this sequence, has a side-effect that interferes with it. If you've written enough "enterprise" software, there's no way you wouldn't have come across this kind of bug.
Like martin has demonstrated you can take any imperative program and imagine every expression is returning an IO[_] and ';' or '\n' as bind. So I don't think IO[_] solves such bugs. But, it creates a potential of other bugs where an effectful computation misses to include one of its effects in the result
CABjJA71xBWd+vgx2_98wDRyByGdOe2ETqafBkeEV5tvpu3HO2A [at] mail [dot] gmail [dot] com" type="cite">I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
But, again, this is not the purpose of the IO monad. Its main purpose is separating effects from pure code. Doing this just also happens to prevent interleaving of effects.
In a language with a lot of use of HOFs, that are open to everyone to use, I see how the use of IO is necessary to the point of making all low level functions return IO[_].
Maybe the difference in Scala is that we have classes with open recursion and protected methods. So I can extend functionality while keeping the contract closed. And then the drawbacks of using IO[_] outweigh the benefits.
Fri, 2011-10-14, 12:57
#2a2a2a
Re: Re: questioning FP
On Oct 14, 2011, at 1:36, Ittay Dror <ittay.dror@gmail.com> wrote:
body p { margin-bottom: 0cm; margin-top: 0pt; } I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
It does, but I only know that because I tried it both ways. There's an expression in the southern USA that "you can't tell a young soldier anything", meaning I have to try things for myself and not be told by you. Because I don't yet have the benefit of experience.
Fri, 2011-10-14, 21:37
#2c2c2c
Re: Re: questioning FP
I have tried to try it both ways, but so far
(1) My trials have only illustrated to me that an IO monad is either a waste of time or I'm doing it wrong, and
(2) Nobody has ever provided an example of where it does anything useful, so if I'm doing it wrong I don't know how I could find this out.
When this thread began, I was willing to concede that there may be uses for it, but although I appreciate the efforts of people arguing for it, the reasoning has been unconvincing at best. So I'm now coming to the conclusion that the uses are only to appeal to personal style, not to solve practical programming problems.
So that this is not all just disagreeable rhetoric, let me give three examples where I have tried in the past to use something like an IO monad, why I rejected it, and what the superior solution was.
(1) TIFF image writer.
Tiff images have a number of required components in their header, and many more optional components. I had an application where the header information needed to be assembled in a not-entirely-straightforward way, and I was running into problems with forgetting to initialize parts of the header. I attempted using an IO marker, but that only told me what I already knew: there was a bunch of code dealing with output. The logic was still wrong. So I switched to using a finite state machine in state space (of the Header[False,False,False] style); the IO monad itself didn't assist with this at all. The equivalent of performUnsafeIO now would throw a compile time error if the header was not properly constructed, and everything was sorted out. But IO[Whatever] only got in the way, so I removed it. I needed a specific finite state machine, not an abstraction for IO, to help keep things straight.
(2) Data computation engine.
I have a daemon that sits around looking for data to appear in a directory; when it does, it grabs it, performs some conversions and computations, then spits it out again in another directory while removing the originals. Since files can appear (and disappear!) asynchronously, it's a little tricky to keep things straight. I speculated when I started this project that using an IO monad might help keep things straight. But no: in order to parallelize the computations, I had actors responsible for reading, moving, computing, etc., and there was no sensible way to push IO above the actor level. Within each actor, the concerns were well-separated; the input actor was, as far as the next actor could tell, just a source for data--I had already abstracted over whether or not the data involved IO. Likewise for the output actor. The input actor itself had scarcely a line of code that didn't involve IO, so any marker trait was redundant, and since the input or output statements were (nearly) consecutive, a state machine was also redundant. IO[Whatever] only got in the way--the solution was to have dedicated actors.
(3) Image processing tool.
I analyze scientific images that are sometimes too large for memory; thus, there's lots of IO required in order to keep the relevant portion of the data set (and partially completed computations) available. I had initially thought that using an IO monad would help me keep track of what was going on: where did I need to be careful, and where not? Where were the expensive operations and where were they inexpensive? This worked reasonably well initially; although I wasn't in any danger of forgetting where IO was when I first wrote the program, it seemed as though when I returned to the code later, I'd have a better understanding of the flow of data and caching and so forth. However, when I returned to the code to modify it, I found that the opposite was true: I wanted to change the sites of caching based on profiling. I already knew exactly where the expensive computations were--no thanks to IO markers--and now I had to change a whole bunch of type signatures because I wanted to change the location of IO. This was actually an anti-pattern: the point of caching is to abstract over whether or not there is IO and/or side-effects, and handle them transparently locally. IO[Whatever] got in the way with a vengeance because it was carried upstream too far.
Now, I fully admit that I was not using any established IO monad library; the concepts seemed straightforward enough to me to use them on my own, so maybe I missed some key insights and properties. But I don't think so. I think IO monads are mostly a red herring. You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
My conclusion is that dividing things into IO and not-IO is an unhelpful abstraction in Scala since different IO tasks have very little in common. Abstraction is useful when there are shared properties; with IO, the only thing that is really shared is that we happened to label these things with "IO". (Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not? Do you really want to label keyboard input, writing to a file, and socket communication with the same marker? Isn't a lot more useful to mark each one separately, if you need markers?) IO often does not commute, but the IO monad won't prevent errors of commution (a state machine can). Side-effects are fundamentally different from not-side-effects, but IO is not fundamentally different from any other side effect unless you have no other side effects around, and even then, that-there-are-side-effects are unlikely to be as interesting as what-the-side-effects-are.
Thus, I think Ittay had the right conclusion: it is useful only for purity. Purity aside, you almost always want something else.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness (not e.g. the use of monad transformers). So I hold out hope that I'm still missing something valuable. But it's a little frustrating that it's boiling down to "try it and you'll see" when I've tried to try it and found the opposite.
--Rex
On Fri, Oct 14, 2011 at 7:55 AM, Runar Oli <runarorama@gmail.com> wrote:
(1) My trials have only illustrated to me that an IO monad is either a waste of time or I'm doing it wrong, and
(2) Nobody has ever provided an example of where it does anything useful, so if I'm doing it wrong I don't know how I could find this out.
When this thread began, I was willing to concede that there may be uses for it, but although I appreciate the efforts of people arguing for it, the reasoning has been unconvincing at best. So I'm now coming to the conclusion that the uses are only to appeal to personal style, not to solve practical programming problems.
So that this is not all just disagreeable rhetoric, let me give three examples where I have tried in the past to use something like an IO monad, why I rejected it, and what the superior solution was.
(1) TIFF image writer.
Tiff images have a number of required components in their header, and many more optional components. I had an application where the header information needed to be assembled in a not-entirely-straightforward way, and I was running into problems with forgetting to initialize parts of the header. I attempted using an IO marker, but that only told me what I already knew: there was a bunch of code dealing with output. The logic was still wrong. So I switched to using a finite state machine in state space (of the Header[False,False,False] style); the IO monad itself didn't assist with this at all. The equivalent of performUnsafeIO now would throw a compile time error if the header was not properly constructed, and everything was sorted out. But IO[Whatever] only got in the way, so I removed it. I needed a specific finite state machine, not an abstraction for IO, to help keep things straight.
(2) Data computation engine.
I have a daemon that sits around looking for data to appear in a directory; when it does, it grabs it, performs some conversions and computations, then spits it out again in another directory while removing the originals. Since files can appear (and disappear!) asynchronously, it's a little tricky to keep things straight. I speculated when I started this project that using an IO monad might help keep things straight. But no: in order to parallelize the computations, I had actors responsible for reading, moving, computing, etc., and there was no sensible way to push IO above the actor level. Within each actor, the concerns were well-separated; the input actor was, as far as the next actor could tell, just a source for data--I had already abstracted over whether or not the data involved IO. Likewise for the output actor. The input actor itself had scarcely a line of code that didn't involve IO, so any marker trait was redundant, and since the input or output statements were (nearly) consecutive, a state machine was also redundant. IO[Whatever] only got in the way--the solution was to have dedicated actors.
(3) Image processing tool.
I analyze scientific images that are sometimes too large for memory; thus, there's lots of IO required in order to keep the relevant portion of the data set (and partially completed computations) available. I had initially thought that using an IO monad would help me keep track of what was going on: where did I need to be careful, and where not? Where were the expensive operations and where were they inexpensive? This worked reasonably well initially; although I wasn't in any danger of forgetting where IO was when I first wrote the program, it seemed as though when I returned to the code later, I'd have a better understanding of the flow of data and caching and so forth. However, when I returned to the code to modify it, I found that the opposite was true: I wanted to change the sites of caching based on profiling. I already knew exactly where the expensive computations were--no thanks to IO markers--and now I had to change a whole bunch of type signatures because I wanted to change the location of IO. This was actually an anti-pattern: the point of caching is to abstract over whether or not there is IO and/or side-effects, and handle them transparently locally. IO[Whatever] got in the way with a vengeance because it was carried upstream too far.
Now, I fully admit that I was not using any established IO monad library; the concepts seemed straightforward enough to me to use them on my own, so maybe I missed some key insights and properties. But I don't think so. I think IO monads are mostly a red herring. You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
My conclusion is that dividing things into IO and not-IO is an unhelpful abstraction in Scala since different IO tasks have very little in common. Abstraction is useful when there are shared properties; with IO, the only thing that is really shared is that we happened to label these things with "IO". (Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not? Do you really want to label keyboard input, writing to a file, and socket communication with the same marker? Isn't a lot more useful to mark each one separately, if you need markers?) IO often does not commute, but the IO monad won't prevent errors of commution (a state machine can). Side-effects are fundamentally different from not-side-effects, but IO is not fundamentally different from any other side effect unless you have no other side effects around, and even then, that-there-are-side-effects are unlikely to be as interesting as what-the-side-effects-are.
Thus, I think Ittay had the right conclusion: it is useful only for purity. Purity aside, you almost always want something else.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness (not e.g. the use of monad transformers). So I hold out hope that I'm still missing something valuable. But it's a little frustrating that it's boiling down to "try it and you'll see" when I've tried to try it and found the opposite.
--Rex
On Fri, Oct 14, 2011 at 7:55 AM, Runar Oli <runarorama@gmail.com> wrote:
On Oct 14, 2011, at 1:36, Ittay Dror <ittay.dror@gmail.com> wrote:I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
It does, but I only know that because I tried it both ways. There's an expression in the southern USA that "you can't tell a young soldier anything", meaning I have to try things for myself and not be told by you. Because I don't yet have the benefit of experience.
Fri, 2011-10-14, 23:47
#2e2e2e
Re: questioning FP
Hi Rex,
well spoken, thanks for writing this up! Especially this part I wholeheartedly agree with:
Functional programming is very useful and provides tools of great utility, the IO monad is simply not part of this set. And since we are on scala-debate, I may add that from the fact that the IO monad is actually necessary in purely functional languages I conclude that those languages—while interesting in an academic sense—are fundamentally broken in practice. I’m not saying that it is impossible to write practical software with them, it’s just not true that purely functional programming should be portrayed as the goal everyone should aspire to reach. And there is mounting evidence for the fact that Scala is sitting much closer to the sweet spot than most other established programming languages out there; this is carefully formulated to indicate that improvement is still possible ;-)
This mail has sparked a question in my mind: I do think that actors are a perfect way of encapsulating IO, it gives you all the goodies without spoiling the rest of the code base (which would naturally also be written using actors ;-) ). Can anybody explain what could be wrong with this intuition? Actually it is more than intuition, since I have implemented quite a few systems which use this approach. CS theoretical issues are also welcome.
Regards,
Roland
On Oct 14, 2011, at 22:28 , Rex Kerr wrote:
well spoken, thanks for writing this up! Especially this part I wholeheartedly agree with:
You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
Functional programming is very useful and provides tools of great utility, the IO monad is simply not part of this set. And since we are on scala-debate, I may add that from the fact that the IO monad is actually necessary in purely functional languages I conclude that those languages—while interesting in an academic sense—are fundamentally broken in practice. I’m not saying that it is impossible to write practical software with them, it’s just not true that purely functional programming should be portrayed as the goal everyone should aspire to reach. And there is mounting evidence for the fact that Scala is sitting much closer to the sweet spot than most other established programming languages out there; this is carefully formulated to indicate that improvement is still possible ;-)
This mail has sparked a question in my mind: I do think that actors are a perfect way of encapsulating IO, it gives you all the goodies without spoiling the rest of the code base (which would naturally also be written using actors ;-) ). Can anybody explain what could be wrong with this intuition? Actually it is more than intuition, since I have implemented quite a few systems which use this approach. CS theoretical issues are also welcome.
Regards,
Roland
On Oct 14, 2011, at 22:28 , Rex Kerr wrote:
I have tried to try it both ways, but so far
(1) My trials have only illustrated to me that an IO monad is either a waste of time or I'm doing it wrong, and
(2) Nobody has ever provided an example of where it does anything useful, so if I'm doing it wrong I don't know how I could find this out.
When this thread began, I was willing to concede that there may be uses for it, but although I appreciate the efforts of people arguing for it, the reasoning has been unconvincing at best. So I'm now coming to the conclusion that the uses are only to appeal to personal style, not to solve practical programming problems.
So that this is not all just disagreeable rhetoric, let me give three examples where I have tried in the past to use something like an IO monad, why I rejected it, and what the superior solution was.
(1) TIFF image writer.
Tiff images have a number of required components in their header, and many more optional components. I had an application where the header information needed to be assembled in a not-entirely-straightforward way, and I was running into problems with forgetting to initialize parts of the header. I attempted using an IO marker, but that only told me what I already knew: there was a bunch of code dealing with output. The logic was still wrong. So I switched to using a finite state machine in state space (of the Header[False,False,False] style); the IO monad itself didn't assist with this at all. The equivalent of performUnsafeIO now would throw a compile time error if the header was not properly constructed, and everything was sorted out. But IO[Whatever] only got in the way, so I removed it. I needed a specific finite state machine, not an abstraction for IO, to help keep things straight.
(2) Data computation engine.
I have a daemon that sits around looking for data to appear in a directory; when it does, it grabs it, performs some conversions and computations, then spits it out again in another directory while removing the originals. Since files can appear (and disappear!) asynchronously, it's a little tricky to keep things straight. I speculated when I started this project that using an IO monad might help keep things straight. But no: in order to parallelize the computations, I had actors responsible for reading, moving, computing, etc., and there was no sensible way to push IO above the actor level. Within each actor, the concerns were well-separated; the input actor was, as far as the next actor could tell, just a source for data--I had already abstracted over whether or not the data involved IO. Likewise for the output actor. The input actor itself had scarcely a line of code that didn't involve IO, so any marker trait was redundant, and since the input or output statements were (nearly) consecutive, a state machine was also redundant. IO[Whatever] only got in the way--the solution was to have dedicated actors.
(3) Image processing tool.
I analyze scientific images that are sometimes too large for memory; thus, there's lots of IO required in order to keep the relevant portion of the data set (and partially completed computations) available. I had initially thought that using an IO monad would help me keep track of what was going on: where did I need to be careful, and where not? Where were the expensive operations and where were they inexpensive? This worked reasonably well initially; although I wasn't in any danger of forgetting where IO was when I first wrote the program, it seemed as though when I returned to the code later, I'd have a better understanding of the flow of data and caching and so forth. However, when I returned to the code to modify it, I found that the opposite was true: I wanted to change the sites of caching based on profiling. I already knew exactly where the expensive computations were--no thanks to IO markers--and now I had to change a whole bunch of type signatures because I wanted to change the location of IO. This was actually an anti-pattern: the point of caching is to abstract over whether or not there is IO and/or side-effects, and handle them transparently locally. IO[Whatever] got in the way with a vengeance because it was carried upstream too far.
Now, I fully admit that I was not using any established IO monad library; the concepts seemed straightforward enough to me to use them on my own, so maybe I missed some key insights and properties. But I don't think so. I think IO monads are mostly a red herring. You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
My conclusion is that dividing things into IO and not-IO is an unhelpful abstraction in Scala since different IO tasks have very little in common. Abstraction is useful when there are shared properties; with IO, the only thing that is really shared is that we happened to label these things with "IO". (Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not? Do you really want to label keyboard input, writing to a file, and socket communication with the same marker? Isn't a lot more useful to mark each one separately, if you need markers?) IO often does not commute, but the IO monad won't prevent errors of commution (a state machine can). Side-effects are fundamentally different from not-side-effects, but IO is not fundamentally different from any other side effect unless you have no other side effects around, and even then, that-there-are-side-effects are unlikely to be as interesting as what-the-side-effects-are.
Thus, I think Ittay had the right conclusion: it is useful only for purity. Purity aside, you almost always want something else.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness (not e.g. the use of monad transformers). So I hold out hope that I'm still missing something valuable. But it's a little frustrating that it's boiling down to "try it and you'll see" when I've tried to try it and found the opposite.
--Rex
On Fri, Oct 14, 2011 at 7:55 AM, Runar Oli <runarorama@gmail.com> wrote:
On Oct 14, 2011, at 1:36, Ittay Dror <ittay.dror@gmail.com> wrote:I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
It does, but I only know that because I tried it both ways. There's an expression in the southern USA that "you can't tell a young soldier anything", meaning I have to try things for myself and not be told by you. Because I don't yet have the benefit of experience.
Roland Kuhn
Typesafe – Enterprise-Grade Scala from the Experts
twitter: @rolandkuhn
Sat, 2011-10-15, 06:47
#303030
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
bravo!
Rex Kerr wrote:
Rex Kerr wrote:
CAP_xLa3Vpi3m3DBpFO+Xr68Pn3wOZsagcwMqstpydcp6RxTh5Q [at] mail [dot] gmail [dot] com" type="cite">I have tried to try it both ways, but so far
(1) My trials have only illustrated to me that an IO monad is either a waste of time or I'm doing it wrong, and
(2) Nobody has ever provided an example of where it does anything useful, so if I'm doing it wrong I don't know how I could find this out.
When this thread began, I was willing to concede that there may be uses for it, but although I appreciate the efforts of people arguing for it, the reasoning has been unconvincing at best. So I'm now coming to the conclusion that the uses are only to appeal to personal style, not to solve practical programming problems.
So that this is not all just disagreeable rhetoric, let me give three examples where I have tried in the past to use something like an IO monad, why I rejected it, and what the superior solution was.
(1) TIFF image writer.
Tiff images have a number of required components in their header, and many more optional components. I had an application where the header information needed to be assembled in a not-entirely-straightforward way, and I was running into problems with forgetting to initialize parts of the header. I attempted using an IO marker, but that only told me what I already knew: there was a bunch of code dealing with output. The logic was still wrong. So I switched to using a finite state machine in state space (of the Header[False,False,False] style); the IO monad itself didn't assist with this at all. The equivalent of performUnsafeIO now would throw a compile time error if the header was not properly constructed, and everything was sorted out. But IO[Whatever] only got in the way, so I removed it. I needed a specific finite state machine, not an abstraction for IO, to help keep things straight.
(2) Data computation engine.
I have a daemon that sits around looking for data to appear in a directory; when it does, it grabs it, performs some conversions and computations, then spits it out again in another directory while removing the originals. Since files can appear (and disappear!) asynchronously, it's a little tricky to keep things straight. I speculated when I started this project that using an IO monad might help keep things straight. But no: in order to parallelize the computations, I had actors responsible for reading, moving, computing, etc., and there was no sensible way to push IO above the actor level. Within each actor, the concerns were well-separated; the input actor was, as far as the next actor could tell, just a source for data--I had already abstracted over whether or not the data involved IO. Likewise for the output actor. The input actor itself had scarcely a line of code that didn't involve IO, so any marker trait was redundant, and since the input or output statements were (nearly) consecutive, a state machine was also redundant. IO[Whatever] only got in the way--the solution was to have dedicated actors.
(3) Image processing tool.
I analyze scientific images that are sometimes too large for memory; thus, there's lots of IO required in order to keep the relevant portion of the data set (and partially completed computations) available. I had initially thought that using an IO monad would help me keep track of what was going on: where did I need to be careful, and where not? Where were the expensive operations and where were they inexpensive? This worked reasonably well initially; although I wasn't in any danger of forgetting where IO was when I first wrote the program, it seemed as though when I returned to the code later, I'd have a better understanding of the flow of data and caching and so forth. However, when I returned to the code to modify it, I found that the opposite was true: I wanted to change the sites of caching based on profiling. I already knew exactly where the expensive computations were--no thanks to IO markers--and now I had to change a whole bunch of type signatures because I wanted to change the location of IO. This was actually an anti-pattern: the point of caching is to abstract over whether or not there is IO and/or side-effects, and handle them transparently locally. IO[Whatever] got in the way with a vengeance because it was carried upstream too far.
Now, I fully admit that I was not using any established IO monad library; the concepts seemed straightforward enough to me to use them on my own, so maybe I missed some key insights and properties. But I don't think so. I think IO monads are mostly a red herring. You might want to collect your data for IO, but you don't want an IO monad for that, you want a collection (possibly a lazy one). You might want to transform your data, but you want e.g. an appropriate applicative functor for that, not an IO monad. You might want to separate IO concerns from others, but if so, just do it: write self-sufficient IO methods or actors or classes, and put the IO there.
My conclusion is that dividing things into IO and not-IO is an unhelpful abstraction in Scala since different IO tasks have very little in common. Abstraction is useful when there are shared properties; with IO, the only thing that is really shared is that we happened to label these things with "IO". (Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not? Do you really want to label keyboard input, writing to a file, and socket communication with the same marker? Isn't a lot more useful to mark each one separately, if you need markers?) IO often does not commute, but the IO monad won't prevent errors of commution (a state machine can). Side-effects are fundamentally different from not-side-effects, but IO is not fundamentally different from any other side effect unless you have no other side effects around, and even then, that-there-are-side-effects are unlikely to be as interesting as what-the-side-effects-are.
Thus, I think Ittay had the right conclusion: it is useful only for purity. Purity aside, you almost always want something else.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness (not e.g. the use of monad transformers). So I hold out hope that I'm still missing something valuable. But it's a little frustrating that it's boiling down to "try it and you'll see" when I've tried to try it and found the opposite.
--Rex
On Fri, Oct 14, 2011 at 7:55 AM, Runar Oli <runarorama [at] gmail [dot] com" rel="nofollow">runarorama@gmail.com> wrote:
On Oct 14, 2011, at 1:36, Ittay Dror <ittay [dot] dror [at] gmail [dot] com" target="_blank" rel="nofollow">ittay.dror@gmail.com> wrote:
I was questioning whether this separation benefits anything or is only good from a purist point of view where everything is black & white.
It does, but I only know that because I tried it both ways. There's an expression in the southern USA that "you can't tell a young soldier anything", meaning I have to try things for myself and not be told by you. Because I don't yet have the benefit of experience.
Sat, 2011-10-15, 07:17
#323232
Re: Re: questioning FP
On Friday, October 14, 2011 4:28:50 PM UTC-4, Rex Kerr wrote:
(Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not?
It is. See scalaz.IORef and scalaz.STRef.
Do you really want to label keyboard input, writing to a file, and socket communication with the same marker?
No, I don't want any marker. I want to describe these things declaratively with a DSL that I can pass to an interpreter at a convenient time.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness
That's the thing though. Much of the usefulness of an IO datatype is that it is a monad. So wherever you see M[_]:Monad or M[_]:Applicative, you can pass IO for M. The usefulness is precisely that it has no IOness. It's just another DSL, and manipulating expressions written in that DSL has no side-effects. This is why it's useful. I'm sorry you were not able to make use of it. Maybe the situation would improve if we built a more complete library of IO widgets.
I can relate a story where having an IO monad solved a real problem at work. This was not very long ago, for a web-based application talking to a database via JDBC. This was a million-LOC enterprise wossname with all the trimmings. The part that talked to the database was designed with a kind of strategy pattern, where you would inherit from a class named Command and overload a method named "body". Commands were executed using execute, the implementation of which went something like this:
1. Read the inputs and validate them.
2. Open a database connection.
3. If all is well, execute the body, passing the inputs.
4. Close the database connections.
Plus some complicated error-handling etc, but you get the idea.
At some point we started noticing that the database was accumulating row locks while the application was running with about 5000 concurrent users. If left alone, the app would become unresponsive, so we resorted to manually killing database sessions.
The reason this was occurring is that some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
Any red-blooded functional programmer would at this point be screaming "use a monad!" Wherever you have an inner thing that depends on an outer thing, but needs shared context, you have a monad. The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
Of course, this was in Java, which as everybody knows, is just a functional academic language, and we should all pat ourselves on the back for not being that academic.
Sat, 2011-10-15, 08:37
#343434
Re: Re: questioning FP
On Sat, Oct 15, 2011 at 2:09 AM, Runar Bjarnason <runarorama@gmail.com> wrote:
Maybe; it'd depend on what the widgets were. (And on how well they were documented.) As I said, I was rolling my own.
Ouch. It seems like there are two serious problems here: first, redundant and possibly conflicting open/closing; and second, poor definition of what state of the database should be used when nesting commands--do you use the partial manipulations or not?
That sort of works (although the type you've specified works only for input? Or you're using A to store state in the type system as well as wrap input?). What's not clear, however, is how this is superior to:
(1) adding another method to the parent class that separates out the DB-open-close stuff from the read/write of an existing database (and using the appropriate one when calling externally vs. to each other), or
(2) Wrapping the database interface to cache open/close requests so you don't have to worry about it, or
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Still, it looks _somewhat_ promising, so I appreciate the example.
--Rex
That's the thing though. Much of the usefulness of an IO datatype is that it is a monad. So wherever you see M[_]:Monad or M[_]:Applicative, you can pass IO for M. The usefulness is precisely that it has no IOness. It's just another DSL, and manipulating expressions written in that DSL has no side-effects. This is why it's useful. I'm sorry you were not able to make use of it. Maybe the situation would improve if we built a more complete library of IO widgets.
Maybe; it'd depend on what the widgets were. (And on how well they were documented.) As I said, I was rolling my own.
Commands were executed using execute, the implementation of which went something like this:
1. Read the inputs and validate them.
2. Open a database connection.
3. If all is well, execute the body, passing the inputs.
4. Close the database connections.
[...] some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
Ouch. It seems like there are two serious problems here: first, redundant and possibly conflicting open/closing; and second, poor definition of what state of the database should be used when nesting commands--do you use the partial manipulations or not?
The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
That sort of works (although the type you've specified works only for input? Or you're using A to store state in the type system as well as wrap input?). What's not clear, however, is how this is superior to:
(1) adding another method to the parent class that separates out the DB-open-close stuff from the read/write of an existing database (and using the appropriate one when calling externally vs. to each other), or
(2) Wrapping the database interface to cache open/close requests so you don't have to worry about it, or
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Still, it looks _somewhat_ promising, so I appreciate the example.
--Rex
Sat, 2011-10-15, 10:17
#2d2d2d
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Bjarnason wrote:
Isn't STRef a solution to allow using mutable data structures for performance gains? Personally, I'd just use the mutable data structure directly (making it hidden/private in some scope if I'm afraid it would accidentally be shared)
Indeed, in this use case, as in HOF, the use of IO is valid. However, not all development uses these use cases. So not all functions that do side-effects should return IO, just those involved in these use cases.
In other words, IO[_] is useful to solve a use-case, not as a mean to get referential transparency.
Sounds like you had a problem with separation of concerns
My approach would have been:
* create an Executor class that accepts a Command, opens a connection, calls command.body and close the database connection.
* This can even be a method withConnection{f: Connection => Unit}
* commands can call other commands' body method
* There's no problem composing f: Connection => Unit and g: Connection => Unit with an operator. `f andAgain g ` where toAgain(f).andAgain(g) = {a => f(a); g(a)}. Of the class based approach allows to create a sort of HOF that invokes g in the middle of f.
Runar Bjarnason wrote:
On Friday, October 14, 2011 4:28:50 PM UTC-4, Rex Kerr wrote:(Why is writing something to a locked file and reading it back "IO" when writing a value in shared memory and reading it back is not?
It is. See scalaz.IORef and scalaz.STRef.
Isn't STRef a solution to allow using mutable data structures for performance gains? Personally, I'd just use the mutable data structure directly (making it hidden/private in some scope if I'm afraid it would accidentally be shared)
Do you really want to label keyboard input, writing to a file, and socket communication with the same marker?
No, I don't want any marker. I want to describe these things declaratively with a DSL that I can pass to an interpreter at a convenient time.
Indeed, in this use case, as in HOF, the use of IO is valid. However, not all development uses these use cases. So not all functions that do side-effects should return IO, just those involved in these use cases.
In other words, IO[_] is useful to solve a use-case, not as a mean to get referential transparency.
I'm quite open to counterexamples; I am still rather mystified how so many intelligent people could find something so valuable without being able to illustrate a clear use-case that depends exactly on the IOness
That's the thing though. Much of the usefulness of an IO datatype is that it is a monad. So wherever you see M[_]:Monad or M[_]:Applicative, you can pass IO for M. The usefulness is precisely that it has no IOness. It's just another DSL, and manipulating expressions written in that DSL has no side-effects. This is why it's useful. I'm sorry you were not able to make use of it. Maybe the situation would improve if we built a more complete library of IO widgets.
I can relate a story where having an IO monad solved a real problem at work. This was not very long ago, for a web-based application talking to a database via JDBC. This was a million-LOC enterprise wossname with all the trimmings. The part that talked to the database was designed with a kind of strategy pattern, where you would inherit from a class named Command and overload a method named "body". Commands were executed using execute, the implementation of which went something like this:
1. Read the inputs and validate them.
2. Open a database connection.
3. If all is well, execute the body, passing the inputs.
4. Close the database connections.
Sounds like you had a problem with separation of concerns
Plus some complicated error-handling etc, but you get the idea.
At some point we started noticing that the database was accumulating row locks while the application was running with about 5000 concurrent users. If left alone, the app would become unresponsive, so we resorted to manually killing database sessions.
The reason this was occurring is that some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
Any red-blooded functional programmer would at this point be screaming "use a monad!" Wherever you have an inner thing that depends on an outer thing, but needs shared context, you have a monad. The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
My approach would have been:
* create an Executor class that accepts a Command, opens a connection, calls command.body and close the database connection.
* This can even be a method withConnection{f: Connection => Unit}
* commands can call other commands' body method
* There's no problem composing f: Connection => Unit and g: Connection => Unit with an operator. `f andAgain g ` where toAgain(f).andAgain(g) = {a => f(a); g(a)}. Of the class based approach allows to create a sort of HOF that invokes g in the middle of f.
Of course, this was in Java, which as everybody knows, is just a functional academic language, and we should all pat ourselves on the back for not being that academic.
Sat, 2011-10-15, 10:27
#2f2f2f
Re: questioning FP
> I can relate a story where having an IO monad solved a real problem at work. This was not very long ago, for a web-based application talking to a database via JDBC. This was a million-LOC enterprise wossname with all the trimmings. The part that talked to the database was designed with a kind of strategy pattern, where you would inherit from a class named Command and overload a method named "body". Commands were executed using execute, the implementation of which went something like this:
>
> 1. Read the inputs and validate them.
> 2. Open a database connection.
> 3. If all is well, execute the body, passing the inputs.
> 4. Close the database connections.
>
> Plus some complicated error-handling etc, but you get the idea.
>
> At some point we started noticing that the database was accumulating row locks while the application was running with about 5000 concurrent users. If left alone, the app would become unresponsive, so we resorted to manually killing database sessions.
>
> The reason this was occurring is that some commands actually depended on other commands, so occasionally a programmer would implement a command so that it would call another Command's execute method directly. The nested session would of course depend on rows that were already being manipulated (and not committed yet) by the outer session.
>
> Any red-blooded functional programmer would at this point be screaming "use a monad!" Wherever you have an inner thing that depends on an outer thing, but needs shared context, you have a monad. The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
>
> type IO[A] = java.sql.Connection => (A, java.sql.Connection)
>
> Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
Thanks for this illuminating example!
> Of course, this was in Java, which as everybody knows, is just a functional academic language, and we should all pat ourselves on the back for not being that academic.
;-)
Heiko
Sat, 2011-10-15, 16:47
#303030
Re: Re: questioning FP
On Sat, Oct 15, 2011 at 3:27 AM, Rex Kerr <ichoran@gmail.com> wrote:
Runar's example is using a state monad where the A is the type of the state. All of the functions composed using Kleisli composition in this case can participate in composing a new state, or not! (leaving the state the same).
From the Scalaz source, if you look at the definition of map and flatMap for State:
def map[B](f: A => B): State[S, B] = state(apply(_) match { case (s, a) => (s, f(a)) })
def flatMap[B](f: A => State[S, B]): State[S, B] = state(apply(_) match { case (s, a) => f(a)(s) })
Meaning each composed function can emit an output (in the above example: java.sql.Connection) and a potential update to the state. Depending on how the state monad is used you can either get the state or the final result (or both) at the end of the computation.
Now, if you think purely functionally then functions using the above signature have no way to get another connection to, well, anything. Pure functions produce outputs exclusively based on their inputs. In the case of functions running in a State monad those inputs are the function inputs and the accumulated state so far. Your fold example is less flexible than what Runar is talking about because it is required to produce an A, but Runar can do things like lift the identity function into the State and simply do a no-op. If you hack around your fold example and try to achieve the same level of flexibility why retaining a functional-programming idiom you will eventually stumble upon the State monad.
--
Jim Powers
The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Runar's example is using a state monad where the A is the type of the state. All of the functions composed using Kleisli composition in this case can participate in composing a new state, or not! (leaving the state the same).
From the Scalaz source, if you look at the definition of map and flatMap for State:
def map[B](f: A => B): State[S, B] = state(apply(_) match { case (s, a) => (s, f(a)) })
def flatMap[B](f: A => State[S, B]): State[S, B] = state(apply(_) match { case (s, a) => f(a)(s) })
Meaning each composed function can emit an output (in the above example: java.sql.Connection) and a potential update to the state. Depending on how the state monad is used you can either get the state or the final result (or both) at the end of the computation.
Now, if you think purely functionally then functions using the above signature have no way to get another connection to, well, anything. Pure functions produce outputs exclusively based on their inputs. In the case of functions running in a State monad those inputs are the function inputs and the accumulated state so far. Your fold example is less flexible than what Runar is talking about because it is required to produce an A, but Runar can do things like lift the identity function into the State and simply do a no-op. If you hack around your fold example and try to achieve the same level of flexibility why retaining a functional-programming idiom you will eventually stumble upon the State monad.
--
Jim Powers
Sat, 2011-10-15, 16:57
#323232
Re: questioning FP
Hi all,
Thanks for this very interesting thread. After enjoying listening to the various arguments I would like to add my view: Except for Roland's "idea" of using Actors for something (?), only the FP/monadic party (Runar) has shown a well thought out and field-tested approach for properly dealing with side effects. Therefore, as long as nobody comes up with a real alternative (except for ignoring), I think I will be a FP fanboy.
By the way: While I am, despite my age, still a young soldier and only a real-world, enterprise-system OO developer, I find the the FP approach neither hard to understand nor hard to apply.
Just my two cents,
Heiko
Sat, 2011-10-15, 17:07
#343434
Re: Re: questioning FP
On Oct 15, 2011, at 5:06, Ittay Dror <ittay.dror@gmail.com> wrote:
body p { margin-bottom: 0cm; margin-top: 0pt; }
Isn't STRef a solution to allow using mutable data structures for performance gains?
It's a solution for allowing working with mutable data structures in a way that the type system guarantees is referentially transparent.
My approach would have been:
* create an Executor class that accepts a Command, opens a connection, calls command.body and close the database connection.
You've just described our unsafePerformIO. And we had an "andAgain" function except it was called "bind" and its type signature wasn't naff.
Sat, 2011-10-15, 17:17
#363636
Re: Re: questioning FP
On Sat, Oct 15, 2011 at 11:41 AM, Jim Powers <jim@casapowers.com> wrote:
Ah, okay. Nothing wrong with using a state monad; it's a perfectly sensible way to build a finite state machine. I'm not that familiar with it, so I didn't recognize it from the signature.
Except again, it has nothing to do with IO. This is equally useful if you want to build an entirely immutable data structure where you have constraints on particular combinations of values in that structure; if said structure is from an external library, it's easier to build a state monad than to try other means of statefully wrapping the library.
I do not argue that state monads are unhelpful (overkill sometimes, but still helpful, and more flexible than a fold, I agree). I just argue that creating a state monad and labeling it "IO" is a strange thing to do; it has to do with linking values (state) and types. Whether or not the state is internal or external (i.e. IO) is immaterial, and in a particular application you want to know _which_ IO process it's helping you with, not that it has something to do with some IO process.
--Rex
On Sat, Oct 15, 2011 at 3:27 AM, Rex Kerr <ichoran@gmail.com> wrote:
The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Runar's example is using a state monad where the A is the type of the state.
Ah, okay. Nothing wrong with using a state monad; it's a perfectly sensible way to build a finite state machine. I'm not that familiar with it, so I didn't recognize it from the signature.
Except again, it has nothing to do with IO. This is equally useful if you want to build an entirely immutable data structure where you have constraints on particular combinations of values in that structure; if said structure is from an external library, it's easier to build a state monad than to try other means of statefully wrapping the library.
I do not argue that state monads are unhelpful (overkill sometimes, but still helpful, and more flexible than a fold, I agree). I just argue that creating a state monad and labeling it "IO" is a strange thing to do; it has to do with linking values (state) and types. Whether or not the state is internal or external (i.e. IO) is immaterial, and in a particular application you want to know _which_ IO process it's helping you with, not that it has something to do with some IO process.
--Rex
Sat, 2011-10-15, 17:27
#383838
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Runar Oli wrote:
My point was that IO[_] is not a silver bullet. In many (I think majority) of cases, side effecting functions are easier to use (without incurring bugs of course) than trying to make them referentially transparent with IO[_]. In other words, referential transparency is not a holy grail.
Runar Oli wrote:
19DBC8AA-F50C-4F2E-9940-0754AF4B78B9 [at] gmail [dot] com" type="cite">I'm not arguing there's no place for lazy IO or the IO monad in programming. Of course there are use cases where people "redesign the wheel" without considering they're actually implementing the IO[_] "design pattern" (or State[_] for that matter).
On Oct 15, 2011, at 5:06, Ittay Dror <ittay [dot] dror [at] gmail [dot] com" rel="nofollow">ittay.dror@gmail.com> wrote:
body p { margin-bottom: 0cm; margin-top: 0pt; }
Isn't STRef a solution to allow using mutable data structures for performance gains?
It's a solution for allowing working with mutable data structures in a way that the type system guarantees is referentially transparent.
My approach would have been:
* create an Executor class that accepts a Command, opens a connection, calls command.body and close the database connection.
You've just described our unsafePerformIO. And we had an "andAgain" function except it was called "bind" and its type signature wasn't naff.
My point was that IO[_] is not a silver bullet. In many (I think majority) of cases, side effecting functions are easier to use (without incurring bugs of course) than trying to make them referentially transparent with IO[_]. In other words, referential transparency is not a holy grail.
Sat, 2011-10-15, 17:37
#3a3a3a
Re: Re: questioning FP
body p { margin-bottom: 0cm; margin-top: 0pt; }
Jim Powers wrote:
State monad is S => (A, S), so the state here is Connection, which doesn't change (or at least, not in a way where you can distinguish I think), so it looks like A, the result, is produced without referential transparency (Runar, correct me please).
Jim Powers wrote:
CAMjNOCMUQkrwyMn_bEcWn2FCrQ6WTiSRYzQB17QZP9P_2BbbFQ [at] mail [dot] gmail [dot] com" type="cite">On Sat, Oct 15, 2011 at 3:27 AM, Rex Kerr <ichoran [at] gmail [dot] com" target="_blank" rel="nofollow">ichoran@gmail.com> wrote:
The solution was of course to refactor so that opening a database connection was disallowed. Instead, a command would assume an open database connection, do something with it, and pass it on. The type we used was something like this:
type IO[A] = java.sql.Connection => (A, java.sql.Connection)
Now, instead of having outer commands call inner commands, we simply chain them with Kleisli composition: outer.compose(inner), and they share the same connection and act as "one command". This resulted in us refactoring to a bunch of primitive commands that we could chain, instead of monolithic ones.
(3) refactoring as a bunch of primitive commands that you chain by sequential appearance in a method (or in a list over which you fold or map the connection) rather than Kleisli composition. (I.e. do the same thing, but write methods that implement only java.sql.Connection => A; if you're not enforcing via types only certain combinations, there isn't much point in returning the connection.)
Runar's example is using a state monad where the A is the type of the state. All of the functions composed using Kleisli composition in this case can participate in composing a new state, or not! (leaving the state the same).
State monad is S => (A, S), so the state here is Connection, which doesn't change (or at least, not in a way where you can distinguish I think), so it looks like A, the result, is produced without referential transparency (Runar, correct me please).
CAMjNOCMUQkrwyMn_bEcWn2FCrQ6WTiSRYzQB17QZP9P_2BbbFQ [at] mail [dot] gmail [dot] com" type="cite">
From the Scalaz source, if you look at the definition of map and flatMap for State:
def map[B](f: A => B): State[S, B] = state(apply(_) match { case (s, a) => (s, f(a)) })
def flatMap[B](f: A => State[S, B]): State[S, B] = state(apply(_) match { case (s, a) => f(a)(s) })
Meaning each composed function can emit an output (in the above example: java.sql.Connection) and a potential update to the state. Depending on how the state monad is used you can either get the state or the final result (or both) at the end of the computation.
Now, if you think purely functionally then functions using the above signature have no way to get another connection to, well, anything. Pure functions produce outputs exclusively based on their inputs. In the case of functions running in a State monad those inputs are the function inputs and the accumulated state so far. Your fold example is less flexible than what Runar is talking about because it is required to produce an A, but Runar can do things like lift the identity function into the State and simply do a no-op. If you hack around your fold example and try to achieve the same level of flexibility why retaining a functional-programming idiom you will eventually stumble upon the State monad.
--
Jim Powers
Sat, 2011-10-15, 17:47
#3c3c3c
RE: Re: questioning FP
Personally I think this whole thread has been hugely interesting and I love this particular email from Runar. Unfortunately, for those keen to use it:
scala> (1 to 10000 toList).traverse[Option, Int](x => Some(x * 2))java.lang.StackOverflowError at scalaz.Traverse$$anon$3$$anonfun$2$$anonfun$apply$3.apply(Traverse.scala:28) at scalaz.Applys$$anon$2$$anonfun$apply$1$$anonfun$apply$2.apply(Apply.scala:12) at scala.Option.map(Option.scala:133) ... etc
I'm sure you know that I by no means intend to denigrate the efforts being put into scalaz and I have the highest regard to all involved, but fantastic abstractions that are unusable in practice are, um, unusable in practice.
Chris
Date: Tue, 11 Oct 2011 22:52:38 -0700
From: runarorama@gmail.com
To: scala-debate@googlegroups.com
CC: runarorama@gmail.com; megagurka@yahoo.com; pub@razie.com; ittay.dror@gmail.com
Subject: [scala-debate] Re: questioning FP
Not complicated at all. The pattern is presented in some detail in the paper "The Essence of the Iterator Pattern" by Gibbons and Oliveira:
http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/iterator.pdf
The basic idea is that data of some type F[A] can be traversed with a function of type A => M[B] to produce M[F[B]], where M represents some effect, and supports certain operations.
We have implemented this in Scalaz. For example, here is a use case where the "effect" is none at all. This is just "map":
scala> List(1,2,3).traverse[Id, Int](_ * 2)
res0: List[Int] = List(2, 4, 6)
scala> (1 to 10000 toList).traverse[Option, Int](x => Some(x * 2))java.lang.StackOverflowError at scalaz.Traverse$$anon$3$$anonfun$2$$anonfun$apply$3.apply(Traverse.scala:28) at scalaz.Applys$$anon$2$$anonfun$apply$1$$anonfun$apply$2.apply(Apply.scala:12) at scala.Option.map(Option.scala:133) ... etc
I'm sure you know that I by no means intend to denigrate the efforts being put into scalaz and I have the highest regard to all involved, but fantastic abstractions that are unusable in practice are, um, unusable in practice.
Chris
Date: Tue, 11 Oct 2011 22:52:38 -0700
From: runarorama@gmail.com
To: scala-debate@googlegroups.com
CC: runarorama@gmail.com; megagurka@yahoo.com; pub@razie.com; ittay.dror@gmail.com
Subject: [scala-debate] Re: questioning FP
Not complicated at all. The pattern is presented in some detail in the paper "The Essence of the Iterator Pattern" by Gibbons and Oliveira:
http://www.comlab.ox.ac.uk/jeremy.gibbons/publications/iterator.pdf
The basic idea is that data of some type F[A] can be traversed with a function of type A => M[B] to produce M[F[B]], where M represents some effect, and supports certain operations.
We have implemented this in Scalaz. For example, here is a use case where the "effect" is none at all. This is just "map":
scala> List(1,2,3).traverse[Id, Int](_ * 2)
res0: List[Int] = List(2, 4, 6)
Sat, 2011-10-15, 17:57
#3e3e3e
Re: Re: questioning FP
2011/10/15 Chris Marshall <oxbow_lakes@hotmail.com>
Personally I think this whole thread has been hugely interesting and I love this particular email from Runar. Unfortunately, for those keen to use it:
scala> (1 to 10000 toList).traverse[Option, Int](x => Some(x * 2))java.lang.StackOverflowError at scalaz.Traverse$$anon$3$$anonfun$2$$anonfun$apply$3.apply(Traverse.scala:28) at scalaz.Applys$$anon$2$$anonfun$apply$1$$anonfun$apply$2.apply(Apply.scala:12) at scala.Option.map(Option.scala:133) ... etc
I'm sure you know that I by no means intend to denigrate the efforts being put into scalaz and I have the highest regard to all involved, but fantastic abstractions that are unusable in practice are, um, unusable in practice.
This problem doesn't come from the abstractions but from the runtime. It is a good argument but for another discussion.
--
Sébastien
Sat, 2011-10-15, 18:07
#373737
RE: Re: questioning FP
I know where the problem comes from. But unless it is solved, the abstraction (or at least the specific implementation in #scalaz) has limited value
Chris
Date: Sat, 15 Oct 2011 18:47:53 +0200
Subject: Re: [scala-debate] Re: questioning FP
From: sebastien.bocq@gmail.com
To: oxbow_lakes@hotmail.com
CC: scala-debate@googlegroups.com
2011/10/15 Chris Marshall <oxbow_lakes@hotmail.com>
This problem doesn't come from the abstractions but from the runtime. It is a good argument but for another discussion.
--
Sébastien
Chris
Date: Sat, 15 Oct 2011 18:47:53 +0200
Subject: Re: [scala-debate] Re: questioning FP
From: sebastien.bocq@gmail.com
To: oxbow_lakes@hotmail.com
CC: scala-debate@googlegroups.com
2011/10/15 Chris Marshall <oxbow_lakes@hotmail.com>
Personally I think this whole thread has been hugely interesting and I love this particular email from Runar. Unfortunately, for those keen to use it:
scala> (1 to 10000 toList).traverse[Option, Int](x => Some(x * 2))java.lang.StackOverflowError at scalaz.Traverse$$anon$3$$anonfun$2$$anonfun$apply$3.apply(Traverse.scala:28) at scalaz.Applys$$anon$2$$anonfun$apply$1$$anonfun$apply$2.apply(Apply.scala:12) at scala.Option.map(Option.scala:133) ... etc
I'm sure you know that I by no means intend to denigrate the efforts being put into scalaz and I have the highest regard to all involved, but fantastic abstractions that are unusable in practice are, um, unusable in practice.
This problem doesn't come from the abstractions but from the runtime. It is a good argument but for another discussion.
--
Sébastien
On Wednesday, October 12, 2011 4:55:49 AM UTC-4, Adriaan Moors wrote:
Traverse is very specific that it requires m to have class Applicative.
In Haskell:
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
In Scala (as a method on T[A]):
def traverse[M[_]:Applicative, B](f: A => M[B]): M[T[B]]