- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
new motivation for applyIfDefined
Wed, 2011-01-26, 20:15
[Resending with new internals address.]
Here is a failure from the scala test suite I witnessed this morning.
[partest] Log file '/scratch/trunk1/test/files/pos/t3866-pos.log':
[partest] scala.MatchError: /scratch/trunk1/test/files/pos/t1987-pos.obj (of class scala.tools.nsc.io.PlainFile)
[partest] at scala.tools.nsc.util.DirectoryClassPath$$anonfun$packages$2.apply(ClassPath.scala:318)
[partest] at scala.tools.nsc.util.DirectoryClassPath$$anonfun$packages$2.apply(ClassPath.scala:318)
[partest] at scala.collection.TraversableLike$$anonfun$collect$1.apply(TraversableLike.scala:306)
[partest] at scala.collection.Iterator$class.foreach(Iterator.scala:646)
The fact that it's a MatchError is both telling and anxiety producing,
because the line throwing it is this:
lazy val packages: List[DirectoryClassPath] = dir collect {
case f if f.isDirectory && validPackage(f.name) => new DirectoryClassPath(f, context)
} toList
At first I thought this was punishment for having a non-atomic
compare-and-then-act setup. But that line cannot produce a match error
if the PF logic is:
if (isDefinedAt(x)) f(x)
And indeed, disassembling the bytecode reveals the logic actually used
for anonymous partial functions. Witness it this way:
scala> def gd(x: Int) = { println("gd(" + x + ")") ; x < 3 }
gd: (x: Int)Boolean
scala> 0 to 5 collect { case x if gd(x) => x }
gd(0)
gd(0)
gd(1)
gd(1)
gd(2)
gd(2)
gd(3)
gd(4)
gd(5)
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2)
Doesn't happen if I make my own partial function.
scala> val pf = new PartialFunction[Int, Int] { def isDefinedAt(x: Int) = { println("gd(" + x + ")") ; x < 3 } ; def apply(x: Int) = x }
pf: java.lang.Object with PartialFunction[Int,Int] =
scala> 0 to 5 collect pf
gd(0)
gd(1)
gd(2)
gd(3)
gd(4)
gd(5)
res0: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 2)
Examining the specification shows why this is the case. The partial
function is created with the given cases taken as-is for the apply
method, and isDefinedAt has all the same cases, but the right hand sides
are changed to "true" and a default case is false.
So that means guards will always be double-evaluated on any match. I
would not mind the double evaluation if it were necessary -- the guard
itself should not have side effects. But this approach creates
matcherrors which should be impossible given the intended logic. And
they would be impossible if I hand rolled all my partial functions,
which obviously is not real appealing.
Given only isDefinedAt and apply it is not trivial to fix this. The
guards have to be run to find out if it isDefinedAt, but then on the
call to apply you don't know which case should match if you ignore the
guards. We've been through all this during the recent-ish discussion of
a faster alternative to partialfunction. But I don't remember this
issue coming up: that there is a critical loss of fidelity between the
logic being expressed and the logic being implemented in an anonymous
partial function. My interest perks way up when we're not just talking
about efficiency but correctness.
I can't see a way out that doesn't involve state in the partial function
or the addition of an applyIfDefined style method. I think we should
seriously consider adding such a method to partialfunction immediately,
with a default obvious implementation returning Option[T]. Then we
could have anonymous ones generate an override which only applies the
guard logic once before application.
And then we could at least fix collect. I am filled with dread at the
prospect of having to worry about collect throwing exceptions anytime
it's not working with eternally immutable data.
Wed, 2011-01-26, 21:07
#2
Re: new motivation for applyIfDefined
Resending to googlegroup.
Hi Paul, yes, it's a good argument. But the spec also says that guards are supposed to be side-effect free, and that the compiler is free to apply optimizations assuming that. So we have striictly speaking an invalid test and not a compiler failure.
That said, I also think it would be much better to go the applyIfDefined route. I'd have to page the mailoing list discussion back into my memory to get a feeling what the tendency was then. I remember there was no lack of proposals!
Cheers
-- Martin
Hi Paul, yes, it's a good argument. But the spec also says that guards are supposed to be side-effect free, and that the compiler is free to apply optimizations assuming that. So we have striictly speaking an invalid test and not a compiler failure.
That said, I also think it would be much better to go the applyIfDefined route. I'd have to page the mailoing list discussion back into my memory to get a feeling what the tendency was then. I remember there was no lack of proposals!
Cheers
-- Martin
Wed, 2011-01-26, 22:37
#3
Re: new motivation for applyIfDefined
There doesn't seem to be an exploitable boundary between testing for a
match and running the body.
object Extract {
def unapply(x: Directory) = {
val files = x.files.toList filter (_ hasExtension "scala")
if (files.size >= 2) Some((files.head, files.tail.head))
else None
}
}
def f(xs: List[Directory]) = xs collect {
case Extract(f1, f2) => f1.path + "$" f2.path
}
To avoid a possible match error this has to use the return value of the
unapply when running the body. But where is that going to come from? My
previous characterization of using Ints to select the case body does not
offer any way to recover the pattern bindings.
In other words it's not just the guard, it's the pattern too. I know
these things aren't supposed to be side effecting but even in a side
effecting world if you think you're writing this
if (p(x)) f(x)
and you're getting this
if (p(x)) { if (/* still? */p(x)) f(x) else throw new BREAKEVERYTHING }
you might say "hey..."
The only ways I can see to implement a sequence of cases for a function
like collect without risking match errors by virtue of unapply or guard
re-evaluation involve:
- memoizing pattern+guard results, or some other form of state
- try f(x) catch { case _: UniqueToThisMatchError => () }
Wed, 2011-01-26, 22:47
#4
Re: Re: new motivation for applyIfDefined
There was some discussion some months ago about alternatives to the present form of partial functions. Wouldn't one of them present a better solution here?
On Wed, Jan 26, 2011 at 19:33, Paul Phillips <paulp@improving.org> wrote:
--
Daniel C. Sobral
I travel to the future all the time.
On Wed, Jan 26, 2011 at 19:33, Paul Phillips <paulp@improving.org> wrote:
There doesn't seem to be an exploitable boundary between testing for a
match and running the body.
object Extract {
def unapply(x: Directory) = {
val files = x.files.toList filter (_ hasExtension "scala")
if (files.size >= 2) Some((files.head, files.tail.head))
else None
}
}
def f(xs: List[Directory]) = xs collect {
case Extract(f1, f2) => f1.path + "$" f2.path
}
To avoid a possible match error this has to use the return value of the
unapply when running the body. But where is that going to come from? My
previous characterization of using Ints to select the case body does not
offer any way to recover the pattern bindings.
In other words it's not just the guard, it's the pattern too. I know
these things aren't supposed to be side effecting but even in a side
effecting world if you think you're writing this
if (p(x)) f(x)
and you're getting this
if (p(x)) { if (/* still? */p(x)) f(x) else throw new BREAKEVERYTHING }
you might say "hey..."
The only ways I can see to implement a sequence of cases for a function
like collect without risking match errors by virtue of unapply or guard
re-evaluation involve:
- memoizing pattern+guard results, or some other form of state
- try f(x) catch { case _: UniqueToThisMatchError => () }
--
Paul Phillips | Giving every man a vote has no more made men wise
Caged Spirit | and free than Christianity has made them good.
Empiricist | -- H. L. Mencken
all hip pupils! |----------* http://www.improving.org/paulp/ *----------
--
Daniel C. Sobral
I travel to the future all the time.
Wed, 2011-01-26, 22:57
#5
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 07:43:36PM -0200, Daniel Sobral wrote:
> There was some discussion some months ago about alternatives to the
> present form of partial functions. Wouldn't one of them present a
> better solution here?
They present a better solution to a different problem. We have to deal
with what exists.
Or if you have a CONCRETE suggestion, I'm all ears.
Wed, 2011-01-26, 23:27
#6
Re: Re: new motivation for applyIfDefined
Well, if the predicate is not idempotent, then all bets are off. But the situation you describe seems even worse. Consider:
def x() = scala.util.Random.nextInt(5)def p(x: Int) => x != 0 def g(x: Int) => x / 2x match { case y : (() => Int) if p(y()) => g(y())
If even p is true on both times it is called, there's still the possibility of an exception being thrown at g. So, if "f.isDirectory && validPackage(f.name)" failed the second time around, would "new DirectoryClassPath(f, context)" have been meaningful at all? After all, the condition no longer holds true.
If we used an applyOrElse, applyOption, or any of the similar alternatives, it would still be the case that you could get an exception or a meaningless result. You only eliminated the MatchError, which might not have been worse than silently producing an invalid result.
On Wed, Jan 26, 2011 at 19:54, Paul Phillips <paulp@improving.org> wrote:
--
Daniel C. Sobral
I travel to the future all the time.
def x() = scala.util.Random.nextInt(5)def p(x: Int) => x != 0 def g(x: Int) => x / 2x match { case y : (() => Int) if p(y()) => g(y())
If even p is true on both times it is called, there's still the possibility of an exception being thrown at g. So, if "f.isDirectory && validPackage(f.name)" failed the second time around, would "new DirectoryClassPath(f, context)" have been meaningful at all? After all, the condition no longer holds true.
If we used an applyOrElse, applyOption, or any of the similar alternatives, it would still be the case that you could get an exception or a meaningless result. You only eliminated the MatchError, which might not have been worse than silently producing an invalid result.
On Wed, Jan 26, 2011 at 19:54, Paul Phillips <paulp@improving.org> wrote:
On Wed, Jan 26, 2011 at 07:43:36PM -0200, Daniel Sobral wrote:
> There was some discussion some months ago about alternatives to the
> present form of partial functions. Wouldn't one of them present a
> better solution here?
They present a better solution to a different problem. We have to deal
with what exists.
Or if you have a CONCRETE suggestion, I'm all ears.
--
Paul Phillips | Adultery is the application of democracy to love.
Protagonist | -- H. L. Mencken
Empiricist |
pp: i haul pills |----------* http://www.improving.org/paulp/ *----------
--
Daniel C. Sobral
I travel to the future all the time.
Wed, 2011-01-26, 23:37
#7
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 10:20 PM, Daniel Sobral wrote:
> So, if "f.isDirectory &&
> validPackage(f.name)" failed the second time around, would "new
> DirectoryClassPath(f, context)" have been meaningful at all? After all, the
> condition no longer holds true.
> If we used an applyOrElse, applyOption, or any of the similar alternatives,
> it would still be the case that you could get an exception or a meaningless
> result. You only eliminated the MatchError, which might not have been worse
> than silently producing an invalid result.
Not necessarily. The check could have been an optimisation and
DirectoryClassPath would still be able to handle changes in the
filesystem. If a MatchError is thrown though, it messes everything up.
Not saying that this is the case here, but it's plausible.
Best,
Ismael
Wed, 2011-01-26, 23:47
#8
Re: Re: new motivation for applyIfDefined
My point is that it is possible for it to go the other way too. What's the point of being overly concerned about a MatchError if you just eliminate a (small?) subset of the problem?
On Wed, Jan 26, 2011 at 20:29, Ismael Juma <ismael@juma.me.uk> wrote:
--
Daniel C. Sobral
I travel to the future all the time.
On Wed, Jan 26, 2011 at 20:29, Ismael Juma <ismael@juma.me.uk> wrote:
On Wed, Jan 26, 2011 at 10:20 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
> So, if "f.isDirectory &&
> validPackage(f.name)" failed the second time around, would "new
> DirectoryClassPath(f, context)" have been meaningful at all? After all, the
> condition no longer holds true.
> If we used an applyOrElse, applyOption, or any of the similar alternatives,
> it would still be the case that you could get an exception or a meaningless
> result. You only eliminated the MatchError, which might not have been worse
> than silently producing an invalid result.
Not necessarily. The check could have been an optimisation and
DirectoryClassPath would still be able to handle changes in the
filesystem. If a MatchError is thrown though, it messes everything up.
Not saying that this is the case here, but it's plausible.
Best,
Ismael
--
Daniel C. Sobral
I travel to the future all the time.
Wed, 2011-01-26, 23:47
#9
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 08:36:40PM -0200, Daniel Sobral wrote:
> My point is that it is possible for it to go the other way too. What's
> the point of being overly concerned about a MatchError if you just
> eliminate a (small?) subset of the problem?
You are confusing exceptions which occur in user code with exceptions
which are caused by the implementation. The problem is not "exceptions
exist." The problem is "the implementation throws exceptions which need
not be thrown."
Wed, 2011-01-26, 23:57
#10
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 08:20:19PM -0200, Daniel Sobral wrote:
> If even p is true on both times it is called, there's still the
> possibility of an exception being thrown at g. So, if "f.isDirectory
> && validPackage( f.name)" failed the second time around, would "new
> DirectoryClassPath(f, context)" have been meaningful at all? After
> all, the condition no longer holds true.
This is all irrelevant. What you describe is user error. Nothing can
guarantee that the body of the match will or won't make sense. What CAN
be guaranteed, and what is presently not guaranteed and we know not even
met, is that exceptions will not be thrown which exist solely because of
inadequacies in the implementation.
Even if every call to p initiates a new nuclear war,
if (p(x)) f(x)
should either happen or not happen. There is no room in the land of
sane code for a third choice of "or we throw an exception if p changes
its mind when we double check."
Thu, 2011-01-27, 00:47
#11
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 20:44, Paul Phillips <paulp@improving.org> wrote:
Well, they won't be thrown if the code is not buggy (ie, if the predicate is idempotent). If a predicate returns two different results for the same value in two consecutive calls, then it is a bug in the program, and a MatchError is a valid result. It means: this code is buggy, fix me.
--
Daniel C. Sobral
I travel to the future all the time.
On Wed, Jan 26, 2011 at 08:36:40PM -0200, Daniel Sobral wrote:
> My point is that it is possible for it to go the other way too. What's
> the point of being overly concerned about a MatchError if you just
> eliminate a (small?) subset of the problem?
You are confusing exceptions which occur in user code with exceptions
which are caused by the implementation. The problem is not "exceptions
exist." The problem is "the implementation throws exceptions which need
not be thrown."
Well, they won't be thrown if the code is not buggy (ie, if the predicate is idempotent). If a predicate returns two different results for the same value in two consecutive calls, then it is a bug in the program, and a MatchError is a valid result. It means: this code is buggy, fix me.
--
Paul Phillips | It's better to have gloved and tossed than never to
Moral Alien | have played baseball.
Empiricist |
i'll ship a pulp |----------* http://www.improving.org/paulp/ *----------
--
Daniel C. Sobral
I travel to the future all the time.
Thu, 2011-01-27, 03:17
#12
Re: Re: new motivation for applyIfDefined
Doesn't solve the problem with collect, but at least in this case
flatMap is almost as concise:
lazy val packages = dir.toList.flatMap { f =>
if (f.isDirectory && validPackage(f.name)) Some(new
DirectoryClassPath(f, context)) else None
}
On Wed, 2011-01-26 at 14:38 -0800, Paul Phillips wrote:
> On Wed, Jan 26, 2011 at 08:20:19PM -0200, Daniel Sobral wrote:
> > If even p is true on both times it is called, there's still the
> > possibility of an exception being thrown at g. So, if "f.isDirectory
> > && validPackage( f.name)" failed the second time around, would "new
> > DirectoryClassPath(f, context)" have been meaningful at all? After
> > all, the condition no longer holds true.
>
> This is all irrelevant. What you describe is user error. Nothing can
> guarantee that the body of the match will or won't make sense. What CAN
> be guaranteed, and what is presently not guaranteed and we know not even
> met, is that exceptions will not be thrown which exist solely because of
> inadequacies in the implementation.
>
> Even if every call to p initiates a new nuclear war,
>
> if (p(x)) f(x)
>
> should either happen or not happen. There is no room in the land of
> sane code for a third choice of "or we throw an exception if p changes
> its mind when we double check."
>
Thu, 2011-01-27, 23:27
#13
Re: Re: new motivation for applyIfDefined
Hey all,
So the following program produces a PartialFunction...
object test { def foo(x: List[String]) = x collect { case x if x.startsWith("foo") => x }}
why not simply make the isDefinedAt complete by adding "case _ => false"
The only problem I can see if the match already has a "case _", but in that case, it will never throw a MatchError, right?
best wishes
-- Burak
package <empty> { final class test extends java.lang.Object with ScalaObject { def this(): object test = { test.super.this(); () }; def foo(x: List[java.lang.String]): List[java.lang.String] = x.collect[java.lang.String, List[java.lang.String]]({ final <synthetic> class $anonfun extends java.lang.Object with PartialFunction[java.lang.String,java.lang.String] { def this(): anonymous class $anonfun = { $anonfun.super.this(); () }; final def apply(x0$1: java.lang.String): java.lang.String = (x0$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => x }; final def isDefinedAt(x$1: java.lang.String): Boolean = (x$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => true case _ => false } }; (new anonymous class $anonfun(): PartialFunction[java.lang.String,java.lang.String]) }, immutable.this.List.canBuildFrom[java.lang.String]()) }}
On Thu, Jan 27, 2011 at 12:28 AM, Nathan Bronson <ngbronson.sub@gmail.com> wrote:
--
Burak Emir
--
http://burak.emir.googlepages.com
So the following program produces a PartialFunction...
object test { def foo(x: List[String]) = x collect { case x if x.startsWith("foo") => x }}
why not simply make the isDefinedAt complete by adding "case _ => false"
The only problem I can see if the match already has a "case _", but in that case, it will never throw a MatchError, right?
best wishes
-- Burak
package <empty> { final class test extends java.lang.Object with ScalaObject { def this(): object test = { test.super.this(); () }; def foo(x: List[java.lang.String]): List[java.lang.String] = x.collect[java.lang.String, List[java.lang.String]]({ final <synthetic> class $anonfun extends java.lang.Object with PartialFunction[java.lang.String,java.lang.String] { def this(): anonymous class $anonfun = { $anonfun.super.this(); () }; final def apply(x0$1: java.lang.String): java.lang.String = (x0$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => x }; final def isDefinedAt(x$1: java.lang.String): Boolean = (x$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => true case _ => false } }; (new anonymous class $anonfun(): PartialFunction[java.lang.String,java.lang.String]) }, immutable.this.List.canBuildFrom[java.lang.String]()) }}
On Thu, Jan 27, 2011 at 12:28 AM, Nathan Bronson <ngbronson.sub@gmail.com> wrote:
Doesn't solve the problem with collect, but at least in this case
flatMap is almost as concise:
lazy val packages = dir.toList.flatMap { f =>
if (f.isDirectory && validPackage(f.name)) Some(new
DirectoryClassPath(f, context)) else None
}
On Wed, 2011-01-26 at 14:38 -0800, Paul Phillips wrote:
> On Wed, Jan 26, 2011 at 08:20:19PM -0200, Daniel Sobral wrote:
> > If even p is true on both times it is called, there's still the
> > possibility of an exception being thrown at g. So, if "f.isDirectory
> > && validPackage( f.name)" failed the second time around, would "new
> > DirectoryClassPath(f, context)" have been meaningful at all? After
> > all, the condition no longer holds true.
>
> This is all irrelevant. What you describe is user error. Nothing can
> guarantee that the body of the match will or won't make sense. What CAN
> be guaranteed, and what is presently not guaranteed and we know not even
> met, is that exceptions will not be thrown which exist solely because of
> inadequacies in the implementation.
>
> Even if every call to p initiates a new nuclear war,
>
> if (p(x)) f(x)
>
> should either happen or not happen. There is no room in the land of
> sane code for a third choice of "or we throw an exception if p changes
> its mind when we double check."
>
--
Burak Emir
--
http://burak.emir.googlepages.com
Thu, 2011-01-27, 23:37
#14
Re: Re: new motivation for applyIfDefined
Ha funny, so actually, there is 'case _ => false' ... silly me.
but why do you get a MatchError then?
-- Burak
On Thu, Jan 27, 2011 at 11:23 PM, Burak Emir <burak.emir@gmail.com> wrote:
--
Burak Emir
--
http://burak.emir.googlepages.com
but why do you get a MatchError then?
-- Burak
On Thu, Jan 27, 2011 at 11:23 PM, Burak Emir <burak.emir@gmail.com> wrote:
Hey all,
So the following program produces a PartialFunction...
object test { def foo(x: List[String]) = x collect { case x if x.startsWith("foo") => x }}
why not simply make the isDefinedAt complete by adding "case _ => false"
The only problem I can see if the match already has a "case _", but in that case, it will never throw a MatchError, right?
best wishes
-- Burak
package <empty> { final class test extends java.lang.Object with ScalaObject { def this(): object test = { test.super.this(); () }; def foo(x: List[java.lang.String]): List[java.lang.String] = x.collect[java.lang.String, List[java.lang.String]]({ final <synthetic> class $anonfun extends java.lang.Object with PartialFunction[java.lang.String,java.lang.String] { def this(): anonymous class $anonfun = { $anonfun.super.this(); () }; final def apply(x0$1: java.lang.String): java.lang.String = (x0$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => x }; final def isDefinedAt(x$1: java.lang.String): Boolean = (x$1: java.lang.String @unchecked) match { case (x @ _) if x.startsWith("foo") => true case _ => false } }; (new anonymous class $anonfun(): PartialFunction[java.lang.String,java.lang.String]) }, immutable.this.List.canBuildFrom[java.lang.String]()) }}
On Thu, Jan 27, 2011 at 12:28 AM, Nathan Bronson <ngbronson.sub@gmail.com> wrote:
Doesn't solve the problem with collect, but at least in this case
flatMap is almost as concise:
lazy val packages = dir.toList.flatMap { f =>
if (f.isDirectory && validPackage(f.name)) Some(new
DirectoryClassPath(f, context)) else None
}
On Wed, 2011-01-26 at 14:38 -0800, Paul Phillips wrote:
> On Wed, Jan 26, 2011 at 08:20:19PM -0200, Daniel Sobral wrote:
> > If even p is true on both times it is called, there's still the
> > possibility of an exception being thrown at g. So, if "f.isDirectory
> > && validPackage( f.name)" failed the second time around, would "new
> > DirectoryClassPath(f, context)" have been meaningful at all? After
> > all, the condition no longer holds true.
>
> This is all irrelevant. What you describe is user error. Nothing can
> guarantee that the body of the match will or won't make sense. What CAN
> be guaranteed, and what is presently not guaranteed and we know not even
> met, is that exceptions will not be thrown which exist solely because of
> inadequacies in the implementation.
>
> Even if every call to p initiates a new nuclear war,
>
> if (p(x)) f(x)
>
> should either happen or not happen. There is no room in the land of
> sane code for a third choice of "or we throw an exception if p changes
> its mind when we double check."
>
--
Burak Emir
--
http://burak.emir.googlepages.com
--
Burak Emir
--
http://burak.emir.googlepages.com
Thu, 2011-01-27, 23:47
#15
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 10:30 PM, Burak Emir wrote:
> Ha funny, so actually, there is 'case _ => false' ... silly me.
> but why do you get a MatchError then?
The MatchError happens in the apply call.
Best,
Ismael
Thu, 2011-01-27, 23:57
#16
Re: Re: new motivation for applyIfDefined
Yeah, I figured that much out meanwhile, but why do we even get to the apply?
Why would doubly evaluating guards cause match errors?
Is this a discussion about allowing guards that will evaluate once to true and once to false?
On Thu, Jan 27, 2011 at 11:41 PM, Ismael Juma <ismael@juma.me.uk> wrote:
--
Burak Emir
--
http://burak.emir.googlepages.com
Why would doubly evaluating guards cause match errors?
Is this a discussion about allowing guards that will evaluate once to true and once to false?
On Thu, Jan 27, 2011 at 11:41 PM, Ismael Juma <ismael@juma.me.uk> wrote:
On Thu, Jan 27, 2011 at 10:30 PM, Burak Emir <burak.emir@gmail.com> wrote:
> Ha funny, so actually, there is 'case _ => false' ... silly me.
> but why do you get a MatchError then?
The MatchError happens in the apply call.
Best,
Ismael
--
Burak Emir
--
http://burak.emir.googlepages.com
Fri, 2011-01-28, 00:07
#17
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 10:48 PM, Burak Emir wrote:
> Yeah, I figured that much out meanwhile, but why do we even get to the
> apply?
> Why would doubly evaluating guards cause match errors?
> Is this a discussion about allowing guards that will evaluate once to true
> and once to false?
Yes. The example given was:
case f if f.isDirectory && validPackage(f.name)
This can change between isDefinedAt and apply if, for example, the
file is renamed or deleted.
Best,
Ismael
Fri, 2011-01-28, 00:27
#18
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 11:48:26PM +0100, Burak Emir wrote:
> Why would doubly evaluating guards cause match errors?
>
> Is this a discussion about allowing guards that will evaluate once to
> true and once to false?
As I summarized it earlier, one attempts to write this:
if (p(x)) f(x)
but gets this:
if (p(x)) { if (/* still? */p(x)) f(x) else throw new BREAKEVERYTHING }
Importantly, the code is fine whether p(x) is true or false, so the fact
that the guard changes answers is an academic matter. Patterns are at
liberty to match or not match when guards are being flibbertigibbety,
but not to start hurling runtime errors.
Fri, 2011-01-28, 00:47
#19
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 21:21, Paul Phillips <paulp@improving.org> wrote:
--
Daniel C. Sobral
I travel to the future all the time.
As I summarized it earlier, one attempts to write this:This I disagree with. If the code if fine whether p(x) is true or false, then make it a function, not a match.
if (p(x)) f(x)
but gets this:
if (p(x)) { if (/* still? */p(x)) f(x) else throw new BREAKEVERYTHING }
Importantly, the code is fine whether p(x) is true or false, so the fact
that the guard changes answers is an academic matter. Patterns are at
liberty to match or not match when guards are being flibbertigibbety,
but not to start hurling runtime errors.
--
Daniel C. Sobral
I travel to the future all the time.
Fri, 2011-01-28, 00:57
#20
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 09:38:49PM -0200, Daniel Sobral wrote:
> This I disagree with. If the code if fine whether p(x) is true or
> false, then make it a function, not a match.
Please stop injecting irrelevancies. I know there is a breed of
programmer who thinks everyone who doesn't write code as they say it
ought to be written deserves every bug they get and we should be looking
for ways to hurt them. For those of us who don't take that point of
view, it makes absolutely no difference what someone "should" have done.
The only question that matters is whether this is acceptable behavior.
I say it isn't. If you say it is I'm glad you're only an observer.
Fri, 2011-01-28, 01:07
#21
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 12:47 AM, Paul Phillips <paulp@improving.org> wrote:
On Thu, Jan 27, 2011 at 09:38:49PM -0200, Daniel Sobral wrote:
> This I disagree with. If the code if fine whether p(x) is true or
> false, then make it a function, not a match.
Please stop injecting irrelevancies. I know there is a breed of
programmer who thinks everyone who doesn't write code as they say it
ought to be written deserves every bug they get and we should be looking
for ways to hurt them. For those of us who don't take that point of
view, it makes absolutely no difference what someone "should" have done.
The only question that matters is whether this is acceptable behavior.
I say it isn't. If you say it is I'm glad you're only an observer.
I agree with Daniel.
Like him, I don't think supporting flip-flopping guards is going to pay off very much. Apparently, we are talking about a non-synchronized concurrent program here, and code like that would be considered a bug. Consider any program with a single match statement
{ case _ if p(x) => throw new BreakEverythingException() case _ => Thread.sleep(1000); ariane(5).launch()}
Now, is the implementation "incorrect" if p(x) evaluates to true while we are sleeping?
Re: concurrency, I am somewhat reminded of the "while(condition) { wait() }" idiom [1], which is by no means academic, and it does not quite matter whether we think there should be spurious wakeups or not.
If you want to make an efficiency argument against evaluating guards twice, please go ahead. applyIfDefined could work out nicely and makes total sense for partial functions, IMHO.
The whole discussion about correctness in the presence of concurrent modifications to the world looks like a red herring to me. Please don't get me wrong, but the spec on how to translate pattern matching anonymous functions spells it out crystal clear (pp 122-123), so by your own argument, you, Paul, are the observer throwing irrelevant normative statements and the whole world has code deployed that expects that MatchError to be thrown.
best wishes,-- Burak
[1] https://www.securecoding.cert.org/confluence/display/java/THI03-J.+Always+invoke+wait()+and+await()+methods+inside+a+loop
--
Paul Phillips | Giving every man a vote has no more made men wise
Imperfectionist | and free than Christianity has made them good.
Empiricist | -- H. L. Mencken
slap pi uphill! |----------* http://www.improving.org/paulp/ *----------
--
Burak Emir
--
http://burak.emir.googlepages.com
Fri, 2011-01-28, 01:17
#22
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 12:57:40AM +0100, Burak Emir wrote:
> Apparently, we are talking about a non-synchronized concurrent program
> here
No, we aren't. I wish you would read the existing thread which outlines
the issue in great detail.
Fri, 2011-01-28, 01:27
#23
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 12:57:40AM +0100, Burak Emir wrote:
> {
> case _ if p(x) => throw new BreakEverythingException()
> case _ => Thread.sleep(1000); ariane(5).launch()
> }
>
> Now, is the implementation "incorrect" if p(x) evaluates to true while
> we are sleeping?
Let me show you the actual situation under consideration. Not
concurrent. Sequential access.
{
case _ if p(x) => 1
case _ => 2
}
Is the implementation "incorrect" if this throws a
BreakEverythingException? You are arguing, I hope out of ignorance of
what we are talking about and not actual conviction, that it is not
incorrect. I say it is very incorrect.
Fri, 2011-01-28, 01:37
#24
Re: Re: new motivation for applyIfDefined
The default case confuses the issue because it will be caught in apply.
Let us look more simply at this.
List(1, 2, 3) collect { case x if p(x) => x }
I will never come around to thinking this should throw exceptions.
Fri, 2011-01-28, 02:07
#25
Re: Re: new motivation for applyIfDefined
I don't say that getting a MatchError there is intuitive, even if I don't think it is outrageous. Patterns are still full of surprises : ) but here it is more the translation to PartialFunction that seems to be the problem.
One way to avoid the spec change and still have guards evaluated only once in order to improve efficiency is to introduce a default implementation of applyIfDefined (lift.apply) in PartialFunction, have the compiler generate an override def applyIfDefined in pattern matching anonymous functions, and use that in collect.
I would call this an optimization. It increases code size somewhat, and whether it is generally worth doing, that is another question.
A high-level optimization of "expr collect { case ... => ... }" would be much much nicer.
Doing "override def applyIfDefined" or the high-level optimization are both approaches that would not protect other uses of if (f.isDefinedAt(x)) f(x) though, but maybe they are good enough?
I would vote for the high-level optimization. That would have a nice impact on a large class of programs, in addition to preventing obscure errors in programs that deal with obscure situations.
cheers-- Burak
On Fri, Jan 28, 2011 at 1:20 AM, Paul Phillips <paulp@improving.org> wrote:
--
Burak Emir
--
http://burak.emir.googlepages.com
One way to avoid the spec change and still have guards evaluated only once in order to improve efficiency is to introduce a default implementation of applyIfDefined (lift.apply) in PartialFunction, have the compiler generate an override def applyIfDefined in pattern matching anonymous functions, and use that in collect.
I would call this an optimization. It increases code size somewhat, and whether it is generally worth doing, that is another question.
A high-level optimization of "expr collect { case ... => ... }" would be much much nicer.
Doing "override def applyIfDefined" or the high-level optimization are both approaches that would not protect other uses of if (f.isDefinedAt(x)) f(x) though, but maybe they are good enough?
I would vote for the high-level optimization. That would have a nice impact on a large class of programs, in addition to preventing obscure errors in programs that deal with obscure situations.
cheers-- Burak
On Fri, Jan 28, 2011 at 1:20 AM, Paul Phillips <paulp@improving.org> wrote:
The default case confuses the issue because it will be caught in apply.
Let us look more simply at this.
List(1, 2, 3) collect { case x if p(x) => x }
I will never come around to thinking this should throw exceptions.
--
Paul Phillips | The important thing here is that the music is not in
In Theory | the piano. And knowledge and edification is not in the
Empiricist | computer. The computer is simply an instrument whose
all hip pupils! | music is ideas. -- Alan Kay
--
Burak Emir
--
http://burak.emir.googlepages.com
Fri, 2011-01-28, 02:17
#26
Re: Re: new motivation for applyIfDefined
I agree that collect should not throw strange exceptions, although I may see an argument for a ConcurrentModificationException or a "ZOMGWORLDEXPLODE" exception wrapper when this situation happens :)
In any case, I think the "paulp is going to hell variant" is a decent stop gap if we come to no other better alternative. I'll run a performance analysis and see how expensive *not* making a another solution will be.
I will state that the file 'external mutability' is a somewhat known problem (among I/O library designers perhaps) that can wreak havoc on any method/library.
However, the two-match nature of partial functions is really asking to be optimized. I don't know of any reason to prefer partial function besides wanting the 'isDefined'/'apply' operation(s).
On Thu, Jan 27, 2011 at 7:20 PM, Paul Phillips <paulp@improving.org> wrote:
In any case, I think the "paulp is going to hell variant" is a decent stop gap if we come to no other better alternative. I'll run a performance analysis and see how expensive *not* making a another solution will be.
I will state that the file 'external mutability' is a somewhat known problem (among I/O library designers perhaps) that can wreak havoc on any method/library.
However, the two-match nature of partial functions is really asking to be optimized. I don't know of any reason to prefer partial function besides wanting the 'isDefined'/'apply' operation(s).
On Thu, Jan 27, 2011 at 7:20 PM, Paul Phillips <paulp@improving.org> wrote:
The default case confuses the issue because it will be caught in apply.
Let us look more simply at this.
List(1, 2, 3) collect { case x if p(x) => x }
I will never come around to thinking this should throw exceptions.
--
Paul Phillips | The important thing here is that the music is not in
In Theory | the piano. And knowledge and edification is not in the
Empiricist | computer. The computer is simply an instrument whose
all hip pupils! | music is ideas. -- Alan Kay
Fri, 2011-01-28, 02:37
#27
Re: Re: new motivation for applyIfDefined
Perhaps it is useful to separate problems that occur to the left and to
the right of the "if"?
The idea of returning the isDefined result as an int is just trying to
capture a continuation of a partially completed apply in a form that is
efficient on the JVM. That continuation is not correct if I write a
non-pure unapply. For example, I could write a non-pure P.unapply and
then
collect { case P(x) if x >= 0 => x }
might return a negative element.
To me, an error in this case seems much less odious than Paul's
original, since custom unapply is more rare than match guards. I don't
think that we should hesitate to fix Paul's case even if the solution
doesn't address this one.
- Nathan
On Thu, 2011-01-27 at 16:20 -0800, Paul Phillips wrote:
> The default case confuses the issue because it will be caught in apply.
> Let us look more simply at this.
>
> List(1, 2, 3) collect { case x if p(x) => x }
>
> I will never come around to thinking this should throw exceptions.
>
Fri, 2011-01-28, 02:47
#28
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 02:06:51AM +0100, Burak Emir wrote:
> I don't say that getting a MatchError there is intuitive, even if I
> don't think it is outrageous. Patterns are still full of surprises : )
> but here it is more the translation to PartialFunction that seems to
> be the problem.
It is without question the translation of anonymous partial functions
that is the problem. I'm sorry to suggest it one last time but I
explained exactly what the problem is in my first couple messages in
this thread.
> I would vote for the high-level optimization. That would have a nice
> impact on a large class of programs, in addition to preventing obscure
> errors in programs that deal with obscure situations.
I do not know if that is meant to imply that the situation under
discussion is obscure. It sounds like it. I don't know what to say
about that other than that you are mistaken. Predicates whose answers
might change: size > x, System.currentTimeMillis < y, etc. are not the
least bit obscure. The method "collect" is not the least bit obscure.
They will not become obscure in combination.
Fri, 2011-01-28, 02:57
#29
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 21:47, Paul Phillips <paulp@improving.org> wrote:
However, you are putting forward an optimization for this case based solely on a case where p(x) need not stay true for f(x) to be called, *and* you want a partial function instead of a function. I can't see that as anything but a very small set of cases, which could be easily done in other ways. Again, back to your original example:
lazy val packages: List[DirectoryClassPath] = dir collect {
case f if f.isDirectory && validPackage(f.name) => new DirectoryClassPath(f, context)
} toList
1. If it doesn't matter that f is no longer a directory and f.name is no longer a valid package, then it isn't necessary to use collect -- you can just use map. Collect is an optimization, but not necessary. 2. It can still be done as dir flatMap { if (...) Some(new DirectoryClassPath(f, context)) else None }. 3. It can still be done as filter/map.
So, as an argument for applyIfDefined, this qualifies as "minor convenience": it is not necessary, and there are simple workarounds.
And, on the subject of laughable claims, this:
> who thinks everyone who doesn't write code as they say it
> ought to be written deserves every bug they get
I'm not saying that at all. I'm saying THE CODE WAS BUGGY TO BEGIN WITH. If the collect is necessary, then the code is buggy. If the code is not buggy, the collect is not necessary. Partial function's behavior doesn't enter this equation.
--
Daniel C. Sobral
I travel to the future all the time.
On Thu, Jan 27, 2011 at 09:38:49PM -0200, Daniel Sobral wrote:This is not an irrelevancy, it is pretty much the crux of the matter. If I make a guard p(x) protecting an f(x), it is a problem that x changes after checking p(x). Whether p(x) is called again, *that* is irrelevant. It only changes the error.
> This I disagree with. If the code if fine whether p(x) is true or
> false, then make it a function, not a match.
Please stop injecting irrelevancies. I know there is a breed of
programmer who thinks everyone who doesn't write code as they say it
ought to be written deserves every bug they get and we should be looking
for ways to hurt them. For those of us who don't take that point of
view, it makes absolutely no difference what someone "should" have done.
The only question that matters is whether this is acceptable behavior.
I say it isn't. If you say it is I'm glad you're only an observer.
However, you are putting forward an optimization for this case based solely on a case where p(x) need not stay true for f(x) to be called, *and* you want a partial function instead of a function. I can't see that as anything but a very small set of cases, which could be easily done in other ways. Again, back to your original example:
lazy val packages: List[DirectoryClassPath] = dir collect {
case f if f.isDirectory && validPackage(f.name) => new DirectoryClassPath(f, context)
} toList
1. If it doesn't matter that f is no longer a directory and f.name is no longer a valid package, then it isn't necessary to use collect -- you can just use map. Collect is an optimization, but not necessary. 2. It can still be done as dir flatMap { if (...) Some(new DirectoryClassPath(f, context)) else None }. 3. It can still be done as filter/map.
So, as an argument for applyIfDefined, this qualifies as "minor convenience": it is not necessary, and there are simple workarounds.
And, on the subject of laughable claims, this:
> who thinks everyone who doesn't write code as they say it
> ought to be written deserves every bug they get
I'm not saying that at all. I'm saying THE CODE WAS BUGGY TO BEGIN WITH. If the collect is necessary, then the code is buggy. If the code is not buggy, the collect is not necessary. Partial function's behavior doesn't enter this equation.
--
Daniel C. Sobral
I travel to the future all the time.
Fri, 2011-01-28, 02:57
#30
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 05:27:24PM -0800, Nathan Bronson wrote:
> Perhaps it is useful to separate problems that occur to the left and
> to the right of the "if"?
You're right that we should, although I'm not sure if I buy your
reasoning. Your example and similar ones are in a qualitatively
different category no matter how you slice it. I know it was just an
example, but an intentionally side effecting, result producing unapply
is way out of bounds in terms of what can be done.
The problem I was talking about yesterday is in something like:
case Foo(x, Bar(y)) if p(x, y) => x+y
But I agree with you, if we took the Int approach the Int can take me to
the matching case. I re-run the pattern match to recover the variables
and walk right past the sleeping guard. Problem solved(*).
(*) Check local regulations.
Fri, 2011-01-28, 03:07
#31
Re: Re: new motivation for applyIfDefined
But the contract of collect sort-of-implies that you won't get a MatchError, so it is unexpected. Even if my p(x) changes, I would not expect a match error unless I also knew that the collect function goes through the match statements twice. This is the crux of paulp's argument.
The collect function states: "Builds a new collection by applying a partial function to all elements of this iterable collection on which the function is defined."
This could be interpreted several ways, but I agree with paul in that a match error is unexpected. Dropping the element seems a more intuitive result.
On Thu, Jan 27, 2011 at 8:47 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
The collect function states: "Builds a new collection by applying a partial function to all elements of this iterable collection on which the function is defined."
This could be interpreted several ways, but I agree with paul in that a match error is unexpected. Dropping the element seems a more intuitive result.
On Thu, Jan 27, 2011 at 8:47 PM, Daniel Sobral <dcsobral@gmail.com> wrote:
On Thu, Jan 27, 2011 at 21:47, Paul Phillips <paulp@improving.org> wrote:On Thu, Jan 27, 2011 at 09:38:49PM -0200, Daniel Sobral wrote:This is not an irrelevancy, it is pretty much the crux of the matter. If I make a guard p(x) protecting an f(x), it is a problem that x changes after checking p(x). Whether p(x) is called again, *that* is irrelevant. It only changes the error.
> This I disagree with. If the code if fine whether p(x) is true or
> false, then make it a function, not a match.
Please stop injecting irrelevancies. I know there is a breed of
programmer who thinks everyone who doesn't write code as they say it
ought to be written deserves every bug they get and we should be looking
for ways to hurt them. For those of us who don't take that point of
view, it makes absolutely no difference what someone "should" have done.
The only question that matters is whether this is acceptable behavior.
I say it isn't. If you say it is I'm glad you're only an observer.
However, you are putting forward an optimization for this case based solely on a case where p(x) need not stay true for f(x) to be called, *and* you want a partial function instead of a function. I can't see that as anything but a very small set of cases, which could be easily done in other ways. Again, back to your original example:
lazy val packages: List[DirectoryClassPath] = dir collect {
case f if f.isDirectory && validPackage(f.name) => new DirectoryClassPath(f, context)
} toList
1. If it doesn't matter that f is no longer a directory and f.name is no longer a valid package, then it isn't necessary to use collect -- you can just use map. Collect is an optimization, but not necessary. 2. It can still be done as dir flatMap { if (...) Some(new DirectoryClassPath(f, context)) else None }. 3. It can still be done as filter/map.
So, as an argument for applyIfDefined, this qualifies as "minor convenience": it is not necessary, and there are simple workarounds.
And, on the subject of laughable claims, this:
> who thinks everyone who doesn't write code as they say it
> ought to be written deserves every bug they get
I'm not saying that at all. I'm saying THE CODE WAS BUGGY TO BEGIN WITH. If the collect is necessary, then the code is buggy. If the code is not buggy, the collect is not necessary. Partial function's behavior doesn't enter this equation.
--
Daniel C. Sobral
I travel to the future all the time.
Fri, 2011-01-28, 03:37
#32
Re: Re: new motivation for applyIfDefined
On Thu, Jan 27, 2011 at 23:54, Josh Suereth <joshua.suereth@gmail.com> wrote:
I get that. I just think that's the equivalent of doing nothing if I call a method on a null reference. To me, MatchError on collect is as much unexpected as a NullPointerException on a method call, and it is very annoying to get them on code that was supposed to be correct, but they both mean the code isn't correct after all. But having already spent 1 billion dollars on NullPointerException, we got used to it.
I'm all for applyIfDefined for performance reasons alone, but collect should just document it may throw MatchError is the predicate is not idempotent and get on with life.
--
Daniel C. Sobral
I travel to the future all the time.
But the contract of collect sort-of-implies that you won't get a MatchError, so it is unexpected. Even if my p(x) changes, I would not expect a match error unless I also knew that the collect function goes through the match statements twice. This is the crux of paulp's argument.
The collect function states: "Builds a new collection by applying a partial function to all elements of this iterable collection on which the function is defined."
This could be interpreted several ways, but I agree with paul in that a match error is unexpected. Dropping the element seems a more intuitive result.
I get that. I just think that's the equivalent of doing nothing if I call a method on a null reference. To me, MatchError on collect is as much unexpected as a NullPointerException on a method call, and it is very annoying to get them on code that was supposed to be correct, but they both mean the code isn't correct after all. But having already spent 1 billion dollars on NullPointerException, we got used to it.
I'm all for applyIfDefined for performance reasons alone, but collect should just document it may throw MatchError is the predicate is not idempotent and get on with life.
--
Daniel C. Sobral
I travel to the future all the time.
Fri, 2011-01-28, 07:47
#33
Re: Re: new motivation for applyIfDefined
On Wed, Jan 26, 2011 at 11:59 AM, martin odersky wrote:
> That said, I also think it would be much better to go the applyIfDefined
> route. I'd have to page the mailoing list discussion back into my memory to
> get a feeling what the tendency was then. I remember there was no lack of
> proposals!
The last word on that thread was a reiteration of a suggestion from
Martin, which involved enhancing Function1 thusly:
class Function1[-A, +B] {
def apply(x: A): B
def missingCase(x: A): B = throw new MatchError
...
}
with the suggestion that "This would let you get the effective
functionality of applyOrElse by subclassing Function1."
But I don't see how that's going to help in the case of functions
generated by the compiler from case expressions (i.e. { case x => ...
}). The compiler could synthesize a default case that called
missingCase, but it would also have to generate the implementation of
missingCase, so I don't see where someone can come in with a subclass
and make something useful happen. Probably I'm missing something.
Anyhow, what seems perhaps more fruitful from a "let's not evaluate
guards twice" perspective (both for correctness and performance
reasons), is a small "enhancement" to PartialFunction:
trait PartialFunction[-A, +B] extends (A => B) {
def isDefinedAt(x: A): Boolean
def apply(x :A) = applyOrElse(x, throw new MatchError)
def applyOrElse[A1 <: A, B1 >: B](x: A1, default: A1 => B1): B1 =
mapOrElse(x, identity[B1], default)
def mapOrElse[A1 <: A, B1 >: B, C](x :A1, f: B1 => C, default: A1 => C): C
}
The compiler would generate an implementation of isDefined as well as
mapOrElse. This would not avoid the code duplication, but it would
avoid double execution. Crucially, it would provide a way for nearly
all of the users of PartialFunction to get their job done without
risking mysterious MatchExceptions. For example:
def collect[B, That](pf: PartialFunction[A, B])(implicit bf:
CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) pf.mapOrElse(x, (v: B) => b += v, (_: A) => ())
b.result
}
This being a common use case, it could be tidied up with something
like the following (which is screaming out for a better name than
'mapIf'):
trait PartialFunction[-A, +B] extends (A => B) {
def mapIf[A1 <: A, B1 >: B] (x :A1, f: B1 => Unit) :Unit =
mapOrElse(x, f, (_: A1) => ())
}
which gives us:
def collect[B, That](pf: PartialFunction[A, B])(implicit bf:
CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) pf.mapOrElse(x, b.+=)
b.result
}
There is the unfortunate additional expense of apply calling
applyOrElse calling mapOrElse with identity, but I think the vast
majority of users of PartialFunction will find themselves going
directly to mapOrElse. The various PartialFunction combinators also
avoid this overhead because only the outermost PartialFunction will go
through two extra hops, and then all of the actual evaluation will
take place in a chain of mapOrElse calls. Thus this change might let
Martin have the performance cake he was looking for with
FunctionWithDefault, and eat it too.
See here for the new implementations of orElse, andThen, etc.:
https://gist.github.com/799916
One could then put a nice warning on isDefinedAt saying "Beware the
madness of `if(pf.isDefinedAt(x)) pf(x)`. Use `mapOrElse` or `mapIf`."
Only the rare circumstances like Lift's delayed initialization would
ever need to call isDefinedAt.
This does introduce additional complexity into the compiler, as it
will have to thread the success continuation through the
implementation when generating an implementation of mapOrElse (as
Martin mentioned offhandedly when considering other possibilities for
FunctionWithDefault). But trading a bit of compiler complexity for
increased performance and a decrease in scary edge cases with arguable
correctness properties sounds like a win.
Fri, 2011-01-28, 10:17
#34
Re: Re: new motivation for applyIfDefined
My 2c is that a naive user (one who isn't steeped in several years of scala arkainery) would find a match error on collect with a guard very strange, and may well be surprised at this double-evaluation. It seems like this MatchError is exposing implementation details about the underlying magic used to compile cases with guards.This is bad because a) code-bases may come to rely upon foibles of the implementation rather than on the API contract and, b) it adds unnecessarily to the magic that a scala programmer may need to know to be effective in diagnosing failures.
That this implementation also triggers a class of faults where the predicate isn't pure is, to me, no argument that the status quo is desirable. In the longer term, it would be better to have purity checking and to flash up a compile-time warning if the code can not be proven to be pure. For now, any solution that avoids the double-evaluation, so hides the implementation better, is acceptable.
Matthew
That this implementation also triggers a class of faults where the predicate isn't pure is, to me, no argument that the status quo is desirable. In the longer term, it would be better to have purity checking and to flash up a compile-time warning if the code can not be proven to be pure. For now, any solution that avoids the double-evaluation, so hides the implementation better, is acceptable.
Matthew
Fri, 2011-01-28, 13:27
#35
Re: Re: new motivation for applyIfDefined
I'm surprised that there is so much discussion on this topic, as I had
the impression that it was already decided by Martin what the way to
go is. I'm referring to his mail from 2010-11-02 where he suggests the
following:
************
Define Function1 like this:
class Function1[-A, +B] {
def apply(x: A): B
def missingCase(x: A): B = throw new MatchError
...
}
2. Change the rules so that _any_ subclass of Function can be
implemented by a closure { x => ... }, as long as the class contains a
single abstract method named `apply'.
3. Change the rules so that in a pattern matching expression
x match { case P1 => ... case Pn => ... }
that was the body of a function, a selector that was not matched by
any of the patterns invoked missingCase(x) instead of immediately
throwing a MatchError(x).
************
Although, regarding point 2, I like the idea that not only any
subclass of function, but any SAM-type can be implemented with a
closure. I think this was brought up by Mark Harrah.
Regards,
Ruediger
2011/1/28 Michael Bayne :
> On Wed, Jan 26, 2011 at 11:59 AM, martin odersky wrote:
>> That said, I also think it would be much better to go the applyIfDefined
>> route. I'd have to page the mailoing list discussion back into my memory to
>> get a feeling what the tendency was then. I remember there was no lack of
>> proposals!
>
> The last word on that thread was a reiteration of a suggestion from
> Martin, which involved enhancing Function1 thusly:
>
> class Function1[-A, +B] {
> def apply(x: A): B
> def missingCase(x: A): B = throw new MatchError
> ...
> }
>
> with the suggestion that "This would let you get the effective
> functionality of applyOrElse by subclassing Function1."
>
> But I don't see how that's going to help in the case of functions
> generated by the compiler from case expressions (i.e. { case x => ...
> }). The compiler could synthesize a default case that called
> missingCase, but it would also have to generate the implementation of
> missingCase, so I don't see where someone can come in with a subclass
> and make something useful happen. Probably I'm missing something.
>
> Anyhow, what seems perhaps more fruitful from a "let's not evaluate
> guards twice" perspective (both for correctness and performance
> reasons), is a small "enhancement" to PartialFunction:
>
> trait PartialFunction[-A, +B] extends (A => B) {
> def isDefinedAt(x: A): Boolean
>
> def apply(x :A) = applyOrElse(x, throw new MatchError)
>
> def applyOrElse[A1 <: A, B1 >: B](x: A1, default: A1 => B1): B1 =
> mapOrElse(x, identity[B1], default)
>
> def mapOrElse[A1 <: A, B1 >: B, C](x :A1, f: B1 => C, default: A1 => C): C
> }
>
> The compiler would generate an implementation of isDefined as well as
> mapOrElse. This would not avoid the code duplication, but it would
> avoid double execution. Crucially, it would provide a way for nearly
> all of the users of PartialFunction to get their job done without
> risking mysterious MatchExceptions. For example:
>
> def collect[B, That](pf: PartialFunction[A, B])(implicit bf:
> CanBuildFrom[Repr, B, That]): That = {
> val b = bf(repr)
> for (x <- this) pf.mapOrElse(x, (v: B) => b += v, (_: A) => ())
> b.result
> }
>
> This being a common use case, it could be tidied up with something
> like the following (which is screaming out for a better name than
> 'mapIf'):
>
> trait PartialFunction[-A, +B] extends (A => B) {
> def mapIf[A1 <: A, B1 >: B] (x :A1, f: B1 => Unit) :Unit =
> mapOrElse(x, f, (_: A1) => ())
> }
>
> which gives us:
>
> def collect[B, That](pf: PartialFunction[A, B])(implicit bf:
> CanBuildFrom[Repr, B, That]): That = {
> val b = bf(repr)
> for (x <- this) pf.mapOrElse(x, b.+=)
> b.result
> }
>
> There is the unfortunate additional expense of apply calling
> applyOrElse calling mapOrElse with identity, but I think the vast
> majority of users of PartialFunction will find themselves going
> directly to mapOrElse. The various PartialFunction combinators also
> avoid this overhead because only the outermost PartialFunction will go
> through two extra hops, and then all of the actual evaluation will
> take place in a chain of mapOrElse calls. Thus this change might let
> Martin have the performance cake he was looking for with
> FunctionWithDefault, and eat it too.
>
> See here for the new implementations of orElse, andThen, etc.:
> https://gist.github.com/799916
>
> One could then put a nice warning on isDefinedAt saying "Beware the
> madness of `if(pf.isDefinedAt(x)) pf(x)`. Use `mapOrElse` or `mapIf`."
> Only the rare circumstances like Lift's delayed initialization would
> ever need to call isDefinedAt.
>
> This does introduce additional complexity into the compiler, as it
> will have to thread the success continuation through the
> implementation when generating an implementation of mapOrElse (as
> Martin mentioned offhandedly when considering other possibilities for
> FunctionWithDefault). But trading a bit of compiler complexity for
> increased performance and a decrease in scary edge cases with arguable
> correctness properties sounds like a win.
>
Fri, 2011-01-28, 13:47
#36
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 1:16 PM, Ruediger Keller <ruediger.keller@rk42.de> wrote:
I'm surprised that there is so much discussion on this topic, as I hadYes, I still think that's the way to go. Thanks for digging it up!
the impression that it was already decided by Martin what the way to
go is. I'm referring to his mail from 2010-11-02 where he suggests the
following:
************
Define Function1 like this:
class Function1[-A, +B] {
def apply(x: A): B
def missingCase(x: A): B = throw new MatchError
...
}
2. Change the rules so that _any_ subclass of Function can be
implemented by a closure { x => ... }, as long as the class contains a
single abstract method named `apply'.
3. Change the rules so that in a pattern matching expression
x match { case P1 => ... case Pn => ... }
that was the body of a function, a selector that was not matched by
any of the patterns invoked missingCase(x) instead of immediately
throwing a MatchError(x).
************
-- Martin
Fri, 2011-01-28, 14:07
#37
Re: Re: new motivation for applyIfDefined
I don't get how this fixes the collect case, as you would have to dynamically sub-class the partial function to write your own fixingCase that did something different. I'm not seeing how this would be translated into a use-case like collect.
On Fri, Jan 28, 2011 at 7:46 AM, martin odersky <martin.odersky@epfl.ch> wrote:
On Fri, Jan 28, 2011 at 7:46 AM, martin odersky <martin.odersky@epfl.ch> wrote:
On Fri, Jan 28, 2011 at 1:16 PM, Ruediger Keller <ruediger.keller@rk42.de> wrote:
I'm surprised that there is so much discussion on this topic, as I hadYes, I still think that's the way to go. Thanks for digging it up!
the impression that it was already decided by Martin what the way to
go is. I'm referring to his mail from 2010-11-02 where he suggests the
following:
************
Define Function1 like this:
class Function1[-A, +B] {
def apply(x: A): B
def missingCase(x: A): B = throw new MatchError
...
}
2. Change the rules so that _any_ subclass of Function can be
implemented by a closure { x => ... }, as long as the class contains a
single abstract method named `apply'.
3. Change the rules so that in a pattern matching expression
x match { case P1 => ... case Pn => ... }
that was the body of a function, a selector that was not matched by
any of the patterns invoked missingCase(x) instead of immediately
throwing a MatchError(x).
************
-- Martin
Fri, 2011-01-28, 14:17
#38
Re: Re: new motivation for applyIfDefined
>>>>> "Matthew" == Matthew Pocock writes:
Matthew> For now, any solution that avoids the double-evaluation, so
Matthew> hides the implementation better, is acceptable.
Agree.
Also, I don't think anyone has mentioned this yet: what if the predicate
is expensive? Then the double evaluation is a performance bug -- even
if the predicate is pure. This isn't merely about predicates that
change their minds.
I find it rather bizarre that some are actually arguing that the double
evaluation here is just fine. Evaluating user code twice for no
defensible reason at all is a bug, plain and simple.
Fri, 2011-01-28, 14:27
#39
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 1:57 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
I don't get how this fixes the collect case, as you would have to dynamically sub-class the partial function to write your own fixingCase that did something different. I'm not seeing how this would be translated into a use-case like collect.The collect method could take a subclass of Function1 that defines missingCase.
Cheers
-- Martin
Fri, 2011-01-28, 14:37
#40
Re: Re: new motivation for applyIfDefined
Yes, but you're pushing the issue onto users of partial functions. I'm not sure that's intuitive, and we'd still get back to the same issue for those who do not define missing case (say, an anonymous partial function).
To allow library writers to write code that allows users to write anonymous partial functions: I think we need some way to partially evaluate the partial function and continue if something is defined.
I think a simple applyOrNone(x : A) : Option[B] would be alright for this case, but again would lead to more code bloat for all anonymous partial functions.
On Fri, Jan 28, 2011 at 8:11 AM, martin odersky <martin.odersky@epfl.ch> wrote:
To allow library writers to write code that allows users to write anonymous partial functions: I think we need some way to partially evaluate the partial function and continue if something is defined.
I think a simple applyOrNone(x : A) : Option[B] would be alright for this case, but again would lead to more code bloat for all anonymous partial functions.
On Fri, Jan 28, 2011 at 8:11 AM, martin odersky <martin.odersky@epfl.ch> wrote:
On Fri, Jan 28, 2011 at 1:57 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
I don't get how this fixes the collect case, as you would have to dynamically sub-class the partial function to write your own fixingCase that did something different. I'm not seeing how this would be translated into a use-case like collect.The collect method could take a subclass of Function1 that defines missingCase.
Cheers
-- Martin
Fri, 2011-01-28, 14:47
#41
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 2:17 PM, Josh Suereth <joshua.suereth@gmail.com> wrote:
Yes, but you're pushing the issue onto users of partial functions. I'm not sure that's intuitive, and we'd still get back to the same issue for those who do not define missing case (say, an anonymous partial function).
Not at all. The collect method needs to take as argument a subclass of Function1 which defines missingCase. The user just writes a closure, as always.
Cheers
-- Martin
Mon, 2011-01-31, 18:17
#42
Re: Re: new motivation for applyIfDefined
On Fri, Jan 28, 2011 at 4:46 AM, martin odersky wrote:
> On Fri, Jan 28, 2011 at 1:16 PM, Ruediger Keller
> wrote:
>>
>> I'm referring to his mail from 2010-11-02 where he suggests the
>> following:
>>
>> ************
>> Define Function1 like this:
>>
>> class Function1[-A, +B] {
>> def apply(x: A): B
>> def missingCase(x: A): B = throw new MatchError
>> ...
>> }
>>
>> 2. Change the rules so that _any_ subclass of Function can be
>> implemented by a closure { x => ... }, as long as the class contains a
>> single abstract method named `apply'.
>>
>> 3. Change the rules so that in a pattern matching expression
>>
>> x match { case P1 => ... case Pn => ... }
>>
>> that was the body of a function, a selector that was not matched by
>> any of the patterns invoked missingCase(x) instead of immediately
>> throwing a MatchError(x).
>>
> Yes, I still think that's the way to go. Thanks for digging it up!
I did some profiling of using a missingCase function to throw a
(preallocated) exception to implement PartialFunction-like behavior
with collect() as a client. I compared it against the existing
PartialFunction implementation, as well as an implementation which
uses an extra level of indirection but avoids the exception.
Unfortunately, the exception-based approach was rather unpredictable
with regard to whether Hotspot optimized it. When it failed to do so,
there was a 5x-10x slowdown. The explicit indirection approach was
faster than the existing PartialFunction approach (because it avoids
double evaluation of the guards), but only if one was very careful to
avoid creating a new closure on every function application (or if the
guards are very expensive).
The full analysis is here:
https://github.com/samskivert/scala-pf-rethink
Feel free to fork it and play around with it, and if there are glaring
errors in my method, I'd love to hear about them.
Fri, 2011-02-04, 01:37
#43
Re: Re: new motivation for applyIfDefined
I just noticed that this proposal will make implicit implicits possible at last! I wonder what impact it would have on the STM library/API?
--
Daniel C. Sobral
I travel to the future all the time.
--
Daniel C. Sobral
I travel to the future all the time.
It occurs to me a hackier and repellent way to fix collect, but which we
should definitely do if we can do nothing else, is
// current implementation
def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) if (pf.isDefinedAt(x)) b += pf(x)
b.result
}
// paulp's "we're going to hell" implementation
def collect[B, That](pf: PartialFunction[A, B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) if (pf.isDefinedAt(x)) {
try b += pf(x)
catch { case _: MatchError => () }
}
b.result
}