- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
2.8 migration guide needed
Thu, 2010-02-04, 18:47
I just hit a pretty subtle issue while porting some old code to 2.8. After thinking about it, first it made sense, then it didn't make sense anymore, and at any rate it's annoying because it's a runtime error.
So here it goes:
val line: String = "name=value"
line.split('=') match {
case Seq("name", v) => v
}
This worked fine in 2.7, but in 2.8 throws a MatchError. Arrays are not sequences anymore, and 'split' returns an array. That's the 'makes sense part'.
According to the spec though, it is unclear why the patternmatch type-checks in the first place: each pattern is typed having Array[String] as expected type, and Seq is not a subtype of arrays. I guess extractor patterns are very much untyped, because even this compiles:
1 match {
case Seq(1, 2, 3) =>
}
I tried to add an unapplySeq[T](x: Array[T]) in object Seq, but for some reason it wouldn't ever be called. I suppose it's because overloading of unapplySeq is not supported (again, the spec is not very clear how an alternative is selected, if more than one unapplySeq methods exist in the same object). It would be nice if this could be solved so that code continues to work as expected.
iulian
--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais
So here it goes:
val line: String = "name=value"
line.split('=') match {
case Seq("name", v) => v
}
This worked fine in 2.7, but in 2.8 throws a MatchError. Arrays are not sequences anymore, and 'split' returns an array. That's the 'makes sense part'.
According to the spec though, it is unclear why the patternmatch type-checks in the first place: each pattern is typed having Array[String] as expected type, and Seq is not a subtype of arrays. I guess extractor patterns are very much untyped, because even this compiles:
1 match {
case Seq(1, 2, 3) =>
}
I tried to add an unapplySeq[T](x: Array[T]) in object Seq, but for some reason it wouldn't ever be called. I suppose it's because overloading of unapplySeq is not supported (again, the spec is not very clear how an alternative is selected, if more than one unapplySeq methods exist in the same object). It would be nice if this could be solved so that code continues to work as expected.
iulian
--
« Je déteste la montagne, ça cache le paysage »
Alphonse Allais
Thu, 2010-02-04, 19:37
#2
Re: 2.8 migration guide needed
On Thu, Feb 4, 2010 at 6:46 PM, Iulian Dragos wrote:
> I just hit a pretty subtle issue while porting some old code to 2.8. After
> thinking about it, first it made sense, then it didn't make sense anymore,
> and at any rate it's annoying because it's a runtime error.
>
> So here it goes:
>
> val line: String = "name=value"
> line.split('=') match {
> case Seq("name", v) => v
> }
>
> This worked fine in 2.7, but in 2.8 throws a MatchError. Arrays are not
> sequences anymore, and 'split' returns an array. That's the 'makes sense
> part'.
>
> According to the spec though, it is unclear why the patternmatch type-checks
> in the first place: each pattern is typed having Array[String] as expected
> type, and Seq is not a subtype of arrays. I guess extractor patterns are
> very much untyped, because even this compiles:
>
> 1 match {
> case Seq(1, 2, 3) =>
> }
>
Indeed. Extractors match anything.Paul lays out in his previous mails why.
> I tried to add an unapplySeq[T](x: Array[T]) in object Seq, but for some
> reason it wouldn't ever be called. I suppose it's because overloading of
> unapplySeq is not supported (again, the spec is not very clear how an
> alternative is selected, if more than one unapplySeq methods exist in the
> same object). It would be nice if this could be solved so that code
> continues to work as expected.
It would be nice, but it will take work to get there. Currently the
system handles only one unapplySeq method per extractor. (There
probably should be a compile-time check enforcing this!) If we handle
multiple unapplySeq's I would imagine they would be treated with an
if-then-else. To be concrete, assume:
object X {
def unapplySeq[T](x: Seq[T]): Option[Seq[T]]
def unapplySeq[T](x: Array[T]): Option[Array[T]]
}
But then, how do you choose between the two. Maybe like this?
if (scrutinee.isInstanceOf[Seq]) and first unapplySeq matches ...
else if (scrutinee.isInstanceOf[Array]) and second unapply matches.
But why do it in this order and not the other way around? How do you
specify this, if the Scala spec has no notion of the order in which
methods are define? (for instance, both unapplySeq's might
be inherited from different traits by X.
Another tricky problem is: How do you type the pattern afterwards? In
the example above we do not know what gets returned from the
unapplySeq. Do we insist that they all return the same, or do we
form a lub? There's a fishing supply store full of cans of worms here!
Cheers
Fri, 2010-02-05, 12:57
#3
Re: 2.8 migration guide needed
On Thu, Feb 4, 2010 at 7:32 PM, martin odersky <martin.odersky@epfl.ch> wrote:
> According to the spec though, it is unclear why the patternmatch type-checks
> in the first place: each pattern is typed having Array[String] as expected
> type, and Seq is not a subtype of arrays. I guess extractor patterns are
> very much untyped, because even this compiles:
>
> 1 match {
> case Seq(1, 2, 3) =>
> }
>
Indeed. Extractors match anything.Paul lays out in his previous mails why.
I understood why, and basically retraced Paul's steps. My message had two intentions: document an issue when porting to 2.8, and be included in a future migration guide (didn't know about the previous discussion, as it didn't show up on a search for '2.8 migration' -- I guess that's a good reason to create that document). The second goal was to point out some under specified sections of Scala (I know you're aware of them from a pervious discussion, but it's a good idea to have it written so others can find it).
> I tried to add an unapplySeq[T](x: Array[T]) in object Seq, but for some
> reason it wouldn't ever be called. I suppose it's because overloading of
> unapplySeq is not supported (again, the spec is not very clear how an
> alternative is selected, if more than one unapplySeq methods exist in the
> same object). It would be nice if this could be solved so that code
> continues to work as expected.
It would be nice, but it will take work to get there. Currently the
system handles only one unapplySeq method per extractor. (There
probably should be a compile-time check enforcing this!) If we handle
multiple unapplySeq's I would imagine they would be treated with an
if-then-else. To be concrete, assume:
object X {
def unapplySeq[T](x: Seq[T]): Option[Seq[T]]
def unapplySeq[T](x: Array[T]): Option[Array[T]]
}
But then, how do you choose between the two. Maybe like this?
Well, there must be a way to choose, since the compiler already does it. It's just not specified in any way. I assume it's the first member that comes up in a lookup.
if (scrutinee.isInstanceOf[Seq]) and first unapplySeq matches ...
else if (scrutinee.isInstanceOf[Array]) and second unapply matches.
I remember a discussion long ago with Burak, in which he said that an instanceOf/cast is anyway performed for unapplies, from the scrutinee type to the parameter type of the unapply (which is not needed in many cases). So this solution seems in line with the current design. The order in which they are tried is trickier. I would go for something similar to implicit resolution, giving higher priority to methods defined lower in the type hierarchy.
Another tricky problem is: How do you type the pattern afterwards? In
the example above we do not know what gets returned from the
unapplySeq. Do we insist that they all return the same, or do we
form a lub? There's a fishing supply store full of cans of worms here!
Agreed. Either would be fine. For now, even an error when more than one unapply exist in the same object would be good.
I think that independent of these language design issues, something should be done to help in migrating existing code to 2.8. I think a simple warning whenever an Array is matched on a sequence pattern would catch most problems (as I don't see a way to replicate the old functionality with new Arrays without significant changes to the pattern matcher).
iulian
Cheers
Fri, 2010-02-05, 16:07
#4
Re: 2.8 migration guide needed
On Fri, Feb 5, 2010 at 12:53 PM, Iulian Dragos wrote:
>
>
> On Thu, Feb 4, 2010 at 7:32 PM, martin odersky
> wrote:
>>
>> > According to the spec though, it is unclear why the patternmatch
>> > type-checks
>> > in the first place: each pattern is typed having Array[String] as
>> > expected
>> > type, and Seq is not a subtype of arrays. I guess extractor patterns are
>> > very much untyped, because even this compiles:
>> >
>> > 1 match {
>> > case Seq(1, 2, 3) =>
>> > }
>> >
>> Indeed. Extractors match anything.Paul lays out in his previous mails why.
>
> I understood why, and basically retraced Paul's steps. My message had two
> intentions: document an issue when porting to 2.8, and be included in a
> future migration guide (didn't know about the previous discussion, as it
> didn't show up on a search for '2.8 migration' -- I guess that's a good
> reason to create that document). The second goal was to point out some under
> specified sections of Scala (I know you're aware of them from a pervious
> discussion, but it's a good idea to have it written so others can find it).
>
>>
>> > I tried to add an unapplySeq[T](x: Array[T]) in object Seq, but for some
>> > reason it wouldn't ever be called. I suppose it's because overloading of
>> > unapplySeq is not supported (again, the spec is not very clear how an
>> > alternative is selected, if more than one unapplySeq methods exist in
>> > the
>> > same object). It would be nice if this could be solved so that code
>> > continues to work as expected.
>>
>> It would be nice, but it will take work to get there. Currently the
>> system handles only one unapplySeq method per extractor. (There
>> probably should be a compile-time check enforcing this!) If we handle
>> multiple unapplySeq's I would imagine they would be treated with an
>> if-then-else. To be concrete, assume:
>>
>> object X {
>> def unapplySeq[T](x: Seq[T]): Option[Seq[T]]
>> def unapplySeq[T](x: Array[T]): Option[Array[T]]
>> }
>>
>> But then, how do you choose between the two. Maybe like this?
>
> Well, there must be a way to choose, since the compiler already does it.
> It's just not specified in any way. I assume it's the first member that
> comes up in a lookup.
>
yes, but that's not a very good way to specify things!
Thinking about it, this is such a huge can of worms that we should not
go there. We should just enforce that unapply and unapplySeq are not
overloaded.
For the migration, the safest thing to do is to duplicate every
case Seq(...) => ...
to
case Seq(...) =>
case Array(...) =>
Then, after a more careful analysis one can most times drop one of the
two cases.
Cheers
Fri, 2010-02-05, 16:47
#5
Re: 2.8 migration guide needed
On Fri, Feb 05, 2010 at 04:00:25PM +0100, martin odersky wrote:
> For the migration, the safest thing to do is to duplicate every
>
> case Seq(...) => ...
>
> to
>
> case Seq(...) =>
> case Array(...) =>
It's worse than that, e.g. my original example
("abcd" drop 2) match { ...
which used to return a Seq and now returns a String.
I am still loosely planning on some kind of -Xmigration command line
switch to warn about recgonizable constructs with altered semantics, and
this one would be a real contender. I agree that trying to fully tackle
the problem involves enough worms to fill several cans, but warning when
people ask about it doesn't.
On Thu, Feb 04, 2010 at 06:46:31PM +0100, Iulian Dragos wrote:
> val line: String = "name=value"
> line.split('=') match {
> case Seq("name", v) => v
> }
>
> This worked fine in 2.7, but in 2.8 throws a MatchError.
I first brought this up in october:
http://www.scala-lang.org/node/3670
Please read all my messages in that thread and tell me what you think,
because that's as far as I could take it without some input.