This page is no longer maintained — Please continue to the home page at www.scala-lang.org

pattern matching + string changes = oops

5 replies
extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.

Three guesses what this function always returns now because calling drop
on a String returns a String.

def getName(s: String, index: Int): String = {
if (index >= s.length) null
else (s drop index) match {
case Seq(x, xs @ _*) if isNameStart(x) => x.toString + (xs takeWhile isNameChar).mkString
case _ => ""
}
}

How can we avoid this kind of silent breakage? More tests at a minimum,
but the worst silent bugs always seem to pop out of pattern matching. I
would think this example should either stop compiling ("pattern Seq does
not match type String") or the String should match like a sequence.

In fact:

scala> Seq('a', 'b', 'c') match { case "abc" => true }
:5: error: type mismatch;
found : java.lang.String("abc")
required: Seq[Char]
Seq('a', 'b', 'c') match { case "abc" => true }
^

That's what I expect, but the other way it's:

scala> "abc" match { case Seq('a', 'b', 'c') => true }
scala.MatchError: abc

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: pattern matching + string changes = oops

On Tue, Oct 13, 2009 at 05:30:48AM -0700, Paul Phillips wrote:
> That's what I expect, but the other way it's:
>
> scala> "abc" match { case Seq('a', 'b', 'c') => true }
> scala.MatchError: abc

It occurs to me that this way isn't checked because anything might have
an unapplySeq method, so it's not a simple early static check. I don't
know if that rules it out or not.

Jorge Ortiz
Joined: 2008-12-16,
User offline. Last seen 29 weeks 4 days ago.
Re: Re: pattern matching + string changes = oops
But Seq's unapplySeq in SeqFactory.scala is defined as

  def unapplySeq[A](x: CC[A]): Some[CC[A]] = Some(x)

where CC[X] <: Seq[X].

So it seems like the compiler has enough information to figure out that "abc" will never match, if it wanted to.

--j

On Tue, Oct 13, 2009 at 6:06 AM, Paul Phillips <paulp@improving.org> wrote:
On Tue, Oct 13, 2009 at 05:30:48AM -0700, Paul Phillips wrote:
> That's what I expect, but the other way it's:
>
> scala> "abc" match { case Seq('a', 'b', 'c') => true }
> scala.MatchError: abc

It occurs to me that this way isn't checked because anything might have
an unapplySeq method, so it's not a simple early static check.  I don't
know if that rules it out or not.

--
Paul Phillips      | Every election is a sort of advance auction sale
Everyman           | of stolen goods.
Empiricist         |     -- H. L. Mencken
up hill, pi pals!  |----------* http://www.improving.org/paulp/ *----------

odersky
Joined: 2008-07-29,
User offline. Last seen 45 weeks 6 days ago.
Re: Re: pattern matching + string changes = oops

On Tue, Oct 13, 2009 at 6:32 PM, Jorge Ortiz wrote:
> But Seq's unapplySeq in SeqFactory.scala is defined as
>
>   def unapplySeq[A](x: CC[A]): Some[CC[A]] = Some(x)
>
> where CC[X] <: Seq[X].
>
> So it seems like the compiler has enough information to figure out that
> "abc" will never match, if it wanted to.
>
Yes, I agree. -- Martin

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: pattern matching + string changes = oops

On Tue, Oct 13, 2009 at 09:32:00AM -0700, Jorge Ortiz wrote:
> But Seq's unapplySeq in SeqFactory.scala is defined as
>
> def unapplySeq[A](x: CC[A]): Some[CC[A]] = Some(x)
>
> where CC[X] <: Seq[X].
>
> So it seems like the compiler has enough information to figure out
> that "abc" will never match, if it wanted to.

It turns out the reason it's not a type error is that the typer
considers implicits and sees it as Seq.unapplySeq(wrapString("abc")),
but the pattern matcher doesn't do implicit resolution like that. When
the matcher sees an unapply taking an argument of whatever type, it does
the equivalent of an isInstanceOf check (if necessary, not if the static
type conforms) and then either calls the method or considers that match
a failure.

So the open question is whether it makes sense for implicit conversions
to operate at that point. Right now if you had two extractors which
took different parameter types, you could never usefully use them both
in the same match, but if the scrutinee were converted you could.

My first instinct is that not having implicits work here is a failure of
uniformity, but it's kind of surprising this way too. Whatever we
decide, it'll be nice when I get the matcher to the point that you can
count on impossible patterns being compile time errors.

extempore
Joined: 2008-12-17,
User offline. Last seen 35 weeks 3 days ago.
Re: Re: pattern matching + string changes = oops

On Tue, Oct 13, 2009 at 06:46:28PM +0200, martin odersky wrote:
> > But Seq's unapplySeq in SeqFactory.scala is defined as
> >
> >   def unapplySeq[A](x: CC[A]): Some[CC[A]] = Some(x)
> >
> > where CC[X] <: Seq[X].
> >
> > So it seems like the compiler has enough information to figure out that
> > "abc" will never match, if it wanted to.
> >
> Yes, I agree. -- Martin

OK, so I spent a while seeing about this. The problem is that even if I
turn off implicits, the compiler has this information:

scrutinee of type String
unapplySeq method to be called, takes parameter of type Seq[X]

So scala says "OK fine, the type we need is String with Seq[X]". Now we
humans know that type doesn't exist, but only because String is final.
If the scala type system takes finality into account somewhere when
considering such questions, I don't see where. And if it doesn't, then
this becomes a runtime check on every scrutinee no matter what static
information is available, because you can only rule the type IN
statically, never rule it out.

I can definitely see sometimes wanting semantics where it won't compile
unless the static type guarantees the extractor will be called, but I
can definitely not see (and definitely would not want) those as the
default semantics. Maybe we need unapplyStrict.

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland