Performance and complexity

21 replies

Tue, 2011-07-19, 22:08

Cédric Beust ♔

Joined: 2011-06-17,

Hi everyone,
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:

def eliminate[@specialized(Byte) T : ClassManifest](xs: Array[T],
slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
if(i == -1) xs else xs.take(i) ++ eliminate(xs.drop(i + slice.length), slice)

}
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; color: #7e7e7e} p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; min-height: 14.0px} span.s1 {color: #4a4a4a} span.s2 {color: #333333}

private static List<Integer> eliminate(List<Integer> value) {

List<Integer> result = Lists.newArrayList();

for (int i = 0; i < value.size(); i++) {

if (value.get(i) == SLICE.get(0) && value.get(i + 1) == SLICE.get(1)

&& value.get(i + 2) == SLICE.get(2)) {

i += 2;

} else {

result.add(value.get(i));

}

return result;

}

I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric

Tue, 2011-07-19, 22:27

Joshua.Suereth

Joined: 2008-09-02,

Re: Performance and complexity

Your Java version doesn't abstract over the size of the slice.
I'm working on a better scala version, but part of the pain is avoiding the scala.Stream class right now. I should have something shortly.
- Josh

2011/7/19 Cédric Beust ♔ <cedric@beust.com>

Hi everyone,
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:
def eliminate[@specialized(Byte) T : ClassManifest](xs: Array[T],
slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
if(i == -1) xs else xs.take(i) ++ eliminate(xs.drop(i + slice.length), slice)
}
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):

private static List<Integer> eliminate(List<Integer> value) {

List<Integer> result = Lists.newArrayList();

for (int i = 0; i < value.size(); i++) {

if (value.get(i) == SLICE.get(0) && value.get(i + 1) == SLICE.get(1)
&& value.get(i + 2) == SLICE.get(2)) {

i += 2;

} else {

result.add(value.get(i));

}

}

return result;

}

I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric

Tue, 2011-07-19, 22:47

ichoran

Joined: 2009-08-14,

Re: Performance and complexity

The Scala version uses concat on arrays, which, since concat is O(n), is a O(n^2) operation. Oh, and it's not tail-recursive, which will clobber your stack. (I mentioned this already in that thread.)

It's not hard to avoid those problems:

def eliminated[@specialized(Byte) T: ClassManifest](
xs: Array[T], slice: Seq[T], found: List[Array[T]] = Nil
): Array[T] = {
val i = xs indexOfSlice slice
if (i < 0) (xs :: found).toArray.reverse.flatten
else eliminated(xs.drop(i+slice.length), slice, xs.take(i) :: found)
}

--Rex

2011/7/19 Cédric Beust ♔ <cedric@beust.com>

Hi everyone,
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:
def eliminate[@specialized(Byte) T : ClassManifest](xs: Array[T],
slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
if(i == -1) xs else xs.take(i) ++ eliminate(xs.drop(i + slice.length), slice)
}
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):

private static List<Integer> eliminate(List<Integer> value) {

List<Integer> result = Lists.newArrayList();

for (int i = 0; i < value.size(); i++) {

if (value.get(i) == SLICE.get(0) && value.get(i + 1) == SLICE.get(1)
&& value.get(i + 2) == SLICE.get(2)) {

i += 2;

} else {

result.add(value.get(i));

}

}

return result;

}

I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric

Tue, 2011-07-19, 22:58

DaveScala

Joined: 2011-03-18,

Re: Performance and complexity

Maybe you could optimize the fucntion it for tail call recursion

package eliminate

import scala.annotation.tailrec

object Main extends App {

@tailrec
def eliminate[@specialized(Byte) T : ClassManifest](acc: Array[T], xs:
Array[T], slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
i match {
case -1 => acc ++ xs
case _ => eliminate(acc ++ xs.take(i), xs.drop(i + slice.length),
slice)
}
}

val acc : Array[Byte] = Array.empty
val ar = List[Byte](1, 2, 26, 20, 64, 3, 26, 20, 64, 4).toArray
val sl = Seq[Byte](26, 20, 64)
val el = eliminate[Byte](acc, ar, sl)
println(el.toList)

}

Wed, 2011-07-20, 00:07

Derek Williams

Joined: 2009-06-13,

Re: Performance and complexity

This seems plenty fast (written quickly so variable names suffer):

def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
case x :: xs =>
val newResult = x :: result
iter(xs, sliceList, newResult, newResult)
}
iter(seq.toList, sliceList, Nil, Nil)
}

Only tested with an Array[Byte] up to 1M, and it was just the original
10 digits repeated. Didn't time it, only tested within the REPL, but
it returns fast even if I convert the List back to an Array.

--
Derek Williams

Wed, 2011-07-20, 00:27

Derek Williams

Joined: 2009-06-13,

Re: Performance and complexity

Woops, little bug there. newResult should be built on fallback, not result.

Derek Williams

On Jul 19, 2011 4:59 PM, "Derek Williams" <derek@nebvin.ca> wrote:
> This seems plenty fast (written quickly so variable names suffer):
>
> def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
> val sliceList = slice.toList
> def iter(xs: List[A], ys: List[A], result: List[A], fallback:
> List[A]): List[A] = xs match {
> case xs if ys.isEmpty => iter(xs, sliceList, result, result)
> case Nil => fallback.reverse
> case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
> case x :: xs =>
> val newResult = x :: result
> iter(xs, sliceList, newResult, newResult)
> }
> iter(seq.toList, sliceList, Nil, Nil)
> }
>
> Only tested with an Array[Byte] up to 1M, and it was just the original
> 10 digits repeated. Didn't time it, only tested within the REPL, but
> it returns fast even if I convert the List back to an Array.
>
> --
> Derek Williams

Wed, 2011-07-20, 03:47

Stefan Wagner

Joined: 2011-04-08,

Re: Performance and complexity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 20.07.2011 01:20, schrieb Derek Williams:
> Woops, little bug there. newResult should be built on fallback, not result.

and a second one:

eliminate (Seq(1, 2, 3, 4, 3, 4, 2, 5), Seq(3, 4, 2))
res168: List[Int] = List(1, 2, 3, 4, 3, 4, 2, 5)

easy to find, since I shortly thought a fast solution ... looking for a
3-element pattern from the back, and jumping length(pattern) forward, if
it doesn't match, but: The end of the pattern might match in the
beginning again.

def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
case x :: xs =>
val newResult = x :: fallback
iter(xs, sliceList, newResult, newResult)
}
iter(seq.toList, sliceList, Nil, Nil)
}

I understood ^.

Wed, 2011-07-20, 04:07

Derek Williams

Joined: 2009-06-13,

Re: Performance and complexity

On Tue, Jul 19, 2011 at 8:41 PM, Stefan Wagner wrote:
> easy to find, since I shortly thought a fast solution ... looking for a
> 3-element pattern from the back, and jumping length(pattern) forward, if
> it doesn't match, but: The end of the pattern might match in the
> beginning again.

Good catch. I knew I was missing something. I actually had it checking
for that before and I must have "optimized" it away. This one works:

def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x ::
fallback)
case x :: xs if ys eq sliceList =>
val newResult = x :: result
iter(xs, sliceList, newResult, newResult)
case xs =>
iter(xs, sliceList, fallback, fallback)
}
iter(seq.toList, sliceList, Nil, Nil)
}

--
Derek Williams

Wed, 2011-07-20, 08:27

odersky

Joined: 2008-07-29,

Re: Performance and complexity

I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.

Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!

For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.

Cheers

-- Martin

Wed, 2011-07-20, 08:57

Alex Cruise

Joined: 2008-12-17,

Re: Performance and complexity

On Wed, Jul 20, 2011 at 12:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:

... a local mutable variable is no big deal, but think twice before introducing a global one!

I wish I could get a compiler warning when a local mutable variable escapes into a closure... Oh, effect system, will you ever be mine? ;)
-0xe1a

Wed, 2011-07-20, 09:17

#10

Sadek Drobi

Joined: 2010-09-22,

Re: Performance and complexity

My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:

I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.

Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!

For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.

Cheers

-- Martin

--
www.sadekdrobi.com
ʎdoɹʇuǝ

Wed, 2011-07-20, 10:37

#11

Sadek Drobi

Joined: 2010-09-22,

Re: Performance and complexity

I forgot a call :p
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimStart(slice,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }

On Wed, Jul 20, 2011 at 10:06 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:

My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:

I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.

Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!

For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.

Cheers

-- Martin

--
www.sadekdrobi.com
ʎdoɹʇuǝ

--
www.sadekdrobi.com
ʎdoɹʇuǝ

Wed, 2011-07-20, 10:57

#12

Sadek Drobi

Joined: 2010-09-22,

Re: Performance and complexity

I will stop spamming the list, but here is a version with all recursive calls optimized
def trimStart[A](slice:Seq[A],list:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A],original:Seq[A]=list):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimSt(slice,l,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,list,list)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 11:20 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:

I forgot a call :p
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimStart(slice,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }

On Wed, Jul 20, 2011 at 10:06 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:

My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:

I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.

Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!

For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.

Cheers

-- Martin

--
www.sadekdrobi.com
ʎdoɹʇuǝ

--
www.sadekdrobi.com
ʎdoɹʇuǝ

--
www.sadekdrobi.com
ʎdoɹʇuǝ

Wed, 2011-07-20, 21:37

#13

odersky

Joined: 2008-07-29,

Re: Performance and complexity

Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:

def elim(xs: List[Int]): List[Int] = xs match {
    case 26 :: 20 :: 64 :: rest => elim(rest)
    case x :: xs1 => x :: elim(xs1)
    case Nil => Nil
}

Cheers

-- Martin

Wed, 2011-07-20, 22:27

#14

Cédric Beust ♔

Joined: 2011-06-17,

Re: Performance and complexity

Hi Martin,
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
-- Cédric

2011/7/20 martin odersky <martin.odersky@epfl.ch>

Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:

def elim(xs: List[Int]): List[Int] = xs match {
    case 26 :: 20 :: 64 :: rest => elim(rest)
    case x :: xs1 => x :: elim(xs1)
    case Nil => Nil
}

Cheers

-- Martin

Thu, 2011-07-21, 04:17

#15

gerferra

Joined: 2009-08-26,

Re: Performance and complexity

Hi,
What about this version?
def elim(xs: List[Int], slice: List[Int]): Vector[Int] = { @tailrec def elimAcc(xs: List[Int], acc: Vector[Int]): Vector[Int] = xs match { case xs if(xs.take(3) == slice) => elimAcc(xs drop 3, acc) case x :: xs1 => elimAcc(xs1, acc :+ x) case Nil => acc } elimAcc(xs, Vector.empty[Int]) }
Regards,Germán
2011/7/20 Cédric Beust ♔ <cedric@beust.com>

Hi Martin,
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
-- Cédric

2011/7/20 martin odersky <martin.odersky@epfl.ch>
Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:

def elim(xs: List[Int]): List[Int] = xs match {
    case 26 :: 20 :: 64 :: rest => elim(rest)
    case x :: xs1 => x :: elim(xs1)
    case Nil => Nil
}

Cheers

-- Martin

Thu, 2011-07-21, 05:37

#16

Tony Morris 2

Joined: 2009-03-20,

Re: Performance and complexity

On 21/07/11 06:35, martin odersky wrote:
> Like I said, if performance matters and the array is big, I would use
> the imperative solution. If simplicity and elegance matters, I think
> the most straightforward solution is to use pattern matching over lists:
>
> def elim(xs: List[Int]): List[Int] = xs match {
> case 26 :: 20 :: 64 :: rest => elim(rest)
> case x :: xs1 => x :: elim(xs1)
> case Nil => Nil
> }
>
> Cheers
>

Thu, 2011-07-21, 06:17

#17

Derek Williams

Joined: 2009-06-13,

Re: Performance and complexity

Do you suggest working with Arrays? Or is there a better alternative?

At the defense of List, I have found prepending to a List, followed by reverse, to be faster than appending to most others, especially other immutable collections.

Derek Williams

On Jul 20, 2011 10:30 PM, "Tony Morris" <tonymorris@gmail.com> wrote:
> On 21/07/11 06:35, martin odersky wrote:
>> Like I said, if performance matters and the array is big, I would use
>> the imperative solution. If simplicity and elegance matters, I think
>> the most straightforward solution is to use pattern matching over lists:
>>
>> def elim(xs: List[Int]): List[Int] = xs match {
>> case 26 :: 20 :: 64 :: rest => elim(rest)
>> case x :: xs1 => x :: elim(xs1)
>> case Nil => Nil
>> }
>>
>> Cheers
>>
>> -- Martin
>>
>
> I mostly agree. If performance matters, and you are doing prefix
> matching on a cons list, you might want to consider a more appropriate
> data structure. This inappropriate use of data structures is completely
> aside from the functional/imperative false dilemma.
>
> http://paste.pocoo.org/show/443309/
>
> --
> Tony Morris
> http://tmorris.net/
>
>

Thu, 2011-07-21, 06:27

#18

Tony Morris 2

Joined: 2009-03-20,

Re: Performance and complexity

On 21/07/11 15:14, Derek Williams wrote:
>
> Do you suggest working with Arrays? Or is there a better alternative?
>
This depends ultimately on all the operations required, including how
the "List" came to be in the first place. Certainly, prefix matching on
a cons list (or even a mutable list) for anything but a trivial data set
is not particularly noble and then measuring the performance is... a bit
ridonkulous.

A prefix tree (trie) may be more appropriate (of which there are many
variations), or perhaps even a Burkhard-Keller Tree (a bit of a stretch
there -- need to know all operations).

> At the defense of List, I have found prepending to a List, followed by
> reverse, to be faster than appending to most others, especially other
> immutable collections.
>
Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
other operations). It took some gymnastics to get there though.

Thu, 2011-07-21, 06:37

#19

Derek Williams

Joined: 2009-06-13,

Re: Performance and complexity

On Jul 20, 2011 11:19 PM, "Tony Morris" <tonymorris@gmail.com> wrote:
> Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
> other operations). It took some gymnastics to get there though.

I'll definitely have to check that out, thanks Tony.

>
> --
> Tony Morris
> http://tmorris.net/
>
>

Thu, 2011-07-21, 06:47

#20

Tony Morris 2

Joined: 2009-03-20,

Re: Performance and complexity

On 21/07/11 15:25, Derek Williams wrote:

pE9PEw [at] mail [dot] gmail [dot] com" type="cite">
On Jul 20, 2011 11:19 PM, "Tony Morris" <tonymorris [at] gmail [dot] com" rel="nofollow">tonymorris@gmail.com> wrote:
> Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
> other operations). It took some gymnastics to get there though.

I'll definitely have to check that out, thanks Tony.

>
> --
> Tony Morris
> http://tmorris.net/
>
>

https://github.com/scalaz/scalaz/blob/scalaz7/core/src/main/scala/scalaz/Dequeue.scala

It is similar to DList[1], which would unfortunately blow the stack for any large input, so is instead modelled with a pair of lists[2]. Mark Hibberd wrote most of it.

[1]
case class Endo[A](e: A => A)
case class DList[A](k: Endo[List[A]])
http://hackage.haskell.org/package/dlist

[2]
case class Endo[A](e: A => A)
case class Dequeue[A](pre: List[Endo[List[A]]], post: List[Endo[List[A]]])

-- 
Tony Morris
http://tmorris.net/

Fri, 2011-07-22, 12:07

#21

odersky

Joined: 2008-07-29,

Re: Performance and complexity

2011/7/20 Cédric Beust ♔ <cedric@beust.com>

Hi Martin,
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?

Not at all. It's not tail recursive, and I would not use on it on very long lists. The first problem (generalize to arbitrary subsequences) is solved by straightforward generalization. The following works for any slice:

def elim(xs: List[Int], slice: Seq[Int]): List[Int] = xs match {
    case _ if xs startsWith slice => xs drop slice.length
    case x :: xs1 => x :: elim(xs1, slice)
    case Nil => Nil
}

To come back to the original problem (long array, performance matters), I would simply fall back on an imperative solution, which is actually also pretty nice:

def elim(xs: Array[Byte], slice: Seq[Byte]): Array[Byte] = {
    val buf = new ArrayBuffer[Byte]
    var i = 0
    while (i < xs.length)
      if (xs startsWith (slice, i)) i += slice.length
      else { buf += xs(i); i += 1 }
    buf.toArray
}

Cheers

-- Martin

Scala Main Menu

Performance and complexity

Scala Quick Links

Featured News

User login