- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Performance and complexity
Tue, 2011-07-19, 22:08
Hi everyone,
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; color: #7e7e7e} p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; min-height: 14.0px} span.s1 {color: #4a4a4a} span.s2 {color: #333333}
I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:
def eliminate[@specialized(Byte) T : ClassManifest](xs: Array[T],
slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
if(i == -1) xs else xs.take(i) ++ eliminate(xs.drop(i + slice.length), slice)
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; color: #7e7e7e} p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Inconsolata; min-height: 14.0px} span.s1 {color: #4a4a4a} span.s2 {color: #333333}
private static List<Integer> eliminate(List<Integer> value) {
List<Integer> result = Lists.newArrayList();
for (int i = 0; i < value.size(); i++) {
if (value.get(i) == SLICE.get(0) && value.get(i + 1) == SLICE.get(1)
&& value.get(i + 2) == SLICE.get(2)) {
i += 2;
} else {
result.add(value.get(i));
}
}
return result;
}
I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric
Tue, 2011-07-19, 22:47
#2
Re: Performance and complexity
The Scala version uses concat on arrays, which, since concat is O(n), is a O(n^2) operation. Oh, and it's not tail-recursive, which will clobber your stack. (I mentioned this already in that thread.)
It's not hard to avoid those problems:
def eliminated[@specialized(Byte) T: ClassManifest](
xs: Array[T], slice: Seq[T], found: List[Array[T]] = Nil
): Array[T] = {
val i = xs indexOfSlice slice
if (i < 0) (xs :: found).toArray.reverse.flatten
else eliminated(xs.drop(i+slice.length), slice, xs.take(i) :: found)
}
--Rex
2011/7/19 Cédric Beust ♔ <cedric@beust.com>
It's not hard to avoid those problems:
def eliminated[@specialized(Byte) T: ClassManifest](
xs: Array[T], slice: Seq[T], found: List[Array[T]] = Nil
): Array[T] = {
val i = xs indexOfSlice slice
if (i < 0) (xs :: found).toArray.reverse.flatten
else eliminated(xs.drop(i+slice.length), slice, xs.take(i) :: found)
}
--Rex
2011/7/19 Cédric Beust ♔ <cedric@beust.com>
Hi everyone,
I became intrigued with a problem that Mark Nelson posted to the list last week. Here is a link to the discussion.
The problem is simple: eliminate all the occurrences of the sequence (26, 20, 64) from a list. For example, given [1, 2, 26, 20, 64, 3, 26, 20, 64, 4], the result should be [1, 2, 3, 4].
Of all the solutions proposed, only the one proposed by Ruediger Keller worked:def eliminate[@specialized(Byte) T : ClassManifest](xs: Array[T],
}
slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
if(i == -1) xs else xs.take(i) ++ eliminate(xs.drop(i + slice.length), slice)
Daniel Sobral also offered a correct solution but it only works on Byte, so not really applicable here.
I became curious to see what kind of complexity the functional solution above is and how it compares to the trivial and linear imperative version (no need for Boyer-Moore, really):
private static List<Integer> eliminate(List<Integer> value) {
List<Integer> result = Lists.newArrayList();
for (int i = 0; i < value.size(); i++) {
if (value.get(i) == SLICE.get(0) && value.get(i + 1) == SLICE.get(1)
&& value.get(i + 2) == SLICE.get(2)) {
i += 2;
} else {
result.add(value.get(i));
}
}
return result;
}
I decided to run a few benchmarks and the results were surprising. I tried with various sizes of elements (10k, 100k, 1M, 10M) and with a fixed number of sequences to remove from them (3).
The imperative version solves the problem for lists up to 10M of elements in a few milliseconds.
However, the Scala version above takes 2 seconds for 1000 elements and doesn't finish after a few minutes for 10k elements.
Can someone suggest additional functional solutions to this problem that can at least approach the linear performance of the imperative version?
Also, what are the complexities of these functional solutions?
-- Cédric
Tue, 2011-07-19, 22:58
#3
Re: Performance and complexity
Maybe you could optimize the fucntion it for tail call recursion
package eliminate
import scala.annotation.tailrec
object Main extends App {
@tailrec
def eliminate[@specialized(Byte) T : ClassManifest](acc: Array[T], xs:
Array[T], slice: Seq[T]): Array[T] = {
val i = xs indexOfSlice slice
i match {
case -1 => acc ++ xs
case _ => eliminate(acc ++ xs.take(i), xs.drop(i + slice.length),
slice)
}
}
val acc : Array[Byte] = Array.empty
val ar = List[Byte](1, 2, 26, 20, 64, 3, 26, 20, 64, 4).toArray
val sl = Seq[Byte](26, 20, 64)
val el = eliminate[Byte](acc, ar, sl)
println(el.toList)
}
Wed, 2011-07-20, 00:07
#4
Re: Performance and complexity
This seems plenty fast (written quickly so variable names suffer):
def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
case x :: xs =>
val newResult = x :: result
iter(xs, sliceList, newResult, newResult)
}
iter(seq.toList, sliceList, Nil, Nil)
}
Only tested with an Array[Byte] up to 1M, and it was just the original
10 digits repeated. Didn't time it, only tested within the REPL, but
it returns fast even if I convert the List back to an Array.
--
Derek Williams
Wed, 2011-07-20, 00:27
#5
Re: Performance and complexity
Woops, little bug there. newResult should be built on fallback, not result.
Derek Williams
On Jul 19, 2011 4:59 PM, "Derek Williams" <derek@nebvin.ca> wrote:> This seems plenty fast (written quickly so variable names suffer):
>
> def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
> val sliceList = slice.toList
> def iter(xs: List[A], ys: List[A], result: List[A], fallback:
> List[A]): List[A] = xs match {
> case xs if ys.isEmpty => iter(xs, sliceList, result, result)
> case Nil => fallback.reverse
> case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
> case x :: xs =>
> val newResult = x :: result
> iter(xs, sliceList, newResult, newResult)
> }
> iter(seq.toList, sliceList, Nil, Nil)
> }
>
> Only tested with an Array[Byte] up to 1M, and it was just the original
> 10 digits repeated. Didn't time it, only tested within the REPL, but
> it returns fast even if I convert the List back to an Array.
>
> --
> Derek Williams
Wed, 2011-07-20, 03:47
#6
Re: Performance and complexity
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Am 20.07.2011 01:20, schrieb Derek Williams:
> Woops, little bug there. newResult should be built on fallback, not result.
and a second one:
eliminate (Seq(1, 2, 3, 4, 3, 4, 2, 5), Seq(3, 4, 2))
res168: List[Int] = List(1, 2, 3, 4, 3, 4, 2, 5)
easy to find, since I shortly thought a fast solution ... looking for a
3-element pattern from the back, and jumping length(pattern) forward, if
it doesn't match, but: The end of the pattern might match in the
beginning again.
def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x :: fallback)
case x :: xs =>
val newResult = x :: fallback
iter(xs, sliceList, newResult, newResult)
}
iter(seq.toList, sliceList, Nil, Nil)
}
I understood ^.
Wed, 2011-07-20, 04:07
#7
Re: Performance and complexity
On Tue, Jul 19, 2011 at 8:41 PM, Stefan Wagner wrote:
> easy to find, since I shortly thought a fast solution ... looking for a
> 3-element pattern from the back, and jumping length(pattern) forward, if
> it doesn't match, but: The end of the pattern might match in the
> beginning again.
Good catch. I knew I was missing something. I actually had it checking
for that before and I must have "optimized" it away. This one works:
def eliminate[A](seq: Seq[A], slice: Seq[A]): List[A] = {
val sliceList = slice.toList
def iter(xs: List[A], ys: List[A], result: List[A], fallback:
List[A]): List[A] = xs match {
case xs if ys.isEmpty => iter(xs, sliceList, result, result)
case Nil => fallback.reverse
case x :: xs if x == ys.head => iter(xs, ys.tail, result, x ::
fallback)
case x :: xs if ys eq sliceList =>
val newResult = x :: result
iter(xs, sliceList, newResult, newResult)
case xs =>
iter(xs, sliceList, fallback, fallback)
}
iter(seq.toList, sliceList, Nil, Nil)
}
--
Derek Williams
Wed, 2011-07-20, 08:27
#8
Re: Performance and complexity
I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.
Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!
For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.
Cheers
-- Martin
Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!
For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.
Cheers
-- Martin
Wed, 2011-07-20, 08:57
#9
Re: Performance and complexity
On Wed, Jul 20, 2011 at 12:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:
I wish I could get a compiler warning when a local mutable variable escapes into a closure... Oh, effect system, will you ever be mine? ;)
-0xe1a
... a local mutable variable is no big deal, but think twice before introducing a global one!
I wish I could get a compiler warning when a local mutable variable escapes into a closure... Oh, effect system, will you ever be mine? ;)
-0xe1a
Wed, 2011-07-20, 09:17
#10
Re: Performance and complexity
My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:
--
www.sadekdrobi.com
ʎdoɹʇuǝ
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:
I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.
Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!
For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.
Cheers
-- Martin
--
www.sadekdrobi.com
ʎdoɹʇuǝ
Wed, 2011-07-20, 10:37
#11
Re: Performance and complexity
I forgot a call :p
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimStart(slice,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 10:06 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:
--
www.sadekdrobi.com
ʎdoɹʇuǝ
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimStart(slice,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 10:06 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:
My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:
I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.
Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!
For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.
Cheers
-- Martin
--
www.sadekdrobi.com
ʎdoɹʇuǝ
--
www.sadekdrobi.com
ʎdoɹʇuǝ
Wed, 2011-07-20, 10:57
#12
Re: Performance and complexity
I will stop spamming the list, but here is a version with all recursive calls optimized
def trimStart[A](slice:Seq[A],list:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A],original:Seq[A]=list):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimSt(slice,l,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,list,list)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 11:20 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:
--
www.sadekdrobi.com
ʎdoɹʇuǝ
def trimStart[A](slice:Seq[A],list:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A],original:Seq[A]=list):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimSt(slice,l,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,list,list)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 11:20 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:
I forgot a call :p
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,l) => trimStart(slice,l) case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original)}
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
On Wed, Jul 20, 2011 at 10:06 AM, Sadache Aldrobi <sadek.drobi@gmail.com> wrote:
My 2 cents.
def trimStart[A](slice:Seq[A],original:Seq[A]):Seq[A] = { @scala.annotation.tailrec def trimSt(sliceLeft:Seq[A],l:Seq[A]):Seq[A] = (sliceLeft,l) match { case (Nil,_) => l case (_,Nil) => original case (h1::t1,h2::t2) => if(h1 == h2) trimSt(t1,t2) else original } trimSt(slice,original) }
def removeOccurence[A](slice:Seq[A],original:Seq[A])= { @scala.annotation.tailrec def remove(leftOriginal:Seq[A],result:Seq[A]):Seq[A]= trimStart(slice,leftOriginal) match{ case(h::tail) => remove(tail,h +: result) case(Nil) => result.reverse } remove(original,Nil) }
}
worth noting that you can use a buffer for the result in the second method instead of a Seq (Or even an immutable structure that allows adding elements to the end) so that you don't require the last result.reverse. Also I am sure it can be written with fewer lines, I just happen to be late for work :)
Sadek
On Wed, Jul 20, 2011 at 9:12 AM, martin odersky <martin.odersky@epfl.ch> wrote:
I should also say, there's really nothing wrong in just coding the imperative algorithm in Scala. It's not a sacrilege to use imperative code, in particular if it's inside a function. Sometimes it's the clearest way to express things. Many other times it is not.
Generally, it is much more important for a system that the interfaces are functional than that the implementations are. Or, put otherwise, a local mutable variable is no big deal, but think twice before introducing a global one!
For instance, if you look inside the scala.collection.immutable.List class you find that most methods use a local mutable ListBuffer and a while loop to fill it.
It's done this way to make these list operations not blow the stack for long lists.
So we have an imperative implementation that ensures good functional behavior on the outside. Compare with OCaml, where lists are purely functional but blow the stack for lists longer than some 1000's of elements.
Cheers
-- Martin
--
www.sadekdrobi.com
ʎdoɹʇuǝ
--
www.sadekdrobi.com
ʎdoɹʇuǝ
--
www.sadekdrobi.com
ʎdoɹʇuǝ
Wed, 2011-07-20, 21:37
#13
Re: Performance and complexity
Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:
def elim(xs: List[Int]): List[Int] = xs match {
case 26 :: 20 :: 64 :: rest => elim(rest)
case x :: xs1 => x :: elim(xs1)
case Nil => Nil
}
Cheers
-- Martin
def elim(xs: List[Int]): List[Int] = xs match {
case 26 :: 20 :: 64 :: rest => elim(rest)
case x :: xs1 => x :: elim(xs1)
case Nil => Nil
}
Cheers
-- Martin
Wed, 2011-07-20, 22:27
#14
Re: Performance and complexity
Hi Martin,
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
-- Cédric
2011/7/20 martin odersky <martin.odersky@epfl.ch>
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
-- Cédric
2011/7/20 martin odersky <martin.odersky@epfl.ch>
Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:
def elim(xs: List[Int]): List[Int] = xs match {
case 26 :: 20 :: 64 :: rest => elim(rest)
case x :: xs1 => x :: elim(xs1)
case Nil => Nil
}
Cheers
-- Martin
Thu, 2011-07-21, 04:17
#15
Re: Performance and complexity
Hi,
What about this version?
def elim(xs: List[Int], slice: List[Int]): Vector[Int] = { @tailrec def elimAcc(xs: List[Int], acc: Vector[Int]): Vector[Int] = xs match { case xs if(xs.take(3) == slice) => elimAcc(xs drop 3, acc) case x :: xs1 => elimAcc(xs1, acc :+ x) case Nil => acc } elimAcc(xs, Vector.empty[Int]) }
Regards,Germán
2011/7/20 Cédric Beust ♔ <cedric@beust.com>
What about this version?
def elim(xs: List[Int], slice: List[Int]): Vector[Int] = { @tailrec def elimAcc(xs: List[Int], acc: Vector[Int]): Vector[Int] = xs match { case xs if(xs.take(3) == slice) => elimAcc(xs drop 3, acc) case x :: xs1 => elimAcc(xs1, acc :+ x) case Nil => acc } elimAcc(xs, Vector.empty[Int]) }
Regards,Germán
2011/7/20 Cédric Beust ♔ <cedric@beust.com>
Hi Martin,
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
-- Cédric
2011/7/20 martin odersky <martin.odersky@epfl.ch>Like I said, if performance matters and the array is big, I would use the imperative solution. If simplicity and elegance matters, I think the most straightforward solution is to use pattern matching over lists:
def elim(xs: List[Int]): List[Int] = xs match {
case 26 :: 20 :: 64 :: rest => elim(rest)
case x :: xs1 => x :: elim(xs1)
case Nil => Nil
}
Cheers
-- Martin
Thu, 2011-07-21, 05:37
#16
Re: Performance and complexity
On 21/07/11 06:35, martin odersky wrote:
> Like I said, if performance matters and the array is big, I would use
> the imperative solution. If simplicity and elegance matters, I think
> the most straightforward solution is to use pattern matching over lists:
>
> def elim(xs: List[Int]): List[Int] = xs match {
> case 26 :: 20 :: 64 :: rest => elim(rest)
> case x :: xs1 => x :: elim(xs1)
> case Nil => Nil
> }
>
> Cheers
>
Thu, 2011-07-21, 06:17
#17
Re: Performance and complexity
Do you suggest working with Arrays? Or is there a better alternative?
At the defense of List, I have found prepending to a List, followed by reverse, to be faster than appending to most others, especially other immutable collections.
Derek Williams
On Jul 20, 2011 10:30 PM, "Tony Morris" <tonymorris@gmail.com> wrote:> On 21/07/11 06:35, martin odersky wrote:
>> Like I said, if performance matters and the array is big, I would use
>> the imperative solution. If simplicity and elegance matters, I think
>> the most straightforward solution is to use pattern matching over lists:
>>
>> def elim(xs: List[Int]): List[Int] = xs match {
>> case 26 :: 20 :: 64 :: rest => elim(rest)
>> case x :: xs1 => x :: elim(xs1)
>> case Nil => Nil
>> }
>>
>> Cheers
>>
>> -- Martin
>>
>
> I mostly agree. If performance matters, and you are doing prefix
> matching on a cons list, you might want to consider a more appropriate
> data structure. This inappropriate use of data structures is completely
> aside from the functional/imperative false dilemma.
>
> http://paste.pocoo.org/show/443309/
>
> --
> Tony Morris
> http://tmorris.net/
>
>
Thu, 2011-07-21, 06:27
#18
Re: Performance and complexity
On 21/07/11 15:14, Derek Williams wrote:
>
> Do you suggest working with Arrays? Or is there a better alternative?
>
This depends ultimately on all the operations required, including how
the "List" came to be in the first place. Certainly, prefix matching on
a cons list (or even a mutable list) for anything but a trivial data set
is not particularly noble and then measuring the performance is... a bit
ridonkulous.
A prefix tree (trie) may be more appropriate (of which there are many
variations), or perhaps even a Burkhard-Keller Tree (a bit of a stretch
there -- need to know all operations).
> At the defense of List, I have found prepending to a List, followed by
> reverse, to be faster than appending to most others, especially other
> immutable collections.
>
Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
other operations). It took some gymnastics to get there though.
Thu, 2011-07-21, 06:37
#19
Re: Performance and complexity
On Jul 20, 2011 11:19 PM, "Tony Morris" <tonymorris@gmail.com> wrote:
> Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
> other operations). It took some gymnastics to get there though.
I'll definitely have to check that out, thanks Tony.
>
> --
> Tony Morris
> http://tmorris.net/
>
>
Thu, 2011-07-21, 06:47
#20
Re: Performance and complexity
On 21/07/11 15:25, Derek Williams wrote:
It is similar to DList[1], which would unfortunately blow the stack for any large input, so is instead modelled with a pair of lists[2]. Mark Hibberd wrote most of it.
[1]
case class Endo[A](e: A => A)
case class DList[A](k: Endo[List[A]])
http://hackage.haskell.org/package/dlist
[2]
case class Endo[A](e: A => A)
case class Dequeue[A](pre: List[Endo[List[A]]], post: List[Endo[List[A]]])
pE9PEw [at] mail [dot] gmail [dot] com" type="cite">https://github.com/scalaz/scalaz/blob/scalaz7/core/src/main/scala/scalaz/Dequeue.scalaOn Jul 20, 2011 11:19 PM, "Tony Morris" <tonymorris [at] gmail [dot] com" rel="nofollow">tonymorris@gmail.com> wrote:
> Scalaz 7 will include a list with O(1) cons and snoc (at the expense of
> other operations). It took some gymnastics to get there though.I'll definitely have to check that out, thanks Tony.
>
> --
> Tony Morris
> http://tmorris.net/
>
>
It is similar to DList[1], which would unfortunately blow the stack for any large input, so is instead modelled with a pair of lists[2]. Mark Hibberd wrote most of it.
[1]
case class Endo[A](e: A => A)
case class DList[A](k: Endo[List[A]])
http://hackage.haskell.org/package/dlist
[2]
case class Endo[A](e: A => A)
case class Dequeue[A](pre: List[Endo[List[A]]], post: List[Endo[List[A]]])
-- Tony Morris http://tmorris.net/
Fri, 2011-07-22, 12:07
#21
Re: Performance and complexity
2011/7/20 Cédric Beust ♔ <cedric@beust.com>
Hi Martin,Not at all. It's not tail recursive, and I would not use on it on very long lists. The first problem (generalize to arbitrary subsequences) is solved by straightforward generalization. The following works for any slice:
Thanks, that's the most intuitive solution so far, although I see two slight problems with it: 1) it will only work if the slice is known at compile time and 2) it's not tail recursive, so it will most likely blow up the stack on big lists.
Unless I'm missing something?
def elim(xs: List[Int], slice: Seq[Int]): List[Int] = xs match {
case _ if xs startsWith slice => xs drop slice.length
case x :: xs1 => x :: elim(xs1, slice)
case Nil => Nil
}
To come back to the original problem (long array, performance matters), I would simply fall back on an imperative solution, which is actually also pretty nice:
def elim(xs: Array[Byte], slice: Seq[Byte]): Array[Byte] = {
val buf = new ArrayBuffer[Byte]
var i = 0
while (i < xs.length)
if (xs startsWith (slice, i)) i += slice.length
else { buf += xs(i); i += 1 }
buf.toArray
}
Cheers
-- Martin
I'm working on a better scala version, but part of the pain is avoiding the scala.Stream class right now. I should have something shortly.
- Josh
2011/7/19 Cédric Beust ♔ <cedric@beust.com>