This page is no longer maintained — Please continue to the home page at www.scala-lang.org

map() implementations time complexity

26 replies
ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.

Hi,

Can one summarize how to reason about (preferably without looking at the
source :) ) what's the time complexity of map() function implementations?

Is it right for me to assume that when the target object is an
Itera[ble|tor], or a projection, map will be O(1), but for all other
containers, it will do the traverse? Is this the best criterion? Is
there another way to discriminate between O(1) and O(n) map functions? I
certainly would want to avoid O(n) and want to build trust on the map()s
I'm invoking without always peeking the source :)

Thanks,
Dimitris Andreou

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: map() implementations time complexity
Could you elaborate on where you'd expect the map() to be non- O(1)?

On Wed, Jan 21, 2009 at 9:30 AM, Dimitris Andreou <jim.andreou@gmail.com> wrote:
Hi,

Can one summarize how to reason about (preferably without looking at the source :) ) what's the time complexity of map() function implementations?

Is it right for me to assume that when the target object is an Itera[ble|tor], or a projection, map will be O(1), but for all other containers, it will do the traverse? Is this the best criterion? Is there another way to discriminate between O(1) and O(n) map functions? I certainly would want to avoid O(n) and want to build trust on the map()s I'm invoking without always peeking the source :)

Thanks,
Dimitris Andreou



--
Viktor Klang
Senior Systems Analyst
Tony Morris
Joined: 2008-12-19,
User offline. Last seen 30 weeks 4 days ago.
Re: map() implementations time complexity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

There are many map implementations for which taking such a measurement
doesn't make any sense:

scala.Option
scala.Either.LeftProjection
scala.Either.RightProjection
scala.Function1 (aka compose)
scala.Responder
parsers

There are also many others that I write myself all the time.

Dimitris Andreou wrote:
> Hi,
>
> Can one summarize how to reason about (preferably without looking
> at the source :) ) what's the time complexity of map() function
> implementations?
>
> Is it right for me to assume that when the target object is an
> Itera[ble|tor], or a projection, map will be O(1), but for all
> other containers, it will do the traverse? Is this the best
> criterion? Is there another way to discriminate between O(1) and
> O(n) map functions? I certainly would want to avoid O(n) and want
> to build trust on the map()s I'm invoking without always peeking
> the source :)
>
> Thanks, Dimitris Andreou
>
>
>

- --
Tony Morris
http://tmorris.net/

S, K and I ought to be enough for anybody.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJduF0mnpgrYe6r60RAj+AAJ9+DPFFEg1fW/3WtOj3jJZTVMXwfQCgjqHg
BC93b2DSKTbtW5YqfE3aRh8=
=Iydp
-----END PGP SIGNATURE-----

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

In any case that it eagerly (vs lazily) doing the traverse. I guess in
mutable containers? I don't know, that's why I'm asking.

Can't you think of any O(n) map?

O/H Viktor Klang έγραψε:
> Could you elaborate on where you'd expect the map() to be non- O(1)?
>
> On Wed, Jan 21, 2009 at 9:30 AM, Dimitris Andreou
> > wrote:
>
> Hi,
>
> Can one summarize how to reason about (preferably without looking
> at the source :) ) what's the time complexity of map() function
> implementations?
>
> Is it right for me to assume that when the target object is an
> Itera[ble|tor], or a projection, map will be O(1), but for all
> other containers, it will do the traverse? Is this the best
> criterion? Is there another way to discriminate between O(1) and
> O(n) map functions? I certainly would want to avoid O(n) and want
> to build trust on the map()s I'm invoking without always peeking
> the source :)
>
> Thanks,
> Dimitris Andreou
>
>
>
>

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

Also, can someone clarify what the Projections are for? Because my guess
was exactly that: those guarantee lazy evaluation semantics (so their
map, filter and friends are all O(1)). Do I misunderstand something?

O/H Dimitris Andreou έγραψε:
> In any case that it eagerly (vs lazily) doing the traverse. I guess in
> mutable containers? I don't know, that's why I'm asking.
>
> Can't you think of any O(n) map?
>
> O/H Viktor Klang έγραψε:
>> Could you elaborate on where you'd expect the map() to be non- O(1)?
>>
>> On Wed, Jan 21, 2009 at 9:30 AM, Dimitris Andreou
>> > wrote:
>>
>> Hi,
>>
>> Can one summarize how to reason about (preferably without looking
>> at the source :) ) what's the time complexity of map() function
>> implementations?
>>
>> Is it right for me to assume that when the target object is an
>> Itera[ble|tor], or a projection, map will be O(1), but for all
>> other containers, it will do the traverse? Is this the best
>> criterion? Is there another way to discriminate between O(1) and
>> O(n) map functions? I certainly would want to avoid O(n) and want
>> to build trust on the map()s I'm invoking without always peeking
>> the source :)
>>
>> Thanks,
>> Dimitris Andreou
>>
>>
>>
>>

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

Think of my question as partial function defined only in map
implementations that it makes does sense :)

(scala.Function1 ?? Now that doesn't make sense either. How's that
relative in this context?)

O/H Tony Morris έγραψε:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> There are many map implementations for which taking such a measurement
> doesn't make any sense:
>
> scala.Option
> scala.Either.LeftProjection
> scala.Either.RightProjection
> scala.Function1 (aka compose)
> scala.Responder
> parsers
>
> There are also many others that I write myself all the time.
>
>
> Dimitris Andreou wrote:
>
>> Hi,
>>
>> Can one summarize how to reason about (preferably without looking
>> at the source :) ) what's the time complexity of map() function
>> implementations?
>>
>> Is it right for me to assume that when the target object is an
>> Itera[ble|tor], or a projection, map will be O(1), but for all
>> other containers, it will do the traverse? Is this the best
>> criterion? Is there another way to discriminate between O(1) and
>> O(n) map functions? I certainly would want to avoid O(n) and want
>> to build trust on the map()s I'm invoking without always peeking
>> the source :)
>>
>> Thanks, Dimitris Andreou
>>
>>
>>
>>
>
> - --
> Tony Morris
> http://tmorris.net/
>
> S, K and I ought to be enough for anybody.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFJduF0mnpgrYe6r60RAj+AAJ9+DPFFEg1fW/3WtOj3jJZTVMXwfQCgjqHg
> BC93b2DSKTbtW5YqfE3aRh8=
> =Iydp
> -----END PGP SIGNATURE-----
>
>
>

Tony Morris
Joined: 2008-12-19,
User offline. Last seen 30 weeks 4 days ago.
Re: map() implementations time complexity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dimitris Andreou wrote:
> Think of my question as partial function defined only in map
> implementations that it makes does sense :)

Once you are clear what it means for a map implementation to make
sense for your question, I suspect that you will have the answer to
your question. Are you referring to the evaluation semantics of
various Scala collection types?

>
> (scala.Function1 ?? Now that doesn't make sense either. How's that
> relative in this context?)

If we suppose that we could curry type constructors, then forall T.
scala.Function1[T] gives rise to a covariant functor and the map
implementation is already implementation as compose. There is also a
potentially *very useful* flatMap implementation (since it is also a
monad) that would be great if it were implemented alongside a map as
an alias for compose :)

Consider this example of a hypothetical wrapper around
scala.Function1[String]

case class StringFunctor[+A](f: String => A) { def map[B](g: A => B) =
StringFunctor[B](g compose f) }

Scalaz has more gory details if you're interested.

- --
Tony Morris
http://tmorris.net/

S, K and I ought to be enough for anybody.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJduowmnpgrYe6r60RAlpgAKCkZPIwC9fE/f6GPSvFavxZxoBqsACdHCMJ
N2v9Ohi6xwFvgnqIKtxqDYU=
=znPS
-----END PGP SIGNATURE-----

DRMacIver
Joined: 2008-09-02,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity
On Wed, Jan 21, 2009 at 8:30 AM, Dimitris Andreou <jim.andreou@gmail.com> wrote:
Hi,

Can one summarize how to reason about (preferably without looking at the source :) ) what's the time complexity of map() function implementations?

Is it right for me to assume that when the target object is an Itera[ble|tor], or a projection, map will be O(1), but for all other containers, it will do the traverse? Is this the best criterion? Is there another way to discriminate between O(1) and O(n) map functions? I certainly would want to avoid O(n) and want to build trust on the map()s I'm invoking without always peeking the source :)

It sortof depends what you mean by O(1). Things are a little complicated in terms of costs on lazy collections.

Certainly if you do something like

(1 until reallyBigNumber).map(_+1)

this will return almost immediately, so the map clearly isn't calculating everything up front. But it still needs to do the work as you evaluate things. If I do

(1 until reallyBigNumber).map(_+1).foreach(println)

the map is still doing O(n) work, it's just spreading the time that work is done out, doing it as you iterate.

So, to summarize, maps are always O(n), it's just sometimes the work is deferred.
James Iry
Joined: 2008-08-19,
User offline. Last seen 1 year 23 weeks ago.
Re: map() implementations time complexity
First, I'll assume you mean mapping over collections. Map applies to many non-collection things..

For eager collections expect O(n) average/best/worst.  For lazy collections it's more complicated.  It's generally going to be O(1) immediate average/best/worst.  But there's a deferred worst case O(n) that you may or may not have to pay depending on how much of the mapped collection you actually "force".  So for quickly analyzing a program it's best to think of even a map over a finite lazy collection as being the worst case O(n) unless you want to take the time to do the analysis that proves it's not for your program.  Map over an infinite lazy collection is a bit more complicated.   You can't just assume O(n) because that means your program never terminates.  So you have to somehow give an upper bound on the number of elements you'll ever actually force.  Laziness is definitely a double edged sword.

On Wed, Jan 21, 2009 at 12:30 AM, Dimitris Andreou <jim.andreou@gmail.com> wrote:
Hi,

Can one summarize how to reason about (preferably without looking at the source :) ) what's the time complexity of map() function implementations?

Is it right for me to assume that when the target object is an Itera[ble|tor], or a projection, map will be O(1), but for all other containers, it will do the traverse? Is this the best criterion? Is there another way to discriminate between O(1) and O(n) map functions? I certainly would want to avoid O(n) and want to build trust on the map()s I'm invoking without always peeking the source :)

Thanks,
Dimitris Andreou

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

O/H David MacIver έγραψε:
> On Wed, Jan 21, 2009 at 8:30 AM, Dimitris Andreou
> > wrote:
>
> Hi,
>
> Can one summarize how to reason about (preferably without looking
> at the source :) ) what's the time complexity of map() function
> implementations?
>
> Is it right for me to assume that when the target object is an
> Itera[ble|tor], or a projection, map will be O(1), but for all
> other containers, it will do the traverse? Is this the best
> criterion? Is there another way to discriminate between O(1) and
> O(n) map functions? I certainly would want to avoid O(n) and want
> to build trust on the map()s I'm invoking without always peeking
> the source :)
>
>
> It sortof depends what you mean by O(1). Things are a little
> complicated in terms of costs on lazy collections.
>
> Certainly if you do something like
>
> (1 until reallyBigNumber).map(_+1)
>
> this will return almost immediately, so the map clearly isn't
> calculating everything up front.
At this implementation, no. But how sure are you for other
implementations? I try to find (and really hoped someone in the list
would give me something like that) an easy rule that explains what I
seek. For example, is the following assertion right? "All maps are O(n)
unless it is a 'Projection' implementation". If this is true, my search
is over. If not, what are the exceptions? Or what is a better rule?
> But it still needs to do the work as you evaluate things. If I do
>
> (1 until reallyBigNumber).map(_+1).foreach(println)
>
> the map is still doing O(n) work, it's just spreading the time that
> work is done out, doing it as you iterate.
I don't talk about foreach. This has obviously has the same complexity
as the product of the iteration and the action. Irrelevant to the
performance of map. Even multiple needless iterations of big containers
are very wasteful. Or worse, think about a slow iteration, think about
iterating a hash table. Then you want even more badly to make sure that
you traverse only *once*.
>
> So, to summarize, maps are always O(n), it's just sometimes the work
> is deferred.
As said above, this is wrong, unless by "map" you mean "map and
foreach". But lets keep the word "map" to mean just map :)

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

Thanks James!

This is very clear. I really needed such a confirmation.

O/H James Iry έγραψε:
> First, I'll assume you mean mapping over collections. Map applies to
> many non-collection things..
>
> For eager collections expect O(n) average/best/worst. For lazy
> collections it's more complicated. It's generally going to be O(1)
> immediate average/best/worst. But there's a deferred worst case O(n)
> that you may or may not have to pay depending on how much of the
> mapped collection you actually "force". So for quickly analyzing a
> program it's best to think of even a map over a finite lazy collection
> as being the worst case O(n) unless you want to take the time to do
> the analysis that proves it's not for your program. Map over an
> infinite lazy collection is a bit more complicated. You can't just
> assume O(n) because that means your program never terminates. So you
> have to somehow give an upper bound on the number of elements you'll
> ever actually force. Laziness is definitely a double edged sword.
>
> On Wed, Jan 21, 2009 at 12:30 AM, Dimitris Andreou
> > wrote:
>
> Hi,
>
> Can one summarize how to reason about (preferably without looking
> at the source :) ) what's the time complexity of map() function
> implementations?
>
> Is it right for me to assume that when the target object is an
> Itera[ble|tor], or a projection, map will be O(1), but for all
> other containers, it will do the traverse? Is this the best
> criterion? Is there another way to discriminate between O(1) and
> O(n) map functions? I certainly would want to avoid O(n) and want
> to build trust on the map()s I'm invoking without always peeking
> the source :)
>
> Thanks,
> Dimitris Andreou
>
>

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

O/H Tony Morris έγραψε:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dimitris Andreou wrote:
>
>> Think of my question as partial function defined only in map
>> implementations that it does make sense :)
>>
>
> Once you are clear what it means for a map implementation to make
> sense for your question, I suspect that you will have the answer to
> your question. Are you referring to the evaluation semantics of
> various Scala collection types?
>
>
Yes. What I'm interesting is to develop whatever skill it takes so I can
read scala/collections code and understand its performance
characteristics. And usually the docs remain silent about their time
complexity.

I know for sure that Projection classes have lazy semantics and only
traverse when they are forced to. On the other hand, for example List's
map (:: specifically) traverses eagerly its elements. This is it:
final override def map[B](f: A => B): List[B] = {
val b = new ListBuffer[B]
var these = this
while (!these.isEmpty) {
b += f(these.head)
these = these.tail
}
b.toList
}
)

By reading James response, I will assume that Projects are always lazy,
and everything else is always eager, and check the source when in doubt.

Btw, documenting the time complexities would be great too.

>> (scala.Function1 ?? Now that doesn't make sense either. How's that
>> relative in this context?)
>>
>
> If we suppose that we could curry type constructors, then forall T.
> scala.Function1[T] gives rise to a covariant functor and the map
> implementation is already implementation as compose. There is also a
> potentially *very useful* flatMap implementation (since it is also a
> monad) that would be great if it were implemented alongside a map as
> an alias for compose :)
>
> Consider this example of a hypothetical wrapper around
> scala.Function1[String]
>
> case class StringFunctor[+A](f: String => A) { def map[B](g: A => B) =
> StringFunctor[B](g compose f) }
>
> Scalaz has more gory details if you're interested.
>
> - --
> Tony Morris
> http://tmorris.net/
>
> S, K and I ought to be enough for anybody.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFJduowmnpgrYe6r60RAlpgAKCkZPIwC9fE/f6GPSvFavxZxoBqsACdHCMJ
> N2v9Ohi6xwFvgnqIKtxqDYU=
> =znPS
> -----END PGP SIGNATURE-----
>
>
>

DRMacIver
Joined: 2008-09-02,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity
On Wed, Jan 21, 2009 at 2:38 PM, Dimitris Andreou <jim.andreou@gmail.com> wrote:
 So, to summarize, maps are always O(n), it's just sometimes the work is deferred.
As said above, this is wrong, unless by "map" you mean "map and foreach". But lets keep the word "map" to mean just map :)

No. It's not wrong. You're just thinking about the problem the wrong way. "The amount of time it takes to return when I first call the function" is the wrong way to reason about the time complexity of non-strict operations.

Suppose I did

val x = (1 to 10).map{x => {Thread.sleep(1000); x}}
val y = (1 to 10)

Both return essentially immediately

Now suppose I do

x.foreach(println);
y.foreach(println);

The first takes rather longer than the second, because it's executing the map as well as the foreach, even though the two collections are "the same". The cost of the map is not just what happens when you first invoke it but the amount of additional work it adds to the overall execution of your program. And in all cases that is O(n) (well, in some cases it might be worse than that).
ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

O/H David MacIver έγραψε:
> On Wed, Jan 21, 2009 at 2:38 PM, Dimitris Andreou
> > wrote:
>
> So, to summarize, maps are always O(n), it's just sometimes
> the work is deferred.
>
> As said above, this is wrong, unless by "map" you mean "map and
> foreach". But lets keep the word "map" to mean just map :)
>
>
> No. It's not wrong. You're just thinking about the problem the wrong
> way. "The amount of time it takes to return when I first call the
> function" is the wrong way to reason about the time complexity of
> non-strict operations.
Yes, I know that you can put an astronomical amount of computation and
wrap it in a O(1) lazy wrapper, thank you. But that was the very point:
what's the price of stuffing such layers consequetively. I don't know if
it's the wrong or right reasoning about time complexity, but I do like
to avoid unnecessary traversals and be in the fast side of computation.

I already analyzed the time complexity of foreach in previous mail, and
explains what you say for the code below too.
>
> Suppose I did
>
> val x = (1 to 10).map{x => {Thread.sleep(1000); x}}
> val y = (1 to 10)
>
> Both return essentially immediately
>
> Now suppose I do
>
> x.foreach(println);
> y.foreach(println);
>
> The first takes rather longer than the second, because it's executing
> the map as well as the foreach, even though the two collections are
> "the same". The cost of the map is not just what happens when you
> first invoke it but the amount of additional work it adds to the
> overall execution of your program. And in all cases that is O(n)
> (well, in some cases it might be worse than that).

James Iry
Joined: 2008-08-19,
User offline. Last seen 1 year 23 weeks ago.
Re: map() implementations time complexity
If you take it personally every time somebody answers one of your emails then you may quickly find you don't get answers.  If somebody tells you something you already know then it's not an insult, it's a fact of life regarding the limitations of what people can know about you.

And besides, map is O(n) worst case on lazy operations.  O(1) is a best case.  David has pretty clearly explained why.  What's the problem?

On 1/21/09, Dimitris Andreou <jim.andreou@gmail.com> wrote:
Yes, I know that you can put an astronomical amount of computation and wrap it in a O(1) lazy wrapper, thank you.

ounos
Joined: 2008-12-29,
User offline. Last seen 3 years 44 weeks ago.
Re: map() implementations time complexity

I'm sorry. I felt like going in defensive mood since it seemed that he
was rendering my whole question stupid, and me personally, and probably
my whole family too. :-) I'll be more polite next time, promise.

I just disagree (and continue to) with the notion that all map()
functions are "eventually O(n)", I just don't like this kind of
inprecise language. map() can be O(1) or O(n). foreach() is O(n). So if
I do a foreach(), I'll get O(n) anyhow, whether or not I did a lazy or
eager map, or no map at all. Isn't it more clear-cut this way? And apart
from the time complexities, I merely wanted to be able to know how many
iterations a code piece performs. (Iteration can be slow in itself).

On a slightly related side note: once in our lab we wanted to compare,
as for research, the performance of some main-memory kind-of databases
(interested basically in their data structures). And they insisted on
measuring how much time each database spent to return *an iterator* over
the results. Not actually consuming the iterators. Oh boy, you know I
screamed :) (and they had their way in the end, though I also pointed
the obvious, that given this kind of measurement, anyone could return an
iterator really super fast (although a bit ...lazy :)), and be unfairly
the "best" than the rest on the benchmark).

Dimitris

O/H James Iry έγραψε:
> If you take it personally every time somebody answers one of your
> emails then you may quickly find you don't get answers. If somebody
> tells you something you already know then it's not an insult, it's a
> fact of life regarding the limitations of what people can know about you.
>
> And besides, map is O(n) worst case on lazy operations. O(1) is a
> best case. David has pretty clearly explained why. What's the problem?
>
> On 1/21/09, *Dimitris Andreou* > wrote:
>
> Yes, I know that you can put an astronomical amount of computation
> and wrap it in a O(1) lazy wrapper, thank you.
>
>

Ricky Clarkson
Joined: 2008-12-19,
User offline. Last seen 3 years 2 weeks ago.
Re: map() implementations time complexity
I think you'll find that all operations iterate once at the most.  If they didn't, then David MacIver would have killed them already.

2009/1/21 Dimitris Andreou <jim.andreou@gmail.com>
I'm sorry. I felt like going in defensive mood since it seemed that he was rendering my whole question stupid, and me personally, and probably my whole family too. :-) I'll be more polite next time, promise.

I just disagree (and continue to) with the notion that all map() functions are "eventually O(n)", I just don't like this kind of inprecise language. map() can be O(1) or O(n). foreach() is O(n). So if I do a foreach(), I'll get O(n) anyhow, whether or not I did a lazy or eager map, or no map at all. Isn't it more clear-cut this way? And apart from the time complexities, I merely wanted to be able to know how many iterations a code piece performs. (Iteration can be slow in itself).

On a slightly related side note: once in our lab we wanted to compare, as for research, the performance of some main-memory kind-of databases (interested basically in their data structures). And they insisted on measuring how much time each database spent to return *an iterator* over the results. Not actually consuming the iterators. Oh boy, you know I screamed :) (and they had their way in the end, though I also pointed the obvious, that given this kind of measurement, anyone could return an iterator really super fast (although a bit ...lazy :)), and be unfairly the "best" than the rest on the benchmark).

Dimitris

O/H James Iry έγραψε:
If you take it personally every time somebody answers one of your emails then you may quickly find you don't get answers.  If somebody tells you something you already know then it's not an insult, it's a fact of life regarding the limitations of what people can know about you.

And besides, map is O(n) worst case on lazy operations.  O(1) is a best case.  David has pretty clearly explained why.  What's the problem?

On 1/21/09, *Dimitris Andreou* <jim.andreou@gmail.com <mailto:jim.andreou@gmail.com>> wrote:

   Yes, I know that you can put an astronomical amount of computation
   and wrap it in a O(1) lazy wrapper, thank you.




Sean McDirmid
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity
For operations like slide, take and drop, laziness is much more beneficial on random access collections. These operations really are O(1) always (not eventually O(n)) whereas the default eager alternatives are always SLOW pointlessly.
Anyways, slicing is very necessary in the IDE (take a text buffer, progressively slice it down), and life would suck a lot if all we had was eager. 
On Thu, Jan 22, 2009 at 4:15 AM, Ricky Clarkson <ricky.clarkson@gmail.com> wrote:
I think you'll find that all operations iterate once at the most.  If they didn't, then David MacIver would have killed them already.

2009/1/21 Dimitris Andreou <jim.andreou@gmail.com>
 
I just disagree (and continue to) with the notion that all map() functions are "eventually O(n)", I just don't like this kind of inprecise language. map() can be O(1) or O(n). foreach() is O(n). So if I do a foreach(), I'll get O(n) anyhow, whether or not I did a lazy or eager map, or no map at all. Isn't it more clear-cut this way? And apart from the time complexities, I merely wanted to be able to know how many iterations a code piece performs. (Iteration can be slow in itself).

mtopol
Joined: 2009-01-12,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity

But isn't the immediate answer to your original question quite obvious? Eager
collections' map is O(n); non-strict collections' map is O(1). So your real
question was how to easily distinguish eager from non-strict collections.
That would have avoided all this confusion right at the start. The way I see
it, some collection traits give you specific guarantees, others don't give
you any. RandomAccessSeq, for example guarantees O(1) random access,
Projection guarantees non-strictness. I don't see any traits guaranteeing
eagerness.

Dimitris Andreou wrote:
>
> I'm sorry. I felt like going in defensive mood since it seemed that he
> was rendering my whole question stupid, and me personally, and probably
> my whole family too. :-) I'll be more polite next time, promise.
>

Viktor Klang
Joined: 2008-12-17,
User offline. Last seen 1 year 27 weeks ago.
Re: map() implementations time complexity


On Thu, Jan 22, 2009 at 10:05 AM, mtopol <mt4web@gmail.com> wrote:

But isn't the immediate answer to your original question quite obvious? Eager
collections' map is O(n); non-strict collections' map is O(1). So your real
question was how to easily distinguish eager from non-strict collections.
That would have avoided all this confusion right at the start. The way I see
it, some collection traits give you specific guarantees, others don't give
you any. RandomAccessSeq, for example guarantees O(1) random access,
Projection guarantees non-strictness. I don't see any traits guaranteeing
eagerness.

Actually no.
It's dead simple to make map something like the following non-compiled pseudocode:

def map[T,R](it : Iterable[T],f : T => R) = new Iterable[R] {
                 def iterator = new Iterator[R] {
                                  val i = it.iterator
                                  def hasNext = i.hasNext
                                  def next = f(i.next)
                                  }
                     }
 
This is of course not the full solution,
but it implements a O(1) map.

What could be done is to have it accept a structural type containing the signatures of map, filter, etc etc and then just create a wrapper type with the same signature and return it,
providing a generic O(1) solution.

I might be way out of my league, so I'm just brainfarting here.

Cheers,
Viktor




Dimitris Andreou wrote:
>
> I'm sorry. I felt like going in defensive mood since it seemed that he
> was rendering my whole question stupid, and me personally, and probably
> my whole family too. :-) I'll be more polite next time, promise.
>

--
View this message in context: http://www.nabble.com/-scala--map%28%29-implementations-time-complexity-tp21578869p21600312.html
Sent from the Scala mailing list archive at Nabble.com.




--
Viktor Klang
Senior Systems Analyst
mtopol
Joined: 2009-01-12,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity

What you wrote is pretty much exactly what a Projection does. The point is
not about what's possible, but about what's GUARANTEED without looking at
the source code. This is why there is a method projection in Iterable: its
result is guaranteed to be a non-strict view of the Iterable upon which it
was called.

Viktor Klang wrote:
>
> On Thu, Jan 22, 2009 at 10:05 AM, mtopol wrote:
> It's dead simple to make map something like the following non-compiled
> pseudocode:
>
> def map[T,R](it : Iterable[T],f : T => R) = new Iterable[R] {
> def iterator = new Iterator[R] {
> val i = it.iterator
> def hasNext = i.hasNext
> def next = f(i.next)
> }
> }
>
>

Johannes Rudolph
Joined: 2008-12-17,
User offline. Last seen 29 weeks 20 hours ago.
Re: map() implementations time complexity

See the differences between the List and the Array implementation of projection:

First for List:

Welcome to Scala version 2.7.3.final (OpenJDK Server VM, Java 1.6.0_0).
Type in expressions to have them evaluated.
Type :help for more information.

scala> List(1,2,3,4,5)
res0: List[Int] = List(1, 2, 3, 4, 5)

scala> res0.map{f => System.out.println(f);f}
1
2
3
4
5
res1: List[Int] = List(1, 2, 3, 4, 5)

scala> res0.projection.map{f => System.out.println(f);f}
1
res2: Stream[Int] = Stream(1, ?)

scala> res2.force
2
3
4
5
res3: List[Int] = List(1, 2, 3, 4, 5)

scala> res2.force
res4: List[Int] = List(1, 2, 3, 4, 5)

scala> res2.foldLeft(0)(_+_)
res6: Int = 15

And here for array:

scala> Array(1,2,3,4,5)
res12: Array[Int] = Array(1, 2, 3, 4, 5)

scala> res12.projection.map{f => System.out.println(f);f}
1
2
3
4
5
res13: RandomAccessSeq.Projection[Int] = ArrayPM(1, 2, 3, 4, 5)

scala> res13.foldLeft(0)(_+_)
1
2
3
4
5
res14: Int = 15

scala> res13.foldLeft(0)(_+_)
1
2
3
4
5
res15: Int = 15

List implements projection by toStream which is indicated by the type
of List.projection which returns a Stream. Streams have the (sometimes
questioned) behaviour of evaluating each element only once and
remember it afterwards.

In the presence of such a behaviour it is IMO worthwhile to document
the facts about when side-effects will be executed and how often. And
yes, it makes probably not much sense to put side-effects into a
function which you pass to map (and map's documentation could warn
you, that the actual execution semantics may be undefined). But even
an apparently side-effect free function might contain some implicit
side-effects just by accessing some data or just by calculating some
intermediate values which consumes time. So if you want to reason
about the runtime of your code it is necessary to know what you have
to expect from this functions.

Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

mtopol
Joined: 2009-01-12,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity

Johannes Rudolph-2 wrote:
>
> And here for array:
>
> scala> Array(1,2,3,4,5)
> res12: Array[Int] = Array(1, 2, 3, 4, 5)
>
> scala> res12.projection.map{f => System.out.println(f);f}
> 1
> 2
> 3
> 4
> 5
> res13: RandomAccessSeq.Projection[Int] = ArrayPM(1, 2, 3, 4, 5)
>

But this means that projection doesn't do a thing for Array, the result is
again strict? This would break the contract of Projection. How come?

Johannes Rudolph-2 wrote:
>
> List implements projection by toStream which is indicated by the type
> of List.projection which returns a Stream. Streams have the (sometimes
> questioned) behaviour of evaluating each element only once and
> remember it afterwards.
>

This was exactly my frustration. While solving some Project Euler problems,
more than once what I needed was a non-caching Iterable. For example, if all
you need from some big collection is immediately reduce it to a single
value, you definitely don't want all the heap wasted to keep those elements
around, even risking OutOfMemory. I implemented my own such Iterable, but it
isn't all that great. Is there any support in Scala library that would make
it easy to implement a non-strict, non-caching collection? The challenge
that I encountered is that the client of such an Iterable must provide a
stateful algorithm that can enumerate its elements. The state must be reset
to the initial value for each fresh Iterator.

Johannes Rudolph-2 wrote:
>
> In the presence of such a behaviour it is IMO worthwhile to document
> the facts about when side-effects will be executed and how often. And
> yes, it makes probably not much sense to put side-effects into a
> function which you pass to map (and map's documentation could warn
> you, that the actual execution semantics may be undefined). But even
> an apparently side-effect free function might contain some implicit
> side-effects just by accessing some data or just by calculating some
> intermediate values which consumes time. So if you want to reason
> about the runtime of your code it is necessary to know what you have
> to expect from this functions.
>

One obvious side-effect is time and memory usage. If you expect something to
be non-strict, you automatically expect it not to waste any memory and
computing cycles, at least not until the result is requested. And especially
if you have a 1e6-element collection from which all you'll ever read are the
first hundred elements. But, generally, this IS documented in Scaladoc, just
perhaps not to the fullest extent.

Florian Hars
Joined: 2008-12-18,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity

Johannes Rudolph schrieb:
> See the differences between the List and the Array implementation of projection:

You are (not as the first one in this thread) falling into the "the
interpreter prints its results" trap:

scala> class Foo { val a = Array(1,2,3,4,5); val p = a.projection.map{f=>println(f);f} }
defined class Foo

scala> val s = new Foo
s: Foo = Foo@fd12614

scala> s.p(0)
1
res0: Int = 1

scala> s.p(3)
4
res1: Int = 4

Now what confuses me is that the scaladoc tells me that
scala.collection.immutable.TreeSet.rangeImpl and related methods in related
classes "create a ranged projection of this collection" when what they return
clearly is no Projection:

scala> import scala.collection.immutable.TreeSet
import scala.collection.immutable.TreeSet

scala> class Foo {val s=TreeSet(1,2,3,4,5); val p = s.rangeImpl(None, Some(3)).map{f=>println(f);f}}
defined class Foo

scala> val v = new Foo
1
2
v: Foo = Foo@74e51bda

And of course "Any mutations in the ranged projection will update this
collection and vice versa." is vacuously true in s.c.*im*mutable._, but
this is probably just inherited from scala.collection.SortedSet.

-Florian

Sean McDirmid
Joined: 2008-12-17,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity


On Thu, Jan 22, 2009 at 9:51 PM, mtopol <mt4web@gmail.com> wrote:


Johannes Rudolph-2 wrote:
>
> And here for array:
>
> scala> Array(1,2,3,4,5)
> res12: Array[Int] = Array(1, 2, 3, 4, 5)
>
> scala> res12.projection.map{f => System.out.println(f);f}
> 1
> 2
> 3
> 4
> 5
> res13: RandomAccessSeq.Projection[Int] = ArrayPM(1, 2, 3, 4, 5)
>

But this means that projection doesn't do a thing for Array, the result is
again strict? This would break the contract of Projection. How come?

No, projections are working: the map predicate is being executed because of the toString method that the interpreter is calling :) 
Never trust the interpreter output if you are going to allow your toString methods to be impure.
This was exactly my frustration. While solving some Project Euler problems,
more than once what I needed was a non-caching Iterable.

I believe Martin said there would be a new ListProjection to correspond to List's projection, rather than reuse Stream. It was my mistake to map List's projection to Stream, it made sense at the time.   
One obvious side-effect is time and memory usage. If you expect something to be non-strict, you automatically expect it not to waste any memory and
computing cycles, at least not until the result is requested. And especially
if you have a 1e6-element collection from which all you'll ever read are the
first hundred elements. But, generally, this IS documented in Scaladoc, just
perhaps not to the fullest extent.

I think its not realistic to document everything. One problem is what does length mean on a projection? That will be slower than a strict version because it is computed each time...
A doc wiki would be nice, then people can add annotations about methods as they uncover them, and everyone else could view that. All the unrealized IDE potential :(

Johannes Rudolph
Joined: 2008-12-17,
User offline. Last seen 29 weeks 20 hours ago.
Re: map() implementations time complexity

On Thu, Jan 22, 2009 at 2:58 PM, Florian Hars wrote:
> Johannes Rudolph schrieb:
>> See the differences between the List and the Array implementation of projection:
>
> You are (not as the first one in this thread) falling into the "the
> interpreter prints its results" trap:
Ah, yes, I didn't comment on the toString problems in the interpreter
but I wanted to show the differences when calling foldLeft twice.

Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Florian Hars
Joined: 2008-12-18,
User offline. Last seen 42 years 45 weeks ago.
Re: map() implementations time complexity

Johannes Rudolph schrieb:
> Ah, yes, I didn't comment on the toString problems in the interpreter
> but I wanted to show the differences when calling foldLeft twice.

Oh, I see. That is indeed evil.

- Florian

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland