- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Question about Threads
Mon, 2011-05-02, 17:06
Hi all,
I'm currently investigating the threading models underlying the Scala runtime. I've read about receive vs react patterns, continuation, fork/join thread pools, etc. All these mechanisms (except for the receive pattern based on the one-thread-per-actor pattern) are aiming at minimizing the number of threads running in the JVM. Here is my question: what's the problem with an high number of threads? Is there any recent document/paper/benchmark showing the problems with theads?
Are we discussing about synchronization among threads? Concurrent access to the shared heap? Memory consuption (if so, Virtual Memory issues like high TLB miss rate or Physical memory issues like fragmentation)?
Thanks in advance for your help.
Michele
Mon, 2011-05-02, 18:57
#2
RE: Question about Threads
http://stackoverflow.com/questions/763579/how-many-threads-can-a-java-vm... - the accepted answer suggests that a JVM will struggle once it hits ~6500 threads.
I should say that the Java "thread pool" approach *is* horizontally scalable, assuming it is done properly. It's just much more difficult to ensure that it is done properly. I now cannot imagine a world without the actor paradigm.
From: oxbow_lakes@hotmail.com
To: scala-internals@googlegroups.com
Subject: RE: [scala-internals] Question about Threads
Date: Mon, 2 May 2011 17:41:37 +0000
.ExternalClass .ecxhmmessage P {padding:0px;} .ExternalClass body.ecxhmmessage {font-size:10pt;font-family:Tahoma;} One obvious thing: threads simply do not scale. You cannot create thousands, let alone millions of threads in a JVM.
The scala actor library allows you to write code form the perspective of a "virtual thread". That is, your actor behaves a lot like a thread, sequentially processing its mailbox in a "while" loop. This conceptualisation (or paradigm) is often more suited to writing clear, expressive code. The actor library (and Akka, of course!) enable the clear, concise code to scala horizontally to tens, hundreds, even tens of hundreds of thousand of individual elements.
Compare this to the Java "thread pool"; here, work is conceptually composed of blocks of functionality (either Runnables or Callables). These most likely involve access and modification of state in disparate program elements. Managing the locking required to make this all safe is hard or, as some might argue, impossible.
Chris
> Date: Mon, 2 May 2011 17:06:37 +0100
> From: mstecca@yahoo.it
> Subject: [scala-internals] Question about Threads
> To: scala-internals@googlegroups.com
>
> Hi all,
> I'm currently investigating the threading models underlying the Scala runtime. I've read about receive vs react patterns, continuation, fork/join thread pools, etc. All these mechanisms (except for the receive pattern based on the one-thread-per-actor pattern) are aiming at minimizing the number of threads running in the JVM. Here is my question: what's the problem with an high number of threads? Is there any recent document/paper/benchmark showing the problems with theads?
> Are we discussing about synchronization among threads? Concurrent access to the shared heap? Memory consuption (if so, Virtual Memory issues like high TLB miss rate or Physical memory issues like fragmentation)?
>
> Thanks in advance for your help.
> Michele
I should say that the Java "thread pool" approach *is* horizontally scalable, assuming it is done properly. It's just much more difficult to ensure that it is done properly. I now cannot imagine a world without the actor paradigm.
From: oxbow_lakes@hotmail.com
To: scala-internals@googlegroups.com
Subject: RE: [scala-internals] Question about Threads
Date: Mon, 2 May 2011 17:41:37 +0000
.ExternalClass .ecxhmmessage P {padding:0px;} .ExternalClass body.ecxhmmessage {font-size:10pt;font-family:Tahoma;} One obvious thing: threads simply do not scale. You cannot create thousands, let alone millions of threads in a JVM.
The scala actor library allows you to write code form the perspective of a "virtual thread". That is, your actor behaves a lot like a thread, sequentially processing its mailbox in a "while" loop. This conceptualisation (or paradigm) is often more suited to writing clear, expressive code. The actor library (and Akka, of course!) enable the clear, concise code to scala horizontally to tens, hundreds, even tens of hundreds of thousand of individual elements.
Compare this to the Java "thread pool"; here, work is conceptually composed of blocks of functionality (either Runnables or Callables). These most likely involve access and modification of state in disparate program elements. Managing the locking required to make this all safe is hard or, as some might argue, impossible.
Chris
> Date: Mon, 2 May 2011 17:06:37 +0100
> From: mstecca@yahoo.it
> Subject: [scala-internals] Question about Threads
> To: scala-internals@googlegroups.com
>
> Hi all,
> I'm currently investigating the threading models underlying the Scala runtime. I've read about receive vs react patterns, continuation, fork/join thread pools, etc. All these mechanisms (except for the receive pattern based on the one-thread-per-actor pattern) are aiming at minimizing the number of threads running in the JVM. Here is my question: what's the problem with an high number of threads? Is there any recent document/paper/benchmark showing the problems with theads?
> Are we discussing about synchronization among threads? Concurrent access to the shared heap? Memory consuption (if so, Virtual Memory issues like high TLB miss rate or Physical memory issues like fragmentation)?
>
> Thanks in advance for your help.
> Michele
Mon, 2011-05-02, 19:27
#3
RE: Question about Threads
There are a few issues with a large number of threads.
First is resource contention / consumption. If you allow hundreds of threads to open files, talk to databases, create sockets whatnot, you'll rapidly deplete (often expensive) system resources. To give you an example of one of the meanings of "expensive" - Oracle enterprise is often licensed based on the number of maximum concurrent connections. That's a direct cost.
Resource contention rises for no good reason with the number of actual threads. Each contention (like writing to a Log4J file) will cause an extra context switch).
Then there are other costs. OS threads are quite expensive to switch, compared to just some variables inside a program.
This is why the basic/classic way to deal with concurrency is to use thread pools and units of work. The Work(er) is somehow assigned to and queued at a thread pool and it contains the status of the work-in-progress, but no (or as little as possible) resources.
The workers can have many names and there's many patterns, from strait-up shared state concurrency with fork-join, barriers, semaphores and other contraptions, to message-passing in actors and processes (erlang I think) to JMS MDB to servlets/sessions to whatever...
The "dispatchers" are those things in charge of assigning the units of work to the thread pools and they can pull all kinds of tricks like dynamic priorities, queues (simple or priority based) etc
Cheers,
Razie
-----Original Message-----
From: scala-internals@googlegroups.com [mailto:scala-internals@googlegroups.com] On Behalf Of stecca michele
Sent: May-02-11 12:07 PM
To: scala-internals@googlegroups.com
Subject: [scala-internals] Question about Threads
Hi all,
I'm currently investigating the threading models underlying the Scala runtime. I've read about receive vs react patterns, continuation, fork/join thread pools, etc. All these mechanisms (except for the receive pattern based on the one-thread-per-actor pattern) are aiming at minimizing the number of threads running in the JVM. Here is my question: what's the problem with an high number of threads? Is there any recent document/paper/benchmark showing the problems with theads?
Are we discussing about synchronization among threads? Concurrent access to the shared heap? Memory consuption (if so, Virtual Memory issues like high TLB miss rate or Physical memory issues like fragmentation)?
Thanks in advance for your help.
Michele
The scala actor library allows you to write code form the perspective of a "virtual thread". That is, your actor behaves a lot like a thread, sequentially processing its mailbox in a "while" loop. This conceptualisation (or paradigm) is often more suited to writing clear, expressive code. The actor library (and Akka, of course!) enable the clear, concise code to scala horizontally to tens, hundreds, even tens of hundreds of thousand of individual elements.
Compare this to the Java "thread pool"; here, work is conceptually composed of blocks of functionality (either Runnables or Callables). These most likely involve access and modification of state in disparate program elements. Managing the locking required to make this all safe is hard or, as some might argue, impossible.
Chris
> Date: Mon, 2 May 2011 17:06:37 +0100
> From: mstecca@yahoo.it
> Subject: [scala-internals] Question about Threads
> To: scala-internals@googlegroups.com
>
> Hi all,
> I'm currently investigating the threading models underlying the Scala runtime. I've read about receive vs react patterns, continuation, fork/join thread pools, etc. All these mechanisms (except for the receive pattern based on the one-thread-per-actor pattern) are aiming at minimizing the number of threads running in the JVM. Here is my question: what's the problem with an high number of threads? Is there any recent document/paper/benchmark showing the problems with theads?
> Are we discussing about synchronization among threads? Concurrent access to the shared heap? Memory consuption (if so, Virtual Memory issues like high TLB miss rate or Physical memory issues like fragmentation)?
>
> Thanks in advance for your help.
> Michele