The Ruby VM: Episode III
For details about this ongoing interview, please see my introductory post.
Let's talk a little about threading, since that's a significant change in the new VM. First, can you please explain the old threading model used in Ruby 1.8 and also the new threading model now used in Ruby 1.9?
- Matz:
-
Old threading model is the green thread, to provide universal
threading on every platform that Ruby runs. I think it was reasonable
decision 14 years ago, when I started developing Ruby. Time goes by
situation has changed. pthread or similar threading libraries are now
available on almost every platform. Even on old platforms, pth
library (a thread library which implements pthread API using setjmp
etc.) can provide green thread implementation.
Koichi decided to use native thread for YARV. I honor his decision.
Only regret I have is we couldn't have continuation support that used
our green thread internal structure. Koichi once told me it's not
impossible to implement continuation on YARV (with some restriction),
so I expect to have it again in the future. Although it certainly has
lower priority in 1.9 implementation.
-
- ko1:
-
Matz explained old one, so I show you YARV's thread model.
As you know, YARV support native thread. It means that you can run each
Ruby thread on each native thread concurrently.
It doesn't mean that every Ruby thread runs in parallel. YARV has
global VM lock (global interpreter lock) which only one running Ruby
thread has. This decision maybe makes us happy because we can run most
of the extensions written in C without any modifications.
-
Why was this change made? What's wrong with green threads?
- Matz:
-
Because green threads does not work well with libraries using native
threads. For example, Ruby/Tk has made huge effort to live along with
pthread.
-
- ko1:
-
Ruby's green (userlevel) thread implementation was too naive to run
fast. All machine stacks are copied when thread context switches. And
more important point is it's not easy to re-implement green thread on
YARV :)
-
What are the downsides to the native threads approach?
- Matz:
-
It is pretty difficult to implement continuation. Besides that, even
with native thread approach, no real concurrency can not be made due
to the global interpreter lock. Koichi is going to address this issue
by Multi-VM approach in the (near) future.
-
- ko1:
-
Yes, it has several problems. First is Performance problem (as you
know, I love to discuss about performance). Too create native thread is
too pricey. So you may use thread pool or so. And current trunk (YARV)
is not tuned on native thread, so I believe some unknown problems
around threads.
Second problem is portability. If your environment has pthread library,
but there are some difference from other pthread system in detail.
Third problem is absence of callcc (which is implemented with green
thread scheme) ... for some people :)
Programming on native thread has own difficulty. For example, on MacOS
X, exec() doesn't work (cause exception) if other threads are running
(one of portability problem). If we find critical problems on native
thread, I will make green thread version on trunk (YARV).
-
Are there plans to support other threading models in the future?
- Matz:
-
Other threading model, no. Win32 threads and pthreads are enough
burden for us to support. There might be other features to support
parallelism in the future, for example light-weight process a la
Erlang.
Koichi may have other idea(s) about supporting concurrency, such as
Multi-VM since he is the expert on it.
-
- ko1:
-
Parallel computing with Ruby is one of my main concern. There are some
way to do it, but running Ruby threads in parallel (without Giant VM
Lock) on a process is too difficult to support current C extension
libraries because of their synchronization problems.
As matz say, if we have multiple VM instance on a process, these VMs can
be run in parallel. I'll work on that theme in the near future (as my
research topic).
BTW, I wrote on last question, if there are many many problems on native
threads, I'll implement green thread. As you know, it's has some
benefit against native thread (lightweight thread creation, etc). It
will be lovely hack (FYI. my graduation thesis is to implement userlevel
thread library on our specific SMT CPU).
... Does anyone have interest to implement it?
-
I would like to see another group spin-off immediately to address High Performance Ruby. YARV's 3 fold gain is an incredible feat of solitary hacking, but the price is too high. Ruby's future is at risk of becoming irrelevant as 2, 4, and soon 8 core solutions become common place. YARV's value delivered over time is being rapidly outpaced by the change in semiconductor computational scaling. Each core doubling halves any speadup gain YARV offers. Many language competitors (fortress, erlang, haskell, scala, ...) all offer solutions to this problem now.
Industry is not very forgiving, and rarely gives out second chances. Ruby has a chance to become a mainstream solution thanks to Rails showing a wider audience how wonderfully productive the language is. Either Ruby moves forward to match competition, or it becomes a really cool has been.
Does anyone else see this as the defining moment of Ruby's future?
It was so sad to read this article. So sad to see the official ruby development purposefully repeating the same mistakes made by python.
It's not too late to switch back. Green threads can be made to work in an efficient matter, one needs to look no further then erlang and scala for examples.
I too predict many people will leave ruby once YARV comes out and these limitations become set in stone.
jherber,
I really disagree with your statement that "Ruby's future is at risk of becoming irrelevant as [multi] core solutions become common place". I deal with both Erlang and Ruby regularly and really it comes down to the best tool for the job. For everyday hacking uses, Ruby as it stands (with the ability to mixin any C code) works exceptionally well.
For the extenuating circumstances that require concurrency, simply change Ruby interpreters to something like JRuby (though I've never used java.util.concurrent though personally) or build a system that plays on the benefits of Erlang and Ruby both, using inter-process communication to hand off tasks in an RPC-esque model.
I think Matz and ko1 are heading in the right direction and the "concurrent Ruby" issue ignores the more important pragmatic approach to solving most problems.