Summary
In a series of short interviews, Ruby language creator Yukihiro Matsumoto and Koichi Sasada, implementor of the new Ruby VM, YARV, discuss the directions for the next-generation Ruby runtime. Among the topics discussed is the move to native threads.
Advertisement
Although Ruby has become a very popular language, some developers are still reluctant to use it on enterprise projects due to the allegedly slow performance of the main Ruby interpreter. While alternative Ruby implementations address the performance problem by running Ruby code on the JVM or on the .NET CLR, the core Ruby team has focused its efforts on a new Ruby VM, YARV.
In a series of brief interviews with James Edward Gray, Ruby creator Yukihiro Matsumoto ("Matz") and Koichi Sasada ("Koichi") discuss new features of YARV, expected to be released at the end of this year. Among the topics discussed is threading and, specifically, a move to native threads from the green threads used in the current Ruby interpreter.
First, YARV is simple stack machine which runs pseudo sequential instructions. [The] old interpreter... traverses the abstract syntax tree (AST) naively. Obviously it's slow. YARV compile[s] that AST to YARV bytecode and run[s] it...
Secondly, YARV uses native thread[s] (that's supported by OS) to implement Ruby threads. It means that you can run blocking task[s] in extension libraries... Supporting native threads does not mean that you can run Ruby scripts in parallel on [a] parallel machine such as [one with a] Multi-Core CPU... [because the] current implementation uses [a] Giant VM Lock to avoid synchronization problems...
YARV doesn't change parser/syntax/specs, GC (memory/object management), and extension libraries like String/Array/Hash/Regexp/etc. Therefore your script doesn't run fast[er] on YARV if [the] bottleneck is string processing.
YARV is already publicly available via our Subversion repository. You can fetch and play with it now. But the first public "release" from us will be Christmas 2007...
On the move to native threading, Koichi and Matz note that:
[The] old threading model [uses] green threads, to provide universal threading on every platform that Ruby runs. I think it was [a] reasonable decision 14 years ago, when I started developing Ruby. Time goes by, [the] situation has changed. pthread or similar threading libraries are now available on almost every platform...
Green threads does not work well with libraries using native threads. For example, Ruby/Tk has made huge effort to live along with pthread...
Koichi decided to use native thread[s] for YARV. I honor his decision. [The] only regret I have is we couldn't have continuation support that used our green thread internal structure...
It doesn't mean that every Ruby thread runs in parallel. YARV has [a] global VM lock (global interpreter lock) which only one running Ruby thread has. This decision ... makes us happy because we can run most of the extensions written in C without any modifications... [But] even with native thread approach, no real concurrency can be [achieved] due to the global interpreter lock. Koichi is going to address this issue by Multi-VM approach in the (near) future...
Parallel computing with Ruby is one of my main concern. There are some way[s] to do it, but running Ruby threads in parallel (without Giant VM Lock) on a process, [it] is too difficult to support current C extension libraries because of their synchronization problems... If we have multiple VM instance[s] on a process, these VMs can be run in parallel...
What do you think of the decision to move to native threads in the next-generation Ruby interpreter?
With new approach, things are not likely to get significantly better (C++ benchmark uses pthreads, Ruby removed to clearly show Oz,Erlang/C++(pthreads)/C(coroutines) relation):
> And not every language is going to, nor should it be, > great at everything.
I completely agree. But being "not great" is different from being not acceptable. IMO, the Ruby concurrency benchmarks are unacceptable. Especially in the light of the dawning concurrency paradigm shift we are currently facing. That will make them even more unacceptable.
> Erlang kicks ass at concurrency. Ruby clearly does not. > Does it NEED to? I don't know; maybe. But let them > m solve the bigger problems first.
The big problem is how you launch your threads. Pthreads need a certain stack size up front. Erlang clearly has better, more lean model for spawning lightweight threads, which makes it scale extremely well in concurrent scenarios. Nowadays, I don't think any language can ignore concurrency performance and be around for any significant amount of time.
The concurrency benchmark is nonsense. There is nothing at all concurrent about it. Run 2 independent tasks in threads in Java or C++ and you will get a 2x speed up. With Ruby you get nothing. That is what concurrency is about.
In ruby 1.8 they just need to start an os thread (optionally) for every green thread and call the external libraries from them and provide a python like GIL interface.
> The concurrency benchmark is nonsense. There is nothing at > all concurrent about it. Run 2 independent tasks in > threads in Java or C++ and you will get a 2x speed up. > With Ruby you get nothing. That is what concurrency is > about.