Archive for category ludicrous

Why libjit?

One question I’ve been asked quite a bit over the last two years is why I decided to use libjit for Ludcirous. Evan Phoenix had considered it for Rubinius a long time ago but rejected it partly because of the license. When I started Ludicrous, I decided to use libjit in spite of the GPL license (the license has since been changed to LGPL).

First of all, why is the GPL (v2) license problematic for a JIT compiler? Any JIT compiler using libjit must link with it, and any program run under this JIT compiler therefore ends up dynamically linking with libjit in the end. This puts the user in a precarious grey area in terms of the GPL (particularly since at the time the code had to be modified to run under Ludicrous). As I said, libjit is now under the LGPL (GPL plus an exception would have been sufficient as well), so this is no longer an issue.

Yet I picked libjit in spite of this because I felt it was the best tool for the job at the time. I evaluated a number of other alternatives:

  • GNU Lightning – much of the library is implemented with macros, making it difficult to use programmatically. It would be virtually impossible to wrap this library as a ruby module without some heavy hacking. It also requires the user to do his own register allocation, which isn’t something I was interested in figuring out.

  • DynASM – this is the “dynamic assembler” used by LuaJIT. It’s supposed to be very fast and lightweight. It’s basically a lua script that preprocesses your C source code, looking for platform-independant asm and replacing it with C calls to generate that asm at runtime. I didn’t look too closely at this because of its dependency on lua.

  • LLVM – somewhat heavyweight, much more than a JIT, but interesting because of its power. However, at the time I could not make heads or tails of their documentation, much less build a simple prototype that adds two numbers. I get the impression this has been addressed now (judging from the number of compilers using LLVM these days), but I’ve never looked back at LLVM.

  • NekoVM – had a working JIT for about two years before I started Ludicrous. Looks lightweight. I wasn’t interested, though, in compiling to a different VM; I wanted Ludicrous-compiled code to run under MRI (giving me full access to all the C extensions that are available).

  • Nanojit – I believe this was available at the time, but was too new to interest me.

  • Libjit – small, lightweight, but most importantly, well documented. It’s a register-based VM that emits code as functions are called, making it easy to wrap as a ruby extension. It does physical register allocation for me; I just pretend I have an infinite number of registers and let libjit do the work. I read through the tutorials in less than half an hour and had a prototype of ruby-libjit running the same afternoon. After that, as they say, it’s all history.

A faster stack

I’m back in the country now and starting to work on Ludicrous again. A number of YARV instructions have now been implemented and very simple methods can be executed.

Stack manipulation is tricky. As mentioned before, the first pass of JIT-compiling YARV bytecode actually allocated a Ruby array for every method invocation, calling rb_ary_push() and rb_ary_pop() to manipulate the stack. As you can imagine, this is quite slow.

The next pass tried to actually use the real YARV stack. This meant directly accessing Ruby’s thread structure to get a pointer to the stack. It was faster than using an array, but there were too many edge cases where the stack pointer wasn’t being manipulated properly.

The current solution uses a static stack instead of a dynamic stack, pushing and popping the stack at compile time rather than at run time. This means that stack operations can actually be optimized into register operations. It obviously doesn’t work with all possible YARV bytecode, but I don’t know of any cases where Ruby generates bytecode that needs a run-time stack (e.g. pushing a value onto the stack in a loop). In any case there are some rudimentary checks for such code, e.g. that the stack size is the same at the beginning and end of a detected loop.

The next challenge to tackle is to implement rescue and ensure. I’ve already got PUSH_TAG, POP_TAG, and EXEC_TAG implemented in ruby-jit (impossible on 1.8, since the current tag is private in eval.c), and the rest should come relatively easily.

Ludicrous on YARV

Last night I was tinkering with Ludicrous and I decided to see how hard it would be to get it to run on YARV. After modifying the extension to compile against ruby 1.9, it turned out to be only a handful of lines of code to compile a simple method in YARV:

class Foo
  def foo
    return 42
  end

 go_plaid
end

puts Foo.new.foo #=> 42

The primary difference between methods on 1.8 and methods on YARV is that methods in 1.8 are represented by SCOPE nodes that hold a reference to that method’s AST, while YARV methods are represented by a METHOD node that holds a reference to that method’s instruction sequence. The code for compiling METHOD nodes and SCOPE nodes is remarkably similar, the primary difference lying in the code that compiles the instruction sequence:

class VM
  class Instruction
    class PUTOBJECT
      def ludicrous_compile(function, env)
        value = function.const(JIT::Type::OBJECT, self.operands[0])
        env.push(value)
      end
    end

    class LEAVE
      def ludicrous_compile(function, env)
        retval = env.pop
        function.insn_return(retval)
      end
    end
  end
end

Currently the stack is a simple Ruby array, which is probably going to run slower than YARV bytecode, but it shouldn’t be difficult to use a more highly optimized stack (perhaps the system stack).

Ludicrous release is on the way

A number of months ago I began working on a just-in-time compiler for Ruby 1.8. What started as an experiment is actually now a reality – Ludicrous is now an executable that can run ruby scripts just like MRI!

Based on some ideas Charles Nutter mentioned on his blog regarding JRuby, Ludicrous works by installing stubs for every method in a given class that jit-compile the methods when they are called. If a method can’t be jit-compiled, the stub is removed and Ludicrous reverts back to the interpreter version of the method.

Ludicrous-compiled methods and MRI-interpreted methods can thus live safely together inside the same interpreter. This has allowed Ludicrous to grow incrementally, since it was able to run all of Test::Unit from day one.

Benchmarks show it to be on par or better than YARV:

cout@bean:~/ludicrous/sample$ ruby simple_benchmark.rb
                           user     system      total        real
Ackermann function     0.600000   0.020000   0.620000 (  0.645012)
Array access           3.590000   0.010000   3.600000 (  3.648332)
Fibonacci numbers      1.030000   0.020000   1.050000 (  1.089943)
GCD (iterative)        2.440000   0.110000   2.550000 (  2.593122)
GCD (recursive)        0.820000   0.040000   0.860000 (  0.889390)
Hash access I          1.400000   0.050000   1.450000 (  1.452047)
Hash access II         2.990000   0.050000   3.040000 (  3.152132)
Lists                  0.880000   0.000000   0.880000 (  0.912412)
Nested loop            4.700000   0.010000   4.710000 (  5.104556)
Sieve of Eratosthenes  5.660000   0.040000   5.700000 (  6.328717)
Word Frequency         0.690000   0.020000   0.710000 (  0.779878)
cout@bean:~/ludicrous/sample$ ruby1.9 simple_benchmark.rb
                           user     system      total        real
Ackermann function     0.640000   0.000000   0.640000 (  0.691521)
Array access           3.280000   0.000000   3.280000 (  3.394318)
Fibonacci numbers      0.290000   0.000000   0.290000 (  0.528554)
GCD (iterative)        0.620000   0.010000   0.630000 (  0.660089)
GCD (recursive)        0.210000   0.000000   0.210000 (  0.238961)
Hash access I          2.320000   0.040000   2.360000 (  2.416732)
Hash access II         4.180000   0.010000   4.190000 (  4.312941)
Lists                  0.560000   0.000000   0.560000 (  0.558828)
Nested loop            2.580000   0.000000   2.580000 (  2.660648)
Sieve of Eratosthenes  5.590000   0.050000   5.640000 (  5.751475)
Word Frequency         1.170000   0.000000   1.170000 (  1.200573)
cout@bean:~/ludicrous/sample$ ruby -I../lib -I../ext simple_benchmark.rb --jit --skip=sieve
Compiling nested_loop
Compiling ack
Compiling gcd_recur
Compiling hash_access_II
Compiling array_access
Compiling lists
Compiling word_frequency
Compiling fib
Compiling hash_access_I
Compiling gcd_iter
                        user     system      total        real
Ackermann function  0.200000   0.000000   0.200000 (  0.210097)
Array access        1.860000   0.030000   1.890000 (  1.919088)
Fibonacci numbers   0.210000   0.000000   0.210000 (  0.212081)
GCD (iterative)     0.060000   0.000000   0.060000 (  0.059266)
GCD (recursive)     0.140000   0.000000   0.140000 (  0.142387)
Hash access I       1.260000   0.000000   1.260000 (  1.321544)
Hash access II      1.700000   0.010000   1.710000 (  1.765920)
Lists               0.870000   0.000000   0.870000 (  0.868467)
Nested loop         2.900000   0.010000   2.910000 (  2.995473)
Word Frequency      0.650000   0.000000   0.650000 (  0.646185)

Ludicrous performs clearly better in most of the benchmarks here. Most impressive is the iterative GCD benchmark; Ludicrous executes the function 11 times faster than YARV and almost 44 times faster than 1.8! The tests where it doesn’t do as well are ones that make heavy use of blocks, because Ludicrous can’t do block inlining like YARV does.

To be fair, Ludicrous doesn’t implement a lot of features that would slow it down, such as event callbacks (used for profiling), bignums, and checking for redefinition of primitive arithmetic methods (such as mathn.rb does). It does try to avoid certain optimizations that YARV does not do, to keep the performance comparisons sensible.

Ludicrous’s performance is not at the expense of correctness, either; it passes all of the bfts tests and most of the tests that come with Ruby 1.8 (there are some pesky tests in particular that it is impossible for Ludicrous to pass without patching the interpreter).

Expect a gem to be released within the next few weeks as I iron out a few more kinks.