To balance this, according to my first read of Ruby’s threading capability, it’s my impression that not only at most one thread can execute Ruby code at a time (limitation shared by OCaml, due to the non-distributed nature of mark and sweep garbage collectors), but also a thread blocked on a syscall will block all other threads to run.
Dumb Ruby threads (but I still hope I’m wrong …)
This is not totally true:
- Yes, Ruby threads are user-level: you won’t get speedup from dual-cores or dual-processors system. In python, this presentation claims (slide 39) that python threads are mapped to native threads, but that a giant lock always prevents the execution of more than one thread at a time (so library writers don’t have to write thread-safe code).
- The ruby interpreter tries very hard to map blocking syscalls to their non-blocking counterparts (for example, all I/Os are passed to a select() call). This gets really dirty when you mix I/O and other syscalls. For example, see the strace output the following Ruby code:
require 'thread' th1 = Thread::new do pid = fork { sleep 2 } Process::waitpid(pid) puts "Finished." end th2 = Thread::new do f = STDIN.read puts "Finished2." end th1.join
select(1, [0], [], [], {0, 349}) = 0 (Timeout) gettimeofday({1168847486, 478832}, NULL) = 0 select(1, [0], [], [], {0, 0}) = 0 (Timeout) waitpid(25249, 0xbf926810, WNOHANG) = 0 select(1, [0], [], [], {0, 0}) = 0 (Timeout) gettimeofday({1168847486, 479013}, NULL) = 0 gettimeofday({1168847486, 479053}, NULL) = 0 select(1, [0], [], [], {0, 59959}) = 0 (Timeout) gettimeofday({1168847486, 538971}, NULL) = 0 select(1, [0], [], [], {0, 41}) = 0 (Timeout) gettimeofday({1168847486, 543017}, NULL) = 0 select(1, [0], [], [], {0, 0}) = 0 (Timeout) waitpid(25249, 0xbf926810, WNOHANG) = 0 select(1, [0], [], [], {0, 0}) = 0 (Timeout) gettimeofday({1168847486, 543200}, NULL) = 0 gettimeofday({1168847486, 543240}, NULL) = 0 select(1, [0], [], [], {0, 59959}) = 0 (Timeout) gettimeofday({1168847486, 602842}, NULL) = 0 select(1, [0], [], [], {0, 357}) = 0 (Timeout) gettimeofday({1168847486, 606842}, NULL) = 0 select(1, [0], [], [], {0, 0}) = 0 (Timeout)