Concurrency in Ruby – José Valim
(Video) http://www.ustream.tv/recorded/33567537 (talk starts at around 5:20)
Nice talk about concurrency.
Concurrency is not a new topic, but it would be getting more importance considering the recent dominance of web-based applications, along with more consolidated infrastructures (server with many cores and cloud-based scalable platforms). However, concurrency has been a tough topic. Issue in concurrency provides a timing-based bugs, and it causes difficult issues to identify and fix.
One topic in the session was about the difference in MRI(YARV) and JRuby regarding ruby standard library usage (ex. Hash or Array). When Rails initially introduced a thread-safe feature, it didn’t work well on JRuby due to the not concurrent access. It’s a little tricky one.
The different semantics existing in different implementations makes concurrency in Ruby harder. We are also not used to think about concurrency. We need more education on how to “think concurrently”.
As noted in the conclusion, it’s kind of difficult one. However, some programming language like Go has a simple concurrent programming syntax. It would be better to have concise way of describing thread-safe logic in ruby, too.
With the appearance of many-cores server system (e.g. Xeon phi has 50 cores), we are required to utilize the computing power of the cores from a single application. For achieving that, application needs to handle concurrency well though the framework like threads.
One issue is global mutable state. Instead of sharing global data across thread, we should use communication between thread.
- It’s described as “Do not communicate by sharing memory; instead, share memory by communicating”.
- Go provides first-class communication channel through “Goroutine”. In ruby, one options is to use “SizedQueue” as communication channel.
Ruby has many different implementations. Rails 2.2 initially claimed that it’s thread-safe, but it was not true for JRuby environment. MRI(YARV) is taking global virtual machine lock on accessing standard library like Hash, but JRuby or Rubinius doesn’t. Then, hash can corrupt. Just adding lock doesn’t solve the problem since it causes another issue in performance.
Java has java.util.concurent. However, it causes too many special classes for concurrent accessing.
Erlang and Go has concurrent primitives. But, ruby doesn’t have one. Therefore, it requires some way to claim the concurrency request in logics. Just replacing hash/array with concurrent version is not good, as it has performance impact in MRI which doesn’t have an issue. So, it needs an additional concept to add abstraction layer to absorb the difference among VM implementations (ex. adding hash.concurrent_read! method).
Some references based on my research on related topics.
- It describes concurrency in JRuby.
- Also, some notes on “ThreadSafe::Array” and “ThreadSafe::Hash” with threadsafe gem. Maybe it’s similar concept as hash.concurrentread! mentioned in the session.
- “Avoid shared, mutable state” section talks some about difference between MRI and JRuby.