Vancouver Ruby & Rails Central

One Stop Source For Ruby & Rails News, Events & Jobs In & Around Vancouver

Vancouver.rb Q&A with Sunny Hirai (MeZine) on Rails, Merb vs. Caffeine and Processor, Database and Storage Scaling, and More

Posted by Gerald on August 27, 2008

Welcome back to the Vancouver.rb Q&A series. Today let’s welcome Vancouverite Sunny Hirai – founder and CEO of MeZine Inc.

Caffeine is multi-threaded so requests do not block each other and you get to focus more on building your application and less time on getting your application to work smoothly. Merb is multi-threaded as long as you don’t use ActiveRecord or other single-threaded libraries.

With Caffeine, you can take any application you’ve built and, with no code changes, drop it into your new project. You could take somebody else’s forum application, for example, and use it in your project. Caffeine handles the differences between user models, database storage, file storage, templating, etc.

To make it work, we had to rethink everything from routing, to the database, to the user model and in many cases the abstractions are in different places than Rails, Merb or other popular frameworks.


Q: Can you tell us a little bit about yourself and your startup (company) MeZine?

Sunny Hirai: I’m the founder and CEO of MeZine Inc., the internet company that developed CityMax, a website builder for small businesses. Technically, we’re not a startup anymore as we’ve been in business since 2000 and we’ve been profitable for a long time but we have a new project that we’re treating like a startup.

Our main property, CityMax, is an everything-for-one-price business website builder that is $20 a month and includes a shopping cart, online form builder, discussion boards, photo albums, eBay integration, PayPal processing, credit card processing, calendar, and dozens of other business features.

I believe we launched one of the first, if not the first, online website builder on the internet.

Currently, we are privately held and employ about thirty people.

Q: How did you get started with Ruby (and Ruby on Rails)? What makes Ruby a great choice for developing web applications?

Sunny Hirai: We needed an application platform that would support our “next big idea” which includes a suite of web services including a site builder, wiki, blog, and social networking.

The platform we chose had to allow us to build a complex application by combining simpler applications in many different ways. It also had to scale easily and almost infinitely.

I was attracted to Ruby on Rails initially but ultimately, Rails did not have the type of scalability nor the ability to combine apps the way we wanted; however, we really liked Ruby and made that the language for our project. I ended up building a new Ruby framework that focused on scalability and application re-use.

I think Ruby’s greatest assets are that it is simple to start but powerful when mastered. This is great for developing in a shared environment because one can build powerful frameworks with meta-programming and other advanced concepts, but the developers who use the framework do not need to know all the advanced features and still be productive right away. Rails, of course, is the primary example of this.

Python was a strong contender but ultimately Ruby’s pure object oriented model won me over Python’s better performance.

Q: Can you tell us some challenges you faced developing using Ruby on Rails and why you have decided to build your own web framework in Ruby?

Sunny Hirai: First of all, let me say that I like Rails. For most web applications, it is a very good or near perfect solution and it’s something I recommend to those who are considering startups.

Unfortunately, it didn’t have ticks in a few important boxes for our project:

  • Multi-threading
  • Application reuse
  • Database clustering
  • Scalable storage

Many of these issues can be patched, fudged or hacked onto Rails but when you do that, it tends to add complexity in the wrong places. Because of this, we decided to start from scratch and build a framework that could handle what we needed elegantly.

The Caffeine Framework improves on more than these four issues but they were the impetus for starting.

I just wanted to add that I haven’t been following Rails intimately so I may have some Rails facts wrong and mean no disrespect.

Q: Can you tell us more about the challenge of scaling and how your Caffeine web framework outshines Ruby on Rails or Merb or takes a different route? Any commentary on Merb and how your web framework differs?

Sunny Hirai: With libraries, deployment design and/or thoughtful application design, Rails scales, but the framework doesn’t help you. Merb defers scaling by focusing on performance, which is great, but it also doesn’t help you when you eventually need to scale. Caffeine’s approach is to focus on scaling as a problem to solve in the framework and leave the application designer mostly out of it. In other words, build an application in Caffeine and it will scale.

Caffeine focuses on three areas:

  1. Processor Scaling
  2. Database Scaling
  3. Storage Scaling

Processor Scaling

Applications can scale across servers by putting a load balancer in front of a bunch of servers including Rails, Merb and of course Caffeine; however, Rails cannot handle more than one request at a time because it is not multi-threaded.

The workaround is to run more than one instance of Rails per server.

The problem is that if you send two requests to an instance and the first request is a long running process like a file upload or credit card charge, the second request has to wait and so does the user. To solve this you use Merb or another framework to handle the slow running processes in a multi-threaded fashion and make sure not to use single-threaded libraries like ActiveRecord in it.

The big issue here is that scaling to multiple processors in Rails is not trivial. You need to know what you’re doing and you need to refactor the slow bits of your application to make it responsive.

Caffeine is multi-threaded so requests do not block each other and you get to focus more on building your application and less time on getting your application to work smoothly. Merb is multi-threaded as long as you don’t use ActiveRecord or other single-threaded libraries.

For many applications that deal with just a database and with fast queries, this isn’t a big deal. It’s when you have many slow running processes that handling them differently becomes a chore.

Database Scaling

Worrying about scaling the database is not something the application developer should have to do. He or she should be working on the application itself. This is both more productive and more fun. It takes a lot of training and knowledge to understand how to manage and scale a database and it shouldn’t be a problem that has to be solved over and over again. To get around this Caffeine takes care of database scaling at the framework level.

Caffeine includes database clustering (also known as partitioning) which means that you are not restricted to storing a table in one database. Instead, you split it up across multiple databases. You control how the database is split and you can add and remove clusters from the pool at will.

This is great for scalability.

As a bonus it also opens up more options. When the database can scale you can solve more problems with the database because it is not a scarce resource. For example, we are considering using PostgreSQL’s full text search in our application, an option we would never consider without clustering.

Caffeine’s clustering is designed to “just work.” You build your controllers like you do normally but you can use them with or without clustering.

Ultimately, it turns clustering into a choice rather than a project. Should I flip the clustering switch or shouldn’t I?

Storage Scaling

You shouldn’t have to decide how you are going to store your files for your application and be locked into the decision. I want to be able to store files locally, or in MogileFS, or in a network share or on Amazon S3 and I don’t want to think about storage problems while developing my application. Handling storage should not be a development issue but an implementation issue.

Caffeine takes care of this by abstracting storage. Presently it has plugins for local/network storage and for Amazon S3 but there is no reason it can’t support MogileFS or any other storage solutions.

Like clustering, it is designed to just work.

Caffeine’s storage framework also integrates with the database. You can put an uploaded file right into a record in the database and Caffeine will automatically move the file into a directory related to the record and store the serialized location of the file into the record. When the record is deleted, the file will automatically be deleted from storage.
Can you highlight (or tell us more about) the architecture and building blocks of your web framework?

Developing in Caffeine is agile and fun like Rails but adds two things:

  1. Scalability
  2. Feature Packaging

We already touched on scalability so let’s talk about feature packaging.

The future in application development is to make the application modular so that you don’t have to keep reinventing common features like message boards, wikis and calendars. I call this feature packaging because you are taking a wholly contained application (a feature) and making it reusable. Unlike a library, a packaged feature includes everything needed to implement a complete feature in an application including the models, views and controllers.

Take Redmine for example. It’s a project manager written in Rails that includes several features like wiki, news manager, document manager and message boards. Those four features have been written hundreds of times over but they were rewritten once again for use in Redmine.

Since the focus of Redmine is project management, why are developers spending time building features that are not specific to project management? They would get farther if they focused on building project management features only and reusing other people’s code for the common bits. But they made the logical choice for now. Feature re-use is too hard in its current state to make it worthwhile.

With Caffeine, you can take any application you’ve built and, with no code changes, drop it into your new project. You could take somebody else’s forum application, for example, and use it in your project. Caffeine handles the differences between user models, database storage, file storage, templating, etc.

To make it work, we had to rethink everything from routing, to the database, to the user model and in many cases the abstractions are in different places than Rails, Merb or other popular frameworks.

Q: Do you have any plans to open source your Caffeine web framework?

Sunny Hirai: Yes, but before we do that, we will put the framework through its paces. This will be sometime after we have launched our product and put it into production.

I’m excited about the potential for open source. It has some great benefits like a larger testing base, contributions, good will and a larger pool of people to hire from since they already know how to use the framework.

Q: Any commentary on Sequel and why it’s a cool data access library?

Sunny Hirai: Sequel is the best ORM out there. I loved it so much that I made it part of Caffeine before Sequel was even out of beta. Amongst other things, it fit a key requirement: multi-threading.

Sequel is counter-intuitive if you’ve grown up writing SQL statements, but as soon as you get the approach, you’ll realize how smart it is. I believe we are going to see a rise in the popularity of Sequel as the preferred data access layer for non-Rails apps.

What makes Sequel cool is that it uses chaining to return subsets of a table. Say you had a table “stories.”

my_stories = stories.filter( :user_id => 7 )

my_stories now refers to all records in the “stories” table where “user_id = 7“.

You can filter the filtered dataset further:

my_featured_stories = my_ stories.filter( :active => true, :featured => true )

This improves security and reduces code. You don’t have to write the same where clause over and over again. For example, “WHERE active = true AND user_id = 7” might be the subset you always want to be working with. With Sequel, you can filter the dataset once at the start then use the filtered dataset without worrying that you’ll forget the filters when you do a SELECT or DELETE.

In this way, Sequel is more DRY than ActiveRecord in common usage.

In Caffeine, I wrap Sequel to handle clustering, revision control, etc. I also feel that if you have a filtered dataset where :user_id = 7, then when you insert a new record, it should automatically set and restrict the user_id to 7, which it does.

I also add a modeling scheme that allows custom data types. For example, you can store WebImage and WebFile types and they are automatically serialized/deserialized and the files automatically stored in the storage engine.

Q: Can you tell us a little bit about GoMarkup – your library for creating web markup (HTML) from text? Any commentary on how it compares to BlueCloth and RedCloth (Markdown and Textile)?

Sunny Hirai: GoMarkup differs from BlueCloth and RedCloth in two important ways. It is:

  • Configurable
  • Portable

GoMarkup is not limited to parsing one markup language. Instead, it takes a configuration file that lets it parse any markup language. Currently I have a configuration for WikiCreole markup but it can parse Markdown and Textile with different configuration files. This also makes adding custom markup code very easy. In a few seconds, I can add a definition for superscript, for example, if there wasn’t one already.

GoMarkup is designed to be portable across languages and, as part of that goal, it is very small (about 12 KB of Ruby without the tests). The goal is to have a version of GoMarkup in all the core web programming languages including Java, PHP, Python, .Net, etc.

As you might have surmised, with configuration and portability, it is possible to eliminate that mess of slightly incompatible markup libraries. I say slightly because it is hard to code a markup parser in different programming languages and have the edge cases handled identically. With GoMarkup, we create one configuration file that works identically in the GoMarkup parsers for each language.

In other words, with GoMarkup, a markup language’s official implementation is automatically documented by its configuration file.

Q: Any tips, tricks or advice for developers getting started with Ruby?

Sunny Hirai: Programming Ruby by Dave Thomas (The Pickaxe Book) is a great start for beginners but for those who want to be break out and be Ruby Experts, there are a few excellent resources, most of which have come around only this year:

  • “The Ruby Programming Language” book by David Flanagan and Yukihiro Matsumoto, the creator of Ruby. This is the de facto reference on the language itself.
  • “Ruby Metaprogramming” screencasts by Dave Thomas at the Pragmatic Programmer websites. This is the first time that I’ve seen metaprogramming explained in a way that was understandable. This is well worth the $5 per episode.
  • “Ruby for Rails” book by David A. Black. Even if you aren’t into Rails, this is one of the best books on advanced Ruby programming concepts.

For daily Ruby reading, I recommend in order from most filtered to least:

Also, it’s useful to be cognizant that you will start Ruby as a productive developer without necessarily knowing the in depth metaprogramming stuff which is hard.

Thanks Sunny Hirai for your time and insight. Questions? Comments? Send them along to the Vancouver.rb forum/mailing list. Thanks!