Friday, May 25, 2007

Ruby-esque JMX

The topic of JMX on JRuby came up recently and I decided to play around. I found a great starter on Jeff Mesnil's blog, but I decided I hated the syntax.

Ruby has spoiled me. ActiveRecord has spoiled me.

So I cooked up this little (fully working) example:

#Find all the MBeans matching some object name
mbeans = JMX::MBean.find_all_by_name("cacheStatistics:*")

mbeans.each do |bean|
puts "#{bean.name} "

#Either use methods on the bean object
puts " - CacheHits: #{bean.CacheHits}"

#Or access the attributes hash.
puts " - CacheMisses: #{bean.attributes["CacheHits"]}"
end


The code is ~50lines which I'll post at some point.

I never thought working with java objects could be made to "feel" nice.

I was also chatting with "headius" on #jruby, and he mentioned that Rob Harrop, of Spring fame, had a talk at JavaOne about about something similar called MScript. I'd love to get my hands on those slides.

Acts As Fast But Very Inaccurate Counter

Introduction

If you have chosen the InnoDB MySQL engine over MyISAM for its support of transactions, foreign keys and other niceties, you might be aware of its limitations, like much slower count(*). Our DBAs are in a constant lookout for slow queries in production and the ways to keep DBs happy so they recommended that we should try to fix count(). They suggested to check SHOW TABLE STATUS for an approximate count of rows in a table. This morning I wrote acts_as_fast_counter which proved that the speed is indeed improved but the accuracy might be not acceptable. The rest of the post just records details of the exercise.

The approach

I created a model per engine and seeded each with 100K records. Then I run count on each model for a thousand times and measured the results.

The code:

module ActiveRecord; module Acts; end; end 

module ActiveRecord::Acts::ActsAsFastCounter

def self.included(base)
base.extend(ClassMethods)
end

module ClassMethods

def acts_as_fast_counter
self.extend(FastCounterOverrides)
end

module FastCounterOverrides

def count(*args)
if args.empty?
connection.select_one("SHOW TABLE STATUS LIKE '#{ table_name }'")['Rows'].to_i
else
super(*args)
end
end

end

end

end

ActiveRecord::Base.send(:include, ActiveRecord::Acts::ActsAsFastCounter)

# create_table :myisams, :options => 'engine=MyISAM'  do |t|
# t.column :name, :string
# end
# 100_000.times { Myisam.create(:name => Time.now.to_s) }
#
# create_table :innodbs, :options => 'engine=InnoDB' do |t|
# t.column :name, :string
# end
# 100_000.times { Innodb.create(:name => Time.now.to_s) }

class Bench

require 'benchmark'
require 'acts_as_fast_counter'

def self.run
measure
show_count
convert_to_fast_counter
show_count
add_records
show_count
destroy_records
show_count
measure
end

def self.measure
puts "* Benchhmarks:"
n = 1_000
Benchmark.bm(12) do |x|
x.report('MyISAM') { n.times { Myisam.count } }
x.report('InnoDB') { n.times { Innodb.count } }
end
end

def self.convert_to_fast_counter
Innodb.send(:acts_as_fast_counter)
puts "* Converted Innodb to fast counter"
end

def self.add_records
@myisam = Myisam.create(:name => 'One more')
@innodb = Innodb.create(:name => 'One more')
puts "* Added records"
end

def self.destroy_records
@myisam.destroy
@innodb.destroy
puts "* Destroyed records"
end

def self.show_count
puts "* Record count:"
puts " MyISAM: #{ Myisam.count }"
puts " InnoDB: #{ Innodb.count }"
end

end


The results:
* Benchhmarks:
user system total real
MyISAM 0.180000 0.040000 0.220000 ( 0.289983)
InnoDB 0.430000 0.070000 0.500000 ( 35.102496)
* Record count:
MyISAM: 100000
InnoDB: 100000
* Converted Innodb to fast counter
* Record count:
MyISAM: 100000
InnoDB: 100345
* Added records
* Record count:
MyISAM: 100001
InnoDB: 100345
* Destroyed records
* Record count:
MyISAM: 100000
InnoDB: 100345
* Benchhmarks:
user system total real
MyISAM 0.250000 0.030000 0.280000 ( 0.350673)
InnoDB 0.250000 0.040000 0.290000 ( 0.977711)


Final thoughts

The MySQL manual has a clear warning about inaccuracy of the amount of rows in the SHOW TABLE STATUS results:

Rows - The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40 to 50%. In such cases, use SELECT COUNT(*) to obtain an accurate count.


The test confirms it by showing 345 more records then expected thus making it not very useful but for some edge cases. If you know a way to improve the speed of count() on InnoDB with some other approach beyond using a counter table, please share.

Thursday, May 24, 2007

Javascript: Event.onElementReady

We deal with a lot of Javascript at RHG. Rather then create functional one-offs that become hard to maintain and very duplicative, we prefer to have collections of behavior that can be assigned to parts of the document. Our CSS designers especially like this because it makes our markup clean and easy to read. It also means we can have tests that exercise a series of interactions.

However, there is a problem we run into when loading the page over a slow connection: the Javascript is not run until window.onload, meaning our users suffer as they can see part of the page rendered but not use it until it's fully loaded. For those who don't know, window.onload will not execute until all images, CSS, and Javascript files have loaded. Our first solution to this problem was to use onDOMReady. This has worked fairly well and kept our site running reasonably quickly. We've run into problems with onDOMReady in IE6 however, and as a result disabled it in favor of having our pages render all the time. In the IE family of browsers if a script tries to access a part of the DOM before it has been completely processed the browser will raise an exception "operation aborted". After this alert message pops up, the web page becomes unusable and the browser could even crash.

After some careful thinking we decided we would attach our behavior objects in smaller chunks of the DOM. Rather then wait until the window.onload or onDomReady events fire we can use a few inline script tags that call a function that figures out how to attach itself to its most immediate parentNode.

For example, we do this:


<div class="parent">
<div class="bvr-blah">
</div>
<script type="text/javascript">
RHG.Behavior.attach();
</script>
</div>


There could be multiple behaviors in the above block of markup that are all contained within the hypothetical div.parent element.

Now, to allow this to work the div.parent must be ready. Otherwise, IE will cancel the whole page rendering with the "Operation Aborted" alert.

Event.onElementReady checks for features of the DOM element to determine whether or not the element is really ready to be manipulatd. Thankfully, IE doesn't mind if you read from an element before it's ready only if you try and modify it. This method will poll the DOM element until either the nextSibling or the textContent is non null.


Object.extend(Event,{
// check whether or not the DOM element is ready
onElementReady: function(element,callback)
{
if( element && (element.nextSibling || element.textContent) ){
callback();
}
else{
setTimeout( this.onElementReady.bind(this,element,callback), 1 );
}
}
});


Now we can attach our behaviors as the page is rendering and avoid "Operation Aborted" alerts from IE.

Tuesday, May 22, 2007

DRYing Up Polymorphic Controllers

Polymorphic routes allow drying up the controller implementation when functionality is identical, regardless of entry point. A good example is comments for articles and blogs. There is a challenge to balance the implementation of the comments controller reflecting the multiple incoming routes. Let's look at the way it could be written.

Routing is straightforward with blogs and article models acting as commentable and both the comment model and comment controllers being polymorphic:

ActionController::Routing::Routes.draw do |map|
map.resources :articles, :has_many => [ :comments ]
map.resources :blogs, :has_many => [ :comments ]
end


This means that a comment can be created via post to either /articles/1/comments/new or /blogs/1/comments/new. The comments controller can be implemented to handle both:

class CommentsController < ApplicationController

def new
@parent = parent_object
@comment = Comment.new
end

def create

@parent = parent_object
@comment = @parent.comments.build(params[:comment])

if @comment.valid? and @comment.save
redirect_to parent_url(@parent)
else
render :action => 'new'
end

end

private

def parent_object
case
when params[:article_id] then Article.find_by_id(params[:article_id])
when params[:news_id] then News.find_by_id(params[:news_id])
end
end

def parent_url(parent)
case
when params[:article_id] then article_url(parent)
when params[:news_id] then news_url(parent)
end
end

end


This method works fine and there is not much drive to start refactoring it right away. This changes, though, if there is a need to add another commentable or allow some other polymorphic route. Instead of adding more 'when' clauses the whole functionality can be extracted and abstracted based on the idea of having fixed naming conventions for resources that allow movement from a controller name to a model. The refactored example has the parent functionality extracted to the application controller to share it as-is with other polymorphic routes:

class ApplicationController < ActionController::Base

protected

class << self

attr_reader :parents

def parent_resources(*parents)
@parents = parents
end

end

def parent_id(parent)
request.path_parameters["#{ parent }_id"]
end

def parent_type
self.class.parents.detect { |parent| parent_id(parent) }
end

def parent_class
parent_type && parent_type.to_s.classify.constantize
end

def parent_object
parent_class && parent_class.find_by_id(parent_id(parent_type))
end

end

class CommentsController < ApplicationController

parent_resources :article, :blogs

def new
@parent = parent_object
@comment = Comment.new
end

def create

@parent = parent_object
@comment = @parent.comments.build(params[:comment])

if @comment.valid? and @comment.save
redirect_to send("#{ parent_type }_url", @parent)
else
render :action => 'new'
end

end

end

The parent_resources call declares resources that are parent for a current controller. An alternative approach is to guess such parent resources from the request URI and routes. Aaron is currently working on a patch on Edge implementing it. We'll update this post later.

If you currently use multiple polymorphic resources and have if clauses in the controller code, you might want to rethink how it could be DRYed up using this approach. In some cases views are very parent type specific. Then it might be better to have different templates and partials rendered via render :template => "/controller/#{ parent_type }_action".