In responding to a Jekyll pull request, I went digging around the way Ruby
handles sorting. The problem was that we were trying to sort a list of
objects which don’t all have a given property. The contributor was using
sort_by which throws an ArgumentError if the block returns a nil value
at all. We had a sparse property we wanted to sort by.
Our typical solution to this is something like:
def sort_sparse_property(objects, property)
objects.sort do |apple, orange|
apple_value = apple.public_send(property)
orange_value = orange.public_send(property)
if !apple_value.nil? && orange_value.nil?
-1
elsif apple_value.nil? && !orange_value.nil?
1
else
apple_value <=> orange_value
end
end
end
Indeed, we did that in Jekyll::Filters#sort_input. It works fine, but
it’s pretty slow. The speed difference between Enumerable#sort and
Enumerable#sort_by is well-documented: sort_by wins every time. But
why?
Well, I read the documentation for Enumerable#sort_by
all the way through to the end. Lo, and behold! It walks right through
an optimization problems using sort! The key problem is that the sort
block is allocating objects during every call. That’s a huge waste of
allocations, and if you have ever worked on production systems written in
Ruby, you know object allocations are one of the key reasons for slowness.
To reduce object allocations, the Ruby documentation suggests something called a “Schwartzian transform”:
A more efficient technique is to cache the sort keys (modification times in this case) before the sort. Perl users often call this approach a Schwartzian transform, after Randal Schwartz. We construct a temporary array, where each element is an array containing our sort key along with the filename. We sort this array, and then extract the filename from the result.
This is exactly what
sort_bydoes internally.
Holy smokes! So going back to our example above, it’s suggesting we first extract the property for each object, then we sort, then we pull the object out again. Let’s give that a try:
def sort_sparse_property(objects, property)
objects.map { |object| [object.public_send(property), object] }
.sort do |apple, orange|
# apple & orange are now an arrays of [value, object]
# Use the value cached in this array to compare!
if !apple.first.nil? && orange.first.nil?
-1
elsif apple.first.nil? && !orange.first.nil?
1
else
apple.first <=> orange.first
end
end
.map { |object_info| object_info.last }
end
The first call to #map creates an array for each object with the value we
want to compare paired with the object itself. The call to #sort performs
the sort as usual, but this time it’s using the cached value. The last
#map extracts the object from the sorted array of [value, object]
pairs.
I wrote a benchmark for Jekyll using Jekyll::Document objects
and ran it for 2 properties: one which is sparsely available (i.e. only on
a few objects), and one which is more fully available (i.e. on all but a
small number of objects) and the results were shocking.
For a sparsely available property, the Schwartzian transform was 7-8x faster. And for the more fully available property, the Schwarzian transform was 1.75x faster. Those are monumental savings!
It seems like a huge waste to create a new array and a new object for
every item in the original Enumerable, but as we have just proven, it is
far more efficient due to the number of times the sort block is called.
Next time you implement a custom sort block, remember the Schwartzian
transform!