In responding to a Jekyll pull request, I went digging around the way Ruby
handles sorting. The problem was that we were trying to sort a list of
objects which don’t all have a given property. The contributor was using
sort_by
which throws an ArgumentError if the block returns a nil
value
at all. We had a sparse property we wanted to sort by.
Our typical solution to this is something like:
def sort_sparse_property(objects, property)
objects.sort do |apple, orange|
apple_value = apple.public_send(property)
orange_value = orange.public_send(property)
if !apple_value.nil? && orange_value.nil?
-1
elsif apple_value.nil? && !orange_value.nil?
1
else
apple_value <=> orange_value
end
end
end
Indeed, we did that in Jekyll::Filters#sort_input
. It works fine, but
it’s pretty slow. The speed difference between Enumerable#sort
and
Enumerable#sort_by
is well-documented: sort_by
wins every time. But
why?
Well, I read the documentation for Enumerable#sort_by
all the way through to the end. Lo, and behold! It walks right through
an optimization problems using sort
! The key problem is that the sort
block is allocating objects during every call. That’s a huge waste of
allocations, and if you have ever worked on production systems written in
Ruby, you know object allocations are one of the key reasons for slowness.
To reduce object allocations, the Ruby documentation suggests something called a “Schwartzian transform”:
A more efficient technique is to cache the sort keys (modification times in this case) before the sort. Perl users often call this approach a Schwartzian transform, after Randal Schwartz. We construct a temporary array, where each element is an array containing our sort key along with the filename. We sort this array, and then extract the filename from the result.
This is exactly what
sort_by
does internally.
Holy smokes! So going back to our example above, it’s suggesting we first extract the property for each object, then we sort, then we pull the object out again. Let’s give that a try:
def sort_sparse_property(objects, property)
objects.map { |object| [object.public_send(property), object] }
.sort do |apple, orange|
# apple & orange are now an arrays of [value, object]
# Use the value cached in this array to compare!
if !apple.first.nil? && orange.first.nil?
-1
elsif apple.first.nil? && !orange.first.nil?
1
else
apple.first <=> orange.first
end
end
.map { |object_info| object_info.last }
end
The first call to #map
creates an array for each object with the value we
want to compare paired with the object itself. The call to #sort
performs
the sort as usual, but this time it’s using the cached value. The last
#map
extracts the object from the sorted array of [value, object]
pairs.
I wrote a benchmark for Jekyll using Jekyll::Document
objects
and ran it for 2 properties: one which is sparsely available (i.e. only on
a few objects), and one which is more fully available (i.e. on all but a
small number of objects) and the results were shocking.
For a sparsely available property, the Schwartzian transform was 7-8x faster. And for the more fully available property, the Schwarzian transform was 1.75x faster. Those are monumental savings!
It seems like a huge waste to create a new array and a new object for
every item in the original Enumerable, but as we have just proven, it is
far more efficient due to the number of times the sort
block is called.
Next time you implement a custom sort
block, remember the Schwartzian
transform!