Schwartzian transform & faster sorting

In responding to a Jekyll pull request, I went digging around the way Ruby handles sorting. The problem was that we were trying to sort a list of objects which don’t all have a given property. The contributor was using sort_by which throws an ArgumentError if the block returns a nil value at all. We had a sparse property we wanted to sort by.

Our typical solution to this is something like:

def sort_sparse_property(objects, property)
  objects.sort do |apple, orange|
    apple_value = apple.public_send(property)
    orange_value = orange.public_send(property)

    if !apple_value.nil? && orange_value.nil?
      -1
    elsif apple_value.nil? && !orange_value.nil?
      1
    else
      apple_value <=> orange_value
    end
  end
end

Indeed, we did that in Jekyll::Filters#sort_input. It works fine, but it’s pretty slow. The speed difference between Enumerable#sort and Enumerable#sort_by is well-documented: sort_by wins every time. But why?

Well, I read the documentation for Enumerable#sort_by all the way through to the end. Lo, and behold! It walks right through an optimization problems using sort! The key problem is that the sort block is allocating objects during every call. That’s a huge waste of allocations, and if you have ever worked on production systems written in Ruby, you know object allocations are one of the key reasons for slowness.

To reduce object allocations, the Ruby documentation suggests something called a “Schwartzian transform”:

A more efficient technique is to cache the sort keys (modification times in this case) before the sort. Perl users often call this approach a Schwartzian transform, after Randal Schwartz. We construct a temporary array, where each element is an array containing our sort key along with the filename. We sort this array, and then extract the filename from the result.

This is exactly what sort_by does internally.

Holy smokes! So going back to our example above, it’s suggesting we first extract the property for each object, then we sort, then we pull the object out again. Let’s give that a try:

def sort_sparse_property(objects, property)
  objects.map { |object| [object.public_send(property), object] }
    .sort do |apple, orange|
      # apple & orange are now an arrays of [value, object]
      # Use the value cached in this array to compare!
      if !apple.first.nil? && orange.first.nil?
        -1
      elsif apple.first.nil? && !orange.first.nil?
        1
      else
        apple.first <=> orange.first
      end
    end
    .map { |object_info| object_info.last }
end

The first call to #map creates an array for each object with the value we want to compare paired with the object itself. The call to #sort performs the sort as usual, but this time it’s using the cached value. The last #map extracts the object from the sorted array of [value, object] pairs.

I wrote a benchmark for Jekyll using Jekyll::Document objects and ran it for 2 properties: one which is sparsely available (i.e. only on a few objects), and one which is more fully available (i.e. on all but a small number of objects) and the results were shocking.

For a sparsely available property, the Schwartzian transform was 7-8x faster. And for the more fully available property, the Schwarzian transform was 1.75x faster. Those are monumental savings!

It seems like a huge waste to create a new array and a new object for every item in the original Enumerable, but as we have just proven, it is far more efficient due to the number of times the sort block is called.

Next time you implement a custom sort block, remember the Schwartzian transform!

Published on September 1, 2017.