TIL: "warning: toplevel constant X referenced by Y"

We ran into an interesting issue recently. After a seemingly unrelated change a large number of tests failed with messages saying warning: toplevel constant X referenced by Y.

The Situation

For a logistics platform we needed multiple JSON-APIs (one for clients, one for couriers). Both are using different data formats and therefore require separate JSON serializers (we are using ActiveModel::Serializers).

The serializers are namespaced and have common structures extracted into a superclass. It looks something like this:

class ShipmentSerializer
  # contains common data structures
end

class Client::ShipmentSerializer < ShipmentSerializer
  # contains data structures specific for the client API
end

class Courier::ShipmentSerializer < ShipmentSerializer
  # contains data structures specific for the courier API
end

This worked fine until one day several tests failed unexpectedly. The data returned by the serializers was wrong and we found the following warnings in the test output:

warning: toplevel constant ShipmentSerializer referenced by <some class>
warning: toplevel constant ShipmentSerializer referenced by <some other class>
...

After some digging we identified the problem was that the code that referenced either Client::ShipmentSerializer or Courier::ShipmentSerializer suddenly got handed the superclass ShipmentSerializer by Ruby instead. It did not cause any runtime errors but did result in incorrectly serialized data.

How could that be? The referencing code clearly references the namespaced class Courier::ShipmentSerializer but Ruby gave it a different class. Similar in name though, but clearly different. After all that’s what namespacing is for, isn’t it?

Namespacing doesn’t work as you believe

Turns out… namespacing in Ruby works differently than most of us would expect. When you write Foo::Bar in Ruby, it does not mean Bar within namespace Foo, but rather Bar as “seen” from Foo ¹. And it turns out that modules and classes differ in what they can “see”. As long as Foo is defined as a module this works well. However, in the case that Foo is a class suddenly that class will see other toplevel constants as well.

You can easily try out the difference by comparing the output of the following 2 commands:

$ ruby -e 'module Foo; end; Foo::String'
-e:1:in `<main>': uninitialized constant Foo::String (NameError)

As expected, we get an error because there is no constant String within the module Foo.

$ ruby -e 'class Foo; end; Foo::String'
-e:1: warning: toplevel constant String referenced by Foo::String

As you might have not have expected, Ruby has no problem referencing Foo::String if Foo is a class even if it doesn’t have a constant String within it. It can “see” the toplevel constant ::String and uses it instead.

You can even string together a whole list of class constants and Ruby will always fall back to the toplevel classes (albeit with warnings):

irb(main):001:0> Hash::Array::String::File
(irb):1: warning: toplevel constant Array referenced by Hash::Array
(irb):1: warning: toplevel constant String referenced by Array::String
(irb):1: warning: toplevel constant File referenced by String::File
=> File

The good news is that as long as the correctly namespaced class is defined, the correct class is used:

class Foo; end
class Bar; end

Foo::Bar   # => warning: toplevel constant Bar referenced by Foo::Bar

class Foo::Bar; end

Foo::Bar   # => returns correct class Foo::Bar

How is this a problem in our original case then? Enter Rails…

Rails’ lazy auto-loading

It all works fine, as one can see in the previous example, when all classes are loaded. However this is not always the case in Rails development and test environments by default. Here Rails only loads the classes when they are needed for performance reasons.

To achieve this, Rails overwrites Module#const_missing to know when a new class might need to be loaded. It then guesses the filename from the constant name, loads the class and passes on execution as if nothing ever happened. Rails autoloading relies on #const_missing being triggered and in our case this does not happen because Ruby has found the toplevel constant instead. In our case the correct class is never loaded and execution continues with a wrong class. Even worse, whether or not it fails can depend on the order in which the classes are auto-loaded, adding a non-deterministic component to make debugging extra fun.

But let’s get back to our original example with the different JSON serializers. Our issue here was that we had namespaced serializers, but also an ActiveRecord model with the same name as one of the namespaces:

class ShipmentSerializer
  # contains common data structures
end

class Client::ShipmentSerializer < ShipmentSerializer
  # contains data structures specific for the client API
end

class Courier::ShipmentSerializer < ShipmentSerializer
  # contains data structures specific for the courier API
end

class Courier < ActiveRecord::Base
  # ActiveRecord model
end

Now when some code tries to access Courier::ShipmentSerializer it would “ask” Courier for the ShipmentSerializer constant. if Courier::ShipmentSerializer is already auto-loaded everything works fine, but if it is not Courier would also look in the toplevel namespace for ::ShipmentSerializer and find the superclass if it was already loaded. The code continues, but uses the wrong serializer and produces incorrect data.

The Solution

While there is no real solution to this, there are a couple of workarounds:

Rename the class Courier
Make the class Courier a module (might not be an option)
Rename the namespace Courier
Introduce an additional root namespace, e.g. Serializer::Courier::ShipmentSerializer
Explicitly require 'courier/shipments_controller' where it is used
Disable lazy loading

We decided on 4. in our case but this is very much case dependent. In the future I will try to avoid using toplevel namespaces that are the same as existing classes or that could be likely candidates for class names in the future.

References

https://makandracards.com/makandra/20633-ruby-constant-lookup-the-good-the-bad-and-the-ugly ↩