TIL: "warning: toplevel constant X referenced by Y"
We ran into an interesting issue recently. After a seemingly unrelated
change a large number of tests failed with messages saying warning: toplevel
constant X referenced by Y
.
The Situation
For a logistics platform we needed multiple JSON-APIs (one for clients, one for couriers). Both are using different data formats and therefore require separate JSON serializers (we are using ActiveModel::Serializers).
The serializers are namespaced and have common structures extracted into a superclass. It looks something like this:
This worked fine until one day several tests failed unexpectedly. The data returned by the serializers was wrong and we found the following warnings in the test output:
After some digging we identified the problem was that the code that
referenced either Client::ShipmentSerializer
or Courier::ShipmentSerializer
suddenly got handed the superclass ShipmentSerializer
by Ruby instead. It
did not cause any runtime errors but did result in incorrectly serialized data.
How could that be? The referencing code clearly references the namespaced class
Courier::ShipmentSerializer
but Ruby gave it a different class. Similar in
name though, but clearly different. After all that’s what namespacing is for,
isn’t it?
Namespacing doesn’t work as you believe
Turns out… namespacing in Ruby works differently than most of us would
expect. When you write Foo::Bar
in Ruby, it does not mean Bar
within
namespace Foo
, but rather Bar
as “seen” from Foo
1. And it turns out
that modules and classes differ in what they can “see”. As long as Foo
is
defined as a module this works well. However, in the case that Foo
is a class
suddenly that class will see other toplevel constants as well.
You can easily try out the difference by comparing the output of the following 2 commands:
As expected, we get an error because there is no constant String
within the
module Foo
.
As you might have not have expected, Ruby has no problem referencing
Foo::String
if Foo
is a class even if it doesn’t have a constant String
within it. It can “see” the toplevel constant ::String
and uses it instead.
You can even string together a whole list of class constants and Ruby will always fall back to the toplevel classes (albeit with warnings):
The good news is that as long as the correctly namespaced class is defined, the correct class is used:
How is this a problem in our original case then? Enter Rails…
Rails’ lazy auto-loading
It all works fine, as one can see in the previous example, when all classes are loaded. However this is not always the case in Rails development and test environments by default. Here Rails only loads the classes when they are needed for performance reasons.
To achieve this, Rails overwrites Module#const_missing
to know when a new class
might need to be loaded. It then guesses the filename from the constant name,
loads the class and passes on execution as if nothing ever happened. Rails
autoloading relies on #const_missing
being triggered and in our case this
does not happen because Ruby has found the toplevel constant instead. In our
case the correct class is never loaded and execution continues with a wrong
class. Even worse, whether or not it fails can depend on the order in which the
classes are auto-loaded, adding a non-deterministic component to make debugging
extra fun.
But let’s get back to our original example with the different JSON serializers. Our issue here was that we had namespaced serializers, but also an ActiveRecord model with the same name as one of the namespaces:
Now when some code tries to access Courier::ShipmentSerializer
it would “ask”
Courier
for the ShipmentSerializer
constant. if
Courier::ShipmentSerializer
is already auto-loaded everything works fine, but
if it is not Courier
would also look in the toplevel namespace for
::ShipmentSerializer
and find the superclass if it was already loaded. The
code continues, but uses the wrong serializer and produces incorrect data.
The Solution
While there is no real solution to this, there are a couple of workarounds:
- Rename the class
Courier
- Make the class
Courier
a module (might not be an option) - Rename the namespace
Courier
- Introduce an additional root namespace, e.g.
Serializer::Courier::ShipmentSerializer
- Explicitly
require 'courier/shipments_controller'
where it is used - Disable lazy loading
We decided on 4. in our case but this is very much case dependent. In the future I will try to avoid using toplevel namespaces that are the same as existing classes or that could be likely candidates for class names in the future.
References
- https://makandracards.com/makandra/20633-ruby-constant-lookup-the-good-the-bad-and-the-ugly
- http://urbanautomaton.com/blog/2013/08/27/rails-autoloading-hell/
- https://github.com/rails/rails/issues/6931#issuecomment-6703968