The Dark Side of Duck Typing
On the Origin of Species
What walks like duck and quacks like a duck is a duck. That what duck typing is about. It makes development easier. We don’t need interfaces and abstract classes. Many object oriented patterns look much simpler because of such dynamic typing. They come more naturally in ruby code and often we don’t even notice them until we see explicit names like Visitor, Decorator, etc.
After I’ve got used with such a powerful tool, it was too easy to step into dangerous territory without even noticing. Sometimes things do walk and quack like ducks all the time… well, almost all the time. And sometimes in a pack of ducks there are some species that do quack (and that makes you feel safe), but that doesn’t make them ducks.
When all ducks are quacking well
Making real world examples is hard, and the following one is somewhat synthetic. But anyway. Let’s suppose that we have a system with Project model and our system should generate reports by aggregating project instances. We have different types of parametrized reports, and these types share some common features. Let’s see how it fits in code.
class Project < ActiveRecord::Base has_and_belongs_to_many :categories has_many :items end class ReportFilter < ActiveRecord::Base def filter_project_ids(project_ids) # implemented in descendants end end class DueDateFilter < ReportFilter def filter_project_ids(project_ids) irrelevant_projects = Project.find(project_ids, :conditions => ['due_date > ?', self.cut_off_date]) project_ids - irrelevant_projects.collect(&:id) end end class ComplexFilter < ReportFilter def filter_project_ids(project_ids) connection.select_values <<-SQL SELECT p.id FROM projects p JOIN project_estimates e ON (e.project_id = p.id) JOIN metrics m ON (p.id = m.project_id) WHERE m.coverage_level < average_coverage_in_group(p.group_id) AND e.author_id IN (SELECT id FROM team_members WHERE statisfaction_level > #{desired_level}) AND p.id IN (#{project_ids.join ','}) SQL end end class CategoryReport < ActiveRecord::Base belongs_to :category has_many :report_filters, :order => 'complexity_level' def projects project_ids = category.project_ids report_items.each do |i| project_ids = i.filter_project_ids(project_ids) end Project.find(project_ids) end end
Note how CategoryReport generates it’s project set. It iterates through it’s project filters to trim category project list down to some relevant report specific subset. All the report filters implement common interface. Each filter accepts project_ids array and reduces it according to its parameters and semantics. Some filters are fairly easy to calculate, other can be computationally complex. We run simple filters first to reduce set and then refer to complex ones.
Most of our filter implementations should look like DueDateFilter. But we live in real world, and several filters look like ComplexFilter. Don’t look deep into SQL query in it, it was added for dramatic effect only. My point is that due to some (often premature) optimization, complex filters can implement some real black magic under the hood.
And here comes the time for a closer look
Let’s suppose that we always run ComplexFilter as the last filter in our chain. But one day a very good reason for changing that order arises. We place DueDateFilter after ComplexFilter. System works, all the filter rspecs pass, and rspec for Report passes as well. It checks Report behavior with great success by mocking everything except Report (a model’s rspec should be focused on the model it’s testing, right?)
Some time passes and we discover a great surprise. It appears that reports have been working incorrectly for some time now. Some filters that come after ComplexFilter don’t filter projects any more. No exceptions are raised. They’ve just silently became transparent.
Please note how DueDateFilter operates on array of ids that are Fixnums. Then note how ComplexFilter returns an array of strings. They’re still ids, but that doesn’t help when it comes to…
project_ids - irrelevant_projects.collect(&:id)
… in other filters.
And a few reflections
A type of bug that is easy to fix, easy to debug, but hard to predict. Who would notice such a thing on code review when all these filters did work previously? And another big question: what can we do to make such bugs harder to sneak into our code?
Well, I don’t know. There’s probably no silver bullet for it. But there’s something we can do however. It’s not new, in fact it’s plain old OOD.
Arrays of fixnums (and arrays of strings) are not domain models for our system. Yes, we don’t want to instantiate full projects for the sake of performance, but passing integers around instead of project models is not the only option that we have.
class ProjectSet def initialize(ids) @project_ids = [] add(ids) end def add(ids) @project_ids |= ids.collect(&:to_i) self end def exclude(ids) @project_ids - ids.collect(&:to_i) self end def project_ids @project_ids.dup end def projects Project.find(@project_ids) end def to_sql_list return 'NULL' if @project_ids.empty? @project_ids.join ',' end end
Looks like we’ve avoided the temptation to implement domain specific operations through overriding operators (especially arithmetic). Cool books say that’s not something that grownups do very often. So our methods are named add and exclude. Now we can change our report code to pass an instance of ProjectSet and have a bit more control of operations that are essential to our problem domain.
ProjectSet adds the following benefits:
- we can encapsulate type checks and explicit conversions in it, and keep filter code clean of such details
- we can raise an exception if supplied ids are of incorrect type and be informed about problems earlier by QA (500-page is easier to notice than tricky changes in reports behavior)
- we can add other specific code that otherwise would be scattered across different filters (note how
to_sql_listhandles empty lists) - we can open paths to other caching/performance optimizations that can be hidden under the hood of
ProjectSet
Not all .rb files that live under app/models directory have to be AR models. Many applications can benefit from having additional models. And those can be no less domain specific than conventional ActiveRecord::Base descendants.

Comments are closed for this entry.