Wednesday, 25 July 2012

Constructors that do too much


A flaw I've notice frequently in a legacy code base is constructors which are doing too much work. A common anti-pattern is that a configuration is passed into a constructor. This configuration is then perused to find information so that further objects can be constructed in the constructor. I will discuss why I think this is bad practice.

As an example I have a message dispatcher which delegates to a sender object
to send a message to a destination. A dispatcher usually makes a decision on the appropriate destination based on the message. For this example there is only one endpoint. This destination is to be read from the configuration.

So I have an interface for the sender:


and a dispatcher which uses the configuration to build a sender inside its constructor:


The expected usage for the constructor then becomes:

This is bad on several counts:

  • Abstract data type is passed in. There's no indication how this will be used or what it's used for. It's just a bucket of parameters. If possible a specific type should be used in preference to an ADT. It's all about the domain and context. Check out Domain Driven Design  or GOOS touches on this point.
  • Harder to unit test. I have to make sure configuration is correct so that internal objects can be created properly. In my example I've only one object to consider but they may be several. 
  • I can't mock the behaviour of any objects created by the constructor. In essence that behaviour is hard-wired which may be deleterious to testing if that behaviour is using a real resource i.e. socket or database. It turns the unit test into more of an integration test

Use a factory approach


So instead of passing in a configuration, pass in the already constructed objects. A DI framework such as Spring or Guice will assist you in this regard as it's fundamentally what DI is all about. You inject pre-configured objects into other objects rather than building those objects internally.

If you're not using a DI framework (I'm in this situation as it's legacy code) then it's better to pass a factory which builds the object on the fly as a constructor parameter.

To illustrate this:

I can have a factory that produces Sender objects. I've taken the liberty of introducing a generic factory for this purpose. It builds objects of the specified type for a given configuration.



Now I provide an implementation for the Sender:

And to illustrate the point I rewrite the Sender class:


Finally I rewrite the message dispatcher. The constructor now takes objects rather than configuration


And the usage becomes:



This is a lot better:


  • Now the dispatcher only knows about objects. It does not even know a configuration is used.
  • The transformation of a configuration into object is localized in the factory and not spread out across the code base
  • The class becomes trivial to unit test. Instead of worrying about configurations I can simply create a mock Sender and pass that to the dispatcher. In this way I can focus solely on the behaviour of the Dispatcher itself.


Conclusion

Resist the urge to pass around configuration objects and allowing constructors to build new objects. A constructor should ideally be used for assembly only. It's better to just pass pre-configured objects into a constructor. This makes the intention of that constructor clear and makes it easier to unit test the class in question as the colloborators can be mocked easily.

Sunday, 8 July 2012

Don't forget to refactor

Recently I've been a mentor of Test Driven Development to my colleagues. I find it interesting introducing TDD to developers who've never followed the approach and finding what I take for granted having practised TDD for many years.

TDD can be summarized succinctly: 
  • Write a failing test to demonstrate the required behaviour of the code
  • Write code to make test pass

(Thanks to Guido Maliandi for the diagram.)

I find that the last step is often ignored or is not taken as an equally important step by TDD practitioners. TDD detractors will often point to a lack of up front design and this stage is probably the culprit. I concede that if you pay no heed to design in TDD, you'll end up with a poor codebase. i.e. I've got a green bar, now I'll continue with my next test. That's why the refactoring stage is important if not the most important stage.

When you've got a green bar, you have made a step in the right direction but it's not the last step you should take. The presence of a working unit test gives confidence that the code can be refactored. When I perform the refactoring stage, I look for any code smells  I may have introduced. This is an important consideration as I find TDD newbies dither a lot on whether the code they have written is 'perfect'. The initial goal should be to get that green bar. Once you're there, then you can consider other issues during the refactoring stage. 

The refactoring stage should also be a point where you ask questions of your code at a wider scope:

  • Does the code fit into the overall architecture?
  • Does the code achieve non-functional requirements i.e. performance ?
The thing about writing tests before production code is that you end up without complex classes. The test you write forces you not to write a complex test with lots of stubs. As a result you will create small classes with only one responsibility. Your code will be decoupled, flexible and configurable. You don’t have to worry about S.O.L.I.D. principles. They will naturally emerge.

The refactoring stage is an often overlooked stage. Lack of focus on this stage leads to build of code entropy resulting in technical debt that will eventually need to be repaid. It should be kept in mind TDD is not mainly a testing strategy, but a design strategy. The tests written first results in a better decoupled design. A better decoupled design in turn is easier to refactor. A case of positive reinforcement.  The code will also be easier to maintain and add new features should they be needed.