Summary
I've been troubled by some recent discussions about C# and Java language features. Most people in the discussions seem to lean towards making methods non-virtual by default. Let's reconsider that.
Advertisement
Over the past few weeks there have been a broad set of discussions about final, virtual, and other keywords that enable or disable method overriding in OO languages. There have been discussions on the pragmatic programmer mailing list and here on Artima; mostly in response to interviews with Anders Hejlsberg and Joshua Bloch. Many people seemed to lean towards making non-virtual the default in languages, and with the best of intentions. After all, virtual methods don't seem quite as safe. When a method is non-virtual you can quickly narrow down what is called.
No one wants to do something that is error prone and everyone wants to prevent errors, but the funny thing is that attempting to prevent some errors can be counter-productive. I think there is an aspect to safety that we often don't see when we are designing. A poor default can lead to trouble down the road.
Let's think about a typical development situation. You and I sit down to
add a feature to a system. We notice that we need to modify an existing class and
design two or three more. We'll also have to modify two of the methods in
the class so that it can delegate to the new classes and merge some
results back into the state of the object. The code looks pretty complicated
and we know that we could make a mistake so we decide to put
the class into a test harness and characterize its behavior. It would very nice if it already had some test cases. We could probably
get the whole change done in about half the time, but whoever wrote the code
didn't write tests so we have some real work ahead of us.
We sit back and look at the class we need to put in the harness. It's pretty
obvious why there isn't a set of tests for it already, it's untestable. It
uses a couple of APIs directly and it would be problematic to interface with
in them a test harness. Luckily, we notice that those uses are well-encapsulated. For instance, the class needs to use the mail API
but it uses it only in one place: a method named sendNotificationMessage().
Immediately, we know what we can do. We create a subclass for testing. All we do within it is override the sendNotificationMessage()
method, giving it an empty body. We don't want to send email while our tests run and
that little piece of work, writing the subclass, will let us write some tests without
modifying the production code. Instead of testing the production class, we test
our new subclass. It inherits all of the code from our original class, but it nulls out behavior that was not important to the tests. After we've characterized the class we write tests for the changes we need to make, make the changes, run the tests and confidently check in our code. In a language like Java or Smalltalk, Ruby or Python that's all there is to it.
Now let's imagine a similar situation in a language which uses non-virtual methods
by default. We can do the same thing. We can override that particular method, but
we have to make it virtual first. In C++ this can be problematic because it
can silently change the behavior of the code. All methods with the same signature
in subclasses will automatically become virtual. If your program relies on the
non-virtualness of that method (which isn't a good idea but has been seen
in the field) you could be in trouble. In C#, the situation is much better but you
are left with a new virtual function in your class. People will look all over
your production code for a reason why and not realize that it was made virtual for testing. It is kind of a shame that our code has to have this little misleading
wart just because we wanted to do something good: write some tests to be confident
in our changes. In Java, that method would look like any other and there would
be less head-scratching involved.
Interestingly, if the original developers had written test cases for the class as they developed it, their encapsulation would have allowed testing. Chances are, they would have had virtual methods in the class because they would have run into the same problem we did, but sooner. Yet again, readers of the class would be scratching their heads because they don't know why a particular method is virtual. Head-scratching is pretty easy to discount, but over time it leads to trouble.
I once looked at some slides a co-worker got from a testing tutorial. In one they
showed a tester bewildered by a black-box object that checked the current time
and date, the phase of the moon and the temperature outside, and there wasn't any way to instrument those conditions for testing. The slide said, "Sometimes Objects Encapsulate Too Much." That is a pretty exaggerated case, but the message does ring true for a lot of code.
The protection that people design into their classes to prevent misuse seems to
make them safer, but does it really? Is the code really safer if you can't check
its behavior while you change it? I don't think so. Code is an interesting form
of material. Unlike metal or concrete, it doesn't break when it is used too much,
it can only break when people change it. Anything you can do to make it easier to understand and easier to test helps keep the bugs out of your code when you make changes. Often that means that your classes have to be flexible enough to live in two environments: your application and your test harness. And often that means that you end up using virtual methods because they are more flexible.
Needless flexibility is a bad thing, but virtual-ness is pretty much the foundation of evolution in OO systems. Often I've helped teams meet new requirements by taking classes full of static methods and converting them to instance methods, making it possible to plug in new behavior. Sometimes that idea just doesn't occur to people. I suspect that having more concepts related to "overriding" will just make evolutionary paths harder for many people to see.
Can virtual methods cause trouble in the face of versioning, incremental deployment, and published APIs? Absolutely, but thank goodness those are separable problems. Yes, some teams create and deploy libraries and frameworks for use by other teams. They have to be very careful that they don't break code that depends on their interfaces.
But, not every team is in that situation and even the teams who are have core code, code that no external clients directly depend upon. I'm pretty sure that there is more core code out there than API code. To me, languages which make methods virtual by default are more in line with what projects need. They certainly aid quality.
I can only think of one thing that would tip me towards default non-virtuals. If we had languages or tools which allowed us to have different access to classes in test and production environments, a default of non-virtual might make sense. With aspects, classloader tricks, and reflective tools we're closer to that than we've been before.
Wouldn't it be better to encapsulate <i>sendNotificationMessage()</i> in a class or interface that was external to the Class Under Test? In that case, it would not be surprising to see this method be virtual, and you could test it using Mock objects. You would get an extra class in your design, but isn't this already a concept that you want to make more explicity anyway?
Overriding a method whose writer did not intent it to be overwritten seems to me to always be a bad idea.
Yes, I agree. The sendNotificationMessage() method seems more like a cohesion problem with the original class. The tests should depend only on the mechanism that triggers sendNotificationMessage(), so perhaps the class could be changed to have a callback?
> Wouldn't it be better to encapsulate > <i>sendNotificationMessage()</i> in a class or interface > that was external to the Class Under Test? In that case, > it would not be surprising to see this method be virtual, > and you could test it using Mock objects. You would get an > extra class in your design, but isn't this already a > concept that you want to make more explicity anyway? > > Overriding a method whose writer did not intent it to be > overwritten seems to me to always be a bad idea.
You can put another class in there (call it a Mailer), but you are still left with the same problem. In production it has to work and under test it should be disabled. You can parameterize the class to accept a mailer when you construct but that shifts a dependency to the class that constructs your class.
There are a lot of tradeoffs, but the key thing is that overriding is very practical when you want to get some coverage in code that wasn't developed test first. If it is code you don't understand very well, minimal modification is a good rule.
Yep! In a previous "life" developing in C++ (many years ago) we had a strict rule that all methods had to be declared virtual unless there was a damn good reason not to.
Johannes: I never understood how someone can intend a method to be overidable or not. That implies that person has psychic powers. How else can he know about any future need or use? If I want to abuse or break your code by overriding it, it is my responsibility. I am the one to blame, not you. You have done your job, I failed to do mine.
There is another default in C# which is the cause of a more subtle bug: the 'new' keyword. If I decide to define a method with a name that matches a superclass' non-virtual method, then my class has in reallity two methods: my new method and the inherited method.
The problem appears when you access a sub class through a superclass interface, like this:
foreach(MySuper s in myList) { s.DoSomething(); }
If MySuper.DoSomething() is non-virtual, then s.DoSomething will never call MySub.DoSomething even when s references an instance of that class.
It is my understanding that having virtual methods in .NET incurs a 'penalty' in performance as virtual tables are created and maintained during execution. I am not sure if this is scary propaganda issued by the non-virtual default believers, or if it really has merit.
I agree that a better solution is in the treatment by the language itself. Until then, compiler support would allow the dual personality of code during testing and production. For C# in VS.NET, one could write:
public class Logger { public #if TEST virtual #endif void WriteToDisk( string message ) {...} }
It is ugly and complicates the code, therefore making it an unfit solution. If the IDE supported the idea of making classes virtual by default in TEST mode, and non-virtual in production, perhaps we wouldn't have to wait for a new language. Could this be done in VS.NET with an Add-In that "alters" the code prior to compilation?