It is not a overstatement to say that I am an enthusiast when it comes to object orientation. A well crafted object oriented system is a piece of art. However, following all the principles of good object-oriented design can be challenging. This article should show why we can never let our guards down when programming object oriented systems. It turned out to be a lot longer than I initially intended, but please bear with me on this one - it will be worth your time.
The Code
In one of my projects, I needed to do some mathematical coding, in particular I needed to work with pairs. Even though org.apache.commons.lang3
has excellent classes for tuples (and I would strongly recommend to use them!) I needed some extra bells and whistles and I ended up writing them on my own.
Here's what my pair class roughly looked like (without the additional stuff that I needed):
public class Pair<A,B> {
private final A first;
private final B second;
public Pair(A first, B second){
Objects.requireNonNull(first);
Objects.requireNonNull(second);
this.first = first;
this.second = second;
}
// getters for first and second
public int hashCode() {
return this.first.hashCode()*31 + this.second.hashCode();
}
public boolean equals(Obj other){
if(other == null) { return false; }
if(other == this) { return true; }
if(other instanceof Pair == false) { return false; }
Pair<?,?> p = (Pair<?,?>)other;
return this.first.equals(p.getFirst())
&& this.second.equals(p.getSecond());
}
}
The above class works fine in its own right. Later down the road in my project I also needed to work with triples. Following the "Don't Repeat Yourself" (DRY) principle, I implemented it as follows:
public class Triple<A,B,C> extends Pair<A,B> {
private final C third;
public Triple(A first, B second, C third) {
super(first, second);
Objects.requireNonNull(third);
this.third = third;
}
// getter for third
public int hashCode() {
return super.hashCode()*47 + this.third.hashCode();
}
public boolean equals(Obj other){
if(other == null) { return false; }
if(other == this) { return true; }
if(other instanceof Triple == false) { return false; }
Triple<?,?,?> t = (Triple<?,?,?>)other;
return this.first.equals(t.getFirst())
&& this.second.equals(t.getSecond())
&& this.third.equals(t.getThird());
}
}
I managed to reuse two fields and a bunch of other stuff, nice! ... or so I thought. Spoiler alert: the code above has a massive flaw. Hats off to you if you can spot it outright. I did not.
Here's the deal: The code above breaks an essential contract of Java programming.
The Issue
A long time after writing these classes, I was searching for an error in a code base of approximately 100.000 lines of code, some of which was quite complex. The code was behaving erratically in certain cases, my JUnit tests were complaining. But I couldn't find the issue.
Eventually, after hours of debugging, I arrived at the following statement:
// complex logic here ...
Set<Pair<String, Long>> pairs = calculate();
Pair<String, Long>> pairToSearchFor = ...;
if(pairs.contains(pairToSearchFor)){
// ... path A
} else {
// ... path B
}
Intense debugging revealed that in this method, path A
was never invoked. And I was puzzled why that was the case, clearly the pair I was looking for in my debug scenario should have been in the set. I checked the entries of the set in the debugger, and then it suddenly dawned upon me.
The set contained only Triple
s.
A Triple
can be used in place of a Pair
in our implementation, the Java compiler will not complain. But here's the issue. Let's assume we have two tuples:
Pair<String, Long> pair = new Pair<>("Hello", 123L);
Triple<String, Long, Long> triple = new Triple<>("Hello", 123L, 144L);
pair.equals(triple); // returns true
triple.equals(pair); // returns false
We have broken the contract of Object#equals(...)
. This is why the entry was not found in the hash set. Object#equals(...)
(like any other equality relation) has to be symmetric, reflexive and transitive. In our example, it's not symmetric. We syntactically fulfilled the contract (i.e. we implemented the method in a way that made javac
happy), but we semantically violated it. It might seem entirely obvious in the isolation of this blog post, but good luck detecting this error in a large code base.
Where did I go wrong?
My first urge was to run for a fix and adjust the equals
method of Pair
. Clearly a triple could never be equal to a pair. But Triple
is a subclass of Pair
, so instanceof Pair
would not help to distinguish these two. other.getClass().equals(Pair.class)
did not feel right for me either. I realized that I had no chance to make this equals(...)
method work without making it aware of its subclasses. And according to OOP design rules, a class should never be aware of its subclasses (unless it is abstract
).
So what the heck? Did we not follow each and every best practice there is? Did we find a flaw in the very principles of Object Orientation? Did we break the matrix?
The solution
When I applied the DRY principle to share the first
and second
fields, I focused exclusively on syntax. I followed this principle because it was easy. Not repeating those fields seemed like a good idea. However, I forgot the one principle of object orientation that overrules all else when it comes to inheritance. Only use inheritance if the sub-class is in fact a semantic specialization of the base class. Inheritance can be a tool to achieve a DRY design. But it's not the only one, and it is not always applicable.
Do Triple
and Pair
share two fields? Well yes, they do! But that's just syntax!
Is every Triple
also a Pair
in each and every case? Certainly not!
The second conclusion needs to be true
in order to properly employ inheritance. However, it requires far more mental effort to check than the first one. It is about the semantics of your class. javac
can't help you here. The essential flaw in my thinking was that I focused on writing as few lines of code as possible, but I ended up violating not only the contract of Object#equals(...)
but also one of the core rules of object orientation - without even noticing at first. The rule which is violated here is known as the Liskov Substitution Principle.
Message in a bottle
The take-away messages from this article are:
- Syntax is a means to an end in OOP. Focus on the semantics in your decisions.
- Rules, principles and best practices can be contradicting each other at times. Make sure that you are aware of which one is more important than the other.
- Never use inheritance simply for the sake of syntactically sharing common fields or methods.
- Never just implement an interface blindly. Read the f***ing manual.
- Never take the implementation of
Object#hasCode()
orObject#equals(...)
lightly. It will come back to haunt you.
Thanks for reading, I hope this post will one day save somebody a lot of hours wasted in the debugger.