How I handle equals() and hashCode() in my project


Posted by Steven

Some of the problems in everyday software development are luxury problems. Like when you work in a greenfield project and are able to choose how you implement equals() and hashCode(). However, this decision will have a major effect for the project down the road. So here are the options I found out and my decision making process.

In the excellent article How to implement Equals and HashCode for JPA entities, Vlad Mihalcea summed up the options before deciding that business keys are the best way to go. My first instinct, using database IDs, is apparently a bad idea. A freshly created object that doesn't have an ID yet would be different from an object that is loaded from the database, even if they represent the same thing. Hence, Vlad suggests using a business key which is a "combination of fields that's unique among Entities". This "business key must be set from the very moment we are creating the Entity and then never change it." As an example, he shows a Company- class in which the name of this company is a never-changing property. To be honest, I think this is a bad example because we've all seen things changing although the customer swore "This will never ever change!" I think the name of a company is very likely to change sometimes, even for big companies like Krupp which is now ThyssenKrupp. However, I do understand the need for such a never-changing attribute.

Vlad suggests the following implementation:

  1. @Column(unique = true, updatable = false)<br />
  2. private String name;

If this field will never change, I think it would be wise to have a constructor with all those non-updatable fields so it's not possible to construct an object without them. Theoretically, there shouldn't be setter for those fields. However, frameworks like Spring Data JPA need the default constructor and getter/setter to handle the bean correctly.

This article provides a nice decision matrix which supports using business keys for equals() and hashCode().

To avoid the famous LazyInitializationException, references that are annotated with FetchType.LAZY should not be used in equals() and hashCode().

Also, I found a little update for using the EqualsBuilder and HashCodeBuilder so the resulting methods look like this:

  1. @Override
  2. public boolean equals(Object o) {
  3. return EqualsBuilder.reflectionEquals(this, o, Arrays.asList(name));
  4. }
  5.  
  6. @Override
  7. public int hashCode() {
  8. return HashCodeBuilder.reflectionHashCode(this, Arrays.asList(name));
  9. }

In my situation it’s very easy to change my small existing codebase to Vlads suggestion. However, I will have to write tests to really believe that this change doesn't break anything.

TL;DR

Use a combination of business keys for equals() and hashCode().

(Photo: https://commons.wikimedia.org/wiki/File:Two_different_shoes_on.jpg)

Share: 

Comments

This is seriously backward 1. The original article claims you can't use the id / primary key because it only gets set when the entity gets persisted. Yes this is true on the technical level, BUT a) If this matters to you just use GUIDs and be done with or set the id before you use hashCode and equals. b) The whole point of separating business key and primary key is: the technical id does not change, while the business key is free to change. If you think you can find a not changing business key for all your entities, I kindly ask to give me the winning lottery numbers for the upcoming month. Much better ROI for this kind of prophetic skills. 2. The article claims that equals and hashcode have to work across different states of an entity, like enities form different sessions. If you are working with entities from different sessions you better know JPA extremely well, because otherwise you are going to learn a lot of stuff about it you never wanted to know in the first place. As always there are exceptions to the rule, but in all applications I have seen so far the following worked well, and most importantly: it was easy to understand the limitations: 1. equality is based on the id 2. if the id is null two entities are different 3. you must not use an entity in a set or similar before it's id is set. If you have ever two entities in scope that are connected to two different sessions, this is considered a bug, if it doesn't come with a lengthy explanation why it has to be this way.


Because this article has been retweeted by @Hibernate: As Jens mentioned in the first comment, the method mentioned in this article is not necessarily the best choice. We discussed the options together and compared pros and cons. However, I didn't have time to mention this here.