There are two methods hashCode() and equals() present in Object class which are important and should be implemented in class. A Java developer should know the basic concept behind these methods. And sometimes even in technical interview questions are asked related to these. This get more important when object is used in hash based Collection classes (Hashtable and HashMap). If these methods are not implement then Collection classes may not work properly because default hashCode() method implementation typically returns internal address after converting to 32 bit integer and equals() method just compares the hashCode and completely ignores the member variables.

hashCode() contracts

hashCode() implementation should conform with the following contracts.

  • Whenever this method is invoked on the same object during execution of Java application, hashCode must consistently return the same integer value, provided that no member variable/field used in euqals comparisons on the object is modified. This integer not required to be same during multiple execution of the same application.
  • For immutable class, hashCode should be cached and lazy initialized. This saves execution time.
  • If a class override euqals() it must override hashCode().
  • hashCode() and equals() should use same set of member variables/fields
  • If two objects are equal according to their equals() method, they must have same hashCode
Note:
hashCode doesn’t provide unique object identifier. Two different objects can have the same hashCode. hashCode shouldn’t be misused as a key.

hashCode algorithm

Main focus of the algorithm is to minimize the hashCode collision. Means ideally two different object should have different hashCode. This algorithm can be followed by most of the Java classes.

  1. Initialize hash with initial prime number
  2. Take another prime number for multiplier, This should be different than initial prime number
  3. Compute the hashCode for each member variable/field and add in intermediate hash. This should be repeated for all member variable which are part of equals() method.
  4. If class is immutable then store the hash, which servers the purpose of lazy initialization
  5. Return the final hash

An example of this algorithm for Employee class is as following.

public class Employee {
	private int employeeId;
	private String name;
	private String title;
	private String dept;


  /** {@inheritDoc} */
  @Override
  public int hashCode() {
		int result = 17;
		final int prime = 31;
		result = prime * result + ((dept == null) ? 0 : dept.hashCode());
		result = prime * result + employeeId;
		result = prime * result + ((name == null) ? 0 : name.hashCode());
		result = prime * result + ((title == null) ? 0 : title.hashCode());
		return result;
	}
}

In this method extra care has been taken care, i.e member variable is checked for null to make sure when hashCode method is called it shouldn’t throw NullPointerException. In case member variable is null, we can assume its hash value is 0. Let’s see hashCode implementation of String class.

 public int hashCode() {
        int h = hash;
        if (h == 0 && count > 0) {
            int off = offset;
            char val[] = value;
            int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

This implementation doesn’t follow the exactly same algorithm mentioned above but similar algorithm. This method takes only prime multiplier and multiplies the intermediate hash with character present at i’th index. This implementation also does lazy initialization, it calculates the hash only when hash is not calculated before (hash == 0) otherwise it returns cached hash value. It saves the execution time. This computation follows the following formula.

h(s) = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
One interesting thing to observe here is that strings “AaBB” and “BBAa” fetch the same hashCode 2031744.

'A' = 65
'B' = 66
'a' = 97
h('AaBB') = 65*31^3 + 97*31^2 + 66* 31 + 66 = 2031744
h('BBAa') = 66*31^3 + 66*31^2 + 65* 31 + 97 = 2031744

This is perfectly fine as we know one of the rules says that the hashCode of different object can be same. Let’s see one more example, this time we are going to check simplest hashCode() implementation of Integer class.

public int hashCode() {
    return value;
}

You may be confused but this implementation does follow all the rules.

hasCode() implementation and performance

A well written hashCode() method can drastically improve the performance by uniformly distributing the hashCode and minimizing the hashCode collision. All Hash based collection classes store the hashCode in an array to get the initial lookup position. These lookup positions are called bucket these buckets are linked with list which contains the object which has same hashCode. Once the bucket is identified then list is linearly scanned useing equals() to find the right key. This check is linear if there are lots of different objects have the same hashCode (hashCode collision) then this list will grow and search time complexity for searching the key in this list would be 'n'. Which is kind of bad. If hashCode implementation provides distinct hash for different object then this time complexity will reduce because there would be only one item in the list. On the other hand, if hashCode algorithm is poor and it returns same hashCode for all object then HashMap based class degrades into a list of list who’s time complexity would be O(m*n) where ‘m’ for calling equals() method on the object whereas ‘n’ is the number of object present in the bucket. This time complexity would be as good as O(n^2) which is definitely bad. Ideally, hashCode based Collection classes should have O(1) time complexity.

Source code generation

After reading till here, you may be thinking that it is not an easy job to write hashCode() method. But I should tell you that this is really easy just few clicks and implementation is done, if you are using Eclipse IDE. Click Source > Generate hashCode() and equals().. and voila it’s done.

Eclipse hashCode() and equals() implementation
Eclipse does take care about the requirements, even check null before calling hashCode() method on member variable. you can completely rely on Eclipse. I suggest developer to use this handy tool.

Hope this blog helped you in some way. If you like this blog then please share it. You can also leave your comment below. You can find Facebook page here.

Related topics

  1. Difference between “==” and equals() method in Java
, , ,
Trackback

3 comments untill now

  1. Hi Rakesh, Nice explanation for hashCode(). I have one doubt here in this explanation. In top paragraph—
    you have mentioned that if hashCode() is not properly implemented then it will convert 32 bit integer address. And equal() method will ignore that….

    Here what do you mean by 32 bit integer. I think it should be system dependent. correct me if I am wrong.

  2. Thanks Manoj for your comment.
    hasCode() returns ‘int’ and according to java standard this int has to be 32 bits. It is machine independent. you can checkout this link http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.2 for more clarification.

  3. yes true…. I missed this one… thanks for ur correction.

Add your comment now