Asked  7 Months ago    Answers:  5   Viewed   52 times

I need to hash passwords for storage in a database. How can I do this in Java?

I was hoping to take the plain text password, add a random salt, then store the salt and the hashed password in the database.

Then when a user wanted to log in, I could take their submitted password, add the random salt from their account information, hash it and see if it equates to the stored hash password with their account information.

 Answers

68

You can actually use a facility built in to the Java runtime to do this. The SunJCE in Java 6 supports PBKDF2, which is a good algorithm to use for password hashing.

byte[] salt = new byte[16];
random.nextBytes(salt);
KeySpec spec = new PBEKeySpec("password".toCharArray(), salt, 65536, 128);
SecretKeyFactory f = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1");
byte[] hash = f.generateSecret(spec).getEncoded();
Base64.Encoder enc = Base64.getEncoder();
System.out.printf("salt: %s%n", enc.encodeToString(salt));
System.out.printf("hash: %s%n", enc.encodeToString(hash));

Here's a utility class that you can use for PBKDF2 password authentication:

import java.security.NoSuchAlgorithmException;
import java.security.SecureRandom;
import java.security.spec.InvalidKeySpecException;
import java.security.spec.KeySpec;
import java.util.Arrays;
import java.util.Base64;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.crypto.SecretKeyFactory;
import javax.crypto.spec.PBEKeySpec;

/**
 * Hash passwords for storage, and test passwords against password tokens.
 * 
 * Instances of this class can be used concurrently by multiple threads.
 *  
 * @author erickson
 * @see <a href="http://stackoverflow.com/a/2861125/3474">StackOverflow</a>
 */
public final class PasswordAuthentication
{

  /**
   * Each token produced by this class uses this identifier as a prefix.
   */
  public static final String ID = "$31$";

  /**
   * The minimum recommended cost, used by default
   */
  public static final int DEFAULT_COST = 16;

  private static final String ALGORITHM = "PBKDF2WithHmacSHA1";

  private static final int SIZE = 128;

  private static final Pattern layout = Pattern.compile("\$31\$(\d\d?)\$(.{43})");

  private final SecureRandom random;

  private final int cost;

  public PasswordAuthentication()
  {
    this(DEFAULT_COST);
  }

  /**
   * Create a password manager with a specified cost
   * 
   * @param cost the exponential computational cost of hashing a password, 0 to 30
   */
  public PasswordAuthentication(int cost)
  {
    iterations(cost); /* Validate cost */
    this.cost = cost;
    this.random = new SecureRandom();
  }

  private static int iterations(int cost)
  {
    if ((cost < 0) || (cost > 30))
      throw new IllegalArgumentException("cost: " + cost);
    return 1 << cost;
  }

  /**
   * Hash a password for storage.
   * 
   * @return a secure authentication token to be stored for later authentication 
   */
  public String hash(char[] password)
  {
    byte[] salt = new byte[SIZE / 8];
    random.nextBytes(salt);
    byte[] dk = pbkdf2(password, salt, 1 << cost);
    byte[] hash = new byte[salt.length + dk.length];
    System.arraycopy(salt, 0, hash, 0, salt.length);
    System.arraycopy(dk, 0, hash, salt.length, dk.length);
    Base64.Encoder enc = Base64.getUrlEncoder().withoutPadding();
    return ID + cost + '$' + enc.encodeToString(hash);
  }

  /**
   * Authenticate with a password and a stored password token.
   * 
   * @return true if the password and token match
   */
  public boolean authenticate(char[] password, String token)
  {
    Matcher m = layout.matcher(token);
    if (!m.matches())
      throw new IllegalArgumentException("Invalid token format");
    int iterations = iterations(Integer.parseInt(m.group(1)));
    byte[] hash = Base64.getUrlDecoder().decode(m.group(2));
    byte[] salt = Arrays.copyOfRange(hash, 0, SIZE / 8);
    byte[] check = pbkdf2(password, salt, iterations);
    int zero = 0;
    for (int idx = 0; idx < check.length; ++idx)
      zero |= hash[salt.length + idx] ^ check[idx];
    return zero == 0;
  }

  private static byte[] pbkdf2(char[] password, byte[] salt, int iterations)
  {
    KeySpec spec = new PBEKeySpec(password, salt, iterations, SIZE);
    try {
      SecretKeyFactory f = SecretKeyFactory.getInstance(ALGORITHM);
      return f.generateSecret(spec).getEncoded();
    }
    catch (NoSuchAlgorithmException ex) {
      throw new IllegalStateException("Missing algorithm: " + ALGORITHM, ex);
    }
    catch (InvalidKeySpecException ex) {
      throw new IllegalStateException("Invalid SecretKeyFactory", ex);
    }
  }

  /**
   * Hash a password in an immutable {@code String}. 
   * 
   * <p>Passwords should be stored in a {@code char[]} so that it can be filled 
   * with zeros after use instead of lingering on the heap and elsewhere.
   * 
   * @deprecated Use {@link #hash(char[])} instead
   */
  @Deprecated
  public String hash(String password)
  {
    return hash(password.toCharArray());
  }

  /**
   * Authenticate with a password in an immutable {@code String} and a stored 
   * password token. 
   * 
   * @deprecated Use {@link #authenticate(char[],String)} instead.
   * @see #hash(String)
   */
  @Deprecated
  public boolean authenticate(String password, String token)
  {
    return authenticate(password.toCharArray(), token);
  }

}
Tuesday, June 1, 2021
 
Pwner
answered 7 Months ago
93

In cryptography, hash functions provide three separate functions.

  1. Collision resistance: How hard is it for someone to find two messages (any two messages) that hash the same.
  2. Preimage Resistance: Given a hash, how hard is it to find another message that hashes the same? Also known as a one way hash function.
  3. Second preimage resistance: Given a message, find another message that hashes the same.

These properties are related but independent. For example, collision resistance implies second preimage resistance, but not the other way around. For any given application, you will have different requirements, needing one or more of these properties. A hash function for securing passwords on a server will usually only require preimage resistance, while message digests require all three.

It has been shown that MD5 is not collision resistant, however, that does not preclude its use in applications that do not require collision resistance. Indeed, MD5 is often still used in applications where the smaller key size and speed are beneficial. That said, due to its flaws, researchers recommend the use of other hash functions in new scenarios.

SHA1 has a flaw that allows collisions to be found in theoretically far less than the 2^80 steps a secure hash function of its length would require. The attack is continually being revised and currently can be done in ~2^63 steps - just barely within the current realm of computability. For this reason NIST is phasing out the use of SHA1, stating that the SHA2 family should be used after 2010.

SHA2 is a new family of hash functions created following SHA1. Currently there are no known attacks against SHA2 functions. SHA256, 384 and 512 are all part of the SHA2 family, just using different key lengths.

RIPEMD I can't comment too much on, except to note that it isn't as commonly used as the SHA families, and so has not been scrutinized as closely by cryptographic researchers. For that reason alone I would recommend the use of SHA functions over it. In the implementation you are using it seems quite slow as well, which makes it less useful.

In conclusion, there is no one best function - it all depends on what you need it for. Be mindful of the flaws with each and you will be best able to choose the right hash function for your scenario.

Wednesday, June 2, 2021
 
max_
answered 7 Months ago
16

Using regex for string replacement is significantly slower than using a string replace.
As demonstrated on JSPerf, you can have different levels of efficiency for creating a regex, but all of them are significantly slower than a simple string replace. The regex is slower because:

Fixed-string matches don't have backtracking, compilation steps, ranges, character classes, or a host of other features that slow down the regular expression engine. There are certainly ways to optimize regex matches, but I think it's unlikely to beat indexing into a string in the common case.

For a simple test run on the JS perf page, I've documented some of the results:

<script>
// Setup
  var startString = "xxxxxxxxxabcxxxxxxabcxx";
  var endStringRegEx = undefined;
  var endStringString = undefined;
  var endStringRegExNewStr = undefined;
  var endStringRegExNew = undefined;
  var endStringStoredRegEx = undefined;      
  var re = new RegExp("abc", "g");
</script>

<script>
// Tests
  endStringRegEx = startString.replace(/abc/g, "def") // Regex
  endStringString = startString.replace("abc", "def", "g") // String
  endStringRegExNewStr = startString.replace(new RegExp("abc", "g"), "def"); // New Regex String
  endStringRegExNew = startString.replace(new RegExp(/abc/g), "def"); // New Regexp
  endStringStoredRegEx = startString.replace(re, "def") // saved regex
</script>

The results for Chrome 68 are as follows:

String replace:    9,936,093 operations/sec
Saved regex:       5,725,506 operations/sec
Regex:             5,529,504 operations/sec
New Regex String:  3,571,180 operations/sec
New Regex:         3,224,919 operations/sec

From the sake of completeness of this answer (borrowing from the comments), it's worth mentioning that .replace only replaces the first instance of the matched character. Its only possible to replace all instances with //g. The performance trade off and code elegance could be argued to be worse if replacing multiple instances name.replace(' ', '_').replace(' ', '_').replace(' ', '_'); or worse while (name.includes(' ')) { name = name.replace(' ', '_') }

Tuesday, June 22, 2021
 
michele
answered 6 Months ago
44

In particular, is there a difference between the hashing algorithm they use? What is the formula used to hash in these two classes?

The primary hash function used when you use an object as a hash table key is the object's hashCode() method. It is up the to the key class to implement a decent hash function.

The Hashtable and HashMap classes take the key's hashcode value and convert it to an index in the primary hashtable array-of-chains. However, there are differences in how this happens between Hashtable and HashMap.

  • For Hashtable (Java 8) the code is this:

     hash = key.hashCode();
     index = (hash & 0x7FFFFFFF) % tab.length;
    
  • For HashMap (Java 8) the code is (effectively) this:

     // (I have restructured the code for ease of comparison.)
     int h;
     hash = (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
     index = (tab.length - 1) & hash;
    

As you can see, HashMap is scrambling the hashcode value returned by the key's hashcode function. This is explained in the source code as follows:

[This method] computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don't benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.

Notes:

  1. The & versus % difference is because in Hashtable the hash array size is a prime number, but in HashMap (Java 8) the size is a power of 2.

  2. In Java 8 HashMap, the implementation will turn a long hash chain into a binary tree if the key class implements Comparable.

  3. HashMap handles null keys, but Hashtable doesn't.


However, all of this extra complexity in HashMap only comes into play if your key class has a poorly designed / implemented hashCode() method ... or if someone is deliberately trying to engineer hash collisions.

In other words, if your key class is well designed, the differences should not matter.

Monday, October 4, 2021
 
AWeim
answered 2 Months ago
62

you need MultiMap, take a look at Google Guava Multimap

Wednesday, October 27, 2021
 
user7282
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :  
Share