Choosing a hash functionA good hash function and implementation algorithm are essential for good hash table performance, but may be difficult to achieve.[citation needed A basic requirement is that the function should provide a uniform distribution of hash values. A non-uniform distribution increases the number of collisions and the cost of resolving them. Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using statistical tests, e.g., a Pearson's chi-squared test for discrete uniform distributions. The distribution needs to be uniform only for table sizes that occur in the application. In particular, if one uses dynamic resizing with exact doubling and halving of the table size, then the hash function needs to be uniform only when the size is a power of two. Here the index can be computed as some range of bits of the hash function. On the other hand, some hashing algorithms prefer to have the size be a prime numberThe modulus operation may provide some additional mixing; this is especially useful with a poor hash function. For open addressing schemes, the hash function should also avoid clustering, the mapping of two or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even if the load factor is low and collisions are infrequent. The popular multiplicative hash[3] is claimed to have particularly poor clustering behavior. Cryptographic hash functions are believed to provide good hash functions for any table size, either by modulo reduction or by bit masking[citation needed]. They may also be appropriate if there is a risk of malicious users trying to sabotage a network service by submitting requests designed to generate a large number of collisions in the server's hash tables. However, the risk of sabotage can also be avoided by cheaper methods (such as applying a secret salt to the data, or using a universal hash function). A drawback of cryptographic hashing functions is that they are often slower to compute, which means that in cases where the uniformity for any size is not necessary, a non-cryptographic hashing function might be preferable.[citation needed] - Study24x7
Social learning Network
17 Mar 2019 09:54 AM study24x7 study24x7

Choosing a hash function
A good hash function and implementation algorithm are essential for good hash table performance, but may be difficult to achieve.[citation needed A basic requirement is that the function should provide a uniform distribution of hash values. A non-unif...

See more

study24x7
Write a comment
Related Questions
500+   more Questions to answer
Most Related Articles