blah blah blah is here! blah blah » Close

348 views

hashing

up1down
link

Hello all. I have a c++ question. I have this project in my mind. its like mpq file format. In other words i want to make a files packager where i can store several files in a single file. The problem i'm facing is making a hash key from the file name that is within 0-50000. for example i have a hashtable of 50000 key and i have for example the file name Pic1.jpg.i found some hash functions on the net but i have a doubt that a collision might happen when i'm using key mod 50000. i'm using something like this code

unsigned int GetHash(char *str)
{
unsigned int hash = 5381;
int c;

while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

return hash%50000;
}

can anyone help in this like am i missing something here?
tnx in advance

last answered one year ago

1 answers

link

Collisions in a hash table will be inevitable the larger the data set is. It can't be helped.

Just make sure that the hash functions you use have a nice uniform distribution and you have a good collision policy.

Jeff1203
538

Here's a site that contains some implementations of various hash algorithms and some info about them in general. (http://www.partow.net/programming/hashfunctions/)

muster
1556

tnx for the answer, there is something i didn't set clear. i meant by collision as if two files with the same name are inserted not two files of different names collide in the same cell after hashing, would it happen for example using the hash function i wrote above?

muster
1556

mmm nvm, i figured it out, just need to check the file name from the value after the key is hashed. tnx for the link as well its very helpful.

Feedback