Applications:
- hash maps
- cryptography
http://en.wikipedia.org/wiki/checksum_algorithm
Used in cryptography as well as other applications, e.g. Git SHA to identify objects uniquely.
Desired properties:
-
it is easy to compute the hash value for any given message
-
it is infeasible to generate a message that has a given hash
-
it is infeasible to modify a message without changing the hash
-
it is infeasible to find two different messages with the same hash.
This is in general much easier than finding an input with a given hash because of the birthday problem: http://en.wikipedia.org/wiki/Birthday_problem
The collision strength of a function is determined by the smallest attack that generates a collision with certainty or high probability.
For example, SHA-1 has 80 bits, so a naive brute force attack costs
http://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_construction
Algorithms on that family can also hash an empty string: http://superuser.com/questions/557925/how-can-zero-byte-files-generate-a-hash-value
160 bits.
SHA-1 is the most popular in 2014. Used in Git.
Collision attacks were found in 2005, but they are were too expensive.
Some parts of the US government moved to SHA-2 in 2010 because of the weaknesses.
SHA-1 will be practical in 2018 for organized crime: https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
Google, Microsoft and Mozilla will remove SHA-1 for security in 2017 and use SHA-2 instead.
The following is a for fun prefix finder: https://github.com/bradfitz/gitbrute
It seems unproven that the all zero SHA-1 0^160 has a preimage: http://stackoverflow.com/questions/1902340/can-a-sha-1-hash-be-all-zeroes
Family of 6 functions and output lengths.
http://en.wikipedia.org/wiki/Rolling_hash
Major application: Rabin-Krap string search.