Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
Data integrity is important in distributed systems. The same characteristics that make these systems robust (e.g., fault tolerance) make maintaining data integrity challenging. For this reason, hash functions play a central role in the algorithms and technologies that power Usenet, BitTorrent, and Bitcoin and its blockchain. A hash function is a function that maps arbitrarily sized data to some ideally smaller, unique, and non-invertable data of fixed size (the importance of these attributes will be explained). The MD5 hash of the title of this presentation is a98230b1c23b0120a6094fadd0adc1a5; if the Oxford comma was removed, the hash would change to 7d77231f9044fb6aeeba479b6ab09aa6. If you were given both the title and its hash, then you could compute the hash of the title you received yourself and compare it to that of the hash you received. If they differed, you would know that there was an error in transmission or that an intermediate editor rejects clarity and civility. This presentation will introduce hashes and their variants, these distributed and sometimes dubious systems, and what can be learned and practically applied in today’s preservation tools and services for purposes of auditing, identifying, recovering, and sharing data.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.