> For the complete documentation index, see [llms.txt](https://sc24.gitbook.io/sc24-crypto-python-workshop/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://sc24.gitbook.io/sc24-crypto-python-workshop/tutorial/markdown.md).

# Hashing Algorithms

### Overview of Hashing

Hashing is a process that converts an input (or "message") into a fixed-size string of bytes, typically a digest that is unique to each unique input. Hashing is a one-way operation, meaning that it's not possible to reverse the output (the hash value) back into the original input. Hash functions are deterministic, meaning that the same input will always produce the same output.

Hash functions are widely used in various security applications, such as:

* **Password Storage**: Storing hashed passwords ensures that even if a database is compromised, the actual passwords remain secure.
* **Data Integrity**: Hashing ensures that data has not been tampered with. If even one bit of the original data is changed, the hash value will change drastically.
* **Digital Signatures**: Verifying the authenticity of a message or document.

***

### Popular Hashing Algorithms

Several hashing algorithms have been developed over the years, each with different levels of security and performance. Some have been found to be insecure for certain applications, but they still serve as important learning tools for understanding how hashing works.

#### MD5 (Message Digest Algorithm 5)

MD5 is one of the most well-known hash functions, originally designed for cryptographic security. It produces a 128-bit (16-byte) hash value. However, due to vulnerabilities that allow for **hash collisions** (where two different inputs produce the same hash), MD5 is no longer considered secure for cryptographic purposes.

Example MD5 hash:

* Input: `"Hello"`
* Hash: `8b1a9953c4611296a827abf8c47804d7`

#### SHA-1 (Secure Hash Algorithm 1)

SHA-1 is another widely used hash function that produces a 160-bit (20-byte) hash value. Like MD5, SHA-1 has been found to be vulnerable to hash collision attacks, and its use has been deprecated in most modern cryptographic applications.

Example SHA-1 hash:

* Input: `"Hello"`
* Hash: `f7c3bc1d808e04732adf679965ccc34ca7ae3441`

#### SHA-256 (Secure Hash Algorithm 256-bit)

SHA-256 is part of the SHA-2 family of hash functions and is considered secure for most cryptographic purposes. It produces a 256-bit (32-byte) hash value. SHA-256 is widely used in security applications, including SSL/TLS certificates, blockchain, and digital signatures.

Example SHA-256 hash:

* Input: `"Hello"`
* Hash: `185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969`

***

### Interactive Code Examples: Generating Hashes with `hashlib` in Python

The Python Standard Library provides the `hashlib` module, which includes implementations of popular hash functions like MD5, SHA-1, and SHA-256.

#### Example: Generating MD5 Hashes

```python
import hashlib

# Define the input data
data = "Hello"

# Generate the MD5 hash
md5_hash = hashlib.md5(data.encode()).hexdigest()

print(f"MD5 Hash: {md5_hash}")
```

### Example: Generating SHA-1 Hashes

```python
import hashlib

# Define the input data
data = "Hello"

# Generate the SHA-1 hash
sha1_hash = hashlib.sha1(data.encode()).hexdigest()

print(f"SHA-1 Hash: {sha1_hash}")
```

### Example: Generating SHA-256 Hashes

```python
import hashlib

# Define the input data
data = "Hello"

# Generate the SHA-256 hash
sha256_hash = hashlib.sha256(data.encode()).hexdigest()

print(f"SHA-256 Hash: {sha256_hash}")
```

These examples show how easy it is to generate hash values using Python’s hashlib library. The `.encode()` method converts the string into bytes, as the hashlib functions require the input to be in bytes format.

### Understanding Hash Collisions and Practical Attacks

#### What is a Hash Collision?

A hash collision occurs when two different inputs produce the same hash value. This is a significant weakness in any hash function, as it undermines the integrity of the hash. Ideally, a hash function should be collision-resistant, meaning that it should be computationally infeasible to find two different inputs that produce the same hash.

#### MD5 Collision Example

MD5 is particularly vulnerable to hash collisions, and researchers have demonstrated methods to find collisions in practice. This vulnerability has made MD5 unsuitable for modern cryptographic applications. Here’s an example showing two different inputs that produce the same MD5 hash:

• Input 1: "abc"

• Input 2: "xyz"

• Both produce the same MD5 hash: 900150983cd24fb0d6963f7d28e17f72

This kind of attack is known as a collision attack, and it highlights the importance of using more secure hash functions like SHA-256.

#### Practical Attacks on Hash Functions

1\. **Birthday Attack**: The birthday attack exploits the mathematics of the birthday paradox to find collisions in hash functions. This attack is especially effective against shorter hash functions like MD5 and SHA-1.

2\. **Preimage Attack**: In a preimage attack, the goal is to find an input that hashes to a specific target hash value. Modern hash functions like SHA-256 are designed to resist these attacks.

Hashing is an essential tool in cryptography, but the choice of the hash function is critical. Weak hash functions like MD5 and SHA-1 are vulnerable to practical attacks, making them unsuitable for secure applications. Understanding these vulnerabilities is key to selecting the right cryptographic tools for your needs.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://sc24.gitbook.io/sc24-crypto-python-workshop/tutorial/markdown.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
