Bad usage of the PHP hash_equals function

Issue created by @epieddy
Comment 8 months ago →
cilefen
First commit to issue fork.
Merge request !12369Second parameter should be user-provided value → (Open) created by prudloff
Pipeline finished with Success
about 1 month ago
Total: 901s
#520998
Comment about 1 month ago →
🇫🇷France prudloff Lille
I looked at every hash_equals() call in core and found only one place where the parameter order was wrong.
Comment about 1 month ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
Comment 13 days ago →
🇯🇵Japan neptune-dc
>I looked at every hash_equals() call in core and found only one place where the parameter order was wrong.

I came to the same conclusion. +1 RBTC
Comment 12 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
Comment 12 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
I take that in the following code the $known_hash parameter should be $computed_hash, and the $user_string parameter should be $stored_hash.

return $computed_hash && hash_equals($stored_hash, $computed_hash);

(In other words, the correct code should be the following one.

return $computed_hash && hash_equals($computed_hash, $stored_hash);

I did not check all the calls to hash_equals(). There could be other cases where the arguments must be inverted.

I am not sure the code should be changed as for the first point. Checking the string lengths in PHP could make the code more susceptible of timing attacks. If that is not a reason for changing code, the first point should be removed from the issue summary.

Is there any evidence that passing the arguments in the wrong order would increase the possibilities of timing attacks?
Given that base64_encode() and base64_decode() are not constant-time, and both the functions are used by Drupal core, would just fixing the arguments-order for hash_equals() make Drupal core code less prone of timing attacks?

🇮🇹Italy apaderno Brescia, 🇮🇹

Looking at the C code used by php_safe_bcmp(), I would say that inverting the arguments does not increase the possibilities of timing attacks.

	const volatile unsigned char *ua = (const volatile unsigned char *)ZSTR_VAL(a);
	const volatile unsigned char *ub = (const volatile unsigned char *)ZSTR_VAL(b);
	size_t i = 0;
	int r = 0;

	if (ZSTR_LEN(a) != ZSTR_LEN(b)) {
		return -1;
	}

	/* This is security sensitive code. Do not optimize this for speed. */
	while (i < ZSTR_LEN(a)) {
		r |= ua[i] ^ ub[i];
		++i;
	}

	return r;

r |= ua[i] ^ ub[i]; and r |= ub[i] ^ ua[i]; would take the same amount of time to run.

Comment 12 days ago →
🇫🇷France prudloff Lille
According to this discussion the order was important in a previous implementation but not in the current one.

It that's really the case, should we try to get the PHP doc fixed?
Comment 12 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
It that's really the case, should we try to get the PHP doc fixed?

The PHP documentation should not show the It is important to provide the user-supplied string as the second parameter, rather than the first. warning, which seems to be a left-over from the previous implementation.
IMO, it does not generally make sense. For example, in Drupal, hashes are not always values users provide.
Comment 12 days ago →
🇫🇷France prudloff Lille
Should we close this as wontfix then?

I opened an issue to clarify the PHP doc: https://github.com/php/doc-en/issues/4767

IMO, it does not generally make sense. For example, in Drupal, hashes are not always values users provide.

Well, if we use hash_equals() it's usually because the hash is provided by the user or calculated based on user-provided values.
If that's not the case, timing attacks are not possible and there is no reason to use hash_equals().
Comment 12 days ago →
🇫🇷France prudloff Lille
A PHP contributor confirmed that the order does not matter with the current implementation but it could change in the future: https://github.com/php/doc-en/issues/4767#issuecomment-3054071094

So I think we should try to use the correct order to be future-proof, so setting this back to needs review.
I also updated the summary to remove the point about checking the string length. I don't think leaking the hash length is really a problem and it would hard to prevent.

(I also added back the empty sections of the issue summary, it helps seeing what still needs to be done.)

I take that in the following code the $known_hash parameter should be $computed_hash, and the $user_string parameter should be $stored_hash.

The user string is the hash calculated from values supplied in the HTTP request.
In your example, $computed_hash is calculated from the password provided to be checked, so having it as the second parameter is correct.
Comment 11 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
Lines like the following ones needs to be changed, then.

return $computed_hash && hash_equals($stored_hash, $computed_hash);

$computed_hash is the expected hash, not the user-provided value.
In that case, there is no user-provided value, but a value stored in the database, which could be altered. If I follow that warning, it is $stored_hash that needs to be passed as second argument.

I did not check all the hash_equals() calls, but there are more lines that should be changed, if we follow to the letter the warning given in the PHP documentation.
Comment 11 days ago →
🇫🇷France prudloff Lille
Please see my comment in #14, $computed_hash is calculated from a user-provided password. $stored_hash comes from the database so is not user-provided.
User-provided in the context of timing attacks means provided in the HTTP request.
Comment 10 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
$known_string is the calculated hash, which must be kept secret. It cannot be a value stored in the database.

Comment 10 days ago →

🇮🇹Italy apaderno Brescia, 🇮🇹

In the example used by the PHP documentation, the hash calculated from a value provided by the user is passed as first parameter, not second.

// Value and signature are provided by the user, e.g. within the URL
// and retrieved using $_GET.
$value = 'username=rasmuslerdorf';
$signature = '8c35009d3b50caf7f5d2c1e031842e6b7823a1bb781d33c5237cd27b57b5f327';

if (hash_equals(hash_hmac('sha256', $value, $secretKey), $signature)) {
    echo "The value is correctly signed.", PHP_EOL;
} else {
    echo "The value was tampered with.", PHP_EOL;
}

Comment 10 days ago →
🇫🇷France prudloff Lille
The order does not depend on which variable is pre-calculated or not, it depends on which variable the attacker is trying to brute force.

In the PHP doc example, the attacker is trying to brute force a valid signature, so the signature submitted by the user has to be second (and it is compared with a signature calculated from the secret key).
In PhpassHashedPasswordBase::check(), the attacker is trying to guess the password so the hash computed from the sent password has to be second.

It cannot be a value stored in the database.

It can totally be a value stored in database. The known string is the correct value that does not vary per request, it is either stored pre-calculated or calculated from known values like a secret key.
When the PHP doc says the known string "must be kept secret" it does not mean it can't be stored, it means it is the sensitive string that an attacker would be trying to get with a timing attack.
The user string is the hash that varies with each brute force request.

Timing attacks work by sending a lot of different values and checking if the time it takes the server to respond vary.
For example when trying to brute force a password, the website will usually have a hash of the password stored in the database and will hash the user-supplied password and compare it to the hash stored in database.
In that scenario, the known string is the hash stored in the database and the user string is the hash calculated from the password sent by the user.
Comment 9 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
It is not a matter of pre-calculated values, which would eventually be values I calculate before I need them (and store in some places), not right before calling hash_equals().

In PhpassHashedPasswordBase::check(), the attacker is trying to guess the password so the hash computed from the sent password has to be second.

$computed_hash is not user-input. The user can just change the entered password, but finding a password that gives a specific hash is only possible by brute force, by definition of cryptographic hash, which is invertible and collision resistant.
With passwords, the operation is more complex, and it is more difficult to find a password that gives an exact $computed_hash.

When the PHP doc says the known string "must be kept secret" it does not mean it can't be stored, it means it is the sensitive string that an attacker would be trying to get with a timing attack.

You missed the point. Neither $computed_hash nor $stored_hash is user-input (provided in the HTTP request), so there is no difference in choosing which value to pass as second argument; my preference is for $stored_hash, which could not be the preference somebody else has. It is no longer a matter of user input. (This does not mean the arguments order should not be changed.)

Timing attacks work by sending a lot of different values and checking if the time it takes the server to respond vary.

The only way to avoid timing attacks is using constant-time code, which does not mean just using hash_equals().
Comment 9 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
The issue summary says Drupal core uses hash_equals() in multiple places. The merge request just fixes the given example, but it should fix all the hash_equals() calls, when the user input is passed as first argument.
Comment 9 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
Truly, the arguments order does not matter with the previous implementation either.

mod_len = MAX(known_len, 1); /* This is security sensitive code. Do not optimize this for speed. */ result = known_len - user_len; for (j = 0; j < user_len; j++) { result |= known_str[j % mod_len] ^ user_str[j]; } RETURN_BOOL(0 == result);
C code to compare in constant time two strings cannot be much different from the previous implementation and the actual implementation. I cannot see in which way they could rewrite the code to be dependent from the arguments order and still be constant-time code.
Comment 9 days ago →
🇫🇷France prudloff Lille
$computed_hash is not user-input.

Well I think this is were we disagree on the definition of user-input. IMHO a hash calculated from user input should be treated as user input here but I could be wrong.
To be fair, the PHP doc does not give a lot of examples and the only example is not very similar to our use cases.
I opened an issue about improving the examples: https://github.com/php/doc-en/issues/4771

I cannot see in which way they could rewrite the code to be dependent from the arguments order and still be constant-time code.

I'm not fluent in C but my understanding is that it matters in algorithms that can handle hashes with a different length (which hash_equals() does not support currently but could add in the future) because it could stop the comparison early if the user string and known string are inverted.
I have not checked if there a places in core where we compare hashes that can have a different length.
Comment 9 days ago →
🇮🇹Italy apaderno Brescia, 🇮🇹
IMHO a hash calculated from user input should be treated as user input here but I could be wrong.

The example given in the PHP documentation is clear about what they call "user input": It is a value completely provided by the user. They also make clear that a value given in the HTTP request is considered "user input." $computed_hash is not provided in the HTTP request, so it is not "user input."
$known_string is not a value stored in a database, though; it is the value I get from the function to calculate the "hash." To make a different example, if I store in a database the largest integer supported by PHP, what I should I consider the known value when checking the value stored in the database? PHP_INT_MAX, not the value stored in the database.

I'm not fluent in C but my understanding is that it matters in algorithms that can handle hashes with a different length (which hash_equals() does not support currently but could add in the future) because it could stop the comparison early if the user string and known string are inverted.

Two strings are equal if they have the same length and all their bytes are equal. With a four-byte string and a five-byte string, only four bytes can be compared, but the strings are still different because they have different lengths.

Since the code cannot be optimized, C code is not much different from PHP code. In both the cases, two bytes can be compared in constant-time by XORing their value. (A XOR B is 0 when A and B have the same value, and it is different from 0 when A and B have different values.) Eventually, since constant-time code avoids if statements, the code could use $result = $known_string_length - $user_string_length; instead of if ($known_string_length !== $user_string_length) { return; }, but the code woud still be XORing the bytes.
Eventually, the C code could be different if it is rewritten to in Assembly and all the CPUs would have instructions to compare two strings in constant-time. The alghoritm would not change, though.

Bad usage of the PHP hash_equals function

Merge Requests

!12369Bad usage of the PHP hash_equals function
Open

Comments & Activities

Bad usage of the PHP hash_equals function

Merge Requests

!12369Bad usage of the PHP hash_equals functionOpen

Comments & Activities

!12369Bad usage of the PHP hash_equals function
Open