- Issue created by @epieddy
- First commit to issue fork.
- 🇫🇷France prudloff Lille
I looked at every hash_equals() call in core and found only one place where the parameter order was wrong.
- 🇯🇵Japan neptune-dc
>I looked at every hash_equals() call in core and found only one place where the parameter order was wrong.
I came to the same conclusion. +1 RBTC
- 🇮🇹Italy apaderno Brescia, 🇮🇹
I take that in the following code the
$known_hash
parameter should be$computed_hash
, and the$user_string
parameter should be$stored_hash
.return $computed_hash && hash_equals($stored_hash, $computed_hash);
(In other words, the correct code should be the following one.
return $computed_hash && hash_equals($computed_hash, $stored_hash);
I did not check all the calls to
hash_equals()
. There could be other cases where the arguments must be inverted.I am not sure the code should be changed as for the first point. Checking the string lengths in PHP could make the code more susceptible of timing attacks. If that is not a reason for changing code, the first point should be removed from the issue summary.
Is there any evidence that passing the arguments in the wrong order would increase the possibilities of timing attacks?
Given thatbase64_encode()
andbase64_decode()
are not constant-time, and both the functions are used by Drupal core, would just fixing the arguments-order forhash_equals()
make Drupal core code less prone of timing attacks? - 🇮🇹Italy apaderno Brescia, 🇮🇹
Looking at the C code used by
php_safe_bcmp()
, I would say that inverting the arguments does not increase the possibilities of timing attacks.const volatile unsigned char *ua = (const volatile unsigned char *)ZSTR_VAL(a); const volatile unsigned char *ub = (const volatile unsigned char *)ZSTR_VAL(b); size_t i = 0; int r = 0; if (ZSTR_LEN(a) != ZSTR_LEN(b)) { return -1; } /* This is security sensitive code. Do not optimize this for speed. */ while (i < ZSTR_LEN(a)) { r |= ua[i] ^ ub[i]; ++i; } return r;
r |= ua[i] ^ ub[i];
andr |= ub[i] ^ ua[i];
would take the same amount of time to run. - 🇫🇷France prudloff Lille
According to this discussion the order was important in a previous implementation but not in the current one.
It that's really the case, should we try to get the PHP doc fixed?
- 🇮🇹Italy apaderno Brescia, 🇮🇹
It that's really the case, should we try to get the PHP doc fixed?
The PHP documentation should not show the It is important to provide the user-supplied string as the second parameter, rather than the first. warning, which seems to be a left-over from the previous implementation.
IMO, it does not generally make sense. For example, in Drupal, hashes are not always values users provide. - 🇫🇷France prudloff Lille
Should we close this as wontfix then?
I opened an issue to clarify the PHP doc: https://github.com/php/doc-en/issues/4767
IMO, it does not generally make sense. For example, in Drupal, hashes are not always values users provide.
Well, if we use hash_equals() it's usually because the hash is provided by the user or calculated based on user-provided values.
If that's not the case, timing attacks are not possible and there is no reason to use hash_equals(). - 🇫🇷France prudloff Lille
A PHP contributor confirmed that the order does not matter with the current implementation but it could change in the future: https://github.com/php/doc-en/issues/4767#issuecomment-3054071094
So I think we should try to use the correct order to be future-proof, so setting this back to needs review.
I also updated the summary to remove the point about checking the string length. I don't think leaking the hash length is really a problem and it would hard to prevent.(I also added back the empty sections of the issue summary, it helps seeing what still needs to be done.)
I take that in the following code the $known_hash parameter should be $computed_hash, and the $user_string parameter should be $stored_hash.
The user string is the hash calculated from values supplied in the HTTP request.
In your example, $computed_hash is calculated from the password provided to be checked, so having it as the second parameter is correct. - 🇮🇹Italy apaderno Brescia, 🇮🇹
Lines like the following ones needs to be changed, then.
return $computed_hash && hash_equals($stored_hash, $computed_hash);
$computed_hash
is the expected hash, not the user-provided value.
In that case, there is no user-provided value, but a value stored in the database, which could be altered. If I follow that warning, it is$stored_hash
that needs to be passed as second argument.I did not check all the
hash_equals()
calls, but there are more lines that should be changed, if we follow to the letter the warning given in the PHP documentation. - 🇫🇷France prudloff Lille
Please see my comment in #14, $computed_hash is calculated from a user-provided password. $stored_hash comes from the database so is not user-provided.
User-provided in the context of timing attacks means provided in the HTTP request. - 🇮🇹Italy apaderno Brescia, 🇮🇹
$known_string
is the calculated hash, which must be kept secret. It cannot be a value stored in the database. - 🇮🇹Italy apaderno Brescia, 🇮🇹
In the example used by the PHP documentation, the hash calculated from a value provided by the user is passed as first parameter, not second.
// Value and signature are provided by the user, e.g. within the URL // and retrieved using $_GET. $value = 'username=rasmuslerdorf'; $signature = '8c35009d3b50caf7f5d2c1e031842e6b7823a1bb781d33c5237cd27b57b5f327'; if (hash_equals(hash_hmac('sha256', $value, $secretKey), $signature)) { echo "The value is correctly signed.", PHP_EOL; } else { echo "The value was tampered with.", PHP_EOL; }
- 🇫🇷France prudloff Lille
The order does not depend on which variable is pre-calculated or not, it depends on which variable the attacker is trying to brute force.
In the PHP doc example, the attacker is trying to brute force a valid signature, so the signature submitted by the user has to be second (and it is compared with a signature calculated from the secret key).
In PhpassHashedPasswordBase::check(), the attacker is trying to guess the password so the hash computed from the sent password has to be second.It cannot be a value stored in the database.
It can totally be a value stored in database. The known string is the correct value that does not vary per request, it is either stored pre-calculated or calculated from known values like a secret key.
When the PHP doc says the known string "must be kept secret" it does not mean it can't be stored, it means it is the sensitive string that an attacker would be trying to get with a timing attack.
The user string is the hash that varies with each brute force request.Timing attacks work by sending a lot of different values and checking if the time it takes the server to respond vary.
For example when trying to brute force a password, the website will usually have a hash of the password stored in the database and will hash the user-supplied password and compare it to the hash stored in database.
In that scenario, the known string is the hash stored in the database and the user string is the hash calculated from the password sent by the user. - 🇮🇹Italy apaderno Brescia, 🇮🇹
It is not a matter of pre-calculated values, which would eventually be values I calculate before I need them (and store in some places), not right before calling
hash_equals()
.In PhpassHashedPasswordBase::check(), the attacker is trying to guess the password so the hash computed from the sent password has to be second.
$computed_hash
is not user-input. The user can just change the entered password, but finding a password that gives a specific hash is only possible by brute force, by definition of cryptographic hash, which is invertible and collision resistant.
With passwords, the operation is more complex, and it is more difficult to find a password that gives an exact$computed_hash
.When the PHP doc says the known string "must be kept secret" it does not mean it can't be stored, it means it is the sensitive string that an attacker would be trying to get with a timing attack.
You missed the point. Neither
$computed_hash
nor$stored_hash
is user-input (provided in the HTTP request), so there is no difference in choosing which value to pass as second argument; my preference is for$stored_hash
, which could not be the preference somebody else has. It is no longer a matter of user input. (This does not mean the arguments order should not be changed.)Timing attacks work by sending a lot of different values and checking if the time it takes the server to respond vary.
The only way to avoid timing attacks is using constant-time code, which does not mean just using
hash_equals()
. - 🇮🇹Italy apaderno Brescia, 🇮🇹
The issue summary says Drupal core uses
hash_equals()
in multiple places. The merge request just fixes the given example, but it should fix all thehash_equals()
calls, when the user input is passed as first argument. - 🇮🇹Italy apaderno Brescia, 🇮🇹
Truly, the arguments order does not matter with the previous implementation either.
mod_len = MAX(known_len, 1); /* This is security sensitive code. Do not optimize this for speed. */ result = known_len - user_len; for (j = 0; j < user_len; j++) { result |= known_str[j % mod_len] ^ user_str[j]; } RETURN_BOOL(0 == result);
C code to compare in constant time two strings cannot be much different from the previous implementation and the actual implementation. I cannot see in which way they could rewrite the code to be dependent from the arguments order and still be constant-time code.
- 🇫🇷France prudloff Lille
$computed_hash is not user-input.
Well I think this is were we disagree on the definition of user-input. IMHO a hash calculated from user input should be treated as user input here but I could be wrong.
To be fair, the PHP doc does not give a lot of examples and the only example is not very similar to our use cases.
I opened an issue about improving the examples: https://github.com/php/doc-en/issues/4771I cannot see in which way they could rewrite the code to be dependent from the arguments order and still be constant-time code.
I'm not fluent in C but my understanding is that it matters in algorithms that can handle hashes with a different length (which hash_equals() does not support currently but could add in the future) because it could stop the comparison early if the user string and known string are inverted.
I have not checked if there a places in core where we compare hashes that can have a different length. - 🇮🇹Italy apaderno Brescia, 🇮🇹
IMHO a hash calculated from user input should be treated as user input here but I could be wrong.
The example given in the PHP documentation is clear about what they call "user input": It is a value completely provided by the user. They also make clear that a value given in the HTTP request is considered "user input."
$computed_hash
is not provided in the HTTP request, so it is not "user input."
$known_string
is not a value stored in a database, though; it is the value I get from the function to calculate the "hash." To make a different example, if I store in a database the largest integer supported by PHP, what I should I consider the known value when checking the value stored in the database?PHP_INT_MAX
, not the value stored in the database.I'm not fluent in C but my understanding is that it matters in algorithms that can handle hashes with a different length (which hash_equals() does not support currently but could add in the future) because it could stop the comparison early if the user string and known string are inverted.
Two strings are equal if they have the same length and all their bytes are equal. With a four-byte string and a five-byte string, only four bytes can be compared, but the strings are still different because they have different lengths.
Since the code cannot be optimized, C code is not much different from PHP code. In both the cases, two bytes can be compared in constant-time by XORing their value. (
A XOR B
is 0 whenA
andB
have the same value, and it is different from 0 whenA
andB
have different values.) Eventually, since constant-time code avoidsif
statements, the code could use$result = $known_string_length - $user_string_length;
instead ofif ($known_string_length !== $user_string_length) { return; }
, but the code woud still be XORing the bytes.
Eventually, the C code could be different if it is rewritten to in Assembly and all the CPUs would have instructions to compare two strings in constant-time. The alghoritm would not change, though.