In: Computer Science
A 10-character password is to be selected from an alphabet comprised of the following: all lower case and upper case English alphabet and all numbers. Compute the entropy of such a password. Why would such an entropy computation not be representative of passwords in practical settings?
Ans :
Why we use entropy to measure the password strength?
Let's say someone uses brute force to guess a password and passwords are not bit strings, they are character strings and entropy is a measure to describe in how many attempts of brute force guessing will it take to crack the password ( not exact counts just a proportionality.)
Let's derive H ( entropy)
N - Number of characters to choose from
L - Length of password
H - Entropy value
NL = 2H
Taking log both sides
L log N = H log 2
H = L ( log N / log 2)
Let's calculate entropy for 10 characters and say case sensitive (a-z or A-Z or 0-9) giving us total N = 62 (26+26+10)
H = 10 ( log 62 / log 2 ) = 59.54 bits
But in this derivation, we have assumed that every character is uniformly randomly chosen but as humans don't choose everything randomly the expected and calculated entropy is different.
Humans are very poor at selecting a good password, even though the above entropy value seems more than sufficient but in real life, it is not even close to a strong password.
In actual life, NIST has rules by which we calculate actual entropy values
For 10 character password = 4*1 + 2*7 + 1.5*2 = 21 bits
Number of character | Entropy values |
1 character | 4 bits each |
Next 7 character | 2 bits each |
Next 9 to 20th character | 1.5 bits each |
Character 21 and above | 1 bits each |
So actual entropy is very less as compared to calculated that's why such password is not preferred in practical life.