Thanks to PCI-DSS requirements and other security standards that specify a minimum length and strength of password, most sysadmins now have the awareness and patience necessary to set up a basic password policy. However, many if not most systems still allow hackers to get a foot in the door by allowing compliant but still easy-to-guess passwords.
Learn about breaking passwords
Discover key forensics concepts and best practices related to passwords and encryption. This skills course covers ⇒ Breaking password security ⇒ Breaking windows passwords ⇒ Two-factor authentication
Start your free trial
Analysis of 5,000 PCI-DSS-compliant passwords
Through a (now addressed) logging bug at a commercial site, I recently had the chance to analyze about 5,000 production passwords set by end users over the course of a year. All of these passwords could have passed cursory PCI-DSS muster (see section 8.5 of version 2.0), since they were more than seven characters long and contained both numeric and alphanumeric characters. In fact, all of these passwords were stronger than PCI-DSS minimums because they were all at least eight characters long, and all contained one upper-case letter, one lower-case letter, and one number. Many also contained special characters.
Password length
Most of the passwords (61%) were right at the password limit, either 8 or 9 characters long. The average length was 9.6 characters, and the average password consisted of 1.1 upper-case letters, 6.1 lower-case letters, 2.2 numbers and 0.2 special characters.
Password complexity
When an upper-case letter was used, it was almost always (86%) the only upper-case letter in the password, and it was usually at the start of the password. When lower-case letters were used, there were five to seven of them most (59%) of the time. When numbers were used, numbers between zero and 99 were used most (63%) of the time and single numbers were very common (41%). Years were also very common, as evidenced by the high incidence (20%) of four-digit numbers, and most of these were in the range from 1900 to 2015. The current year (2013) was an especially popular (5%) password inclusion. Although special characters (e.g., “!” or “#”) were not required, 17% of all end users included them. (This was good news.) In almost all cases (90%), only a single special character was used. The most popular special character sequences were all single characters: exclamation point (“!” – 29%), period (“.” – 19%), “at” symbol (“@” – 15%) and hash (“#” – 14%). These were followed by the single dash (“-“), dollar sign (“$”), space (” “), asterisk (“*”), and plus sign (“+”), each making up between 3% and 6% of the single-character special character population. Passwords containing multiple special characters mainly (68%) just repeated the same special character, such as “##” or “???.”
Password Predictability: Similar to default password
This was a system that sent a common fixed password to all end users, so I also had the chance to see if that was a factor in password selection. For example, if an initial password was “RedBlue1,” I looked to see which end users just changed their password to something like “RedBlue2” or “GreenBlue1.” Unfortunately “similarity to original password” was a factor, with many (13%) end users opting for this pattern.
Password predictability: Similar to username
Since I was also able to compare username to their passwords, I could check for username/password similarities. For example, if a username was “john.smith@corp.com,” I looked for passwords like “John2013,” “JSmith13,” “!corp123.” Unfortunately, many end users (10%) did select a password that was striking similar to their username.
Password predictability: Containing dictionary words
After the analysis of similarity between the initial password and username, I looked for passwords that were similar to about 168,000 English language dictionary words (similar to a “Scrabble® dictionary”) of four characters or more. The vast majority (75%) of all passwords matched one or more of these words. Some users probably used dictionary words safely (using multiple words to spell a memorable phrase) since the maximum password length was a full 24 characters, but most did not. Furthermore, some users (2%) actually used the word “password” or “pass” in their password, suggesting that unsafe use of words is still common.
Password predictability: Containing keyboard patterns
Finally, I looked for (US standard) keyboard patterns in the passwords. These are common groupings of keys such as “123,” “qwer” and “poiu.” Even though I only tested a few dozen sequences, they were popular (7%) inclusions in people’s passwords.
Password analysis conclusions
The conclusion from my password analysis was startling, even though the system was set to comply with, if not exceed, PCI-DSS password complexity regulations. It suggested that about one-quarter of the passwords were similar to the system’s default password, the user’s username, or the word “password,” and thus were vulnerable to an intelligent brute force attack, phishing, or social engineering*. Many of the rest of the passwords could be derived from common dictionary words (usually with the first letter capitalized), short number sequences or years, and one of four special characters (“!.@#”).
Defending against Weak but Compliant-on-Paper Passwords
If you have read this far, I think it is safe to assume that you agree that passwords are a necessary evil, and that not every system can or should be converted to strong authentication by using tokens, certificates, keys, or biometrics. With that in mind, I will describe two general approaches to defending against weak but compliant-on-paper passwords: one each from the perspective of a developer and a sysadmin.
Advanced password protection for Sysadmins
SysAdmins are often at the mercy of the technology they purchase, configure, and deploy, so your guide simply asks you to purchase, configure, and deploy technology that checks and denies passwords that match certain patterns.
Disallow passwords SIMILAR to usernames. (Not merely “contains usernames.”) Make sure this is case-insensitive (e.g., “Smith” matches “smith”) and, ideally, that it can handle a slight offset (e.g., “john.smith” matches “jsmith”).
Disallow dictionary words (e.g., “duck”), unless multiple dictionary words are used to construct a phrase (“DuckJasperNinePaddy”). This comparison should also be case-insensitive.
Passwords containing the phrase “pass” (or “password”) should ALWAYS be disallowed. The use of custom dictionaries may be preferred, as described below.
Disallow keyboard sequences (e.g., “qwer”). This may be implemented as custom entries in a password dictionary. Disallow common date sequences such as years from 1900-2050, month names, and quarter designations, such as “Q1”. (All of these are commonly used to defeat password rotation policies.) Optional: Disallow passwords similar to initial or default passwords if a single initial password (e.g., “RedBlue1”) or password pattern (e.g., last four digits of SSH) is used as each end user’s initial password. This may often be implemented as custom entries in a password dictionary.
and, of course:
Require a minimum length and complexity (mix of upper-case, lower-case, numeric, and special characters), and regular password changes (e.g., every 90 days) with no repeats for some period (e.g., three years).
Admittedly products containing ALL these protections are rare today but, the more you ask for them, the higher the chance that smart and security-conscious product managers will record and run with these ideas.
Advanced password protection for developers
If you develop technology that allows end users or sysadmins to change passwords, you have a special responsibility to ensure that your technology enforces good passwords. The attributes that you need to enforce are listed above in the “Advanced Password Protection for SysAdmins” section, but I have also provided a few hints to help you implement these rules below. Algorithm to Detect Username/Password (or Initial Password) Similarity The following proto-function returns “true” if any set of characters iWindow characters long matches between the two phrases. If returns “false” if iWindow is shorter than either of the two phrases or if no match is discovered. [python] bPhraseSimilarToPhrase(sPhrase1, sPhrase2, iWindow) { if(sPhrase1.Length >= iWindow AND sPhrase2.Length >= iWindow) { for (i=0; i<sPhrase1.Length – iWindow; i++) { sCheck1 = LowerCase(Substring(sPhrase1, i, iWindow)) for (j=0; j<sPhrase2.Length – iWindow; j++) { sCheck2 = LowerCase(Substring(sPhrase2, j, iWindow)) if (sCheck1 == sCheck2) { return true } } } } return false } [/python] Optimizing dictionary password checks for performance Dictionary checks are most frequently done by comparing an in-memory password with each of several thousand lines from a text file full of dictionary words. If you frequently check passwords (e.g., many times a minute) against a dictionary, response time is a concern, or file I/O is a concern, it may be worth it to build a long-running thread or service that loads and caches the dictionary file in memory every few minutes. Then, you can replace your inline code that iterates through your password dictionary (case-insensitively, remember) with a quick interprocess call to the password quality thread or service. Alternatively, you could import your password file into a database file and use the database (service) to perform string comparisons against a table for you. However, relational database indexes may not be of much use because you need to perform partial string searches and case-insensitive string searches. Algorithms not covered here As a developer, you probably already know how to count lower-case, upper-case, special characters, and numeric characters in a string. There are also various ways to find a year in a string, especially if you just want to find the current year.
Notes
- Regarding password strength and phishing or social engineering: Imagine an email message or inbound phone call crafted from “your security team” telling the end user that they were putting the company at risk by choosing a weak password that was too similar to their username: 10% of your end users, knowing this was true, might click an email link or pull up a site to “change their password to something safer.” Of course, regardless of how safe their “new” password was, as soon as they entered their “old” (current) password into the hacker’s capture site, their accounts would be compromised.