As one of the creators of bcrypt back in 1997, I find it somewhat surprising that, 25 years later, we still rely heavily on passwords. My initial encounter with password security took place in 1993 at the University of Hamburg. Having just discovered Unix, I was fascinated by a service called Yellow Pages that allowed anyone to dump encrypted password databases. When I reached out to my friends at other universities, requesting their password dumps to run crack on them, some of the system administrators were quite displeased with me. They found themselves victims of weak system security and an insecure password hashing algorithm. While I can't claim that this directly inspired my future work, it's clear that bcrypt ultimately helped to rectify some of the wrongs committed during my innocent youth.
Password Guessing
Password guessing tools are like toddlers: they'll try anything, and they won't give up easily. - Google’s Bard AI.
Security | Adaptable Work Factor | Memory-hardness | Year Introduced | |
---|---|---|---|---|
CRYPT | Not recommended | No | No | 1970s |
MD5crypt | Not recommended | No | No | 1994 |
BCRYPT | High | Yes | No | 1999 |
PBKDF2WithHmacSHA1 (RFC 2898, 2000) | Moderate to high | Yes | No | 2000 |
Scrypt (Percival, 2013) | High | Yes | Yes | 2009 |
Argon2 (Biryukov & Dinu, 2016) | High | Yes | Yes | 2015 |
Back in 1997, I had little idea that bcrypt would be a significant innovation in password security. With the Internet's rising popularity, data breaches involving hashed passwords were becoming commonplace, revealing shortcomings in the password hashing algorithms of the time. This underscored our motivation to create bcrypt, an algorithm designed to resist the rapid advancements in computational power.
Both memory-hardness and adjustable parallelism were meant to counter password guessing on specialized hardware such as GPUs. Memory-hardness increases the password guessing cost by requiring a substantial amount of memory for each guess. This is especially effective in deterring attackers who use parallel computing resources to speed up password cracking. Memory-hardness was an idea first introduced by Abadi et al. in 2003 and applied to passwords by Percival in 2009. Adjustable parallelism allows the algorithms to make use of multiple processing cores, further increasing the computation cost and reducing efficiency gains through guessing passwords in parallel.
There are only two types of companies: Those that have been hacked and those that will be hacked. – Robert S. Mueller III, former Director of the FBI.
I don't need a password manager, I just use the same password for everything. - Unknown
Initially, I believed that generating secure password hashes might be considered a solved problem. Sadly, industry adoption hasn't been as swift as I had hoped. Even after introducing bcrypt and other more secure password hashing algorithms, numerous high-profile data breaches have occurred involving passwords hashed with weaker algorithms. These incidents highlight the importance of strong password hashing algorithms but also expose the sluggish pace at which the industry embraces new security technologies. Here are a few notable examples:
- LinkedIn (2012): In 2012, LinkedIn was hacked, and approximately 117 million password hashes were stolen. The passwords were stored using the SHA-1 hashing algorithm, which is no longer considered secure.
- Yahoo (2013 and 2014): In 2013 and 2014, Yahoo was hacked, and a total of 3 billion user accounts were compromised. The accounts included names, email addresses, passwords, and birth dates. The passwords were stored using a variety of hashing algorithms, including MD5, SHA-1, and bcrypt.
- Adobe (2013): In 2013, Adobe was hacked, and 153 million user accounts were compromised. The accounts included names, email addresses, passwords, and credit card numbers. The passwords were stored using a custom encryption method rather than a proper hashing algorithm, making it easier for attackers to crack them. The password hints were stored using the SHA-1 hashing algorithm.
- MySpace (2016): The MySpace breach was reported in 2016, but it is believed to have occurred around 2013. In this incident, 360 million password hashes were stolen. The passwords were stored unsalted using the SHA-1 hashing algorithm.
These breaches underscore the importance of using strong password hashing algorithms like bcrypt, scrypt, and Argon2. While bcrypt had already been introduced and was known to offer stronger protection, many organizations continued to use weaker hashing algorithms, leaving user data vulnerable to attackers. These breaches have motivated the increased adoption of more secure password storage methods and the development of more advanced hashing algorithms to safeguard user information. Alas, even now, we should expect more breaches where weak password hashing algorithms were used. As I will argue later, security is no longer a technical problem — rather it is a people and incentives problem.
Password Hashing Performance Over Time
Over the last 30 years, we have seen password cracking techniques and hardware evolve significantly, resulting in substantial improvements in password guessing performance. Given the performance improvements in CPUs and GPUs, it has become obvious how important it is to employ work-cost adaptation for password hashing algorithms. To demonstrate this progression, I've compiled a list of password guessing scenarios over time, highlighting the number of password guesses per second for various algorithms and tools. Please keep in mind that these figures serve as an overview of general trends and might not be directly comparable due to differences in hardware, software, and configurations used.
- 1978: PDP-11/70 (M-209 simulated crypt) → 800 passwords/second
- 1988: VAX 8600 (Morris-worm optimized des-crypt) → 45 passwords/second
- 1994: 60MHz Pentium (MD5-based crypt) → 29.41 passwords/second
- 1999: John the Ripper (bitsliced DES-crypt) → 214,000 passwords/second
- 1999: John the Ripper (bcrypt with work factor 5) → 62.5 passwords/second
- 2018: Hashcat (des-crypt on a GPU-based rig) → 1.7 billion passwords/second
- 2018: Hashcat (MD5 hashes on a GPU-based rig) → 45.4 billion passwords/second
- 2018: Hashcat (SHA-1 hashes on a GPU-based rig) → 14.6 billion passwords/second
- 2018: Hashcat (bcrypt with work factor 5 on a GPU-based rig) → 47.2 thousand passwords/second
- 2018: Hashcat (scrypt on a GPU-based rig) → 1.4 million passwords/second
- 2022: Hashcat (des-crypt on an RTX 4090 GPU) → 6.3 billion passwords/second
- 2022: Hashcat (bcrypt with work factor 5 on an RTX 4090 GPU) → 184 thousand passwords/second
Over the course of 34 years, we've witnessed an extraordinary leap in the password guessing speed for DES-crypt; it's accelerated from a mere 45 passwords per second to an astounding 6.3 billion passwords per second. This advance doesn't just surpass Moore's law — it pulverizes it, tripling the password cracking speed every two years.
Bcrypt's Enduring Performance Over the Past 25 Years
Personally, I think that bcrypt and SHA-crypt are quite close to becoming obsolete in favor of something new, yet right now I would still recommend moving to bcrypt as the most suitable pre-existing password hashing method. - Solar Designer, June 2012 on Bugtraq mailing list
Throughout my 25-year journey with bcrypt, I've seen it withstand challenges posed by rapid advancements in computing power and the development of new password cracking techniques. Surprisingly, it still remains an effective and reliable choice for password hashing.
We designed the adaptable work factor in bcrypt, primarily to keep pace with increasing CPU performance, but also included factors to make it more robust against custom ASICs or GPU acceleration, e.g. by heavily relying on instructions that generic CPUs execute efficiently. A crucial aspect of bcrypt's resistance to GPU optimization lies in its memory access pattern spanning 4KB during the key setup phase, which is inefficient to parallelize on GPUs with small L1 caches (Malvoni et al., 2014). That said, NVidia increased the L1 cache significantly to 16MB for the RTX 4090 and bcrypt’s 4KB is a tiny amount of memory these days.
Bcrypt's endurance can be attributed to several other factors beyond our intentional algorithm design. Its wide availability in open-source implementations has facilitated widespread adoption and integration into various systems. According to Wikipedia, there are implementations of bcrypt in C, C++, C#, Embarcadero Delphi, Elixir, Go, Java, JavaScript, Perl, PHP, Python, and Ruby. Moreover, bcrypt's focus on computational cost scaling makes it an attractive choice for large Internet services compared to newer algorithms like Argon2, which also scale in memory consumption. As large-scale Internet services handle most password checking and need to optimize resource consumption for economic reasons, bcrypt's robust security, without excessive memory demands, proves beneficial. Modern approaches like Argon2, with their memory-hardness, can be less compelling in some cases, as their increased memory consumption may not align with resource optimization goals, contributing to bcrypt's continued relevance in password security.
That said, Facebook has successfully deployed scrypt at 16MiB memory consumption and Solar Designer has worked on an extension of scrypt called yescrypt that can leverage large amounts of site-shared read only memory to overcome the throughput concerns faced by large-scale Internet services (Solar Designer, 2017).
The Future of Passwords and Security
Like the pulsing rhythm in an EDM track, bcrypt keeps the beat of password security, forever adapting to the increasing tempo of computational power. - Activ8te, 2023.
Over the years, I've observed modern password hashing algorithms significantly reduce the effectiveness of brute-force password guessing. However, password stuffing attacks—where attackers use previously leaked credentials to gain unauthorized access—continue to be a persistent threat. On the other hand, the advent of multi-factor authentication (MFA) has shifted the focus to protecting user accounts through additional layers of verification, making passwords less critical to security.
In today's digital landscape, the adoption of cloud services has increased, with most critical data being stored and processed remotely. Brute-force password guessing against online services often encounters rate-limiting protections and captchas that require human interaction. As a result, attackers targeting cloud-based services tend to exploit vulnerabilities or use social engineering to compromise insiders to breach sensitive customer or user data.
Despite these challenges, passwords remain an integral part of our digital lives. Their ease of deployment, creation, and revocability make them a convenient choice even today. While single sign-on is now nearly ubiquitous, there is a significant risk in trusting a single identity provider (IdP) with one's entire online presence. For instance, an IdP might disable your account due to poorly designed abuse detection rules, or because your account has been erroneously flagged as a false positive. Utilizing a password eliminates the risky dependency on third-party services, which, while sad to say, is a pertinent concern. Until alternatives such as self-sovereign identity (Allen, 2016) - where users retain control over their identities — gain widespread acceptance, the reign of passwords is likely to endure.
In my experience, most security professionals consider password security a solved problem from a technical perspective. I would even go further and claim that security is no longer a problem that I consider to be primarily technical. By and large, most security challenges have technological solutions that sufficiently address them. Instead, the main problem I see with companies improving their security and reducing the frequency of data breaches, boils down to human factors and the cost of adopting existing security technologies.
Let’s talk about adopting security technologies first. Currently, there are no off the shelf solutions that lead to good security outcomes. Truly robust security is often seen only in Internet-based software engineering companies, where the executive team places a high priority on security and is prepared to invest in it. Most other companies usually seek to choose an appropriate trade off between investing in business growth and security. The incentives are such that it is the rational argument to invest in business growth and treat a data breach as an eventuality that will be dealt with when it happens. Tighter regulations, more stringent enforcement and increased liability might change that calculus.
For the few companies that achieve a mature security posture, human factors often dominate their security concerns, and insider risk becomes the primary focus. Many of these threats stem from the vulnerability of humans to social engineering, leading them to act on behalf of others, or from disgruntled employees who, in a moment of weakness, abuse their privileges to steal and leak sensitive information. Addressing insider risk is a more complex challenge, as it involves a human component and typically lacks a straightforward technological solution.
That said, it's much easier to address human behavior when there's a strong technological security foundation in place. A robust approach to securing infrastructure from both external attacks and rogue insiders necessitates the development of strong security invariants integrated into the infrastructure to mitigate risks. This often calls for engineering bespoke solutions and employing skilled software and security engineers. Taking such an approach brings us right back to the previous point, i.e. the high cost of adopting security technologies.
It turns out that even companies who are willing to heavily invest in improving their security, face significant challenges in finding and hiring skilled engineers with security domain expertise. As such, one of the most significant challenges in addressing security issues may be the scarcity of skilled security professionals. I believe that there is an urgent need to generate more interest in the field and create a strong talent pipeline to ensure the continued development of effective security measures.
To address this scarcity of talent, I've pursued a very unconventional approach. I've embarked on a new venture as an EDM (Electronic Dance Music) producer under the artist name Activ8te, creating cybersecurity-themed EDM tracks. My goal is to captivate a younger audience and ignite their interest in security topics. Some of my recent tracks, like "Teardrop Falling" and "I Am Tracking You," explore challenging security themes such as denial of service, censorship, and the risks to our privacy posed by sophisticated spyware (Activ8te, 2022). By raising awareness and enthusiasm for the field, I hope to contribute to the expansion of a skilled security professional pipeline.
Abadi, M., Burrows, M., Manasse, M., & Wobber, T. (2003). Moderately hard, memory-bound functions. NDSS Symposium..
Activ8te (2022). Cybersecurity meets EDM. https://activ8te.io/
Allen, C. (2016). The Path to Self-Sovereign Identity. https://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-id...
Biryukov, A., & Dinu, D. (2016). The Argon2 memory-hard function for password hashing and other applications. In International Conference on Fast Software Encryption (pp. 170-189). Springer, Berlin, Heidelberg.
Bonneau, J. (2012). The science of guessing: Analyzing an anonymized corpus of 70 million passwords. 2012 IEEE Symposium on Security and Privacy.
Chiasson, S., Biddle, R., & Van Oorschot, P. C. (2006). A second look at the usability of click-based graphical passwords. In Proceedings of the 3rd Symposium on Usable Privacy and Security (pp. 1-12). ACM.
Florencio, D., & Herley, C. (2007). A large-scale study of web password habits. In Proceedings of the 16th international conference on World Wide Web (pp. 657-666). ACM.
Levy, S. (1984). Hackers: Heroes of the Computer Revolution
Percival, C. (2013). Stronger key derivation via sequential memory-hard functions. Self-published. https://sites.cs.ucsb.edu/~rich/class/old.cs290/papers/scrypt.pdf
Provos, N. and Mazieres, D. (1999). Bcrypt algorithm. USENIX ATC, FREENIX Track.
RFC 2898. (2000). PKCS #5: Password-Based Cryptography Specification Version 2.0. IETF.
Seeley, Donn (1989). A Tour of the Worm: https://www.cs.unc.edu/~jeffay/courses/nidsS05/attacks/seely-RTMworm-89....
Solar Designer. (2012) Password security: past, present, future. Openwall.
Solar Designer. (2015). bcrypt: Cost, parallelism, and attacks on GPUs. Openwall.
Solar Designer (2017). yescrypt: large-scale password hashing. https://www.openwall.com/presentations/BSidesLjubljana2017-Yescrypt-Larg...
Ur, B., Bees, J., Segreti, S. M., Bauer, L., Christin, N., & Cranor, L. F. (2015). Measuring Real-World Accuracy and Impact of Password Strength Meters. In Proceedings of the 24th International Conference on World Wide Web (pp. 283-293).