SHA-1 Was Shattered

Boot.dev TeamProgramming course authors and video producers

Last published June 16, 2026

A couple of weeks ago I downloaded a copy of OBS, and my operating system yelled at me. Told me I shouldn't trust it. And it was right, I shouldn't have been trying to download from fastandrealobsfree.ru.

But what if one of the most important algorithms that keeps you safe from stuff like that just stopped working?

Well, that happened. Worldwide.

On February 23rd, 2017, Google published two PDFs.

One has a blue background, the other red. Open them side by side, and it's pretty easy to see that they're not the same document.

But, if you feed both files into SHA-1, the hashing algorithm that, at the time, still protected software installs, certificate authorities, and a good chunk of the internet's identity infrastructure [3], the algorithm claims the two files are identical.

This is impossible.

At least, it should be impossible. Back when SHA-1 was designed, everyone thought that even if you used code to generate as many random input files as you possibly could, creating two files with identical hashes, called a "collision", would take so long that it wasn't even worth worrying about.

See, every git commit has its own SHA-1 hash associated with it, and GitHub put the odds of an accidental SHA-1 collision this way: even if five million programmers each generated one commit per second, you'd only have about a 50% chance of seeing one before the sun swallowed the Earth [5].

Turns out, it took a team at Google about nine months to do it [1].

The team, a collaboration between researchers at CWI Amsterdam and Google's security group, called the project "SHAttered" [3].

Ars Technica ran an obituary-like headline: "At death's door for years, widely used SHA1 function is now dead" [9]. The Hacker News thread for the announcement racked up nearly five hundred comments, with developers ping-ponging between dissecting the math, arguing about which systems were actually exposed, and warning each other to stop trusting SHA-1 in production. Like, immediately [10].

Bruce Schneier, who's been writing about cryptography since the 90s, posted to his blog that, actually, the result was important, expected, and overdue [8]. He'd been calling SHA-1 broken since 2005 [12], and in 2012 he published a back-of-the-envelope cost projection on his blog that turned out to be pretty accurate [19][8]. It only took about $110,000 worth of compute, split across 6,500 years of CPU time and 110 years of GPU time [1][9].

Now, that sounds like a lot, but they compressed it into nine months by using Google-hosted hardware clusters spread across eight physical locations [9]. That 6,500 years of CPU time could be parallelized across thousands of machines, bringing the real-world time down to less than a year [1].

So people suspected SHA-1 could be on its last legs before 2017, and a collision had been theorized about, but it had never been practical to try to create one.

And it's important to understand, SHA-1 isn't just an academic algorithm - something like bubble sort that you study in school but never use in your own software. It was load-bearing. It was holding up installation signatures, version control, and deduplication systems - mostly boring technical plumbing to be fair, but important technical plumbing. And five researchers had just proven they could fool it for the price of a luxury car.

So in this post, we'll answer three important questions:

What even is a hash function?

How did SHA-1 actually die?

And why are people still using it?

Now, before we answer those 3 questions, let's finish the story.

Marc Stevens, Pierre Karpman, Elie Bursztein, Ange Albertini and Yarik Markov put out a blog post detailing that now-famous attack on SHA-1 [1] which produced what no good hash function should allow: two materially different PDFs with identical hashes.

And this part caught me by surprise: to keep with Google's vulnerability disclosure policies, they had to wait 90 days before releasing their code to the public that allowed them to create the collision. But in the meantime, they did at least provide a free detection system to the public to help protect them from the same kind of attack [1].

The public detection tool was a file tester on the SHAttered website: upload a file, and it checked whether the file looked like part of the collision attack they'd discovered [3]. The detector didn't even need both colliding files. It could look at one suspicious file and catch the patterns they had associated with the problem [3]. We'll get to how that all works in a second.

But the main takeaway of the project was very simple: stop using SHA-1! They stressed the flaws in the algorithm, hypothesizing that well-funded attackers could follow in their footsteps and craft even more collisions.

Their exact words in the blog post were: "It's more urgent than ever for security practitioners to migrate to safer cryptographic hashes" [1].

What even is a hash function?

Okay, so at this point you're probably wondering: what even is this "hash function" they keep talking about?

Well, a hash function takes any data input, a word, a paragraph, a PDF, or even a movie file, and spits out a fixed-length string of characters [1][6]. Type "hello" into a SHA-1 hasher and you get back this:

aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d

Forty characters. Always forty characters, no matter what you put in.

Type "Hello" with a capital H, and the output is completely different.

f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0

Same for "jello".

2ced3ee86f82bf91c15cc30605df6d3ddf0769ff

Same for "cello".

b7cf79ff9671395a93c7dbcd99485ecf5e4d98c3

Same for adding a single space at the end of "hello ".

c4d871ad13ad00fde9a7bb7ff7ed2543aec54241

Notice that even these tiny changes in the input scramble the output beyond recognition. But it's NOT random. "hello" always hashes to this fingerprint.

aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d

And "jello" always hashes to that one.

2ced3ee86f82bf91c15cc30605df6d3ddf0769ff

That's why hashes are so useful for managing file downloads. Say you download a game. Your operating system or anti virus can hash the game file you downloaded and compare that hash against the hash the developer published as the "official version". If they match, we can be confident that this is the official game file. If they don't match, there's no guarantee.

Maybe the file got corrupted. Or maybe there's just a new version the anti-virus doesn't know about. Or maybe someone swapped in malware that asks for access to your files, your camera, your passkeys, or whatever else your machine is supposed to protect.

When Windows updates itself, yes, even Windows, Microsoft uses code-signing signatures to authenticate that updates came from Microsoft and weren't tampered. Wrong signature, no update [14].

Hash functions have four big properties that make them useful (and secure) - and breaking just one of them is enough to cause serious problems.

First, hash functions need to be deterministic, meaning you put in the same input, you get the same output, every time. You can put "hello" into a SHA-1 tomorrow, next week, or a month from now - and still get the same hash back.

Second, SHA based hash functions are fast to compute. Even on something like a raspberry pi, you can compute millions per second. Which is important, because some input files aren't just simple strings of a few kilobytes - they can be gigabytes or even terabytes in length.

Third, the hash function only goes one-way, so you can't reverse-engineer the input from just the hash. So if you have a really embarrassing bit of software you wrote a few years ago, like this:

if average >= 60:
    if average >= 70:
        if average >= 80:
            if average >= 90:
                print("You got an A")
                return
            print("You got an B")
            return
        print("You got an C")
        return
    print("You got an D")

even if you published the release's hash to your GitHub, no one can get back to the source from just the digest.

And last, this is the one the shattered team broke, a hash function MUST BE collision resistant. A collision is, of course, when two different input files produce the same output fingerprint. It should be practically impossible to find two different inputs that produce the same output [6].

Now you might think, well, those fingerprints are pretty short strings of text. Just 160 bits [6]. Well, 2^160 is actually about 1.46 quindecillion:

1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976

so... seems like we're safe. Right?

Well, it's THAT promise of collision resistance that the shattered team broke. Two visibly different files, with the same 40-character SHA-1 fingerprint [3].

How did SHAttered work?

So let's talk about how the shattered attack actually worked, because no, they didn't just brute force one and a half quindecillion different inputs.

The interesting thing about collisions is that while they're incredibly improbable, they are not designed to be impossible. 1.46 quindecillion is still a finite number. And the number of possible inputs you could shove into the function is effectively infinite. There is technically an input limit, but its just a bit of an implementation detail, it might as well be infinite.

Every possible file, every possible string of text, and every possible video of your grandma's cat that she sends you every Wednesday at 6:51 AM... they can all be hashed.

More inputs than outputs. So, some of those inputs have to collide. It's called the pigeonhole principle: if you have more items than containers, at least one container has to hold more than one item. If thirteen people walk into a room, at least two of them must share a birth month.

So when cryptographers said SHA-1 was secure, they didn't mean collisions don't exist. They meant finding one should be so absurdly expensive that the sun burns out before you're able to finish your search.

To be fair, for a 160-bit hash, a brute-force collision search would just need to try roughly two-to-the-eightieth inputs (because 2^80 is about the square root of the hash space) to find a single one [12]. But even at a million hashes per second, that's about 38 billion years, longer than the universe has been around. So in practice, "you'll never find a collision" was a fair approximation.

But the last time I checked, the sun hasn't exploded yet, and shattered already happened. So somehow, they found a shortcut. Here's how they did it.

SHA-1 processes data in blocks. It breaks up the input data, and feeds each block into its series of mathematical functions, and the output of one block becomes part of the input to the next. The internal state evolves as more data flows through [2]. These "mathematical" functions actually aren't all that complicated, so don't let it scare you. It's things like:

Truncating some bits off the end
Rotating bits left, so the front bits wrap around to the back
And mixing values together with bitwise OR and XOR operations

The important thing is that it's just transforming the data deterministically (not randomly) and it's doing it in a way that can't be reversed. Kinda like one of those silly math-based magic tricks.

The SHAttered team used a technique called differential cryptanalysis [2].

Instead of randomly trying inputs and hoping for a collision, they studied something very specific: how would tiny changes to the input ripple throughout all those math functions? Can we find two small changes that cancel each other out?

If you could find enough of the right tiny changes, you'd have it. Two different files that produce the same final state: a collision [2].

Here's a simple example: say you and your friend are standing in the same spot in New York, and you both want to meet at the same coffee house. I give you one route, and I give your friend a different route. If I choose those routes carefully, you can both end up at the same place.

+-----+-----+-----+-----+-----+
|     |     |     |     |     |
|     |     |     |     |     |
|     |     |     |     |     |
+-----+-----+-----+-----+-----+
|     |     |     |     |     |
|     |     |     |     |     |
|     |     |     |     |     |
+-----+-----+-----+-----+-----+
|     |     |     |     |     |
|     |     |  X  |     |     |
|     |     |     |     |     |
+-----+-----+-----+-----+-----+
|     |     |     |     |     |
|  O  |     |     |     |     |
|     |     |     |     |     |
+-----+-----+-----+-----+-----+
|     |     |     |     |     |
|     |     |     |  O  |     |
|     |     |     |     |     |
+-----+-----+-----+-----+-----+

The team constructed two files that started with the same prefix, then diverged into two different "collision blocks" of carefully calculated junk data. After those blocks, SHA-1's internal state was identical again. So the two files would go on to share the same suffix and land on the same final hash [3].

The SHAttered team estimated their attack was roughly 100,000 times faster than a brute-force search of the square root of the collision space [1]. That's about 2^63 SHA-1 computations instead of 2^80... which is 2^17th or about 130,000 times faster.

Even so, it wasn't cheap.

Reports vary on the dollar cost of creating the collision, but the repeated figure seems to be in the ballpark of $110,000 in cloud-compute terms. It sounds like a lot until you remember that intelligence agencies, well-funded criminal groups, and nation states have deeper pockets than your average hacker.

So effectively, anyone with serious money and patience can plausibly engineer a SHA-1 attack if the target still trusts SHA-1.

Why are people still using it?

So in 2026, nine years after SHAttered, has the world finished migrating off SHA-1? You can probably guess where I'm going with this.

See, most people, including a lot of developers unfortunately, treat security as binary. Either your password is long enough or it's insecure. But it's not so simple.

Cryptographic algorithms age. They often move through a slow lifecycle of being considered safe, then questionable, then dead. Sure, some have stayed extremely safe, and we may never find a practical attack, but many haven't. Computers get faster, and mathematicians find new attacks. The cost of breaking an algorithm drops every year, even when the algorithm itself doesn't change a line of code. What was secure in 2005 might be "questionable" in 2015 and "actively dangerous" in 2025.

The perfect example of this: SHA-1's death started years before SHAttered actually landed.

In 2005, a Chinese cryptographer named Xiaoyun Wang, with colleagues Yiqun Lisa Yin and Hongbo Yu, published a theoretical attack that reduced SHA-1's collision strength from two-to-the-eightieth down to roughly two-to-the-sixty-ninth [11]. Within months, Wang and collaborators announced a further refinement to two-to-the-sixty-third [16]. Still not practical. But Schneier wrote at the time that it pretty much put a bullet into SHA-1 for digital signatures, and that this was a "major, major cryptanalytic result" [12]. The math was now public, and so refinements would only make it faster.

In 2011, NIST formally deprecated SHA-1 for digital signature generation [7]. The U.S. government's standards body officially said, in writing, please stop using this to trust file downloads and things of that sort. They didn't ban it, they almost never ban anything, but they made it very clear that we should move to other solutions.

In 2014, Google Chrome announced it would start gradually distrusting HTTPS certificates signed with SHA-1, certificates browsers use to prove that a server really belongs to the domain in your address bar [4]. The post was titled, "Gradually sunsetting SHA-1" [4]. The plan was tiered: certificates expiring after 2017 would trigger a security warning, then eventually a hard error.

By 2015, Marc Stevens and his collaborators got closer. They cracked a piece of SHA-1's internals in ten days using rented GPUs, and put a price tag on the full attack: somewhere between $75,000 and $120,000 on Amazon EC2 [13].

So, the degradation of SHA-1 wasn't really news to cryptographers. By the time SHAttered actually landed in 2017, the only thing surprising about it was that anybody was still surprised.

So, our final question: if not SHA-1, what do we use now? And will THAT algorithm break soon?

The simple answer is, start using SHA-256. It's nowhere near being cracked. Easy 1-line code change, right?

- sha1_digest = hashlib.sha1(data).hexdigest()
+ sha256_digest = hashlib.sha256(data).hexdigest()

Kinda? The problem is you're updating every certificate authority and every certificate you've ever issued. So for some projects, yeah, it's pretty simple. But for the big projects with millions of hashes in production filesystems... it's not.

Every piece of software that validates signatures. Every embedded device with a hash function baked into the firmware, industrial controllers, medical equipment, point-of-sale terminals, ATM networks, weird custom IoT widgets in oil refineries that nobody at the company can find the source code for anymore... now needs an update.

So it's not just the code, it's all those stored hashes.

Anyways, take Microsoft. They didn't finish moving Windows Update signing fully to SHA-2 until 2019, over two years AFTER SHAttered, and didn't retire the last SHA-1-signed Windows content from their Download Center until August 2020 [20][14]. Windows 7 users who never installed Microsoft's SHA-2 patch literally stopped receiving security updates in 2019, because their machines couldn't verify the newer signatures [20]. Microsoft. One of the biggest software companies on Earth, with effectively infinite engineers. Still took years to migrate.

And if Microsoft moves that slow, imagine the rest of the internet.

On the public-facing internet, SHA-1 WAS banished years ago [4][18]. That part of the cleanup actually worked because browser makers could force the issue: if a publicly trusted site kept using SHA-1 certificates, Chrome and Firefox could put a scary warning, or eventually a hard error, between that site and its users [4][18].

But the internet runs much deeper than what's in your personal browser. There's the entire issue of corporate intra-nets: Banks. Government agencies. Insurance companies. Hospitals. These places run software written in the 1990s on hardware nobody wants to touch because they can't afford for it to stop working, and the people maintaining it are almost never cryptographers. They're sysadmins keeping a sixty-million-dollar mainframe from setting itself on fire, and the SHA-1 dependency is buried in a config file eight directories deep that hasn't been touched since the guy who wrote it took early retirement.

For those teams, the plan is always to migrate "next quarter." And next quarter has been next quarter for the last 7 years.

In January 2020, three years after SHAttered, two cryptographers proved the attack had gotten cheaper. Gaetan Leurent and Thomas Peyrin published a paper called "SHA-1 is a Shambles." For an estimated $45,000 in rented GPU time, they crafted a pair of PGP keys with different identities but colliding SHA-1 certificates [15].

Why does that matter? GPG, an open-source implementation of OpenPGP, used for years by journalists, activists, and Linux maintainers, still defaulted to SHA-1 for identity certifications [15]. And PGP runs on a "Web of Trust": instead of one central certificate authority proving who owns a key, users sign each other's keys. Forge the right certificate, and you can impersonate someone inside that trust graph.

And then there's Git.

Git identifies every object in a repository by its SHA-1 hash. Every commit, every file blob, every tag. A collision in Git theoretically means you could craft two different commits with the same identifier, swap malicious code into a repository, and not break the cryptographic chain that's supposed to prevent exactly that.

When SHAttered dropped, GitHub responded fast. Within a few weeks, they deployed SHA-1 collision detection, a modified version of SHA-1 that watches for the specific bit patterns used in collision attacks and rejects just them [5].

Git has been preparing for a transition for years [17], but that migration still isn't finished. To be fair, Git shas are everywhere. There may not be a messier SHA migration in software.

MD5, the algorithm that came before SHA-1, started falling apart in the early 2000s [16]. It took decades to fully scrub MD5 from production systems. There are probably still corners of the internet where MD5 quietly hashes things it shouldn't.

SHA-2, which is what we mostly use now, especially SHA-256, is the safer family everyone moved to after SHA-1. Cryptographers do NOT think a practical SHA-256 break is close. But "close" in cryptography is measured in decades, and we don't know what we don't know [1].

So should you panic the next time you push a Git commit, or click the little padlock in your browser? No. The padlock is signed with SHA-2, and the vast majority of commit hashes aren't direct attack vectors.

But you should understand the tools you're trusting.

Bibliography

Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, and Yarik Markov, "Announcing the first SHA1 collision." Google Online Security Blog, February 23, 2017. https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Marc Stevens, Elie Bursztein, Pierre Karpman, Ange Albertini, and Yarik Markov, "The first collision for full SHA-1." IACR Cryptology ePrint Archive, 2017. https://eprint.iacr.org/2017/190.pdf
Internet Archive capture of the SHAttered project page, February 23, 2017. https://web.archive.org/web/20170223151231/https://shattered.io/
Chrome Security Team, "Gradually sunsetting SHA-1." Google Online Security Blog, September 5, 2014. https://security.googleblog.com/2014/09/gradually-sunsetting-sha-1.html
GitHub Engineering, "SHA-1 collision detection on GitHub.com." The GitHub Blog, March 20, 2017. https://github.blog/engineering/platform-security/sha-1-collision-detection-on-github-com/
NIST, "FIPS PUB 180-4: Secure Hash Standard." https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf
NIST, "SP 800-131A Rev. 2: Transitioning the Use of Cryptographic Algorithms and Key Lengths." https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-131Ar2.pdf
Bruce Schneier, "SHA-1 Collision Found." Schneier on Security, February 23, 2017. https://www.schneier.com/blog/archives/2017/02/sha-1_collision.html
Dan Goodin, "At death's door for years, widely used SHA1 function is now dead." Ars Technica, February 23, 2017. https://arstechnica.com/information-technology/2017/02/at-deaths-door-for-years-widely-used-sha1-function-is-now-dead/
Hacker News discussion, "Announcing the first SHA1 collision." February 2017. https://news.ycombinator.com/item?id=13713480
Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu, "Finding Collisions in the Full SHA-1." Advances in Cryptology - CRYPTO 2005, LNCS vol. 3621, Springer, 2005. https://link.springer.com/chapter/10.1007/11535218_2
Bruce Schneier, "SHA-1 Broken." Schneier on Security, February 15, 2005. https://www.schneier.com/blog/archives/2005/02/sha1_broken.html
Marc Stevens, Pierre Karpman, and Thomas Peyrin, "Freestart Collision for Full SHA-1." IACR Cryptology ePrint Archive, 2015; EUROCRYPT 2016. https://eprint.iacr.org/2015/967
Microsoft, "SHA-1 Windows content to be retired August 3, 2020." Microsoft Tech Community / Windows IT Pro Blog, July 28, 2020. https://techcommunity.microsoft.com/blog/windows-itpro-blog/sha-1-windows-content-to-be-retired-august-3-2020/1544373
Gaetan Leurent and Thomas Peyrin, "SHA-1 is a Shambles: First Chosen-Prefix Collision on SHA-1 and Application to the PGP Web of Trust." 29th USENIX Security Symposium, 2020. https://www.usenix.org/conference/usenixsecurity20/presentation/leurent
Wikipedia, "SHA-1." https://en.wikipedia.org/wiki/SHA-1
brian m. carlson, "Git hash function transition." Git Documentation. https://git-scm.com/docs/hash-function-transition/
Mozilla Security Blog, "The end of SHA-1 on the Public Web." February 23, 2017. https://blog.mozilla.org/security/2017/02/23/the-end-of-sha-1-on-the-public-web/
Bruce Schneier, "When Will We See Collisions for SHA-1?" Schneier on Security, October 5, 2012. https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
Microsoft, "2019 SHA-2 Code Signing Support requirement for Windows and WSUS." Microsoft Support. https://support.microsoft.com/help/4472027/2019-sha-2-code-signing-support-requirement-for-windows-and-wsus

SHA-1 Was Shattered

What even is a hash function?

How did SHAttered work?

Why are people still using it?

Bibliography

Related Articles

The Boot.dev Beat. June 2026

WannaCry: The Ransomware Attack That Shut Down Hospitals

Open Source Maintainers Are Crashing Out

GitHub Keeps Going Down

Platform

Languages

Subjects

Support

Pricing

Community