Practical collision attack against SHA-1

Published: 2017-02-23
Last Updated: 2017-02-23 16:56:14 UTC
by Rick Wanner (Version: 1)
4 comment(s)

Google has announced that they have succeeded in developing a technique which makes it practical to craft two PDF files with the same SHA-1 digital signature.

Of course like all new vulnerabilities/attacks in this decade it needs a web page and a cool logo.  Not to disappoint they can be found here.

What does this mean to you?  The fact is nothing has changed since yesterday.  This is still a difficult attack. For most applications SHA-1 will still be an adequate level of protection.  This does highlight a significant risk to high-trust applications such as banking, legal contracts and digital signatures.  Theoretical attacks against SHA-1 have been hypothesized since 2005 and SHA-1 was deprecated by NIST in 2011, so most high-trust uses of SHA-1 should be long since upgraded to more secure methods.

SHA-1 is still commmonly used for file integrity hashes, and is used for that purpose in Git and most vendor signatures, so there wil be some work to do.

Google is following their disclosure guidelines so the details of the attack will not be released for 90 days.  Leaving time for applications that are still using SHA-1 to move to more secure hashing methods such as SHA-3 or SHA-256.

Further reading below: 

Google -> https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

ARSTechnica -> https://arstechnica.com/security/2017/02/at-deaths-door-for-years-widely-used-sha1-function-is-now-dead/

-- Rick Wanner MSISE - rwanner at isc dot sans dot edu - http://namedeplume.blogspot.com/ - Twitter:namedeplume (Protected)

Keywords:
4 comment(s)

Comments

Of course, this method won't produce both an SHA-1 AND a MD5 hash that match. Either or. So the older methods, if combined, are still safe.
Absolutely correct. That's why we enabled both SHA1 and MD5 hashing in our file integrity monitoring system many years ago.

Additionally, while both SHA1 and MD5 are "broken" they are both still quite useful for things such as file download integrity verification. Yes, using the MD5 hash to validate the file is still far better than not checking anything at all.

When MD5 collisions were proven many years ago they also used a PDF file. The authors explained that there is a large amount of error correction in a PDF so that even a corrupted PDF can be made to display correctly. They took advantage of that error correction by padding the binary PDF with garbage that changed the file internally to change the hash value but that did not affect the Reader's ability to render the PDF.

So no, modifying a binary program file to have a hash collision and replacing it on someone's Downloads web page is not a real risk. If you are able to upload a malicious file and they do display the hash values on the page, it's probably just as easy to modify the web page to display the correct hash of your file.

Our biggest problem with moving away from MD5 and SHA1 on VPNs is the other side. Ancient Cisco firewalls, ancient routers, you name it. "We can't get the budget to replace them." is the frequent excuse. The last one we experienced that with was an FBI-approved vendor for submitting fingerprints for background checks. "Upgrading is on our roadmap." Another great excuse is "We don't work for this company. We're their outsourced network administrator."
You are absolutely correct. This attack, when demonstrated against both MD5 and SHA-1, was done with PDF files because PDF files were the easiest to manipulate. The PDF format permits padding of the file without impacting the visual view of the file and as you said, even if the PDF were to be corrupted it would still most likely display. This attack would be far harder to execute against a file that is not as tolerant of changes, such as a binary executable, or even a source code file. But the likelihood of a successful attack is not necessarily the issue, it is the uncertainty of whether or not a successful attack is possible that is the issue. The trust has been eroded.

Creating multiple hashes using disparate hashing algorithms eliminates any likelihood of a successful attack. Finding a collision that is applicable across multiple hashing algorithms, while still theoretically possible, is so mathematically unlikely as to not be considered reasonable. I checked most of the common software repositories that I get code from and they already publish multiple hashes. For example WinSCP:

WinSCP-5.9.4-Setup.exe
- MD5: dabad66ce7ab5d3a1e60bf10a64912a4
- SHA-1: 7a2b9ff4d3e9a58286556c9718e86c27ce47529f
- SHA-256: af062b32c907ee1d51de82cadb570171750a51e7dd3d953bb8f24282c3db642d
The PDFs produced for the SHAttered attack are actually identical, except for the JPEG image they both contain.
The PDFs don't contain garbage.

The JPEG images are mostly identical, except for 128 bytes close to the start of the JPEG image.
Those bytes are stored as comments (FFFE) in the JPEG image, and they create the collision.

Doing this with an exe file would have been as simple as for a PDF (Ange even calls it trivial, and says PDF was more challenging: https://twitter.com/angealbertini/status/835044024033632257).
The bytes for the collision can be put in the IMAGE_DOS_STUB, after the IMAGE_DOS_HEADER.
And you could also do this for the digital signature of the exe file: https://blog.didierstevens.com/2009/01/17/playing-with-authenticode-and-md5-collisions/

Diary Archives