This is a huge achievement for Debian and the free software world.
It took a while though until this was understood. In 2007 when pointing out on debian-devel that this is needed, I was still told what huge waste of time this would be. And indeed it took a huge amount of work by many people to get there, but it is well worth it.
https://wiki.debian.org/ReproducibleBuilds has some more infos; some is outdated, but it also has a chart showing how many packages are built in the CI, and how many of those are reproducible builds.
(Orange = FTBR = "failed to build reproducibly")
I'm not good at reading numbers from charts, but I'd guess it's a few percent (4-5ish?).
While we are bragging, stagex was the first to hit 100% full source bootstrapped deterministic and hermetic builds last year and the first to make multiple signed reproductions by different maintainers on their own hardware mandatory for every release.
Debian has come along way, but when Debian says reproducible they mean they grab third party binaries to build theirs. When we say reproducible we mean 100% bootstrapped from source code all the way through the entire software supply chain.
As pointed in your link, NetBSD achieved this with some help from Debian. If I understand correctly, it's not that NetBSD tried harder, it's that their problem was easier: fewer packages which change less (they still use CVS, "stability" is an understatement!).
BTW, most Debian packages have reproducible builds. Those which have not (I'd say 5%) are shown in orange in the graph there: https://wiki.debian.org/ReproducibleBuilds
Has anyone fought Microsoft Visual Studio successfully to produce reproducible builds of C++ programs? From what I have heard, it is one of the worst contexts to do it.
As someone who recently spent a lot of time on making a large C++ program entirely reproducible on 4 different OS’es, one cannot understate just how many tiny details matter here.
Debian, like any other legacy distro, mush became declarative, because the '80s model of manual deploy and the absurd pain of D/I and Preseed must end.
Maybe not by itself, but it does allow for the ecosystem to be audited, in a way that ultimately benefits the end-user. It really is an important part of a healthy supply chain.
This is some of the best news I've heard recently when it comes to figuring out how to produce high quality Software Bills of Materials for the upcoming EU Cyber Resilience Act, for what it's worth. Reproducible packages are actually worth a great deal when you are selling products with digital elements. Much easier to scan through, audit, etc. with confidence.
Debian has had a better "software supply chain" posture than any other player in the ecosystem since before the turn of the century. While we all face the risk of malware from upstream, Debian is the least at risk of being affected by it. See for example the stream of issues from npm et al. None of it has affected Debian.
It's npm that's affected, therefore it's not even considered when choosing language/ecosystem for writing distro tools. You'll find no sane distro writing package manager in javascript precisely to avoid this joke of a supply chain.
That's not what reproducible builds aim to prevent, and no one claims that. When upstream pushes bad code, that's on upstream.
The thing reproducible builds aim to prevent is Debian or individual developers and system administrators with access rights to binary uploads and signing keys to get forced to sign and upload binary packages by attackers - be these governments (with or without court orders) or criminal organizations.
As of now, say if I were an administrator of Debian's CI infrastructure, technically there would be nothing preventing me from running an "extra" job on the CI infrastructure building a package for openssh with a knock-knock backdoor, properly signing it and uploading it to the repository. For someone to spot the attack and differentiate it, they'd have to notice that there is a package in the repository that has no corresponding build logs or has issues otherwise.
But with reproducible builds, anyone can set up infrastructure to rebuild Debian packages from source automatically and if there is a mismatch with what is on Debian's repository, raise alarm bells.
Reproducible builds shows that, within a specific configuration, the code produced the binary, regardless of who signed or published it.
Indeed, this could mitigate an attacker replacing the binary with something that's not produced from the code, but it does not mitigate the tool chain or code itself containing the exploit, creating a malicious binary.
If you find yourself holding opinions of the kind: "If it can't be made perfect, it shouldn't be changed at all?" you may want to consider that most things that work well today were incrementally improved.
Reproducable builds are not solving all issues as you rightly observed, but they can be a stepping stone (or even a pre-condition) for further measures.
Well - reproducible also means code guarantee. It may not improve an end-user experience directly, but you get an extra quality control step, as guarantee, here. I think reproducibility is great. If we can achieve that, it should be achieved. See also NixOS; it can guarantee that snapshot xyz works, not just for one user, but ALL users. I see it as hopping from guarantee to guarantee. That's actually a good thing in the long run. Just think differently here.
This is a huge achievement for Debian and the free software world.
It took a while though until this was understood. In 2007 when pointing out on debian-devel that this is needed, I was still told what huge waste of time this would be. And indeed it took a huge amount of work by many people to get there, but it is well worth it.
https://wiki.debian.org/ReproducibleBuilds has some more infos; some is outdated, but it also has a chart showing how many packages are built in the CI, and how many of those are reproducible builds.
(Orange = FTBR = "failed to build reproducibly")
I'm not good at reading numbers from charts, but I'd guess it's a few percent (4-5ish?).
A great milestone, congrats Debian on taking a stance and holding high standards for yourself, especially in the current era.
Good thing. NetBSD has fully reproductible build since 2017. https://blog.netbsd.org/tnf/entry/netbsd_fully_reproducible_...
While we are bragging, stagex was the first to hit 100% full source bootstrapped deterministic and hermetic builds last year and the first to make multiple signed reproductions by different maintainers on their own hardware mandatory for every release.
Debian has come along way, but when Debian says reproducible they mean they grab third party binaries to build theirs. When we say reproducible we mean 100% bootstrapped from source code all the way through the entire software supply chain.
We think that distinction matters.
https://stagex.tools
As pointed in your link, NetBSD achieved this with some help from Debian. If I understand correctly, it's not that NetBSD tried harder, it's that their problem was easier: fewer packages which change less (they still use CVS, "stability" is an understatement!).
BTW, most Debian packages have reproducible builds. Those which have not (I'd say 5%) are shown in orange in the graph there: https://wiki.debian.org/ReproducibleBuilds
Forbidden
You don't have permission to access this resource. Apache Server at lists.debian.org Port 443
:/
I can see it just fine; maybe an overzealous firewall thinks you're a bot? At any rate, the Wayback Machine has it: https://web.archive.org/web/20260510074120/https://lists.deb...
Has anyone fought Microsoft Visual Studio successfully to produce reproducible builds of C++ programs? From what I have heard, it is one of the worst contexts to do it.
A small step for debian,
giant leap for mankind.
As someone who recently spent a lot of time on making a large C++ program entirely reproducible on 4 different OS’es, one cannot understate just how many tiny details matter here.
"overstate"
Debian, like any other legacy distro, mush became declarative, because the '80s model of manual deploy and the absurd pain of D/I and Preseed must end.
zero improvement on end-user experience. does not solve supply chain issues, debian package will reproducabily contain the malware from upstream.
> zero improvement on end-user experience.
Maybe not by itself, but it does allow for the ecosystem to be audited, in a way that ultimately benefits the end-user. It really is an important part of a healthy supply chain.
I would call less North Korean hacks a massive benefit for end users
NK isn't the hostile superpower I'm most concerned about.
While taking no stance on your statement, I think “fewer” works better in this context than “less”.
"fewer" when the measure is discrete, "less" when it is continuous
This is some of the best news I've heard recently when it comes to figuring out how to produce high quality Software Bills of Materials for the upcoming EU Cyber Resilience Act, for what it's worth. Reproducible packages are actually worth a great deal when you are selling products with digital elements. Much easier to scan through, audit, etc. with confidence.
Debian has had a better "software supply chain" posture than any other player in the ecosystem since before the turn of the century. While we all face the risk of malware from upstream, Debian is the least at risk of being affected by it. See for example the stream of issues from npm et al. None of it has affected Debian.
> for example the stream of issues from npm et al.
Curious, what distros where affected by npm supply chain attacks?
It's npm that's affected, therefore it's not even considered when choosing language/ecosystem for writing distro tools. You'll find no sane distro writing package manager in javascript precisely to avoid this joke of a supply chain.
It does not solve all supply chain issues, it do solve some supply chain issues.
Not being able to see if the source code shipped is the same as been used for creating the binary is scary
Has there been a single publicly known attack that would have been prevented by this?
Several actually. Pypi is regularly targeted in this way.
That's not what reproducible builds aim to prevent, and no one claims that. When upstream pushes bad code, that's on upstream.
The thing reproducible builds aim to prevent is Debian or individual developers and system administrators with access rights to binary uploads and signing keys to get forced to sign and upload binary packages by attackers - be these governments (with or without court orders) or criminal organizations.
As of now, say if I were an administrator of Debian's CI infrastructure, technically there would be nothing preventing me from running an "extra" job on the CI infrastructure building a package for openssh with a knock-knock backdoor, properly signing it and uploading it to the repository. For someone to spot the attack and differentiate it, they'd have to notice that there is a package in the repository that has no corresponding build logs or has issues otherwise.
But with reproducible builds, anyone can set up infrastructure to rebuild Debian packages from source automatically and if there is a mismatch with what is on Debian's repository, raise alarm bells.
Reproducible builds shows that, within a specific configuration, the code produced the binary, regardless of who signed or published it.
Indeed, this could mitigate an attacker replacing the binary with something that's not produced from the code, but it does not mitigate the tool chain or code itself containing the exploit, creating a malicious binary.
If you find yourself holding opinions of the kind: "If it can't be made perfect, it shouldn't be changed at all?" you may want to consider that most things that work well today were incrementally improved.
Reproducable builds are not solving all issues as you rightly observed, but they can be a stepping stone (or even a pre-condition) for further measures.
Well - reproducible also means code guarantee. It may not improve an end-user experience directly, but you get an extra quality control step, as guarantee, here. I think reproducibility is great. If we can achieve that, it should be achieved. See also NixOS; it can guarantee that snapshot xyz works, not just for one user, but ALL users. I see it as hopping from guarantee to guarantee. That's actually a good thing in the long run. Just think differently here.