Automated localization for unreproducible builds
Automated localization for unreproducible builds Ren et al., ICSE’18
Reproducible builds are an important component of integrity in the software supply chain. Attacks against package repositories and build environments may compromise binaries and produce packages with backdoors (see this report for a recent prominent example of compromised packages on DockerHub). If the same source files always lead to the same binary packages, then an infected binary can be much more easily detected. Unfortunately, reproducible builds have not traditionally been the norm. Non-determinism creeping into build processes means that rebuilding an application from the exact same source, even within a secure build environment, can often lead to a different binary.
Due to the significant benefits, many open-source software repositories have initiated their validation processes. These repositories include GNU/Linux distributions such as Debian and Guix, as well as software systems like Bitcoin.
If you have a non-reproducible build, finding out why can be non-trivial. It takes time and a lot of effort to hunt down and eradicate the causes. For example, Debian unstable for AMD64 still had 2,342 packages with non-reproducible builds as of August 2017. (The number today as I’m writing this is 2,826). You can see a stubbornly persistent Continue reading