How bad can it git? Characterizing secret leakage in public GitHub repositories
How bad can it git? Characterizing secret leakage in public GitHub repositories Meli et al., NDSS’19
On the one hand you might say there’s no new news here. We know that developers shouldn’t commit secrets, and we know that secrets leaked to GitHub can be discovered and exploited very quickly. On the other hand, this study goes much deeper, and also provides us with some very actionable information.
…we go far beyond noting that leakage occurs, providing a conservative longitudinal analysis of leakage, as well as analyses of root causes and the limitations of current mitigations.
In my opinion, the best time to catch secrets is before they are ever committed in the first place. A git pre-commit hook using the regular expressions from this paper’s appendix looks like a pretty good investment to me. The pre-commit hook approach is taken by TruffleHog, though as of the time this paper was written, TruffleHog’s secret detection mechanisms were notably inferior (detecting only 25-29%) to those developed in this work (§ VII.D). You might also want to look at git-secrets which does this for AWS keys, and is extensible with additional patterns. For a belt and braces approach, also Continue reading