I was recently reminded of a post I sent to the Google Corp internal Google+
(RIP) a decade(!) ago.
The topic of the post was software archaeology. The prompt was how Blink
(Chromium’s Web engine) internally uses reference counting that traces back to
early KHTML.
I’m reposting this decade-old observation that is even more relevant today:
Every Computer Science curriculum should include a course named “Source Code
Archaeology”
Subjects would include:
How to search as if you were the original developer.
Date-based searching: connecting the dots between mailing list discussions and
commits.
Finding code plagiarism across repositories.
Getting through a code refactor to find the original author.
Linearizing commits across multiple SCM systems to follow through a
project’s history when it switched SCMs.
Finding the needle in the haystack: how to ask the SCM for relevant
information and filter out the rest.
Design, not code: how to find the author’s inspiration for the design.