Digging through Git History for Removed Text
Close up of a modern excavator bucket digging a hole in the ground by Ivan Radic is licensed under CC BY
Git keeps a history of practically everything for your repository. I cannot state how powerful a tool it is, and has helped me immensely as I try to understand the why of some code.
There are times when I’ve tried to find something that doesn’t exist in the current revision but was in a previous version. I’ve always struggled with this, and I wanted a simpler and (hopefully) quicker way to conduct this search.
I decided to script my way to a solution that I feel happy about for now.
This script ends up doing the following to search through Git history for matching text:
grep
the git
patch and print the lines that match the search query (with the match highlighted).A lot of the git
commands are leveraging the --pickaxe-regex -S<query>
flags to search changes with regex powers:
-S<string>
Look for differences that change the number of occurrences of the specified string (i.e. addition/deletion) in a file.
--pickaxe-regex
Treat the <string> given to -S as an extended POSIX regular expression to match.
This would allow for the search query to leverage regular expressions. The following is an example of looking through my dotfiles repo for zgen
(which is something I don’t use anymore).
❯ git dig "\szgen"
Searching for any commits with any 'additions/removals'
Matching regex: "\szgen"
commit bd5dc58e9f94b7b62013a1378fbe7d101ccc70ec
Author: Kevin Jalbert <[email protected]>
Date: Fri Jan 6 17:42:44 2017 -0500
Update zsh.rake to install zplug instead of zgen
FILE -> tasks/zsh.rake
- desc "Install zgen"
- section "Installing zgen"
# Plus a few more commits
This GitHub Gist will be the canonical reference of the git-dig
script (and will see updates as needed).
#!/usr/bin/env ruby
require 'optparse'
require 'open3'
def files_with_search_query(sha, search_query)
`git show --format="" --name-only #{sha} -i --pickaxe-regex -S #{search_query}`.split("\n")
end
def grep_over_file(sha, file, search_query)
git_show_cmd = "git show --format='' -p #{sha} -- '#{file}'"
grep_cmd = "grep -i --color=always -E #{search_query}"
`#{git_show_cmd} | #{grep_cmd}`
end
def git_pickaxe_search(search_query, options, &block)
git_log_cmd = "git log --no-merges --format=%h"
git_log_cmd += " --after=\"#{options[:after]}\"" if options[:after]
git_log_cmd += " --before=\"#{options[:before]}\"" if options[:before]
git_log_cmd += " -i --pickaxe-regex -S #{search_query}"
Open3.popen3(git_log_cmd) do |stdin, stdout, stderr, thread|
yield(stdout)
end
end
def git_commit_info(sha)
`git show --no-patch #{sha}`
end
def find_root_dir
current_dir = Dir.pwd
until Dir.exist?([current_dir, ".git"].join(File::SEPARATOR)) do
parent_dir = File.dirname(current_dir)
if parent_dir == current_dir
puts "No .git repo found in path (or parent paths)."
exit
end
current_dir = parent_dir
end
current_dir
end
# Handle options
options = {}
OptionParser.new do |opts|
opts.banner = "Usage: git-dig [options] [regex query]"
opts.on("--after=DATE", "Only searches commits more recent than the specific date.") do |v|
options[:after] = v
end
opts.on("--before=DATE", "Only searches commits older recent than the specific date.") do |v|
options[:before] = v
end
opts.on("-h", "--help", "Prints this help") do
puts opts
exit
end
end.parse!
# Get search query
search_query = "\"#{ARGV.first}\""
if search_query == '""' # empty search as it would just be a pair of quotes
puts "You need to pass a search term as input."
exit
end
puts "Searching for any commits with any 'additions/removals'"
puts " Matching regex: #{search_query}"
if options[:after] || options[:before]
puts " With search options: #{options}"
end
puts ""
# Get SHAs of all commits which had some 'addition/removal' of the search term
git_pickaxe_search(search_query, options) do |stdout|
while line = stdout.gets
exit if line.nil?
sha = line.strip
puts git_commit_info(sha)
puts ""
# Print diff of files which have search term
files_with_search_query(sha, search_query).each do |file|
puts "FILE -> #{file}"
puts grep_over_file(sha, file, search_query)
puts ""
end
puts "-"*80
puts ""
end
end
To note, most Git GUIs have some search functionality in place which can satisfy the majority of this. Using the Git CLI is still a nice thing to be able to leverage though, which is why I went this route.