blog dds

2017.01.16

How to avoid redoing manual corrections

Say you have an automated process to create a report, which you then have to polish by hand, because there are adjustments that require human judgment. After three hours of polishing, you realize that the report is full of errors due to a bug in the initial reporting process. Is there a way to salvage the three hours of work you put into it?

Here's how I used Git to avoid redoing 90% of the manual work I had put into the report. I initially created the report with a process, such as the following.

run-report raw-data >report
hand-edit report >polished-report

At a point where I had almost finished the hand-editing the report (in my case about 700 BibTeX entries), I realized that all the DOI (Digital Object Identifier) links were wrong. I soon found out that Scopus had started using in its RIS export the M3 field to mark the entry's type. In the past SpringerLink used the same field for the DOI and this broke the DOI import. I quickly fixed the run-report program to output the correct DOI.

The problem now was merging my manual changes with the about 600 corrected DOIs. I initially considered writing a custom script to do that, but I then realized I could ask Git to do the merging for me. The steps I followed were roughly the following.

# Initialize a Git repository
git init

# Add the initial report
git add report
git commit -m 'Report output with errors'

# Create a branch with the fixed errors
git checkout -b fix-errors

# Run the reporting again to get the file with the errors fixed
run-report raw-data >report

# Commit the error fixes
git commit -am 'Report output without errors'

# Go back to the branch with the initial report
git checkout master

# Create a revision with the hand-polishing
mv polished-report report
git commit -am 'Polished report with errors'

# Merge into this revision the fixed errors
git merge fix-errors
git commit -am 'Polished report without errors'

When merging the corrected output with my hand polishing, I had about 14 merge conflicts. I addressed those in a few minutes. In all, Git saved me hours of mind-numbing work.

Read and post comments, or share through   


Creative Commons License Last modified: Monday, January 16, 2017 2:10 pm
Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.