Software Sleuthing – The Act of Being a Detective! by Prabhat Mishra on June 17, 2020 1,185 views
Software sleuthing or debugging is a process of investigating a problem within a computer program. Since it’s an investigation, it always starts when someone notices or reports the problem.
The problem could be an error or performance issue or it could be a functional anomaly.
For example:
Problem #1 – A user is getting HTTP 403 Forbidden error even after providing required access permissions.
Problem #2 – A search operation at times takes more than a minute to give a response.
Problem #3 – Account balance being displayed varies from expected balance.
Note: if you are reporting a problem then make sure to include following details on the problem report
i). What were you doing?
ii). What happened?
iii). What you were expecting?
iv). Other evidence like URLs, snapshots, error messages etc.
This information gives the investigating team a good head start.
Let’s Start the Investigation:
- Once you receive the problem report, don’t be in a hurry to fix the problem. It’s worthwhile finding as many defects as possible and fixing them together.
- Analyze initial facts provided along with the problem. I would say read it multiple times.
- Recreate the issue using the same steps reported.
- Build a basic test case just recreating the issue.
- Run the test case and document the outcome as investigation facts.
You might be wondering why to document the investigation facts. It has multiple benefits i) help us draw better & correct conclusions. ii) if you are leaving in the middle of investigation or someone joining you on the investigation can go through it and have a clear understanding. iii) can be produced as evidence or references in times, etc.
- Go through the investigation facts and draw conclusions.
- Expand the test cases based on our conclusion i.e to cover new suspects.
- Rerun test cases and document the outcomes as facts found during investigation.
- Repeat steps 6 to 8 until we find the root cause or narrow it down to a specific area.
- Once we find the root cause it’s time to think of possible quick and permanent solutions.
- Apply the best possible solution. Based on the urgency we may go for double rounds like apply the quickest solution first and later on go for the permanent solution.
- Rerun test cases to ensure the solution resolved all the problems.
- Once your fix moves to the dev/staging environment test using the same steps reported.
There are several debugging approaches and tools you can use during this process. Like
- Print Debugging (or tracing)
- Backtracking
- Wolf Fencing
- Incremental development
- Problem simplification
- Debuggers
Further Reading:
- http://www.cs.cornell.edu/courses/cs312/2006fa/lectures/lec26.html
- https://opendsa-server.cs.vt.edu/ODSA/Books/Everything/html/debugmethods.html
- http://www.itu.dk/people/slauesen/Papers/DebuggingTechniques.pdf
- https://www.pembinatrails.ca/WhatWeOffer/SafetyHealthandEnvironment/Safety/Accident-Reporting/Pages/Accident-Investigations.aspx