Drive other programs from Python. Debug your own programs with pdb. The lab wraps the Unix du utility via subprocess and includes a debugging exercise that plants a subtle bug for you to find.
Theme
Two skills this week. They look unrelated but share a register: this is the week Python becomes a connector to the rest of your system.
The first skill is subprocess: running another program from Python and reading its output. Most Python automation is glue between existing tools. Want disk usage? Wrap du. Want git status? Wrap git status. Want to convert a video? Wrap ffmpeg. Reaching for subprocess is the right move whenever a Unix utility (or any command-line program) already does the work better than you could write it. The discipline is in handling the result correctly: exit codes, stdout, stderr, errors.
The second skill is pdb, the standard Python debugger. Until this week you have used print-debugging exclusively, and it has been fine for the problems you have faced. This week's lab plants a bug subtle enough that print-debugging is genuinely worse than pdb: the bug involves a wrong value computed somewhere deep in a function chain, and finding it with prints requires sprinkling the whole call tree with print(name, val) lines. With pdb you set a breakpoint, run, inspect, step.
You also pick up try/except properly this week, completing what week 5 introduced as a forward-pointer. By the end of week 9 you can: launch a subprocess and read its output safely; recognize the shell-injection risk and avoid shell=True; set a pdb breakpoint and use the four core commands (n, s, c, p); read a Python traceback root-up; catch specific exceptions and let unexpected ones propagate.
Reading list (~1 hour)
- Matthes, Python Crash Course 2nd ed., Ch 10.4 ("Exceptions"). Matthes covers
try/except/else/finallyand the common exception classes. FND-102 week 9 is where you finally get this in depth. - Sweigart, Automate the Boring Stuff with Python 2nd ed., Ch 11 ("Debugging") at
https://automatetheboringstuff.com/2e/chapter11/. Free online. Covers tracebacks, assertions, and logging-as-debugging. Sweigart skipspdbin favor of IDE debuggers; FND-102 teachespdbbecause every Linux server you SSH into has it. - Python
subprocessmodule docs athttps://docs.python.org/3/library/subprocess.html. Read at least the "Using the subprocess Module" section and thesubprocess.runreference. ~25 min. - Python
pdbmodule docs athttps://docs.python.org/3/library/pdb.html. Skim the command reference. ~15 min. The full four-command vocabulary you need is in the lecture below. - Real Python: "Python Debugging With Pdb" at
https://realpython.com/python-debugging-pdb/. ~25 min read. Worked examples; the only Real Python article onpdbworth a careful read.
Lecture outline (~1.5 hours, 2 sessions of ~50 min)
Session 1: subprocess and try/except
Section 1.1: subprocess.run the safe default
- The minimal pattern:
import subprocess result = subprocess.run(['ls', '-la'], capture_output=True, text=True) print(result.stdout) print('exit code:', result.returncode)
- Arguments:
- First argument is a LIST of strings (the command and its arguments). NOT a single string.
capture_output=Truecaptures stdout and stderr instead of letting them inherit the parent's terminal.text=Truedecodes stdout/stderr as text (UTF-8 by default) instead of bytes. Without it,result.stdoutisbytesand you must.decode()manually.
- Return value is a
CompletedProcessobject with attributes:result.returncode; the exit status (0 = success)result.stdout; captured stdout as a stringresult.stderr; captured stderr as a string
Section 1.2: shell=True is dangerous
- The convenient form:
result = subprocess.run('ls -la ' + user_input, shell=True, capture_output=True, text=True)
- The danger: if
user_inputis'; rm -rf ~', the shell happily runsls -la ; rm -rf ~. This is a shell injection vulnerability. The standard example, but real software gets owned by it regularly. - The safe form: pass arguments as a list. The OS does NOT pass them through a shell; the shell metacharacters in
user_inputare just data.result = subprocess.run(['ls', '-la', user_input], capture_output=True, text=True)
- Rule: do not use
shell=Truewith any string that includes user input. Better rule: do not useshell=True. Almost every use case has a list-form equivalent.
Section 1.3: Exit codes
- Convention: 0 means success, nonzero means failure.
- Check explicitly:
result = subprocess.run(['ls', '/nope'], capture_output=True, text=True) if result.returncode != 0: print('ls failed:', result.stderr)
- Or use
check=Trueto raiseCalledProcessErroron nonzero exit:try: result = subprocess.run(['ls', '/nope'], capture_output=True, text=True, check=True) except subprocess.CalledProcessError as e: print('ls failed:', e.stderr)
- The
check=Truepattern matches the Pythonic "raise on error" style; the explicit-check pattern matches Unix script style. Pick the one that fits your tool.
Section 1.4: try / except in depth
- The basic pattern:
try: value = int(user_input) except ValueError as e: print(f'not a number: {e}') value = 0
- Catch the SPECIFIC exception class.
except:(bare) catches everything includingKeyboardInterruptandSystemExit, and hides real bugs. Always name the class. - Multiple exception types:
try: ... except (FileNotFoundError, PermissionError) as e: print(f'cannot open file: {e}')
try/except/else/finally:elseruns only iftrysucceeded (no exception)finallyruns always (success or exception); useful for cleanup
- The "EAFP" idiom: Easier to Ask Forgiveness than Permission. Pythonic style prefers
try: x[k]; except KeyError: defaultoverif k in x: x[k]; else: default. Both work; EAFP wins when the missing-key case is genuinely exceptional.
Session 2: pdb
Section 2.1: The four commands you need
- Set a breakpoint by inserting one line in your code:
import pdb; pdb.set_trace() # OR, in Python 3.7+: breakpoint()
- When execution hits this line, you get an interactive prompt:
(Pdb)
- Four commands handle 95% of debugging:
n(next): execute the current line; if it is a function call, do NOT step into its(step): execute the current line; if it is a function call, DO step into itc(continue): resume normal execution until the next breakpoint or program endp variable(print): print the current value ofvariable
Section 2.2: Inspecting state
p variableprints the value.pp variablepretty-prints (for nested structures).l(list) shows the source around the current line.w(where) prints the call stack so you can see how you got here.- Type any Python expression at the
(Pdb)prompt to evaluate it.p len(my_list)prints the length.p sum(my_dict.values())prints the sum.
Section 2.3: Setting breakpoints without editing the source
- Run your script under pdb:
python3 -m pdb my_script.py
- This drops into pdb at the first line. Use
b filename.py:42to set a breakpoint at line 42, thencto continue to it. - Useful when you cannot modify the source (a vendored module, for example).
Section 2.4: Reading a traceback
- A Python traceback prints top-to-bottom in CALL order: the outermost frame first, the innermost (where the exception was raised) last.
- Read root-up: start at the LAST line (the exception message), then work upward to see HOW you got there.
- Example:
Traceback (most recent call last): File "scan.py", line 23, in <module> main() File "scan.py", line 18, in main result = process(data) File "scan.py", line 10, in process return data[0] / data[1] ZeroDivisionError: division by zero
- The exception is
ZeroDivisionErrorat line 10 ofscan.pyin functionprocess.processwas called frommainat line 18.mainwas called from the top of the file at line 23. To fix: either preventdata[1]from being zero, or wrap the division intry/except ZeroDivisionError.
Section 2.5: When pdb beats print, when print beats pdb
- pdb wins when the bug is "wrong value somewhere in a deep call chain" or "rare condition I can't reliably trigger." Set a breakpoint at the suspicious line; inspect interactively.
- print wins when the bug is "this loop is doing something weird" (sprinkle prints in the loop body) or when you cannot use stdin (the program is running unattended). Print scales to "every iteration"; pdb requires you to step.
- A middle option:
logging.debugwith--debugflag (Lab 6 pattern). Production code uses logging, not pdb or print.
Labs (~90 minutes)
Lab 9: Disk-Usage Reporter + Debugging Exercise (labs/lab-9-disk-usage.md)
- Goal: build a CLI tool that wraps
duvia subprocess and emits a human-readable directory-size summary; then debug a planted bug using pdb - Time: ~90 minutes (60 min for the tool, 30 min for the debug exercise)
- Artifact:
lab-9-du.py+lab-9-bug.py(with your fix committed) in~/fnd-102/lab-9/
Independent practice (~4 hours)
-
subprocess drills (45 min). Wrap five common shell commands with
subprocess.run. For each, print the output and the exit code:date(no args)ls -la /tmpdf -h(Unix) orwmic logicaldisk get size,freespace,caption(Windows; or use shutil.disk_usage instead)python3 --versiongit statusfrom the FND-102 directory
For each, decide: list form (safe) or shell form (dangerous). Always pick list form for these.
-
Shell-injection demo (30 min). Write a small Python script that calls
subprocess.run('echo ' + name, shell=True). Call it withname = 'hello'; it prints "hello". Call it withname = 'hello; touch INJECTED'; observe the fileINJECTEDappears. Now rewrite with the list form (subprocess.run(['echo', name])); confirm that the list form treats the metacharacters as data. -
try/except practice (45 min). Take your Lab 5 scanner and add proper exception handling for:
- Missing input file (
FileNotFoundError) - Permission denied (
PermissionError) - Decoding error on a binary file (
UnicodeDecodeError)
For each, the program should print a clear error message and exit nonzero. Test by deliberately triggering each.
- Missing input file (
-
pdb exploration (45 min). Take a known-buggy function:
def average(nums): return sum(nums) / len(nums) + 1 # bug: spurious +1 def main(): result = average([1, 2, 3, 4, 5]) print(result)
Set a
breakpoint()insideaverage; run it; step through withnandp; identify the+1. Then fix it. The point: even on a trivial bug, the muscle memory matters. -
Traceback reading drill (30 min). Write three programs that crash in three different ways:
IndexError:[1, 2, 3][5]KeyError:{'a': 1}['b']TypeError:'hello' + 5
Read each traceback root-up. Describe in one sentence what went wrong and how to fix it.
-
EAFP vs LBYL (30 min). Take this LBYL ("Look Before You Leap") code:
if 'name' in user: greet(user['name']) else: print('no name')
Rewrite as EAFP ("Easier to Ask Forgiveness than Permission"):
try: greet(user['name']) except KeyError: print('no name')
When is each more readable? When does each have a performance edge? (Hint: think about the success case being common vs rare.)
Reflection prompts (~30 minutes)
- The shell-injection demo (practice 2) made the danger of
shell=Trueconcrete. Did the demo change how you'd write similar code in the future? - The pdb exploration (practice 4) showed pdb on a trivial bug. How would print-debugging have compared? On what kind of bug would pdb be clearly worth the setup cost?
- Tracebacks read top-to-bottom in call order but are most informative root-up. Did you read the three crash tracebacks (practice 5) top-down first? Which way is faster for you now?
try/exceptlets a program continue past errors. Your week-5 scanner crashed on a missing file; your week-9 version handles it. What new failure modes did you NOT handle, intentionally? (Hint: every program has unhandled failure modes; the discipline is to be intentional about which.)- One thing from this week you want to know more about?
Tool journal (week 9)
subprocess.run: run another program from Pythoncapture_output=True, text=True: the safe defaults- List form vs
shell=True: shell injection avoidance subprocess.CalledProcessError: nonzero exit handlingtry/except/else/finally: exception handling shapespdbandbreakpoint(): interactive debugger- pdb commands:
n,s,c,p,l,w - Traceback reading discipline: root-up
What comes next
Week 10 picks up git intermediate skills: branches, remotes, pull requests. Lab 10 submits your Lab 9 disk-usage reporter as a git PR for instructor review. The actual workflow every working programmer uses every day; Lab 10 is your first experience with code review.