The shape of a real program: code split across files, plus the argparse and logging modules. The lab is a refactor: take Lab 5's scanner, give it a real argparse CLI and a logging debug log.
Theme
Single-file Python programs do not scale. Past ~500 lines, a script becomes hard to navigate. The fix is the same fix as for functions in week 3: split the code into named pieces. The unit one level up from a function is a module; a .py file that other code can import.
This week also introduces Python's standard library, the collection of modules that ship with the interpreter. The stdlib is large and growing; you do not need to memorize it. The discipline is to know which problems have stdlib solutions, then look up the module when you need it. Two modules are this week's focus: argparse (CLI argument parsing, used in every CLI tool you write from now on) and logging (structured debug and audit output, replacing print-debugging for non-trivial programs).
The lab is a refactor. Lab 5 was a working scanner; this week you rewrite it with a real CLI (--input PATH, --output PATH, --top N, --verbose) and a proper logging setup (DEBUG to a file, INFO to stderr). The scanner's behavior does not change; the interface becomes professional. This is the difference between a script you wrote and a tool you would hand to a coworker.
By the end of week 6 you can: organize a multi-file Python project; use import correctly (module imports, from-imports, aliases); write an argparse.ArgumentParser with positional and optional arguments; configure the logging module to send output to multiple destinations; recognize when to reach for a stdlib module instead of writing your own.
Reading list (~1 hour)
- Matthes, Python Crash Course 2nd ed., Ch 8.5 ("Storing Your Functions in Modules"). Matthes covers
import,from ... import, aliases, and module organization. - Sweigart, Automate the Boring Stuff with Python 2nd ed., Appendix B ("Running Programs") and chapter excerpts on
argparse(Ch 14: "Working with PDF and Word Documents" uses argparse incidentally) athttps://automatetheboringstuff.com/2e/chapter14/. Sweigart's book does not have a dedicated argparse chapter; the official Python tutorial fills that gap (next reading). - Python official argparse tutorial at
https://docs.python.org/3/howto/argparse.html. ~20 min read. The canonical reference; the patterns it shows are the ones you should use. - Python official logging tutorial at
https://docs.python.org/3/howto/logging.html. ~20 min read. The basic vs advanced split is genuine; for FND-102 you need the "basic logging tutorial" section only. - Real Python: "Logging in Python" at
https://realpython.com/python-logging/. ~20 min read. Worked examples beyond the official tutorial.
Lecture outline (~1.5 hours, 2 sessions of ~50 min)
Session 1: Modules and imports
Section 1.1: What a module is
- A module is a
.pyfile. Period. mymath.py:def square(n): return n * n def cube(n): return n ** 3
- From another file in the same directory:
import mymath print(mymath.square(5)) # 25 print(mymath.cube(3)) # 27
- The first time
mymathis imported, Python runs the file top-to-bottom (executing function definitions and any top-level code). The functions become attributes of the module object.
Section 1.2: Import variations
import mymath; bind the module to the namemymath. Usemymath.square(5).import mymath as mm; alias. Usemm.square(5). Common for long module names (import numpy as np).from mymath import square; bindsquaredirectly. Usesquare(5). Nomymath.prefix.from mymath import square, cube; multiple names at once.from mymath import *; import everything. Almost always wrong; pollutes your namespace. Avoid.
Section 1.3: The standard library
- Python ships with ~200 modules in the standard library. Some you will use frequently in FND-102:
osandpathlib; filesystemsys; interpreter and command-line interfacejson,csv; structured dataargparse; CLI argument parsinglogging; structured outputre; regular expressionssubprocess; running other programshashlib; cryptographic hashingdatetime; dates and timescollections;Counter,defaultdict,OrderedDict, etc.random; pseudorandom numbersurllib,http; basic HTTP (week 12 usesrequests, a third-party package, buturllibis the stdlib alternative)
- A complete stdlib reference is at
https://docs.python.org/3/library/. Bookmark it.
Section 1.4: Third-party packages
- The stdlib has limits.
requests(HTTP),numpy(arrays),pandas(tables),pytest(testing) are third-party. - Install with
pip:python3 -m pip install requests
- Use the same way as stdlib:
import requests resp = requests.get('https://example.com') print(resp.status_code)
- For FND-102: you install
requestsin week 12 andpytestin week 13. Everything else is stdlib.
Section 1.5: Project structure
- A small project organized as multiple files:
my-scanner/ ├── README.md ├── scan.py # main entry point ├── log_reader.py # reads + iterates log files ├── parser.py # parses log lines └── tests/ └── test_parser.py scan.pyis the entry point withif __name__ == '__main__':; it imports from the helper modules.- For very small tools (your Lab 6 included), a single file is fine. The multi-file pattern matters when modules exceed ~200 lines.
Session 2: argparse and logging
Section 2.1: argparse basics
- The standard pattern:
import argparse def build_parser(): parser = argparse.ArgumentParser( description='Scan a log file for ERROR lines.' ) parser.add_argument('input', help='path to the log file') parser.add_argument('--top', type=int, default=10, help='show top N matches (default: 10)') parser.add_argument('--verbose', '-v', action='store_true', help='enable debug logging') return parser def main(): args = build_parser().parse_args() print(f'input: {args.input}') print(f'top: {args.top}') print(f'verbose: {args.verbose}') if __name__ == '__main__': main()
- Positional arguments (
'input') are required; the value is inargs.input. - Optional arguments (
'--top') start with--; the value is inargs.top(Python attribute-name conversion:--my-argbecomesargs.my_arg). type=intparses the string into an int (or errors out).default=10is the fallback if--topis not passed.action='store_true'is the "flag" pattern: passing--verbosesetsargs.verboseto True; not passing it leaves it False.argparseauto-generates--helpfrom your descriptions. Runpython3 myscript.py --helpto see it. A--helpthat reads like documentation is the goal.
Section 2.2: argparse for real
- Common patterns beyond the basics:
- Multiple required positionals:
parser.add_argument('input'); parser.add_argument('output') - Optional with a value:
parser.add_argument('--threshold', type=float, default=0.5) - Choices:
parser.add_argument('--format', choices=['json', 'csv', 'text'], default='text') - Lists:
parser.add_argument('--paths', nargs='+')accepts one or more paths
- Multiple required positionals:
- The
--helpoutput is your interface documentation; rewrite the help strings until they read like prose. Example: not--threshold THRESHOLD: a numberbut--threshold N: skip records with score below N (default: 0.5).
Section 2.3: The logging module
print-debugging works for week-3 scripts. Past ~100 lines, you wantlogging:import logging logging.basicConfig( level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s' ) log = logging.getLogger(__name__) log.debug('reading file %s', path) # not shown at INFO level log.info('scan complete: %d matches', n) # shown log.warning('parse failed on line %d', i) log.error('file not found: %s', path) log.critical('database connection lost; shutting down')
- Five levels: DEBUG < INFO < WARNING < ERROR < CRITICAL.
basicConfig(level=...)sets the minimum level to display; anything lower is silently dropped. - The
%s,%dformatting uses the logger's lazy evaluation: the formatting only happens if the log level is enabled. Do NOT pre-format with f-strings (log.info(f'count: {n}')); that defeats the lazy evaluation and is slower in the dropped case.
Section 2.4: Logging configuration
- Send DEBUG to a file, INFO+ to stderr:
logging.basicConfig( level=logging.DEBUG, format='%(asctime)s %(name)s %(levelname)s %(message)s', handlers=[ logging.FileHandler('scan.debug.log'), logging.StreamHandler() # stderr by default ] ) # then in main(): set StreamHandler level based on args.verbose
- A common pattern in CLI tools:
--verboseflips the stderr handler from WARNING (default) to INFO or DEBUG. The file handler always logs DEBUG.
Section 2.5: print vs logging: when to use which
printis for "this output is what the user asked for." The scanner's result list.loggingis for "this output is operational." Progress messages, debug traces, warnings about input quality.- Conventionally:
printwrites to stdout, which is the result;loggingwrites to stderr, which is the commentary. A user who runsmyscript --output result.csv 2> debug.logseparates them cleanly.
Labs (~90 minutes)
Lab 6: Argparse + Logging Refactor (labs/lab-6-argparse-logging.md)
- Goal: take Lab 5's scanner and rewrite it with a real
argparseCLI and alogging-based debug output - Time: ~90 minutes
- Artifact:
lab-6-scanner.pyin~/fnd-102/lab-6/, committed to Git
Independent practice (~4 hours)
-
argparsedrills (45 min). Build five small CLI tools that each demonstrate one argparse pattern:greet.py NAME(one positional)add.py N M(two positionals, both ints)pick.py --choice red,blue,green(choices)flag.py --verbose(flag)paths.py FILE1 FILE2 [FILE3 ...](nargs+) Run each with--help; verify the output is readable.
-
loggingexploration (30 min). Add logging to your Lab 5 scanner at four levels: DEBUG ("reading line 1234"), INFO ("found 500 errors"), WARNING ("parse failed on line N"), ERROR ("file not found"). Run with different--verboseflags; verify the level filter works. -
Multi-module project (60 min). Refactor Lab 5 into two files:
scanner/__init__.py(empty; this makes the directory a package)scanner/reader.py(contains thescangenerator)scanner/main.py(the argparse + main loop) Run withpython3 -m scanner.main. This is the conventional shape for a Python package.
-
Read a stdlib module (30 min). Pick one stdlib module you have not used and read its docs (
https://docs.python.org/3/library/). Suggestions:datetime,collections,itertools,functools. Write 3 things the module does and 1 thing you might use it for. -
--helpreadability audit (30 min). Take your Lab 6--helpoutput. Read it as if you had never seen the tool. Is it clear what--top Ndoes? What--verbosecontrols? Rewrite any help string that is field-name shaped ("verbose") into sentence-shaped ("--verbose: print extra debug output to stderr"). -
Optional stretch (45 min). Write a
--config FILEargument that reads default values from a JSON config file. CLI arguments override the config; config overrides hard-coded defaults. This is the layered-config pattern every real CLI tool uses.
Reflection prompts (~30 minutes)
- Your week-5 scanner is a script; your week-6 scanner is a tool. Articulate the difference in two sentences. What did the CLI interface and the logging add?
- The stdlib has ~200 modules. You used 1 (csv) in week 5, 4 in week 6 (csv, json, argparse, logging). At what point would you start writing your own module instead of looking for a stdlib one?
logging.info('count: %d', n)uses lazy formatting;logging.info(f'count: {n}')does not. The first is preferred in tight loops. Did you write any logging this week in the f-string form? Refactor.printvslogging: in your Lab 6 scanner, what is printed to stdout and what is logged to stderr? Why?- One thing from this week you want to know more about?
Tool journal (week 6)
import module,from module import name,import module as alias: import variationsargparse.ArgumentParser: build a CLIadd_argumentwith positional, optional, flag, choices, nargs patternslogging.getLogger,logging.basicConfig: structured output- Log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL)
FileHandler,StreamHandler: multi-destination loggingif __name__ == '__main__':: import-safe entry point
What comes next
Week 7 introduces regular expressions. Your Lab 5 + Lab 6 scanner uses 'ERROR' in line for a simple substring match; week 7's lab extracts specific patterns (IPv4 and IPv6 addresses) from log lines using re.findall. Regex is the standard tool for "find structured data inside unstructured text."