Classroom Glossary Public page

Week 12: HTTP with `requests`

1,426 words

The first time the course goes to the network. Use the requests library to read a public weather API and emit a today's-forecast CLI tool.


Theme

Up to this week, everything you have built ran entirely on your own machine: files, processes, hashes. Week 12 introduces the network as a data source. The requests library is Python's de facto HTTP client; it is so universally used that the stdlib's urllib.request (the older, awkward alternative) is mostly forgotten.

The week's lab is a weather-report CLI: pass a city or coordinates, the tool calls a free weather API, parses the JSON response, and prints today's forecast. The pattern is the foundation for everything else you will do over a network: any REST API, any HTTP-based scraping, any cloud-service automation.

Two cross-cutting disciplines come up. First: handling failure. A network call can fail in dozens of ways (no internet, DNS error, server down, rate-limited, malformed response). A tool that crashes on every transient failure is unusable; a tool that silently swallows every failure is dangerous. The discipline is in the middle: catch what you expect; let unexpected failures propagate with context. Second: protecting secrets. Many APIs require keys; embedding a key in your source code and committing it to GitHub is a top-10 way to get an account drained. The course lab uses a key-free API for that reason, but you will learn the .env + .gitignore discipline that real tools use.

By the end of week 12 you can: use requests.get and requests.post; check response.status_code; parse response.json(); handle the common failure modes (timeout, connection error, non-200, malformed JSON); store API keys outside your code; respect rate limits.

Reading list (~1 hour)

  1. Matthes, Python Crash Course 2nd ed., Ch 17 ("Working with APIs"). Matthes' weather-API-style example matches FND-102's lab almost exactly. Read it carefully.
  2. requests documentation: "Quickstart" at https://requests.readthedocs.io/en/latest/user/quickstart/. ~20 min read. The canonical first-pass reference. Read at least through "Response Content" and "JSON Response Content."
  3. Real Python: "Python's Requests Library (Guide)" at https://realpython.com/python-requests/. ~30 min read. The most thorough single source; covers timeouts, retries, sessions.
  4. Mozilla Developer Network: "An overview of HTTP" at https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview. ~15 min read. The HTTP basics every developer should know.

Lecture outline (~1.5 hours, 2 sessions of ~50 min)

Session 1: HTTP basics and requests

Section 1.1: HTTP in 5 minutes

  • An HTTP request is a client asking a server for a resource at a URL.
  • The request has a method: GET (read), POST (create), PUT (update), DELETE (delete), PATCH (partial update), HEAD (metadata only). 95% of FND-102 work uses GET; week 12's lab uses GET exclusively.
  • The response has a status code: 200 (OK), 201 (Created), 204 (No Content), 301/302 (Redirect), 400 (Bad Request), 401 (Unauthorized), 403 (Forbidden), 404 (Not Found), 429 (Too Many Requests), 500 (Internal Server Error), 502/503/504 (server errors).
  • Status codes 2xx mean success; 3xx mean "look elsewhere"; 4xx mean "client's fault"; 5xx mean "server's fault." The exact code tells you what to do.
  • The response has a body (the data). For APIs, the body is usually JSON.

Section 1.2: requests.get

  • The basic pattern:
    import requests
    response = requests.get('https://api.example.com/data', timeout=10)
    if response.status_code == 200:
        data = response.json()
        print(data)
    else:
        print(f'request failed: {response.status_code}')
    
  • timeout=10 is critical: without it, a hanging server hangs your program forever. ALWAYS pass a timeout.
  • response.json() parses the body as JSON; raises requests.exceptions.JSONDecodeError if the body is not valid JSON.
  • response.text gives the body as a string; response.content gives bytes.

Section 1.3: Query parameters

  • Many APIs take parameters in the URL:
    https://api.example.com/weather?city=Madison&units=metric
    
  • Two ways to construct this in Python:
    • Manual: requests.get(f'https://api.example.com/weather?city={city}&units=metric')
    • Better: requests.get('https://api.example.com/weather', params={'city': city, 'units': 'metric'})
  • The second is safer: requests handles URL-encoding (spaces become %20, special chars escaped). Manual concatenation is a source of bugs and security issues.

Section 1.4: Headers and auth

  • Some APIs require headers (User-Agent, Authorization):
    headers = {
        'User-Agent': 'fnd-102-weather-cli/0.1',
        'Authorization': f'Bearer {api_key}'
    }
    response = requests.get(url, headers=headers, timeout=10)
    
  • User-Agent identifies your tool to the server. Some APIs reject empty or default User-Agents. Always set one.
  • Authorization: Bearer <token> is the standard OAuth 2.0 pattern; many APIs use it. Other APIs use ?api_key=... in query params (older, less secure).

Session 2: Failure handling and secrets

Section 2.1: The failure modes

  • Network-layer failures:
    • Connection error: DNS failed, network unreachable, server refused connection. requests.exceptions.ConnectionError.
    • Timeout: server is slow or stalled. requests.exceptions.Timeout. (Requires you to set timeout=....)
  • HTTP-level failures:
    • 4xx response: client error. Check status code; act accordingly.
    • 5xx response: server error. Retry MAY help (transient); often does not.
  • Application-level failures:
    • Malformed JSON: server returned 200 but the body is not valid JSON. JSONDecodeError.
    • Missing field: JSON parsed, but the key you expected is absent. KeyError.
  • All five categories need different responses. A robust client distinguishes them.

Section 2.2: The try/except pattern

import requests
from requests.exceptions import ConnectionError, Timeout, RequestException

def fetch_weather(city):
    try:
        response = requests.get(
            'https://api.example.com/weather',
            params={'city': city},
            timeout=10
        )
    except Timeout:
        print('Error: weather API timed out (10s). Try again later.')
        return None
    except ConnectionError:
        print('Error: cannot reach weather API. Check your internet.')
        return None
    except RequestException as e:
        print(f'Error: unexpected request error: {e}')
        return None

    if response.status_code == 404:
        print(f'Error: city {city!r} not found.')
        return None
    if response.status_code == 429:
        print('Error: weather API rate-limited. Try again later.')
        return None
    if response.status_code != 200:
        print(f'Error: weather API returned {response.status_code}: {response.text[:200]}')
        return None

    try:
        return response.json()
    except ValueError:
        print(f'Error: weather API returned non-JSON: {response.text[:200]}')
        return None

This is verbose for a reason: each error becomes a user-readable message instead of a stack trace. For a CLI tool, that is the right trade-off.

Section 2.3: Retries and rate limits

  • Some failures are transient: a single 503 might succeed on retry.
  • Naive retry:
    for attempt in range(3):
        try:
            response = requests.get(url, timeout=10)
            if response.status_code == 200:
                return response.json()
        except (ConnectionError, Timeout):
            pass
        time.sleep(2 ** attempt)  # exponential backoff
    
  • A library handles this better: requests + the urllib3 Retry adapter; or the higher-level tenacity package.
  • Rate limits (429) deserve a longer wait. Some APIs include a Retry-After header telling you how long to wait.

Section 2.4: Secrets management

  • API keys are credentials. If you commit them to GitHub, scrapers find them within minutes; your account may be drained or banned.
  • The standard pattern:
    import os
    api_key = os.environ['WEATHER_API_KEY']  # raises if not set
    
  • Store keys in environment variables, not in source code. Tools like direnv, .env files, or a secrets manager handle the loading.
  • The python-dotenv package (third-party) reads a .env file:
    # .env
    WEATHER_API_KEY=your-key-here
    
    from dotenv import load_dotenv
    load_dotenv()  # reads .env and populates os.environ
    api_key = os.environ['WEATHER_API_KEY']
    
  • Always add .env to your .gitignore so the key never gets committed.
  • For Lab 12, the chosen API is key-free (https://wttr.in/) so you do not have to set this up. The discipline still applies for your capstone if it uses any keyed API.

Section 2.5: Choosing the right API for the lab

  • Free, no-key weather APIs (good for FND-102):
    • wttr.in at https://wttr.in/: text-based forecast service. Try curl wttr.in/Madison?format=j1 to see the JSON.
    • Open-Meteo at https://open-meteo.com/: JSON forecast, no key, generous free tier. Takes lat/lon (not city name; use a geocoder first if your tool wants city name).
  • Lab 12 uses Open-Meteo because the JSON shape is cleaner and the API is more stable.

Labs (~90 minutes)

Lab 12: Weather-Report CLI (labs/lab-12-weather-cli.md)

  • Goal: build a CLI tool that takes a city (or lat/lon); calls Open-Meteo; prints today's forecast
  • Time: ~90 minutes
  • Artifact: lab-12-weather.py in ~/fnd-102/lab-12/, committed to Git

Independent practice (~4 hours)

  1. requests warm-up (30 min). In the REPL:

    import requests
    r = requests.get('https://httpbin.org/get', params={'a': 1, 'b': 'hello world'}, timeout=10)
    print(r.status_code)
    print(r.json())
    

    httpbin.org is a free testing service that echoes your request. Notice how params got URL-encoded (spaces became %20 or +).

  2. Failure-mode exploration (45 min). Trigger each failure mode deliberately:

    • Wrong URL host: requests.get('https://this-domain-does-not-exist-12345.com', timeout=5) → ConnectionError
    • Timeout: requests.get('https://httpbin.org/delay/15', timeout=2) → Timeout
    • 404: requests.get('https://httpbin.org/status/404', timeout=5) → 200 status code... wait, it returns 200 with the body "404"? Actually returns 404. Check.
    • 500: requests.get('https://httpbin.org/status/500')
    • Malformed JSON: requests.get('https://example.com').json() → JSONDecodeError (the page is HTML) Practice the try/except pattern for each.
  3. Read an API doc (30 min). Pick any API at https://github.com/public-apis/public-apis (no-auth section). Read the docs. Construct a requests.get call. Print the response.

  4. Build a multi-call tool (60 min). Write a script that fetches 5 GitHub user profiles via https://api.github.com/users/USERNAME and reports their public-repo counts. Use a list of 5 known usernames (octocat, torvalds, dhh, etc.). Notice GitHub's rate limit; respect it.

  5. Capture a session (30 min). Use requests.Session() for multiple requests to the same host; this reuses the TCP connection and may include cookies. Useful when interacting with an API across multiple endpoints. Lab 12 does not require it; useful for capstone.

  6. .env setup (15 min). Install python-dotenv (pip install python-dotenv). Create a .env file with MY_KEY=hello. Read it with load_dotenv() and os.environ['MY_KEY']. Add .env to .gitignore. This is the muscle memory for any keyed API.

Reflection prompts (~30 minutes)

  1. Your Lab 12 catches ConnectionError, Timeout, 404, and JSON-parse errors. Is there a failure mode you DID NOT catch deliberately? What is your reasoning?
  2. The lab uses a key-free API. If you had to add a keyed API to your capstone, what discipline would you use to keep the key out of Git?
  3. requests.get(url, params=...) vs string concatenation: the params form handles URL-encoding correctly. Did you ever write the concatenated form? What edge case would it break on?
  4. Retries: when does a retry help? When does it hurt? (Hint: 503 might be transient; 400 won't be; 429 needs a longer wait.)
  5. One thing from this week you want to know more about?

Tool journal (week 12)

  • requests library: requests.get, requests.post, etc.
  • response.status_code, response.json(), response.text: response inspection
  • timeout= argument: always required
  • params= and headers= arguments: query string and HTTP headers
  • Exception types: Timeout, ConnectionError, RequestException
  • requests.Session(): connection reuse and cookies
  • os.environ: read environment variables
  • python-dotenv: load .env files
  • HTTP status codes: 2xx success / 3xx redirect / 4xx client error / 5xx server error
  • httpbin.org: free testing service for HTTP edge cases

What comes next

Week 13 introduces pytest and writing READMEs. You take one of your prior labs (Lab 6, 9, 11, or 12 are good candidates) and add three or more tests plus a README that lets a stranger run your tool. This is the discipline that distinguishes a shippable tool from a one-off script.