Classroom Glossary Public page

Lab 7: SIGINT Discipline -- Unknown Signal Classification Workflow

340 words

Week: 9 -- SIGINT Techniques
Points: 25
Time estimate: 90 min lab + 3 hr independent
Deliverable: lab-7-report.md + SIGINT hypothesis document


Objectives

  1. Execute the full five-stage SIGINT classification pipeline against an instructor-supplied unknown IQ capture.
  2. Produce a confidence-assessed SIGINT hypothesis document with evidence at each stage.
  3. Implement a simple modulation classifier using the feature extraction functions from Week 7 lecture.
  4. Write one Suricata-equivalent signal detection signature for the classified signal.

The Unknown Target

The instructor provides unknown_signal.cf32: an IQ file captured at 2.4 MSPS. You do not know the modulation, symbol rate, frequency, or protocol. Your task is to find out.

Metadata provided: Center frequency 433.5 MHz, sample rate 2.4 MSPS, capture duration 30 seconds, captured in an urban ISM-band environment. This is all the information you have.


Stage 1: Spectrum Survey (15 min)

#!/usr/bin/env python3
"""Lab 7 Stage 1: Spectrum survey."""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import spectrogram

# Load the unknown signal
iq = np.fromfile('unknown_signal.cf32', dtype=np.complex64)
fs = 2.4e6  # sample rate
fc = 433.5e6  # center frequency

print(f"Signal duration: {len(iq)/fs:.1f} seconds")
print(f"Total samples: {len(iq):,}")

# 1. Power spectral density
NFFT = 8192
S = 20*np.log10(np.abs(np.fft.fftshift(np.fft.fft(iq[:NFFT], NFFT)))/NFFT + 1e-10)
f_axis = np.fft.fftshift(np.fft.fftfreq(NFFT, 1/fs)) / 1e3  # kHz

plt.figure(figsize=(12, 10))
plt.subplot(2, 2, 1)
plt.plot(f_axis, S)
plt.xlabel('Frequency (kHz from center)')
plt.ylabel('Power (dBFS)')
plt.title('PSD of Unknown Signal')
plt.grid(True)

# 2. Wideband spectrogram (time-frequency map)
f_sg, t_sg, Sxx = spectrogram(iq, fs=fs, nperseg=512, noverlap=448, return_onesided=False)
f_sg_shift = np.fft.fftshift(f_sg) / 1e3  # kHz
Sxx_shift = np.fft.fftshift(Sxx, axes=0)

plt.subplot(2, 2, 2)
plt.pcolormesh(t_sg, f_sg_shift, 10*np.log10(Sxx_shift + 1e-10),
               cmap='viridis', vmin=-60, vmax=0, shading='auto')
plt.colorbar(label='dBFS')
plt.xlabel('Time (s)')
plt.ylabel('Frequency (kHz)')
plt.title('Spectrogram (time-frequency map)')

# 3. Amplitude time series
plt.subplot(2, 2, 3)
plt.plot(np.arange(min(10000, len(iq)))/fs*1e3,
         np.abs(iq[:10000]))
plt.xlabel('Time (ms)')
plt.ylabel('Amplitude')
plt.title('Amplitude (first 10 ms)')
plt.grid(True)

# 4. Instantaneous frequency
phase = np.unwrap(np.angle(iq[:10000]))
inst_freq = np.diff(phase) * fs / (2 * np.pi) / 1e3  # kHz
plt.subplot(2, 2, 4)
plt.plot(np.arange(len(inst_freq))/fs*1e3, inst_freq)
plt.xlabel('Time (ms)')
plt.ylabel('Instantaneous frequency (kHz)')
plt.title('Instantaneous Frequency')
plt.grid(True)

plt.tight_layout()
plt.savefig('lab7/stage1_survey.png', dpi=150)

# Record observations
print("\nStage 1 Observations (fill in your findings):")
print("Signal bandwidth estimate: ___ kHz")
print("Duty cycle estimate: ___% (continuous / burst)")
print("Approximate SNR: ___ dB (visual estimate from PSD)")
print("Frequency stability: stable / drifting / hopping")

Document for Stage 1:

  • Center frequency (kHz from center): ___
  • Signal bandwidth (kHz): ___
  • Duty cycle (%): ___
  • Estimated SNR (dB): ___
  • Frequency behavior: continuous / burst / hopping

Stage 2: Modulation Classification (20 min)

#!/usr/bin/env python3
"""Lab 7 Stage 2: Modulation classification."""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import welch

def analyze_signal(iq, fs):
    """Extract modulation classification features."""
    amplitude = np.abs(iq)
    phase = np.unwrap(np.angle(iq))
    inst_freq = np.diff(phase) * fs / (2 * np.pi)
    
    amp_variance_norm = np.var(amplitude) / (np.mean(amplitude)**2 + 1e-10)
    freq_variance = np.var(inst_freq)
    phase_std = np.std(np.diff(np.angle(iq)))
    
    print(f"Modulation features:")
    print(f"  Normalized amplitude variance: {amp_variance_norm:.4f}")
    print(f"  Instantaneous freq variance: {freq_variance/1e6:.2f} (×10⁶ Hz²)")
    print(f"  Phase step std: {phase_std:.3f} rad")
    
    if amp_variance_norm < 0.02 and freq_variance > 1e8:
        modulation_guess = "FSK (constant amplitude, large freq variation)"
    elif amp_variance_norm < 0.02 and freq_variance < 1e6:
        modulation_guess = "PSK (constant amplitude, low freq variation)"
    elif amp_variance_norm > 0.2:
        modulation_guess = "AM/ASK or QAM (amplitude variation)"
    elif freq_variance > 1e7 and amp_variance_norm > 0.05:
        modulation_guess = "CSS/LoRa (chirp - freq varies monotonically)"
    else:
        modulation_guess = "Unknown - further analysis needed"
    
    print(f"\n→ Modulation hypothesis: {modulation_guess}")
    return amp_variance_norm, freq_variance, phase_std

a_var, f_var, p_std = analyze_signal(iq, fs)

# Constellation diagram (if PSK/QAM)
# Downsample to 2 samples/symbol (requires symbol rate estimate first)
plt.figure(figsize=(12, 4))

plt.subplot(1, 3, 1)
n_plot = min(5000, len(iq))
plt.scatter(iq[:n_plot].real, iq[:n_plot].imag, s=1, alpha=0.3)
plt.xlabel('I')
plt.ylabel('Q')
plt.title('Constellation Diagram')
plt.axis('equal')

plt.subplot(1, 3, 2)
inst_freq_plot = np.diff(np.unwrap(np.angle(iq[:min(50000, len(iq))]))) * fs / (2*np.pi) / 1e3
plt.plot(inst_freq_plot[:2000])
plt.xlabel('Sample')
plt.ylabel('Instantaneous freq (kHz)')
plt.title('Inst. Frequency (2000 samples)')
plt.grid(True)

plt.subplot(1, 3, 3)
plt.hist(np.abs(iq[:50000]), bins=100, density=True)
plt.xlabel('Amplitude')
plt.ylabel('Density')
plt.title('Amplitude Distribution')
plt.grid(True)

plt.tight_layout()
plt.savefig('lab7/stage2_modulation.png', dpi=150)

Stage 3: Symbol Rate and Frame Structure (20 min)

#!/usr/bin/env python3
"""Lab 7 Stage 3: Symbol rate and frame structure."""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import welch

# Symbol rate via envelope power spectrum
power = np.abs(iq)**2
f_welch, Pxx = welch(power, fs=fs, nperseg=65536, noverlap=32768)

# Find spectral peaks (symbol rate candidates)
noise_floor = np.median(Pxx)
peak_indices = np.where(Pxx > 5 * noise_floor)[0]
if len(peak_indices) > 0:
    peak_freqs = f_welch[peak_indices]
    # Filter out DC
    symbol_rate_candidates = peak_freqs[peak_freqs > 500]
    if len(symbol_rate_candidates) > 0:
        sym_rate_est = symbol_rate_candidates[0]
        print(f"Symbol rate estimate: {sym_rate_est/1e3:.2f} kBaud")
        sps = fs / sym_rate_est  # samples per symbol
        print(f"Samples per symbol: {sps:.1f}")

plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.semilogy(f_welch/1e3, Pxx)
plt.xlabel('Frequency (kHz)')
plt.ylabel('Power spectral density of |x|²')
plt.title('Symbol Rate Detection (PSD of envelope)')
plt.grid(True)
plt.axvline(sym_rate_est/1e3 if 'sym_rate_est' in dir() else 10, 
            color='r', linestyle='--', label='Symbol rate candidate')
plt.legend()

# Autocorrelation to find frame period
plt.subplot(1, 2, 2)
n_auto = min(100000, len(iq))
acorr = np.correlate(np.abs(iq[:n_auto]), np.abs(iq[:1000]), mode='valid')
plt.plot(np.arange(len(acorr))/fs*1e3, np.abs(acorr))
plt.xlabel('Lag (ms)')
plt.ylabel('Autocorrelation amplitude')
plt.title('Autocorrelation (frame period detection)')
plt.grid(True)

plt.tight_layout()
plt.savefig('lab7/stage3_symbol_rate.png', dpi=150)

Stage 4: Preamble and Sync Word Identification (15 min)

#!/usr/bin/env python3
"""Lab 7 Stage 4: Preamble and sync word search."""
import numpy as np

# After estimating the symbol rate, demodulate to bits
# This example assumes FSK-like signal; adapt to your hypothesis
def demod_to_bits(iq, fs, sym_rate, n_symbols=1000):
    """Simple FM demodulation + decision to bits."""
    sps = int(fs / sym_rate)
    phase = np.unwrap(np.angle(iq[:n_symbols * sps]))
    inst_freq = np.diff(phase) * fs / (2 * np.pi)
    
    # Downsample: take one sample per symbol at the center
    sym_freqs = []
    for i in range(n_symbols - 1):
        center = (i + 1) * sps + sps // 2
        if center < len(inst_freq):
            sym_freqs.append(inst_freq[center])
    
    sym_freqs = np.array(sym_freqs)
    threshold = np.median(sym_freqs)
    bits = (sym_freqs > threshold).astype(int)
    return bits

# Attempt to find common ISM-band preambles
def search_preamble(bits, patterns):
    """Search for known preamble patterns."""
    found = []
    for name, pattern in patterns.items():
        p = np.array(pattern, dtype=int)
        for start in range(0, len(bits) - len(p), 1):
            if np.all(bits[start:start+len(p)] == p):
                found.append((start, name))
    return found

common_preambles = {
    '0xAA_run':   [1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0],  # alternating (common)
    '0x55_run':   [0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1],
    '0xD3_0x91': [1,1,0,1,0,0,1,1, 1,0,0,1,0,0,0,1],   # common ISM sync
    'all_ones':   [1,1,1,1,1,1,1,1],
}

# Adapt sym_rate_est from Stage 3 or use a default
try:
    bits = demod_to_bits(iq, fs, sym_rate_est if 'sym_rate_est' in dir() else 4800)
    print(f"First 64 recovered bits: {''.join(map(str, bits[:64]))}")
    
    found = search_preamble(bits, common_preambles)
    if found:
        for loc, name in found[:5]:
            print(f"  Preamble candidate '{name}' at bit {loc}")
    else:
        print("No common preamble found -- review modulation hypothesis")
except Exception as e:
    print(f"Demodulation error: {e}")
    print("Check symbol rate estimate and modulation hypothesis first")

Stage 5: SIGINT Hypothesis Document (20 min)

Complete the following hypothesis template and paste it into your lab report:

# SIGINT Hypothesis Document
## Target: unknown_signal.cf32

**Classification date:** YYYY-MM-DD  
**Analyst:** [student name]  
**Capture metadata:** 433.5 MHz center, 2.4 MSPS, 30 sec, urban ISM band

---

### Stage 1: Spectrum Survey
- Signal bandwidth: ___ kHz [CONFIDENCE: CONFIRMED / INFERRED / HYPOTHESIZED]
- Duty cycle: ___% [CONFIDENCE: ...]
- SNR estimate: ___ dB
- Frequency behavior: [stable / burst / hopping]

### Stage 2: Modulation Classification
- Modulation: ___ [CONFIDENCE: ...]
- Evidence: [list 2-3 specific observations from constellation, inst. frequency, amplitude distribution]

### Stage 3: Symbol Rate
- Symbol rate: ___ Baud [CONFIDENCE: ...]
- Evidence: PSD peak at ___ kHz; sps = ___

### Stage 4: Frame Structure
- Preamble identified: [yes/no; if yes, pattern and location]
- Sync word: [if found]
- Frame period: ___ ms [CONFIDENCE: ...]

### Stage 5: Protocol Hypothesis
- Most likely protocol family: [ISM sub-GHz device (OOK/ASK), LoRa, FHSS 433, custom] 
- Evidence for hypothesis: [3+ specific observations]
- Evidence against hypothesis: [what doesn't fit]
- Residual unknowns: [what you couldn't determine]

### Limit of Confidence
- CONFIRMED claims: [list items you can verify bit-for-bit]
- INFERRED claims: [list items supported by evidence but not confirmed]
- HYPOTHESIZED claims: [list items that are educated guesses]

Suricata-Style Detection Rule

Write a detection specification for this signal (analog to a Suricata rule, but for RF):

# RF Detection Rule (conceptual; adapt to SigMF format for production)
alert rf 433.5MHz ±50kHz (
    msg: "SIGINT Lab7 - Unknown ISM Device";
    modulation: [your modulation];
    symbol_rate: [symbol rate] Baud ± 5%;
    bandwidth: [BW] kHz;
    duty_cycle: [DC]% ± 10%;
    threshold: type both, count 3, seconds 60;
    sid: 7000001;
    rev: 1;
    confidence: INFERRED;
)

Lab Report

Create lab-7-report.md with:

  1. Stage 1-4 output plots (4 figures: spectrum survey, modulation, symbol rate, preamble)
  2. Completed SIGINT hypothesis document (from Stage 5 template)
  3. RF detection rule
  4. Analysis (150 words): "Your hypothesis at Stage 2 was based on the amplitude variance and instantaneous frequency variance. Describe one scenario where this heuristic would give a wrong classification, and explain what additional measurement you would take to resolve the ambiguity."

Grading

Component Points
Stage 1-4 plots: all four produced with quantified observations 8
SIGINT hypothesis document: all fields complete; confidence levels present 10
RF detection rule: parameterized from measured values 4
Analysis: failure case identified with remediation 3
Total 25