Lab 2: OSINT Dossier · PEN-101 · Virtus Cyber Academy Classroom

Week 2 graded lab. Passive reconnaissance only. No active probing of the target.

Learning objectives

Collect publicly available information about a target domain without touching the target's infrastructure
Produce a structured OSINT dossier that informs the Week 3 active recon phase
Distinguish passive recon from active recon and apply the constraint correctly throughout

Target

The instructor-designated lab domain (provided at lab start). This is a domain registered and operated by the academy for this exercise. Do not OSINT any other organization.

Authorization boundary

Authorized: Reading public information from WHOIS databases, CT logs (crt.sh), DNS resolvers that are not the target's authoritative server (8.8.8.8, 1.1.1.1), Shodan, Censys, GitHub, LinkedIn, job postings, public web pages of the target, search engine results.

Not authorized:

Directly querying the target's authoritative DNS server (this lands in the target's logs)
Running port scans of any kind (that is Week 3)
Submitting forms on the target's website
Creating accounts on the target's services
Using any tool that sends requests directly to the target's IP address

Required dossier sections

Produce a Markdown document with the following sections. Each finding must include the source (URL or tool name) and the date/time you retrieved it.

Section 1: Domain and registration

WHOIS registrant information (name, organization, email if exposed; note privacy proxy if in place)
Registration date and expiration date
Registrar
Authoritative name servers (from WHOIS)
Registrar abuse contact

whois <target-domain>

Section 2: DNS records

Collect all available record types using a public resolver (not the target's authoritative server):

dig @8.8.8.8 <target-domain> A
dig @8.8.8.8 <target-domain> AAAA
dig @8.8.8.8 <target-domain> MX
dig @8.8.8.8 <target-domain> TXT
dig @8.8.8.8 <target-domain> NS
dig @8.8.8.8 <target-domain> CNAME

Attempt a zone transfer against the authoritative name server. Note whether it succeeds or is refused:

dig @<authoritative-ns-from-whois> <target-domain> AXFR

What does the MX record tell you about the email provider? What does the TXT record tell you about SPF/DKIM/DMARC configuration?

Section 3: Subdomains

Run CT log enumeration and passive subdomain discovery:

# CT log search (no requests to target):
curl -s "https://crt.sh/?q=%25.<target-domain>&output=json" | jq '.[].name_value' | sort -u | tee crt-results.txt

# Passive subdomain enumeration:
sublist3r -d <target-domain> -o sublist3r-results.txt

# theHarvester:
theHarvester -d <target-domain> -l 200 -b bing,google,linkedin,twitter -f theharvester-results

For each subdomain found: note the source, note the IP it resolves to (using dig @8.8.8.8), and note whether the IP is in the same block as the main domain or hosted elsewhere.

Section 4: IP ranges and hosting

What IP address(es) does the main domain resolve to?
Who is the ASN / hosting provider? (Use whois <ip> or ipinfo.io)
Is it cloud-hosted (AWS, GCP, Azure, Cloudflare)? What evidence supports this?
Are there other domains hosted at the same IP? (Reverse DNS lookup)

dig @8.8.8.8 <target-domain> +short  # get the IP
whois <IP>                             # get ASN / org
dig @8.8.8.8 -x <IP>                  # reverse lookup

Section 5: Technology stack inference

Without making requests to the target's servers, what can you infer about the technology stack from:

Job postings (search for jobs at the fictional company on LinkedIn, Indeed, or similar)
Any public GitHub repositories
Any metadata in public documents (PDFs, Word docs) linked from public pages that you retrieve via search engine cache

Section 6: GitHub and code exposure

Search GitHub for the domain name: github.com search?q=<target-domain>
Search for any public repositories associated with the organization
If repositories exist: scan them with trufflehog or gitleaks for secrets

trufflehog git --repo=https://github.com/<org>/<repo>

Document any secrets found (API keys, passwords, internal URLs in configuration files). Note that even deleted content in commit history may be indexed.

Section 7: Email and personnel

Collect publicly available email addresses and employee names:

theHarvester -d <target-domain> -b linkedin,google,bing -l 200

From any public sources (LinkedIn, website About pages, GitHub profiles): what roles exist? Who is the technical contact? Who is the executive decision-maker?

Section 8: Shodan and Censys

Search Shodan and Censys for the target's IP range or domain:

# Shodan CLI (requires free API key registration at shodan.io):
shodan search "hostname:<target-domain>"
shodan host <target-ip>

What services has Shodan indexed on the target's IP? Compare to your DNS enumeration. Are there open ports that were not obvious from DNS records?

Section 9: Findings summary

Based on your dossier, answer:

What is the likely technology stack (web server, backend language, hosting provider)?
Which subdomains or services look most interesting for the Week 3 active recon?
Are there any credentials, API keys, or sensitive configuration data in public sources?
What attack vectors does this recon suggest?

Evidence discipline

Every finding entry must have:

The source (URL, tool name + command)
The timestamp of retrieval (your local time with timezone)
The raw output or a quoted excerpt (do not paraphrase; quote)

An OSINT finding without a source is not a finding -- it is a guess.

Submission

A single Markdown document (or PDF) covering all nine sections. Appendix: attach raw tool output files (crt-results.txt, sublist3r-results.txt, theharvester-results.xml, Shodan output) as a ZIP alongside the main document.