Beyond nmap and dirbuster: how modern attackers map a target's web attack surface using JavaScript analysis, subdomain enumeration, and API discovery.
The first phase of any web app attack is reconnaissance. Modern web apps expose far more attack surface than the visible UI: API endpoints, internal admin paths, embedded JavaScript with secrets, forgotten subdomains. Attackers and defenders both need tools to enumerate this surface comprehensively.
This tutorial is from the offensive side — what's done, with what tools, and where you'd be wise to look at yourself first.
Two principles guide modern web app recon:
Spend most time on coverage. Sophistication comes later when you've found something interesting.
Most web apps have many more subdomains than their main marketing site reveals. Each is a separate attack surface.
Certificate Transparency logs — every TLS cert issued is logged. Cheapest, most comprehensive single source.
curl -s 'https://crt.sh/?q=%25.djangozen.com&output=json' | \
jq -r '.[].name_value' | sort -u | head -20
Subfinder — aggregates multiple passive sources:
subfinder -d djangozen.com -all -silent
Amass passive mode:
amass enum -passive -d djangozen.com -src
SecurityTrails, VirusTotal, Shodan — APIs that have indexed subdomains. Some free tiers.
DNS brute force — try common subdomain names:
gobuster dns -d djangozen.com -w subdomains-top1million-5000.txt
HTTP-level discovery — once you have a list of subdomains, see which are alive:
echo "www.djangozen.com\nstaging.djangozen.com\napi.djangozen.com" | \
httpx -silent -title -status-code
Pattern: a subdomain points (via CNAME) to a service that no longer exists. Attacker claims that service name, now serves content under your subdomain.
Common targets: GitHub Pages, AWS S3, Azure Storage, Heroku, Surge, Netlify, Cloudflare Workers — anywhere you can claim a name.
Detection (do this against yourself):
# Check every subdomain's CNAME and verify the target is owned
subjack -d domains.txt -ssl -t 100
Defense:
Sites change. Old endpoints that still exist but aren't linked anymore are gold for attackers.
# Get every URL ever archived for your domain
waybackurls djangozen.com | sort -u > urls.txt
# Filter for interesting patterns
grep -E '\.php|\.bak|\.sql|\.zip|\.git|admin|api/v1|debug' urls.txt
Surprisingly often this reveals working admin paths, debug endpoints, or old API versions that never got removed.
Modern web apps push significant logic to the browser. The JavaScript bundles often contain:
# Fetch every JS file linked from a page
katana -u https://djangozen.com -d 5 -jc -silent | grep -E '\.js$' > js_urls.txt
# Or use a proxy and crawl
gospider -s https://djangozen.com -d 3 --js
# Download all JS files
mkdir -p js && cd js
xargs -n1 -P10 curl -s -O < ../js_urls.txt
# Look for endpoints
grep -hoE '"(/api/[^"]*)"' *.js | sort -u
# Look for likely secrets
grep -hE '(api[_-]?key|secret|token|password)' *.js | head -20
# AWS access key pattern
grep -hE 'AKIA[0-9A-Z]{16}' *.js
Specialized tools: TruffleHog, gitleaks, secretfinder (Python tool for JS).
Modern apps are increasingly API-driven. The API often has more attack surface than the UI.
# Common API paths
ffuf -u 'https://djangozen.com/api/FUZZ' -w api-wordlist.txt -fc 404
# Versioned APIs
for v in v1 v2 v3 v4; do
echo "Checking /api/$v/"
curl -s -o /dev/null -w "%{http_code}\n" https://djangozen.com/api/$v/
done
# Common API endpoints
ffuf -u 'https://djangozen.com/api/v1/FUZZ' -w api-endpoints.txt
Many APIs accidentally expose their own documentation:
/swagger/
/swagger.json
/swagger-ui/
/openapi.json
/redoc/
/api/docs/
/api/schema/
If found, that's the API contract — every endpoint, every parameter, every authentication requirement, gift-wrapped.
GraphQL has its own discovery patterns:
# Common locations
curl https://djangozen.com/graphql
curl https://djangozen.com/api/graphql
# Introspection query — gets the full schema
curl -X POST -H "Content-Type: application/json" \
-d '{"query":"{__schema{types{name,fields{name}}}}"}' \
https://djangozen.com/graphql
Introspection should be disabled in production. If it's not, the entire schema is exposed.
Once you find an API, brute force parameter names:
# Try known parameter names on a parameter-less endpoint
ffuf -u 'https://djangozen.com/api/users/?FUZZ=test' -w params.txt -fs 0
Tools that do this systematically: Param Miner (Burp extension), Arjun.
Old-school dirbuster: brute-force common paths. Modern alternatives are smarter.
katana (ProjectDiscovery):
katana -u https://djangozen.com -d 5 -jc -aff -o crawled.txt
Crawls recursively, parses JavaScript for additional URLs, follows forms, handles single-page apps.
gospider:
gospider -s https://djangozen.com -d 3 --js --robots --sitemap
Assetnote wordlists — curated by experience: https://wordlists.assetnote.io/
SecLists — the everything-and-the-kitchen-sink collection.
Use technology-specific wordlists. If the target runs Django (visible in headers/cookies), use Django-specific paths. If WordPress, WP paths. Etc.
200 — exists, content
301/302 — redirect (interesting if the redirect target is suspicious)
401 — exists, requires auth (often valuable)
403 — exists, forbidden (sometimes bypassable; often valuable)
404 — doesn't exist (or is hidden behind 404)
500 — server error — bug surface
A 403 on /admin/ is often more interesting than a 200 on the home page. It tells you the admin exists, you just need to authenticate or bypass.
Knowing the stack tells you what bugs to try.
# Wappalyzer browser extension — easiest manual check
# whatweb — command line
whatweb https://djangozen.com
# httpx with tech detection
httpx -u https://djangozen.com -tech-detect -title
What you'll learn:
A bug bounty hunter's morning workflow:
# 1. Subdomains
subfinder -d djangozen.com -all -silent > subs.txt
amass enum -passive -d djangozen.com >> subs.txt
sort -u subs.txt > subs-unique.txt
# 2. Alive hosts
httpx -l subs-unique.txt -silent -title -tech-detect -status-code > alive.txt
# 3. Historical URLs
waybackurls djangozen.com > wayback.txt
gau djangozen.com >> wayback.txt
sort -u wayback.txt > urls.txt
# 4. Active crawling
katana -u https://djangozen.com -d 5 -jc -aff -silent | tee -a urls.txt
# 5. Interesting patterns
grep -E '/api/|/admin|debug|swagger|\.git|\.env|backup' urls.txt > interesting.txt
# 6. Take screenshots of every alive URL
gowitness file -f alive.txt -P screenshots/
# 7. Manual review of interesting URLs and screenshots
By the time you've done this for an app, you've seen everything publicly visible — admin panels, API endpoints, old debug pages, dev/staging environments forgotten in DNS, JS bundles with hidden secrets.
This list is identical to running this recon against your own infrastructure:
If you find things you didn't know you had, that's the point of the exercise. Better to find them yourself than read about them in a breach disclosure.
Recon is patient, methodical, comprehensive. The good attackers spend disproportionate time here because the bug is usually findable; the trick is finding what others miss.
Defenders win by knowing their own attack surface better than attackers do. The tools above are open source and free. Run them. Schedule them. Treat "new asset detected" as an actionable alert. The half-life of unknown assets is measured in months before they become a problem.
Tutorial 8 covers what happens when an attacker has finished reconnaissance and starts going after the auth layer.