Security Logging & Monitoring Failures
Theory
Security Logging and Monitoring Failures moved from #10 (2017) to #9 (2021). It represents a fundamental problem: you cannot defend against attacks you cannot see. Without adequate logging and alerting, breaches go undetected for months. The IBM Cost of a Data Breach Report 2023 found that breaches take an average of 204 days to identify and a further 73 days to contain. Organisations with good logging and IR plans save an average of $1.76M per breach.
What Should Be Logged (But Often Isn't)
- Authentication events โ every login attempt (success and failure), password reset, MFA use, and account lockout
- Access control failures โ every 401/403 response; repeated failures on sequential IDs signal an IDOR scan
- High-value transactions โ financial transfers, privilege changes, admin actions, bulk data exports
- Input validation failures โ repeated
400/422errors may indicate automated fuzzing or injection attempts - Abnormal application behaviour โ exceptions, stack traces, unexpected large queries
Common Logging Failures
- No structured logs โ logs are unformatted text, hard to query, easy to miss patterns in
- Logs not monitored โ data is written to files but no SIEM (Splunk, Elastic, Datadog) ingests or alerts on it
- Log injection โ user-controlled input written directly to logs without sanitisation; can poison log parsers or forge log entries
- No log integrity protection โ logs stored on the compromised system; attacker deletes or modifies them to cover tracks
- Sensitive data in logs โ logging full request bodies including passwords, credit card numbers, or tokens
Vulnerable Logging Code
import logging
logger = logging.getLogger(__name__)
# VULNERABLE โ what's wrong here?
@app.post("/login")
def login(username: str = Form(), password: str = Form()):
user = db.get_user(username)
if not user or not check_password(password, user.hash):
logger.warning(f"Failed login for {username}") # good, but...
return {"error": "invalid"} # no IP logged, no alert threshold
# admin login succeeds: nothing is logged at all!
return {"token": create_token(user)}
# LOG INJECTION VULNERABILITY:
@app.get("/search")
def search(q: str):
logger.info(f"Search query: {q}") # attacker sends q="foo
INFO: admin logged in from 8.8.8.8"
# forges a fake log entry
Fixed Logging Code
import logging, json
from datetime import datetime
logger = logging.getLogger("security")
def security_event(event_type: str, actor: str, detail: str, request=None):
entry = {
"timestamp": datetime.utcnow().isoformat(),
"event": event_type,
"actor": actor,
"detail": detail,
"ip": request.client.host if request else "unknown",
}
logger.info(json.dumps(entry)) # structured JSON log; easy to query in SIEM
@app.post("/login")
def login(username: str, password: str, request: Request):
user = db.get_user(username)
if not user or not check_password(password, user.hash):
security_event("auth_failure", username, "invalid credentials", request)
return {"error": "invalid"}
security_event("auth_success", username, f"role={user.role}", request)
return {"token": create_token(user)}
# LOG INJECTION FIX โ sanitise user input before including in log messages
import re
def sanitize_for_log(value: str) -> str:
return re.sub(r'[
]', '_', value)[:200] # strip newlines, cap length
What a Good Alerting Rule Looks Like
# Splunk: alert when 5+ auth failures from the same IP in 5 minutes
index=security_logs event=auth_failure
| stats count by ip
| where count >= 5
# Datadog monitor: alert on any access to sensitive admin endpoints
logs("service:myapp path:/admin/* status:403").rollup("count").last("5m") > 10
# Elastic (KQL): find IDOR scanning behaviour
event:"access_denied" | stats count by ip | sort count desc
Real-World Breaches Enabled by Logging Failures
- Target (2013) โ Attackers' malware triggered multiple security alerts that were reviewed but not escalated. 40 million credit card numbers stolen. The security team had visibility but no actionable alerting.
- Uber (2016) โ Attackers accessed AWS S3 credentials in a GitHub repository, then downloaded 57 million records. The breach was concealed for a year because no alerts were in place for unusual S3 access patterns.
- Equifax (2017) โ The HTTPS inspection appliance used to decrypt internal traffic had an expired certificate for 19 months, meaning encrypted attack traffic was never inspected or logged during the entire compromise.
- Capital One (2019) โ The attacker's SSRF exploit was made from legitimate AWS IPs. Unusual metadata service requests appeared in logs but no alert rule flagged the pattern.
How to Fix โ Checklist
- Log all authentication events โ successes and failures; include timestamp, IP, username, and user agent
- Log access control failures โ every 401/403 with the requested resource and the requester's identity
- Use structured (JSON) logging โ makes logs queryable by SIEM tools; avoids log injection via key=value formatting
- Ship logs to a separate system โ logs on the compromised host can be deleted; centralise in a write-only SIEM
- Alert on anomalies โ 5+ failures from one IP, sudden spike in 403s, access to decommissioned endpoints
- Never log credentials โ sanitise request bodies; log only the field name, not the value, for password fields
- Define incident response procedures โ logs are useless without a process for acting on alerts
Read the Audit Logs โ Find What's Missing
Inspect the audit log. Notice that no admin login events are recorded, even though admin actions have been performed. The backdoor endpoint is undocumented and unlogged โ find it.
Hint
/web/a09/audit-log, then look for an undocumented admin action endpoint.Undocumented Backdoor โ Unlogged Access
An attacker previously installed a backdoor at /web/a09/backdoor. Because there is no logging on this endpoint, the intrusion was never detected.
Hint
/web/a09/backdoor โ it works, and nothing in the audit log will record it.