Skip to content

Security Incident Report: Crypto Miner on CI Build Agent

Date: 2026-03-17 Severity: CRITICAL Affected system: Self-hosted Azure DevOps build agent (DO-Build-Agents) Droplet: 159.223.11.111 (devops-buildagent) Status: Resolved First mention: .specstory/history/2026-03-17_08-26Z-unit-test-job-not-stopping.md


Executive Summary

A cryptocurrency miner was discovered running on the self-hosted DigitalOcean build agent during investigation of a stalled Azure Pipelines unit test job. The attacker exploited a publicly exposed PostgreSQL container (used for CI test runs) that was configured with trust-based authentication and no network restrictions. The miner consumed ~250% CPU and 2.4 GB RAM, severely degrading CI pipeline performance.

The attack was neutralized, firewall rules applied, and the root cause patched in the CI pipeline configuration.


Timeline

Time (UTC) Event
~07:42 Pipeline #20260317084052 starts, PostgreSQL container launched
~07:43 Attacker connects to exposed PostgreSQL (port 5432)
~07:43 Miner process /tmp/mysql spawned via PostgreSQL command execution
~07:48 Jest-based unit tests begin executing (severely degraded by miner)
~08:26 Tests finish after ~330s wall-clock, but pipeline step does not exit
08:28 Investigation begins via SSH; miner detected at 249% CPU
08:28 First kill of PID 774129 — miner respawns within seconds
08:29 Watchdog process tree identified and killed (PID 774094)
08:30 iptables rules applied to block external access to port 5432
08:30 Pipeline config patched: PostgreSQL bound to 127.0.0.1 only
08:30 --forceExit and timeoutInMinutes added to test pipeline steps

Attack Analysis

Attack Vector

The CI pipeline starts a PostgreSQL 16 Docker container for integration tests. The container was configured with:

  • Port binding: -p "${PG_PORT}:5432" (binds to 0.0.0.0, all network interfaces)
  • Authentication: POSTGRES_HOST_AUTH_METHOD=trust (no password required)
  • User: postgres with full superuser privileges

This made the database accessible from the public internet without any credentials.

Exploitation Method

PostgreSQL provides several mechanisms for command execution when accessed as a superuser:

  • COPY ... TO PROGRAM — executes arbitrary shell commands
  • CREATE FUNCTION ... LANGUAGE C — loads shared libraries
  • Large object functions combined with file I/O

The attacker used one of these methods to write a binary to /tmp/mysql and execute it.

Persistence Mechanism

The miner used a process namespace with a watchdog to survive kill signals:

uml diagram

Key characteristics:

  • Process camouflage: Named /tmp/mysql to blend in with database processes
  • User context: Ran as lxd (the user owning the Docker daemon socket), inheriting Docker group privileges
  • Fileless persistence: The binary was deleted from disk after execution (/proc/<pid>/exe pointed to /tmp/mysql (deleted)), running entirely from memory
  • Auto-respawn: The init watchdog (PID 774094) immediately respawned the miner when killed, requiring a full process tree kill

Network Exposure

uml diagram

Listening ports at time of discovery:

Port Service Binding Risk
22 SSH 0.0.0.0 Key-based only — low
80 NGINX (Docker) 0.0.0.0 Reverse proxy — low
443 NGINX (Docker) 0.0.0.0 Reverse proxy — low
5432 PostgreSQL (Docker) 0.0.0.0 trust auth — CRITICAL

Impact

Direct Impact

Metric Value
CPU stolen ~250% (of 4 vCPU = 62.5% of system)
Memory consumed 2.4 GB (of 8 GB = 30%)
Duration active ~45 minutes (this instance)
Pipeline delay Unit test job hung at 44+ minutes
Data exfiltration None confirmed — miner-only payload

Indirect Impact

  • CI pipeline throughput reduced (tests ran ~6x slower)
  • Build agent availability degraded for the pipeline pool
  • Pipeline jobs appeared hung, delaying development feedback loops
  • Unknown how long the vulnerability has existed or how many times it was exploited between pipeline runs

Remediation Actions

Immediate (applied 2026-03-17)

uml diagram

Action File / System Change
Kill miner process tree Server kill -9 on PIDs 774094, 779680, and all child threads
Block port 5432 externally Server iptables iptables -I INPUT -p tcp --dport 5432 -j DROP (with localhost/Docker subnet exceptions)
Bind PostgreSQL to localhost setup-database.yml -p "${PG_PORT}:5432" changed to -p "127.0.0.1:${PG_PORT}:5432"
Add --forceExit to Jest azure-pipelines.yml Prevents Jest worker processes from hanging on open handles
Add step timeout azure-pipelines.yml timeoutInMinutes: 20 on both unit test steps as a safety net
Priority Action Status Rationale
HIGH Rotate all secrets and credentials accessible from the build agent TODO Agent may have been compromised beyond the miner
HIGH Audit Azure DevOps service connections and variable groups TODO Attacker had access to the agent's runtime environment
HIGH Persist iptables rules across reboots (iptables-save / netfilter-persistent) DONE Added to setup-build-agent.sh
HIGH Block Docker API ports (2375/2376) on all droplets DONE DigitalOcean Docker images ship with these open in UFW -- allows unauthenticated RCE
HIGH Harden SSH on all droplets (key-only, fail2ban) DONE PermitRootLogin prohibit-password, PasswordAuthentication no, fail2ban installed
HIGH Security audit staging and nanoclaw machines DONE No compromise found; hardening applied 2026-03-17
MEDIUM Add DigitalOcean Cloud Firewall to the droplet TODO Defense-in-depth; blocks traffic before it reaches the OS
MEDIUM Restrict the lxd user's Docker group membership TODO Limits blast radius of container-escape attacks
MEDIUM Run the Azure DevOps agent as a non-root user without Docker socket access TODO Use rootless Docker or restrict agent capabilities
LOW Set up host-level monitoring (CPU spike alerts, /tmp binary alerts) DONE /etc/cron.d/tmp-exec-monitor added to build agent
LOW Audit /var/log/postgresql for evidence of the initial exploitation query TODO Understand exact method used

Sibling Machine Audit (2026-03-17)

Following the build agent compromise, a security audit was performed on both sibling droplets:

Machine IP Result Issues Found
Staging (forma3d-staging) 167.172.45.47 Clean -- no compromise Docker API ports 2375/2376 open in UFW, PermitRootLogin yes, no fail2ban
Nanoclaw (devgem-nanoclaw) 188.166.104.96 Clean -- no compromise Docker API ports 2375/2376 open in UFW, PermitRootLogin yes, active brute-force from 185.241.34.154

All issues were remediated:

  • Removed UFW rules for Docker API ports 2375/2376 on both machines
  • Set PermitRootLogin prohibit-password and PasswordAuthentication no on both machines
  • Installed fail2ban with 24-hour SSH ban on both machines
  • Rate-limited SSH via UFW on staging (nanoclaw already had rate limiting)
  • Cleared plaintext Docker registry credentials from staging bash history
  • Updated setup-droplet.sh and setup-build-agent.sh to include all hardening steps

Root Cause

uml diagram

The root cause was a defense-in-depth failure: the Docker port binding default (0.0.0.0) combined with trust authentication and no firewall created an unauthenticated, internet-facing PostgreSQL superuser endpoint. Any single mitigation (localhost binding, firewall, password auth) would have prevented the attack.


Forensic Evidence

Process Snapshots

Initial detection (08:28 UTC):

PID 774129 — /tmp/mysql — 249% CPU, 2.4 GB RSS, running 2693 seconds, user: lxd

After first kill (08:28 UTC) — respawned immediately:

PID 779680 — /tmp/mysql — 106% CPU, 510 MB RSS, running 7 seconds, user: lxd

Process tree (08:29 UTC):

postgres,773014 (user: lxd, PPID: 772988)
  ├── init,774094          ← watchdog
  │   ├── init,774128
  │   ├── mysql,779680     ← miner (9 threads)
  │   └── {init},774127
  ├── postgres,777352      ← checkpointer
  ├── postgres,777353      ← background writer
  ├── postgres,777354      ← walwriter
  ├── postgres,777355      ← autovacuum launcher
  └── postgres,777356      ← replication launcher

Binary analysis: - Path: /tmp/mysql (deleted from disk, running from memory) - /proc/779680/exe → /tmp/mysql - File was unlinked (fileless execution) — could not be recovered for hash analysis

Auth Log (no unauthorized SSH access found)

All SSH sessions during the incident window originated from a single authorized IP (84.198.49.200) using ED25519 public key authentication. No brute-force attempts or unauthorized logins were recorded.

Login History

The wtmp log contained only system reboot entries — no interactive user login records. This is consistent with the agent running headless, though wtmp tampering cannot be ruled out.