Code execution inside the data stack: dbt Cloud service-token abuse
A leaked dbt Cloud service token runs attacker-authored models that execute arbitrary SQL in the warehouse under dbt's own trust. dbt sits on two credentials, and you must cut both.
dbt Cloud is a trusted transformation layer that runs SQL against your warehouse under its own connection. A leaked service token turns that trust into code execution inside the data stack.
How the attack works
The token leaks from a misconfigured CI variable. The attacker uses it to trigger an off-hours job that executes models and macros that were never in the reviewed project, issuing ad hoc SQL against source schemas. The compiled SQL selects from customer and finance source tables far beyond the model’s normal scope, running under dbt’s warehouse role, and a macro then copies the results to an external object-storage destination not owned by the company. In ATT&CK terms this is T1552, Unsecured Credentials, and T1648, Serverless Execution, with the data theft mapping to T1530, Data from Cloud Storage.
Why it works
dbt runs arbitrary SQL with its own warehouse trust, and its role was broad enough to reach all source data. The leaked token was over-scoped, and unreviewed models executed automatically.
How to fix it
The non-obvious move is that dbt sits on two credentials, not one: the service token that triggers jobs and the warehouse connection credentials that actually touch data. Revoking only the token still leaves an attacker who scraped the connection creds with direct warehouse access, so you must revoke the token, cancel the active run, AND rotate the warehouse connection credentials. For the class fix, store tokens in a managed secret store with least-privilege scopes, restrict dbt’s warehouse role to only the schemas it transforms, and gate model and macro changes that reach new sources. The warehouse query history, joined to the dbt run logs and compiled SQL, is authoritative for what was read and exfiltrated.
Practice it
We built this as a GraphLattice Range scenario so teams can rehearse the malicious run, the revoke-token-and-rotate-connection containment, and the obligations when regulated data leaves the warehouse.