One token can empty the lakehouse
A stolen Databricks token bypasses MFA and reads whatever its identity can. No exploit, no password, and gigabytes of data leave through a normal API. Here is how, and the controls that hold.
No password, no MFA prompt, no exploit, and the data still leaves. A long-lived token is a key that skips every door.
What it is
Databricks personal access tokens and service-principal tokens authenticate to the workspace API. A long-lived token leaked in code or CI, phished, or taken from a compromised user lets an attacker run notebooks and jobs, query the tables the token’s identity can read, reach cloud storage through the workspace, and export data, all without a password and without an MFA prompt. This is T1528 (steal application access token) into T1213 (data from information repositories) and T1567 (exfiltration over a web service).
Why it works
It is a valid token hitting normal APIs, so it blends in with legitimate analytics. A long-lived token is, in effect, an MFA-bypassing data key.
How to detect it
Look for a token used from a new IP or location, then broad cross-catalog queries and bulk exports the identity does not normally run, plus new scheduled jobs created for persistence, in the Databricks audit logs.
The fix that holds
Prefer short-lived OAuth over long-lived personal access tokens. Enforce IP access lists, least-privilege catalog and table grants, cap token lifetimes, and alert on token use from new IPs and on bulk exports. Treat a stolen token as a data-exposure incident: revoke, assess what was read, and meet your notification obligations.
Practice it
We built a Databricks token-theft scenario in GraphLattice Range so teams work the detection, the revoke, and the data-exposure call end to end.