Run arbitrary PySpark or Python code on Fabric Spark compute via the Livy API. No notebook artifact is created or persisted; sessions are ephemeral. Full read/write access to lakehouse Delta tables via Spark SQL.
az login)The Livy API requires a token from az account get-access-token --resource https://api.fabric.microsoft.com. Tokens from fab auth do not work for OneLake storage access inside the Spark session.
import subprocess, json
result = subprocess.run(
["az", "account", "get-access-token", "--resource", "https://api.fabric.microsoft.com"],
capture_output=True, text=True
)
token = json.loads(result.stdout)["accessToken"]
Do not output or log the token. Pass it directly to the API call.
1. Create session POST .../sessions {"kind": "pyspark"}
2. Wait for idle GET .../sessions/{id} poll until state: "idle" (~30-90s)
3. Submit code POST .../sessions/{id}/statements {"code": "...", "kind": "pyspark"}
4. Get result GET .../sessions/{id}/statements/{n} poll until state: "available"
5. Delete session DELETE .../sessions/{id} ALWAYS do this
Base URL: https://api.fabric.microsoft.com/v1/workspaces/{wsId}/lakehouses/{lhId}/livyapi/versions/2023-12-01
CRITICAL: Always delete sessions when done. Idle sessions consume Fabric capacity units (CUs). A forgotten session burns compute until it times out (default: 20 minutes). In automation, wrap cleanup in a finally block.
WS_ID=$(fab get "Workspace.Workspace" -q "id" | tr -d '"')
LH_ID=$(fab get "Workspace.Workspace/Lakehouse.Lakehouse" -q "id" | tr -d '"')
Submit PySpark or pure Python as statements. The spark object is available automatically.
# Statement payload
{"code": "df = spark.sql('SELECT * FROM products LIMIT 10')\ndf.show()", "kind": "pyspark"}
Results are in output.data["text/plain"] when state: "available" and output.status: "ok".
spark.sql("SELECT ...") ; full Spark SQL against lakehouse tablesspark.sql("SHOW TABLES") ; metastore accessdf.write.mode("overwrite").saveAsTable(...) ; write Delta tablesdeltalake (delta-rs) is not pre-installed; use Spark SQL insteadnotebookutils has limited functionality (no FUSE mount at /lakehouse/default/)fab auth ; must use az CLI token| Scenario | Approach |
|---|---|
| Quick read-only exploration | DuckDB locally (fastest; see using-duckdb skill) |
| Write data back to lakehouse | Livy session or notebook |
| Ephemeral transform; no artifact | Livy session (this skill) |
| Complex multi-cell workflow | Notebook (nb exec or portal) |
| Scheduled ETL | Notebook via fab job run |
| Agent-driven compute (Dagster, orchestrators) | Livy session |
references/livy-api.md -- Full API reference with endpoints, request/response formats, and error handlingreferences/example-script.md -- Complete working script that creates a session, queries data, writes results, and cleans up