技能 编程开发 Databricks环境配置与认证指南

Databricks环境配置与认证指南

v20260423
databricks-install-auth
本文档提供了全面的指南,用于安装和配置Databricks CLI v2和Python SDK。内容涵盖了连接不同Databricks工作区的所有关键步骤,包括使用个人访问令牌(PAT)、OAuth U2M交互式认证以及用于CI/CD流程的安全OAuth M2M服务主体配置。是开发人员将Databricks集成到项目中的必备知识。
获取技能
427 次下载
概览

Databricks Install & Auth

Overview

Set up Databricks CLI v2, Python SDK, and authentication. Covers Personal Access Tokens (legacy), OAuth U2M (interactive), and OAuth M2M (service principal for CI/CD). Databricks strongly recommends OAuth over PATs for production.

Prerequisites

  • Python 3.8+ with pip
  • Databricks workspace URL (e.g., https://adb-1234567890123456.7.azuredatabricks.net)
  • For PAT: User Settings > Developer > Access Tokens in workspace UI
  • For OAuth M2M: Service principal with client ID and secret

Instructions

Step 1: Install Databricks CLI and Python SDK

set -euo pipefail

# Install CLI v2 (standalone binary — recommended)
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh

# Verify CLI
databricks --version

# Install Python SDK
pip install databricks-sdk

# Install Databricks Connect for local Spark development
pip install databricks-connect==14.3.*

Step 2: Configure Authentication

Option A: Personal Access Token (Quick Start)

Generate a PAT in workspace UI: User Settings > Developer > Access Tokens.

# Interactive setup — prompts for host and token
databricks configure --token

# Or set environment variables directly
export DATABRICKS_HOST="https://adb-1234567890123456.7.azuredatabricks.net"
export DATABRICKS_TOKEN="dapi_your_token_here"

Option B: OAuth U2M (User-to-Machine — Interactive)

Opens browser for OAuth consent. Token auto-refreshes (1-hour lifetime).

# Interactive OAuth login
databricks auth login --host https://adb-1234567890123456.7.azuredatabricks.net

# Verify — prints current user
databricks current-user me

Option C: OAuth M2M (Service Principal — CI/CD)

Uses client credentials flow. No browser required. Create a service principal in Account Console > Service Principals, then generate an OAuth secret.

export DATABRICKS_HOST="https://adb-1234567890123456.7.azuredatabricks.net"
export DATABRICKS_CLIENT_ID="00000000-0000-0000-0000-000000000000"
export DATABRICKS_CLIENT_SECRET="dose00000000000000000000000000000000"

# Verify
databricks current-user me

Step 3: Configure Profiles for Multi-Workspace

# ~/.databrickscfg — one section per workspace
[DEFAULT]
host  = https://adb-dev-workspace.7.azuredatabricks.net
token = dapi_dev_token_here

[staging]
host  = https://adb-staging-workspace.7.azuredatabricks.net
token = dapi_staging_token_here

[production]
host       = https://adb-prod-workspace.7.azuredatabricks.net
client_id  = 00000000-0000-0000-0000-000000000000
client_secret = dose_prod_secret_here
# Use a specific profile
databricks workspace list / --profile staging

Step 4: Verify SDK Connection

from databricks.sdk import WorkspaceClient

# Auto-detects from env vars or ~/.databrickscfg
w = WorkspaceClient()

me = w.current_user.me()
print(f"Authenticated as: {me.user_name}")
print(f"Workspace: {w.config.host}")
print(f"Auth type: {w.config.auth_type}")

# Quick smoke test — list clusters
clusters = list(w.clusters.list())
print(f"Clusters found: {len(clusters)}")

Step 5: Service Principal Authentication (Python SDK)

from databricks.sdk import WorkspaceClient
from databricks.sdk.config import Config

# Explicit M2M config for CI/CD scripts
config = Config(
    host="https://adb-1234567890123456.7.azuredatabricks.net",
    client_id="00000000-0000-0000-0000-000000000000",
    client_secret="dose00000000000000000000000000000000",
)
w = WorkspaceClient(config=config)

# Or use a named profile
w = WorkspaceClient(profile="production")

Output

  • Databricks CLI v2 installed and on PATH
  • Python SDK (databricks-sdk) installed
  • Authentication credentials stored in env vars or ~/.databrickscfg
  • Connection verified with databricks current-user me

Error Handling

Error Cause Solution
INVALID_TOKEN Token expired or revoked Generate a new PAT or re-run databricks auth login
Could not resolve host Wrong workspace URL Verify URL format: https://adb-<id>.<region>.azuredatabricks.net
PERMISSION_DENIED Token lacks required entitlements Ensure user/SP has workspace access in Account Console
SSL: CERTIFICATE_VERIFY_FAILED Corporate proxy intercepts TLS Set REQUESTS_CA_BUNDLE=/path/to/cert.pem
Connection refused VPN or firewall blocking Check corporate firewall rules for workspace domain
No matching profile Profile name typo in ~/.databrickscfg Run databricks auth profiles to list available profiles

Examples

Account-Level Client (Multi-Workspace Management)

from databricks.sdk import AccountClient

# Account-level operations (manage workspaces, users, billing)
a = AccountClient(
    host="https://accounts.cloud.databricks.com",
    account_id="00000000-0000-0000-0000-000000000000",
    client_id="sp-client-id",
    client_secret="sp-secret",
)

for ws in a.workspaces.list():
    print(f"{ws.workspace_name}: {ws.deployment_name}")

Azure AD Managed Identity

from databricks.sdk import WorkspaceClient

# Uses Azure Default Credential chain (works in Azure VMs, AKS, Functions)
w = WorkspaceClient(
    host="https://adb-1234567890123456.7.azuredatabricks.net",
    azure_workspace_resource_id="/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Databricks/workspaces/<ws>",
)

Resources

Next Steps

After successful auth, proceed to databricks-hello-world for your first cluster and notebook.

信息
Category 编程开发
Name databricks-install-auth
版本 v20260423
大小 6.39KB
更新时间 2026-04-28
语言