派兰蒂尔基金会最佳实践架构

v20260423

palantir-reference-architecture

本指南提供了一套全面的Palantir Foundry参考架构，用于指导构建生产级的企业级数据应用。它详细涵盖了从原始数据摄取、数据清洗、模型构建到最终本体模型（Ontology）的完整数据流转过程，并提供了项目布局、外部API集成和多层安全机制的最佳实践，适用于规划和优化复杂的数据基础设施。

派兰蒂尔基金会架构数据管线本体模型最佳实践数据建模企业级

获取技能

199 次下载

概览

Palantir Reference Architecture

Overview

Production-ready architecture for Foundry-integrated applications. Covers the standard data pipeline pattern (ingest > clean > model > serve), Ontology design, external API integration, and multi-repo project layout.

Prerequisites

Foundry enrollment with project access
Understanding of Ontology concepts (object types, link types, actions)
Familiarity with palantir-core-workflow-a (transforms) and palantir-core-workflow-b (Ontology)

Instructions

Step 1: Data Pipeline Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐     ┌───────────┐
│  Raw Layer   │────>│  Clean Layer │────>│ Model Layer │────>│ Ontology  │
│ (ingested)   │     │  (validated) │     │ (enriched)  │     │ (objects) │
└─────────────┘     └──────────────┘     └─────────────┘     └───────────┘
  ↑ Connectors        @transform_df       @transform_df       Object types
  ↑ REST sync          null checks         joins, aggs         Link types
  ↑ File upload        type casting        ML features         Actions

Step 2: Project Layout (Foundry)

Foundry Project: "Customer Analytics"
├── Datasets/
│   ├── raw/                    # Ingested from sources
│   │   ├── raw_orders          # REST connector → CRM
│   │   ├── raw_customers       # JDBC connector → DB
│   │   └── raw_products        # File upload (CSV/Parquet)
│   ├── clean/                  # Validated, typed
│   │   ├── clean_orders        # Nulls removed, dates parsed
│   │   ├── clean_customers     # Deduped, normalized
│   │   └── clean_products      # Schema enforced
│   └── model/                  # Enriched, analytics-ready
│       ├── order_enriched      # Joined with customer + product
│       ├── customer_360        # Aggregated customer view
│       └── daily_summary       # Time-series aggregation
├── Code Repositories/
│   ├── pipeline-ingestion/     # Connectors and raw → clean
│   ├── pipeline-analytics/     # Clean → model transforms
│   └── ontology-actions/       # Action implementations
└── Ontology/
    ├── Object Types: Customer, Order, Product
    ├── Link Types: Customer→Orders, Order→Products
    └── Actions: createOrder, updateCustomerSegment

Step 3: External API Integration Pattern

# External app consuming Foundry Ontology via Platform SDK
my-external-app/
├── src/
│   ├── foundry/
│   │   ├── client.py           # Singleton FoundryClient
│   │   ├── objects.py          # Object query helpers
│   │   ├── actions.py          # Action wrappers
│   │   └── cache.py            # TTL cache layer
│   ├── api/
│   │   ├── routes.py           # REST endpoints
│   │   └── webhooks.py         # Foundry event handlers
│   └── main.py
├── tests/
│   ├── conftest.py             # Mocked FoundryClient
│   ├── test_objects.py
│   └── test_actions.py
├── .env                        # FOUNDRY_HOSTNAME, credentials
└── requirements.txt

Step 4: Ontology Design Patterns

Pattern	When to Use	Example
Hub-and-spoke	Central entity with many relationships	Customer → Orders, Tickets, Payments
Event sourcing	Audit trail needed	OrderEvent (created, shipped, delivered)
Computed properties	Derived values	`totalRevenue` on Customer (sum of orders)
Composite actions	Multi-step mutations	`processReturn`: update order + create credit + notify

Step 5: Security Layers

┌──────────────────────────────────────────┐
│ Layer 1: Network (VPN/private link)       │
├──────────────────────────────────────────┤
│ Layer 2: OAuth2 (service user per app)    │
├──────────────────────────────────────────┤
│ Layer 3: Scopes (minimum per app)         │
├──────────────────────────────────────────┤
│ Layer 4: Project roles (Viewer/Editor)    │
├──────────────────────────────────────────┤
│ Layer 5: Marking (data classification)    │
└──────────────────────────────────────────┘

Output

Standard 3-layer data pipeline (raw > clean > model)
Ontology design with typed objects, links, and actions
External app architecture with caching and webhooks
Security model with 5 defense layers

Error Handling

Architecture Issue	Symptom	Fix
Circular dependencies	Builds fail	Restructure pipeline DAG
Missing clean layer	Bad data in model	Always validate between raw and model
Monolithic transforms	Slow builds	Split into focused transforms
No caching	API rate limits	Add TTL cache layer

Resources

Next Steps

For data handling and compliance, see palantir-data-handling.

信息

Category 数据科学

Name palantir-reference-architecture

版本 v20260423

大小 6.53KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-28