使用Together AI进行模型推理

v20260423

together-hello-world

本技能是关于如何使用Together AI的OpenAI兼容API执行各种AI推理任务的教程。内容详细涵盖了包括文本聊天补全、流式输出、图像生成和向量嵌入等核心功能。它适用于需要测试开源模型、比较不同大型语言模型（如Llama、Mixtral）性能，或将生成式AI功能集成到应用中的开发者。

AI 推理 Python API 生成式AI Together

获取技能

335 次下载

概览

Together AI Hello World

Overview

Run chat completions with open-source models via Together AI's OpenAI-compatible API. Supports Llama, Mixtral, Qwen, and 100+ models. Key endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/generations.

Instructions

Step 1: Chat Completions

from together import Together

client = Together()

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"},
    ],
    max_tokens=500,
    temperature=0.7,
    top_p=0.9,
)

print(response.choices[0].message.content)
print(f"Tokens: {response.usage.prompt_tokens} in, {response.usage.completion_tokens} out")

Step 2: Streaming

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True,
    max_tokens=200,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Step 3: Image Generation

response = client.images.generate(
    model="black-forest-labs/FLUX.1-schnell-Free",
    prompt="A sunset over mountains, digital art style",
    width=1024, height=768,
    n=1,
)
print(f"Image URL: {response.data[0].url}")

Step 4: Embeddings

response = client.embeddings.create(
    model="togethercomputer/m2-bert-80M-8k-retrieval",
    input=["Hello world", "Together AI is great"],
)
print(f"Embedding dim: {len(response.data[0].embedding)}")

Step 5: Node.js with OpenAI Client

import OpenAI from 'openai';

const together = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY,
  baseURL: 'https://api.together.xyz/v1',
});

const chat = await together.chat.completions.create({
  model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(chat.choices[0].message.content);

Output

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

Tokens: 28 in, 45 out

Error Handling

Error	Cause	Solution
`Model not found`	Wrong model ID	Check docs.together.ai/docs/inference-models
Empty response	max_tokens too low	Increase max_tokens
`429 rate limit`	Too many requests	Implement backoff
Slow response	Large model	Try Turbo variant or smaller model

Resources

Next Steps

Proceed to together-local-dev-loop for development workflow.

信息

Category 人工智能

Name together-hello-world

版本 v20260423

大小 3.41KB

Source jeremylongshore/claude-code-plugins-plus-skills

更新时间 2026-04-28