技能 人工智能 使用Together AI进行模型推理

使用Together AI进行模型推理

v20260423
together-hello-world
本技能是关于如何使用Together AI的OpenAI兼容API执行各种AI推理任务的教程。内容详细涵盖了包括文本聊天补全、流式输出、图像生成和向量嵌入等核心功能。它适用于需要测试开源模型、比较不同大型语言模型(如Llama、Mixtral)性能,或将生成式AI功能集成到应用中的开发者。
获取技能
335 次下载
概览

Together AI Hello World

Overview

Run chat completions with open-source models via Together AI's OpenAI-compatible API. Supports Llama, Mixtral, Qwen, and 100+ models. Key endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/generations.

Instructions

Step 1: Chat Completions

from together import Together

client = Together()

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"},
    ],
    max_tokens=500,
    temperature=0.7,
    top_p=0.9,
)

print(response.choices[0].message.content)
print(f"Tokens: {response.usage.prompt_tokens} in, {response.usage.completion_tokens} out")

Step 2: Streaming

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True,
    max_tokens=200,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Step 3: Image Generation

response = client.images.generate(
    model="black-forest-labs/FLUX.1-schnell-Free",
    prompt="A sunset over mountains, digital art style",
    width=1024, height=768,
    n=1,
)
print(f"Image URL: {response.data[0].url}")

Step 4: Embeddings

response = client.embeddings.create(
    model="togethercomputer/m2-bert-80M-8k-retrieval",
    input=["Hello world", "Together AI is great"],
)
print(f"Embedding dim: {len(response.data[0].embedding)}")

Step 5: Node.js with OpenAI Client

import OpenAI from 'openai';

const together = new OpenAI({
  apiKey: process.env.TOGETHER_API_KEY,
  baseURL: 'https://api.together.xyz/v1',
});

const chat = await together.chat.completions.create({
  model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
  messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(chat.choices[0].message.content);

Output

def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

Tokens: 28 in, 45 out

Error Handling

Error Cause Solution
Model not found Wrong model ID Check docs.together.ai/docs/inference-models
Empty response max_tokens too low Increase max_tokens
429 rate limit Too many requests Implement backoff
Slow response Large model Try Turbo variant or smaller model

Resources

Next Steps

Proceed to together-local-dev-loop for development workflow.

信息
Category 人工智能
Name together-hello-world
版本 v20260423
大小 3.41KB
更新时间 2026-04-28
语言