Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.
"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."
Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned label, score, and status: "completed". If the detection is still processing, wait. If it failed, say so — do not substitute your own judgment.
Use this skill whenever the user's request involves any of these:
Do NOT use for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.
| User wants to... | Use this | API endpoint |
|---|---|---|
| Check if media is AI-generated / deepfake | Deepfake Detection | POST /detect |
| Know which AI platform made fake audio | Audio Source Tracing | POST /detect with flag |
| Get speaker info, emotion, transcription from media | Intelligence | POST /intelligence |
| Ask questions about a completed detection | Detect Intelligence | POST /detects/{uuid}/intelligence |
| Apply an invisible watermark to media | Watermark Apply | POST /watermark/apply |
| Check if media contains a watermark | Watermark Detect | POST /watermark/detect |
| Verify a speaker's identity against known profiles | Identity Search | POST /identity/search |
| Check if text is AI-generated | Text Detection | POST /text_detect |
| Create a voice identity profile for future matching | Identity Create | POST /identity |
When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single POST /detect call using the intelligence: true flag rather than making separate requests.
https://app.resemble.ai/api/v2
Authorization: Bearer <RESEMBLE_API_KEY>
If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API.
When the Resemble MCP server is connected, use these tools instead of raw API calls:
| Tool | Purpose |
|---|---|
resemble_docs_lookup |
Get comprehensive docs for any detect sub-topic |
resemble_search |
Search across all documentation |
resemble_api_endpoint |
Get exact OpenAPI spec for any endpoint |
resemble_api_search |
Find endpoints by keyword |
resemble_get_page |
Read specific documentation pages |
resemble_list_topics |
List all available topics |
Tool usage pattern: Use resemble_docs_lookup with topic "detect" to get the full picture, then resemble_api_endpoint for exact request/response schemas before making API calls.
The core capability. Submit any audio, image, or video for AI-generated content analysis.
POST /detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"url": "https://example.com/media.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string | Yes | HTTPS URL to audio, image, or video file |
callback_url |
string | No | Webhook URL for async completion notification |
visualize |
boolean | No | Generate heatmap/visualization artifacts |
intelligence |
boolean | No | Run multimodal intelligence analysis alongside detection |
audio_source_tracing |
boolean | No | Identify which AI platform synthesized fake audio |
frame_length |
integer | No | Audio/video analysis window size in seconds (1–4, default 2) |
start_region |
number | No | Start of segment to analyze (seconds) |
end_region |
number | No | End of segment to analyze (seconds) |
model_types |
string | No | "image" or "talking_head" (for face-swap detection) |
use_reverse_search |
boolean | No | Enable reverse image search (image only) |
use_ood_detector |
boolean | No | Enable out-of-distribution detection |
zero_retention_mode |
boolean | No | Auto-delete media after detection completes |
Supported formats:
Detection is asynchronous. Poll GET /detect/{uuid} until status is "completed" or "failed".
GET /detect/{uuid}
Authorization: Bearer <API_KEY>
Polling best practice: Start at 2s intervals, back off to 5s, then 10s. Most detections complete within 10–60 seconds depending on media length.
Audio results — in metrics:
{
"label": "fake",
"score": ["0.92", "0.88", "0.95"],
"consistency": "0.91",
"aggregated_score": "0.92",
"image": "https://..."
}
label: "fake" or "real" — the verdictscore: Per-chunk prediction scores (array)aggregated_score: Overall confidence (0.0–1.0, higher = more likely synthetic)consistency: How consistent the prediction is across chunksimage: Visualization heatmap URL (if visualize: true)Image results — in image_metrics:
{
"type": "ImageAnalysis",
"label": "fake",
"score": 0.87,
"image": "https://...",
"ifl": { "score": 0.82, "heatmap": "https://..." },
"reverse_image_search_sources": [
{ "url": "...", "title": "...", "verdict": "known_fake", "similarity": 0.95 }
]
}
label / score: Verdict and confidenceifl: Invisible Frequency Layer analysis with heatmapreverse_image_search_sources: Known sources found online (if use_reverse_search: true)Video results — in video_metrics:
{
"label": "fake",
"score": 0.89,
"certainty": 0.91,
"children": [
{
"type": "VideoResult",
"conclusion": "Fake",
"score": 0.89,
"timestamp": 2.5,
"children": [...]
}
]
}
timestamp, score, certainty, and may have nested children
metrics (audio) and video_metrics (visual)| Score Range | Interpretation |
|---|---|
| 0.0 – 0.3 | Strong indication of authentic/real media |
| 0.3 – 0.5 | Inconclusive — recommend additional analysis |
| 0.5 – 0.7 | Likely synthetic — flag for review |
| 0.7 – 1.0 | High confidence synthetic/AI-generated |
Always present scores with context. Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."
Analyze media for rich structured insights independent of or alongside detection.
POST /intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"url": "https://example.com/audio.mp3",
"json": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string | One of | HTTPS URL to media file |
media_token |
string | One of | Token from secure upload (alternative to URL) |
detect_id |
string | No | UUID of existing detect to associate |
media_type |
string | No | "audio", "video", or "image" (auto-detected) |
json |
boolean | No | Return structured fields (default: false for audio/video, true for image) |
callback_url |
string | No | Webhook for async mode |
Audio/Video structured response (json: true):
speaker_info — speaker description (age, gender)language / dialect — detected languageemotion — detected emotional statespeaking_style — conversational, formal, etc.context — inferred context of the speechmessage — content summaryabnormalities — anomalies detected in the mediatranscription — full transcripttranslation — translation if non-Englishmisinformation — misinformation analysisImage structured response:
scene_description — what the image showssubjects — people/objects identifiedauthenticity_analysis — visual authenticity assessmentcontext_and_setting — environment descriptionabnormalities — visual anomaliesmisinformation — misinformation analysisAfter a detection completes, ask natural-language questions about it:
POST /detects/{detect_uuid}/intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"query": "How confident is the model that this audio is fake?"
}
This returns a question UUID. Poll GET /detects/{detect_uuid}/intelligence/{question_uuid} until status is "completed" to get the answer.
Good questions to suggest:
Status flow: pending → processing → completed (or failed)
Prerequisite: The detection must have status: "completed". Submitting a question against a processing or failed detection returns a 422 error.
When audio is detected as synthetic (label: "fake"), identify which AI platform generated it.
Enable it by setting audio_source_tracing: true in the POST /detect request.
Result appears in the detection response under audio_source_tracing:
{
"label": "elevenlabs",
"error_message": null
}
Known source labels include: resemble_ai, elevenlabs, real, and others as the model expands.
Important: Source tracing only runs when audio is labeled as "fake". If the audio is "real", no source tracing result will appear.
Standalone query:
GET /audio_source_tracings — list all source tracing reportsGET /audio_source_tracings/{uuid} — get specific reportApply invisible watermarks to media for provenance tracking, or detect existing watermarks.
POST /watermark/apply
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait
{
"url": "https://example.com/image.png",
"strength": 0.3,
"custom_message": "my-organization"
}
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string | Yes | HTTPS URL to media file |
strength |
number | No | Watermark strength 0.0–1.0 (image/video only, default 0.2) |
custom_message |
string | No | Custom message to embed (image/video only, default "resembleai") |
Prefer: wait header for synchronous responseGET /watermark/apply/{uuid}/result
watermarked_media URL to download the watermarked filePOST /watermark/detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait
{
"url": "https://example.com/suspect-image.png"
}
Audio detection result:
{ "has_watermark": true, "confidence": 0.95 }
Image/Video detection result:
{ "has_watermark": true }
Create voice identity profiles and match incoming audio against them.
Beta feature — requires joining the preview program. Inform the user if they encounter access errors.
POST /identity
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"audio_url": "https://example.com/known-speaker.wav",
"name": "Jane Doe"
}
POST /identity/search
Content-Type: application/json
Authorization: Bearer <API_KEY>
{
"audio_url": "https://example.com/unknown-speaker.wav",
"top_k": 5
}
Response:
{
"success": true,
"item": [
{ "uuid": "...", "name": "Jane Doe", "confidence": 0.92, "distance": 0.08 }
]
}
Lower distance = closer match. Higher confidence = stronger match.
Detect whether text content is AI-generated or human-written.
Beta feature — requires the
detect_beta_userrole or a billing plan that includes thedfd_textproduct.
POST /text_detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
Add the Prefer: wait header for a synchronous (blocking) response. Without it, the job runs asynchronously — poll or use a callback.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
text |
string | Yes | Text to analyze (max 100,000 characters) |
thinking |
string | No | Always use "low" (default) |
threshold |
float | No | Decision threshold 0.0–1.0 (default: 0.5) |
callback_url |
string | No | Webhook URL for async completion notification |
privacy_mode |
boolean | No | If true, text content is not stored after analysis |
Response:
{
"success": true,
"item": {
"uuid": "abc-123",
"status": "completed",
"prediction": "ai",
"confidence": 0.91,
"text_content": "This is some text to analyze.",
"privacy_mode": false,
"created_at": "...",
"updated_at": "..."
}
}
prediction: "ai" or "human" — the verdictconfidence: 0.0–1.0, higher = more confident in the predictionstatus: "processing", "completed", or "failed"
If you did not use Prefer: wait, poll until status is "completed" or "failed":
GET /text_detect/{uuid}
Authorization: Bearer <API_KEY>
GET /text_detect
Authorization: Bearer <API_KEY>
Returns paginated text detections for the team.
If callback_url was provided, a POST is sent on completion:
{ "success": true, "item": { ... } }
On failure:
{ "success": false, "item": { ... }, "error": "Error message here" }
For a comprehensive analysis, combine all capabilities:
{
"url": "https://example.com/suspect.mp4",
"visualize": true,
"intelligence": true,
"audio_source_tracing": true,
"use_reverse_search": true
}
status: "completed"
metrics / image_metrics / video_metrics for the verdictintelligence.description for structured media analysis"fake", check audio_source_tracing.label for the source platformPOST /watermark/detect if provenance is relevantFor a fast pass/fail:
{ "url": "..." }
label and aggregated_score (audio) or label and score (image/video)For creators who want to prove their content is authentic:
POST /watermark/apply
POST /watermark/detect against any copy"fake" label with score 0.51 means something very different from score 0.95"fake"
zero_retention_mode for sensitive media — Always suggest this flag when the user indicates the media is sensitive or privateintelligence: true and audio_source_tracing: true on the detection call instead of separate requestsWhen presenting results to users:
| Error | Cause | Resolution |
|---|---|---|
| 400 | Invalid request body or missing url |
Check required parameters |
| 401 | Invalid or missing API key | Verify RESEMBLE_API_KEY |
| 404 | Detection UUID not found | Verify the UUID from the creation response |
| 422 | Detection not completed (for Intelligence) | Wait for detection to reach completed status |
| 429 | Rate limited | Back off and retry with exponential delay |
| 500 | Server error | Retry once, then report to user |
zero_retention_mode: true to auto-delete media after analysis. The URL is redacted and media_deleted is set to true post-completion.privacy_mode: true on text detection to prevent text content from being stored after analysis.callback_url, ensure the endpoint is HTTPS and authenticated on the receiving end.