Async Speech Transcription and Intelligence Analysis

v20260423

assemblyai-core-workflow-a

Execute comprehensive, asynchronous transcription workflows using AssemblyAI. This tool handles file upload, queues jobs, and provides advanced audio intelligence features. Capabilities include speaker diarization, sentiment analysis, named entity recognition (NER), topic categorization (IAB), content safety moderation, and sensitive PII redaction. Ideal for processing meeting recordings, interviews, and large volumes of spoken content.

assemblyai transcription speech-to-text diarization sentiment-analysis pii-redaction ai typescript

Get Skill

447 downloads

Overview

AssemblyAI Core Workflow A — Async Transcription

Overview

Primary money-path workflow: submit audio for async transcription with audio intelligence features. The SDK handles file upload (for local files), queues the transcription job, and polls until completion.

Prerequisites

assemblyai package installed
API key configured in ASSEMBLYAI_API_KEY

Instructions

Step 1: Basic Async Transcription

import { AssemblyAI } from 'assemblyai';

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
});

// Remote URL — SDK queues and polls automatically
const transcript = await client.transcripts.transcribe({
  audio: 'https://example.com/meeting-recording.mp3',
});

console.log(transcript.text);
console.log(`Duration: ${transcript.audio_duration}s`);
console.log(`Words: ${transcript.words?.length}`);

Step 2: Local File Upload

// The SDK uploads the file and transcribes in one call
const transcript = await client.transcripts.transcribe({
  audio: './recordings/interview.wav',
});

// Or from a buffer/stream
import fs from 'fs';
const buffer = fs.readFileSync('./recordings/interview.wav');
const transcript2 = await client.transcripts.transcribe({
  audio: buffer,
});

Step 3: Speaker Diarization

const transcript = await client.transcripts.transcribe({
  audio: audioUrl,
  speaker_labels: true,
  speakers_expected: 3,  // Optional: hint for expected speaker count
});

// Utterances are grouped by speaker
for (const utterance of transcript.utterances ?? []) {
  console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
  // Speaker A: Good morning, thanks for joining.
  // Speaker B: Happy to be here.
}

Step 4: Full Audio Intelligence Stack

const transcript = await client.transcripts.transcribe({
  audio: audioUrl,

  // Speaker identification
  speaker_labels: true,

  // Content analysis
  sentiment_analysis: true,
  entity_detection: true,
  auto_highlights: true,
  iab_categories: true,       // Topic detection (IAB taxonomy)
  content_safety: true,        // Flag sensitive content
  summarization: true,
  summary_model: 'informative',
  summary_type: 'bullets',

  // Formatting
  punctuate: true,
  format_text: true,
  language_code: 'en',

  // Word boost for domain terms
  word_boost: ['AssemblyAI', 'LeMUR', 'transcription'],
  boost_param: 'high',
});

// --- Access results ---

// Sentiment per sentence
for (const s of transcript.sentiment_analysis_results ?? []) {
  console.log(`[${s.sentiment}] ${s.text}`);
  // [POSITIVE] I really enjoyed working on this project.
}

// Named entities
for (const e of transcript.entities ?? []) {
  console.log(`${e.entity_type}: ${e.text}`);
  // person_name: John Smith
  // location: San Francisco
}

// Auto-highlighted key phrases
for (const h of transcript.auto_highlights_result?.results ?? []) {
  console.log(`"${h.text}" (count: ${h.count}, rank: ${h.rank})`);
}

// IAB content categories
const categories = transcript.iab_categories_result?.summary ?? {};
for (const [category, relevance] of Object.entries(categories)) {
  if ((relevance as number) > 0.5) {
    console.log(`Topic: ${category} (${((relevance as number) * 100).toFixed(0)}%)`);
  }
}

// Content safety labels
for (const result of transcript.content_safety_labels?.results ?? []) {
  for (const label of result.labels) {
    console.log(`Safety: ${label.label} (${(label.confidence * 100).toFixed(0)}%)`);
  }
}

// Summary
console.log('Summary:', transcript.summary);

Step 5: PII Redaction

const transcript = await client.transcripts.transcribe({
  audio: audioUrl,
  redact_pii: true,
  redact_pii_policies: [
    'email_address',
    'phone_number',
    'person_name',
    'credit_card_number',
    'social_security_number',
    'date_of_birth',
  ],
  redact_pii_sub: 'hash',  // Replace PII with hash. Options: 'hash' | 'entity_name'
  redact_pii_audio: true,  // Also generate redacted audio file
});

// Text has PII replaced: "My name is ####" or "My name is [PERSON_NAME]"
console.log(transcript.text);

// Get redacted audio URL (takes extra processing time)
if (transcript.redact_pii_audio_quality) {
  const redactedAudio = await client.transcripts.redactedAudio(transcript.id);
  console.log('Redacted audio URL:', redactedAudio.redacted_audio_url);
}

Step 6: Manage Transcripts

// List recent transcripts
const page = await client.transcripts.list({ limit: 20 });
for (const t of page.transcripts) {
  console.log(`${t.id} | ${t.status} | ${t.audio_duration}s`);
}

// Get a specific transcript
const existing = await client.transcripts.get('transcript-id');

// Delete a transcript (GDPR compliance)
await client.transcripts.delete('transcript-id');

Supported Audio Formats

MP3, WAV, FLAC, M4A, OGG, WebM, MP4, AAC. Max file size: 5 GB. Max duration: 10 hours (async). The SDK auto-detects format.

Output

Complete transcript with word-level timestamps and confidence scores
Speaker-labeled utterances (with speaker_labels: true)
Sentiment analysis, entity detection, key phrases, topic categories
PII-redacted text and audio
Content safety labels for moderation

Error Handling

Error	Cause	Solution
`transcript.status === 'error'`	Corrupted audio or unsupported format	Verify audio file plays locally
`download_url must be accessible`	Private/expired URL	Use a publicly accessible URL or upload locally
`Could not process audio`	File too short (<200ms) or silent	Ensure audio has speech content
`word_boost` has no effect	Misspelled terms or wrong model	Check spelling; word boost works with Best model tier

Resources

Next Steps

For real-time streaming transcription, see assemblyai-core-workflow-b. For LLM-powered analysis of transcripts, see assemblyai-sdk-patterns (LeMUR examples).

Info

Category Data Science

Name assemblyai-core-workflow-a

Version v20260423

Size 6.8KB

Source jeremylongshore/claude-code-plugins-plus-skills

Updated At 2026-04-28