Skills Artificial Intelligence Phoneme-Level Pronunciation Training Workflow

Phoneme-Level Pronunciation Training Workflow

v20260423
speak-core-workflow-b
This advanced workflow provides comprehensive pronunciation training by analyzing speech down to the phoneme level. It assesses users' speaking skills, identifies specific weak phonemes (e.g., 'th' vs. 'd'), and runs adaptive drill loops. The system generates detailed weakness reports and recommends targeted practice phrases for continuous accent reduction and fluency improvement.
Get Skill
100 downloads
Overview

Speak Core Workflow B: Pronunciation Training

Overview

Secondary workflow for Speak: detailed pronunciation training with phoneme-level analysis and adaptive practice. Uses OpenAI's speech recognition with Speak's proprietary proficiency graph to identify and drill weak phonemes.

Prerequisites

  • Completed speak-core-workflow-a
  • Audio recording capability (WAV 16kHz mono)
  • ffmpeg installed for audio preprocessing

Instructions

Step 1: Pronunciation Assessment

import { SpeakClient } from '@speak/language-sdk';

const client = new SpeakClient({
  apiKey: process.env.SPEAK_API_KEY!,
  appId: process.env.SPEAK_APP_ID!,
  language: 'es',
});

// Assess pronunciation of a specific phrase
const result = await client.assessPronunciation({
  audioPath: './recordings/hola-como-estas.wav',
  targetText: 'Hola, como estas?',
  language: 'es',
  detailLevel: 'phoneme',
});

console.log(`Overall score: ${result.score}/100`);
for (const word of result.words) {
  const flag = word.score < 70 ? 'WEAK' : 'OK';
  console.log(`  [${flag}] "${word.text}": ${word.score}/100`);
  if (word.phonemes) {
    for (const p of word.phonemes.filter(p => p.score < 70)) {
      console.log(`    Phoneme "${p.symbol}": ${p.score} — ${p.suggestion}`);
    }
  }
}

Step 2: Adaptive Drill Loop

async function pronunciationDrill(
  client: SpeakClient,
  phrases: string[],
  language: string,
  targetScore: number = 80,
  maxAttempts: number = 3,
) {
  const weakPoints: Map<string, number[]> = new Map();
  const results: DrillResult[] = [];

  for (const phrase of phrases) {
    let bestScore = 0;
    let attempts = 0;

    while (bestScore < targetScore && attempts < maxAttempts) {
      const audioPath = await recordStudentAudio(phrase);
      const result = await client.assessPronunciation({
        audioPath, targetText: phrase, language, detailLevel: 'phoneme',
      });

      bestScore = Math.max(bestScore, result.score);
      attempts++;

      // Track weak phonemes
      for (const word of result.words) {
        for (const p of (word.phonemes || []).filter(p => p.score < 70)) {
          const scores = weakPoints.get(p.symbol) || [];
          scores.push(p.score);
          weakPoints.set(p.symbol, scores);
        }
      }

      if (result.score >= targetScore) {
        console.log(`"${phrase}": PASSED (${result.score}/100, ${attempts} attempts)`);
      } else if (attempts < maxAttempts) {
        console.log(`"${phrase}": ${result.score}/100 — try again`);
      }
    }

    results.push({ phrase, bestScore, attempts });
  }

  return { results, weakPoints };
}

Step 3: Weakness Report

function generateWeaknessReport(weakPoints: Map<string, number[]>) {
  const report = [...weakPoints.entries()]
    .map(([phoneme, scores]) => ({
      phoneme,
      avgScore: Math.round(scores.reduce((a, b) => a + b, 0) / scores.length),
      occurrences: scores.length,
    }))
    .sort((a, b) => a.avgScore - b.avgScore);

  console.log('\\n=== Pronunciation Weakness Report ===');
  for (const entry of report.slice(0, 10)) {
    const bar = '█'.repeat(Math.round(entry.avgScore / 10));
    console.log(`  ${entry.phoneme.padEnd(5)} ${bar} ${entry.avgScore}/100 (${entry.occurrences}x)`);
  }
  return report;
}

Step 4: Targeted Practice Generator

async function generateTargetedPractice(
  client: SpeakClient,
  weakPhonemes: string[],
  language: string,
) {
  // Request phrases that emphasize specific phonemes
  const practice = await client.getPracticePhrasesForPhonemes({
    phonemes: weakPhonemes,
    language,
    difficulty: 'progressive', // Start easy, increase complexity
    count: 10,
  });

  console.log('Targeted practice phrases:');
  for (const phrase of practice.phrases) {
    console.log(`  "${phrase.text}" — targets: ${phrase.targetPhonemes.join(', ')}`);
  }
  return practice;
}

Workflow Comparison

Aspect Workflow A (Conversation) Workflow B (Pronunciation)
Focus Natural dialogue Phoneme accuracy
Feedback Grammar + vocabulary Phoneme scores + mouth position
Sessions 5-15 min conversations 2-5 min drills
Scoring Overall fluency Per-phoneme breakdown
Use case Communication practice Accent reduction

Output

  • Phoneme-level pronunciation scores
  • Adaptive drill loop with retry on weak phrases
  • Weakness report showing problematic phonemes
  • Targeted practice phrase generation
  • Progress tracking over multiple sessions

Error Handling

Error Cause Solution
Audio too short Recording < 0.5s Minimum 0.5s audio required
Background noise Poor recording environment Prompt for quieter location
Phoneme not detected Unclear speech Slow down and articulate
Score always low Microphone quality Test with known-good audio first

Resources

Next Steps

For common errors, see speak-common-errors.

Examples

Basic drill: Assess pronunciation of 5 common Spanish phrases, identify weak phonemes, and generate a targeted practice set.

Progress tracking: Run daily pronunciation drills, track phoneme scores over time, and visualize improvement trends.

Info
Name speak-core-workflow-b
Version v20260423
Size 5.6KB
Updated At 2026-04-28
Language