Investigate AgentCore runtime sessions by querying CloudWatch Logs Insights, filtering OpenTelemetry noise, and producing structured investigation output.
Key capabilities:
Load these files as needed for detailed guidance:
When: ALWAYS load before starting an investigation — ensures CloudWatch and Application Signals MCP servers are configured Contains: MCP server configuration for CloudWatch Logs and Application Signals, with setup instructions for Claude Code, Gemini, Codex, and Kiro CLI
When: Load when setting up MCP servers for the first time Contains: Sample MCP configuration with both CloudWatch and Application Signals servers
When: ALWAYS load before querying or filtering OTEL spans Contains: Field extraction priorities, known instrumentation scopes, noise filtering heuristics (DROP/KEEP patterns)
When the user provides a sessionId, resolve it to traceId(s) first. If user provides traceId directly, skip this phase.
fields traceId, @timestamp
| filter attributes.session.id = "SESSION_ID"
| stats count(*) as spanCount, min(@timestamp) as firstSeen, max(@timestamp) as lastSeen by traceId
| sort firstSeen asc
fields @timestamp, @message
| parse @message '"traceId":"*"' as traceId
| parse @message '"session.id":"*"' as sessionId
| filter sessionId = "SESSION_ID" or @message like "SESSION_ID"
| stats earliest(@timestamp) as firstSeen, latest(@timestamp) as lastSeen, count(*) as spanCount by traceId
| sort firstSeen asc
| limit 50
fields traceId
| filter attributes.session.id = "SESSION_ID"
| sort @timestamp desc
| limit 1
Store discovered traceId(s) and use them in ALL subsequent queries.
Use describe_log_groups with logGroupNamePrefix /aws/bedrock-agentcore/runtimes to find all runtime log groups.
Log group naming patterns (in priority order):
- /aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>/otel-rt-logs (structured OTEL spans)
- /aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>/[runtime-logs] (stdout/stderr)
- /aws/bedrock-agentcore/runtimes/<agent_id>-<endpoint_name>-DEFAULT (single combined group)
AgentCore runtimes always emit OTEL spans. Some deployments split logs into a dedicated otel-rt-logs sub-group; others write everything into a single combined log group. Both are normal.
| Log Group Layout | Query Strategy |
|---|---|
Dedicated otel-rt-logs exists |
Use structured field queries (traceId, attributes.session.id, etc.) |
| Single combined log group | Try structured fields first — if they return 0 results, use glob-style parse @message |
If a dedicated otel-rt-logs group exists, prefer it for structured queries.
When using parse @message on combined log groups, prefer glob-style parse — it is simpler and avoids escaping issues:
| parse @message '"name":"*"' as spanName
| parse @message '"traceId":"*"' as traceId
| parse @message '"startTimeUnixNano":"*"' as startNano
Regex parse (/pattern/) is valid CloudWatch Logs Insights syntax but requires careful escaping of quotes and special characters inside JSON. If glob-style parse extracts the field you need, use it.
Run all 6 query types for a complete investigation. Each query has a structured version (for dedicated otel-rt-logs) and a glob-style parse version (for combined log groups).
Every query MUST include | limit to prevent context window overflow:
| limit 50
| limit 100
| limit 50
| limit 100
| limit 50
| limit 20
Structured:
fields @timestamp, traceId, spanId, parentSpanId, name, scope.name,
attributes.session.id, attributes.gen_ai.operation.name, attributes.gen_ai.agent.name,
startTimeUnixNano, endTimeUnixNano
| filter traceId = "TRACE_ID"
| sort startTimeUnixNano asc
| limit 50
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| parse @message '"name":"*"' as spanName
| parse @message '"traceId":"*"' as traceId
| parse @message '"spanId":"*"' as spanId
| parse @message '"startTimeUnixNano":"*"' as startNano
| parse @message '"endTimeUnixNano":"*"' as endNano
| sort @timestamp asc
| limit 50
Structured:
fields @timestamp, traceId, spanId, parentSpanId, name, scope.name,
startTimeUnixNano, endTimeUnixNano,
(endTimeUnixNano - startTimeUnixNano) / 1000000 as durationMs,
status.code, attributes.gen_ai.operation.name
| filter traceId = "TRACE_ID"
| filter ispresent(startTimeUnixNano)
| sort startTimeUnixNano asc
| limit 100
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| parse @message '"name":"*"' as spanName
| parse @message '"spanId":"*"' as spanId
| parse @message '"parentSpanId":"*"' as parentSpanId
| parse @message '"startTimeUnixNano":"*"' as startNano
| parse @message '"endTimeUnixNano":"*"' as endNano
| parse @message '"statusCode":"*"' as statusCode
| sort @timestamp asc
| limit 100
Structured:
fields @timestamp, traceId, spanId, name, status.code, status.message,
attributes.error.message, attributes.exception.message, attributes.exception.type
| filter traceId = "TRACE_ID"
| filter status.code = 2 OR ispresent(attributes.error.message) OR ispresent(attributes.exception.message)
| sort @timestamp asc
| limit 50
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| filter @message like /ERROR|exception|Exception|fault|STATUS_CODE_ERROR/
| parse @message '"name":"*"' as spanName
| parse @message '"statusCode":"*"' as statusCode
| parse @message '"startTimeUnixNano":"*"' as startNano
| sort @timestamp asc
| limit 50
Structured:
fields @timestamp, traceId, spanId, name, scope.name,
attributes.gen_ai.operation.name, attributes.tool.name,
startTimeUnixNano, endTimeUnixNano,
(endTimeUnixNano - startTimeUnixNano) / 1000000 as durationMs
| filter traceId = "TRACE_ID"
| filter attributes.gen_ai.operation.name = "execute_tool" OR ispresent(attributes.tool.name) OR name like /tool/
| sort startTimeUnixNano asc
| limit 100
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| filter @message like /tool|execute_tool|function_call/
| parse @message '"name":"*"' as spanName
| parse @message '"startTimeUnixNano":"*"' as startNano
| parse @message '"endTimeUnixNano":"*"' as endNano
| parse @message '"statusCode":"*"' as statusCode
| sort @timestamp asc
| limit 100
Structured:
fields @timestamp, traceId, spanId, name,
attributes.gen_ai.usage.input_tokens, attributes.gen_ai.usage.output_tokens,
attributes.gen_ai.usage.total_tokens, attributes.gen_ai.agent.name
| filter traceId = "TRACE_ID"
| filter ispresent(attributes.gen_ai.usage.total_tokens)
| sort @timestamp asc
| limit 50
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| filter @message like /input_tokens|output_tokens|usage/
| parse @message '"name":"*"' as spanName
| parse @message '"gen_ai.usage.input_tokens"' as hasTokens
| sort @timestamp asc
| limit 50
Structured:
fields @timestamp, traceId, spanId, name,
(endTimeUnixNano - startTimeUnixNano) / 1000000 as durationMs
| filter traceId = "TRACE_ID"
| filter ispresent(endTimeUnixNano)
| sort durationMs desc
| limit 20
Combined log group:
fields @timestamp, @message
| filter @message like "TRACE_ID"
| parse @message '"name":"*"' as spanName
| parse @message '"startTimeUnixNano":"*"' as startNano
| parse @message '"endTimeUnixNano":"*"' as endNano
| sort @timestamp asc
| limit 50
Queries are async — use get_logs_insight_query_results to poll until status is Complete.
See otel-span-schema.md for extraction rules, known scopes, and DROP/KEEP heuristics.
After retrieving query results:
Compute relative offsets from the earliest span's startTimeUnixNano:
[T+0ms] Session started — traceId: abc123
[T+45ms] LLM inference — model: anthropic.claude-v3 — 1,200ms
[T+1,250ms] Tool call: search_documents — 340ms
[T+1,600ms] Tool result: 3 documents found
[T+1,650ms] LLM inference — model: anthropic.claude-v3 — 890ms
[T+2,550ms] Response generated — 200 OK
[T+2,600ms] Session ended — total: 2,600ms
| Situation | Action |
|---|---|
| No log groups found | Ask user for log group name or AWS region |
| Query returns 0 results | Widen time range to ±24h, retry. If still empty, try alternate ID fields |
| Session ID not found | Try filtering by requestId, invocationId, traceId variants |
| Query timeout | Use cancel_logs_insight_query, reduce time range, retry |
| Partial results | Note in output, suggest narrower time window |
| Structured field queries return 0 results | Switch to glob-style parse @message queries (see Parse Syntax Guidance) |