Troubleshooting

Common Issues

This page covers common issues you might encounter when using Agenta observability with either the Python SDK or OpenTelemetry.

Invalid Content Format

You may receive a 500 error with the message "Failed to parse OTLP stream." This happens when you send trace data in JSON format instead of Protobuf.

Agenta's OTLP endpoints accept only the Protobuf format (binary encoding). The server cannot parse JSON payloads. When you configure your OpenTelemetry exporter to use JSON encoding, the request will fail.

Solution

Configure your OpenTelemetry exporter to use Protobuf encoding. Most OpenTelemetry exporters use Protobuf by default, so you typically don't need to specify it explicitly.

For the Python SDK, import OTLPSpanExporter from the proto package (not the json package). The exporter class name should include "proto" in its import path.

For the OpenTelemetry Collector, verify that the encoding field is set to proto or omit it entirely (since proto is the default).

For JavaScript and Node.js, use @opentelemetry/exporter-trace-otlp-proto instead of the JSON variant. Avoid any exporter package with "json" in its name.

Do not set encoding: json in your configuration files. Agenta does not support JSON-encoded OTLP payloads.

Payload Too Large

Your collector may receive a 413 response when posting to /otlp/v1/traces. This means the batch size exceeds the limit. Agenta accepts batches up to 5 MB by default.

Reduce the batch size in your collector configuration. You can also enable compression (such as gzip) to keep requests under the limit.

Missing Traces in Serverless Functions

Some traces may not appear in the Agenta dashboard when you run observability in serverless environments. This includes AWS Lambda, Vercel Functions, Cloudflare Workers, and Google Cloud Functions.

OpenTelemetry batches spans in background processes before exporting them. This improves efficiency. However, serverless functions terminate abruptly. They often stop before the background processes finish sending trace data to Agenta. The spans get buffered but never exported.

Solution

Call the force_flush() method before your function terminates. This ensures all spans export before the function exits. Import get_tracer_provider from opentelemetry.trace and call force_flush() on it. Place this call in a finally block so it runs even if errors occur.

Traces Not Appearing in UI

First, verify that your AGENTA_API_KEY is set correctly and has the necessary permissions.

Next, check your endpoint configuration. Point to the correct Agenta host. For cloud deployments, use https://cloud.agenta.ai. For self-hosted instances, use your instance URL (such as http://localhost).

Finally, confirm that you call ag.init() before any instrumented functions execute.

Authentication Errors

You may receive 401 Unauthorized errors when sending traces. This happens for three main reasons.

First, verify that your API key is correct. Check for typos or missing characters.

Second, confirm that the key has not expired. Some API keys have expiration dates.

Third, ensure you use the correct format. The authorization header should follow this pattern: ApiKey YOUR_KEY_HERE.

Performance Issues

High memory usage

You can reduce memory usage in three ways. Enable gzip compression for OTLP exports to reduce the size of data in memory. Lower the number of spans sent per batch. Implement sampling to avoid sending 100% of traces in high-volume scenarios.

High latency

Instrumentation should not add significant latency. If it does, check three things. Ensure spans export in the background using async export. Tune your batch size and export intervals to find the right balance. Review custom instrumentation to verify that custom spans are not performing expensive operations.

OpenTelemetry-Specific Issues

Context propagation not working

Distributed tracing may fail to work across services. Check three things to fix this. Verify that propagators are configured correctly (set OTEL_PROPAGATORS=tracecontext,baggage). Confirm that headers pass between services. Ensure all services use compatible OpenTelemetry versions.

Spans not nesting correctly

Spans may appear flat instead of nested in the trace view. This indicates a context problem. Verify that context passes correctly between functions. Check that you use start_as_current_span with proper context. Make parent-child relationships explicit in your code.

Python SDK-Specific Issues

Decorator not capturing data

The @ag.instrument() decorator may fail to capture inputs and outputs. This happens for three reasons. The decorator must be the top-most decorator on your function. You must call ag.init() before the function runs. The function must return a value (printing alone is not enough).

Metadata not appearing

Data from ag.tracing.store_meta() may not show in the UI. Call this method only within an instrumented function. Check that the span context is active when you call it. Verify that your data format is JSON-serializable.

Need More Help?

Check the Agenta documentation for more details about observability concepts. Review our integration guides for framework-specific help. Visit our GitHub issues to report bugs. Join our community for support.

Next steps

Review the setup instructions to ensure correct configuration. Explore distributed tracing to understand how traces work across services. Check the integrations page for your specific framework.

Common Issues​

Invalid Content Format​

Solution​

Payload Too Large​

Missing Traces in Serverless Functions​

Solution​

Traces Not Appearing in UI​

Authentication Errors​

Performance Issues​

High memory usage​

High latency​

OpenTelemetry-Specific Issues​

Context propagation not working​

Spans not nesting correctly​

Python SDK-Specific Issues​

Decorator not capturing data​

Metadata not appearing​

Need More Help?​

Next steps​

Common Issues

Invalid Content Format

Solution

Payload Too Large

Missing Traces in Serverless Functions

Solution

Traces Not Appearing in UI

Authentication Errors

Performance Issues

High memory usage

High latency

OpenTelemetry-Specific Issues

Context propagation not working

Spans not nesting correctly

Python SDK-Specific Issues

Decorator not capturing data

Metadata not appearing

Need More Help?

Next steps