Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advise on how to debug Lambda metric errors that don't show up in Logs #364

Open
andreas-venturini opened this issue Jan 29, 2024 · 8 comments

Comments

@andreas-venturini
Copy link

andreas-venturini commented Jan 29, 2024

This is following a discussion over at awslabs/aws-lambda-rust-runtime#786 where we were initially advised to enable response streaming for our Lambda function url to work around the bug from that issue.

After changing the function invoke mode to response streaming (and setting AWS_LWA_INVOKE_MODE to response_stream) our Lambda function continued to work normally and there were no errors reported in either CloudWatch or X-Ray.
However, Lambda metric reports suddenly started showing an error count (on the chart one can clearly see when buffered mode was changed to response streaming and back).

image

Nothing else but the invoke mode was changed, also these errors are not related to the problematic source file(s) that triggered the bug reported in the linked issue.

We were advised to open an issue about this here.

Any pointers on how we might gain visibility into these errors would be appreciated. We searched our CloudWatch logs using multiple regex patterns, e.g. filter @message LIKE /ERROR/ etc. to no avail.

@DarthSim
Copy link

Some info from my side:

I'm digging the same issue. I've done some load testing using a single source file. I've configured JSON logs for my function and set the most verbose log levels for both application and system logs. I didn't find any errors in the log yet the function's monitoring reports errors.

The only error-like log records I see are readiness check failures during the function's initialization. Yet these records aren't treated like errors my monitoring, and, in fact, this is a normal behavior.

@andreas-venturini andreas-venturini changed the title Advise on how to debug Lambda metric errors that don't show up in CloudWatch Advise on how to debug Lambda metric errors that don't show up in Logs Jan 30, 2024
@DarthSim
Copy link

I made a couple more tests.

  1. I removed the Lambda adapter from the Docker image and added a test native support for Lambda to the software. The errors didn't disappear.

  2. I built a Docker image with a sample program that just anwers OK to every request. Errors didn't disappear. The whole test program code is:

package main

import (
	"net/http"
	"time"

	"github.com/aws/aws-lambda-go/lambdaurl"
)

func main() {
	lambdaurl.Start(http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		time.Sleep(100 * time.Millisecond)
		rw.Header().Set("Content-Type", "text/plain")
		rw.WriteHeader(200)
		rw.Write([]byte("OK"))
	}))
}

Hense, the Lambda adapter nor our software are not causing that errors.

@bnusunny
Copy link
Contributor

Thanks for this information. I will do some tests to verify.

In the meantime, from your test results, it seems like a Lambda service issue. Could you please open a ticket with AWS support?

@bnusunny
Copy link
Contributor

bnusunny commented Feb 1, 2024

@andreas-venturini @DarthSim It almost recovered. I got 1 or 2 errors out of thousands of invokes. Could you please check if you see the same?

image

@DarthSim
Copy link

DarthSim commented Feb 1, 2024

Unfortunately, nothing changed in my case. I noticed that the bigger the response the larger the error rate. A function with the code I posted above indeed causes only a couple of errors for thousands of requests. Yet the software that responds with images of a few kilobytes causes tons of errors.

@bnusunny
Copy link
Contributor

bnusunny commented Feb 2, 2024

Indeed. I see the same. I'm following up with Lambda team.

@andreas-venturini
Copy link
Author

@bnusunny has there been any feedback from the Lambda team so far? Thanks

@bnusunny
Copy link
Contributor

Lambda team has identified the cause. This should be fixed soon. I will update here when the fixes are rolled out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants