Improve output documentation regarding retries and brokers #414

inliquid · 2020-03-25T12:13:40Z

We use http_server as input and http_client as one of outputs (for a part of message batch). In case when there is some error coming from http_client, benthos starts to retry this error message indefinitely (#415). However most significant, is that it stops accepting other, normal messages.

Here is the log when I first try to send message which causes http_client to get 500 error, and then try with the message that should not cause 500 error from the endpoint that http_client connects to:

>benthos.exe -c config-addons.yaml
{"@timestamp":"2020-03-25T14:56:29+03:00","@service":"benthos","level":"INFO","component":"benthos.input","message":"Receiving HTTP messages at: http://:8000/api/v1/image-upload/"}
{"@timestamp":"2020-03-25T14:56:29+03:00","@service":"benthos","level":"INFO","component":"benthos","message":"Launching a benthos instance, use CTRL+C to close."} 
{"@timestamp":"2020-03-25T14:56:29+03:00","@service":"benthos","level":"INFO","component":"benthos.output.broker.outputs.0","message":"Sending messages via HTTP requests to: http://127.0.0.1:9000/internal/v1/save-data/"}
{"@timestamp":"2020-03-25T14:56:29+03:00","@service":"benthos","level":"INFO","component":"benthos.output.broker.outputs.1","message":"Writing message parts as files."}
{"@timestamp":"2020-03-25T14:56:29+03:00","@service":"benthos","level":"INFO","component":"benthos","message":"Listening for HTTP requests at: http://0.0.0.0:4195"}
{"@timestamp":"2020-03-25T15:06:50+03:00","@service":"benthos","level":"TRACE","component":"benthos.input","message":"Consumed 2 messages from POST to '/api/v1/image-upload/'."}
{"@timestamp":"2020-03-25T15:06:50+03:00","@service":"benthos","level":"TRACE","component":"benthos.output.broker.outputs.1","message":"Attempting to write 1 messages to 'files'."}
{"@timestamp":"2020-03-25T15:06:50+03:00","@service":"benthos","level":"TRACE","component":"benthos.output.broker.outputs.0","message":"Attempting to write 1 messages to 'http_client'."}
{"@timestamp":"2020-03-25T15:06:50+03:00","@service":"benthos","level":"TRACE","component":"benthos.output.broker.outputs.1","message":"Successfully wrote 1 messages to 'files'."}
{"@timestamp":"2020-03-25T15:06:57+03:00","@service":"benthos","level":"ERROR","component":"benthos.output.broker.outputs.0","message":"Failed to send message to http_client: HTTP request returned unexpected response code (500): 500 Internal Server Error"}
{"@timestamp":"2020-03-25T15:06:57+03:00","@service":"benthos","level":"ERROR","component":"benthos.output","message":"Failed to dispatch fan out message: HTTP request returned unexpected response code (500): 500 Internal Server Error"}
{"@timestamp":"2020-03-25T15:06:57+03:00","@service":"benthos","level":"TRACE","component":"benthos.output.broker.outputs.0","message":"Attempting to write 1 messages to 'http_client'."}
{"@timestamp":"2020-03-25T15:07:31+03:00","@service":"benthos","level":"TRACE","component":"benthos.input","message":"Consumed 2 messages from POST to '/api/v1/image-upload/'."}
{"@timestamp":"2020-03-25T15:07:51+03:00","@service":"benthos","level":"ERROR","component":"benthos.output.broker.outputs.0","message":"Failed to send message to http_client: Post \"http://127.0.0.1:9000/internal/v1/save-data/\": EOF"}
{"@timestamp":"2020-03-25T15:07:51+03:00","@service":"benthos","level":"ERROR","component":"benthos.output","message":"Failed to dispatch fan out message: Post \"http://127.0.0.1:9000/internal/v1/save-data/\": EOF"}
{"@timestamp":"2020-03-25T15:07:51+03:00","@service":"benthos","level":"TRACE","component":"benthos.output.broker.outputs.0","message":"Attempting to write 1 messages to 'http_client'."}
{"@timestamp":"2020-03-25T15:08:23+03:00","@service":"benthos","level":"INFO","component":"benthos","message":"Received SIGTERM, the service is closing."}

According to logs, second portion of message batch is received, but it's not processed and client (which sends to input) sees timeout.

http_server:

  http_server:
    address: :8000
    path: /api/v1/image-upload/
    timeout: 10s
    rate_limit: ""

http_client:

      - http_client:
          # retry_period: 10s
          max_retry_backoff: 30s
          # retries: 3
          backoff_on:
            - 429
            - 500
          url: http://${MONGOD_ADDR:127.0.0.1:9000}/internal/v1/save-data/
          verb: POST
          headers:
            Content-Type: application/json
            X-Image-ID: ${!metadata:image_id}
          rate_limit: ""
          timeout: 5s
          max_in_flight: 1
          batching:
            count: 1
            byte_size: 0
            period: ""
        processors:
          - select_parts:
              parts: [0]

The text was updated successfully, but these errors were encountered:

Jeffail · 2020-03-25T19:40:21Z

Hey @inliquid, Benthos never drops messages unless explicitly told to. The http_client output has a discrete number of retries, but once those retries are exhausted the attempt is considered failed but Benthos doesn't have any other options for the message, and so it will try again.

Eventually new messages will stop being consumed whilst the failed message is blocking, this is intended as the back pressure prevents Benthos from consuming unlimited memory and claiming messages it cannot process.

In your particular case it sounds like you want to simply drop messages if they failed, which you can do with a try broker where you follow it with a drop:

output:
  try:
    - http_client:
        max_retry_backoff: 30s
        retries: 3
        backoff_on:
          - 429
          - 500
        url: http://${MONGOD_ADDR:127.0.0.1:9000}/internal/v1/save-data/
        ... etc ...
    - type: drop

inliquid · 2020-03-26T08:34:02Z

Hi @Jeffail thanks, will this work with processor section attached to the http_client output or should I add this to try somehow?

I think it's not very well documented as for me there is no way to understand that in order for these parameters to work you have to use some additional complicated try setup: https://www.benthos.dev/docs/components/outputs/http_client/#retries

Jeffail · 2020-03-26T08:47:49Z

Yeah I think the output about page needs a bit of an update, ideally it should cover all of these patterns with config examples.

Add your processors to the http_client output like before:

output:
  try:
    - http_client:
        max_retry_backoff: 30s
        retries: 3
        backoff_on:
          - 429
          - 500
        url: http://${MONGOD_ADDR:127.0.0.1:9000}/internal/v1/save-data/
        ... etc ...
      processors:
        - select_parts:
            parts: [0]
    - type: drop

inliquid · 2020-03-26T13:24:23Z

Thank you, what about max_in_flight parameter? Does it affect in any way fact that until retries either succeed or fail, whole pipeline is blocked?

Jeffail · 2020-03-27T10:54:22Z

If you're consistently attempting payloads that fail without dropping them then eventually the pipeline will be throttled by the constant retry attempts. If you expect certain payloads to always cause 500 and want to quickly drop them rather than block the pipeline then add the status code 500 to the field drop_on.

Edit: Although ideally if this response is caused by a bad or unexpected payload and you're in control of it then ideally you should update the server to respond to those payloads with something like a status 400.

inliquid mentioned this issue Mar 25, 2020

Can't prevent http_client from resending error messages indefinitely #415

Closed

Jeffail added the question label Mar 25, 2020

Jeffail added documentation and removed question labels Mar 26, 2020

Jeffail changed the title ~~http_client output blocks everything in case of error~~ Improve output documentation regarding retries and brokers Mar 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve output documentation regarding retries and brokers #414

Improve output documentation regarding retries and brokers #414

inliquid commented Mar 25, 2020 •

edited

Loading

Jeffail commented Mar 25, 2020

inliquid commented Mar 26, 2020

Jeffail commented Mar 26, 2020

inliquid commented Mar 26, 2020 •

edited

Loading

Jeffail commented Mar 27, 2020 •

edited

Loading

Improve output documentation regarding retries and brokers #414

Improve output documentation regarding retries and brokers #414

Comments

inliquid commented Mar 25, 2020 • edited Loading

Jeffail commented Mar 25, 2020

inliquid commented Mar 26, 2020

Jeffail commented Mar 26, 2020

inliquid commented Mar 26, 2020 • edited Loading

Jeffail commented Mar 27, 2020 • edited Loading

inliquid commented Mar 25, 2020 •

edited

Loading

inliquid commented Mar 26, 2020 •

edited

Loading

Jeffail commented Mar 27, 2020 •

edited

Loading