Skip to content

Slow first call while using multiple SqsAsyncClient because of underlining io.netty.handler.timeout.ReadTimeoutException #2146

Closed
@amrynsky

Description

@amrynsky

Slow first call while using multiple SqsAsyncClient because of underlining io.netty.handler.timeout.ReadTimeoutException

Details

We are using SqsAsyncClient and observed strange behavior while using multiple clients. In the following code, the first call to client2 is always slow taking ~30sec

SqsAsyncClient client1 = createSqsClient();
createQueue(client1, "test");
createQueue(client1, "test");

SqsAsyncClient client2 = createSqsClient();
createQueue(client2, "test");
createQueue(client2, "test");

Here is a log from our test:

08:38:59.302  [Test worker] INFO [c.p.p.s.a.s.AwsSqsAsyncClientTest] - client1 finished in 404ms
08:38:59.336  [Test worker] INFO [c.p.p.s.a.s.AwsSqsAsyncClientTest] - client1 finished in 34ms
08:39:29.474  [Test worker] INFO [c.p.p.s.a.s.AwsSqsAsyncClientTest] - client2 finished in **30137ms**
08:39:29.499  [Test worker] INFO [c.p.p.s.a.s.AwsSqsAsyncClientTest] - client2 finished in 24ms

in DEBUG logs we see the first HTTP call of the new client always results in io.netty.handler.timeout.ReadTimeoutException . It works after retry and all subsequent calls are also good.

09:03:42.911  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 - R:/127.0.0.1:33447] FLUSH
09:03:42.912  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 - R:/127.0.0.1:33447] WRITE, EmptyLastHttpContent, 0B
09:03:42.912  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 - R:/127.0.0.1:33447] FLUSH
09:03:47.882  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-2] DEBUG [s.a.a.h.n.n.i.IdleConnectionReaperHandler] - Closing unused connection (f8ed65be) because it has been idle for longer than 5000 milliseconds.
09:03:47.885  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-2] DEBUG [i.n.h.l.LoggingHandler] - [id: 0xf8ed65be, L:/127.0.0.1:60688 ! R:/127.0.0.1:33447] INACTIVE
09:03:47.886  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-2] DEBUG [i.n.h.l.LoggingHandler] - [id: 0xf8ed65be, L:/127.0.0.1:60688 ! R:/127.0.0.1:33447] UNREGISTERED
09:04:12.915  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 - R:/127.0.0.1:33447] EXCEPTION: io.netty.handler.timeout.ReadTimeoutException
io.netty.handler.timeout.ReadTimeoutException: null
09:04:12.915  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [s.a.a.h.n.n.i.ResponseHandler] - Exception processing request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=http, host=127.0.0.1, port=33447, encodedPath=, headers=[amz-sdk-invocation-id, amz-sdk-retry, Authorization, Content-Length, Content-Type, Host, User-Agent, X-Amz-Date], queryParameters=[])
io.netty.handler.timeout.ReadTimeoutException: null
09:04:12.923  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [s.a.a.request] - Retryable error detected. Will retry in 55ms. Request attempt number 2
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:205) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:201) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:143) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:125) ~[sdk-core-2.15.26.jar:?]
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:74) ~[utils-2.15.26.jar:?]
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$null$0(MakeAsyncHttpRequestStage.java:104) ~[sdk-core-2.15.26.jar:?]
	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859) ~[?:?]
	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837) ~[?:?]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2152) ~[?:?]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$WrappedErrorForwardingResponseHandler.onError(MakeAsyncHttpRequestStage.java:158) ~[sdk-core-2.15.26.jar:?]
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.lambda$exceptionCaught$0(ResponseHandler.java:146) ~[netty-nio-client-2.15.26.jar:?]
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:180) [netty-nio-client-2.15.26.jar:?]
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.exceptionCaught(ResponseHandler.java:146) [netty-nio-client-2.15.26.jar:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.handler.logging.LoggingHandler.exceptionCaught(LoggingHandler.java:205) [netty-handler-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at software.amazon.awssdk.http.nio.netty.internal.UnusedChannelExceptionHandler.exceptionCaught(UnusedChannelExceptionHandler.java:52) [netty-nio-client-2.15.26.jar:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at software.amazon.awssdk.http.nio.netty.internal.FutureCancelHandler.exceptionCaught(FutureCancelHandler.java:47) [netty-nio-client-2.15.26.jar:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireExceptionCaught(CombinedChannelDuplexHandler.java:424) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:92) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.CombinedChannelDuplexHandler$1.fireExceptionCaught(CombinedChannelDuplexHandler.java:145) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:143) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.CombinedChannelDuplexHandler.exceptionCaught(CombinedChannelDuplexHandler.java:231) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.handler.timeout.ReadTimeoutHandler.readTimedOut(ReadTimeoutHandler.java:98) [netty-handler-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.handler.timeout.ReadTimeoutHandler.channelIdle(ReadTimeoutHandler.java:90) [netty-handler-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.handler.timeout.IdleStateHandler$ReaderIdleTimeoutTask.run(IdleStateHandler.java:504) [netty-handler-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:476) [netty-handler-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) [netty-transport-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.51.Final.jar:4.1.51.Final]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: java.io.IOException: Read timed out
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.wrapException(ResponseHandler.java:385) ~[netty-nio-client-2.15.26.jar:?]
	at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.exceptionCaught(ResponseHandler.java:144) ~[netty-nio-client-2.15.26.jar:?]
	... 35 more
Caused by: io.netty.handler.timeout.ReadTimeoutException
09:04:12.929  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 ! R:/127.0.0.1:33447] CLOSE
09:04:12.933  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 ! R:/127.0.0.1:33447] CLOSE
09:04:12.934  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 ! R:/127.0.0.1:33447] INACTIVE
09:04:12.934  trace:[] span:[] [aws-java-sdk-NettyEventLoop-1-5] DEBUG [i.n.h.l.LoggingHandler] - [id: 0x32315beb, L:/127.0.0.1:60690 ! R:/127.0.0.1:33447] UNREGISTERED
09:04:12.982  trace:[] span:[] [sdk-ScheduledExecutor-4-0] DEBUG [s.a.a.request] - Retrying Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=http, host=127.0.0.1, port=33447, encodedPath=, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent], queryParameters=[])

Steps to Reproduce

Here is a code snippet to reproduce the issue.

        @Test
	void createQueueTest() {
		SqsAsyncClient client1 = createSqsClient();

		record(() -> createQueue(client1, "test"));
		record(() -> createQueue(client1, "test"));

		SqsAsyncClient client2 = createSqsClient();

		record(() -> createQueue(client2, "test"));
		record(() -> createQueue(client2, "test"));
	}

	private void createQueue(SqsAsyncClient client, String name) {
		CreateQueueRequest createQueueRequest = CreateQueueRequest.builder()
				.queueName(name)
				.build();

		client.createQueue(createQueueRequest).join();
	}

	private SqsAsyncClient createSqsClient() {
		return SqsAsyncClient.builder()
				.credentialsProvider(provider)
				.endpointOverride(endpoint)
				.region(Region.US_WEST_2)
				.build();
	}

	private void record(Runnable f) {
		Instant start = Instant.now();
		try {
			f.run();
		} finally {
			log.info("finished in " + Duration.between(start, Instant.now()).toMillis() + "ms");
		}
	}

Possible Solution

Need to look into code to understand the root cause

Context

This is not real use case because in our service we are sharing the same client across different components. But it could be relevant for some use cases in the future - for example, creating a client per region. We came across this while writing unit tests where we create a new client in @BeforeEach.

Environment

  • AWS Java SDK version used: 2.15.26, Netty - 4.1.51.Final
  • JDK version used: OpenJDK Runtime Environment (build 14.0.1+7)
  • Operating System and version: Mac_OS_X/10.15.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.needs-triageThis issue or PR still needs to be triaged.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions