Timeouts are important. 

But what are timeouts?

Timeouts are a set of configurations that a developer provides when one service calls another service. 

Broadly, there are two types of timeouts.

Connection Timeout 

Say, service A is calling service B. Service B has a total of 10 connections and assumes all of them are used. Now, if service A initiates a connection to service B, service B has no connection to offer to service A.  One practical example is when you call someone over phone and it rings and rings. If no one picks up till the ring time of say 30s, the call disconnects. What you have to do is to try again after some time, hoping someone picks up the call.

In such a situation, connection timeout is the time for which service A will wait to make the connection to service B. 

What will happen if you set a high value of connection time out or you don’t set a connection time?

Service A will keep on piling up requests to service B. It will make the recovery of service B difficult in case of an outage. It might even lead to cascading failure of service A. 

What should you do if you see a connection timeout in logs? 

If you see a lot of connection timeouts in logs, you should be worried. 

As an owner of service A, you should check if you are making more requests to service B than what you have informed the owner of service B. 

As an owner of service B, you may increase the number of pods as an immediate fix.

What is the right value of connection time out?

Anything in three-digit milliseconds should be ok.

Read Timeout 

Say, service A is calling service B. Read timeout is the time for which service A will wait for service B to return data after the connection is made. A practical example is an old style pay phone where we dropped a coin to get a talk time of 60 seconds. What happens there, we call someone dropping a coin in the phone box. Someone picks it up and then call automatically disconnects after say 60 seconds, even if the conversation is not over. That’s one example of read timeout.

Picture taken from wikipedia

What will happen if you set a high value of read time out or you don’t set a read time?

Service A will keep on piling up requests to service B. It will make the recovery of service B difficult in case of an outage. It might even lead to cascading failure of service A. 

What will happen if you set a low value of read time out?

Service A will keep calling service B and service B will not return any response.

What should you do if you see a read timeout in logs? 

If you see a lot of read timeouts in logs, that means that response from service B has increased. 

As an owner of service A, you should find out if you need to increase the value of read timeout. 

As an owner of service B, you should check the 99 percentile of API response if there is any code change. You should also check if the particular API makes an external call and if there is any issue with those external services.

What is the right value of read time out?

Depends on API. The owner of service B should do a load test on the API and report the 99 percentile to service A. Service A should set 99 percentile plus some buffer as their read timeout.


Further Reading

Reference 1

Reference 2

Reference 3

One thought on “Timeouts are important. 

Add yours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: