SlideShare a Scribd company logo
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP
for responsiveness
DeNA Co., Ltd.
Kazuho Oku
1
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
explains TCP latency optimization implemented in H2O
HTTP/2 server 2.1
2	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Background
3	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP slow start
n  Initial Congestion Window (IW)=10
⁃  only 10 packets can be sent in first RTT
⁃  used to be IW=3
n  window increase: 1.5x/RTT
4	Programming TCP for responsivesess
0	
100,000	
200,000	
300,000	
400,000	
500,000	
600,000	
700,000	
800,000	
1	 2	 3	 4	 5	 6	 7	 8	
bytes	transmi,ed
RTT
TCP	slow	start	(IW10,	MSS1460)
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Why 1.5x?
During slow start, a TCP increments cwnd by at most SMSS bytes
for each ACK received that cumulatively acknowledges new data.
(snip)
The delayed ACK algorithm specified in [RFC1122] SHOULD be
used by a TCP receiver. When using delayed ACKs, a TCP
receiver MUST NOT excessively delay acknowledgments.
Specifically, an ACK SHOULD be generated for at least every
second full-sized segment, and MUST be generated within 500 ms
of the arrival of the first unacknowledged packet.
TCP Congestion Control (RFC 5681)
5	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Flow of the ideal HTTP
n  fastest within the limits of TCP/IP
n  receive a request 0-RTT, and:
⁃  first send CSS/JS*
⁃  then send the HTML
⁃  then send the images*
*: but only the ones not cached by the browser
6	Programming TCP for responsivesess
client server
1	RTT
request
response
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
The reality in HTTP/2
n  TCP establishment: +1 RTT*
n  TLS handshake: +2 RTT**
n  HTML fetch: +1 RTT
n  JS,CSS fetch: +2 RTT***
n  Total: 6 RTT
*: 0 RTT on reconnection
**: 1 RTT on reconnection
***: servers often cannot switch to sending JS,CSS
instantly, due to the output buffered in TCP send buffer
7	Programming TCP for responsivesess
client server
1	RTT
TCP	SYN
TCP	SYNACK
TLS	Handshake
TLS	Handshake
TLS	Handshake
TLS	Handshake
GET	/
HTML
GET	css,js
css,	js
〜〜
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Ongoing optimizations
n  TCP Fast Open
⁃  initial establishment in 1 RTT
⁃  re-establishment in 0 RTT
n  TLS 1.3
⁃  initial handshake complete in 1 RTT
⁃  resumption in 0 RTT
n  what can be done in the HTTP/2 layer?
8	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
9	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
Answer: TCP Urgent Indications (i.e. MSG_OOB)
10	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Programming TCP for responsiveness
Answer: TCP Urgent Indications (i.e. MSG_OOB)
11	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP Urgent Indications
n  out-of-band messaging for TCP
⁃  used by telnet!
n  can only send 1 octet
⁃  conflicting specs on how to handle multi-octet
messages
n  cannot be used for HTTP/2
n  RFC 6093 “recommends against the use of urgent
mechanism” (RFC 7414)
12	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Typical sequence of HTTP/2
13	Programming TCP for responsivesess
HTTP/2 200 OK
<!DOCTYPE HTML>
…
<SCRIPT SRC=”jquery.js”>
…
client server
GET /
GET /jquery.js
need	to	switch	sending	from	HTML	
to	JS	at	this	very	moment	
(means	that	amount	of	data	sent	in	
*	must	be	smaller	than	IW)
1	RTT
*
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Buffering in TCP and TLS layer
14	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
BIO	buf.
// ordinary code (non-blocking)
while (SSL_write(…) != SSL_ERR_WANT_WRITE)
;
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Why do we have buffers?
15	Programming TCP for responsivesess
n  TCP send buffer:
⁃  reduce ping-pong bet. kernel and application
n  BIO buffer:
⁃  for data that couldnʼt be stored in TCP send buffer
TCP	send	buffer
CWND	
unacked	 poll	threshold	
BIO	buf.
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Improvement: poll-then-write
16	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
// only call SSL_write when polls notifies the app.
while (poll_for_write(fd) == SOCKET_IS_READY)
SSL_write(…);
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Adjust poll threshold
17	Programming TCP for responsivesess
TCP	send	buffer
CWND	
unacked	 poll	threshold	
n  set poll threshold to the end of CWND?
⁃  setsockopt(TCP_NOTSENT_LOWAT)
⁃  in linux, the minimum is CWND + 1 octet
•  becomes unstable when set to CWND + 0
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Adjust poll threshold
18	Programming TCP for responsivesess
CWND	
unacked	 poll	threshold	
// only call SSL_write when polls notifies the app.
while (poll_for_write(fd) == SOCKET_IS_READY)
SSL_write(…);
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
TCP	send	buffer
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Further improvement: read TCP states
19	Programming TCP for responsivesess
CWND	
unacked	 poll	threshold	
// calc size of data to send by calling getsockopt(TCP_INFO)
if (poll_for_write(fd) == SOCKET_IS_READY) {
capacity = CWND - unacked + TWO_MSS - TLS_overhead;
SSL_write(prepare_http2_frames(capacity));
}
TLS	Records
sent	immediately	 not	immediately	sent	
HTTP/2	frames
TCP	send	buffer
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Negative impact of additional delay
n  increased delay bet. ACK recv. → data send, since:
⁃  traditional approach: completes within kernel
⁃  this approach: application needs to be notified to
generate new data
n  outcome:
⁃  increase of CWND becomes slower
⁃  leads to slower peak speed?
•  depends on how CWND at peak is calculated
⁃  does kernel use TCP timestamp for the matter?
20	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Countermeasures
n  optimize for responsiveness only when necessary
⁃  i.e. when RTT is big and CWND is small
⁃  impact of optimization is proportional to
unsent_bytes / CWND
n  disable optimization if additional delay is significant
⁃  when epoll returns immediately, estimated
additional delay is equal to the time spent by the
loop
21	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Configuration Directives
n  http2-latopt-min-rtt
⁃  minimum TCP RTT to enable the optimization
⁃  default: UINT_MAX (disabled)
n  http2-latopt-max-cwnd
⁃  maximum CWND to enable (in octets)
⁃  default: 65535
n  http2-max-additional-delay
⁃  max. additional delay (as the ratio to TCP RTT)
⁃  latopt disabled if the delay is greater
⁃  default: 0.1
22	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Pseudo-code
size_t get_suggested_write_size() {
getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcp_info, sizeof(tcp_info));
if (tcp_info.tcpi_rtt < min_rtt || tcp_info.tcpi_snd_cwnd > max_cwnd)
return UNKNOWN;
switch (SSL_get_current_cipher(ssl)->id) {
case TLS1_CK_RSA_WITH_AES_128_GCM_SHA256:
case …:
tls_overhead = 5 + 8 + 16;
break;
default:
return UNKNOWN;
}
packets_sendable = tcp_info.tcpi_snd_cwnd > tcp_info.tcpi_unacked ?
tcp_info.tcpi_snd_cwnd - tcp_info.tcpi_unacked : 0;
return (packets_sendable + 2) * (tcp_info.tcpi_snd_mss - tls_overhead);
}
23	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Benchmark (1)
24	Programming TCP for responsivesess
n  conditions:
⁃  server in Ireland, client in Tokyo (RTT 250ms)
⁃  load tiny js at the top of a large HTML
n  result: delay decreased from 511ms to 250ms
⁃  i.e. JS fetch latency was 2RTT, became 1 RTT
•  similar results in other environments
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Benchmark (2)
n  using same data as previous
n  server: Sakura VPS (Ishikari DC)
25	Programming TCP for responsivesess
0	
50	
100	
150	
200	
250	
300	
HTML	 JS	
milliseconds
downloading	HTML	(and	JS	within)	
RTT	~25ms
master	 latopt
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Conclusion
n  near-optimal result can be achieved
⁃  by adjusting poll threshold and reading TCP
states
⁃  1-packet overhead due to restriction in Linux
kernel
n  1-RTT improvement in H2O
⁃  estimated 1-RTT improvement per the depth of
the load graph
26	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Under the hood
27	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
TCP_NOTSENT_LOWAT
n  supported by Linux, OS X
n  on Linux:
⁃  sysctl:
•  set to -1: use kernel default
•  set to 0: sshd hangs
•  set to positive int: override kernel default
⁃  setsockopt:
•  set to 0: use default (sysctl or kernel)
•  set to int: override default
28	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Unit of CWND
n  Linux: # of packets
⁃  if INITCWND is 10, you can send at most 10
packets at once, regardless of their size
n  BSD (incl. OS X): octets
⁃  you can send CWND*MSS octets, regardless of
the number of packets
•  if CWND=10 and MSS=1460, it is possible to send
14,600 packets containing 1-octet payload
29	Programming TCP for responsivesess
Copyright	(C)	2016	DeNA	Co.,Ltd.	All	Rights	Reserved.	
Determining amount of data that can be
sent immediately
OS MSS CWND inflight send	buffer	(inflight	+	unsent)
Linux tcpi_snd_mss tcpi_snd_cwnd* tcpi_snd_unacked* ioctl(SIOCOUTQ)
OS	X** tcpi_maxseg tcpi_snd_cwnd - tcpi_snd_sbbytes
FreeBSD tcpi_snd_mss tcpi_snd_cwnd - ioctl(FIONWRITE)
NetBSD tcpi_snd_mss tcpi_snd_cwnd* - ioctl(FIONWRITE)
30	Programming TCP for responsivesess
n  calculate either of:
⁃  CWND - inflight
⁃  min(CWND - (inflight + unsent), 0)
n  units used in the calculation must be the same
⁃  NetBSD: fail
*:	units	of	values	marked	are	packets,	unmarked	are	octets	
**:	somefmes	the	values	of	tcpi_*	are	returned	as	zeros

More Related Content

PPTX
Programming TCP for responsiveness
PPTX
Cache aware-server-push in H2O version 1.5
PPTX
Recent Advances in HTTP, controlling them using ruby
PDF
Reorganizing Website Architecture for HTTP/2 and Beyond
PDF
H2O - making the Web faster
PDF
Developing the fastest HTTP/2 server
PDF
H2O - the optimized HTTP server
PDF
HTTP/2で 速くなるとき ならないとき
Programming TCP for responsiveness
Cache aware-server-push in H2O version 1.5
Recent Advances in HTTP, controlling them using ruby
Reorganizing Website Architecture for HTTP/2 and Beyond
H2O - making the Web faster
Developing the fastest HTTP/2 server
H2O - the optimized HTTP server
HTTP/2で 速くなるとき ならないとき

What's hot (20)

PDF
Promise of Push (HTTP/2 Web Performance)
PDF
How happy they became with H2O/mruby and the future of HTTP
PDF
Make gRPC great again
PDF
IETF 100: Surviving IPv6 fragmentation
PPTX
HTTP2 and gRPC
PDF
Teach your (micro)services talk Protocol Buffers with gRPC.
PDF
HTTP/2: What no one is telling you
ODP
7.protocols 2
ODP
7. protocols
PPTX
redGuardian DP100 large scale DDoS mitigation solution
PPTX
Are we really ready to turn off IPv4?
PPT
Network and TCP performance relationship workshop
PDF
gRPC or Rest, why not both?
PDF
Implementing BGP Flowspec at IP transit network
PDF
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
PPTX
Microservices summit talk 1/31
PDF
Enabling Googley microservices with HTTP/2 and gRPC.
PDF
LF_OVS_17_OvS manipulation with Go at DigitalOcean
PDF
加快互联网核心协议,提高Web速度yuchungcheng
PDF
HTTP/3 over QUIC. All is new but still the same!
Promise of Push (HTTP/2 Web Performance)
How happy they became with H2O/mruby and the future of HTTP
Make gRPC great again
IETF 100: Surviving IPv6 fragmentation
HTTP2 and gRPC
Teach your (micro)services talk Protocol Buffers with gRPC.
HTTP/2: What no one is telling you
7.protocols 2
7. protocols
redGuardian DP100 large scale DDoS mitigation solution
Are we really ready to turn off IPv4?
Network and TCP performance relationship workshop
gRPC or Rest, why not both?
Implementing BGP Flowspec at IP transit network
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Microservices summit talk 1/31
Enabling Googley microservices with HTTP/2 and gRPC.
LF_OVS_17_OvS manipulation with Go at DigitalOcean
加快互联网核心协议,提高Web速度yuchungcheng
HTTP/3 over QUIC. All is new but still the same!
Ad

Similar to Programming TCP for responsiveness (20)

PPT
Chapter6TransportLayer header format protocols-2.ppt
PDF
Computer network (16)
PPTX
Mobile Transpot Layer
PDF
IRJET- Modeling a New Startup Algorithm for TCP New Reno
PDF
Computer network (13)
PPTX
chapter 3.2 TCP.pptx
PPTX
Tcp congestion avoidance
PPTX
Abandon Decades-Old TCPdump for Modern Troubleshooting
PPT
TCP Over Wireless
PDF
Tuning TCP and NGINX on EC2
PPTX
Reconsider TCPdump for Modern Troubleshooting
PDF
features of tcp important for the web
PDF
Improving Distributed TCP Caching for Wireless Sensor Networks
PPTX
Lec 2.pptx
PDF
Insights into the performance and configuration of TCP in Automotive Ethernet...
PPT
Tcp congestion control
PPT
Tcp congestion control (1)
PDF
Iaetsd an effective approach to eliminate tcp incast
PPTX
3.TRANSPORT LAYER Computer Network .pptx
PDF
TCP and Mobile Networks Turbulent Relationship
Chapter6TransportLayer header format protocols-2.ppt
Computer network (16)
Mobile Transpot Layer
IRJET- Modeling a New Startup Algorithm for TCP New Reno
Computer network (13)
chapter 3.2 TCP.pptx
Tcp congestion avoidance
Abandon Decades-Old TCPdump for Modern Troubleshooting
TCP Over Wireless
Tuning TCP and NGINX on EC2
Reconsider TCPdump for Modern Troubleshooting
features of tcp important for the web
Improving Distributed TCP Caching for Wireless Sensor Networks
Lec 2.pptx
Insights into the performance and configuration of TCP in Automotive Ethernet...
Tcp congestion control
Tcp congestion control (1)
Iaetsd an effective approach to eliminate tcp incast
3.TRANSPORT LAYER Computer Network .pptx
TCP and Mobile Networks Turbulent Relationship
Ad

More from Kazuho Oku (20)

PDF
QUIC標準化動向 〜2017/7
PDF
HTTP/2の課題と将来
PDF
TLS 1.3 と 0-RTT のこわ〜い話
PPTX
TLS & LURK @ IETF 95
PPTX
HTTPとサーバ技術の最新動向
PPTX
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
PDF
HTTP/2時代のウェブサイト設計
PDF
H2O - making HTTP better
PPTX
JSON SQL Injection and the Lessons Learned
PPTX
JSX 速さの秘密 - 高速なJavaScriptを書く方法
PPTX
JSX の現在と未来 - Oct 26 2013
PPTX
Using the Power to Prove
PDF
JSX - 公開から1年を迎えて
PDF
JSX - developing a statically-typed programming language for the Web
PDF
ウェブブラウザの時代は終わるのか 〜スマホアプリとHTML5の未来〜
PPTX
PPTX
JSX Optimizer
PPTX
JSX Design Overview (日本語)
PPTX
PPT
Unix Programming with Perl 2
QUIC標準化動向 〜2017/7
HTTP/2の課題と将来
TLS 1.3 と 0-RTT のこわ〜い話
TLS & LURK @ IETF 95
HTTPとサーバ技術の最新動向
ウェブを速くするためにDeNAがやっていること - HTTP/2と、さらにその先
HTTP/2時代のウェブサイト設計
H2O - making HTTP better
JSON SQL Injection and the Lessons Learned
JSX 速さの秘密 - 高速なJavaScriptを書く方法
JSX の現在と未来 - Oct 26 2013
Using the Power to Prove
JSX - 公開から1年を迎えて
JSX - developing a statically-typed programming language for the Web
ウェブブラウザの時代は終わるのか 〜スマホアプリとHTML5の未来〜
JSX Optimizer
JSX Design Overview (日本語)
Unix Programming with Perl 2

Recently uploaded (20)

PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
DOCX
Unit-3 cyber security network security of internet system
PDF
“Google Algorithm Updates in 2025 Guide”
PPTX
ppt for upby gurvinder singh padamload.pptx
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PPTX
QR Codes Qr codecodecodecodecocodedecodecode
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPTX
durere- in cancer tu ttresjjnklj gfrrjnrs mhugyfrd
PPTX
innovation process that make everything different.pptx
PPTX
PPT_M4.3_WORKING WITH SLIDES APPLIED.pptx
PPTX
CSharp_Syntax_Basics.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PPT
256065457-Anaesthesia-in-Liver-Disease-Patient.ppt
PPTX
international classification of diseases ICD-10 review PPT.pptx
PDF
Centralized Business Email Management_ How Admin Controls Boost Efficiency & ...
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
PDF
RPKI Status Update, presented by Makito Lay at IDNOG 10
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
Unit-3 cyber security network security of internet system
“Google Algorithm Updates in 2025 Guide”
ppt for upby gurvinder singh padamload.pptx
Slides PDF The World Game (s) Eco Economic Epochs.pdf
Job_Card_System_Styled_lorem_ipsum_.pptx
QR Codes Qr codecodecodecodecocodedecodecode
Paper PDF World Game (s) Great Redesign.pdf
durere- in cancer tu ttresjjnklj gfrrjnrs mhugyfrd
innovation process that make everything different.pptx
PPT_M4.3_WORKING WITH SLIDES APPLIED.pptx
CSharp_Syntax_Basics.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Unit-1 introduction to cyber security discuss about how to secure a system
256065457-Anaesthesia-in-Liver-Disease-Patient.ppt
international classification of diseases ICD-10 review PPT.pptx
Centralized Business Email Management_ How Admin Controls Boost Efficiency & ...
WebRTC in SignalWire - troubleshooting media negotiation
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
RPKI Status Update, presented by Makito Lay at IDNOG 10
introduction about ICD -10 & ICD-11 ppt.pptx

Programming TCP for responsiveness