SlideShare a Scribd company logo
1
How the TCP/IP Protocol Works
Les Cottrell – SLAC
Lecture # 1 presented at the 26th International Nathiagali Summer College on Physics
and Contemporary Needs, 25th June – 14th July, Nathiagali, Pakistan
Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end
Performance Monitoring (IEPM), also supported by IUPAP
2
Overview
• This is not a lecture on how to program TCP/IP,
rather an introduction to how major portions works
• IP
• Addressing: IP addresses, ARP, routing
• ICMP
• UDP
• TCP: flow control, error recovery, establishment,
diconnect
• References:
– “Internetworking with TCP/IP, volume I, principles, protocols & Architecture”,
by Douglas Comer
– “TCP/IP Illustrated: the protocols”, by W. Richard Stevens
– Most information also available free via Web searches
3
Internet Protocol (IP RFC-791)
Transport Services
Connectionless packet delivery service
Application services
TCP/IP Internet provides 3 layers of service
•Layering allows one to replace one service without affecting
others
•IP layer (basic unit of transfer in TCP/IP) provides:
•Best-effort (does not discard capriciously), unreliable (no
guarantees)
•Packet may be lost, duplicated, out-of-order with no
notification
•Connectionless (each packet treated independently)
•IP software provides routing
4
Internet datagram
• Basic transfer unit
• Format of Internet datagram
Datagram header Datagram data area
Vers Type of serv. Total length
0 8 16 31
Identification Flags
24
Hlen
4
Fragment offset
19
TTL Protocol Header Checksum
Source IP address
Destination IP address
IP Options (if any) Padding
Data
…
5
IP datagram format (cont.)
• Vers (4 bits): version of IP protocol (IPv4=4)
• Hlen (4 bits): Header length in 32 bit words, without
options (usual case) = 20
• Type of Service – TOS (8 bits): little used in past, now
being used for QoS
• Total length (16 bits): length of datagram in bytes, includes
header and data
• Time to live – TTL (8bits): specifies how long datagram is
allowed to remain in internet
– Routers decrement by 1
– When TTL = 0 router discards datagram
– Prevents infinite loops
• Protocol (8 bits): specifies the format of the data area
– Protocol numbers administered by central authority to guarantee
agreement, e.g. TCP=6, UDP=17 …
6
IP Datagram format (cont.)
• Source & destination IP address (32 bits each):
contain IP address of sender and intended recipient
• Options (variable length): Mainly used to record a
route, or timestamps, or specify routing
7
IP Fragmentation
• How do we send a datagram of say 1400 bytes through a
link that has a Maximum Transfer Unit (MTU) of say 620
bytes?
• Answer the datagram is broken into fragments
– Router fragments 1400 byte datagrams
• Into 600 bytes, 600 bytes, 200bytes (note 20 bytes for IP header)
• Routers do NOT reassemble, up to end host
Net 1
MTU=1500
Net 2
MTU=620
Net 3
MTU=1500
8
Fragmentation Control
• Identification: copied into fragment, allows destination to
know which fragments belong to which datagram
• Fragment Offset (12 bits): specifies the offset in the
original datagram of the data being carried in the fragment
– Measured in units of 8 bytes starting at 0
• Flags (3 bits): control fragmentation
– Reserved (0-th bit)
– Don’t Fragment – DF (1st bit):
• useful for simple (computer bootstrap) application that can’t handle
• also used for MTU discovery (see later)
• if need to fragment and can’t router discards & sends error to source
– More Fragments (least sig bit): tells receiver it has got last
fragment
• TCP traffic is hardly ever fragmented (due to use of MTU
discovery). About 0.5% - 0.1% of TCP packets are
fragmented .
9
Fragment series composition
NB. If data segment contains its own header that is not
replicated
Offset=0
More frags
Offset=1480
More frags
Offset=2960
More frags
Offset=3440
Last frag
10
Internet Addressing
• IP address is a 32 bit integer
– Refers to interface rather than host
– Consists of network and host portions
• Enables routers to keep 1 entry/network instead of 1/host
– Class A, B, C for unicast
– Class D for multicast
– Class E reserved
– Classless addresses
• Written as 4 octets/bytes in decimal format
– E.g. 134.79.16.1, 127.0.0.1
11
Internet Class-based addresses
• Class A: large number of hosts, few networks
– 0nnnnnnn hhhhhhhh hhhhhhhh hhhhhhhh
• 7 network bits (0 and 127 reserved, so 126 networks), 24 host bits (> 16M
hosts/net)
• Initial byte 1-127 (decimal)
• Class B: medium number of hosts and networks
– 10nnnnnn nnnnnnnn hhhhhhhh hhhhhhhh
• 16,384 class B networks, 65,534 hosts/network
• Initial byte 128-191 (decimal)
• Class C: large number of small networks
– 110nnnnn nnnnnnnn nnnnnnnn hhhhhhhh
• 2,097,152 networks, 254 hosts/network
• Initial byte 192-223 (decimal)
• Class D: 224-239 (decimal) Multicast [RFC1112]
• Class E: 240-255 (decimal) Reserved
12
Subnets
• A subnet mask is applied to the host bits to
determine how the network is subnetted, e.g. if the
host is: 137.138.28.228, and the subnet mask is
255.255.255.0 then the right hand 8 bits are for the
host (255 is decimal for all bits set in an octet)
• Host addresses of all bits set or no bits set, indicate a
broadcast, i.e. the packet is sent to all hosts.
13
Subnet Mask Conversions
/1 128.0.0.0
/2 192.0.0.0
/3 224.0.0.0
/4 240.0.0.0
/5 248.0.0.0
/6 252.0.0.0
/7 254.0.0.0
/8 255.0.0.0
/9 255.128.0.0
/10 255.192.0.0
/11 255.224.0.0
/12 255.240.0.0
/13 255.248.0.0
/14 255.252.0.0
/15 255.254.0.0
/16 255.255.0.0
/17 255.255.128.0
/18 255.255.192.0
/19 255.255.224.0
/20 255.255.240.0
/21 255.255.248.0
/22 255.255.252.0
/23 255.255.254.0
/24 255.255.255.0
/25 255.255.255.128
/26 255.255.255.192
/27 255.255.255.224
/28 255.255.255.240
/29 255.255.255.248
/30 255.255.255.252
/31 255.255.255.254
/32 255.255.255.255
Prefix
Length
Subnet Mask Prefix
Length
Subnet Mask
128 1000 0000
192 1100 0000
224 1110 0000
240 1111 0000
248 1111 1000
252 1111 1100
254 1111 1110
255 1111 1111
Decimal Octet Binary Number
14
Address depletion
• In 1991 IAB identified 3 dangers
– Running out of class B addresses
– Increase in nets has resulted in routing table explosion
– Increase in net/hosts exhausting 32 bit address space
• Four strategies to address
– Creative address space allocation {RFC 2050}
– Private addresses {RFC 1918}, Network Address
Translation (NAT) {RFC 1631}
– Classless InterDomain Routing (CIDR) {RFC 1519}
– IP version 6 (IPv6) {RFC 1883}
15
Creative IP address allocation
• Class A addresses 64 – 127 reserved
– Handle on individual basis
• Class B only assigned given a demonstrated need
• Class C
– divided up into 8 blocks allocated to regional authorities
– 208-223 remains unassigned and unallocated
• Three main registries handle assignments
– APNIC – Asia & Pacific www.apnic.net
– ARIN – N. & S. America, Caribbean & sub-Saharan
Africa www.arin.net
– RIPE – Europe and surrounding areas www.ripe.net
16
Private IP Addresses
• IP addresses that are not globally unique, but used
exclusively in an organization
• Three ranges:
– 10.0.0.0 - 10.255.255.255 a single class A net
– 172.16.0.0 - 172.31.255.255 16 contiguous class Bs
– 192.168.0.0 – 192.168.255.255 256 contiguous class Cs
• Connectivity provided by Network Address
Translator (NAT)
– translates outgoing private IP address to Internet IP
address, and a return Internet IP address to a private
address
– Only for TCP/UDP packets
17
Class InterDomain Routing (CIDR)
• Many organization have > 256 computers but few
have more than several thousand
• Instead of giving class B (16384 nets) give sufficient
contiguous class C addresses to satisfy needs
– < 256 addresses assign 1 class C
– …
– < 8192 addresses assign 32 contiguous Class C nets
18
• Since assigned contiguously, class C CIDR has same most
significant bits & so only needs one routing table entry
• CIDR block represented by a prefix and prefix length
– Prefix = single address representing block of nets, e.g
• 192.32.136.0 = 11000000 00100000 10001000 00000000 while
• 192.32.143.0 = 11000000 00100000 10001111 00000000
– Prefix length indicates number of routing bits, e.g.
192.32.136.0/21 means 21 bits used for routing
• CIDR collects all nets in range 192.32.136.0 through 143.0 into a single
router entry – reduces router table entries
• Removes address classes A, B & C boundaries
• For more details see RFC 1519
CIDR & Supernetting
21 bit prefix (2048 host addresses)
19
Address Recognition Protocol (ARP)
• IP address is at network layer, need to map it to the
MAC (Ethernet address) link layer address
• Use ARP to map 48 bit Ethernet address to 32 bit IP
– IP requests MAC address for IP address from local ARP
table
– If not there, then an ARP request packet for IP address is
sent using physical broadcast address (all FFFs)
– Host with requested IP address responds with its MAC
address as a unicast packet
– On return, host updates ARP table and returns MAC
address
– ARP cache times out
– ARP packets are on top of Ethernet
20
ARP cont.
• ARP requests are local only, do not cross routers
• Compare local IP and subnet mask => local subnet
• Compare local subnet to destination IP
– if local, ARP for MAC address
– else remote so
• if ROUTE entry, ARP for router to subnet
• if default route, ARP for default gateway
• otherwise, drop packet & return error
134.79.10.17 134.79.15.3
134.79.15.1
134.79.10.1
User A User B
Subnet 1 Subnet 2
21
Routing
• Routers must select next hop for packet
• Get route information from other routers via a
routing protocol (RIP, OSPF, EIGRP etc.)
• Note the following are non-routable:
– private networks: 10.0.0.0/8, 172.16.0.0/12,
192.168.0.0/16
– Loopback 127.0.0.0/24
22
ICMP Purpose (RFC 792)
• Communicates control & error information
– Between routers and hosts
– Only reports to original source, suggests corrections
– Error messages about error messages are not generated
– Never generated due to multicasts
• Packet format
Type Code Checksum
0 8 16 31
ICMP data (depends on type/code)
24
23
Main ICMP request types
Type ICMP
0 Echo reply, ping
3 Destination unreachable (code 1 host, code 3 port)
DF and must fragment (code 4)
4 Source quench
5 Redirect (change a route)
8 Echo request
11 Time exceeded (code 0 ttl=0, code 1 reassembly)
12 Parameter problems
24
ICMP Echo/Ping
• Very commonly used diagnostic tool
• Implementations vary between OS’
• Build echo request
– Identifier used to match request to replies (e.g. pid)
– Sequence number, starts at 0 increments by 1 for each ping packet
• Used to detect loss, reorder, duplicates
– Optional data, sent by requester, returned by replier
• Usually contains a timestamp when the request was sent plus pad data
Type=8 Code=0 Checksum
0 8 16 31
Identifier Sequence number
Optional data
24
25
What do we learn from Ping
• Host reachable
– Host may respond to ping but not be running services
• Round trip timing
• Lost packets
• Packet reordering duplicate packets
• Example:
13cottrell@noric05:~>ping -c 4 lhr.comsats.net.pk
PING lhr.comsats.net.pk (210.56.16.10) from 134.79.125.205 : 56(84) bytes of data.
64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=0 ttl=242 time=716.962 msec
64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=1 ttl=242 time=720.375 msec
64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=2 ttl=242 time=725.907 msec
64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=3 ttl=242 time=710.734 msec
--- lhr.comsats.net.pk ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max/mdev = 710.734/718.494/725.907/5.566 ms
26
Unreachable
76cottrell@flora06:~>ping islamabad-server2.comsats.net.pk
ICMP 13 Unreachable from gateway 207.45.205.18
for icmp from FLORA06.SLAC.Stanford.EDU (134.79.16.101)
to islamabad-server2.comsats.net.pk (210.56.8.8)
What does this mean, see exercise?
27
Time Exceeded
• Time-to-live has expired at a router (code=0)
– ttl sets bound on number routers datagram can transit
• Prevents infinite routine loops
• Initialized by sender, decremented by 1 each time passes router
• When ttl = 0 datagram thrown away & sender notified by ICMP
message
• Fragment reassembly timer (code=1)
Type 11 Code Checksum
0 8 16 31
Unused
Internet header & 8 bytes of data
24
28
MTU Discovery
• Path MTUs vary
• Fragmentation is bad
• Small transmission units are bad
• SO need to discover optimum MTU (largest without
fragmentation)
• Host sends a packet with the Don’t Fragment bit set
– Length is lesser of local MTU and MSS announced by
remote system
– If MTU between hosts requires fragmentation (e.g. at an
intermediate router), then
• if an ICMP DF bit set & must fragment then an ICMP message
is sent back to source, saying “I can’t fragment”
• try again with smaller size.
29
User Datagram Protocol - UDP
• RFC 768, Protocol 17
• Provides unreliable, connectionless on top of IP
• Minimal overhead, high performance
– No setup/teardown, 1 datagram at a time
• Application responsible for reliability
– Includes datagram loss, duplication, delay, out-of-
sequence, multiplexing, loss of connectivity
IP
Port 1
TCP UDP
Port 2 Port 1 Port 2
Demux on
IP protocol
Demux on
Port number
Network
Transport
App.
30
UDP Datagram format
• Source/destination port: port numbers identify sending & receiving
processes
– Port number & IP address allow any application in any computer on Internet to
be uniquely identified
– Used to demultiplex datagrams to processes
– Ports can be static or dynamic
• Static (< 1024) assigned centrally, known as well known ports
• Dynamic
• Message length in bytes includes the UDP header and data
Source port Destination port
UDP message len Checksum (opt.)
0 8 16 31
24
Data
…
31
UDP applications
• Message oriented, e.g. SNMP, DNS, time
• File system, e.g. NFS, AFS
• Lightweight file transfer, e.g. tftp, bootp
32
Transmission Control Protocol -TCP
• RFC 768 & host requirements RFC 1122
– Reliable stream transport
• Connection oriented (full duplex virtual circuit)
– Conceptually place call, two ends communicate to agree on details
– After agreeing application notified of connection
– During transfer, ends communicate continuously to verify data received
correctly
– When done, ends tear down the connection
– If UDP is like regular mail, TCP is like phone call
• Provides buffering and flow control
• Takes care of lost packets, out of order, duplicates, long delays
• Isolates application program from network details
• Jargon
– Segment = TCP packet
– Socket= source (address + port) + destination (address + port)
33
TCP layering
• To ID connection need:
– Source: (address, port) AND Destination: (address, port)
– Only need one port on host to allow multiple connections, since
each connection will have different (host, port) at other end
• E.g. single host can serve multiple telnet connections
• Passive open: application contacts OS & indicates will
accept incoming connection, OS assigns port and listens
• Active open: application requests OS to connect to an (host,
port)
IP
Port 1
TCP UDP
Port 2 Port 1 Port 2
Demux on
IP protocol
Demux on
Port number
Network
Transport
App.
IP port 6
34
TCP – providing reliability
• Positive acknowledgement (ACK) with
retransmission
– Sender keeps record of each packet sent
– Sender awaits an ACK
– Sender starts timer when sends packet
Send pkt 1
Rcv ACK 1
Send pkt 2
Rcv ACK 2
Network messages
Rcv pkt 1
Rcv pkt 2
Send ACK 2
Send ACK 1
Sender site Receiver site
Time
35
TCP – simple lost packet recovery
Send pkt 1
Start timer
ACK normally
arrives
Rcv ACK 1
Network messages
Pkt should arrive
Rcv pkt 1
Send ACK 1
ACK should be sent
Sender site Receiver site
Loss
Timer expires
Retransmit pkt 1
start timer
36
TCP – improving performance
• BUT simple ACK protocol wastes bandwidth since it must
delay sending next packet until it gets ACK
• Use sliding window
• Sender can send 4 packets of data without ACK
– When sender gets ACK then can send another packet
– Window = unacknowledged packets/bytes
– Keeps timer for each packet
1 2 3 4 5 6 7 8 …
Initial window of 4 packets
1 2 3 4 5 6 7 8 …
Window slides
Packets successfully sent
Packets sent, awaiting ACK
Packets to be sent
37
Tuning to fill pipe
• Optimal window size depends on:
– Bandwidth end to end, i.e. min(BWlinks) AKA bottleneck
bandwidth
– Round Trip Time (RTT)
– For TCP keep pipe full
• Window (sometime called pipe) ~ RTT*BW
– Can increase bandwidth by
orders of magnitude
• Windows also used for flow control
Src Rcv
t = bits in packet/link speed
RTT
38
Implementation
• Sliding window operates at byte level, NOT packet
• Receiver keeps similar window to put stream back
together
• Since full duplex, altogether 4 windows & pointer
sets
1 2 3 4 5 6 7 8 …
Current window
Highest byte that can be sent
Bytes sent and acknowledged
3 pointers
Highest byte sent
39
TCP flow control
• Windows vary over time
– Receiver advertises (in ACKs) how many it can receive
• Based on buffers etc. available
– Sender adjusts its window to match advertisement
– If receiver buffers fill, it sends smaller adverts
• Used to match buffer requirements of receiver
• Also used to address congestion control (e.g. in
intermediate routers)
40
TCP Segment format
• Source/Dest port: TCP port numbers to ID applications at
both ends of connection
• Sequence number: ID position in sender’s byte stream
Source port Destination port
Sequence number
0 8 16 31
24
Acknowledgement number
4
Hlen
10
Resv Code Window
Urgent ptr
Checksum
Options (if any) Padding
Data if any
…
41
TCP segment format – cont.
• Acknowledgement: identifies the number of the
byte the sender of this segment expects to receive
next
• Hlen: specifies the length of the segment header in
32 bit multiples. If there are no options, the Hlen = 5
(20 bytes)
• Reserved for future use, set to 0
• Code: used to determine segment purpose, e.g.
SYN, ACK, FIN, URG
42
TCP Segment format- cont
• Window: Advertises how much data this station is
willing to accept. Can depend on buffer space
remaining.
• Checksum: Verifies the integrity of the TCP header
and data. It is mandatory.
• Urgent pointer: used with the URG flag to indicate
where the urgent data starts in the data stream.
Typically used with a file transfer abort during FTP
or when pressing an interrupt key in telnet.
• Options: used for window scaling, SACK,
timestamps, maximum segment size etc.
43
TCP timeout
• Need a timeout estimate that will work for LANs
(RTT < msec.) to satellite WANs (hundreds of msec.
to secs). RTT can vary a lot with time of day, day of
week, or one second to next.
– TCP records time segment sent
– and time ACK received
– Then calculates RTT sample
– Smooth & use to estimate timeout, e.g.
• Timeout=beta * RTTs
• Timeout= RTTs + eta{=4}*f(dev(RTTs))
– Needs to take account of losses, e.g.
• New_timeout=gamma{2} * timeout
May 12th
Time of day
44
TCP connection establishment
• 3 way handshake
• Initial sequence numbers (x, y) are chosen randomly
• Guarantees both sides ready & know it, and sets
initial sequence numbers, also sets window & mss
• Once connection established, data can flow in both
directions, equally well, there is no master or slave
Send SYN seq x
Rcv SYN/ACK
Send ACK y+1
Rcv SYN segment
Rcv ACK segment
Send SYN seq=y, ACK x+1
Site 1 Site 2
45
TCP close connection
• Modified 3 way handshake (or 4 way termination)
• App tells TCP to close, TCP sends remaining data & waits
for ACK, then sends FIN
• Site 2 TCP ACKs FIN, tells its application “end of data”
• Site 2 sends FIN when its app closes connection (may be
long delay (e.g. require human interaction).
(App closes)
Send FIN seq=x
Rcv ACK segment
Rcv FIN segment
Receive ACK segment
Send ACK x=1
(inform app)
Site 1 Site 2
Rcv FIN + ACK seg
Send ACK y+1
(app closes connection)
Send FIN seq=y, ACK x+1
46
More Information
• Lectures, tutorials etc:
– www.nv.cc.va.us/home/joney/tcp_ip.htm
– www.cs.pdx.edu/~jrb/tcpip.lectures.html
– www.raleigh.ibm.com/cgi-bin/bookmgr/BOOKS/EZ306200/CCONTENTS
– www.cisco.com/univercd/cc/td/doc/product/iaabu/centri4/user/scf4ap1.htm
– www.cis.ohio-state.edu/htbin/rfc/rfc1180.html
– www.jbmelectronics.com/tcp.htm
• Encylopaedia
– http://www.freesoft.org/CIE/index.htm
• TCP/IP Resources
– www.private.org.il/tcpip_rl.html
• Understanding IP addresses
– http://www.3com.com/solutions/en_US/ncs/501302.html
• Configuring TCP (RFC 1122)
– ftp://nic.merit.edu/internet/documents/rfc/rfc1122.txt
• Assigned protocols, ports etc (RFC 1010)
– http://www.es.net/pub/rfcs/rfc1010.txt & /etc/protocols
47
Example: 3 way handshake
• atlas> telnet sunstats.cern.ch
– atlas is a WNT PC, sunstats is a Sun Solaris 5.6 host
– MSS is set in TCP option in a SYN segment,
communicates the MSS the sender wants to receive
– len=ip_hlen/tcp_hlen:ip_total_len
– Initial Sequence Numbers are randomly selected
– Telnet = port 23
– W=Receive window size advertises how much data this
host will accept
48
Example: 3 way handshake - cont.
• TCP from atlas:1174 to sunstats:23 seq=180839,
A=0, W=8192, SYN [len=5/6:44, opt=020405B4
<opt=2, len=4, mss=0x5B4=1460>]
• TCP from sunstats:23 to atlas:1174
seq=1383568304, A=180840, W=64240, SYN/ACK
[len=5/6:44, opt=020405B4]
• TCP from atlas:1174 to sunstats:23 seq =180840,
A=1383568305, W=8760 [len=5/5:40, opt=nul]
– Notice window size can vary from segment to segment depending
on buffer space available
– Notice smaller PC window advertisement
– Notice ephemeral port selected by telnet client
– Notice acknowledge next expected byte (=seq+1)
– 0x020405B4: 02 = option type, 04=len, 0x5B4=1460
49
Session start
SLAC>CERN: 256kbyte window,1 stream,
full speed > 30msec, 13MBytes in 20s, 5.1MBytes/s
Rcvr Advertised window
Acks returned by
Rcvr
Segments sent
Congestion window

More Related Content

PDF
ADDRESSING PADA TCP IP
PPTX
Networking essentials lect2
PPTX
10 coms 525 tcpip - internet protocol - ip
PPT
CCNA Exam by [email protected] - for CCNA test
PPTX
11 coms 525 tcpip - internet protocol - forward
PPTX
IP Routing.pptx
ADDRESSING PADA TCP IP
Networking essentials lect2
10 coms 525 tcpip - internet protocol - ip
CCNA Exam by [email protected] - for CCNA test
11 coms 525 tcpip - internet protocol - forward
IP Routing.pptx

Similar to tcpip.ppt (20)

PDF
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
PPT
lecture08.ppt
PPTX
Computer network coe351- part3-final
PPTX
Internet Protocol Version 4
PPT
Networking and data communication IP.ppt
PPT
computerNetworkSecurity.ppt
PPT
210202021018701 suratNetworkSecurity.ppt
PDF
IPForwarding-Lab3 in routing and switching
PPTX
Internetworking
PPTX
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
PDF
Ch 2: TCP/IP Concepts Review
PPTX
TCP/IP and UDP protocols
PPTX
Week 2 - Computer networks lab - ACU.pptx
PPTX
Lecture 3 network layer
PPT
chsadsadasdasdasdasdsadsadsadsadsadasda10.ppt
PDF
ENC_254_PPT_ch04.pdf
PPT
TCPIP in brief and working operation.ppt
PPTX
computer network notes in network layer.
PPT
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
lecture08.ppt
Computer network coe351- part3-final
Internet Protocol Version 4
Networking and data communication IP.ppt
computerNetworkSecurity.ppt
210202021018701 suratNetworkSecurity.ppt
IPForwarding-Lab3 in routing and switching
Internetworking
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
Ch 2: TCP/IP Concepts Review
TCP/IP and UDP protocols
Week 2 - Computer networks lab - ACU.pptx
Lecture 3 network layer
chsadsadasdasdasdasdsadsadsadsadsadasda10.ppt
ENC_254_PPT_ch04.pdf
TCPIP in brief and working operation.ppt
computer network notes in network layer.

Recently uploaded (20)

PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Trump Administration's workforce development strategy
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Yogi Goddess Pres Conference Studio Updates
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Classroom Observation Tools for Teachers
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Updated Idioms and Phrasal Verbs in English subject
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PDF
RMMM.pdf make it easy to upload and study
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
What if we spent less time fighting change, and more time building what’s rig...
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Trump Administration's workforce development strategy
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Anesthesia in Laparoscopic Surgery in India
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Yogi Goddess Pres Conference Studio Updates
Microbial disease of the cardiovascular and lymphatic systems
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Classroom Observation Tools for Teachers
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
Updated Idioms and Phrasal Verbs in English subject
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
RMMM.pdf make it easy to upload and study
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
What if we spent less time fighting change, and more time building what’s rig...

tcpip.ppt

  • 1. 1 How the TCP/IP Protocol Works Les Cottrell – SLAC Lecture # 1 presented at the 26th International Nathiagali Summer College on Physics and Contemporary Needs, 25th June – 14th July, Nathiagali, Pakistan Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP
  • 2. 2 Overview • This is not a lecture on how to program TCP/IP, rather an introduction to how major portions works • IP • Addressing: IP addresses, ARP, routing • ICMP • UDP • TCP: flow control, error recovery, establishment, diconnect • References: – “Internetworking with TCP/IP, volume I, principles, protocols & Architecture”, by Douglas Comer – “TCP/IP Illustrated: the protocols”, by W. Richard Stevens – Most information also available free via Web searches
  • 3. 3 Internet Protocol (IP RFC-791) Transport Services Connectionless packet delivery service Application services TCP/IP Internet provides 3 layers of service •Layering allows one to replace one service without affecting others •IP layer (basic unit of transfer in TCP/IP) provides: •Best-effort (does not discard capriciously), unreliable (no guarantees) •Packet may be lost, duplicated, out-of-order with no notification •Connectionless (each packet treated independently) •IP software provides routing
  • 4. 4 Internet datagram • Basic transfer unit • Format of Internet datagram Datagram header Datagram data area Vers Type of serv. Total length 0 8 16 31 Identification Flags 24 Hlen 4 Fragment offset 19 TTL Protocol Header Checksum Source IP address Destination IP address IP Options (if any) Padding Data …
  • 5. 5 IP datagram format (cont.) • Vers (4 bits): version of IP protocol (IPv4=4) • Hlen (4 bits): Header length in 32 bit words, without options (usual case) = 20 • Type of Service – TOS (8 bits): little used in past, now being used for QoS • Total length (16 bits): length of datagram in bytes, includes header and data • Time to live – TTL (8bits): specifies how long datagram is allowed to remain in internet – Routers decrement by 1 – When TTL = 0 router discards datagram – Prevents infinite loops • Protocol (8 bits): specifies the format of the data area – Protocol numbers administered by central authority to guarantee agreement, e.g. TCP=6, UDP=17 …
  • 6. 6 IP Datagram format (cont.) • Source & destination IP address (32 bits each): contain IP address of sender and intended recipient • Options (variable length): Mainly used to record a route, or timestamps, or specify routing
  • 7. 7 IP Fragmentation • How do we send a datagram of say 1400 bytes through a link that has a Maximum Transfer Unit (MTU) of say 620 bytes? • Answer the datagram is broken into fragments – Router fragments 1400 byte datagrams • Into 600 bytes, 600 bytes, 200bytes (note 20 bytes for IP header) • Routers do NOT reassemble, up to end host Net 1 MTU=1500 Net 2 MTU=620 Net 3 MTU=1500
  • 8. 8 Fragmentation Control • Identification: copied into fragment, allows destination to know which fragments belong to which datagram • Fragment Offset (12 bits): specifies the offset in the original datagram of the data being carried in the fragment – Measured in units of 8 bytes starting at 0 • Flags (3 bits): control fragmentation – Reserved (0-th bit) – Don’t Fragment – DF (1st bit): • useful for simple (computer bootstrap) application that can’t handle • also used for MTU discovery (see later) • if need to fragment and can’t router discards & sends error to source – More Fragments (least sig bit): tells receiver it has got last fragment • TCP traffic is hardly ever fragmented (due to use of MTU discovery). About 0.5% - 0.1% of TCP packets are fragmented .
  • 9. 9 Fragment series composition NB. If data segment contains its own header that is not replicated Offset=0 More frags Offset=1480 More frags Offset=2960 More frags Offset=3440 Last frag
  • 10. 10 Internet Addressing • IP address is a 32 bit integer – Refers to interface rather than host – Consists of network and host portions • Enables routers to keep 1 entry/network instead of 1/host – Class A, B, C for unicast – Class D for multicast – Class E reserved – Classless addresses • Written as 4 octets/bytes in decimal format – E.g. 134.79.16.1, 127.0.0.1
  • 11. 11 Internet Class-based addresses • Class A: large number of hosts, few networks – 0nnnnnnn hhhhhhhh hhhhhhhh hhhhhhhh • 7 network bits (0 and 127 reserved, so 126 networks), 24 host bits (> 16M hosts/net) • Initial byte 1-127 (decimal) • Class B: medium number of hosts and networks – 10nnnnnn nnnnnnnn hhhhhhhh hhhhhhhh • 16,384 class B networks, 65,534 hosts/network • Initial byte 128-191 (decimal) • Class C: large number of small networks – 110nnnnn nnnnnnnn nnnnnnnn hhhhhhhh • 2,097,152 networks, 254 hosts/network • Initial byte 192-223 (decimal) • Class D: 224-239 (decimal) Multicast [RFC1112] • Class E: 240-255 (decimal) Reserved
  • 12. 12 Subnets • A subnet mask is applied to the host bits to determine how the network is subnetted, e.g. if the host is: 137.138.28.228, and the subnet mask is 255.255.255.0 then the right hand 8 bits are for the host (255 is decimal for all bits set in an octet) • Host addresses of all bits set or no bits set, indicate a broadcast, i.e. the packet is sent to all hosts.
  • 13. 13 Subnet Mask Conversions /1 128.0.0.0 /2 192.0.0.0 /3 224.0.0.0 /4 240.0.0.0 /5 248.0.0.0 /6 252.0.0.0 /7 254.0.0.0 /8 255.0.0.0 /9 255.128.0.0 /10 255.192.0.0 /11 255.224.0.0 /12 255.240.0.0 /13 255.248.0.0 /14 255.252.0.0 /15 255.254.0.0 /16 255.255.0.0 /17 255.255.128.0 /18 255.255.192.0 /19 255.255.224.0 /20 255.255.240.0 /21 255.255.248.0 /22 255.255.252.0 /23 255.255.254.0 /24 255.255.255.0 /25 255.255.255.128 /26 255.255.255.192 /27 255.255.255.224 /28 255.255.255.240 /29 255.255.255.248 /30 255.255.255.252 /31 255.255.255.254 /32 255.255.255.255 Prefix Length Subnet Mask Prefix Length Subnet Mask 128 1000 0000 192 1100 0000 224 1110 0000 240 1111 0000 248 1111 1000 252 1111 1100 254 1111 1110 255 1111 1111 Decimal Octet Binary Number
  • 14. 14 Address depletion • In 1991 IAB identified 3 dangers – Running out of class B addresses – Increase in nets has resulted in routing table explosion – Increase in net/hosts exhausting 32 bit address space • Four strategies to address – Creative address space allocation {RFC 2050} – Private addresses {RFC 1918}, Network Address Translation (NAT) {RFC 1631} – Classless InterDomain Routing (CIDR) {RFC 1519} – IP version 6 (IPv6) {RFC 1883}
  • 15. 15 Creative IP address allocation • Class A addresses 64 – 127 reserved – Handle on individual basis • Class B only assigned given a demonstrated need • Class C – divided up into 8 blocks allocated to regional authorities – 208-223 remains unassigned and unallocated • Three main registries handle assignments – APNIC – Asia & Pacific www.apnic.net – ARIN – N. & S. America, Caribbean & sub-Saharan Africa www.arin.net – RIPE – Europe and surrounding areas www.ripe.net
  • 16. 16 Private IP Addresses • IP addresses that are not globally unique, but used exclusively in an organization • Three ranges: – 10.0.0.0 - 10.255.255.255 a single class A net – 172.16.0.0 - 172.31.255.255 16 contiguous class Bs – 192.168.0.0 – 192.168.255.255 256 contiguous class Cs • Connectivity provided by Network Address Translator (NAT) – translates outgoing private IP address to Internet IP address, and a return Internet IP address to a private address – Only for TCP/UDP packets
  • 17. 17 Class InterDomain Routing (CIDR) • Many organization have > 256 computers but few have more than several thousand • Instead of giving class B (16384 nets) give sufficient contiguous class C addresses to satisfy needs – < 256 addresses assign 1 class C – … – < 8192 addresses assign 32 contiguous Class C nets
  • 18. 18 • Since assigned contiguously, class C CIDR has same most significant bits & so only needs one routing table entry • CIDR block represented by a prefix and prefix length – Prefix = single address representing block of nets, e.g • 192.32.136.0 = 11000000 00100000 10001000 00000000 while • 192.32.143.0 = 11000000 00100000 10001111 00000000 – Prefix length indicates number of routing bits, e.g. 192.32.136.0/21 means 21 bits used for routing • CIDR collects all nets in range 192.32.136.0 through 143.0 into a single router entry – reduces router table entries • Removes address classes A, B & C boundaries • For more details see RFC 1519 CIDR & Supernetting 21 bit prefix (2048 host addresses)
  • 19. 19 Address Recognition Protocol (ARP) • IP address is at network layer, need to map it to the MAC (Ethernet address) link layer address • Use ARP to map 48 bit Ethernet address to 32 bit IP – IP requests MAC address for IP address from local ARP table – If not there, then an ARP request packet for IP address is sent using physical broadcast address (all FFFs) – Host with requested IP address responds with its MAC address as a unicast packet – On return, host updates ARP table and returns MAC address – ARP cache times out – ARP packets are on top of Ethernet
  • 20. 20 ARP cont. • ARP requests are local only, do not cross routers • Compare local IP and subnet mask => local subnet • Compare local subnet to destination IP – if local, ARP for MAC address – else remote so • if ROUTE entry, ARP for router to subnet • if default route, ARP for default gateway • otherwise, drop packet & return error 134.79.10.17 134.79.15.3 134.79.15.1 134.79.10.1 User A User B Subnet 1 Subnet 2
  • 21. 21 Routing • Routers must select next hop for packet • Get route information from other routers via a routing protocol (RIP, OSPF, EIGRP etc.) • Note the following are non-routable: – private networks: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 – Loopback 127.0.0.0/24
  • 22. 22 ICMP Purpose (RFC 792) • Communicates control & error information – Between routers and hosts – Only reports to original source, suggests corrections – Error messages about error messages are not generated – Never generated due to multicasts • Packet format Type Code Checksum 0 8 16 31 ICMP data (depends on type/code) 24
  • 23. 23 Main ICMP request types Type ICMP 0 Echo reply, ping 3 Destination unreachable (code 1 host, code 3 port) DF and must fragment (code 4) 4 Source quench 5 Redirect (change a route) 8 Echo request 11 Time exceeded (code 0 ttl=0, code 1 reassembly) 12 Parameter problems
  • 24. 24 ICMP Echo/Ping • Very commonly used diagnostic tool • Implementations vary between OS’ • Build echo request – Identifier used to match request to replies (e.g. pid) – Sequence number, starts at 0 increments by 1 for each ping packet • Used to detect loss, reorder, duplicates – Optional data, sent by requester, returned by replier • Usually contains a timestamp when the request was sent plus pad data Type=8 Code=0 Checksum 0 8 16 31 Identifier Sequence number Optional data 24
  • 25. 25 What do we learn from Ping • Host reachable – Host may respond to ping but not be running services • Round trip timing • Lost packets • Packet reordering duplicate packets • Example: 13cottrell@noric05:~>ping -c 4 lhr.comsats.net.pk PING lhr.comsats.net.pk (210.56.16.10) from 134.79.125.205 : 56(84) bytes of data. 64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=0 ttl=242 time=716.962 msec 64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=1 ttl=242 time=720.375 msec 64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=2 ttl=242 time=725.907 msec 64 bytes from lhr.comsats.net.pk (210.56.16.10): icmp_seq=3 ttl=242 time=710.734 msec --- lhr.comsats.net.pk ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max/mdev = 710.734/718.494/725.907/5.566 ms
  • 26. 26 Unreachable 76cottrell@flora06:~>ping islamabad-server2.comsats.net.pk ICMP 13 Unreachable from gateway 207.45.205.18 for icmp from FLORA06.SLAC.Stanford.EDU (134.79.16.101) to islamabad-server2.comsats.net.pk (210.56.8.8) What does this mean, see exercise?
  • 27. 27 Time Exceeded • Time-to-live has expired at a router (code=0) – ttl sets bound on number routers datagram can transit • Prevents infinite routine loops • Initialized by sender, decremented by 1 each time passes router • When ttl = 0 datagram thrown away & sender notified by ICMP message • Fragment reassembly timer (code=1) Type 11 Code Checksum 0 8 16 31 Unused Internet header & 8 bytes of data 24
  • 28. 28 MTU Discovery • Path MTUs vary • Fragmentation is bad • Small transmission units are bad • SO need to discover optimum MTU (largest without fragmentation) • Host sends a packet with the Don’t Fragment bit set – Length is lesser of local MTU and MSS announced by remote system – If MTU between hosts requires fragmentation (e.g. at an intermediate router), then • if an ICMP DF bit set & must fragment then an ICMP message is sent back to source, saying “I can’t fragment” • try again with smaller size.
  • 29. 29 User Datagram Protocol - UDP • RFC 768, Protocol 17 • Provides unreliable, connectionless on top of IP • Minimal overhead, high performance – No setup/teardown, 1 datagram at a time • Application responsible for reliability – Includes datagram loss, duplication, delay, out-of- sequence, multiplexing, loss of connectivity IP Port 1 TCP UDP Port 2 Port 1 Port 2 Demux on IP protocol Demux on Port number Network Transport App.
  • 30. 30 UDP Datagram format • Source/destination port: port numbers identify sending & receiving processes – Port number & IP address allow any application in any computer on Internet to be uniquely identified – Used to demultiplex datagrams to processes – Ports can be static or dynamic • Static (< 1024) assigned centrally, known as well known ports • Dynamic • Message length in bytes includes the UDP header and data Source port Destination port UDP message len Checksum (opt.) 0 8 16 31 24 Data …
  • 31. 31 UDP applications • Message oriented, e.g. SNMP, DNS, time • File system, e.g. NFS, AFS • Lightweight file transfer, e.g. tftp, bootp
  • 32. 32 Transmission Control Protocol -TCP • RFC 768 & host requirements RFC 1122 – Reliable stream transport • Connection oriented (full duplex virtual circuit) – Conceptually place call, two ends communicate to agree on details – After agreeing application notified of connection – During transfer, ends communicate continuously to verify data received correctly – When done, ends tear down the connection – If UDP is like regular mail, TCP is like phone call • Provides buffering and flow control • Takes care of lost packets, out of order, duplicates, long delays • Isolates application program from network details • Jargon – Segment = TCP packet – Socket= source (address + port) + destination (address + port)
  • 33. 33 TCP layering • To ID connection need: – Source: (address, port) AND Destination: (address, port) – Only need one port on host to allow multiple connections, since each connection will have different (host, port) at other end • E.g. single host can serve multiple telnet connections • Passive open: application contacts OS & indicates will accept incoming connection, OS assigns port and listens • Active open: application requests OS to connect to an (host, port) IP Port 1 TCP UDP Port 2 Port 1 Port 2 Demux on IP protocol Demux on Port number Network Transport App. IP port 6
  • 34. 34 TCP – providing reliability • Positive acknowledgement (ACK) with retransmission – Sender keeps record of each packet sent – Sender awaits an ACK – Sender starts timer when sends packet Send pkt 1 Rcv ACK 1 Send pkt 2 Rcv ACK 2 Network messages Rcv pkt 1 Rcv pkt 2 Send ACK 2 Send ACK 1 Sender site Receiver site Time
  • 35. 35 TCP – simple lost packet recovery Send pkt 1 Start timer ACK normally arrives Rcv ACK 1 Network messages Pkt should arrive Rcv pkt 1 Send ACK 1 ACK should be sent Sender site Receiver site Loss Timer expires Retransmit pkt 1 start timer
  • 36. 36 TCP – improving performance • BUT simple ACK protocol wastes bandwidth since it must delay sending next packet until it gets ACK • Use sliding window • Sender can send 4 packets of data without ACK – When sender gets ACK then can send another packet – Window = unacknowledged packets/bytes – Keeps timer for each packet 1 2 3 4 5 6 7 8 … Initial window of 4 packets 1 2 3 4 5 6 7 8 … Window slides Packets successfully sent Packets sent, awaiting ACK Packets to be sent
  • 37. 37 Tuning to fill pipe • Optimal window size depends on: – Bandwidth end to end, i.e. min(BWlinks) AKA bottleneck bandwidth – Round Trip Time (RTT) – For TCP keep pipe full • Window (sometime called pipe) ~ RTT*BW – Can increase bandwidth by orders of magnitude • Windows also used for flow control Src Rcv t = bits in packet/link speed RTT
  • 38. 38 Implementation • Sliding window operates at byte level, NOT packet • Receiver keeps similar window to put stream back together • Since full duplex, altogether 4 windows & pointer sets 1 2 3 4 5 6 7 8 … Current window Highest byte that can be sent Bytes sent and acknowledged 3 pointers Highest byte sent
  • 39. 39 TCP flow control • Windows vary over time – Receiver advertises (in ACKs) how many it can receive • Based on buffers etc. available – Sender adjusts its window to match advertisement – If receiver buffers fill, it sends smaller adverts • Used to match buffer requirements of receiver • Also used to address congestion control (e.g. in intermediate routers)
  • 40. 40 TCP Segment format • Source/Dest port: TCP port numbers to ID applications at both ends of connection • Sequence number: ID position in sender’s byte stream Source port Destination port Sequence number 0 8 16 31 24 Acknowledgement number 4 Hlen 10 Resv Code Window Urgent ptr Checksum Options (if any) Padding Data if any …
  • 41. 41 TCP segment format – cont. • Acknowledgement: identifies the number of the byte the sender of this segment expects to receive next • Hlen: specifies the length of the segment header in 32 bit multiples. If there are no options, the Hlen = 5 (20 bytes) • Reserved for future use, set to 0 • Code: used to determine segment purpose, e.g. SYN, ACK, FIN, URG
  • 42. 42 TCP Segment format- cont • Window: Advertises how much data this station is willing to accept. Can depend on buffer space remaining. • Checksum: Verifies the integrity of the TCP header and data. It is mandatory. • Urgent pointer: used with the URG flag to indicate where the urgent data starts in the data stream. Typically used with a file transfer abort during FTP or when pressing an interrupt key in telnet. • Options: used for window scaling, SACK, timestamps, maximum segment size etc.
  • 43. 43 TCP timeout • Need a timeout estimate that will work for LANs (RTT < msec.) to satellite WANs (hundreds of msec. to secs). RTT can vary a lot with time of day, day of week, or one second to next. – TCP records time segment sent – and time ACK received – Then calculates RTT sample – Smooth & use to estimate timeout, e.g. • Timeout=beta * RTTs • Timeout= RTTs + eta{=4}*f(dev(RTTs)) – Needs to take account of losses, e.g. • New_timeout=gamma{2} * timeout May 12th Time of day
  • 44. 44 TCP connection establishment • 3 way handshake • Initial sequence numbers (x, y) are chosen randomly • Guarantees both sides ready & know it, and sets initial sequence numbers, also sets window & mss • Once connection established, data can flow in both directions, equally well, there is no master or slave Send SYN seq x Rcv SYN/ACK Send ACK y+1 Rcv SYN segment Rcv ACK segment Send SYN seq=y, ACK x+1 Site 1 Site 2
  • 45. 45 TCP close connection • Modified 3 way handshake (or 4 way termination) • App tells TCP to close, TCP sends remaining data & waits for ACK, then sends FIN • Site 2 TCP ACKs FIN, tells its application “end of data” • Site 2 sends FIN when its app closes connection (may be long delay (e.g. require human interaction). (App closes) Send FIN seq=x Rcv ACK segment Rcv FIN segment Receive ACK segment Send ACK x=1 (inform app) Site 1 Site 2 Rcv FIN + ACK seg Send ACK y+1 (app closes connection) Send FIN seq=y, ACK x+1
  • 46. 46 More Information • Lectures, tutorials etc: – www.nv.cc.va.us/home/joney/tcp_ip.htm – www.cs.pdx.edu/~jrb/tcpip.lectures.html – www.raleigh.ibm.com/cgi-bin/bookmgr/BOOKS/EZ306200/CCONTENTS – www.cisco.com/univercd/cc/td/doc/product/iaabu/centri4/user/scf4ap1.htm – www.cis.ohio-state.edu/htbin/rfc/rfc1180.html – www.jbmelectronics.com/tcp.htm • Encylopaedia – http://www.freesoft.org/CIE/index.htm • TCP/IP Resources – www.private.org.il/tcpip_rl.html • Understanding IP addresses – http://www.3com.com/solutions/en_US/ncs/501302.html • Configuring TCP (RFC 1122) – ftp://nic.merit.edu/internet/documents/rfc/rfc1122.txt • Assigned protocols, ports etc (RFC 1010) – http://www.es.net/pub/rfcs/rfc1010.txt & /etc/protocols
  • 47. 47 Example: 3 way handshake • atlas> telnet sunstats.cern.ch – atlas is a WNT PC, sunstats is a Sun Solaris 5.6 host – MSS is set in TCP option in a SYN segment, communicates the MSS the sender wants to receive – len=ip_hlen/tcp_hlen:ip_total_len – Initial Sequence Numbers are randomly selected – Telnet = port 23 – W=Receive window size advertises how much data this host will accept
  • 48. 48 Example: 3 way handshake - cont. • TCP from atlas:1174 to sunstats:23 seq=180839, A=0, W=8192, SYN [len=5/6:44, opt=020405B4 <opt=2, len=4, mss=0x5B4=1460>] • TCP from sunstats:23 to atlas:1174 seq=1383568304, A=180840, W=64240, SYN/ACK [len=5/6:44, opt=020405B4] • TCP from atlas:1174 to sunstats:23 seq =180840, A=1383568305, W=8760 [len=5/5:40, opt=nul] – Notice window size can vary from segment to segment depending on buffer space available – Notice smaller PC window advertisement – Notice ephemeral port selected by telnet client – Notice acknowledge next expected byte (=seq+1) – 0x020405B4: 02 = option type, 04=len, 0x5B4=1460
  • 49. 49 Session start SLAC>CERN: 256kbyte window,1 stream, full speed > 30msec, 13MBytes in 20s, 5.1MBytes/s Rcvr Advertised window Acks returned by Rcvr Segments sent Congestion window