SlideShare a Scribd company logo
How LinkedIn used TCP Anycast to make the site faster
How	
  LinkedIn	
  used	
  TCP	
  Anycast	
  to	
  make	
  
the	
  site	
  faster	
  
Ritesh	
  Maheshwari	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Shawn	
  Zandi	
  
Anycast	
  
•  Anycast	
  provides	
  a	
  distributed	
  service	
  via	
  rou8ng.	
  
•  It	
  is	
  not	
  really	
  different	
  than	
  unicast.	
  
•  NLRI	
  object	
  with	
  mul8ple	
  next-­‐hops.	
  
•  It	
  simply	
  works	
  for	
  both	
  TCP	
  and	
  UDP	
  applica8ons.	
  (use	
  
with	
  cau8ons!)	
  
	
  	
  
	
  
SF	
  
CHI	
  
NYC	
  
Bob	
  
www.linkedin.com	
  
2001:db8::1/56	
  
www.linkedin.com	
  
2001:db8::1/56	
  
www.linkedin.com	
  
2001:db8::1/56	
  
Anycast	
  with	
  ECMP	
  
•  Not	
  a	
  real	
  issue	
  in	
  today’s	
  internet	
  
•  Consistent	
  flow	
  rou8ng	
  is	
  required	
  (per	
  packet	
  load	
  
balancing	
  breaks	
  Anycast)	
  –	
  Pre_y	
  Much	
  Standard	
  
•  Most	
  BGP	
  implementa8ons	
  do	
  not	
  load	
  balance	
  across	
  
different	
  AS-­‐PATHs	
  even	
  with	
  same	
  size.	
  
	
  
Anycast	
  Complica8ons	
  
•  Broken	
  MTU	
  Challenges	
  
•  ICMP	
  message	
  may	
  not	
  reach	
  the	
  intended	
  receiver	
  
to	
  report	
  MTU	
  problem.	
  Adjus8ng	
  MSS	
  can	
  help.	
  
•  RPF	
  Checks	
  
•  Mul8ple	
  covering	
  prefixes	
  -­‐	
  Only	
  one	
  Service	
  Address	
  
should	
  be	
  covered	
  by	
  each	
  adver8sed	
  prefix	
  /24	
  or	
  /56	
  
•  Monitoring!	
  	
  
	
  	
  
	
  
 
But!	
  
How	
  to	
  measure	
  Anycast	
  effec8veness?	
  
What	
  is	
  RUM?	
  
	
  
JavaScript	
  (Client-­‐code)	
  to	
  measure	
  
performance	
  
	
   •  DNS	
  Time	
  
•  Connec8on	
  8me	
  
•  First	
  Byte	
  Time	
  
•  Download	
  Time	
  
•  Page	
  Load	
  Time	
  
What	
  are	
  PoPs?	
  
	
  
Point	
  of	
  Presence	
  /	
  PoP	
  
•  Small-­‐scale	
  data	
  centers	
  
•  Proxy	
  servers	
  at	
  LinkedIn	
  (ATS)	
  
Without	
  PoPs	
  
Browser	
   Data	
  Center	
  
connec8on	
  8me	
   250ms	
  
Without	
  PoPs	
  
Browser	
   Data	
  Center	
  
connec8on	
  8me	
  
server	
  
compute	
  
8me	
  
250ms	
  
500ms	
  
Without	
  PoPs	
  
Browser	
   Data	
  Center	
  
connec8on	
  8me	
  
3-­‐5	
  round	
  trips	
  
first	
  	
  
byte	
  
	
  8me	
  	
  
+	
  
page	
  
download	
  
8me	
  
5	
  RTTs	
  =	
  5x250ms	
  =	
  1250ms	
  
server	
  
compute	
  
8me	
  
250ms	
  
Total	
  =	
  2000ms	
  
500ms	
  
With	
  PoPs	
  
Browser	
   Data	
  Center	
  PoP	
  
100ms	
  
250ms	
  
With	
  PoPs	
  
Browser	
   Data	
  Center	
  PoP	
  
100ms	
  connec8on	
  8me	
  
Old	
  TCP	
  Connec8on	
  
With	
  PoPs	
  
Browser	
   Data	
  Center	
  PoP	
  
100ms	
  connec8on	
  8me	
  
one	
  round	
  trip	
  
first	
  	
  
byte	
  
	
  8me	
  	
  
+	
  
page	
  
download	
  
8me	
  
Old	
  TCP	
  Connec8on	
  
server	
  
compute	
  
8me	
  
500ms	
  
With	
  PoPs	
  
Browser	
   Data	
  Center	
  PoP	
  
100ms	
  connec8on	
  8me	
  
one	
  round	
  trip	
  
5	
  RTTs	
  =	
  5x100ms	
  =	
  500ms	
  
Total	
  =	
  1100ms	
  900	
  ms	
  gain!	
  
first	
  	
  
byte	
  
	
  8me	
  	
  
+	
  
page	
  
download	
  
8me	
  
Old	
  TCP	
  Connec8on	
  
500ms	
  
server	
  
compute	
  
8me	
  
How	
  are	
  users	
  assigned	
  to	
  PoPs?	
  
Through	
  DNS:	
  	
  
	
  IP	
  handed	
  based	
  on	
  user’s	
  resolver	
  country	
  
	
  
	
  
#	
  Spain	
  
$	
  dig	
  @109.69.8.51	
  +short	
  www.linkedin.com	
  
91.225.248.80	
  
	
  
#	
  California	
  
$	
  dig	
  +short	
  www.linkedin.com	
  
216.52.242.80	
  
	
  
Should	
  India	
  connect	
  to	
  Singapore	
  or	
  
Dublin?	
  
	
  
How	
  to	
  assure	
  op,mal	
  PoPs	
  assignment?	
  
	
  
	
  
RUM	
  beacons	
  
Fetch	
  a	
  8ny	
  object	
  from	
  each	
  candidate	
  PoP	
  
	
  
For each pop_name,
1.  Start timer
2.  Fetch {pop_name}.perf.linkedin.com/pop/admin
3.  Stop timer
Send data back to our servers
•  Millions	
  of	
  agents!	
  
•  Analyze	
  data	
  to	
  find	
  “op8mal”	
  PoP	
  per	
  country	
  
We	
  can	
  assign	
  countries	
  to	
  new	
  PoPs!	
  
Country	
   PoP	
  
Median	
  Beacon	
  
Time(ms)	
  
China	
   Hong	
  Kong	
   434	
  
China	
   Dublin	
   1216	
  
China	
   Singapore	
   515	
  
India	
   Hong	
  Kong	
   1368	
  
India	
   Dublin	
   1042	
  
India	
   Singapore	
   898	
  
We	
  can	
  audit	
  current	
  assignment!	
  
Country	
   Is	
  PoP	
  op8mal?	
   Current	
  PoP	
   Op8mal	
  PoP	
  
India	
   TRUE	
   Singapore	
   Singapore	
  
Pakistan	
   FALSE	
   Singapore	
   Dublin	
  
Spain	
   TRUE	
   Dublin	
   Dublin	
  
Brazil	
   FALSE	
   US	
  West	
  Coast	
   US	
  East	
  Coast	
  
Netherlands	
   TRUE	
   Dublin	
   Dublin	
  
UAE	
   FALSE	
   US	
  West	
  Coast	
   Dublin	
  
Italy	
   TRUE	
   Dublin	
   Dublin	
  
Mexico	
   TRUE	
   US	
  West	
  Coast	
   US	
  West	
  Coast	
  
Russia	
   FALSE	
   US	
  West	
  Coast	
   Dublin	
  
0%	
  
5%	
  
10%	
  
15%	
  
20%	
  
25%	
  
30%	
  
India	
   Pakistan	
   Singapore	
   Russia	
   Brazil	
  
Percentage	
  Improvement	
  
LinkedIn	
  Homepage	
  Download	
  Time	
  Improvement	
  
Median	
  Improvement	
   90th	
  Percen8le	
  Improvement	
  
How LinkedIn used TCP Anycast to make the site faster
How LinkedIn used TCP Anycast to make the site faster
Plot	
  Twist:	
  	
  
Assignment	
  far	
  from	
  op8mal	
  
•  About	
  31%	
  of	
  US	
  traffic	
  gets	
  assigned	
  to	
  a	
  
subop8mal	
  PoP.	
  
– 45%	
  of	
  East	
  Coast	
  
•  About	
  10%	
  of	
  traffic	
  globally	
  gets	
  assigned	
  to	
  a	
  
subop8mal	
  PoP.	
  
DNS	
  PoP	
  assignment	
  is	
  subop8mal	
  
•  Assignment	
  based	
  on	
  Resolver	
  IP,	
  not	
  Client	
  IP	
  
DNS	
  
Resolver	
  
PoP	
  
US	
  
East	
  
PoP	
  
US	
  
West	
  
New	
  York	
  California	
  
DNS	
  PoP	
  assignment	
  is	
  subop8mal	
  
•  Assignment	
  based	
  on	
  Resolver	
  IP,	
  not	
  Client	
  IP	
  
•  Bad	
  IP	
  to	
  Geo	
  databases	
  
– Resolver	
  really	
  in	
  NY,	
  but	
  database	
  says	
  CA	
  
Story	
  so	
  far	
  
1.  We	
  built	
  PoPs	
  
2.  …used	
  RUM	
  to	
  assign	
  users	
  to	
  Op8mal	
  PoPs	
  
3.  …found	
  DNS	
  based	
  assignment	
  is	
  subop8mal	
  
Accurate	
  PoP	
  assignment	
  Problem	
  
•  Bug	
  our	
  DNS	
  providers	
  (31%	
  -­‐>	
  27%)	
  
•  Run	
  our	
  own	
  DNS	
  
	
  
How	
  about	
  Anycast?	
  
Anycast	
  –	
  One	
  IP,	
  Mul8ple	
  Servers	
  
PoP	
  A	
  
PoP	
  B	
  
PoP	
  C	
  
Bob	
  
1.1.1.1	
  
1.1.1.1	
  
1.1.1.1	
  
ü Client	
  IP,	
  not	
  Resolver	
  IP	
  used!	
  
ü No	
  Geo-­‐IP	
  Databases	
  
	
  
How	
  does	
  Anycast	
  compare	
  to	
  DNS?	
  
	
  
Will	
  anycast	
  send	
  more	
  users	
  to	
  op,mal	
  PoP?	
  
	
  
Ø Lets	
  test	
  it!	
  
RUM	
  to	
  rescue	
  
	
  
For	
  each	
  PoP:	
  
1.  Announce	
  same	
  anycast	
  IP	
  (108.174.13.10)	
  
2.  Configure	
  a	
  domain	
  
ac.perf.linkedin.com	
  to	
  point	
  to	
  
108.174.13.10	
  
RUM	
  to	
  rescue	
  
For	
  each	
  page	
  view:	
  
1.  RUM	
  downloads	
  a	
  8ny	
  object	
  :	
  	
   	
   	
  
	
  ac.perf.linkedin.com/pop/admin
2.  Read	
  	
  X-Li-Pop response	
  header	
  to	
  record	
  which	
  PoP	
  served	
  
the	
  object	
  
3.  Send	
  this	
  back	
  to	
  LinkedIn	
  with	
  RUM	
  data	
  
Data:	
  
1.  For	
  each	
  user,	
  the	
  anycast	
  PoP	
  
2.  For	
  each	
  user,	
  the	
  op8mal	
  PoP	
  (from	
  pop	
  beacons)	
  
Results	
  J	
  
Region	
  or	
  	
  
Country	
  
DNS	
  %	
  Op8mal	
  
Assignment	
  
Anycast	
  %	
  Op8mal	
  	
  
Assignment	
  
Illinois	
   70	
   90	
  
Florida	
   73	
   95	
  
Georgia	
   75	
   93	
  
Pennsylvania	
   85	
   95	
  
Results	
  L	
  
Region	
  or	
  	
  
Country	
  
DNS	
  %	
  Op8mal	
  
Assignment	
  
Anycast	
  %	
  Op8mal	
  	
  
Assignment	
  
Arizona	
   60	
   39	
  
Brazil	
   88	
   33	
  
New	
  York	
   77	
   74	
  
How LinkedIn used TCP Anycast to make the site faster
Fewer	
  hops	
  !=	
  Lower	
  Latency	
  
•  Carriers	
  prefer	
  to	
  haul	
  packets	
  within	
  
their	
  own	
  network	
  
•  Peering	
  can	
  create	
  inter-­‐con8nental	
  
short	
  cuts	
  
Z	
  
X	
  
Alice	
  
Y	
  
inter-­‐con8nental	
  link	
  
1.1.1.1	
  
1.1.1.1	
  
1.1.1.1	
  
Maybe	
  DNS	
  wasn’t	
  so	
  bad	
  
	
  
Con8nent-­‐level	
  assignments	
  	
  
	
  
City	
  /	
  State	
  level	
  assignments	
  
“Regional”	
  Anycast	
  
DNS-­‐based	
  
1	
  anycast	
  IP	
  per	
  con8nent	
  
Ran	
  a	
  RUM	
  experiment,	
  	
  
all	
  was	
  fine	
   Z	
  
X	
  
Alice	
  
Y	
  
2.2.2.2	
  
1.1.1.1	
  
1.1.1.1	
  
inter-­‐con8nental	
  link	
  
USA	
  Ramp	
  Results	
  
50.00	
  
55.00	
  
60.00	
  
65.00	
  
70.00	
  
75.00	
  
80.00	
  
85.00	
  
90.00	
  
95.00	
  
100.00	
  
20141207	
  20141208	
  20141209	
  20141210	
  20141211	
  20141212	
  20141213	
  20141214	
  20141215	
  20141216	
  20141217	
  
%	
  Traffic	
  going	
  to	
  Op8mal	
  PoP	
  
Date	
  
Illinois	
  
Florida	
  
North	
  Carolina	
  
Indiana	
  
NY	
  
NJ	
  
VA	
  
WV	
  
LA	
  
Ramp	
  outside	
  USA	
  	
  
In	
  progress	
  
Story	
  so	
  far	
  
1.  We	
  built	
  PoPs	
  
2.  …used	
  RUM	
  to	
  assign	
  users	
  to	
  Op8mal	
  PoPs	
  
3.  …found	
  DNS	
  based	
  assignment	
  is	
  subop8mal	
  
4.  …evaluated	
  Anycast	
  as	
  a	
  solu8on	
  using	
  RUM	
  
5.  …now	
  using	
  Anycast	
  to	
  assign	
  users	
  to	
  PoPs	
  
Next	
  play:	
  
•  Build	
  more	
  PoPs!	
  
Story:	
  The	
  End	
  
Learnings	
  
•  Clients	
  are	
  your	
  
measurement	
  agents	
  
•  Trust,	
  but	
  verify	
  
•  You	
  can	
  have	
  a	
  bigger	
  
impact	
  if	
  you	
  collaborate	
  
Next	
  Play	
  
•  Keep	
  evalua8ng	
  Anycast	
  
•  Keep	
  building	
  new	
  PoPs	
  
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.
Ad

More Related Content

What's hot (20)

SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & Ecosystem
Kingston Smiler
 
I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_
Sheetal Kshirsagar
 
APNIC Update
APNIC Update APNIC Update
APNIC Update
APNIC
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDN
nvirters
 
EIGRP, DHCP, OSPF, NAT
EIGRP, DHCP, OSPF, NATEIGRP, DHCP, OSPF, NAT
EIGRP, DHCP, OSPF, NAT
Md. Rakibul Islam
 
Introduction to Segment Routing
Introduction to Segment RoutingIntroduction to Segment Routing
Introduction to Segment Routing
MyNOG
 
Segment Routing
Segment RoutingSegment Routing
Segment Routing
APNIC
 
Scripting on Routers - NANOG 47
Scripting on Routers - NANOG 47Scripting on Routers - NANOG 47
Scripting on Routers - NANOG 47
Richard Steenbergen
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Jose Liste
 
Deep Packet Inspection technology evolution
Deep Packet Inspection technology evolutionDeep Packet Inspection technology evolution
Deep Packet Inspection technology evolution
Daniel Vinyar
 
EVPN Introduction
EVPN IntroductionEVPN Introduction
EVPN Introduction
Bangladesh Network Operators Group
 
APRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering AutomationAPRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering Automation
Tom Paseka
 
LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitch
mestery
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Adam Dunkels
 
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
シスコシステムズ合同会社
 
SDN/NFV: Service Chaining
SDN/NFV: Service Chaining SDN/NFV: Service Chaining
SDN/NFV: Service Chaining
Odinot Stanislas
 
Service Chaining overview (English) 2015/10/05
Service Chaining overview (English) 2015/10/05Service Chaining overview (English) 2015/10/05
Service Chaining overview (English) 2015/10/05
Kentaro Ebisawa
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining
Tail-f Systems
 
Janet access solutions
Janet access solutionsJanet access solutions
Janet access solutions
Jisc
 
OARC 26: Scoring the Root Server System
OARC 26: Scoring the Root Server SystemOARC 26: Scoring the Root Server System
OARC 26: Scoring the Root Server System
APNIC
 
SDN Architecture & Ecosystem
SDN Architecture & EcosystemSDN Architecture & Ecosystem
SDN Architecture & Ecosystem
Kingston Smiler
 
I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_I pv6 routing_protocol_for_low_power_and_lossy_
I pv6 routing_protocol_for_low_power_and_lossy_
Sheetal Kshirsagar
 
APNIC Update
APNIC Update APNIC Update
APNIC Update
APNIC
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDN
nvirters
 
Introduction to Segment Routing
Introduction to Segment RoutingIntroduction to Segment Routing
Introduction to Segment Routing
MyNOG
 
Segment Routing
Segment RoutingSegment Routing
Segment Routing
APNIC
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Jose Liste
 
Deep Packet Inspection technology evolution
Deep Packet Inspection technology evolutionDeep Packet Inspection technology evolution
Deep Packet Inspection technology evolution
Daniel Vinyar
 
APRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering AutomationAPRICOT 2015 - NetConf for Peering Automation
APRICOT 2015 - NetConf for Peering Automation
Tom Paseka
 
LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitch
mestery
 
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Building the Internet of Things with Thingsquare and Contiki - day 2 part 2
Adam Dunkels
 
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
【EPN Seminar Nov.10.2015】 Services Function Chaining Architecture, Standardiz...
シスコシステムズ合同会社
 
SDN/NFV: Service Chaining
SDN/NFV: Service Chaining SDN/NFV: Service Chaining
SDN/NFV: Service Chaining
Odinot Stanislas
 
Service Chaining overview (English) 2015/10/05
Service Chaining overview (English) 2015/10/05Service Chaining overview (English) 2015/10/05
Service Chaining overview (English) 2015/10/05
Kentaro Ebisawa
 
Dynamic Service Chaining
Dynamic Service Chaining Dynamic Service Chaining
Dynamic Service Chaining
Tail-f Systems
 
Janet access solutions
Janet access solutionsJanet access solutions
Janet access solutions
Jisc
 
OARC 26: Scoring the Root Server System
OARC 26: Scoring the Root Server SystemOARC 26: Scoring the Root Server System
OARC 26: Scoring the Root Server System
APNIC
 

Similar to How LinkedIn used TCP Anycast to make the site faster (20)

Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73
APNIC
 
CoAP Talk
CoAP TalkCoAP Talk
CoAP Talk
Basuke Suzuki
 
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
ScyllaDB
 
RIPE 78: IPv6 reliability measurements
RIPE 78: IPv6 reliability measurementsRIPE 78: IPv6 reliability measurements
RIPE 78: IPv6 reliability measurements
APNIC
 
Being Open: How Facebook got its Edge
Being Open: How Facebook got its EdgeBeing Open: How Facebook got its Edge
Being Open: How Facebook got its Edge
APNIC
 
Building a better web
Building a better webBuilding a better web
Building a better web
Fastly
 
Real time system_performance_mon
Real time system_performance_monReal time system_performance_mon
Real time system_performance_mon
Tomas Doran
 
Cdn cs6740
Cdn cs6740Cdn cs6740
Cdn cs6740
Aravindharamanan S
 
Network
NetworkNetwork
Network
Ynon Perek
 
The Anatomy of Failure - Lessons from running systems to serve millions of pe...
The Anatomy of Failure - Lessons from running systems to serve millions of pe...The Anatomy of Failure - Lessons from running systems to serve millions of pe...
The Anatomy of Failure - Lessons from running systems to serve millions of pe...
John Paul Alcala
 
Http2 in practice
Http2 in practiceHttp2 in practice
Http2 in practice
Patrick Meenan
 
Can you trust Neutron?
Can you trust Neutron?Can you trust Neutron?
Can you trust Neutron?
salv_orlando
 
TCP-IP PROTOCOL
TCP-IP PROTOCOLTCP-IP PROTOCOL
TCP-IP PROTOCOL
Osama Ghandour Geris
 
[En] IPVS for Docker Containers
[En] IPVS for Docker Containers[En] IPVS for Docker Containers
[En] IPVS for Docker Containers
Andrey Sibirev
 
IPVS for Docker Containers
IPVS for Docker ContainersIPVS for Docker Containers
IPVS for Docker Containers
Bob Sokol
 
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.ioWhen DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
DevOps4Networks
 
Distributed monitoring
Distributed monitoringDistributed monitoring
Distributed monitoring
Leon Torres
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Zbigniew Jerzak
 
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other ObservationsAusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
Mark Smith
 
How does the Cloud Foundry Diego Project Run at Scale?
How does the Cloud Foundry Diego Project Run at Scale?How does the Cloud Foundry Diego Project Run at Scale?
How does the Cloud Foundry Diego Project Run at Scale?
VMware Tanzu
 
Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73Measuring IPv6 Performance, RIPE73
Measuring IPv6 Performance, RIPE73
APNIC
 
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-...
ScyllaDB
 
RIPE 78: IPv6 reliability measurements
RIPE 78: IPv6 reliability measurementsRIPE 78: IPv6 reliability measurements
RIPE 78: IPv6 reliability measurements
APNIC
 
Being Open: How Facebook got its Edge
Being Open: How Facebook got its EdgeBeing Open: How Facebook got its Edge
Being Open: How Facebook got its Edge
APNIC
 
Building a better web
Building a better webBuilding a better web
Building a better web
Fastly
 
Real time system_performance_mon
Real time system_performance_monReal time system_performance_mon
Real time system_performance_mon
Tomas Doran
 
The Anatomy of Failure - Lessons from running systems to serve millions of pe...
The Anatomy of Failure - Lessons from running systems to serve millions of pe...The Anatomy of Failure - Lessons from running systems to serve millions of pe...
The Anatomy of Failure - Lessons from running systems to serve millions of pe...
John Paul Alcala
 
Can you trust Neutron?
Can you trust Neutron?Can you trust Neutron?
Can you trust Neutron?
salv_orlando
 
[En] IPVS for Docker Containers
[En] IPVS for Docker Containers[En] IPVS for Docker Containers
[En] IPVS for Docker Containers
Andrey Sibirev
 
IPVS for Docker Containers
IPVS for Docker ContainersIPVS for Docker Containers
IPVS for Docker Containers
Bob Sokol
 
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.ioWhen DevOps and Networking Intersect by Brent Salisbury of socketplane.io
When DevOps and Networking Intersect by Brent Salisbury of socketplane.io
DevOps4Networks
 
Distributed monitoring
Distributed monitoringDistributed monitoring
Distributed monitoring
Leon Torres
 
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe EngineElastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Zbigniew Jerzak
 
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other ObservationsAusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
Mark Smith
 
How does the Cloud Foundry Diego Project Run at Scale?
How does the Cloud Foundry Diego Project Run at Scale?How does the Cloud Foundry Diego Project Run at Scale?
How does the Cloud Foundry Diego Project Run at Scale?
VMware Tanzu
 
Ad

Recently uploaded (15)

Breaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdfBreaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdf
Internet Bundle Now
 
ProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptxProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptx
OlenaKotovska
 
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
Taqyea
 
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdfGiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
Giacomo Vacca
 
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness GuideThe Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
russellpeter1995
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
introduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.pptintroduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.ppt
SherifElGohary7
 
Paper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdfPaper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdf
Steven McGee
 
AG-FIRMA Ai Agent for Agriculture | RAG ..
AG-FIRMA Ai Agent for Agriculture  | RAG ..AG-FIRMA Ai Agent for Agriculture  | RAG ..
AG-FIRMA Ai Agent for Agriculture | RAG ..
Anass Nabil
 
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and MonitoringPresentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
mdaoudi
 
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
werhkr1
 
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
Taqyea
 
IoT PPT introduction to internet of things
IoT PPT introduction to internet of thingsIoT PPT introduction to internet of things
IoT PPT introduction to internet of things
VaishnaviPatil3995
 
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
emestica1
 
Cloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptxCloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptx
marketing140789
 
Breaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdfBreaking Down the Latest Spectrum Internet Plans.pdf
Breaking Down the Latest Spectrum Internet Plans.pdf
Internet Bundle Now
 
ProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptxProjectArtificial Intelligence Good or Evil.pptx
ProjectArtificial Intelligence Good or Evil.pptx
OlenaKotovska
 
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
学生卡英国RCA毕业证皇家艺术学院电子毕业证学历证书
Taqyea
 
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdfGiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
GiacomoVacca - WebRTC - troubleshooting media negotiation.pdf
Giacomo Vacca
 
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness GuideThe Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
The Hidden Risks of Hiring Hackers to Change Grades: An Awareness Guide
russellpeter1995
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
introduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.pptintroduction to html and cssIntroHTML.ppt
introduction to html and cssIntroHTML.ppt
SherifElGohary7
 
Paper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdfPaper: World Game (s) Great Redesign.pdf
Paper: World Game (s) Great Redesign.pdf
Steven McGee
 
AG-FIRMA Ai Agent for Agriculture | RAG ..
AG-FIRMA Ai Agent for Agriculture  | RAG ..AG-FIRMA Ai Agent for Agriculture  | RAG ..
AG-FIRMA Ai Agent for Agriculture | RAG ..
Anass Nabil
 
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and MonitoringPresentation Mehdi Monitorama 2022 Cancer and Monitoring
Presentation Mehdi Monitorama 2022 Cancer and Monitoring
mdaoudi
 
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
DEF CON 25 - Whitney-Merrill-and-Terrell-McSweeny-Tick-Tick-Boom-Tech-and-the...
werhkr1
 
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
美国文凭明尼苏达大学莫里斯分校毕业证范本UMM学位证书
Taqyea
 
IoT PPT introduction to internet of things
IoT PPT introduction to internet of thingsIoT PPT introduction to internet of things
IoT PPT introduction to internet of things
VaishnaviPatil3995
 
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
CompTIA-Security-Study-Guide-with-over-500-Practice-Test-Questions-Exam-SY0-7...
emestica1
 
Cloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptxCloud-to-cloud Migration presentation.pptx
Cloud-to-cloud Migration presentation.pptx
marketing140789
 
Ad

How LinkedIn used TCP Anycast to make the site faster

  • 2. How  LinkedIn  used  TCP  Anycast  to  make   the  site  faster   Ritesh  Maheshwari                              Shawn  Zandi  
  • 3. Anycast   •  Anycast  provides  a  distributed  service  via  rou8ng.   •  It  is  not  really  different  than  unicast.   •  NLRI  object  with  mul8ple  next-­‐hops.   •  It  simply  works  for  both  TCP  and  UDP  applica8ons.  (use   with  cau8ons!)        
  • 4. SF   CHI   NYC   Bob   www.linkedin.com   2001:db8::1/56   www.linkedin.com   2001:db8::1/56   www.linkedin.com   2001:db8::1/56  
  • 5. Anycast  with  ECMP   •  Not  a  real  issue  in  today’s  internet   •  Consistent  flow  rou8ng  is  required  (per  packet  load   balancing  breaks  Anycast)  –  Pre_y  Much  Standard   •  Most  BGP  implementa8ons  do  not  load  balance  across   different  AS-­‐PATHs  even  with  same  size.    
  • 6. Anycast  Complica8ons   •  Broken  MTU  Challenges   •  ICMP  message  may  not  reach  the  intended  receiver   to  report  MTU  problem.  Adjus8ng  MSS  can  help.   •  RPF  Checks   •  Mul8ple  covering  prefixes  -­‐  Only  one  Service  Address   should  be  covered  by  each  adver8sed  prefix  /24  or  /56   •  Monitoring!          
  • 7.   But!   How  to  measure  Anycast  effec8veness?  
  • 8. What  is  RUM?     JavaScript  (Client-­‐code)  to  measure   performance     •  DNS  Time   •  Connec8on  8me   •  First  Byte  Time   •  Download  Time   •  Page  Load  Time  
  • 9. What  are  PoPs?     Point  of  Presence  /  PoP   •  Small-­‐scale  data  centers   •  Proxy  servers  at  LinkedIn  (ATS)  
  • 10. Without  PoPs   Browser   Data  Center   connec8on  8me   250ms  
  • 11. Without  PoPs   Browser   Data  Center   connec8on  8me   server   compute   8me   250ms   500ms  
  • 12. Without  PoPs   Browser   Data  Center   connec8on  8me   3-­‐5  round  trips   first     byte    8me     +   page   download   8me   5  RTTs  =  5x250ms  =  1250ms   server   compute   8me   250ms   Total  =  2000ms   500ms  
  • 13. With  PoPs   Browser   Data  Center  PoP   100ms   250ms  
  • 14. With  PoPs   Browser   Data  Center  PoP   100ms  connec8on  8me   Old  TCP  Connec8on  
  • 15. With  PoPs   Browser   Data  Center  PoP   100ms  connec8on  8me   one  round  trip   first     byte    8me     +   page   download   8me   Old  TCP  Connec8on   server   compute   8me   500ms  
  • 16. With  PoPs   Browser   Data  Center  PoP   100ms  connec8on  8me   one  round  trip   5  RTTs  =  5x100ms  =  500ms   Total  =  1100ms  900  ms  gain!   first     byte    8me     +   page   download   8me   Old  TCP  Connec8on   500ms   server   compute   8me  
  • 17. How  are  users  assigned  to  PoPs?   Through  DNS:      IP  handed  based  on  user’s  resolver  country       #  Spain   $  dig  @109.69.8.51  +short  www.linkedin.com   91.225.248.80     #  California   $  dig  +short  www.linkedin.com   216.52.242.80    
  • 18. Should  India  connect  to  Singapore  or   Dublin?     How  to  assure  op,mal  PoPs  assignment?      
  • 19. RUM  beacons   Fetch  a  8ny  object  from  each  candidate  PoP     For each pop_name, 1.  Start timer 2.  Fetch {pop_name}.perf.linkedin.com/pop/admin 3.  Stop timer Send data back to our servers •  Millions  of  agents!   •  Analyze  data  to  find  “op8mal”  PoP  per  country  
  • 20. We  can  assign  countries  to  new  PoPs!   Country   PoP   Median  Beacon   Time(ms)   China   Hong  Kong   434   China   Dublin   1216   China   Singapore   515   India   Hong  Kong   1368   India   Dublin   1042   India   Singapore   898  
  • 21. We  can  audit  current  assignment!   Country   Is  PoP  op8mal?   Current  PoP   Op8mal  PoP   India   TRUE   Singapore   Singapore   Pakistan   FALSE   Singapore   Dublin   Spain   TRUE   Dublin   Dublin   Brazil   FALSE   US  West  Coast   US  East  Coast   Netherlands   TRUE   Dublin   Dublin   UAE   FALSE   US  West  Coast   Dublin   Italy   TRUE   Dublin   Dublin   Mexico   TRUE   US  West  Coast   US  West  Coast   Russia   FALSE   US  West  Coast   Dublin  
  • 22. 0%   5%   10%   15%   20%   25%   30%   India   Pakistan   Singapore   Russia   Brazil   Percentage  Improvement   LinkedIn  Homepage  Download  Time  Improvement   Median  Improvement   90th  Percen8le  Improvement  
  • 25. Plot  Twist:     Assignment  far  from  op8mal   •  About  31%  of  US  traffic  gets  assigned  to  a   subop8mal  PoP.   – 45%  of  East  Coast   •  About  10%  of  traffic  globally  gets  assigned  to  a   subop8mal  PoP.  
  • 26. DNS  PoP  assignment  is  subop8mal   •  Assignment  based  on  Resolver  IP,  not  Client  IP   DNS   Resolver   PoP   US   East   PoP   US   West   New  York  California  
  • 27. DNS  PoP  assignment  is  subop8mal   •  Assignment  based  on  Resolver  IP,  not  Client  IP   •  Bad  IP  to  Geo  databases   – Resolver  really  in  NY,  but  database  says  CA  
  • 28. Story  so  far   1.  We  built  PoPs   2.  …used  RUM  to  assign  users  to  Op8mal  PoPs   3.  …found  DNS  based  assignment  is  subop8mal  
  • 29. Accurate  PoP  assignment  Problem   •  Bug  our  DNS  providers  (31%  -­‐>  27%)   •  Run  our  own  DNS     How  about  Anycast?  
  • 30. Anycast  –  One  IP,  Mul8ple  Servers   PoP  A   PoP  B   PoP  C   Bob   1.1.1.1   1.1.1.1   1.1.1.1   ü Client  IP,  not  Resolver  IP  used!   ü No  Geo-­‐IP  Databases    
  • 31. How  does  Anycast  compare  to  DNS?     Will  anycast  send  more  users  to  op,mal  PoP?     Ø Lets  test  it!  
  • 32. RUM  to  rescue     For  each  PoP:   1.  Announce  same  anycast  IP  (108.174.13.10)   2.  Configure  a  domain   ac.perf.linkedin.com  to  point  to   108.174.13.10  
  • 33. RUM  to  rescue   For  each  page  view:   1.  RUM  downloads  a  8ny  object  :          ac.perf.linkedin.com/pop/admin 2.  Read    X-Li-Pop response  header  to  record  which  PoP  served   the  object   3.  Send  this  back  to  LinkedIn  with  RUM  data   Data:   1.  For  each  user,  the  anycast  PoP   2.  For  each  user,  the  op8mal  PoP  (from  pop  beacons)  
  • 34. Results  J   Region  or     Country   DNS  %  Op8mal   Assignment   Anycast  %  Op8mal     Assignment   Illinois   70   90   Florida   73   95   Georgia   75   93   Pennsylvania   85   95  
  • 35. Results  L   Region  or     Country   DNS  %  Op8mal   Assignment   Anycast  %  Op8mal     Assignment   Arizona   60   39   Brazil   88   33   New  York   77   74  
  • 37. Fewer  hops  !=  Lower  Latency   •  Carriers  prefer  to  haul  packets  within   their  own  network   •  Peering  can  create  inter-­‐con8nental   short  cuts   Z   X   Alice   Y   inter-­‐con8nental  link   1.1.1.1   1.1.1.1   1.1.1.1  
  • 38. Maybe  DNS  wasn’t  so  bad     Con8nent-­‐level  assignments       City  /  State  level  assignments  
  • 39. “Regional”  Anycast   DNS-­‐based   1  anycast  IP  per  con8nent   Ran  a  RUM  experiment,     all  was  fine   Z   X   Alice   Y   2.2.2.2   1.1.1.1   1.1.1.1   inter-­‐con8nental  link  
  • 40. USA  Ramp  Results   50.00   55.00   60.00   65.00   70.00   75.00   80.00   85.00   90.00   95.00   100.00   20141207  20141208  20141209  20141210  20141211  20141212  20141213  20141214  20141215  20141216  20141217   %  Traffic  going  to  Op8mal  PoP   Date   Illinois   Florida   North  Carolina   Indiana   NY   NJ   VA   WV   LA   Ramp  outside  USA     In  progress  
  • 41. Story  so  far   1.  We  built  PoPs   2.  …used  RUM  to  assign  users  to  Op8mal  PoPs   3.  …found  DNS  based  assignment  is  subop8mal   4.  …evaluated  Anycast  as  a  solu8on  using  RUM   5.  …now  using  Anycast  to  assign  users  to  PoPs   Next  play:   •  Build  more  PoPs!  
  • 42. Story:  The  End   Learnings   •  Clients  are  your   measurement  agents   •  Trust,  but  verify   •  You  can  have  a  bigger   impact  if  you  collaborate   Next  Play   •  Keep  evalua8ng  Anycast   •  Keep  building  new  PoPs  
  • 43. ©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.
  翻译: