SlideShare a Scribd company logo
OpenSample 
A Low-latency, Sampling-based Measurement 
Platform for Software Defined Data Center 
Junho Suh†, Ted “Taekyoung” Kwon†, 
Colin Dixon‡, Wes Felter‡, and John Carter‡ 
†Seoul National University 
‡IBM Research Austin 
ICDCS'14@Madrid 1
Software Defined Networking (SDN) 
OS 
routing 
VPN 
… 
monitoring 
Control / 
management 
functions 
Embedded OS 
Switching ASIC 
Open Interface 
Network OS 
Open Interface 
routing 
VPN 
… 
monitoring 
CISCO Juniper 
2 
Legacy SDN 
SDN 
Measurement 
Control 
Decision 
ICDCS'14@Madrid
X86 64bits High-performance 
mainframe 
3 
Control Loop in 
Software Defined Networking 
• Control loop 
SDN 
Decision 
(100us) 
Measurement 
Control 
(100ms ~ 1sec+) 
(10ms) 
• Measurement is a bottleneck 
– High latency of control loopDC 
PerformanceApp performance 
ICDCS'14@Madrid 
Open Interface 
Network OS 
Open Interface 
routing 
VPN 
… 
monitoring 
IBM RackSwitch G8264
Control Loop in 
Software Defined Networking 
• Control loop 
(100us) 
• For high-speed networks 
– E.g., 10/40Gbps 
X86 64bits High-performance 
mainframe 
4 
SDN 
Measurement 
Control 
Decision 
ICDCS'14@Madrid 
Open Interface 
Network OS 
Open Interface 
routing 
VPN 
… 
monitoring 
IBM RackSwitch G8264 
(100ms ~ 1sec+) 
(10ms)
How Fast should Measurement do 
for SDDC? 
5 
CDF of Flow Duration 
Univ. DC 
Flow Duration@1Gbps (ms) 
*source: T. Benson, et al., “Network traffic characteristics of data 
centers in the wild,” IMC`10 
Production DC 
Flow Duration@1Gbps (ms) 
*source: Alizadeh et al., Background TCP flows, Microsoft data center 
DCTCP, Sigcomm`10 
• The situation is getting worse toward high-speed data center 
networks (e.g., 1Gbps  10/40Gbps) 
ICDCS'14@Madrid
Why are Measurements so Slow? 
• Traditionally, this didn’t need to be fast 
• Control plane of switch’s CPUs are wimpy 
– Over taxing on switch’s CPU  increasing as flow table increases 
– Ex) Polling flow counter and sampling 
• Although faster CPU could help, but still a big gap between CPUs and ASICs 
Kernel Driver 
CPU 
CPU 
ASIC 
PCI-E 
ICDCS'14@Madrid 6 
ASIC 
Kernel Driver 
CPU 
PCI-E 
XAUI, Aurora …
Is Packet Sampling a viable Solution? 
• Estimating flow statistics from a packet sampling 
1,000 pkts classified as A 
Hmm…roughly 
400,000 pkts of 
class A are there~ 
– Maximum likelihood estimation (MLE) 
• #packets ≈ #packets sampled X sampling ratio 
2,500 pkts sampled 
… … 
… … … 
1,000,000 pkts transiting 
in a given measurement time 
Sampling probability = 0.25% 
Flow A 
Flow B 
Flow C 
ICDCS'14@Madrid 7
Is Packet Sampling a Viable Solution? 
• Theory behind this inference… 
– Large number theory 
– Estimation accuracy ∝ sqrt of # samples 
2,500 pkts sampled 
… … 
… … … 
1,000,000 pkts transiting 
in a given measurement time 
Sampling probability = 0.25% 
1,000 pkts classified as A 
Hmm…roughly 
40% of pkts of 
class A are there~ 
The number of pkts of 
class A are in 
[381,000, 419,000] 
Flow A 
Flow B 
Flow C 
ICDCS'14@Madrid 8
How many Samples really we can get? 
1-in-N Peak at 350 pkts/sec 
9 
collector 
client server 
ICDCS'14@Madrid 
• Micro benchmark 
– IBM RackSwitch G8264 
– Single TCP connection @10Gbps w/ TCP perf
How many Samples really we can get? 
In 100ms, only 3,000 pkts 
in avg. are arrived, 
as well as 60 flows 
*Dataset source: Benson, T., Network traffic characteristics of data centers in the wild, IMC 2010 
• With a limit of 350 samples/sec, the situation is worse 
ICDCS'14@Madrid 10
Can we gather Samples more? 
• Two approaches to increase an accuracy 
(= increase # samples) 
– ↑ sampling probability 
• Over taxing on switches’ CPU 
– ↑ an measurement interval 
• Violating OpenSample’s goal of low-latency 
measurements 
ICDCS'14@Madrid 11
Our Solution: Protocol-aware 
Flow Statistics Detection Algorithm 
• Fact: 99% of total traffic in data centers are 
TCP flows 
– Captures two distinct packet headers 
• a timestamp and a TCP sequence number are exploited 
– Ex) TCP packet A with seq# SA at time tA and B with seq# SB at 
time tB such that tA < tB 
ICDCS'14@Madrid 12
Our Solution: Protocol-aware 
Flow Statistics Detection Algorithm 
• Ex) Estimating flow statistics 
13 
… … … 
S1 Throughput of flowS = (S2-S1)/(t3-t1) 
Throughput of flowT = (T2-T1)/(t3-t1) 
ICDCS'14@Madrid 
S2 
T2 T1 
t4 t3 t2 t1 
Streaming algorithm
Our Solution: Protocol-aware 
Flow Statistics Detection Algorithm 
• Ex) Estimating port statistics 
– Exploiting MLE 
• regards packets passing through a specific port as super 
flow 
14 
measurement 
interval 
t2 t1 
… … … 
UtilportA = #samples * sampling rate 
ICDCS'14@Madrid
Our Solution: Protocol-aware 
Flow Statistics Detection Algorithm 
• Benefits 
– Streaming algorithm 
• Near-real-time analysis 
– High accuracy even with low sampling probability 
– Independent with the sampling theory 
• don’t need to know a sampling probability 
• can capture samples at multiple points in a given network 
• measurement delay depends only on latency between two 
different samples 
ICDCS'14@Madrid 15
How much a number of flows can be 
detected? 
• Probability of flow statistics detection in single switch 
model 
Disjoint events 
– Pr{2+ samples} = 1 – Pr{get 0 or 1 sample} 
– = 1 – Pr{0 sample} – Pr{1 sample} 
– =1 - (1-p)n - np(1-p)n-1 
Bernoulli trial 
n: # of packets in a given flow 
p: probability of packet sampled (0 ≤ p ≤ 1) 
ICDCS'14@Madrid 16
How much a number of flows can be 
detected? 
17 
• Probability of flow statisitcs detection w/ multiple 
switches 
– Pr{2+ samples} = 1 – (1-kp)n – nkp(1-kp)n-1 
n: # of packets in the flow 
p: probability of packet sampled (0 ≤ p ≤ 1) 
k: # of switches 
ICDCS'14@Madrid
How much a number of flows can be 
detected? 
18 
• Probability of flow detection w/ multiple switches 
n: # of packets in the flow 
p: probability of packet sampled (0 ≤ p ≤ 1) 
k: # of switches 
ICDCS'14@Madrid
Protocol-aware Flow Statistics 
Detection Algorithm (4/4) 
• Flow detection delay 
– E[D] = E[X1 + X2] 
– = E[X1] + E[X2] = 2/λkp 
D: delaying to acquire two samples from a given flow 
X1, X2: the arrival time of the first and second sampled 
packets 
λ: inter-arrival rate 
ICDCS'14@Madrid 19
Protocol-aware Flow Statistics 
Detection Algorithm (4/4) 
• Flow detection delay 
ICDCS'14@Madrid 20
Implementation 
• OpenSample collector 
– Java-based collector with Netty NIO 
framework 
– sFlow v5.0 standard 
– Reconstruct flow/port statistics 
• Protocol-aware flow statistics detection 
• Maximum Likelihood Estimation 
• Floodlight SDN controller 
– Traffic engineering application 
ICDCS'14@Madrid 21
Benchmark Tests (Emulation) 
• Mininet v2.0 
– SDN emulator running in a single host 
– Real traffic characteristics 
– Results can be reproducible 
ICDCS'14@Madrid 22
Benchmark Tests (Configuration) 
• Topology 
– FatTree (k=4) vs. non-blocking @ 10 Mbps 
– 16 hosts and 3-levels 
• Workloads 
– Spatial locality of traffic patterns 
• random (benign), staggered, stride (adversarial) 
– Flow size following an exponential distribution 
• with avg. 1MB and 1GB 
• Benchmarks 
– Polling-based flow scheduler 
• Polling interval = 1 second 
– MLE vs. Protocol-aware flow statistics detection algorithm 
• Sampling rate: N=50 (High), N=200 (Low) 
ICDCS'14@Madrid 23
Results: 1GB Long Flow 
• Normalized aggregate throughput 
ICDCS'14@Madrid 24
Results: 1MB Short Flow 
• Normalized aggregate throughput 
ICDCS'14@Madrid 25
Results: 1MB Short Flow 
• CDF of bytes left at the time 
of detection 
• CDF of bytes left at the time 
of routing 
ICDCS'14@Madrid 26
Results: 1MB Short Flow 
• The total bytes sent in 30s and the percent of those bytes 
scheduled by traffic engineering for the STRIDE8 workload 
ICDCS'14@Madrid 27
Conclusion 
• OpenSample 
– A working prototype of a low-latency, sampling-based measurement platform 
– Reducing control loop latency from 1-5 seconds to 100 milliseconds 
• Further reducing control loop as fast as 100us with hardware supports 
– See Planck: Millisecond-scale Monitoring and Control for Commodity 
Networks, Sigcomm`14 
ICDCS'14@Madrid 28
Q&A 
Email: jhsuh@mmlab.snu.ac.kr 
ICDCS'14@Madrid 29
Ad

More Related Content

What's hot (20)

LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitch
mestery
 
Packet Framework - Cristian Dumitrescu
Packet Framework - Cristian DumitrescuPacket Framework - Cristian Dumitrescu
Packet Framework - Cristian Dumitrescu
harryvanhaaren
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
Thomas Graf
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
Thomas Graf
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
Stephen Hemminger
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
vivekkonnect
 
Dynamic user trace
Dynamic user traceDynamic user trace
Dynamic user trace
Vipin Varghese
 
IPv6 Segment Routing : an end-to-end solution ?
IPv6 Segment Routing : an end-to-end solution ?IPv6 Segment Routing : an end-to-end solution ?
IPv6 Segment Routing : an end-to-end solution ?
Olivier Bonaventure
 
Implementing IPv6 Segment Routing in the Linux kernel
Implementing IPv6 Segment Routing in the Linux kernelImplementing IPv6 Segment Routing in the Linux kernel
Implementing IPv6 Segment Routing in the Linux kernel
Olivier Bonaventure
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
Thomas Graf
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
harryvanhaaren
 
DevOops - Lessons Learned from an OpenStack Network Architect
DevOops - Lessons Learned from an OpenStack Network ArchitectDevOops - Lessons Learned from an OpenStack Network Architect
DevOops - Lessons Learned from an OpenStack Network Architect
James Denton
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Future Internet protocols
Future Internet protocolsFuture Internet protocols
Future Internet protocols
Olivier Bonaventure
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDN
nvirters
 
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
Junho Suh
 
Network Test Automation 2015-04-23 #npstudy
Network Test Automation 2015-04-23 #npstudyNetwork Test Automation 2015-04-23 #npstudy
Network Test Automation 2015-04-23 #npstudy
Hiroshi Ota
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
Stephen Hemminger
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 
LISP and NSH in Open vSwitch
LISP and NSH in Open vSwitchLISP and NSH in Open vSwitch
LISP and NSH in Open vSwitch
mestery
 
Packet Framework - Cristian Dumitrescu
Packet Framework - Cristian DumitrescuPacket Framework - Cristian Dumitrescu
Packet Framework - Cristian Dumitrescu
harryvanhaaren
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
Thomas Graf
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
Thomas Graf
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
vivekkonnect
 
IPv6 Segment Routing : an end-to-end solution ?
IPv6 Segment Routing : an end-to-end solution ?IPv6 Segment Routing : an end-to-end solution ?
IPv6 Segment Routing : an end-to-end solution ?
Olivier Bonaventure
 
Implementing IPv6 Segment Routing in the Linux kernel
Implementing IPv6 Segment Routing in the Linux kernelImplementing IPv6 Segment Routing in the Linux kernel
Implementing IPv6 Segment Routing in the Linux kernel
Olivier Bonaventure
 
Cilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPFCilium - API-aware Networking and Security for Containers based on BPF
Cilium - API-aware Networking and Security for Containers based on BPF
Thomas Graf
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
harryvanhaaren
 
DevOops - Lessons Learned from an OpenStack Network Architect
DevOops - Lessons Learned from an OpenStack Network ArchitectDevOops - Lessons Learned from an OpenStack Network Architect
DevOops - Lessons Learned from an OpenStack Network Architect
James Denton
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Pyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDNPyretic - A new programmer friendly language for SDN
Pyretic - A new programmer friendly language for SDN
nvirters
 
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
OCP Summit 2016 - Transforming Networks to All-IT Network with OCP and Open N...
Junho Suh
 
Network Test Automation 2015-04-23 #npstudy
Network Test Automation 2015-04-23 #npstudyNetwork Test Automation 2015-04-23 #npstudy
Network Test Automation 2015-04-23 #npstudy
Hiroshi Ota
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 

Viewers also liked (8)

How Software-Defined Data Center Technology Is Changing Cloud Computing
How Software-Defined Data Center Technology Is Changing Cloud ComputingHow Software-Defined Data Center Technology Is Changing Cloud Computing
How Software-Defined Data Center Technology Is Changing Cloud Computing
NIMBOXX
 
Micro Server Design - Open Compute Project
Micro Server Design - Open Compute ProjectMicro Server Design - Open Compute Project
Micro Server Design - Open Compute Project
Hitesh Jani
 
Interference-Aware Multipath Routing In Wireless Sensor NetworksMinor projr...
Interference-Aware Multipath Routing In  Wireless Sensor NetworksMinor  projr...Interference-Aware Multipath Routing In  Wireless Sensor NetworksMinor  projr...
Interference-Aware Multipath Routing In Wireless Sensor NetworksMinor projr...
Rakesh Behera
 
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld
 
OCP 2015 summit_honey badger_and_panther+_update
OCP 2015 summit_honey badger_and_panther+_updateOCP 2015 summit_honey badger_and_panther+_update
OCP 2015 summit_honey badger_and_panther+_update
Mike Yan
 
Software Defined presentation
Software Defined presentationSoftware Defined presentation
Software Defined presentation
John Rhodes
 
Open Compute Project : Benefit and Challange
Open Compute Project : Benefit and ChallangeOpen Compute Project : Benefit and Challange
Open Compute Project : Benefit and Challange
ITOCHU Techno-Solutions America
 
Software Defined Data Center: The Intersection of Networking and Storage
Software Defined Data Center: The Intersection of Networking and StorageSoftware Defined Data Center: The Intersection of Networking and Storage
Software Defined Data Center: The Intersection of Networking and Storage
EMC
 
How Software-Defined Data Center Technology Is Changing Cloud Computing
How Software-Defined Data Center Technology Is Changing Cloud ComputingHow Software-Defined Data Center Technology Is Changing Cloud Computing
How Software-Defined Data Center Technology Is Changing Cloud Computing
NIMBOXX
 
Micro Server Design - Open Compute Project
Micro Server Design - Open Compute ProjectMicro Server Design - Open Compute Project
Micro Server Design - Open Compute Project
Hitesh Jani
 
Interference-Aware Multipath Routing In Wireless Sensor NetworksMinor projr...
Interference-Aware Multipath Routing In  Wireless Sensor NetworksMinor  projr...Interference-Aware Multipath Routing In  Wireless Sensor NetworksMinor  projr...
Interference-Aware Multipath Routing In Wireless Sensor NetworksMinor projr...
Rakesh Behera
 
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld 2014: Software-Defined Data Center through Hyper-Converged Infrastruc...
VMworld
 
OCP 2015 summit_honey badger_and_panther+_update
OCP 2015 summit_honey badger_and_panther+_updateOCP 2015 summit_honey badger_and_panther+_update
OCP 2015 summit_honey badger_and_panther+_update
Mike Yan
 
Software Defined presentation
Software Defined presentationSoftware Defined presentation
Software Defined presentation
John Rhodes
 
Software Defined Data Center: The Intersection of Networking and Storage
Software Defined Data Center: The Intersection of Networking and StorageSoftware Defined Data Center: The Intersection of Networking and Storage
Software Defined Data Center: The Intersection of Networking and Storage
EMC
 
Ad

Similar to Opensample: A Low-latency, Sampling-based Measurement Platform for Software Defined Data Center (20)

LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecasesLF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OpenvSwitch
 
Presentations on basic understanding of networm management
Presentations on basic understanding of networm managementPresentations on basic understanding of networm management
Presentations on basic understanding of networm management
RasithaAbayakoon2
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
MohammedAlasmar2
 
Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depth
IdcIdk1
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
Jonas Traub
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilities
G Prachi
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
MohammedAlasmar2
 
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
Kevin Mathew
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Prasanna Shanmugasundaram
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networks
Harshavardhan Achrekar
 
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
1.1.2 - Concept of Network and TCP_IP Model (2).pptx1.1.2 - Concept of Network and TCP_IP Model (2).pptx
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
VINAYTANWAR18
 
Data Stream Management
Data Stream ManagementData Stream Management
Data Stream Management
k_tauhid
 
TINET_FRnOG_2008_public
TINET_FRnOG_2008_publicTINET_FRnOG_2008_public
TINET_FRnOG_2008_public
Davide Cherubini
 
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Carlos Reaño González
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
Paris Carbone
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
OPAL-RT TECHNOLOGIES
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6
Olivier Bonaventure
 
Lambda Data Grid
Lambda Data GridLambda Data Grid
Lambda Data Grid
Tal Lavian Ph.D.
 
Part5-tcp-improvements.pptx
Part5-tcp-improvements.pptxPart5-tcp-improvements.pptx
Part5-tcp-improvements.pptx
Olivier Bonaventure
 
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecasesLF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
LF_OpenvSwitch
 
Presentations on basic understanding of networm management
Presentations on basic understanding of networm managementPresentations on basic understanding of networm management
Presentations on basic understanding of networm management
RasithaAbayakoon2
 
Queuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depthQueuing theory and traffic analysis in depth
Queuing theory and traffic analysis in depth
IdcIdk1
 
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
JT@UCSB - On-Demand Data Streaming from Sensor Nodes and A quick overview of ...
Jonas Traub
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilities
G Prachi
 
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
A Robust UART Architecture Based on Recursive Running Sum Filter for Better N...
Kevin Mathew
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Performance Analysis of Creido Enhanced Chord Overlay Protocol for Wireless S...
Prasanna Shanmugasundaram
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networks
Harshavardhan Achrekar
 
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
1.1.2 - Concept of Network and TCP_IP Model (2).pptx1.1.2 - Concept of Network and TCP_IP Model (2).pptx
1.1.2 - Concept of Network and TCP_IP Model (2).pptx
VINAYTANWAR18
 
Data Stream Management
Data Stream ManagementData Stream Management
Data Stream Management
k_tauhid
 
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Pipelined Compression in Remote GPU Virtualization Systems using rCUDA: Early...
Carlos Reaño González
 
An Introduction to Distributed Data Streaming
An Introduction to Distributed Data StreamingAn Introduction to Distributed Data Streaming
An Introduction to Distributed Data Streaming
Paris Carbone
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
OPAL-RT TECHNOLOGIES
 
Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6Part 9 : Congestion control and IPv6
Part 9 : Congestion control and IPv6
Olivier Bonaventure
 
Ad

Recently uploaded (20)

Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
Lecture - 7 Canals of the topic of the civil engineering
Lecture - 7  Canals of the topic of the civil engineeringLecture - 7  Canals of the topic of the civil engineering
Lecture - 7 Canals of the topic of the civil engineering
MJawadkhan1
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Journal of Soft Computing in Civil Engineering
 
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic AlgorithmDesign Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Design Optimization of Reinforced Concrete Waffle Slab Using Genetic Algorithm
Journal of Soft Computing in Civil Engineering
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Working with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to ImplementationWorking with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to Implementation
Alabama Transportation Assistance Program
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptxUnleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
SanjeetMishra29
 
UNIT 3 Software Engineering (BCS601) EIOV.pdf
UNIT 3 Software Engineering (BCS601) EIOV.pdfUNIT 3 Software Engineering (BCS601) EIOV.pdf
UNIT 3 Software Engineering (BCS601) EIOV.pdf
sikarwaramit089
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdfDahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
PawachMetharattanara
 
David Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And PythonDavid Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And Python
David Boutry
 
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Guru Nanak Technical Institutions
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
Introduction to Additive Manufacturing(3D printing)
Introduction to Additive Manufacturing(3D printing)Introduction to Additive Manufacturing(3D printing)
Introduction to Additive Manufacturing(3D printing)
vijimech408
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
Lecture - 7 Canals of the topic of the civil engineering
Lecture - 7  Canals of the topic of the civil engineeringLecture - 7  Canals of the topic of the civil engineering
Lecture - 7 Canals of the topic of the civil engineering
MJawadkhan1
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptxUnleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
Unleashing the Power of Salesforce Flows &amp_ Slack Integration!.pptx
SanjeetMishra29
 
UNIT 3 Software Engineering (BCS601) EIOV.pdf
UNIT 3 Software Engineering (BCS601) EIOV.pdfUNIT 3 Software Engineering (BCS601) EIOV.pdf
UNIT 3 Software Engineering (BCS601) EIOV.pdf
sikarwaramit089
 
Environment .................................
Environment .................................Environment .................................
Environment .................................
shadyozq9
 
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdfDahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
PawachMetharattanara
 
David Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And PythonDavid Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And Python
David Boutry
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
Machine foundation notes for civil engineering students
Machine foundation notes for civil engineering studentsMachine foundation notes for civil engineering students
Machine foundation notes for civil engineering students
DYPCET
 
Introduction to Additive Manufacturing(3D printing)
Introduction to Additive Manufacturing(3D printing)Introduction to Additive Manufacturing(3D printing)
Introduction to Additive Manufacturing(3D printing)
vijimech408
 

Opensample: A Low-latency, Sampling-based Measurement Platform for Software Defined Data Center

  • 1. OpenSample A Low-latency, Sampling-based Measurement Platform for Software Defined Data Center Junho Suh†, Ted “Taekyoung” Kwon†, Colin Dixon‡, Wes Felter‡, and John Carter‡ †Seoul National University ‡IBM Research Austin ICDCS'14@Madrid 1
  • 2. Software Defined Networking (SDN) OS routing VPN … monitoring Control / management functions Embedded OS Switching ASIC Open Interface Network OS Open Interface routing VPN … monitoring CISCO Juniper 2 Legacy SDN SDN Measurement Control Decision ICDCS'14@Madrid
  • 3. X86 64bits High-performance mainframe 3 Control Loop in Software Defined Networking • Control loop SDN Decision (100us) Measurement Control (100ms ~ 1sec+) (10ms) • Measurement is a bottleneck – High latency of control loopDC PerformanceApp performance ICDCS'14@Madrid Open Interface Network OS Open Interface routing VPN … monitoring IBM RackSwitch G8264
  • 4. Control Loop in Software Defined Networking • Control loop (100us) • For high-speed networks – E.g., 10/40Gbps X86 64bits High-performance mainframe 4 SDN Measurement Control Decision ICDCS'14@Madrid Open Interface Network OS Open Interface routing VPN … monitoring IBM RackSwitch G8264 (100ms ~ 1sec+) (10ms)
  • 5. How Fast should Measurement do for SDDC? 5 CDF of Flow Duration Univ. DC Flow Duration@1Gbps (ms) *source: T. Benson, et al., “Network traffic characteristics of data centers in the wild,” IMC`10 Production DC Flow Duration@1Gbps (ms) *source: Alizadeh et al., Background TCP flows, Microsoft data center DCTCP, Sigcomm`10 • The situation is getting worse toward high-speed data center networks (e.g., 1Gbps  10/40Gbps) ICDCS'14@Madrid
  • 6. Why are Measurements so Slow? • Traditionally, this didn’t need to be fast • Control plane of switch’s CPUs are wimpy – Over taxing on switch’s CPU  increasing as flow table increases – Ex) Polling flow counter and sampling • Although faster CPU could help, but still a big gap between CPUs and ASICs Kernel Driver CPU CPU ASIC PCI-E ICDCS'14@Madrid 6 ASIC Kernel Driver CPU PCI-E XAUI, Aurora …
  • 7. Is Packet Sampling a viable Solution? • Estimating flow statistics from a packet sampling 1,000 pkts classified as A Hmm…roughly 400,000 pkts of class A are there~ – Maximum likelihood estimation (MLE) • #packets ≈ #packets sampled X sampling ratio 2,500 pkts sampled … … … … … 1,000,000 pkts transiting in a given measurement time Sampling probability = 0.25% Flow A Flow B Flow C ICDCS'14@Madrid 7
  • 8. Is Packet Sampling a Viable Solution? • Theory behind this inference… – Large number theory – Estimation accuracy ∝ sqrt of # samples 2,500 pkts sampled … … … … … 1,000,000 pkts transiting in a given measurement time Sampling probability = 0.25% 1,000 pkts classified as A Hmm…roughly 40% of pkts of class A are there~ The number of pkts of class A are in [381,000, 419,000] Flow A Flow B Flow C ICDCS'14@Madrid 8
  • 9. How many Samples really we can get? 1-in-N Peak at 350 pkts/sec 9 collector client server ICDCS'14@Madrid • Micro benchmark – IBM RackSwitch G8264 – Single TCP connection @10Gbps w/ TCP perf
  • 10. How many Samples really we can get? In 100ms, only 3,000 pkts in avg. are arrived, as well as 60 flows *Dataset source: Benson, T., Network traffic characteristics of data centers in the wild, IMC 2010 • With a limit of 350 samples/sec, the situation is worse ICDCS'14@Madrid 10
  • 11. Can we gather Samples more? • Two approaches to increase an accuracy (= increase # samples) – ↑ sampling probability • Over taxing on switches’ CPU – ↑ an measurement interval • Violating OpenSample’s goal of low-latency measurements ICDCS'14@Madrid 11
  • 12. Our Solution: Protocol-aware Flow Statistics Detection Algorithm • Fact: 99% of total traffic in data centers are TCP flows – Captures two distinct packet headers • a timestamp and a TCP sequence number are exploited – Ex) TCP packet A with seq# SA at time tA and B with seq# SB at time tB such that tA < tB ICDCS'14@Madrid 12
  • 13. Our Solution: Protocol-aware Flow Statistics Detection Algorithm • Ex) Estimating flow statistics 13 … … … S1 Throughput of flowS = (S2-S1)/(t3-t1) Throughput of flowT = (T2-T1)/(t3-t1) ICDCS'14@Madrid S2 T2 T1 t4 t3 t2 t1 Streaming algorithm
  • 14. Our Solution: Protocol-aware Flow Statistics Detection Algorithm • Ex) Estimating port statistics – Exploiting MLE • regards packets passing through a specific port as super flow 14 measurement interval t2 t1 … … … UtilportA = #samples * sampling rate ICDCS'14@Madrid
  • 15. Our Solution: Protocol-aware Flow Statistics Detection Algorithm • Benefits – Streaming algorithm • Near-real-time analysis – High accuracy even with low sampling probability – Independent with the sampling theory • don’t need to know a sampling probability • can capture samples at multiple points in a given network • measurement delay depends only on latency between two different samples ICDCS'14@Madrid 15
  • 16. How much a number of flows can be detected? • Probability of flow statistics detection in single switch model Disjoint events – Pr{2+ samples} = 1 – Pr{get 0 or 1 sample} – = 1 – Pr{0 sample} – Pr{1 sample} – =1 - (1-p)n - np(1-p)n-1 Bernoulli trial n: # of packets in a given flow p: probability of packet sampled (0 ≤ p ≤ 1) ICDCS'14@Madrid 16
  • 17. How much a number of flows can be detected? 17 • Probability of flow statisitcs detection w/ multiple switches – Pr{2+ samples} = 1 – (1-kp)n – nkp(1-kp)n-1 n: # of packets in the flow p: probability of packet sampled (0 ≤ p ≤ 1) k: # of switches ICDCS'14@Madrid
  • 18. How much a number of flows can be detected? 18 • Probability of flow detection w/ multiple switches n: # of packets in the flow p: probability of packet sampled (0 ≤ p ≤ 1) k: # of switches ICDCS'14@Madrid
  • 19. Protocol-aware Flow Statistics Detection Algorithm (4/4) • Flow detection delay – E[D] = E[X1 + X2] – = E[X1] + E[X2] = 2/λkp D: delaying to acquire two samples from a given flow X1, X2: the arrival time of the first and second sampled packets λ: inter-arrival rate ICDCS'14@Madrid 19
  • 20. Protocol-aware Flow Statistics Detection Algorithm (4/4) • Flow detection delay ICDCS'14@Madrid 20
  • 21. Implementation • OpenSample collector – Java-based collector with Netty NIO framework – sFlow v5.0 standard – Reconstruct flow/port statistics • Protocol-aware flow statistics detection • Maximum Likelihood Estimation • Floodlight SDN controller – Traffic engineering application ICDCS'14@Madrid 21
  • 22. Benchmark Tests (Emulation) • Mininet v2.0 – SDN emulator running in a single host – Real traffic characteristics – Results can be reproducible ICDCS'14@Madrid 22
  • 23. Benchmark Tests (Configuration) • Topology – FatTree (k=4) vs. non-blocking @ 10 Mbps – 16 hosts and 3-levels • Workloads – Spatial locality of traffic patterns • random (benign), staggered, stride (adversarial) – Flow size following an exponential distribution • with avg. 1MB and 1GB • Benchmarks – Polling-based flow scheduler • Polling interval = 1 second – MLE vs. Protocol-aware flow statistics detection algorithm • Sampling rate: N=50 (High), N=200 (Low) ICDCS'14@Madrid 23
  • 24. Results: 1GB Long Flow • Normalized aggregate throughput ICDCS'14@Madrid 24
  • 25. Results: 1MB Short Flow • Normalized aggregate throughput ICDCS'14@Madrid 25
  • 26. Results: 1MB Short Flow • CDF of bytes left at the time of detection • CDF of bytes left at the time of routing ICDCS'14@Madrid 26
  • 27. Results: 1MB Short Flow • The total bytes sent in 30s and the percent of those bytes scheduled by traffic engineering for the STRIDE8 workload ICDCS'14@Madrid 27
  • 28. Conclusion • OpenSample – A working prototype of a low-latency, sampling-based measurement platform – Reducing control loop latency from 1-5 seconds to 100 milliseconds • Further reducing control loop as fast as 100us with hardware supports – See Planck: Millisecond-scale Monitoring and Control for Commodity Networks, Sigcomm`14 ICDCS'14@Madrid 28
  • 29. Q&A Email: jhsuh@mmlab.snu.ac.kr ICDCS'14@Madrid 29

Editor's Notes

  • #3: Let me start with a new concept of software defined networking (SDN) in network research area. The SDN introduces the possibility of building self-tuning networks by replacing the distributed, per-switch control planes of traditional network with a (logically) centralized control plane. And all functionalities of network control plane is moving to a centralized controller running on commodity server, and it constantly monitor network conditions and react rapidly to important events such as congestions and network failures. We will call this a control-loop consisting of measurement, decision, and control. i) gathering traffic and other measurements from the network ii) using the gathered information to compute iii) installing forwarding behaviors in the switches
  • #4: So due to this decoupling between control- and data-plane in the SDN, it naturally introduces a new problem that increasing latency of a control loop in six or seven orders of magnitude greater than legacy architecture. The red values show latency of each component contributed to the SDN’s control loop, and these values are measured from our testbed. With our x86 64bits high-performance mainframe, it takes about 100 us to calculate a new routing path for a new flow in large scale topology, and takes about 10ms to install a new path configuration to a number of switches. However, in the case of measurement, it takes about from 100ms to 1second, depending on flow table sizes. Therefore, the measurement is a bottleneck point of a control loop, which is two or three orders of magnitude greater than the others.
  • #5: Moreover, this control loop problem is getting worse in the case of the high speed Software Defined Data Center, running at 10/40 Gbps link speed. We believe the value of a tens of seconds for measurement is too slow to react to any, but the largest network events such as link failures, VM migrations, and bulk data movement. In other words, the problem induced by transient conditions such as conflicting small-to-medium flows cannot be identified fast enough to respond before they disappear, resulting in frequent bursts of congestion. Therefore, we need a much lower latency network monitoring mechanism.
  • #6: So how fast should measurement do for practical Software-Defined Data Center? These graphs show the opposite traffic characteristics of data centers. The left hand side in this slide shows a CDF of flow duration and represents a traffic characteristic in university’s data center running web servers and file servers and so on…The right hand side in this slide shows a same result, but represents production data center that usually runs Big Data applications such as MAPREDUCE. Both are reproduced by using the datasets stated below, respectively. As you can see 80% of flows are shorter than 9~10 seconds. So a second of control loop latency is effective for traffic engineering. However, in production DC, 90% of flows are shorter than 100ms. Therefore, traffic engineering running every a second is now no longer effective at all. Further, we believe that this situation is getting worse toward high-speed data center networks such as 10/40Gbps. Therefore, ideally a measurement system for SDDC would be near-real-time, this is a latency on the order of milliseconds.
  • #7: So why are state-of-the-art measurements so slow? The answer is this didn’t need to be fast traditionally. As shown in the figure, switch’s CPUs are usually wimpy, and bandwidth between ASIC and CPU are shared medium such as PCI-E meaning that it is too slow. So nowadays a new architecture of switch is introduced by adding a fast and multiple cores and directly connecting with CPU and ASIC using XAUI or AURORA interfaces. Although these new technologies could help to remedy the over taxing on switch’s CPU, but a big gap between CPUs and ASICs is still existed.
  • #8: Now let’s take a look at packet sampling is a viable solution. Before answering this question, we need to know of how sampling works. Packet sampling is based on the estimation theory called maximum likelihood estimation. Roughly say that the number of packets passing through a given switch is the number of packets captured multiplied by sampling ratio. For example, suppose 1 million packets transiting in a given measurement time and sampling probability is .25%, corresponding with 1-in-400. So 2,500 packets are sampled, meanwhile, let’s consider 1000 packets are classified as class A. Then, we can roughly estimate this flow statistics as 400,000 packets of class A are in a given network.
  • #9: But since the theory behind this inference is based on large number theory, the estimation has some variance regarding with the number of samples. We call this estimation accuracy which is proportional to square root of the number of samples. Therefore, it is desirable that the number of packet of class A are in avg. 400,000 with some variance. For example, in the range between 381,000 and 419,000.
  • #10: So how many samples really we can get from switch. To figure it out, we was carrying out a micro benchmark with IBM RackSwitch G8264. We generate a single TCP connection at 10Gbps with TCP perf to saturate a switch’s port, then measures how many samples are arrived every second at a collector. Here is a result as we increase a sampling ratio from 1-in-10,000 to 1-in-250, which is maximum sampling ratio we can manipulate. As you can confirm that only avg. 350 packets are arrived every second at around 1-in-1,000. This is due to the same reason we previously explained, wimpy CPU and low bandwidth between CPUs and ASICs.
  • #11: Do you think this value is really useful to estimate flow statistics? No. The situation is getting worse when we consider a real traffic patterns. In average, only 60 flows and in that 3000 packets are arrived at each TOR switch in any 100ms time window. This means the avg. flow has 50 packets in a 100ms window. Even if all 50 packets from a given flow are sampled, we can only estimate the flow’s actual rate with approximately 30% error.
  • #12: Well, there are two approaches to gather more samples that are… Increasing sampling probability or increasing an measurement interval, but they all violate an OpenSample’s design goal of low-latency and low-cost.
  • #13: Therefore, we think differently and finally we propose a protocol-aware flow statistics detection algorithm in OpenSample. OpenSample exploits the fact that 99% of total traffic in data centers are TCP flows, so if we can get two distinct packet headers of a given flow having a timestamp and a TCP sequence number, we can easily and exactly calculate a flow statistics.
  • #14: Let me give you a simple example of extracting flow statistics. Consider a packet of flow S is arrived at timestamp1, and at timestamp2 a T1’s packet is arrived, and at timestamp3 a second packet of flow S is arrived. At this point we can extract a throughput of flow S from this equation. A throughput of Flow T is analogous. Further, this algorithm can be thought of being as streaming algorithm instead of batch algorithm like maximum likelihood estimation.
  • #15: Now let’s look at how we estimate port utilization. For this, we just use maximum likelihood estimation that regards all packets passing through a specific port as super flow in measurement interval. So port utilization can be calculated as # of samples multiplied by sampling rate.
  • #17: Now we analyze our algorithm in terms of probability of flow statistics detection. To determine this, we develop an analytical model of the probability of getting at least two different samples from a single switch. Moreover, we also use a simple simulator to validate this model. For analytical simplicity, we assume packet arrival process follows Poisson process. Since there is no risk of sampling the same packet twice at a single switch, the probability of getting two samples from the same flow at the same switch is the probability of getting two more samples from the same flow. More formally…
  • #18: When considering the case of more than one switch, the analysis becomes more complex because it must account for the possibility of sampling the same packet twice at two different switches. However, for realistic numbers of switches k, and probability of packet sampled, p, the probability of sampling the same packet more than once at different switches is too low enough that it effectively acts as the one switch model with a probability of packet sampled of kp, called effective sampling probability.
  • #19: This figure shows the results of both an analytical model (line) and a simulation (dot) as varying number of switches and sampling ratio. As can be seen, the intuitive approximation of multiple switches model to the one switch model is well matched. Further, you can see that 80% of small flows can be detected with practical parameters of sampling ratio and the number of switches. Also this shows that even for low sampling ratio, increasing the number of switches drastically improves the probability that we will get two distinct samples with low-cost. Meanwhile, the lines with half the sampling ratio, but twice the switches closely follow each other. For example, one switch with 1 in 5000 sampling produces the same result as two switches with 1 in 10000 sampling.
  • #20: Now let’s see the analysis of the algorithm in terms of detection delay getting two samples in a given flow. Since we assume that packet arrival process follows Poisson process, the random variable D, delaying to acquire two samples from a given flow follows the sum of two random variables X1 and X2, which represent the arrival time of the first and second sampled packets, respectively. And the two random variables follow exponential distribution. Therefore, the result is the sum of inter-arrival time of first and second sampled packets.
  • #21: Here is the graph showing the results of flow detection delay with the parameter k=3 which is a typical 3-hop path in data center and for packet inter-arrival times of 1000us and 10 us representing slow (12Mbps) and fast (1.2Gbps) flows, assuming 1K packets. As can be seen, we easily detect all fast flows in less than 100ms even with sampling ratio of 1 in 1000 in high speed network (the yellow line).
  • #22: Now, in order to evaluate our OpenSample measurement platform, we implement it which is java-based collector with Netty NIO framework and implement sFlow v5.0 standard. The OpenSample collector reconstruct flow and port statistics using XXX and XXX, repectively. Further, we use Floodlight SDN controller and implement a centralized flow scheduler using Greedy algorithm and Linear programming. But mostly we use greedy algorithm because it is much practical for moderate network sizes.
  • #23: And we use Mininet V2.0 to carry out some benchmarks because it has an advantage that our implementation can be used for real physical testbed also. Further, since Mininet is a SDN emulator running in a single host and makes virtual network that real traffic flows, the results produced in the emulator can be easily reproducible in real-testbed.
  • #24: We use a three-level k=4 FatTree as a network topology to emulate a data center network having 10 Mbps link speed, but a real network would use a much larger switch radix such as k=64 having 10 Gbps. Well, although there is 1000 times discrepancy between the emulated environment and physical one due to the limitation of the resources in a single host, because Mininet uses Linux traffic shaping to emulate fixed-speed links, giving the emulated network realistic congestion and queuing delays, the same results can be seen in real physical testbed by scaling up the configuration. We run the workloads on an emulated single large non-blocking switch to determine the maximum throughput when constrained only by host NIC speeds that shows optimum performance on the FatTree, because a non-blocking is a graphically identical to FatTree. In order to carry out the benchmarks between them, we implement three flow schedulers such as…
  • #25: This figure shows the throughput of a variety of workloads on our emulated configuration. Although a FatTtree is topologically a rearrangeable non-blocking topology, there is a significant gap between naïve ECMP forwarding and the hypothetical single non-blocking switch (orange bar) due to collisions where multiple flows are hashed onto the same link. This gap makes the case for traffic engineering with perfect flow scheduling it should be possible to approach the throughput of a non-blocking switch. Further, except ECMP, all other measurement mechanism shows good performance for a flow scheduler because elephant flows are prevailed.
  • #26: However, the situation is changed when short flows are prevailed, which are less than a second. The most of flows are almost always finished before the controller can detect them, then reschedule them. In this scenario, OpenSample-TCP performs significantly better than either polling or MLE, because it detects and schedules elephant flows earlier. In most cases it achieves performance close to a non-blocking switch. Often it outperforms the alternatives by 25-50%.
  • #27: This figure provides deeper insight into the behavior of the measurement systems in the context of traffic engineering. The left hand side in this slide shows the fraction of bytes left in a flow at the time it is detected by each measurement system. And the right hand side in this slide shows the fraction of bytes left in a flow at the time it is actually rerouted, this is after the new forwarding rules have been installed. The results show that MLE and OpenSample-TCP dramatically outperform Polling when it comes to detecting short flows. When accounting for the scheduling interval, the time needed to compute routes and install the new routes, the advantage of MLE had vanished, but OpenSample-TCP is still able to significantly outperform the alternatives. In conclusion, OpenSample-TCP can detect elephant flows far earlier than the alternatives and, when used to drive traffic engineering, it enables the traffic engineering mechanism to schedule up to 60% of the bytes that hosts send and to a 150% improvement in aggregate throughput when most flow sizes are small.
  • #28: This table gives an intuition of the source of the performance gains. It shows both the total bytes transferred in the 30s duration of the experiment and the percentage of those bytes that the traffic engineeering manages to schedule. The fraction of bytes scheduled can be considered a figure of merit for traffic engineering, since any bytes that are no scheduled are more likely to be subject to congestions. By this metric, OpenSample-TCP schedules over twice the fraction of bytes as the polling system. Moreover, we can see that the reduced congestion allowed the workload to send twice as much data in the same time, resulting in doubling throughput.
  • #31: In today’s talk, we present OpenSample that is a low-latency, sampling-based network measurement platform for SDDC. Targeted at building faster control loop as fast as 100 ms, instead of a 1-5 seconds control loop of state-of-the-art monitoring mechanisms. From this improvement, performance of a flow scheduler can be improved up to 150% over one running on both ECMP and a polling-based solution even with short flows. Further, it is deployable right away without a new hardware or modifying end-host’s kernel stack, because the algorithm we used does not incur any modification in hardware.
  翻译: