https://www.openstack.org/summit/tokyo-2015/vote-for-speakers/Presentation/6494
Title: Toward 40Gbps NFV/SDN in Commodity Hardware
Network Function Virtualization (NFV) has emerged today as an operator proposal for network service offering with software based network functions. The NFV is getting its popularity especially for network nodes, Data Centers, and virtual machines for Cloud based services. For the NFV, Data Plane Development Kit (DPDK) and Single Root I/O virtualization (SR-IOV) get highlighted playing critical roles for the maximum performance of the network throughput of network nodes. DPDK is the programming framework to enable the development of high speed data packet network application on Intel x86 processor based commodity hardwares. And SR-IOV is a technology that makes high-speed packet processing possible in virtual machine environment. The NFV which is incorporated with DPDK and SR-IOV becomes common on the 10Gbps line rate link and is shown to reach its maximum performance in many publications.
Recently some backbone links begin to apply 40Gbps line rate link rather than 10Gbps line rate link to handle heavier network traffic according to the popularity of Cloud based services.
In the case of 40Gbps line rate link, while the 40Gbps network function is shown to be quite achievable solely with the DPDK on the commodity baremetal server[1], the performance degradation[2][3] was observed with the SR-IOV in VM server environment. To resolve the performance issue of SR-IOV on 40Gbps line rate link, we propose the IOMMU Pass-through (IOMMU-PT), which bypasses every packet-level DMA Address Translation. Based on the analysis that performance deterioration comes from the DMA Address Translation by the IOMMU requested by Virtual Machines, we modify the Hypervisor, Guest OS, and DPDK library to bypass the DMA Address Translation step. DPDK library populates an Address translation table for its necessary Guest Physical Address and Host Physical Address during the initialization, by which the DMA works based not on the Guest physical address but on the Host physical address for packet processing to avoid DMA translation overhead. By the experimental results we show that the throughput and RTT is very close to those of native machine environment for 40Gbps line rate link, which verifies the effectiveness of IOMMU-PT.
Implementation Features
- KVM Hypervisor :hypercall for DMA address remapping, para-virtual IOMMU device
- Guest OS : DMA address remapping device driver
- Guest DPDK : DPDK hugepage allocation
Experiemental Result
- Sender : DPDK-pktgen (CPU E5-1620, 3.60 GHz)
- Receiver : DPDK-pktgen (CPU E5-1620, 3.60 GHz)
- Network Card: Intel Corporation Ethernet Controller LX710 for 40GbE QSFP+
- Description: Sender and Receiver is connected by 40Gbps Link directly. Receivers are 3 kind, DPDK application in Baremetal, DPDK application in VM with SR-IOV and IOMMU enabled, DPDK application in VM with SR-IOV and IOMMU-PT enabled(Ours).
Packet Size Baremetal (Native) VM SR-IOV(IOMMU) VM SR-IOV(IOMMU-PT)
------------------------------------------------------------------------------------------
64 byte 15.935 Gbps 4.941 Gbps 15.938 Gbps
128 byte 23.742 Gbps 8.282 Gbps 23.744 Gbps
256 byte 26.624 Gbps 13.229 Gbps 26.628 Gbps
512 byte 27.146 Gbps 16.966 Gbps 27.144 Gbps
768 byte 27.337 Gbps 17.391 Gbps 27.327 Gbps
1024 byte 27.396 Gbps 18.539 Gbps 27.392 Gbps
1280 byte 27.460 Gbps 17.420 Gbps 27.459 Gbps
1518 byte 27.441 Gbps 18.020 Gbps 27.469 Gbps
References
[1] Radisys Delivers Industry's First 40G Solution For Intel DPDK, http://www.radisys.com/2012/radisys-delivers-industrys-first-40g-solution-for-intel-data-plane-development-kit/
[2] Performance Degradation at 64 byte and 128 byte packets due to IOTLB eviction, page 28, http://www.intel.co.kr/content/dam/www/public/us/en/documents/presentation/dpdk-packet-processing-ia-overview-presentation.pdf
[3] Using DPDK in a Virtual World, 08 Sept 2014, VMware and Intel
1. Register
2. after login, voting
Thanks,