Automatic NF acceleration in ACES

Network Functions (NFs) are pervasive in today’s networks. They implement core network functionality, from basic fundamental features like bridging and network address translation; to accelerating the network with WAN optimizers and load-balancers; to guaranteeing security with firewalls, port scan detectors, and intrusion detection systems.

The ACES network will incorporate accelerated software NFs across the infrastructure to meet the requirements of its use cases. A crucial challenge is guaranteeing the required performance while keeping the flexibility offered by software.

Context. NFs were originally implemented as fixed-function, closed-source appliances, but recently, there has been a transition to implementing them in software using commodity off-the-shelf servers. These NFs trade off flexibility and ease of deployment for an increased performance challenge. Specifically, to allow NFs to process packets at current line-rate speeds (100+ Gbps), one must resort to multiple CPU cores. The difficulty is doing so without breaking the NF core functionality.

Operating at high line rates (e.g., 100Gbps), each packet demands a very short processing time, making inter-core coordination a complex and costly task. The challenge of avoiding this synchronization is not only difficult but also error-prone, necessitating a deep understanding of the NF, meticulous implementation, and careful avoidance of common parallelization pitfalls.

Automatic NF parallelization. In ACES, we advocate a paradigm shift in NF parallelization: the burden of parallelization should not be put on the developer but instead be automatically performed by compilers. This approach empowers developers to reason about their NFs in a sequential mindset while reaping the full benefits of parallelization. In this context, we developed Maestro, a system that facilitates the automatic parallelization of software network functions. 

Maestro uses static-analysis tools to analyze the sequential implementation of the NF and automatically generates an enhanced parallel version that carefully configures the NIC to distribute traffic across cores while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic.

To find a shared-nothing solution, Maestro analyzes the NF and infers how it should partition state and packets across cores to altogether avoid synchronization (i.e. a sharding solution). To concretize this sharding solution, Maestro formulates it as an SMT problem and uses a solver (e.g., Z3) to find the correct NIC configuration that enforces it. Finally, it correctly configures the NIC and automatically generates performance-oriented parallel code, dealing with the pitfalls of parallel programming.

Evaluation. We parallelized 8 common software NFs. The Figure shows how their performance scales with the number of cores. They generally scale up linearly until bottlenecked by PCIe when using small packets (an optimal outcome) or by 100Gbps line rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.

Maestro was presented at NSDI24, and its source code is available on Github.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related content

Architecture award for #EUCloudEdgeIoT

At the EUCEI final conference on 18.06.2024 the ACES project was conferred the Architecture award by EUCloudEdgeIoT. ACES envisions to build on our progressive insights...

Best Paper Award ADAPTIVE2024

Last 14th April in Venice (IT), the ACES team won the Best Paper Award ADAPTIVE2024 with the paper: "Aged-based Modeling in the Edge Continuum using Swarm...

NSDI ’24 Open Access

ACES team presented the paper: Automatic Parallelization of Software Network Functions at the NSD ’24 Open Access Conference Sponsored by King Abdullah University of...

Agent-Based Modeling as a Starting Point for Applying Swarm Intelligence in the Edge Continuum

The Complexity of the Edge-Computing Infrastructure The rise of local processing capacity at the edge is driven by numerous advantages critical for future processing...

A Comprehensive Cybersecurity Solution for Autopoietic Cognitive Edge-Cloud Services (ACES)

Edge-cloud services are rapidly adopted. However, the increase in cyberattacks on these services presents significant challenges, including service interruptions, data...

ACES Observability & Monitoring

When considering complex distributed architectures—spanning also multiple cluster or edge environments—the ability to gain deep insights into the performance, health,...