Automatic NF acceleration in ACES

Network Functions (NFs) are pervasive in today’s networks. They implement core network functionality, from basic fundamental features like bridging and network address translation; to accelerating the network with WAN optimizers and load-balancers; to guaranteeing security with firewalls, port scan detectors, and intrusion detection systems.

The ACES network will incorporate accelerated software NFs across the infrastructure to meet the requirements of its use cases. A crucial challenge is guaranteeing the required performance while keeping the flexibility offered by software.

Context. NFs were originally implemented as fixed-function, closed-source appliances, but recently, there has been a transition to implementing them in software using commodity off-the-shelf servers. These NFs trade off flexibility and ease of deployment for an increased performance challenge. Specifically, to allow NFs to process packets at current line-rate speeds (100+ Gbps), one must resort to multiple CPU cores. The difficulty is doing so without breaking the NF core functionality.

Operating at high line rates (e.g., 100Gbps), each packet demands a very short processing time, making inter-core coordination a complex and costly task. The challenge of avoiding this synchronization is not only difficult but also error-prone, necessitating a deep understanding of the NF, meticulous implementation, and careful avoidance of common parallelization pitfalls.

Automatic NF parallelization. In ACES, we advocate a paradigm shift in NF parallelization: the burden of parallelization should not be put on the developer but instead be automatically performed by compilers. This approach empowers developers to reason about their NFs in a sequential mindset while reaping the full benefits of parallelization. In this context, we developed Maestro, a system that facilitates the automatic parallelization of software network functions. 

Maestro uses static-analysis tools to analyze the sequential implementation of the NF and automatically generates an enhanced parallel version that carefully configures the NIC to distribute traffic across cores while preserving semantics. When possible, Maestro orchestrates a shared-nothing architecture, with each core operating independently without shared memory coordination, maximizing performance. Otherwise, Maestro choreographs a fine-grained read-write locking mechanism that optimizes operation for typical Internet traffic.

To find a shared-nothing solution, Maestro analyzes the NF and infers how it should partition state and packets across cores to altogether avoid synchronization (i.e. a sharding solution). To concretize this sharding solution, Maestro formulates it as an SMT problem and uses a solver (e.g., Z3) to find the correct NIC configuration that enforces it. Finally, it correctly configures the NIC and automatically generates performance-oriented parallel code, dealing with the pitfalls of parallel programming.

Evaluation. We parallelized 8 common software NFs. The Figure shows how their performance scales with the number of cores. They generally scale up linearly until bottlenecked by PCIe when using small packets (an optimal outcome) or by 100Gbps line rate with typical Internet traffic. Maestro further outperforms modern hardware-based transactional memory mechanisms, even for challenging parallel-unfriendly workloads.

Maestro was presented at NSDI24, and its source code is available on Github.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Related content

Trustworthy AI: Bayesian networks for self-calibrating intelligent agents.

In this presentation, we introduce a new tool for optimizing black-box cost functions—costly and unknown functions. Our tool smartly explores the function’s domain...

Trustworthy AI: Swarm Intelligence for highly dynamic distributed and future composable edge-cloud

Dr. Melanie Schranz's presentation, "Swarm Intelligence to manage highly dynamic distributed and future composable edge-cloud computing infrastructures at the edge,"...

Trustworthy AI: The role of knowledge graphs

This talk presents the ACES Knowledge Graph (KG) model, and its role at fostering autopoiesis within cloud-edge infrastructures. Autopoiesis in this context refers to...

EGI2024

At the EGI2024 annual conference in Lecce, experts from international scientific communities, computing service providers, European projects, and policymakers came...

FIWARE  Global Summit

The FIWARE Global Summit 2024 in Naples brought together key players from industry, research, and government to explore the latest advancements in digital technologies....

EGOV – CeDEM – ePart 2024

The eGov2024 conference in Leuven focused on the digital transformation of public services, emphasising the role of AI/ML, e-government, smart cities, and...