Shift-left OCP System Evaluation with Virtual Client library (VC) for Azure

  • Thread starter Thread starter MSFT_RahulShah
  • Start date Start date
M

MSFT_RahulShah



Terminology



Term

Definition

VC

Virtual Client library for Azure, an open-source repository of workloads and industry benchmarks by Microsoft.

Shift-Left

Enabling quality evaluations of the hardware systems during early stages of their development lifecycle.

EV

Engineering version, used in the context of the hardware system maturity.

DV

Development version, used in the context of the hardware system maturity.

PV

Production version, used in the context of the hardware system maturity.

Workloads

Customer representative executables or applications


Overview



Virtual Client library (VC) for Azure is an open source, standardized automation library of industry benchmarks and cloud customer workloads from Microsoft1. It is a key platform used by 7+ engineering groups at Microsoft to evaluate the impact of software, firmware and hardware changes to Azure customer experience, and Azure infrastructure for deployment decisions, following Safe Deployment Practices (SDP)2.

This document outlines potential opportunities to leverage the VC platform to shift-left quality evaluation of OCP systems by evaluating their “cloud readiness” as per the OCP specifications during early phases of development lifecycle. It also explores opportunities to leverage VC platform for OCP Test and Validation initiatives3.



Virtual Client



Virtual Client (VC) is an open-source platform from Microsoft that supports 40 (and adding) workloads and benchmarks spanning x64 and ARM64 architecture, Windows and Linux Operating Systems, and Host and Virtualized runtimes. VC offers run-time dependency management, extensible monitoring capabilities, multiple configurations for workloads and benchmarks, common-schema for analysis needs, and off-the-shelf, robust data engineering pipeline using Azure Platform As-A Service (PaaS) offerings. These capabilities supported by VC can evaluate various OCP systems (e.g., Server, Networking, Firmware, Storage etc.) for performance, reliability, and security benchmarking.

This platform can scale from one benchtop system to 1000s in Datacenters, and its capabilities have been growing due to the open-sourced codebase repositories curated by Microsoft engineers and engineering contributions from the industry.

460x222?v=v2.png

Figure 1 VC engineering stack



VC uses the concept of profiles to author and support multiple configurations of workloads and benchmarks. VC profiles are JSON configuration files that specify compile time and runtime behavior of the underlying benchmarks and workloads, for example, compiler version, architecture, compiler flags etc. Once deployed, VC executable uses these profiles to create run-time environment, bring and load dependencies, trigger monitoring tools and establish local and remote connections.



Applications



VC platform can be leveraged to shift-left performance, reliability, and security benchmarking evaluations for OCP systems (e.g., server, networking, storage etc.) following OCP specifications.

Another important consideration for using the VC platform is the robust data engineering pipeline and The VC platform abstracts emitted metrics, monitors, performance counters, telemetry events, and unstructured logs using a common schema with a documented data dictionary. This allows OCP projects to track and compare performance and reliability benchmarking across comparable systems and system versions. The VC data engineering infrastructure streamlines A/B comparisons and generates portable, reproducible metrics. It helps expedite OCP specification validations using OCP Test-and-Validation project schema.

While by default, these data are persisted in the local filesystem, VC also uses Azure Platform As-A Service (PaaS) offerings (e.g., Azure Storage, Azure Event Hubs, Azure Data Explorer etc.) for off-the-shelf data analysis needs. It focuses on reporting, analysis, and insights vs. execution.



467x487?v=v2.png

Figure 2 High level data engineering pipeline



466x226?v=v2.png

Figure 3 Example of unified data collection schema



Analysis



Based on the early analysis, the following OCP specifications can be supported with minor changes to the VC profiles.


OCP SPEC

OCP Projects

VC Value-Add

Capability Readiness

OCP Contributor

NVMe Cloud SSD Specification

Storage

Automated Testing

Yes (FIO, DiskSpd)

Microsoft, Meta

NVMe Cloud HDD Specification

Storage

Automated Testing

Yes (FIO, DiskSpd)

Microsoft, Seagate, Western Digital

Hyperscale NVMe Boot SSD Specification

Storage

Automated Testing

Yes (FIO, DiskSpd)

Meta, Google

Datacenter NVMe® SSD Specification

Storage

Automated Testing

Yes (FIO, DiskSpd)

Microsoft, Meta, HPE, Dell

Base Specification for Immersion Fluids

Cooling

Power Monitoring, Automated Workload

Yes (imputil, SPECpower)

Intel

Shasta HW System Specifications

Rack

Power/Fan/Temperature Monitoring

Yes (imputil, SPECpower)

Microsoft

OCP Accelerator Module Design Specification

Accelerator

Power/Fan/Temperature Monitoring

Yes (imputil, FPGAstress)

Meta, Microsoft, Baidu

Test and Validation Enablement Initiative

Validation

Automated Testing

Yes

OCP

High Performance Computing - Incubation

HPC

Automated Testing

Yes (HPCG, HPLinkpack, LAPACK etc.)

OCP

NIC 3.0

Networking

Automated Testing

Yes (NTTTCP, sockperf etc.)

OCP

Composable Memory

Memory

Automated Testing

Yes (SPECjbb, LMBench etc.)

OCP

Regional Project Community - China Mainland

AI

Automated Testing

Yes (Superbench, MLPerf etc.)

OCP-China


Opportunities



The following OCP projects could leverage the VC platform capabilities for performance, reliability, and security benchmarking.


OCP Projects

VC Capabilities

Cooling

Ipmiutil data in parallel with SPECpower or other simulated workloads

Hardware Fault Management

Standard fault injection or stress workloads

Networking

Wide range of networking benchmarks

Power

Ipmiutil data in parallel with SPECpower or other simulated workloads

Firmware

Firmware automation and qualification.

Security

Security qualification like the tests we are running for ACC

Server – HPC incubation

HPC benchmarks like HPCG, LAPACK

Storage

IO workloads like FIO, DiskSpd, and database workloads

AI projects in OCP China

GPU benchmarks like Superbench and MLPerf


FAQ



What is Virtual Client library for Azure?

  • It is a standardized, collaborative, and open-sourced platform of workloads and industry benchmarks. It is an outcome of work, ideas, and expertise of engineers across Azure coming together to standardize and solve cloud-scale system readiness problems.
  • Virtual Client platform supports 40+ workloads and cloud customer representative benchmarks focusing on CPU, GPU, Disk I/O, Network/Web performance, Memory, SQL/Database, Compression, Encryption, Java, Resiliency and Machine learning model training and inference.
  • At the high level, Virtual Client platform orchestrates Workload execution, System Monitoring and Dependencies, and provides metrics, telemetry, and monitoring with a standardized common schema.
  • Virtual Client can be run as a stand-alone executable on a single machine to 100s of machines as an end-to-end solution.

What are the platforms supported by Virtual Client?

Virtual Client platform abstracts platform and runtime requirements using “profiles.” It supports Linux and Windows operating systems, x64 and ARM64 architectures, Host and Guest (VM) platforms.



How can I use / contribute to the Virtual Client library for Azure?

Virtual Client is open source, and the repository is actively maintained and curated by the Azure engineering team. The repository is well documented with step-by-step instructions to onboard Virtual Client by leveraging Azure PaaS offerings (Event Hub, Azure Data Explorer etc.)

Getting Started: https://microsoft.github.io/VirtualClient

GitHub Repo : https://github.com/microsoft/VirtualClient



Where can I find the list of workloads supported by Virtual Client?

References


1 What is Virtual Client? Virtual Client Platform | Virtual Client Platform

2 Azure Safe Deployment Practices (SDP): Advancing safe deployment practices | Azure Blog | Microsoft Azure

3 OCP Test and Validation Enablement Initiative: OCP Test and Validation Enablement Initiative - OpenCompute

Continue reading...
 
Back
Top