DAOS Storage Revolutionizes High-Performance Storage
Enabled by Intel® Optane™ persistent memory, Distributed Asynchronous Object Storage (DAOS) offers dramatic improvements to storage I/O to accelerate HPC, AI, analytics, and cloud projects.
Achieve High-Performance Storage with DAOS
DAOS is an open source, distributed storage solution based on several innovative principles. This technology uses the fast I/O and data persistence of Intel® Optane™ persistent memory in combination with any Non-Volatile Memory Express (NVMe) SSD, such as Intel® NVMe SSDs, to alleviate bottlenecks and drive storage performance in distributed environments.
Latency in Traditional Storage
In a traditional storage solution, metadata informs the operating system (OS) about where data is located within a storage cluster. Anytime the system accesses data for read or write operations, it must also create or modify correlating metadata within an I/O block on the underlying storage media. In compute clusters, multiple nodes may need to access the same block, so traditional storage will temporarily lock the block to prevent write conflicts. When replicated across millions of read/write operations, this process generates a significant amount of storage latency that limits application I/O.
Microsecond Write Latencies with DAOS, Intel® Optane™ Persistent Memory, and Intel® NVMe SSDs
In a DAOS configuration, Intel® Optane™ persistent memory modules store metadata for the entire cluster by byte rather than by block, so there’s no need to lock the block as with traditional storage. The use of NVMe SSDs further allows storage I/Os to saturate the PCIe bus with a bigger data pipeline as compared to SATA SSDs.1 As a result, DAOS can deliver storage I/O that is faster by orders of magnitude—from milliseconds (ms) to tens of microseconds (μs)—compared to traditional storage.2 Persistent memory also preserves metadata through system shutdowns or reboots and can absorb small write operations to help ensure system uptime and availability for stringent SLAs. In DAOS deployments with 3D QLC NAND storage, persistent memory can also help mitigate the performance impact of write pressure on the storage cluster.
Open Source Software and Validated DAOS Releases
In addition to the hardware layer, solution providers will need open source DAOS software to complete the stack. Developers can download and compile the code directly from GitHub. For a simpler deployment path, tested and validated binary releases are available through the community daos.io website. Intel actively works with partners and solution providers to enable their DAOS product offerings with L3 technical support.
HPC and Big Data: DAOS Storage Changing the Future
Within big data and HPC clusters, compute nodes are tightly connected to storage tiers and data scientists commonly deal with a variety of cold, warm, and hot data types. The future of storage configurations will depend on a hybrid approach where DAOS is attached to file systems that also use cost-optimized SATA drives. Academic and government labs are already seeing results with high compute utilization in HPC clusters that are driving fast discovery.
For example, Washington University’s radiology research center deployed a software-defined storage system enabled by DAOS to accommodate up to 13 petabytes of storage at a USD 1,500 reduced cost per storage node.3 Commercial HPC deployments show a lot of potential as well, especially in the energy and healthcare sectors that depend on HPC AI, analytics, and simulation workloads. In the IO-500 benchmark, a ranking of the world’s fastest storage systems, half of the top 10 positions are currently held by DAOS configurations.4
DAOS Storage for Exascale Performance
DAOS is the file system of choice for the Argonne National Lab (ANL) Aurora supercomputer, the first planned HPC system targeting exascale compute performance, with up to 230 petabytes of DAOS-enabled storage at > 25 TB/s read/write bandwidth. ANL and the Texas Advanced Computing Center (TACC), another Intel partner with DAOS-enabled HPC, were also both ranked in the top five on the IO-500 list as of September 2020. These successes have also spurred interest from CSPs like Google Cloud Platform, who is now looking to integrate DAOS into its cloud storage services.
Low Read Latencies, Even in Presence of Write Pressure
Even for cost-optimized media that’s qualified for read-intensive workloads, DAOS can have a positive impact. When tested with Intel® QLC 3D NAND SSDs in place of NVMe SSDs, a DAOS configuration was able to achieve read tail latencies of five nines (P99.999) between 200 and 300 μs, meaning that 99.999 percent of all requests were delivered in under 300 μs.2 In the presence of write pressure as high as 2,500 MB/s, the same test showed that DAOS could maintain file system SLAs to achieve five nines in less than 5 ms.2
Intel® Xeon® Scalable Processors and DAOS Performance
Processor performance in a storage node will positively impact DAOS performance. Generational improvements in the number of memory channels, bandwidth per channel, as well as PCIe speed (PCIe 4 vs. PCIe 3) offer a significant boost to DAOS. In an IOR benchmark test, a configuration with 3rd Gen Intel® Xeon® Scalable processors and Intel® Optane™ persistent memory 200 series achieved a 58 percent increase in write performance compared to previous-generation CPUs and persistent memory.5 It’s expected that PCIe Gen 5 in future processor generations will bring even higher performance levels to DAOS.
DAOS Conclusion: A New Path for Fast Storage I/O
DAOS offers a new path to achieve excellent storage I/O that matches pace with growing compute performance and powers the most-demanding use cases while providing storage I/O headroom in everything else. Applications for AI, analytics, HPC, and even cloud computing can benefit from DAOS based on Intel® Optane™ persistent memory.