StorageReview - DapuStor Xlenstor X2900P SCM SSD Review
The DapuStor X2900P is the company’s newest storage-class memory (SCM) data center SSD. Powered by an internally developed DPU600 controller and paired with KIOXIA XL-Flash, this second-generation SSD features models in both the AIC and U.2 form factor and is PCIe Gen4. Because it’s SCM, the DapuStor X2900P is designed for applications that require the best latency profile available out of an SSD.
Recently, DapuStor had a minor portfolio change, creating two sub-series of the Xlenstor Gen2 family, the X2900 and X2900P. The “Pro”version (the model we will be looking at, which is designated with the“P”at the end) features better endurance and improved performance.
While Both models use the PCIe Gen4 interface (NVMe 1.4a, dual-port) and have the same power efficiency at 18W and 5W for active and idle states, the X2900P offers much better random 4K write speeds. The Pro model is quoted at 1.8 million IOPS read and 1.2 million IOPS write while the non-Pro (X2900) is expected to reach the same read performance and 640K IOPS write. Both models are specced with identical sequential performance at 7.5GB/s read and 7GB/s write for the highest capacity model.
For reliability, the X2900P and X2900 have the same MTBF at 2.5 million hours; however, the Pro model has an improved endurance rating of 100 DWPD compared to the non-pro's 60 DWPD.
The Dapustor X2900P is backed by a 5-year warranty. We will be looking at the 800GB U.2 model for this review.
DapuStor X2900P Specifications
▼DapuStor X2900P vs. Intel Optane P5800X vs. Kioxia FL6
Storage-class memory isn't new, in recent history, Intel's series of SCM drives have been the dominant force in the market. Intel's P4800X really opened the doors here back in 2018, in terms of reshaping storage tiers. While the drives were relatively small in capacity, when compared to standard SSDs, they offered tremendous latency advantages. Despite the small capacities, the drives have done really well as a write tier. As a tier, they're able to eat up all of the incoming writes into a system, letting slower and less-expensive SSDs handle most of the read activity.
Of course, other NAND producers needed to respond, so as to enable packaged SSD vendors something to fight Optane with. Kioxia responded in 2019 with XL-Flash, their SCM product. Toward the end of 2020, DapuStor released the H3900 SCM SSD, the first commercially available drive on Kioxia's new media. It was a worthy Optane competitor, winning out in a number of benchmarks.
You'll notice Samsung isn't mentioned. Samsung launched their Z-NAND, but the ensuing Z-SSD didn't do very well and it can be argued, wasn't even SCM in the first place. They've since placed their bets on other technologies like computational storage and don't have a modern SCM entry in the market.
Now the market has iterated again, on to second-generation products. Intel has the P5800X, which came out alongside the Ice Lake refresh earlier this year. Intel has had a long time to itself in the SCM pool as a result. But the market is heating up now with the DapuStor X2900P 2nd Gen SCM drive coming out. And now Kioxia has entered the fray with their own SSD based on XL-Flash, the FL6.
While we haven't reviewed the FL6 yet, its big potential hook is capacity. While the X2900P tops out at 800GB (1.6TB for the X2900) and the P5800X stops at 1.6TB, the FL6 will go to 3.2TB. The demand for those large capacities may be light, as these drives are very expensive, but for use cases where there is a premium on rack space, the FL6 should be compelling for that reason alone.
The FL6 doesn't have any published numbers, so it's a little difficult to evaluate. The FL6 is just now sampling and Kioxia has only released a 60DWPD endurance spec. That's going to come in equal with the X2900 but less than the 100DWPD spec the P5800X and X2900P offer. This comes down in large part to over-provisioning of the drives. DapuStor offers twin endurance/capacity specs while Kioxia and Intel are just going with one.
With the FL6 abstaining from the performance conversation, for now, the P5800X and X2900P square up. Peak performance for the 800GB capacity is 7,500MB/s read and 7,000Mb/s write for the DapuStor. Intel's numbers are 7,200MB/s and 6,100MB/s respectively. The X2900P claims a little better peak IOPs as well, 1.8 million vs 1.5 million in the P5800X. Both drives offer a 5-year warranty and while peak power consumption is the same at 18W, the P5800X does sip a little less power at idle – 4.2W vs 5W. That said, it's unlikely these drives will ever see much idle time.
But this is all spec sheet data taken during the best of times. We put the drives to work in our extensive benchmarking process to see which SCM SSD takes the crown as the fastest drive on the market.
DapuStor X2900P Performance
Our PCIe Gen4 Enterprise SSD reviews leverage a Lenovo ThinkSystem SR635 for application tests and synthetic benchmarks. The ThinkSystem SR635 is a well-equipped single-CPU AMD platform, offering CPU power well in excess of what’s needed to stress high-performance local storage. It is also the only platform in our lab (and one of the few on the market currently) with PCIe Gen4 U.2 bays. Synthetic tests don’t require a lot of CPU resources but still leverage the same Lenovo platform. In both cases, the intent is to showcase local storage in the best light possible that aligns with storage vendor maximum drive specs.
▼PCIe Gen4 Synthetic and Application Platform (Lenovo ThinkSystem SR635)
◇ 1 x AMD 7742 (2.25GHz x 64 cores)
◇ 8 x 64GB DDR4-3200MHz ECC DRAM (1 x 64GB for Houdini)
◇ CentOS 7.7 1908
◇ Ubuntu 20.10-desktop
◇ ESXi 6.7u3
▼Testing Background and Comparables
The StorageReview Enterprise Test Lab provides a flexible architecture for conducting benchmarks of enterprise storage devices in an environment comparable to what administrators encounter in real deployments. The Enterprise Test Lab incorporates a variety of servers, networking, power conditioning, and other network infrastructure that allows our staff to establish real-world conditions to accurately gauge performance during our reviews.
We incorporate these details about the lab environment and protocols into reviews so that IT professionals and those responsible for storage acquisition can understand the conditions under which we have achieved the following results. None of our reviews are paid for or overseen by the manufacturer of equipment we are testing. Additional details about the StorageReview Enterprise Test Lab and an overview of its networking capabilities are available on those respective pages.
▼VDBench Workload Analysis
When it comes to benchmarking storage devices, application testing is best, and synthetic testing comes in second place. While not a perfect representation of actual workloads, synthetic tests do help to baseline storage devices with a repeatability factor that makes it easy to do apples-to-apples comparisons between competing solutions. These workloads offer a range of different testing profiles ranging from “four corners” tests, common database transfer size tests, to trace captures from different VDI environments.
All of these tests leverage the common vdBench workload generator, with a scripting engine to automate and capture results over a large compute testing cluster. This allows us to repeat the same workloads across a wide range of storage devices, including flash arrays and individual storage devices. Our testing process for these benchmarks fills the entire drive surface with data, then partitions a drive section equal to 25% of the drive capacity to simulate how the drive might respond to application workloads. This is different than full entropy tests which use 100% of the drive and takes them into a steady state. As a result, these figures will reflect higher-sustained write speeds.
◇ 4K Random Read: 100% Read, 128 threads, 0-120% iorate
◇ 4K Random Write: 100% Write, 128 threads, 0-120% iorate
◇ 64K Sequential Read: 100% Read, 32 threads, 0-120% iorate
◇ 64K Sequential Write: 100% Write, 16 threads, 0-120% iorate
◇ Synthetic Database: SQL and Oracle
◇ VDI Full Clone and Linked Clone Traces
◇ Intel P5800X 800GB
In our first VDBench Workload Analysis, random 4K read, the Dapustor X2900P posted a peak of 1,449,098 IOPS at 86.2µs in latency. While the P5800X showed better latency throughout the test, both drives had virtually identical peaks.
In 4K random writes, the X2900P showed impressive results again. Similar to reads, it posted a peak of 1,412,734 IOPS at 85.1µs in latency, edging out the Intel drive.
In Random 64K read, we measured 6.38GB/s (101,872 IOPS) read with 311.1µs from the X2900P. The Intel drive topped out at 7.06GB/s read at 281µs.
Looking at in random 64K write, the X2900P peaked at 102,557 IOPS with a latency of 148.3µs, once again pulling ahead of the Intel drive.
Our next set of tests is our SQL workloads: SQL, SQL 90-10, and SQL 80-20. Starting with SQL, the X2900P drive had a peak performance of 570,612 IOPS at a latency of 54.4µs.
SQL 90-10 saw the X2900P saw a peak performance of 565,287 IOPS at a latency of 55.3µs.
Looking at SQL 80-20, the X2900P had a peak performance of 570,612 IOPS with 54.4µs in latency.
Next up are our Oracle workloads: Oracle, Oracle 90-10, and Oracle 80-20. Starting with Oracle, the X2900P showed a peak performance of 578,663 IOPS at a latency of 59.6µs.
For Oracle 90-10, the X2900P posted a peak score of 440322 IOPS at a latency of 48.5µs.
Looking at Oracle 80-20, the X2900P posted a peak performance of 448,923 IOPS at 47µs in latency.
Next, we switched over to our VDI clone test, Full and Linked. For VDI Full Clone (FC) Boot, the X2900P showed a peak of 397,004 IOPS at a latency of 84.6µs.
VDI FC Initial Login, the X2900P peaked at 309,326 IOPS with a latency of 92.9µs.
With VDI FC Monday Login, the X2900P had a peak of 220,109 IOPS with a latency of 69.5µs.
For VDI Linked Clone (LC) Boot, the X2900P showed a peak of 194,040 IOPS at a latency of 81µs.
VDI LC Initial Login saw the X2900P with a spike in performance at the beginning; however, it quickly leveled out with a peak of 119,971 IOPS at 62.7µs in latency.
Our last test is the VDI LC Monday Login. Here, the X2900P showed a performance spike at the beginning of the test once again, though it leveled out with a peak performance of 170,598 IOPS and a latency of 89.7µs.
The Dapustor X2900P is an impressive release by the company. Equipped with a DPU600 controller, Kioxia XL-Flash, and the speedy PCIe Gen4 interface, this second-generation SCM drive is available in both the add-in card and U.2 form factors. Dapustor’s designed this drive specifically for organizations that need ultra-low latency combined with excellent performance and endurance. In that, it certainly excels.
To gauge its performance, we tested the Dapustor X2900P alongside the Intel P5800X and looked at VDBench synthetic workloads. Compared with the Dapustor Haishen3-XL drive (H3900 PCIe Gen3) we reviewed last year, performance has greatly improved.
In our first series of tests, highlights include: 1.45 million IOPS in 4K read, 1.41M IOPS in 4K write, 6.38GB/s in 64K read, and 6.41GB/s in 64K write. In both 4K random write and 64K sequential write the Dapustor X2900P was able to hold its own against Intel's P5800X. On read performance, the Optane SSD still had the edge for both peak speed and lower latency.
In our SQL testing, the new Dapustor X2900P saw peaks of 571K IOPS, 565K IOPS in SQL 90-10, and 570K IOPS in SQL 80-20. With Oracle, we saw 579K IOPS, 430K IOPS in Oracle 90-10, and 449K IOPS in Oracle 80-20.
Next up were our VDI Clone tests, Full and Linked. In Full Clone, we saw 397K IOPS in boot, 309K IOPS in Initial Login, and 220K IOPS in Monday Login. In Linked Clone we saw 194K IOPS in boot, 120K IOPS in Initial Login, and 171K IOPS in Monday Login.
As you can see from the results above, both the Intel and DapuStor drives are incredibly fast. Intel has been unbeatable so far in the SCM drive category. However, the X2900P pretty much rides on the Intel P5800X's tail throughout our review. It even notches a few wins in 4K and 64K write performance, which is very impressive. While the P5800X has been the dominant player all year, the DapuStor X2900P is a worthy challenger.
So where does that leave us? A lot of this will come down to pricing. If DapuStor is less expensive than Intel, they’ll rack up some wins. But either way, having two very strong options in the SCM storage category may help apply some pricing pressure to the entire segment. In the end, the X2900P is a very good drive and we look forward to seeing what DapuStor can do next.