Tests show Datera provides significantly higher IOPS and consistently lower latency.

In Part One of this blog series, we compared the designs of Datera and Ceph storage. In this instalment, we will compare the performance of each.

Let’s start by acknowledging that every storage system has its strengths and weaknesses. And that synthetic testing is often crafted to highlight the various strengths of one system vs. another while hiding its weaknesses. Considering this, our goal was to keep as many variables the same as we could, to achieve as much of an apples-to-apples comparison as possible.

In our tests, we used the same physical storage node hardware, the same switches, the same network, the same load generation clients, the same testing tools, and the same number of volumes. However, the storage node hardware used in testing was designed for Datera software, not Ceph.

Perhaps the biggest difference here is in the choice of CPU. Datera is designed around the Intel Xeon E5-2630 v4 processor, a $661 part. Datera has optimized its performance based on these relatively low-cost CPUs, allowing for significant savings in base system cost. For Ceph, the public material recommends a much larger CPU, often a part that costs over $2000 each.

For our synthetic testing, we used the Flexible I/O Tester (FIO), reading and writing to the standard Linux block device interfaces. The tests focused on random workloads across a few different block sizes. We also kept the number of volumes, threads and I/O depth constant to see how each system would handle them. Our goal was to understand how the two systems perform under the same conditions, not to show the limits of either system or to produce the best possible results.

We used a 4K I/O size to represent small blocks and a 32K I/O size to represent larger blocks. Anything above a 32K I/O size moves from testing the system data path to testing the networking environment. We tested 100% reads, 100% writes, and a storage industry-standard 70/30 read/write mix. For each test, we used 35 volumes spread across three load generators. Both the Datera and Ceph systems were configured with the same set of hybrid and all-flash nodes.

All numbers presented are from the client side as reported by FIO. Using client-side metrics represents not only what the storage system can do but also any performance effects of the environment and client.

Ceph was deployed and configured using best practices from an existing production hybrid configuration:

For these tests, the Ceph read performance was about half that of Datera.

Both systems leverage the same NVMe devices in a “flash first” strategy and de-stage the data to disk for longer term storage.

As discussed in Part One, a fundamental difference between Datera and Ceph is that Datera uses a custom Block Store designed to provide high performance at a low latency. Ceph was configured, based on the production environment best practices, to use XFS filesystems for the OSDs. The use of a general-purpose filesystem means the storage system has less control over the response time and overall performance of those devices.

For write workloads, there is no question that the Datera system is significantly faster.

In the testing, Datera beat Ceph by more than 10X in both latency and raw IOPS. It is important to note that both systems maintained three replicas for data protection. This means every write from the client was placed synchronously onto three different nodes. With Datera, the number of replicas are configured on a per-volume basis so each application’s storage can be tailored to the needs of that specific application. With Ceph, the replication factor is based on the pool type and is fixed for all volumes in that pool.

The biggest reason for Datera’s significant write acceleration compared to Ceph is the use of Non-Volatile Dual Inline Memory Modules (NVDIMM.) NVDIMM provides DRAM-like performance with data persistence.

The persistence of memory is critical in case of system failures. This allows Datera to perform at the speed of a Tier 0 or Tier 1 enterprise array with enterprise-class reliability while using commodity hardware. Further, Datera uses NVDIMM directly in the data path so applications achieve the best possible acceleration from the device. While Ceph is designed to run on any hardware with any media, it means the design is centered on the least common denominator – not for maximum efficiency or performance.

For this project, we also explored the use of NVDIMM with Ceph (Jewel Release) but were unable to find a method by which Ceph could leverage it. Clearly, the NVDIMM technology coupled with the Datera Block Store provides an incredible increase in performance and allows Datera to write at more than 10X the performance of a comparable Ceph configuration.

Now, let’s look at the all-flash configurations:

With the workload being served by all-flash based storage servers, Ceph sees a healthy increase in performance, but remains well below Datera.

In the all-flash deployment, Ceph was configured with the journal and the OSD sharing the same device.

With Datera, the all-flash node is treated as a single tier of storage and does not need any kind of caching method. As with hybrid, Datera continues to offer a significant increase in write performance compared to Ceph.

When we run an industry-standard 70/30 read/write workload, we can see that Datera is as much as 8X faster than Ceph. The primary reason for this is that Ceph is held back due to its write performance, which forces the benchmark to throttle read requests to maintain the 70/30 mix.

Summary

Ceph is a very popular storage solution for OpenStack deployments given its free nature and deep integration. One of the major challenges is Ceph has an endless number of options and configuration choices. This can be very powerful for those who invest the time and energy to understand how they relate to each other and can tune their system for their workloads. In our testing, Ceph was tuned using the best practices from a production environment.

All Datera deployments receive the same built-in engineering to manage the hardware efficiently and react to different workloads accordingly. Datera expects our users to set up the environment and start to use it, while the system adapts to the workloads. The tests clearly show that given a consistent workload, Datera provides major performance and latency benefits. For example, if your workload is write intensive, Ceph requires 10X more hardware to provide the same performance as Datera.

Can Ceph achieve better performance than what is presented here? Certainly – bigger CPUs and extensive tuning may provide dramatic benefits for specific workloads and applications.

The agility and flexibility of Datera allows it to handle diverse or changing workloads with very little administrative intervention or tuning.