MongoDB Counters Couchbase Performance Claims - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Comments
MongoDB Counters Couchbase Performance Claims
Newest First  |  Oldest First  |  Threaded View
shane.k.j
100%
0%
shane.k.j,
User Rank: Strategist
4/20/2015 | 9:22:46 AM
Re: Couchbase Response

I appreciate you taking the time to respond to our concerns. Couchbase Server continues to demonstrate great performance and scalability in clustered benchmarks. Now, we see MongoDB demonstrating solid performance in single-node benchmarks. It's not a surprise.

 

I think MongoDB is well-suited to single node deployments, but you can't estimate the performance of a cluster by benchmarking the performance a single node. After all, it doesn't account for replication or the performance trade-off between availability and consistency caused by the master/slave architecture of MongoDB.

 

If you were having difficulties with the latest client library, we would have been happy to help. Thumbtack benchmarked Couchbase Server 2.5 before the 2.x libraries were available. They were released with Couchbase Server 3.0, the release you benchmarked. It's not appropriate to benchmark a current release with a library that was intended for the previous release. At the very least, you should have performed the benchmark with the current 1.x release (1.4.9) rather than an outdated one (1.1.8).

We'll have to agree to disagree on CAS.

However, durability is not defined by "writing data to disk". It's a property - data will not be lost if a node fails. A distributed database is durable when replication is synchronous. For example, persistence on AWS is not enough for instance with ephemeral SSDs. If the instance is stopped, the data is lost. However, if the data is replicated... it's not.

usain
50%
50%
usain,
User Rank: Apprentice
4/16/2015 | 5:19:18 AM
Re: Couchbase Response
Thanks for sharing your thoughts, Shane. We at USA have some differing opinions and outlined our responses below to your comments:

1) Well, they chose to benchmark single node deployments.
RESPONSE: Single server test results are absolutely relevant as they represent the building blocks of any system. As noted by the author of YCSB, it is important to first test performance based on a single node, and then to test scalability. The better the performance of a single node, the fewer nodes will be required to meet the demands of a specific application. YCSB was designed by the team at Yahoo! to test vertical as well as horizontal scaling. We plan to test horizontal scaling in a future report.

2) They chose to have Couchbase Server perform two operations per write (read+update) instead of one, but not MongoDB.
RESPONSE: As indicated in your blog post and comments, Couchbase requires two operations to perform an update, whereas MongoDB requires one. In Couchbase, as you know, updates must be performed by the client, and the client must also manage conflict resolution. This adds to the work and ongoing maintenance for application development teams using Couchbase, and it impacts performance and scalability of the system. We implemented the client correctly, following Couchbase's documentation. We catch and retry any CASMismatchException errors that the server throws when an update would overwrite another update to the same document since the document was initially read. As noted in your
blog comments, Couchbase plans to add an equivalent feature in the next release. We will include this feature in future tests when it becomes available.

3) They used a two year old client library for Couchbase Server, but not MongoDB. It waits at least 100ms before checking if writes are durable. The latest, as little as 10μs.
RESPONSE: We used the same client library as Couchbase's benchmark conducted less than a year ago by Thumbtack (we did update it to the latest patch release). We tried to use 2.1.1 but it performed worse and was not able to complete the full test without Timeout exceptions. Since then, someone on the Couchbase forums reported the issue, possibly the same we encountered.

4) While single node deployments require sync disk writes for durability, distributed databases do not. They can leverage sync replication to perform writes on multiple nodes.
RESPONSE: Durability is defined in terms of writing data to disk, not memory. This is true whether using a single server or multiple servers. If your servers lose power or crash, data in RAM that has not been persisted to disk will be lost. However, if deployed correctly across racks and data centers, potential data loss can be minimized by replicating to multiple servers. All three products provide similar capabilities in this area, and we plan to evaluate this in our next round of tests.

We attempted to provide links to evidence but this site does not support URLs in comments.

United Software Associates
[email protected]
usain
50%
50%
usain,
User Rank: Apprentice
4/16/2015 | 3:14:47 AM
Re: Couchbase Response
Thanks for posting, Robin. Here are some of my thoughts based on your response.
We consulted Jonathan Ellis's blog post that you mention to ensure we were following best practices for benchmarking Cassandra. We did use the same version of YCSB across all tests, and we incorporated the YCSB client code for each vendor, including the fork that Jonathan Ellis referenced in his post. If you feel there are some configurations that did not follow your best practices, we would love to hear from you.

All three products provide capabilities in terms of the ability to scale out across many servers and data centers. We plan to test these configurations in our next round of tests.

United Software Associates
[email protected]
RSCHUMACHER400
50%
50%
RSCHUMACHER400,
User Rank: Apprentice
4/3/2015 | 2:43:00 PM
Re: Couchbase Response
As Jonathan Ellis wrote in "How not to benchmark Cassandra," there are a few inflection points that can tell you a lot about how a system will scale. One is when the dataset no longer fits in memory. Another is when the indexes no longer fit in memory. Smaller datasets may not be representative of what you will see as you push past those thresholds.

Here we have a tiny 20 million row dataset, where both data and indexes trivially fit in memory. This is the best possible scenario for MongoDB, since the wiredTiger engine slows dramatically for larger
datasets. Cassandra's log-structured engine continues to deliver consistent performance for larger-than-memory workloads.

Jonathan also explained that "it's important to take care that the same thing is being measured across the board."  Here, United Software Associates isn't even using the same version of the test suite across the different databases.  This can easily make a meaningful difference in the observed results.

Finally, it's telling that MongoDB isn't trying to compete in a clustered scenario. If your dataset fits in memory on a single machine, then as this benchmark implies it may well be that MongoDB is your best choice.  But for modern applications requiring performance and availability at scale across multiple machines and datacenters, look to Cassandra.
shane.k.j
50%
50%
shane.k.j,
User Rank: Strategist
4/1/2015 | 10:50:32 AM
Couchbase Response
MongoDB performs well when it 1) is limited to a single node, 2) doesn't store a lot of data, and 3) doesn't support a lot of users. This is a sweet spot for MongoDB. However, a single node deployment can't meet the rigorous demands of production deployments. Couchbase Server, on the other hand, shines when deployed as a distributed database.

Benchmark Issues

1) Well, they chose to benchmark single node deployments.

2) They chose to have Couchbase Server perform two operations per write (read+update) instead of one, but not MongoDB.

3) They used a two year old client library for Couchbase Server, but not MongoDB. It waits at least 100ms before checking if writes are durable. The latest, as little as 10µs.

4) While single node deployments require sync disk writes for durability, distributed database do not. They can leverage sync replication to perform writes on multiple nodes.

blog.couchbase.com/mongodb-rules-single-node-deployments

 

Shane K Johnson
Product Marketing
Couchbase


2020 State of DevOps Report
2020 State of DevOps Report
Download this report today to learn more about the key tools and technologies being utilized, and how organizations deal with the cultural and process changes that DevOps brings. The report also examines the barriers organizations face, as well as the rewards from DevOps including faster application delivery, higher quality products, and quicker recovery from errors in production.
Slideshows
Data Science: How the Pandemic Has Affected 10 Popular Jobs
Cynthia Harvey, Freelance Journalist, InformationWeek,  9/9/2020
Commentary
How to Eliminate Disruptive Technology's Risk
Andrew Froehlich, President & Lead Network Architect, West Gate Networks,  8/31/2020
News
How Analytics Helped Accenture's Pandemic Plans
Jessica Davis, Senior Editor, Enterprise Apps,  9/1/2020
Register for InformationWeek Newsletters
Video
Current Issue
IT Automation Transforms Network Management
In this special report we will examine the layers of automation and orchestration in IT operations, and how they can provide high availability and greater scale for modern applications and business demands.
White Papers
Slideshows
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll