Cutting through the FUD on MongoDB

Introductory aside... There are a lot of tech- and non-tech folks not up on the whys and wherefores of NoSQL databases. (Even a former DBA was very confused recently.) If you are unsure about what MongoDB offers with document-based NoSQL engine, or, relatedly, Elastic with their search-based NoSQL engine, I highly recommend this writeup (coincidently from Amazon!) explaining the various flavors of NoSQL database, the pros/cons each has against SQL relational database, and the preferred use cases for each.

Relational databases have historically been great for transactional data. [A transaction is one where you need to be sure ALL data changes complete as a whole, else it is rolled back. A simple transaction: you moving money from your checking to savings. You wouldn't be terribly happy if $1000 left your checking and never appeared in your savings, and the bank wouldn't be happy with the inverse.]  Document-based databases are great for data collections, like diverse datasets needing categorization or a hierarchy (music or movie listings, articles, news, blogs) or e-commerce (product catalogs, shopping cart). Elastic is great for search (think of the db as one big index) so is great for time-series analysis (metrics, IoT, logging) and full text search (twitter search).

...Okay, back to the FUD...


MDB investors are freaking out about Amazon's new cloud-native competing service to their Atlas service. However, didn't we know something was coming down the pipe? MDB signaled this heavily with the licensing change, as they made clear that they don't want cloud providers providing managed hosting of MongoDB -- as that is their new angle alongside their enterprise support over their open source software. This forces cloud providers into creating their own home-grown solution instead of using MongoDB itself.

One confusion that I have seen repeated endlessly in articles on the topic: Amazon is NOT using MongoDB's code base. They built an entirely new document-based NoSQL product that MIMICS MongoDB, in particular the v3.6 API standard. [The API is the way the developer interfaces with MongoDB for inserts, updates, deletes and querying.] I've seen hints that DocumentDB is backed by their relational database Aurora, so isn't an entirely new product out of thin air. The net effect of mimicking the API is that this makes it compatible with your application's code base so you can more easily swap out the data store. In addition, AWS built a migration tool that can clone your existing MongoDB v3.6 (or under) into DocumentDB. These features combine to make it so new users with existing applications can get immediately started on the new product.

So, a point we can all agree on is that this product and its "MongoDB" features are obviously to nab new and existing MongoDB users over to their new product, whether they are MDB customers already or not.

Not mentioned enough is that AWS's new product DocumentDB directly competes with their prior product DynamoDB. [Later I get to Lyft, who, to be clear, is NOT on this new AWS product.]

So -- AWS is bringing some competition. HOWEVER... let's take a brief walk through history.

MongoDB has been around since 2009.

Amazon and Google and Microsoft cloud services were already competing with MongoDB via their cloud-native document-based NoSQL clones. Amazon has DynamoDB (since 2012). Microsoft has CosmosDB (since 2010), with a MongoDB-compatible API. Google has Cloud Firestore, part of Firebase that it acquired in 2014. Not to mention other open source document-based NoSQL stores like Cassandra (since 2008) and Couchbase (since 2010).

REGARDLESS of these competing products, MongoDB is the clear king of document databases.

REGARDLESS of these competing products that are more "native" to their respective clouds, MongoDB has thrived as a cloud-neutral provider.

REGARDLESS of many cloud services already hosting many many MongoDB containers (self managed by users themselves) and others using their "cloud-native" products, MongoDB has thrived.

Why? In order of importance, IMHO:

MongoDB has been the #1 product for document-based NoSQL for a decade. There is already a lot of institutional knowledge in most enterprises, and a ton of resources built around it. Developers have either already used it or can easily pick it up and learn it from a multitude of online sources with step-by-step instructions. It natively represents data as JSON for maximum compatibility. [JSON, a light-weight format for data transmission, is the most-used method today for API communication. Much simpler and more readable than (the thankfully nearly deceased) XML.]

They provide a core, widely-used database that isn't cloud-specific. MongoDB can be deployed anywhere, within a customer's data center OR in the cloud OR now directly on mobile devices (new feature).  No cloud-native service can host at any other location outside their cloud. Any applications built using those services must be internet-connected at all times, in order to utilize it.

They provide enterprise support. MongoDB [the company] sprung up to support MongoDB [the open source software]. Enterprises want to use open source solutions in their software stack as they are battle tested and not proprietary (so not locked into a vendor), but the downside is that then don't want to wait days for an answer from communities like Stack Overflow or Google Groups to solve their immediate needs. Companies sprung up around this need as a common business model. (Also see: Elastic, Cloudera, Hortonworks, Datastax, MySQL [long ago acq by Sun then Oracle].)

They provide a managed service (Atlas!) so you don't have to administer and deploy your database architecture yourself. These tasks become even more complicated with sharded/replicated databases in today's global economy, combined with the necessary heavy focus on security. It is simply easier to pay someone to manage your cluster instead of you needing to add multiple personnel and domain experts to manage it for you (DBA team to do installation, setup, networking, security, sharding, replication, backup plan, etc...).

They are cloud-vendor neutral. This has meaning to risk-averse enterprises, but I have the feeling it is the least important reason for someone to pick MongoDB Atlas over "cloud native" solutions. Many enterprises are already "locked in" to a cloud vendor of choice.

Some final notes on recent news:

Lyft news is entirely a big nothing burger. Lyft switched entirely back to AWS DynamoDB. [NOTE: the older DynamoDB, NOT the brand new DocumentDB]. It was an analyst that tied this news to MongoDB as a big loss, calling Lyft a "marquee customer". However, we haven't seen any such indication, and MDB themselves said it was only a customer of mLab. Lyft was already locked in with AWS (and been using DynamoDB for years already). I am guessing they are taking their [FREE open-source] MongoDB instances that were running on EC2 VMs and converting them to DynamoDB. They are only a MongoDB "customer" through their prior use of mLab. MDB has since come out saying that Lyft wasn't an Atlas customer, and was converting their own self-managed instances that were running a legacy version of MongoDB. Bottom line: This news is just a sales loss, of what could have been a future Atlas customer.

Redhat is the only bit of Mongo news this month that has interested me, but ultimately seems another nothing burger. MongoDB changed their license for a good business reason [I can only imagine Elastic following suit]. However, Redhat's decision simply means MongoDB is no longer packaged on their operating system releases by default. So now the user has to ... wait for it... INSTALL IT. This has zero impact on MDB other than optics of the open-source community. Redhat fancies themselves as OPEN SOURCE STEWARDS and don't like the licensing changes. This is a bit of what happened when Oracle bought Java.

At the end, a lot of huffing and puffing but I, as a software developer and investor, see very little reason to panic. I'd rather see a change in the stellar growth numbers, not freak out when Amazon is mentioned as a competitor (even though.. uh.. they've been a competitor with DynamoDB already...).

Apply all of the above to Elastic as well, as they are an extremely similar company with extremely similar business lines (enterprise support of their open-source software, and management of cloud instances). Their software is even more complex in setup and ongoing management. Less TAM than MDB ultimately, but lots of excitement as they continue to find new use cases from their customers. APM and network security monitoring are huge waves to ride right now, though.

-muji