It's time for another technical deep dive... let's switch to MongoDB. A database engine - a subject near and dear to my heart! If you read my ElasticON writeup, you know software development is my job, and I enjoy it immensely. I personally work more with Elasticsearch than MongoDB, but am starting to work with it more as I have some use cases that it solves. For the sake of this post, I'll call the company MDB and the product MongoDB.
MDB just had an acquisition, their third. Their prior one, mLab in Oct 2018, cost $68M, and allowed them to convert customers from an Atlas competitor (managed cloud-hosted MongoDB instances) into Atlas customers. While that last acquisition was a customer and team acquire, they just bought Realm for $39M to acquire product lines that help them jump start their new mobile initiatives.
I want to riff a bit on MDB's moves into mobile.
Overview
What has allowed companies like MDB and Elastic (and non-database dev tools like Twilio) to thrive is how embedded they become as tools a software company uses to solve problems. As a developer has a need for data storage in their application, they pick and choose amongst the available SQL and NoSQL databases and select one that serves their use case best -- and it better be one they can learn to use fast and get productive with as quickly as possible. Once that database gets embedded into use within their application and it goes into production, the software company heavily relies on the stability of it. From there, it needs to be able to scale -- because as a software company gets more and more successful, their usage of (and dependence to) those databases goes up immensely. And as those companies have success with that new tool, they use it again in other projects or products. Once the tool is embedded in the software architecture, it is usually not going to be removed unless it starts being unreliable, or is not scaling up enough to keep up with the company's needs.
MongoDB is a document database, which allow you to store and query a collection of data objects. Being a NoSQL store, it excels at being flexible (data properties can differ from object to object) and can easily scale. MongoDB couples easily with modern web & mobile application code, as each piece of data sent or queried is basically a ready-to-use object, whereas in SQL, you would use a data translation layer in between (such as an ORM) to help translate SQL rows to a usable object, and vice versa. In NoSQL, instead of relying on a "common query language" like SQL, a developer accesses the data via simple API methods for querying, inserting, updating and deleting. Document databases have a huge number of use-cases, as it can be used anywhere people have a collection of objects... such as movies in a streaming service, properties on a rental site, books in a library, user reviews or comments, store inventory, etc.
Mobile Databases
The primary methodology used to access SQL and NoSQL databases is the client-server model, where the web or mobile app is a client that makes requests to a database that is hosted on a remote server (either on-prem on cloud-hosted).
But this doesn't solve every need. Yes, the world is increasingly more connected and cloud-driven, but there are still bandwidth concerns and connectivity issues. Applications are, of course, going to be MUCH faster and MUCH more performant if a subset (or all) of the data an application needs resides directly on the device. Users could perform tasks while off-line, that could be synced back to the cloud when they next come back online.
A different type of access method being used on mobile apps is having a synchronized copy of the database as a mobile database, in order to store the data locally on the phone or device itself, instead of being hosted elsewhere. A mobile database can basically be any of the SQL or NoSQL types, but in order to usable and maintainable, it has to be light-weight so it doesn't overpower the device, as devices tend to have less compute and memory, and have battery considerations.
In addition, the database must be geared for synchronizing data between a centralized master server and the local database on each mobile device. In a mobile database, synchronization ultimately needs to support updates in BOTH directions:
- Master to Mobile = syncing updates from the master database to the embedded database, updating the local data on the device with any inserts or changes they should get a copy of.
- Mobile to Master = syncing updates from the local mobile database back to the master database.
SQLite is the long-time standard for on-device databases. It's an open-source, light-weight relational "database in a file" that has been around forever, having been long used in embedded devices. It has the same SQL interface as server-hosted databases. It's slow, but is easy enough to use for on-device database needs. One huge downside for modern architectures, however -- it doesn't have any out-of-the-box synchronization capabilities. Custom add-on solutions have cropped up, like AMPLI-SYNC (sync from SQLite to/from any major SQL database) or LiteSync (sync between SQLite dbs). And proprietary competitors have long been around, like InterBase, and a bunch of long-in-the-tooth bloated crap from Oracle, Microsoft and IBM that no one is using in their modern architectures. Despite all this, SQLite reigns supreme.
So the SQL side is covered, though mostly from a long-used light-weight database that has been piece-mealed to work with mobile. But what about NoSQL mobile solutions? The only major player that has focused on mobile thus far has been Couchbase, who is one of the few document-database competitors to MongoDB that isn't a cloud provider. Beyond them, a few NoSQL competitors have started to spring up, like the open-source databases Realm and UnQLite.
Speaking of cloud providers, they want you to AVOID using an on-device database, and instead have everything talking to a hosted solution... because having a local database on the device is counter to their existence of providing cloud compute and storage resources. So they are not (yet) much of a factor on this front, though AWS AppSync aims to provide some of the sync features.
Recently, MDB has been moving to fill in the gaps that Couchbase is covering, and now has their own mobile product. But first, let's talk about the many steps MDB has taken towards the Cloud and Mobile...
Cloud Atlas
In 2016, MDB got the bright idea that customers don't want to do the boring stuff in setting up their own database servers -- they just want the features and they want it immediately. No more wasting time with patching, maintaining security and backups, and monitoring and auditing. The amount of time and IT resources this saves companies has to be massive.
So users can now go directly to the source, and have the database provider be the one supporting their database instances. Who knows the database engine better than the ones who wrote it? Atlas supports all the major cloud providers, and, like with web hosting, a customer can get their database hosted on a shared cluster or a private dedicated one, depending on their budget and needs.
We all know about the licensing changes that subsequently occurred. Needless to say, MDB woke up to the fact that AWS and Google were the real competition (to Atlas, and, as we'll see, to Stitch), not open-source alternatives like Couchbase and Cassandra.
All Eyes on the Cloud
MongoDB v4 was released less than a year ago in July 2018, during their MongoDB World 2018 conference. This was their first major release under the new licensing. It shows they are making major moves towards Cloud as the future platform.
Beside the much ballyhooed multi-document ACID transaction capabilities (...as well as the new SQL connectors, the new Charts viz tool, and the new pipeline builder...), they made HUGE progress in making MongoDB more applicable for cloud hosted and mobile solutions.
v4 included:
- MongoDB Stitch serverless platform released.
- MongoDB Mobile released in beta for iOS/Android devices.
First lets talk Stitch, as it's the important lynch pin that makes a Mobile version even viable. MongoDB Stitch is a serverless-platform for interacting with the database. At its core, it allows direct querying from front-end code (web or mobile app), skipping the requirement of needing a back-end API, plus provides the ability to run code inline, within the database.
Database developers typically control access to the database from a server-side API or service. The traditional software stack is a web or mobile front-end application that talks to an API. That API in turn controls all access to the back-end database engines (which could be Mongo, a SQL DB, Elasticsearch, Redis, etc) and file systems... as well any corresponding security, logging, monitoring & audit concerns around that.
Serverless platforms allow the front-end application (say, a Javascript web app, or an iOS mobile app) direct integration with the database, so no intermediary API is needed. This has the potential to greatly simplify the amount of infrastructure a SaaS or software company would normally need, as it could completely remove a layer of the architecture stack.
Your app still talks to a database server hosted SOMEWHERE (whether on-prem, or, most likely, the self-managed or managed cloud). But now the app does not need an intermediary API or backend service to access that database. And if you don't need an intermediary server, why even host the database? If your database is also hosted in the cloud, it makes it so the company doesn't need ANY servers -- it can be a mobile or web app, hosted in the cloud, that just talks directly to the database in the cloud. Truly serverless.
End result of using serverless platforms like Stitch are a much simpler stack. Benefit 1: The front-end application can talk directly to the database. Compared to the typical stack above, you may be able to do Benefit 2: Completely eliminate the need for an API layer, especially by using the other features of Stitch (functions and triggers). After you lose needing a server to host the API, why not go all the way and Benefit 3: No longer self-manage and self-host your database instance. Use Stitch to talk directly to the MongoDB Atlas service.
Stitching Together a Platform
But beyond simplifying the development stack, Stitch also greatly extends the programmability of the MongoDB database, by allowing you to code scripts that can run in any copy of your database - even on Atlas.
- Stitch Functions = allows you to embed custom code within the database, which include calling external APIs like Twilio and Slack.
- Stitch Triggers = allows you to have data changes trigger events. Such as a running a function to trigger a Twilio text or email notice when a new customer record is added.
These combine into a potent new interface into the database, and one that greatly simplifies how the data is accessed - which in turn can all combine to completely eliminate the need for API layer. You no longer need to maintain any of your own infrastructure under this new paradigm. The database itself can be triggering back-end processes, like notifications and emails.
Serverless platforms like this provide a huge amount of developer lock-in. There aren't standards across platforms, so you don't go switching from one to another without a lot of pain. Do you think Stitch allows you to talk to AWS DocumentDB? Nope. It completely perpetuates lock-in to MDB - either to cloud-hosted Atlas (managed instance) or a self-hosted MongoDB v4 (self-managed instance).
However, cloud providers aren't standing still. They have AWS Amplify and Google Firebase platforms, that are similar serverless platforms that tie to their own respective cloud services. (And again, they aren't interested in providing an embedded on-device database - they want you to consume cloud services.)
These serverless ecosystems are the platforms that the next generation of SaaS tooling is going to be built upon. MDB had some catching up to the big cloud providers, but with Switch and Atlas, MDB is skating to where the puck is going to be. This sets them up to directly compete against AWS and Google and Microsoft serverless platforms. How can MDB differentiate? What they do best -- helping you solve all your data needs, while being flexible and scalable. Part of that is solving niches that cloud providers can't or won't solve.
Stitch is the Thread Binding Cloud to Mobile
MongoDB self-managed and MongoDB Atlas are great HOSTED database options, ones that must run on a server somewhere. Mobile is altogether different. A mobile database is one that lives on the device itself. It allows for a local copy of a dataset which, as we discussed above, needs synchronization with a master database.
MongoDB v4 included MongoDB Mobile, which aimed at filling this niche and finally catching up to Couchbase's mobile offerings. It went beta when first released last summer, then went GA back in November. It allows mobile devices to install and use a local instance of MongoDB - the same database a company may already use on the server side. It works through MongoDB Stitch, which can serve as the interface for storing data either to a hosted database (Atlas or self-hosted) or to the on-device database. As of November GA release, it also currently has a beta feature called Mobile Sync, which allows for synchronizing changes between a back-end MongoDB and the MongoDB Mobile database.
I found an interesting tidbit on MDB Mobile features page: "MongoDB Mobile uses SQLite as a simple key-value store behind the scenes due to its stability and prevalence on devices."
MDB has clearly been betting big on their new Stitch and mobile capabilities over the past year, but the mobile side still seems pretty rudimentary (mobile db just went GA, and sync still in beta). They need to inject some maturity in their mobile products, so... this new acquisition really comes as no surprise. You'll never guess what Realm does!
There Can Be Only One
Realm is a leader in NoSQL mobile databases, with 2 main products: a platform for syncing data, and a mobile database that they bill as an open-source NoSQL version of SQLite. It is a NoSQL document store that is very similar to MongoDB, not to mention also being open-source and free. They label their focus as "off-line first", which means they are ideal for having the local on-device database be the main database for the mobile app, and necessary updates are sent from and to the master database once Internet connectivity is next re-established.
In addition, they have a managed cloud-hosted version called Realm Cloud that has been available since Jan 2018. Realm marketing says you can now use Realm as a “RESTless” middleware layer, aka a serverless platform. So this company is a mini MongoDB for mobile -- they have their own MongoDB Mobile & Stitch and can manage and host it for you like Atlas.
Realm has 2B+ installed on-device databases, from 100k active developers across 350 customers. Published customers include Amazon, Google, Netflix, Starbucks, Ebay. Not too shabby.
I think MDB was so impressed by their sync features that they bought themselves bolt-on capabilities to integrate into their own platform. And yes, it's for the team too, as they are the dev team best suited for integrating Realm's db & sync features directly into MongoDB Mobile and Stitch platforms.
The PR makes it sound like Realm will continue to exist, but I doubt their platform will continue to be a thing.... The clear path now for MDB is use Realm Database on the device (getting rid of SQLite), which much more closely matches MongoDB format. Then, utilize their sync platform to drive how it syncs up with the master MongoDB. MDB is sure to integrate it directly into MongoDB Mobile & the Stitch platform in order to sync to MongoDB self-hosted and Atlas databases. Final step would then migrate any Realm customers to MongoDB Mobile.
Bottom line - this was an excellent acquire. Stitch is the cementing the cloud-friendly future for MDB and was THE big news in the v4 release. Mobile is a major part of it, and is something that cloud providers can't provide. They took out a successful competitor in mobile and are using that to overlay their existing product lines.
Mobile databases are starting to enable a new type of mobile app. Apps can be developed that are able to download a snapshot of the user's data to the device, so the user can always carry it around and access it, regardless of being on wifi/LTE or off-line. This would make things FAST as well, as you aren't having to download large datasets off the web. Any changes the user makes while off-line can be synced to the 'home base' database as needed. Clearly some major app providers are taking advantage of this, after looking at Realm's customer list.
PR for the acquisition said they will be releasing more details at MongoDB World 2019, coming up soon (mid June in NYC). Hopefully we will start seeing signs of where and how Realm is getting integrated into the MongoDB ecosystem that v4 got started on (Stitch, Canvas, etc).
It's an exciting time for app development, as cloud platforms have really enabled nimbleness and flexibility while enabling massive scale. I see MDB as making the right moves at the right time. They have been busy stitching together a cloud- and mobile-focused platform, and then enhanced it greatly with this acquisition. They join with cloud providers in having a serverless platform for their customers to better tie application development into their services. That means an ecosystem.... and that means lock-in. And when you are talking development lock-in, you generally mean a deep integration that isn't going to be replaced easily.
-muji