28 blog posts in 1 category
MongoDB is great at storing clickstream data, but using it to analyze millions of documents can be challenging. Hadoop provides a way of processing and analyzing data at large scale. Since it is a parallel system, workloads can be split on multiple nodes and computations on large datasets can be done in relatively short timeframes. MongoDB data can be moved into Hadoop using ETL tools like Talend or Pentaho Data Integration (Kettle).
So, your development project has been humming along nicely on MongoDB, until it was time to deploy the application. That's when you called your operations person and things got uncomfortable. NoSQL, document database, collections, replica sets, sharding, config servers, query servers,... What the hell's going on here?
According to Wikipedia, a ceilometer is a device that uses a laser or other light source to determine the height of a cloud base. And it is also the name of the framework for monitoring and metering OpenStack. It collects measurements within OpenStack so that no two agents would need to be written to collect the same data.
In this post we will compare performance of MongoDB and TokuMX, a MongoDB performance engine from Tokutek. We will conduct three simple experiments that (almost) anyone without any programming skills can try and reproduce. In this way, we’ll be able to see how both products behave.
Replica Sets in MongoDB are very useful. They provide multiple copies of data, automated failover and read scalability. A Replica Set can consist of up to 12 nodes, with only one primary node (or master node) able to accept writes. In case of primary node failure, a new primary is auto-elected.Replica Sets in MongoDB are very useful. They provide multiple copies of data, automated failover and read scalability. A Replica Set can consist of up to 12 nodes, with only one primary node (or master node) able to accept writes. In case of primary node failure, a new primary is auto-elected.
Replica Sets or Sharded Clusters?
** Diagrams updated on May 22nd. Thanks to Leif Walsh from Tokutek for his feedback.
Replica Sets are a great way to replicate MongoDB data across multiple servers and have the database automatically failover in case of server failure. Read workloads can be scaled by having clients directly connect to secondary instances. Note that master/slave MongoDB replication is not the same thing as a Replica Set, and does not have automatic failover.
**Attention: The instructions in this blog post are outdated. Please refer to ClusterControl Quick Start Guide for updated instructions.
In this post, we are going to show you on how to install and integrate ClusterControl on top of an existing MongoDB Sharded Cluster with a replica set of 3 nodes
Last year, we did a survey asking our users about any other databases they were using alongside MySQL. A clear majority were interested in using other databases alongside MySQL, these included (in order of popularity) MongoDB, PostgreSQL, Cassandra, Hadoop, Redis.