Skip to content
FullStackDost Logo
  • All Courses
  • Blogs
  • Login
  • Register
  • All Courses
  • Blogs
  • Login
  • Register
  • Home
  • All Courses
  • Web development
  • MongoDB – (No-SQL)

MongoDB – (No-SQL)

Curriculum

  • 10 Sections
  • 31 Lessons
  • 10 Weeks
Expand all sectionsCollapse all sections
  • Introduction to MongoDB
    MongoDB is a NoSQL database that is designed for handling large volumes of unstructured or semi-structured data. Unlike traditional relational databases (RDBMS) that use tables and rows to organize data, MongoDB stores data in a flexible document-oriented format using JSON-like documents (BSON - Binary JSON). This makes it highly scalable, flexible, and performant for applications that need to handle varying types of data with complex structures.
    5
    • 1.1
      What is MongoDB?
    • 1.2
      Why MongoDB?
    • 1.3
      When to use MongoDB?
    • 1.4
      Key Features of MongoDB
    • 1.5
      Installing MongoDB
  • MongoDB Basic Operations
    MongoDB provides a rich set of basic operations for interacting with the database, including creating, reading, updating, and deleting data (often abbreviated as CRUD operations). Below are the basic operations that you can perform with MongoDB.
    2
    • 2.0
      Database and Collection Basics
    • 2.1
      CRUD Operations
  • Advanced Querying Techniques
    MongoDB offers a rich set of querying capabilities, and as you work with larger datasets and more complex application requirements, you’ll often need to use advanced querying techniques. These techniques help you optimize performance, execute sophisticated queries, and leverage MongoDB’s powerful indexing and aggregation features.
    4
    • 3.1
      Query Filters and Operators
    • 3.2
      Advanced Querying
    • 3.3
      Sorting and Limiting Results
    • 3.4
      Aggregation Framework
  • Data Modeling and Schema Design
    Data modeling and schema design are critical when using MongoDB (or any NoSQL database) to ensure efficient data storage, fast queries, and scalability. Unlike relational databases, MongoDB is schema-less, which means you are not required to define a fixed schema upfront. However, making the right design decisions from the beginning is essential for maintaining performance and avoid complications as your data grows.
    4
    • 4.1
      Data Modeling
    • 4.2
      Document Structure
    • 4.3
      Schema Design Patterns
    • 4.4
      MongoDB and Relationships
  • Indexing and Performance Optimization
    In MongoDB, indexing is a critical part of performance optimization. Without proper indexes, MongoDB has to scan every document in a collection to satisfy queries, which can be very inefficient for large datasets. Indexes are used to quickly locate data without scanning every document, making reads faster and more efficient.
    3
    • 5.0
      Creating Indexes
    • 5.1
      Using Text Search
    • 5.2
      Performance Optimization
  • Integrating MongoDB with a Web Application (Node.js)
    Integrating MongoDB with a web application built using Node.js is a common and powerful combination for building scalable and efficient web apps. MongoDB’s flexibility with JSON-like data and Node.js's asynchronous event-driven architecture work well together. In this guide, I'll walk you through the steps for integrating MongoDB with a Node.js web application, covering the essentials of setting up the connection, performing CRUD operations, and using popular libraries.
    3
    • 6.0
      Setting Up MongoDB with Node.js
    • 6.1
      CRUD Operations with Mongoose
    • 6.2
      Error Handling and Validation
  • Security in MongoDB
    Security is an essential aspect when working with MongoDB, especially when handling sensitive data in production environments. MongoDB provides a variety of security features to help protect your data against unauthorized access, injection attacks, and other vulnerabilities. Here’s a guide on securing MongoDB and your Node.js application when interacting with MongoDB.
    2
    • 7.0
      Authentication and Authorization
    • 7.1
      Data Encryption
  • Working with MongoDB in Production
    3
    • 8.0
      MongoDB Backup and Restore
    • 8.1
      MongoDB Scaling and Sharding
    • 8.2
      MongoDB Replication
  • Deploying and Monitoring MongoDB
    Working with MongoDB in a production environment requires careful planning, attention to detail, and best practices to ensure optimal performance, security, reliability, and scalability.
    3
    • 9.0
      Deploying MongoDB to Production
    • 9.1
      Monitoring and Management
    • 9.2
      Summary for MongoDB deployment on Production
  • Building a Web App with MongoDB (Final Project)
    Demo Project (OneStopShop)
    2
    • 10.0
      Building the Application
    • 10.1
      Final Project Features

MongoDB Scaling and Sharding

As your application grows, scaling your MongoDB deployment becomes essential to handle larger datasets, higher throughput, and increased user load. MongoDB provides two primary methods to scale: vertical scaling and horizontal scaling. The latter, sharding, is MongoDB’s approach to horizontal scaling and is designed for large-scale applications with distributed data requirements.

In this guide, we will explore scaling concepts and sharding in MongoDB, how they work, and how to implement them in production environments.


1. Vertical Scaling

Vertical scaling refers to increasing the resources (CPU, RAM, storage) on a single MongoDB server to handle more data or load.

  • When to use: This method works well for small-to-medium-sized datasets or if your workload is relatively light.
  • Limitations: Vertical scaling has physical hardware limits. Once you hit the resource limits of a single machine, it becomes difficult to scale further.

How to scale vertically:

  • Increase RAM: More memory allows MongoDB to cache more data in RAM, speeding up access to frequently queried data.
  • Increase CPU: If your queries are CPU-bound, increasing CPU cores can help speed up processing.
  • Increase disk space: MongoDB stores data on disk, so adding storage will allow you to store more data.

While vertical scaling can be effective for handling increasing load on a single node, it eventually becomes unsustainable for very large datasets or high traffic. At this point, horizontal scaling (via sharding) becomes necessary.


2. Horizontal Scaling and Sharding

Sharding is the process of distributing data across multiple servers (called shards) to achieve horizontal scaling. MongoDB supports sharding natively, enabling you to scale out your database as your data grows.

What is Sharding?

Sharding splits a database into smaller, more manageable pieces called chunks, and distributes them across multiple servers (shards). Each shard contains a subset of the data. Sharding helps to balance the load and provide high availability while managing large datasets.

In a sharded cluster, MongoDB manages the distribution of data, query routing, and balancing across multiple shards. It consists of the following components:

  • Shards: The actual MongoDB servers that hold the data. Each shard stores a portion of the dataset.
  • Mongos: A routing service that directs client queries to the appropriate shard(s).
  • Config Servers: These servers store metadata about the sharded cluster, including the mapping of data to shards and the configuration of the cluster itself.

When to Use Sharding?

  • Large Datasets: When the size of the data grows beyond the limits of a single machine.
  • High Throughput: When you need to distribute read/write traffic across multiple servers to handle high request volumes.
  • Geographical Distribution: If you want to distribute your database across multiple regions for fault tolerance and improved latency.

3. Sharding Architecture

Shards

Each shard is a MongoDB replica set that stores a subset of the data. In a production environment, it’s common to use multiple replica sets for high availability within each shard. This provides data redundancy and ensures that there’s no single point of failure in the cluster.

Config Servers

MongoDB uses config servers to store metadata for the sharded cluster. This includes information about the distribution of data and chunk locations. A minimum of three config servers is recommended for redundancy and fault tolerance.

  • Use case: Config servers are critical because they allow MongoDB to keep track of the data and ensure that queries are routed correctly to the appropriate shard.

Mongos Routers

The mongos process is the query router. Applications connect to the mongos router, which then forwards queries to the appropriate shard(s). It handles the routing logic of distributing queries to the correct shard based on the shard key.

  • Use case: The mongos router abstracts away the complexity of managing multiple shards for the application. The application doesn’t need to know where the data is stored.

4. Sharding Key

The shard key is the field that MongoDB uses to distribute documents across the shards. This key determines how data is partitioned. Choosing the right shard key is one of the most important decisions when setting up sharding because it has a significant impact on the performance and efficiency of the sharded cluster.

Types of Shard Keys:

  1. Hashed Shard Key:
    • MongoDB uses a hash function to distribute data evenly across the shards.
    • Use case: When you need to evenly distribute data but don’t care about the range of the data.
    • Example: {"user_id": "hashed"}
  2. Ranged Shard Key:
    • MongoDB divides data based on ranges of values, such as a range of timestamps, numeric values, or alphabetic values.
    • Use case: When you query based on a range, such as date ranges or numeric ranges.
    • Example: {"timestamp": 1}

Choosing a Shard Key

  • High Cardinality: A good shard key should have many distinct values (high cardinality) to ensure an even distribution of data.
  • Even Distribution: Ideally, the shard key should distribute the data evenly across all shards. Poor choice of shard key can lead to hot spots, where one shard receives the majority of the traffic or data.
  • Query Patterns: Choose a shard key that reflects your most common query patterns. For example, if your queries often filter by user_id, you might want to shard by user_id.

5. Setting Up Sharding in MongoDB

To set up sharding in MongoDB, follow these general steps:

Step 1: Start Config Servers

You need at least three config servers for redundancy.

  1. Start a config server (repeat for the other two):bashCopy codemongod --configsvr --replSet configReplSet --port 27019 --dbpath /path/to/configdb --bind_ip 127.0.0.1
  2. Initialize the config server replica set:bashCopy codemongo --port 27019 rs.initiate()

Step 2: Start Shards

Start each shard as a replica set. For example, for three shards:

  1. Start the first shard:bashCopy codemongod --shardsvr --replSet shard1 --port 27018 --dbpath /path/to/shard1
  2. Repeat for the second and third shards.

Step 3: Start Mongos Routers

The mongos routers are the interface between client applications and the sharded cluster. Start a mongos process for each router:

bash

Copy code

mongos --configdb configReplSet/localhost:27019 --bind_ip 127.0.0.1 --port 27017

Step 4: Enable Sharding on the Database

To enable sharding on a database, use the following command:

bash

Copy code

mongo --port 27017 sh.enableSharding("mydb")

Step 5: Shard a Collection

After enabling sharding on the database, choose a shard key and shard a collection. For example, to shard a collection based on user_id:

bash

Copy code

sh.shardCollection("mydb.mycollection", { "user_id": 1 })

Step 6: Monitor and Balance the Cluster

MongoDB automatically balances the data across the shards. You can monitor the balancing process using:

bash

Copy code

db.printShardingStatus()

Sharding automatically redistributes data across the cluster when chunks grow too large. The balancer runs in the background to ensure an even distribution of data.


6. Handling Shard Balancing and Chunk Migration

MongoDB uses chunk migration to ensure that data is evenly distributed across shards.

  • Chunk: A chunk is a range of data from the shard key. Each chunk has a start and end range and is distributed across the shards.
  • Balancer: The balancer runs in the background to move chunks between shards to maintain an even distribution of data. It tries to keep the cluster balanced, with each shard holding approximately the same amount of data.

To view the chunk distribution:

bash

Copy code

sh.status()

Balancing Options:

You can disable the balancer temporarily if you need to control when chunks are moved:

bash

Copy code

sh.stopBalancer() sh.startBalancer()


7. Scaling and Performance Considerations

While sharding allows you to scale horizontally, there are important considerations to keep in mind to ensure optimal performance.

a. Proper Shard Key Selection

Choosing the wrong shard key can lead to inefficient data distribution, resulting in hot spots and unbalanced workloads. It’s crucial to consider your most common query patterns and choose a shard key that balances the data evenly.

b. Write-Heavy Workloads

If you have write-heavy workloads, MongoDB’s sharded architecture can help by distributing the writes across multiple shards. However, you should still ensure that the shard key is well-chosen to avoid bottlenecks on a single shard.

c. Monitoring the Cluster

To ensure that your sharded cluster is running optimally, monitor key metrics such as disk usage, query performance, and shard distribution. Use tools like MongoDB Atlas, Cloud Manager, or Monitoring APIs to track performance.

d. Network Latency

When scaling horizontally, network latency between shards and mongos routers can affect performance. Ensure that your network infrastructure is robust and low-latency to minimize performance bottlenecks.


Conclusion

Sharding is a powerful technique that enables MongoDB to scale horizontally across many servers, making it an ideal choice for large, high-throughput applications. However, to fully leverage the power of sharding, it’s essential to choose the right shard key and maintain a well-configured cluster with sufficient monitoring and balancing. By carefully planning your sharding strategy and monitoring your cluster’s health, you can ensure your MongoDB deployment scales smoothly as your data grows.

MongoDB Backup and Restore
Prev
MongoDB Replication
Next

Copyright © 2025 FullStackDost. All Rights Reserved.

Theme by ILOVEWP