MongoDB sharded cluster deployment in Ubuntu

1st November, 2022

MongoDB is an open source document-oriented NoSQL database. Unlike in the traditional relational databases, MongoDB makes use of JSON like collections and documents where each document consists of key-value pairs. It is typically used for high volume data storage.

However, as the size of the database increases, the query execution time might tend to increase as with any other database. To resolve such issues, MongoDB provides an option of sharding.

The concept of sharding can be read here. In MongoDB sharding requires three sets of different servers.

  • Shards (at least 2 sets of 3 servers replica set): Each shard contains a subset of data. The shards should be deployed as replica sets. Within each shard the data will get replicated across its servers.
  • Mongos: It acts as a query router, handling both read and write operations. The query requests get dispatched to the relevant shards and subsequently the result from each shard gets aggregated before the final response is delivered.
  • Config (at least a set of 3 servers replica set): It maintains metadata and configuration settings.

 

Each server requires installation of mongodb. The installation instructions can be found here.

Following steps will help to deploy a sharded MongoDB cluster. The documentation is based on Ubuntu 20.04.5 LTS

  • Set up of config servers (repeat the following steps in all servers)
    • Create/edit the config file /etc/mongod.conf
      net:
        port: 27019        #same on all servers
        bindIp: 127.0.0.1,<hostname(s)|ip address(es)>
      replication:
        replSetName: <replica set name>
      sharding:
        clusterRole: configsvr​
    • Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
      [Unit]
      Description=MongoDB Config Server
      Documentation=https://docs.mongodb.org/manual
      After=network-online.target
      Wants=network-online.target
      [Service]
      User=mongodb
      Group=mongodb
      EnvironmentFile=-/etc/default/mongod
      ExecStart=/usr/bin/mongod --config /etc/mongod.conf
      PIDFile=/var/run/mongodb/mongod.pid
      # file size
      LimitFSIZE=infinity
      # cpu time
      LimitCPU=infinity
      # virtual memory size
      LimitAS=infinity
      # open files
      LimitNOFILE=64000
      # processes/threads
      LimitNPROC=64000
      # locked memory
      LimitMEMLOCK=infinity
      # total threads (user+kernel)
      TasksMax=infinity
      TasksAccounting=false
      # Recommended limits for mongod as specified in
      # https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
      [Install]
      WantedBy=multi-user.target​
    • Reload the systemd: sudo systemctl daemon-reload
    • Restart the server: sudo systemctl restart mongoserver
    • Enable the service so that it gets restarted on reboot: sudo systemctl enable mongoserver
  • Initiate the config servers replica set. This can be achieved by connecting to any one of the config servers. This is to be done only once for the complete replica set.
    • SSH to any one of the server belonging to the config servers replica set and execute mongosh

      mongosh --host localhost  --port 27019

      Execute the following: (add the replica set name used before and add the server details in the members section)
      rs.initiate(
        {
          _id: "<replica set name>",
          configsvr: true,
          members: [
            { _id : 0, host : "<host/ip>:27019" },
            { _id : 1, host : "<host/ip>:27019" },
            { _id : 2, host : "<host/ip>:27019" }
          ]
        }
      )
      ​


  • Set up of Shard servers
    • Each shard should be a replica set of at least 3 servers.
    • Set up of replica set of single shard (repeat the following steps in all servers)
      • Create/edit the config file /etc/mongod.conf
        net:
          port: 27018        #same on all servers
          bindIp: 127.0.0.1,<hostname(s)|ip address(es)>
        replication:
          replSetName: <replica set name>
        sharding:
          clusterRole: shardsvr​

         

      • Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
        [Unit]
        Description=MongoDB Shard Server
        Documentation=https://docs.mongodb.org/manual
        After=network-online.target
        Wants=network-online.target
        [Service]
        User=mongodb
        Group=mongodb
        EnvironmentFile=-/etc/default/mongod
        ExecStart=/usr/bin/mongod --config /etc/mongod.conf
        PIDFile=/var/run/mongodb/mongod.pid
        # file size
        LimitFSIZE=infinity
        # cpu time
        LimitCPU=infinity
        # virtual memory size
        LimitAS=infinity
        # open files
        LimitNOFILE=64000
        # processes/threads
        LimitNPROC=64000
        # locked memory
        LimitMEMLOCK=infinity
        # total threads (user+kernel)
        TasksMax=infinity
        TasksAccounting=false
        # Recommended limits for mongod as specified in
        # https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
        [Install]
        WantedBy=multi-user.target​

         

      • Reload the systemd: sudo systemctl daemon-reload
      • Restart the server: sudo systemctl restart mongoserver
      • Enable the service so that it gets restarted on reboot: sudo systemctl enable mongoserver
    • Initiate the shard servers replica set. This can be achieved by connecting to any one of the servers of shard replica set. This is to be done only once for the complete replica set.
      • SSH to any one of the server belonging to the config servers replica set and execute mongosh

        mongosh --host localhost  --port 27018

        Execute the following: (add the replica set name used before and add the server details in the members section)
        rs.initiate(
          {
            _id: "<replica set name>",
            members: [
              { _id : 0, host : "<host/ip>:27018" },
              { _id : 1, host : "<host/ip>:27018" },
              { _id : 2, host : "<host/ip>:27018" }
            ]
          }
        )
        ​
    • Repeat the same steps for each shard replica set but with different port.
  • Set up of mongos query router server.
    • Create/edit the config file /etc/mongod.conf

      Comment the storage section and edit the following.
      net:
        port: 27017
        bindIp: 127.0.0.1,<hostname(s)|ip address(es)> # Include public IP if required
      
      sharding:
        configDB: <config replica set name>/<config server ip>:27019,<config server ip>:27019 # at least one member of the replica set in <replSetName>/<host:port> format
    • Create a systemd service by creating a file /etc/systemd/system/mongoserver.service
      [Unit]
      Description=MongoDB Query Server
      Documentation=https://docs.mongodb.org/manual
      After=network-online.target
      Wants=network-online.target
      [Service]
      User=mongodb
      Group=mongodb
      EnvironmentFile=-/etc/default/mongod
      # NOTE: command must be mongos
      ExecStart=/usr/bin/mongos --config /etc/mongod.conf
      PIDFile=/var/run/mongodb/mongod.pid
      # file size
      LimitFSIZE=infinity
      # cpu time
      LimitCPU=infinity
      # virtual memory size
      LimitAS=infinity
      # open files
      LimitNOFILE=64000
      # processes/threads
      LimitNPROC=64000
      # locked memory
      LimitMEMLOCK=infinity
      # total threads (user+kernel)
      TasksMax=infinity
      TasksAccounting=false
      # Recommended limits for mongod as specified in
      # https://docs.mongodb.com/manual/reference/ulimit/#recommended-ulimit-settings
      [Install]
      WantedBy=multi-user.target​
    • Reload the systemd: sudo systemctl daemon-reload
    • Restart the server: sudo systemctl restart mongoserver
    • Enable the service so that it gets restarted on reboot: sudo systemctl enable mongoserver
  • Add shards to the cluster
    • Go to the mongos server

      mongosh --host localhost  --port 27018

      Execute the following: (add the replica set name used before and add the server details in the members section)
      sh.addShard( "<shard repl set name>/<dns/ip>:27018,<dns/ip>:27018,<dns/ip>:27018")
      
      # Repeat the above comamnd for all the shards.​

 

That's it. The MongoDB cluster should be ready now.

 

Helpful references: