MongoDB: A technology behind the leading software applications.

Technology has been changing everyday now and we at GIBOTS strive to be at pace with the changing landscape. Be it embracing the angular framework from the days of it’s beta launch till the latest release of Angular 5.0. From using MYSQL to now using the best NoSQL db around.

Behind every successful business, there is an efficient use of automation system. The suits of GIBots Robotic Business Process Automation (RBPA) software are not only making process and tasks simpler but also make it efficient which eliminate redundancy with accuracy in daily operations. Enabling GIBots to handle any process will not only streamline your organization’s workflow but focus the companies that they work for great.

Our venture into MongoDB had been for sole purpose of handling huge consumer data for analytics and reporting.

That bring us to the title of this blog. As it’s normal to have the initial hiccups with every new thing you try, we had our adventures as well. In this blog we try to cover the basics that will get anyone stated with this and build on.

What is the purpose of Blog?

As part of our mission to help developers build better apps faster by providing a database platform that doesn’t hold them back, we are always looking for new ways to better understand the challenges. This article showed you some basic operation on MongoDB database. And Second is how to use aggregation pipeline and perform write operations in bulk.

What is MongoDB?

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
Example:

{
name: ”abc”
age: 29,
status: ”A”,
position: ”B”
}

The advantages of using documents are:

  1. Documents (i.e. objects) correspond to native data types in many programming languages.
  2. Embedded documents and arrays reduce need for expensive joins.
  3. Dynamic schema supports fluent polymorphism.

MongoDB CRUD Operations:

  1. Create Operations.
  2. Read Operations.
  3. Update Operations.
  4. Delete Operations.
  5. Bulk Write.

Create Operations:

Create or insert operations add new documents to a collection. If the collection does not currently exist, insert operations will create the collection.

Examples: MongoDB provides the following methods to insert documents into a collection.

db.collection.insertOne():
db.products.insertOne( { item: "card", qty: 15 } );
db.collection.insertMany():
db.products.insertMany ( [
      { item: "card", qty: 15 },
      { item: "envelope", qty: 20 },
      { item: "stamps" , qty: 30 }
] );

Read Operations:

Read operations retrieves documents from a collection; i.e. queries a collection for documents.Examples: Given a collection students that contains the following documents:

{ "_id" : 1, "score" : [ -1, 3 ] }
                        { "_id" : 2, "score" : [ 1, 5 ] }
                        { "_id" : 3, "score" : [ 5, 5 ] }

The following query:

                      db.students.find( { score: { $gt: 0, $lt: 2 } } );

Matches the following documents:

                        { "_id" : 1, "score" : [ -1, 3 ] }
                        { "_id" : 2, "score" : [ 1, 5 ] }
               db.students.find().sort( { score: 1 } ).limit( 5 )
               db.students.find().limit( 5 ).sort( { score: 1 } )

Update Operations:

Update operations modify existing documents in a collection. MongoDB provides the following methods to update documents of a collection:

Examples:

updateOne:

Updates a single document within the collection based on the filter.

updateOne() updates the first matching document in the collection that matches the filter.

Example:                        The students collection contains the following documents:

                                    { "_id" : 1, "name" : “A”,”age":1},
                                    { "_id" : 2, "name" : “B”,”age":1},
                                    { "_id" : 3, "name" : “C”,”age":1}
                        try 
                                     db.students.updateOne(
                                                { "name" : "A" },
                                               { $set: { "age" : 3 } }
                                       );
                        } catch (e) {
                           print(e);
                        }

response: { “acknowledged” : true, “matchedCount” : 0, “modifiedCount” : 0 }

updateMany:

upsert: true would insert the document if no match was found.

Example:

try {

db.students.updateMany(
                                                { "name" : "A" },
                                                { $set: { "age" : 3 } },
                                                { upsert: true }
                                       );
                        } catch (e) {
                           print(e);
                        }

Delete Operations:

Delete operations remove documents from a collection. MongoDB provides the following methods to delete documents of a collection:

db.collection.deleteOne()

db.collection.deleteMany()

Bulk Write:

MongoDB provides the ability to perform write operations in bulk. For details, see Bulk Write Operations.The bulkWrite() method provides the ability to perform bulk insert, update, and remove operations

Ordered vs Unordered Operations:

With an ordered list of operations, MongoDB executes the operations serially. If an error occurs during the processing of one of the write operations, MongoDB will return without processing any remaining write operations in the list. See ordered Bulk Write

With an unordered list of operations, MongoDB can execute the operations in parallel, but this behaviour is not guaranteed. If an error occurs during the processing of one of the write operations, MongoDB will continue to process remaining write operations in the list. See Unordered Bulk Write.By default, bulkWrite() performs ordered operations. To specify unordered write operations, set ordered : false in the options document.

Example:

The following bulkWrite() performs multiple operations on the collection:

try {
   db.students.bulkWrite(
      [
         { insertOne :
            {
               "document" :
               {
                  "_id" : 4, "name" : "A" ,”age”:22}
            }
         },
         { updateOne :
            {
               "filter" : { "name" : "A" },
               "update" : { $set : { "age" : 33} }
            }
         },
         { deleteOne :
            { "filter" : { "age" : 22} }
         }
       ]
   );
}
catch (e) {
   print(e);
}

Aggregation:

Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result.

MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.

Following are the possible stages in aggregation framework −

  • $project − Used to select some specific fields from a collection.
  • $match − This is a filtering operation and thus this can reduce the number of documents that are given as input to the next stage.
  • $group − This does the actual aggregation as discussed above.
  • $sort − Sorts the documents.
  • $skip − With this, it is possible to skip forward in the list of documents for a given amount of documents.
  • $limit − This limits the amount of documents to look at, by the given number starting from the current positions.
  • $unwind − This is used to unwind document that are using arrays. When using an array, the data is kind of pre-joined and this operation will be undone with this to have individual documents again. Thus with this stage we will increase the amount of documents for the next stage.

Example:

In the collection you have the following data :

{
   _id: ObjectId(7df78ad8902c)
   title: 'MongoDB Overview',
   description: 'MongoDB is no sql database',
   by_user: 'tutorials point',
   url: 'http://www.tutorialspoint.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 100
},
{
   _id: ObjectId(7df78ad8902d)
   title: 'NoSQL Overview',
   description: 'No sql database is very fast',
   by_user: 'tutorials point',
   url: 'http://www.tutorialspoint.com',
   tags: ['mongodb', 'database', 'NoSQL'],
   likes: 10
},
{
   _id: ObjectId(7df78ad8902e)
   title: 'Neo4j Overview',
   description: 'Neo4j is no sql database',
   by_user: 'Neo4j',
   url: 'http://www.neo4j.com',
   tags: ['neo4j', 'database', 'NoSQL'],
   likes: 750
},

db.mycol.aggregate([{
$group : {
                        _id : “$by_user",
                        num_tutorial : {$sum : 1}
            }
}]);
db.mycol.aggregate( [
{ $group: { id: "title", likes: { $sum: "$likes" } } },
            { $match: {by_user: 'Neo4j'} }
])

Aggregation Pipeline Optimization:

Projection Optimization:

The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline.

$project or $addFields + $match Sequence Optimization

For an aggregation pipeline that contains a projection stage ($project or $addFields) followed by a $match stage, MongoDB moves any filters in the $match stage that do not require values computed in the projection stage to a new $match stage before the projection.

If an aggregation pipeline contains multiple projection and/or $match stages, MongoDB performs this optimization for each $match stage, moving each $match filter before all projection stages that the filter does not depend on.

Consider a pipeline of the following stages:

{ $addFields: {
    maxTime: { $max: "$times" },
    minTime: { $min: "$times" }
} },
{ $project: {
    _id: 1, name: 1, times: 1, maxTime: 1, minTime: 1,
    avgTime: { $avg: ["$maxTime", "$minTime"] }
} },
{ $match: {
    name: "Joe Schmoe",
    maxTime: { $lt: 20 },
    minTime: { $gt: 5 },
    avgTime: { $gt: 7 }
} }
$sort + $match Sequence Optimization

When you have a sequence with $sort followed by a $match, the $match moves before the $sort to minimize the number of objects to sort. For example, if the pipeline consists of the following stages:

{ $sort: { age : -1 } },
{ $match: { status: 'A' } }

During the optimization phase, the optimizer transforms the sequence to the following:

{ $match: { status: 'A' } },
{ $sort: { age : -1 } }

$redact + $match Sequence Optimization:

Restricts the contents of the documents based on information stored in the documents themselves.

{ $redact: { $cond: { if: { $eq: [ "$level", 5 ] }, then: "$$PRUNE", else: "$$DESCEND" } } },
{ $match: { year: 2014, category: { $ne: "Z" } } }
$project + $skip or $limit Sequence Optimization:

When you have a sequence with $project followed by either $skip or $limit, the $skip or $limit moves before $project.

For example, if the pipeline consists of the following stages:

{ $sort: { age : -1 } },
{ $project: { status: 1, name: 1 } },
{ $limit: 5 }

During the optimization phase, the optimizer transforms the sequence to the following:

{ $sort: { age : -1 } },
{ $limit: 5 }
{ $project: { status: 1, name: 1 } },

$lookup:

Performs a left outer join to other  collection in the same database to filter in documents from the “joined” collection for processing. The $lookup stage does an equality match between a field from the input documents with a field from the documents of the “joined” collection.

To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage.

The $lookup stage has the following syntax:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

Example:

{
  $lookup: {
    from: "otherCollection",
    as: "resultingArray",
    localField: "x",
    foreignField: "y"
  }
},
$unwind:

The MongoDB $unwind stages operator is used to deconstructing an array field from the input documents to output a document for each element. Every output document is the input document with the value of the array field replaced by the element.

 Syntax: {
$unwind: <field path>
}
Example:
db.mycol.aggregate([
   {
      $unwind: "$specs"
   },
   {
      $lookup:
         {
            from: "inventory",
            localField: "specs",
            foreignField: "size",
            as: "inventory_docs"
        }
   },
   {
      $match: { "inventory_docs": { $ne: [] } }
   }
])

Single Purpose Aggregation Operations:

Aggregation refers to a broad class of data manipulation operations that compute a result based on an input and a specific procedure. MongoDB provides a number of aggregation operations that perform specific aggregation operations on a set of data.

Example:

                        db.mycol.count( { url: 'http://www.neo4j.com' } );
                        db.mycol.distinct( "url" );

 

References:


4 Comments

Anil · February 8, 2018 at 11:40 am

Excellent work Avinash. Keep up the good work. Proud of you.

Aarohi · February 8, 2018 at 11:56 am

Very Very Informative Avinash, Thank you for the guidance

Ajinkya · February 8, 2018 at 12:00 pm

Good job Avinash.provides lots of information for the beginners to start with mongo db.

Abhishek · February 14, 2018 at 3:42 pm

Such an amazing Job Avinash. Thanks for stepping up and getting this done for us. Keep Motivating

Leave a Reply

Your email address will not be published. Required fields are marked *