Sitemap

♠ MongoDB Aggregation Framework and Map Reduce ♠

4 min readMay 30, 2021

✨ What is Database ?

A database is an organized collection of data, generally stored and accessed electronically from a computer system. Where databases are more complex they are often developed using formal design and modeling techniques.

✨ What is NoSQL ?

NoSQL can be defined as an approach to database designing, which holds a vast diversity of data such as key-value, multimedia, document, columnar, graph formats, external files, etc. NoSQL is purposefully developed for handling specific data models having flexible schemas to build modern applications. Some famous examples are MongoDB, Neo4J, HyperGraphDB, etc.

Press enter or click to view image in full size

✨ What is MongoDB ?

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas.

Press enter or click to view image in full size

✨ What is MongoDB Aggregation Framework ?

Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods.

Press enter or click to view image in full size

✨ What is Aggregation Pipeline ?

MongoDB’s aggregation framework is modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.

✨ What is Map Reduce Function ?

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results.

Press enter or click to view image in full size

Let’s begin with Task :🤩

✨ Task Description📄

🔅 Use Aggression Framework of MongoDB and Create Mapper and Reducer Program.

Before we begin with this we need to import the data in mongo database.

mongoimport sample.json  -d  <database_name> -c <collection_name>    --jsonArray
Press enter or click to view image in full size

👉🏻 Lets connect to mongodb shell :

# to see databases
show dbs
# to see collections
use <database_name>
show collections
Press enter or click to view image in full size

👉🏻 First let’s understand what our data contains and what our goal is ?

The data is of the countries and the languages the people speak there. So our goal is to find out how many countries speak same language.

👉🏻 We will perform this using two MongoDB Aggregation Framework :

  1. Aggregation Pipeline
  2. Map-Reduce Function

✨ Now let’s begin with task ….

👉🏻 Method 1: Aggregation Pipeline

Press enter or click to view image in full size
db.countries.aggregate([{$group: {_id: {Language: “$Language”}, totalCountry: {$sum: 1}}}, {$sort: {totalCountry: 1}}])
# {$group: {_id: {Language: "$Language"} --> group by Language# totalCountry: {$sum: 1} --> count the total countries asscoiated with that language# {$sort: {totalCountry: 1} --> sort them in ascending order
Press enter or click to view image in full size

👉🏻 Method 2: Map Reduce Function

Press enter or click to view image in full size
# Map reduce functions# Syntax :var mapFunction = function() { … };var reduceFunction = function(key, values) { … };db.runCommand(
{
mapReduce: <input-collection>,
map: mapFunction,
reduce: reduceFunction,
out: { merge: <output-collection> },
query: <query>
}
)

👉🏻 Declaring Map variable :

var mapFunc1 = function()  { 
var cntry = emit(this.Language, this.CountryName);
$split: [ cntry, "," ];
};
# defined country variable which will be grouping the data based on Language and Country Name and then splitting the data by comma

👉🏻 Declaring Reduce variable :

var ReduceFunc1 = function(keyLang, valuesCountryName) {  
return valuesCountryName.length;
};
# after grouping, here we are counting the number of countries after the output is been sent by mapper

👉🏻 Using Map Reduce Function :

db.countries.mapReduce( 
mapFunc1,
ReduceFunc1,
{out: "map_reduced"}
)
# now using map reduce function and saving it in map_reduced collection
Press enter or click to view image in full size

👉🏻 Now let’s do query :

db.map_reduced.find().sort( { } )
Press enter or click to view image in full size

So finally, data is sorted as we wanted !!! 🎉🎉

Thanks for Reading !! 🙌🏻😁📃

🔰 Keep Learning !! Keep Sharing !! 🔰

--

--

No responses yet