Consulting Engineer, MongoDB
Bryan Reinero
#ConferenceHashTag
Time Series Data- Part 2
Aggregations in Action
Real Time Traffic Data Project
Our network of 16,000 speed sensors report
data every minute.
What we want from our data
Charting and Trending
What we want from our data
Historical & Predictive Analysis
What we want from our data
Real Time Traffic Dashboard
Document Structure
{ _id: ObjectId("5382ccdd58db8b81730344e2"),
linkId: 900006,
date: ISODate("2014-03-12T17:00:00Z"),
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Compound, unique
Index identifies the
Individual document
{ _id: ObjectId("5382ccdd58db8b81730344e2"),
linkId: 900006,
date: ISODate("2014-03-12T17:00:00Z"),
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Saves an extra index
{ _id: “900006:14031217”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
{ _id: “900006:14031217”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Range queries:
/^900006:1403/
Regex must be
left-anchored &
case-sensitive
{ _id: “900006:140312”,
data: [
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
{ speed: NaN, time: NaN },
...
],
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
}
Sample Document Structure
Pre-allocated,
60 element array of
per-minute data
Charts
0
10
20
30
40
50
60
70
MonMar10201404:57:00…
MonMar10201405:31:00…
MonMar10201406:05:00…
MonMar10201406:39:00…
MonMar10201407:13:00…
MonMar10201407:47:00…
MonMar10201408:21:00…
MonMar10201408:55:00…
MonMar10201409:29:00…
MonMar10201410:04:00…
MonMar10201410:38:00…
MonMar10201411:55:00…
TueMar11201402:41:00…
TueMar11201403:15:00…
TueMar11201403:49:00…
TueMar11201404:39:00…
TueMar11201405:13:00…
TueMar11201405:47:00…
TueMar11201406:21:00…
TueMar11201406:55:00…
TueMar11201407:29:00…
TueMar11201408:03:00…
TueMar11201408:37:00…
TueMar11201409:18:00…
TueMar11201410:44:00…
TueMar11201411:18:00…
TueMar11201411:53:00…
TueMar11201412:27:00…
TueMar11201413:04:00…
TueMar11201413:38:00…
TueMar11201414:15:00…
TueMar11201416:56:00…
WedMar12201401:45:00…
WedMar12201402:19:00…
WedMar12201402:53:00…
WedMar12201403:27:00…
WedMar12201406:46:00…
WedMar12201408:26:00…
WedMar12201409:00:00…
WedMar12201410:12:00…
WedMar12201410:46:00…
db.linkData.find( { _id : /^20484097:2014031/ } )
Rollups
{ _id: "20484097:20140204",
hours: [
{ speed: { sum: 1889, count: 60 }
time: { sum: 20562, count: 60 },
conditions: {
status: "Snow / Ice Conditions",
pavement: "Icy Spots",
weather: "Light Snow"
}
},
{ speed: {m: 1892, count: 60 },
time: {sum: 20442, count: 60 },
conditions: {
status: "Snow / Ice Conditions",
pavement: "Slush",
weather: "Light Snow"
}
}
]}
Document retention
Doc per hour
Doc per day
2 days
2 months
1year
Doc per Month
Analysis with The Aggregation
Framework
Pipelining operations
grep | sort | uniq
Piping command line operations
Pipelining operations
$match $group | $sort|
Piping aggregation operations
Stream of documents Result documents
What is the average speed for a
given road segment?
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: “”, ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Select documents on the target segment
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Keep only the fields we really need
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Loop over the array of data points
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
What is the average speed for a
given road segment?
Use the handy $avg operator
> db.linkData.aggregate(
{ $match: { ”_id" : /^20484097:/ } },
{ $project: { "data.speed": 1, linkId: 1 } } ,
{ $unwind: "$data"},
{ $group: { _id: "$linkId", ave: { $avg: "$data.speed"} } }
);
{ "_id" : 20484097, "ave" : 47.067650676506766 }
More Sophisticated Pipelines:
average speed with variance
{ "$project" : {
mean: "$meanSpd",
spdDiffSqrd : {
"$map" : {
"input": {
"$map" : {
"input" : "$speeds",
"as" : "samp",
"in" : { "$subtract" : [ "$$samp", "$meanSpd" ] }
}
},
as: "df", in: { $multiply: [ "$$df", "$$df" ] }
} } } },
{ $unwind: "$spdDiffSqrd" },
{ $group: { _id: mean: "$mean", variance: { $avg: "$spdDiffSqrd" } } }
Historic Analysis
How does weather and road conditions affect
traffic?
The Ask: what are the average speeds per
weather, status and pavement
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Snow”,
34
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Icy spots”, 34
MapReduce
function map() {
for( var i = 0; i < this.data.length; i++ ) {
emit (
this.conditions.weather,
{ speed : this.data[i].speed }
);
emit (
this.conditions.status,
{ speed : this.data[i].speed }
);
emit (
this.conditions.pavement,
{ speed : this.data[i].speed }
);
} }
“Delays”, 34
MapReduce
MapReduce
Weather: “Rain”, speed: 44
MapReduce
Weather: “Rain”, speed: 39
MapReduce
Weather: “Rain”, speed: 46
MapReduce
function reduce ( key, values ) {
var result = { count : 1, speedSum : 0 };
values.forEach( function( v ){
result.speedSum += v.speed;
result.count++;
});
return result;
}
MapReduce
function reduce ( key, values ) {
var result = { count : 1, speedSum : 0 };
values.forEach( function( v ){
result.speedSum += v.speed;
result.count++;
});
return result;
}
Results
results: [
{
"_id" : "Generally Clear and Dry Conditions",
"value" : {
"count" : 902,
"speedSum" : 45100
}
},
{
"_id" : "Icy Spots",
"value" : {
"count" : 242,
"speedSum" : 9438
}
},
{
"_id" : "Light Snow",
"value" : {
"count" : 122,
"speedSum" : 7686
}
},
{
"_id" : "No Report",
"value" : {
"count" : 782,
"speedSum" : NaN
}
}
Processing Large Data Sets
• Need to break data into smaller pieces
• Process data across multiple nodes
Hadoop Hadoop Hadoop Hadoop
Hadoop Hadoop Hadoop HadoopHadoop
Hadoop
Benefits of the Hadoop Connector
• Increased parallelism
• Access to analytics libraries
• Separation of concerns
• Integrates with existing tool chains
• Drivers will be accessing the data via web, mobile
devices, and navigation systems
• We need to provide current average speed, travel time
and weather per road segment
Real-time Dashboard
Current Real-Time Conditions
Last ten minutes of speeds and
times
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
Current Real-Time Conditions
Pre-aggregated
metrics
{ _id : “I-87:10656”,
description : "NYS Thruway Harriman Section Exits 14A - 16",
update : ISODate(“2013-10-10T23:06:37.000Z”),
speeds : [ 52, 49, 45, 51, ... ],
times : [ 237, 224, 246, 233,... ],
pavement: "Wet Spots",
status: "Wet Conditions",
weather: "Light Rain”,
averageSpeed: 50.23,
averageTime: 234,
maxSafeSpeed: 53.1,
location" : {
"type" : "LineString",
"coordinates" : [
[ -74.056, 41.098 ],
[ -74.077, 41.104 ] }
}
Current Real-Time Conditions
Geo-spatially indexed
road segment
db.linksAvg.update(
{"_id" : linkId},
{ "$set" : {"update " : date},
"$push" : {
"times" : { "$each" : [ time ], "$slice" : -10 },
"speeds" : {"$each" : [ speed ], "$slice" : -10}
}
})
Maintaining the current conditions
Each update pops the last element off the
array and pushes the new value
Putting it all together
Patterns common to time series
data:
• You need to store and manage an incoming
stream of data samples
• You need to compute derivative data sets based
on these samples
• You need low latency access to up-to-date data
Patterns common to time series
data:
• You need to store and manage an incoming
stream of data samples
• You need to compute derivative data sets based
on these samples
• You need low latency access to up-to-date data
Introducing The High Volume Data
Feed
HVDF: Reference Implementation
Screech -- High Volume Data Feed engine
REST
Service API
Processor
Plugins
Inline
Batch
Stream
Channel Data Storage
Raw
Channel
Data
Aggregated
Rollup T1
Aggregated
Rollup T2
Query Processor Streaming spout
Custom Stream
Processing Logic
Incoming Sample Stream
POST /feed/channel/data
GET
/feed/channeldata?time=XX
X&range=YYY
Real-time Queries
HVDF:
https://github.com/10gen-labs/hvdf
Hadoop Connector:
https://github.com/mongodb/mongo-hadoop
Consulting Engineer, MongoDB Inc.
Bryan Reinero
#MongoDBWorld
Thank You

More Related Content

PPTX
MongoDB for Time Series Data: Setting the Stage for Sensor Management
PPTX
Tutorial SDL Trados Studio 2021
PDF
Breve manual de cmaptools
PPTX
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
PPTX
MongoDB for Time Series Data
PPTX
The Aggregation Framework
PPTX
MongoDB for Time Series Data Part 3: Sharding
PPTX
MS SQL SERVER: Time series algorithm
MongoDB for Time Series Data: Setting the Stage for Sensor Management
Tutorial SDL Trados Studio 2021
Breve manual de cmaptools
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data
The Aggregation Framework
MongoDB for Time Series Data Part 3: Sharding
MS SQL SERVER: Time series algorithm

Viewers also liked (13)

PPTX
MongoDB and Hadoop: Driving Business Insights
PDF
Using MongoDB + Hadoop Together
PPTX
Agg framework selectgroup feb2015 v2
PDF
Creating a Modern Data Architecture for Digital Transformation
PPTX
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
PPTX
Back to Basics Webinar 3: Introduction to Replica Sets
PDF
Webinar: 10-Step Guide to Creating a Single View of your Business
PPTX
Seattle Scalability Meetup - Ted Dunning - MapR
PDF
Design, Scale and Performance of MapR's Distribution for Hadoop
PPTX
Back to Basics Webinar 1: Introduction to NoSQL
PDF
Webinar: Working with Graph Data in MongoDB
PPTX
Back to Basics: My First MongoDB Application
PDF
Advanced Schema Design Patterns
MongoDB and Hadoop: Driving Business Insights
Using MongoDB + Hadoop Together
Agg framework selectgroup feb2015 v2
Creating a Modern Data Architecture for Digital Transformation
MongoDB Analytics: Learn Aggregation by Example - Exploratory Analytics and V...
Back to Basics Webinar 3: Introduction to Replica Sets
Webinar: 10-Step Guide to Creating a Single View of your Business
Seattle Scalability Meetup - Ted Dunning - MapR
Design, Scale and Performance of MapR's Distribution for Hadoop
Back to Basics Webinar 1: Introduction to NoSQL
Webinar: Working with Graph Data in MongoDB
Back to Basics: My First MongoDB Application
Advanced Schema Design Patterns
Ad

Similar to MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Aggregation Framework and Hadoop (20)

PPTX
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
PDF
MongoDB World 2016: The Best IoT Analytics with MongoDB
PPTX
Operational Intelligence with MongoDB Webinar
PPTX
MongoDB 3.2 - Analytics
PPTX
Weather of the Century: Design and Performance
PPTX
2014 bigdatacamp asya_kamsky
PPTX
IT Days - Parse huge JSON files in a streaming way.pptx
PPTX
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
PDF
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
PDF
NoSQL meets Microservices
PDF
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
PDF
Data Time Travel by Delta Time Machine
PDF
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
PDF
NoSQL meets Microservices - Michael Hackstein
PDF
Analytics with Spark
KEY
Handling Real-time Geostreams
KEY
Handling Real-time Geostreams
PDF
node.js and the AR.Drone: building a real-time dashboard using socket.io
PDF
Evolution is Continuous, and so are Big Data and Streaming Pipelines
PPTX
Everything That Is Really Useful in Oracle Database 12c for Application Devel...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB World 2016: The Best IoT Analytics with MongoDB
Operational Intelligence with MongoDB Webinar
MongoDB 3.2 - Analytics
Weather of the Century: Design and Performance
2014 bigdatacamp asya_kamsky
IT Days - Parse huge JSON files in a streaming way.pptx
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
NoSQL meets Microservices
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
Data Time Travel by Delta Time Machine
Codepot - Pig i Hive: szybkie wprowadzenie / Pig and Hive crash course
NoSQL meets Microservices - Michael Hackstein
Analytics with Spark
Handling Real-time Geostreams
Handling Real-time Geostreams
node.js and the AR.Drone: building a real-time dashboard using socket.io
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Everything That Is Really Useful in Oracle Database 12c for Application Devel...
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
The various Industrial Revolutions .pptx
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
DOCX
search engine optimization ppt fir known well about this
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
STKI Israel Market Study 2025 version august
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Hindi spoken digit analysis for native and non-native speakers
2018-HIPAA-Renewal-Training for executives
Zenith AI: Advanced Artificial Intelligence
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Convolutional neural network based encoder-decoder for efficient real-time ob...
Benefits of Physical activity for teenagers.pptx
The various Industrial Revolutions .pptx
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
search engine optimization ppt fir known well about this
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
sustainability-14-14877-v2.pddhzftheheeeee
Microsoft Excel 365/2024 Beginner's training
Flame analysis and combustion estimation using large language and vision assi...
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
STKI Israel Market Study 2025 version august
Credit Without Borders: AI and Financial Inclusion in Bangladesh
OpenACC and Open Hackathons Monthly Highlights July 2025
Custom Battery Pack Design Considerations for Performance and Safety
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...

MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Aggregation Framework and Hadoop

Editor's Notes

  • #3: Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  • #7: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #8: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #9: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #10: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #11: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #12: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #13: Compound unique index on linkId & Interval update field used to identify new documents for aggregation
  • #18: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #19: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #20: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #21: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #22: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #23: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #39: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #40: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #41: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #42: Priority Floating point number between 0..1000 Highest member that is up to date wins Up to date == within 10 seconds of primary If a higher priority member catches up, it will force election and win Slave Delay Lags behind master by configurable time delay Automatically hidden from clients Protects against operator errors Fat fingering Application corrupts data
  • #44: Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  • #45: Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)
  • #46: Reports (group, summing, averaging) Analytics(incremental reporting, rollups) Analysis (trends, segmentation, anomalies) Analytics (regression, forecasting, filtering) Warehousing (long term storage and simplified querying)