SlideShare a Scribd company logo
@gamussa @hazelcast #oraclecode
IN-MEMORY ANALYTICS
with APACHE SPARK and
HAZELCAST
@gamussa @hazelcast #oraclecode
Solutions Architect
Developer Advocate
@gamussa in internetz
Please, follow me on Twitter
I’m very interesting ©
Who am I?
@gamussa @hazelcast #oraclecode
What’s Apache Spark?
Lightning-Fast Cluster Computing
@gamussa @hazelcast #oraclecode
Run programs up to 100x
faster than Hadoop
MapReduce in memory,
or 10x faster on disk.
@gamussa @hazelcast #oraclecode
When to use Spark?
Data Science Tasks
when questions are unknown
Data Processing Tasks
when you have to much data
You’re tired of Hadoop
@gamussa @hazelcast #oraclecode
Spark Architecture
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
RDD
@gamussa @hazelcast #oraclecode
Resilient Distributed Datasets (RDD)
are the primary abstraction in Spark –
a fault-tolerant collection of elements that can be
operated on in parallel
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
RDD Operations
@gamussa @hazelcast #oraclecode
operations on RDDs:
transformations and actions
@gamussa @hazelcast #oraclecode
transformations are lazy
(not computed immediately)
the transformed RDD gets recomputed
when an action is run on it (default)
@gamussa @hazelcast #oraclecode
RDD
Transformations
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
RDD
Actions
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
RDD
Fault Tolerance
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
RDD
Construction
@gamussa @hazelcast #oraclecode
parallelized collections
take an existing Scala collection
and run functions on it in parallel
@gamussa @hazelcast #oraclecode
Hadoop datasets
run functions on each record of a file in Hadoop distributed
file system or any other storage system supported by
Hadoop
@gamussa @hazelcast #oraclecode
What’s Hazelcast IMDG?
The Fastest In-memory Data Grid
@gamussa @hazelcast #oraclecode
Hazelcast IMDG
is an operational,
in-memory,
distributed computing platform
that manages data using
in-memory storage, and
performs parallel execution for
breakthrough application speed
and scale
@gamussa @hazelcast #oraclecode
High-Density
Caching
In-Memory
Data Grid
Web Session
Clustering
Microservices
Infrastructure
@gamussa @hazelcast #oraclecode
What’s Hazelcast IMDG?
In-memory Data Grid
Apache v2 Licensed
Distributed
Caches (IMap, JCache)
Java Collections (IList, ISet, IQueue)
Messaging (Topic, RingBuffer)
Computation (ExecutorService, M-R)
@gamussa @hazelcast #oraclecode
Green
Primary
Green
Backup
Green
Shard
@gamussa @hazelcast #oraclecode
@gamussa @hazelcast #oraclecode
final SparkConf sparkConf = new SparkConf()
.set("hazelcast.server.addresses", "localhost")
.set("hazelcast.server.groupName", "dev")
.set("hazelcast.server.groupPass", "dev-pass")
.set("hazelcast.spark.readBatchSize", "5000")
.set("hazelcast.spark.writeBatchSize", "5000")
.set("hazelcast.spark.valueBatchingEnabled", "true");
final JavaSparkContext jsc = new JavaSparkContext("spark://localhost:7077",
"app", sparkConf);
final HazelcastSparkContext hsc = new HazelcastSparkContext(jsc);
final HazelcastJavaRDD<Object, Object> mapRdd = hsc.fromHazelcastMap("movie");
final HazelcastJavaRDD<Object, Object> cacheRdd = hsc.fromHazelcastCache("my-
cache");
@gamussa @hazelcast #oraclecode
final SparkConf sparkConf = new SparkConf()
.set("hazelcast.server.addresses", "localhost")
.set("hazelcast.server.groupName", "dev")
.set("hazelcast.server.groupPass", "dev-pass")
.set("hazelcast.spark.readBatchSize", "5000")
.set("hazelcast.spark.writeBatchSize", "5000")
.set("hazelcast.spark.valueBatchingEnabled", "true");
final JavaSparkContext jsc = new JavaSparkContext("spark://localhost:7077",
"app", sparkConf);
final HazelcastSparkContext hsc = new HazelcastSparkContext(jsc);
final HazelcastJavaRDD<Object, Object> mapRdd = hsc.fromHazelcastMap("movie");
final HazelcastJavaRDD<Object, Object> cacheRdd = hsc.fromHazelcastCache("my-
cache");
@gamussa @hazelcast #oraclecode
final SparkConf sparkConf = new SparkConf()
.set("hazelcast.server.addresses", "localhost")
.set("hazelcast.server.groupName", "dev")
.set("hazelcast.server.groupPass", "dev-pass")
.set("hazelcast.spark.readBatchSize", "5000")
.set("hazelcast.spark.writeBatchSize", "5000")
.set("hazelcast.spark.valueBatchingEnabled", "true");
final JavaSparkContext jsc = new JavaSparkContext("spark://localhost:7077",
"app", sparkConf);
final HazelcastSparkContext hsc = new HazelcastSparkContext(jsc);
final HazelcastJavaRDD<Object, Object> mapRdd = hsc.fromHazelcastMap("movie");
final HazelcastJavaRDD<Object, Object> cacheRdd = hsc.fromHazelcastCache("my-
cache");
@gamussa @hazelcast #oraclecode
final SparkConf sparkConf = new SparkConf()
.set("hazelcast.server.addresses", "localhost")
.set("hazelcast.server.groupName", "dev")
.set("hazelcast.server.groupPass", "dev-pass")
.set("hazelcast.spark.readBatchSize", "5000")
.set("hazelcast.spark.writeBatchSize", "5000")
.set("hazelcast.spark.valueBatchingEnabled", "true");
final JavaSparkContext jsc = new JavaSparkContext("spark://localhost:7077",
"app", sparkConf);
final HazelcastSparkContext hsc = new HazelcastSparkContext(jsc);
final HazelcastJavaRDD<Object, Object> mapRdd = hsc.fromHazelcastMap("movie");
final HazelcastJavaRDD<Object, Object> cacheRdd = hsc.fromHazelcastCache("my-
cache");
@gamussa @hazelcast #oraclecode
Demo
@gamussa @hazelcast #oraclecode
LIMITATIONS
@gamussa @hazelcast #oraclecode
DATA SHOULD NOT BE
UPDATED WHILE READING
FROM SPARK
@gamussa @hazelcast #oraclecode
WHY ?
@gamussa @hazelcast #oraclecode
MAP EXPANSION
SHUFFLES THE DATA
INSIDE THE BUCKET
@gamussa @hazelcast #oraclecode
CURSOR DOESN’T POINT TO
CORRECT ENTRY ANYMORE,
DUPLICATE OR MISSING
ENTRIES COULD OCCUR
@gamussa @hazelcast #oraclecode
github.com/hazelcast/hazelcast-spark
@gamussa @hazelcast #oraclecode
THANKS!
Any questions?
You can find me at
@gamussa
viktor@hazelcast.com
Ad

Recommended

[OracleCode - SF] Distributed caching for your next node.js project
[OracleCode - SF] Distributed caching for your next node.js project
Viktor Gamov
 
Spark!
Spark!
Przemek Maciolek
 
Ignite Your Big Data With a Spark!
Ignite Your Big Data With a Spark!
Progress
 
Dynamic Class-Based Spark Workload Scheduling and Resource Using YARN with L...
Dynamic Class-Based Spark Workload Scheduling and Resource Using YARN with L...
Databricks
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Spark Summit
 
10 Things About Spark
10 Things About Spark
Roger Brinkley
 
Spark, Tachyon and Mesos internals
Spark, Tachyon and Mesos internals
Claudiu Barbura
 
Hadoop at ayasdi
Hadoop at ayasdi
Mohit Jaggi
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
ScyllaDB
 
Scylla @ GumGum: Contextual Ads
Scylla @ GumGum: Contextual Ads
ScyllaDB
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
ScyllaDB
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
Kristofferson A
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per server
Avi Kivity
 
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
Akshay Rai
 
Apache spark online training - GoLogica
Apache spark online training - GoLogica
GoLogica Technologies
 
Managing your Black Friday Logs
Managing your Black Friday Logs
J On The Beach
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
ScyllaDB
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
Matteo Moretti
 
Redshift Introduction
Redshift Introduction
DataKitchen
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
Carlos Alonso Pérez
 
Hadoop + GPU
Hadoop + GPU
Vladimir Starostenkov
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks
 
Introduction to df
Introduction to df
Mohit Jaggi
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Spark Summit
 
«Почему Spark отнюдь не так хорош»
«Почему Spark отнюдь не так хорош»
Olga Lavrentieva
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
Streamsets and spark
Streamsets and spark
Hari Shreedharan
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Timo Walther
 

More Related Content

What's hot (20)

GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
ScyllaDB
 
Scylla @ GumGum: Contextual Ads
Scylla @ GumGum: Contextual Ads
ScyllaDB
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
ScyllaDB
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
Kristofferson A
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per server
Avi Kivity
 
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
Akshay Rai
 
Apache spark online training - GoLogica
Apache spark online training - GoLogica
GoLogica Technologies
 
Managing your Black Friday Logs
Managing your Black Friday Logs
J On The Beach
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
ScyllaDB
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
Matteo Moretti
 
Redshift Introduction
Redshift Introduction
DataKitchen
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
Carlos Alonso Pérez
 
Hadoop + GPU
Hadoop + GPU
Vladimir Starostenkov
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks
 
Introduction to df
Introduction to df
Mohit Jaggi
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Spark Summit
 
«Почему Spark отнюдь не так хорош»
«Почему Spark отнюдь не так хорош»
Olga Lavrentieva
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
GPS Insight on Using Presto with Scylla for Data Analytics and Data Archival
ScyllaDB
 
Scylla @ GumGum: Contextual Ads
Scylla @ GumGum: Contextual Ads
ScyllaDB
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
ScyllaDB
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
ScyllaDB
 
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
OOW Unconference 2010: Mining the AWR repository for Capacity Planning, Visua...
Kristofferson A
 
Scylla: 1 Million CQL operations per second per server
Scylla: 1 Million CQL operations per second per server
Avi Kivity
 
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
Akshay Rai
 
Apache spark online training - GoLogica
Apache spark online training - GoLogica
GoLogica Technologies
 
Managing your Black Friday Logs
Managing your Black Friday Logs
J On The Beach
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
ScyllaDB
 
AWS Summit Milan - AWS RDS for your data (and your sleep)
AWS Summit Milan - AWS RDS for your data (and your sleep)
Matteo Moretti
 
Redshift Introduction
Redshift Introduction
DataKitchen
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
Claudiu Barbura
 
Case Study: Troubleshooting Cassandra performance issues as a developer
Case Study: Troubleshooting Cassandra performance issues as a developer
Carlos Alonso Pérez
 
Building Data Quality pipelines with Apache Spark and Delta Lake
Building Data Quality pipelines with Apache Spark and Delta Lake
Databricks
 
Introduction to df
Introduction to df
Mohit Jaggi
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Spark Summit
 
«Почему Spark отнюдь не так хорош»
«Почему Spark отнюдь не так хорош»
Olga Lavrentieva
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
J On The Beach
 

Viewers also liked (20)

Streamsets and spark
Streamsets and spark
Hari Shreedharan
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Timo Walther
 
Akka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming World
Konrad Malawski
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
Anshuman Ghosh
 
[Jfokus] Riding the Jet Streams
[Jfokus] Riding the Jet Streams
Viktor Gamov
 
[JokerConf] Верхом на реактивных стримах, 10/13/2016
[JokerConf] Верхом на реактивных стримах, 10/13/2016
Viktor Gamov
 
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
Viktor Gamov
 
[Codemash] Caching Made "Bootiful"!
[Codemash] Caching Made "Bootiful"!
Viktor Gamov
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
Rahul Gupta
 
Hazelcast Essentials
Hazelcast Essentials
Rahul Gupta
 
Apache Spark and Oracle Stream Analytics
Apache Spark and Oracle Stream Analytics
Prabhu Thukkaram
 
Complex Event Processing with Esper
Complex Event Processing with Esper
Ted Won
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
Yahoo Developer Network
 
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2
 
Dive into Spark Streaming
Dive into Spark Streaming
Gerard Maas
 
Streaming all the things with akka streams
Streaming all the things with akka streams
Johan Andrén
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
The Power of the Log
The Power of the Log
Ben Stopford
 
Kafka & Couchbase Integration Patterns
Kafka & Couchbase Integration Patterns
Manuel Hurtado
 
Kudu Forrester Webinar
Kudu Forrester Webinar
Cloudera, Inc.
 
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Apache Flink's Table & SQL API - unified APIs for batch and stream processing
Timo Walther
 
Akka-chan's Survival Guide for the Streaming World
Akka-chan's Survival Guide for the Streaming World
Konrad Malawski
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
Anshuman Ghosh
 
[Jfokus] Riding the Jet Streams
[Jfokus] Riding the Jet Streams
Viktor Gamov
 
[JokerConf] Верхом на реактивных стримах, 10/13/2016
[JokerConf] Верхом на реактивных стримах, 10/13/2016
Viktor Gamov
 
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
[NYJavaSig] Riding the Distributed Streams - Feb 2nd, 2017
Viktor Gamov
 
[Codemash] Caching Made "Bootiful"!
[Codemash] Caching Made "Bootiful"!
Viktor Gamov
 
Think Distributed: The Hazelcast Way
Think Distributed: The Hazelcast Way
Rahul Gupta
 
Hazelcast Essentials
Hazelcast Essentials
Rahul Gupta
 
Apache Spark and Oracle Stream Analytics
Apache Spark and Oracle Stream Analytics
Prabhu Thukkaram
 
Complex Event Processing with Esper
Complex Event Processing with Esper
Ted Won
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
Yahoo Developer Network
 
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2Con USA 2017: Scalable Real-time Complex Event Processing at Uber
WSO2
 
Dive into Spark Streaming
Dive into Spark Streaming
Gerard Maas
 
Streaming all the things with akka streams
Streaming all the things with akka streams
Johan Andrén
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Lightbend
 
The Power of the Log
The Power of the Log
Ben Stopford
 
Kafka & Couchbase Integration Patterns
Kafka & Couchbase Integration Patterns
Manuel Hurtado
 
Kudu Forrester Webinar
Kudu Forrester Webinar
Cloudera, Inc.
 
Ad

Similar to [OracleCode SF] In memory analytics with apache spark and hazelcast (20)

JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
Joseph Kuo
 
Intro to Spark development
Intro to Spark development
Spark Summit
 
Introduction to Spark Training
Introduction to Spark Training
Spark Summit
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
Databricks
 
Big data clustering
Big data clustering
Jagadeesan A S
 
Intro to Apache Spark
Intro to Apache Spark
clairvoyantllc
 
Intro to Apache Spark
Intro to Apache Spark
Robert Sanders
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
Paco Nathan
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
DataStax Academy
 
Introduction to hazelcast
Introduction to hazelcast
Emin Demirci
 
Scala Meetup Hamburg - Spark
Scala Meetup Hamburg - Spark
Ivan Morozov
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8
Rahul Gupta
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
Jen Stirrup
 
Apache Spark Core
Apache Spark Core
Girish Khanzode
 
Spark Concepts - Spark SQL, Graphx, Streaming
Spark Concepts - Spark SQL, Graphx, Streaming
Petr Zapletal
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
Dec6 meetup spark presentation
Dec6 meetup spark presentation
Ramesh Mudunuri
 
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
 
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
JCConf 2016 - Cloud Computing Applications - Hazelcast, Spark and Ignite
Joseph Kuo
 
Intro to Spark development
Intro to Spark development
Spark Summit
 
Introduction to Spark Training
Introduction to Spark Training
Spark Summit
 
Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
Databricks
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
Rahul Borate
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
Paco Nathan
 
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
Cassandra Summit 2014: Apache Spark - The SDK for All Big Data Platforms
DataStax Academy
 
Introduction to hazelcast
Introduction to hazelcast
Emin Demirci
 
Scala Meetup Hamburg - Spark
Scala Meetup Hamburg - Spark
Ivan Morozov
 
Distributed caching-computing v3.8
Distributed caching-computing v3.8
Rahul Gupta
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
Jen Stirrup
 
Spark Concepts - Spark SQL, Graphx, Streaming
Spark Concepts - Spark SQL, Graphx, Streaming
Petr Zapletal
 
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Spark & Cassandra at DataStax Meetup on Jan 29, 2015
Sameer Farooqui
 
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Apache Spark - Lightning Fast Cluster Computing - Hyderabad Scalability Meetup
Hyderabad Scalability Meetup
 
Dec6 meetup spark presentation
Dec6 meetup spark presentation
Ramesh Mudunuri
 
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
 
Ad

More from Viktor Gamov (11)

[DataSciCon] Divide, distribute and conquer stream v. batch
[DataSciCon] Divide, distribute and conquer stream v. batch
Viktor Gamov
 
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
Viktor Gamov
 
Testing containers with TestContainers @ AJUG 7/18/2017
Testing containers with TestContainers @ AJUG 7/18/2017
Viktor Gamov
 
Distributed caching for your next node.js project cf summit - 06-15-2017
Distributed caching for your next node.js project cf summit - 06-15-2017
Viktor Gamov
 
[Philly ETE] Java Puzzlers NG
[Philly ETE] Java Puzzlers NG
Viktor Gamov
 
Распределяй и властвуй — 2: Потоки данных наносят ответный удар
Распределяй и властвуй — 2: Потоки данных наносят ответный удар
Viktor Gamov
 
[JBreak] Блеск И Нищета Распределенных Стримов - 04-04-2017
[JBreak] Блеск И Нищета Распределенных Стримов - 04-04-2017
Viktor Gamov
 
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
Viktor Gamov
 
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
Viktor Gamov
 
Functional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIA
Viktor Gamov
 
Testing Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user group
Viktor Gamov
 
[DataSciCon] Divide, distribute and conquer stream v. batch
[DataSciCon] Divide, distribute and conquer stream v. batch
Viktor Gamov
 
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
[Philly JUG] Divide, Distribute and Conquer: Stream v. Batch
Viktor Gamov
 
Testing containers with TestContainers @ AJUG 7/18/2017
Testing containers with TestContainers @ AJUG 7/18/2017
Viktor Gamov
 
Distributed caching for your next node.js project cf summit - 06-15-2017
Distributed caching for your next node.js project cf summit - 06-15-2017
Viktor Gamov
 
[Philly ETE] Java Puzzlers NG
[Philly ETE] Java Puzzlers NG
Viktor Gamov
 
Распределяй и властвуй — 2: Потоки данных наносят ответный удар
Распределяй и властвуй — 2: Потоки данных наносят ответный удар
Viktor Gamov
 
[JBreak] Блеск И Нищета Распределенных Стримов - 04-04-2017
[JBreak] Блеск И Нищета Распределенных Стримов - 04-04-2017
Viktor Gamov
 
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
Viktor Gamov
 
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
Viktor Gamov
 
Functional UI testing of Adobe Flex RIA
Functional UI testing of Adobe Flex RIA
Viktor Gamov
 
Testing Flex RIAs for NJ Flex user group
Testing Flex RIAs for NJ Flex user group
Viktor Gamov
 

Recently uploaded (20)

Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
SOFTTECHHUB
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
AudGram Review: Build Visually Appealing, AI-Enhanced Audiograms to Engage Yo...
SOFTTECHHUB
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Seminar: Evolving Landscape of Post-Quantum Cryptography.pptx
FIDO Alliance
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 

[OracleCode SF] In memory analytics with apache spark and hazelcast