Next CERN Accelerator Logging Service with Jakub Wozniak

Jakub Wozniak, CERN
Next CERN Accelerator
Logging Service
Architecture
#EUent9

Agenda
• What is (Next) CALS?
• NXCALS Architecture
• Meta-data Service & Ingestion API
• Spark Extraction API
#EUent9

Controls Data Logging
• Provide access to current & historical device state
– Monitoring & controls of the machines
– Improve machine/beam performance
– Various studies (new beam types, experiments, machines)
• Required to deliver quality beam to experiments
• Not physics data from experiments!
#EUent9

CERN Accelerator Logging Service
• Old system (CALS) based on Oracle (2 DBs)
– ~20,000 devices (from ~120,000 devices)
– 1,500,000 signals
– 5,000,000 extractions per day
– 71,000,000,000 records per day
• 2 TB / day (unfiltered data, 2 DBs)
– 1 PB of total data storage (heavily filtered up to 95%)
#EUent9

Current Controls Data Storage
Run 1 Run 2LS 1
900 GB/day
#EUent9

Current Issues With CALS
• Performance / scalability problems
– Difficult to scale horizontally
– “… to extract 24h of data takes 12h”
• Other issues
– Problems with big payloads (payloads vary from KB to GB)
– Limited & rigid table structure & limited types (no nested types)
– Limited integration with heterogeneous analytics tools (Python, Matlab,
R, Java,…)
• CALS & tools not ready for Big Data!
– Have to extract data to do analysis!
#EUent9

Big Data
For Controls?
#EUent9
CALS on Oracle
Impala
Kudu
?Next CALS
(NXCALS)

Next
CERN Accelerator Logging Service
(Kafka, Hadoop, Spark)
#EUent9

Controls Data
#EUent9
Readings from devices / properties (with fields inside)
Timeseries of records
Device X / Property Y (time & values): t0: { f1, f2, f3 } (schema 1)
t1: { f1, f2, f3 }
t2: { f1, f2, f3 }
…
t3: { f1, f2, f3, f4 } (schema 2)
t4: { f1, f2, f3, f4 }
…
tN: { f1, f2, f3, f5, …, fN } (schema N)
Devices get updated so …
… schema changes over time!

Generic Storage System
• Different Controls Systems for different domains
• Not only Device/Property model
Let’s generalize and define some abstraction
Call it Entity…
…and just arbitrary Records
Record: Key -> Values (with timestamp & partition)
Not limited to Controls nor CERN!
#EUent9

Some Requirements
• Discover entities from records
– Avoids static / offline registration in advance
• Allow to search for entity meta-data
– What are the known entities?
– How they are partitioned?
– With what schemas?
• Store & extract data
• Data access
– Online monitoring (simple extraction but must have low latency data access)
– Offline analysis (provide visualization tools for more complex analysis)
#EUent9

NXCALS Architecture
Spark
Lo
g.
Pr
oc.
Datasources
12
Jupyter
Old API
NXCALS
API
ETL
Kafka
HBase
HDFS
Avro
Parquet
Hadoop
API
Meta-data service DB
Scientists
Programmers
Applications
Clients

Design Choices
• Why Hadoop
– Service at CERN (IT/DB group)
• Why Kafka?
– Redundancy & data safety (if Hadoop not available)
– Low latency streaming API for extraction
• Why Hbase?
– Fast, low latency for online monitoring queries
– Gives time for data deduplication & compaction into Parquet files
• Why Parquet as final storage?
– Open, columnar, storage efficient format with good compression
– Good performance for extraction
• predicate push down
• column projection
– Easy to understand, access (even outside of the system), backup, etc
#EUent9

Data Flow
• Ingestion API to send data to Kafka (as Avro)
• ETL extracts it from Kafka towards
– HDFS (as Avro, into staging folders)
– HBase (as Avro, for low latency)
• Avro files is deduplicated & compacted
• Into larger Parquet files (with Spark)
• Hadood-friendly process, avoids many small files
• Spark Extraction API for data access
• Meta-data service knows location of objects in files
– Avoids scanning many files
– “Replacement” for missing indexes
#EUent9

Devops?
• Microservice architecture
• Monitoring is crucial, done using
– Prometheus
– Alertmanager
– Grafana
– Logs send to Elastic (outside)
• Fully automated CI/CD with
– Jenkins pipelines
– Ansible deployment
#EUent9

Data Types
• Data (records):
– Kafka -> Hadoop (HBase, HDFS)
• Meta-data (info about data)
– RDBMS (Oracle)
#EUent9

Domain Description
• System stores changes of state of abstract entities in form of records
– Data identified by entity keys and timestamp
– “Extended” timeseries data
• Record = { f1=v1 ,…, fn=vn } (at t1)
– Any fields
– Some fields are special (entity keys, partition keys, timestamp)
– Set of fields => Schema
• Records are split (grouped in different files on disk) by:
– Time, partition (classifier), schema
• Fields can change over time {f1…fm} (at tx)
– History of record structure changes (schema changes)
#EUent9

Meta Data Objects
• ENTITY – abstract object we store data for
– Identified by known record fields (primary key)
• PARTITION –classifier to store data on disk in files
– Identified by known record fields (primary key)
• SCHEMA – given set of all record’s fields
#EUent9

Meta Data Objects
• SYSTEM – defines record type (special fields)
– Field names identifying ENTITY
– Field names identifying PARTITION
– Field names identifying TIMESTAMP
• ENTITY-HISTORY – history of SCHEMA & PARTITION changes of ENTITY over
time
• VARIABLE – alias for ENTITY
– whole record
– field in record
• VARIABLE-HISTORY – VARIABLE configuration over time
– Pointer (alias) to entity and field with time information
#EUent9

Java Ingestion API Example
// Create data publisher
Publisher<ImmutableData> publisher =
PublisherFactory.newInstance().createPublisher(“MOCK-SYSTEM”,(d)-> d);
// Create data (ImmutableData == Map<String,Object>)
ImmutableData data = ImmutableData.builder()
.add("device", ”NXCALS_MONITORING_DEV1")
.add(”property", ”Setting")
.add(“class”,”MONITORING”)
.add(“timestamp”,Instant.now())
.add("byteField1", (byte) 2)
.add("shortField1", (short) 1).build();
// Publish data
CompletableFuture<Void> future = publisher.publish(data);
// Handle Future completion or error
future.whenComplete((v,e)->{if(e != null) //handle errors });
#EUent9
Entity Key
Partition Key
Timestamp Key

Data Partitioning
System [sid], { entity_keys, partition_keys, timestamp, field1…fieldN } = record
hdfs: /// project / nxcals / sid / partition_id / schema_id / date / data.parquet
schema
Meta
A simple example for device domain (CMW)
• System CMW which defines:
• Entity keys as device, property
• Partition keys as class, property
• Timestamp keys (acq or cycle stamp)
So one data.parquet file will contain
data for devices from the same
class/property.
A file has always records of
the same schema!#EUent9

Meta Store Efficiency
• Meta-data is cached
• Ingestion API calls the meta-store only on:
– Entity creation
– Entity change (schema change / rename / …)
– Cache misses
• So rarely compared to the data rate
– Calls to meta store expensive (10-50ms)
#EUent9

Meta-Store Features
• Entities are created dynamically from records
• Schemas are discovered and saved with history
• Records (entities) can change schemas over time
• Schema changes handled at extraction
– using history from meta-data service
#EUent9

API for Spark Extraction
• Extension to Spark sources package
– Extends BaseRelation, implements PrunedFilteredScan
– sparkSession.read().format("cern.accsoft.nxcals.data.access.api”).load()
• Hides data source & implementation details
– Hbase for most recent data (<36 hours)
– HDFS for older data (>36 hours due to compaction)
• Merges schemas using schema history
• Greatly simplifies data access
#EUent9

Spark Extraction Example
SparkSession sparkSession = … // create session
Dataset<Row> dataset = DataAccessQueryBuilder
.system("MOCK-SYSTEM")
.keyValue("device", ”NXCALS_MONITORING_DEV1")
.keyValue(”property", ”Setting")
.startTime("2017-10-10 00:00:00.0")
.duration(Duration.ofDays(2))
.fields("device", "intField1", "doubleField")
.buildDataset(sparkSession);
#EUent9
Entity Key
Time Window

Record Schema, Spark Default
Record 1: {acqStamp, field1 (double), field2 (integer)}
…
Record 2: {acqStamp, field1 (float), field21 (long)} //rename, field2 = field21
…
Record 3: {acqStamp, field3 (double)} //only field3
Can you quickly extract & union datasets containing those records?
org.apache.spark.sql.AnalysisException:
Union can only be performed on tables with the same number of columns
Can be done but troublesome for scientists!
Entity A evolves over time:
#EUent9

Schema Merging
Schema: {acqStamp(long), field1 (double), field2 (integer), field21 (long), field3 (double)}
Record1
Record2
Record3
…
…
#EUent9

…
…
… With Field Aliases
… and new_field as alias of field2 and field21
Schema {acqStamp (long), field1 (double), new_field (long), field3 (double)}
Record1
Record2
Record3
#EUent9

Variables
• Pointer to field in entity record in time window
• Can point to different entities over time
• No need for real entity
• Useful for abstractions (“LHC_Beam_Intensity”)
#EUent9

Variable Extraction API
#EUent9
SparkSession sparkSession = … // create session
Dataset<Row> dataset = VariableQueryBuilder
.variable(”NXCALS_MONITORING_VARIABLE")
.startTime("2017-10-10 00:00:00.0")
.duration(Duration.ofDays(2))
.buildDataset(sparkSession);

Variables Configuration
Schema: {variable (String), acqStamp(long), value (double)}
Entity 1: {acqStamp, field1 (float), field21 (long)}
Entity 2: {acqStamp, field2 (double)}
Entity 3: {aqcStamp, field1(array2D), field3 (float)}
Variable configuration
changes over time
#EUent9

Why Simplified Extraction?
• Data producers ≠ data consumers
• At CERN different groups do
– Equipment & Device / Property design (low level)
– Physics & Beam-oriented analysis (high level)
#EUent9

Summary
• NXCALS is a generic Big Data storage system
• Timeseries-like records of changing structure
– Arbitrary entity & partition keys
• Java Ingestion API
• Spark Extraction API (Java, Python, Scala)
#EUent9

Questions?
• NXCALS code:
– https://gitlab.cern.ch/acc-logging-team/nxcals
• Contact us:
– jakub.wozniak@cern.ch
– acc-logging-team@cern.ch
#EUent9

Next CERN Accelerator Logging Service with Jakub Wozniak

More Related Content

What's hot (20)

Similar to Next CERN Accelerator Logging Service with Jakub Wozniak (20)

More from Spark Summit (20)

Recently uploaded (20)

Next CERN Accelerator Logging Service with Jakub Wozniak