Development of Methods for Deep Neural
Network Architectures Optimization based on
Tensor Factorization Algorithms
Supervisor: Revin Ilya Evgenyevich, research associate,
Laboratory for Composite AI, Research Center “Strong AI in Industry”
Presented by: Zakharov Denis, J4232c
Pain: Current deep networks have excessive parameters which results in
long training time and expensive storing wights production.
Hypothesis: With the use of optimization method we can reduce size of
stored model, increase throughput.
Work: Development of Methods for Deep Neural Network Architectures
Optimization based on Tensor Factorization Algorithms
Problem
2
Purpose and objectives of study
Goal
Development of Methods for Deep Neural Network Architectures Optimization based on Tensor
Factorization Algorithms
Objectives
• Make a Literature review of related field to provide a background
• Make a scientific research of tensor Algorithms (Optimization, Operations)
• Make a scientific research of LoRA approach
• Perform experiment with TS Models
• Based on these findings propose a product that can be integrated into AutoML solution
• Develop Optimization Method
• Contribute to Fedot.Industrial framework
3
BERT 340M T5 11B GPT-3 175B
Megatron-Turing 530B
GPT-4 1.76T
Gemini Pro ≈30T
Gemini Ultra ≈60T
2018 2019 2020 2021 2022 2023 2024
Model Trend
Modern Networks
4
Number
of
parameters
Tensor Decomposition
5
!
≈ =
!!
"!
#!
"
!
"#$
$
%
&
!
≈
!!
!"
!#
!!
!"
!#
!
!!
≈
""
#!
$!
!
!
"#$
! " # $!
= ∗ ∗
t-SVD (Singular Value Decomposition)
CANDECOMP/PARAFAC
Tucker Block Term
Low-rank decomposition
6
𝑀 𝑅!
"
𝐿!
≈ ×
𝑚×𝑘
𝑚×𝑛
𝑘×𝑛
Statement of Experiment
7
Check performance of TS models NBEATS, Transformer, ARIMA on M4
dataset for checking how would they predict timeseries based on data.
Models
8
Transformer — state-of-the-art deep learning model introduced in
2017. It is an encoder-decoder architecture whose core feature is
the ‘multi-head attention’ mechanism
Linear Linear Linear
Scaled Dot-Product Attention
Concat
Linear
Multi-Head Attention
MatMul
SoftMax
Mask
Scale
MatMul
Scaled Dot-Product Attention
In Out
Multi-Head
Attention
Add & Norm
Feed
Forward
Add & Norm
Masked
Multi-Head
Attention
Add & Norm
Multi-Head
Attention
Add & Norm
Feed
Forward
Add & Norm
SoftMax
Linear
!
×
!
×
Models
9
NBEATS — deep learning-based approach for time series forecasting.
ARIMA — popular statistical model used to forecast future values in a time
series based on past values.
Block
FC Stack
FC FC
!!(#!) !"(#")
Backcast
Stack
Forecast
Stack
Block 1
Block 2
Block K
…
Forecast
Global
Forecast
Stack 1
Stack 2
Stack M
…
M4 Dataset
10
The M4 dataset is a collection of 100,000 time series used for the fourth
edition of the Makridakis forecasting Competition.
Consists of time series of:
• Yearly — 63 avg training length
• Quarterly — 125
• Monthly — 302
• Weekly — 2035
• Daily — 475
• Hourly — 682
Experiment - Monthly
11
ARIMA
NBEATS Transformer
Interpretation of results
12
On short series all three models performed well and suitable for predicting
However with longer range
• NBEATS perform better than others models
• Some models has a critical difference in predictions
• Longer Range all models not struggle
Training on just 50 epochs took:
• Almost an hour for NBEATS
• 20 min for Transformer
• 1 min for Arima
LoRA + rSVD
! ∈ #!×!
A = "(0, &!
)
B = 0
r
x
h
!!
"×$
%
$×&
'
"×&
=
(
"×"
)
"×"
*!
"×&
SVD
recover
(
$×"
=
(
+
"×"
!
$×"
13
Results NBEATS
Base
Model
LoRA
Layers
Model latency throughput
No LoRA 0.00108 4364808.0
LoRA Layer 0.00166 4159606.0
MS Default 0.00106 4748564.0
MS All 0.00108 4482670.0
MS LoRA 0.00105 5365210.0
14
Results Transformer
Model latency throughput
No LoRA 0.00085 9306328.0
LoRA Layer 0.00084 9782205.0
MS Default 0.00085 9572622.0
MS All 0.00085 9463282.0
MS LoRA 0.00084 8586088.0
0
0,01
0,02
0,03
0,04
0,05
0,06
0,07
Epoch
1
Epoch
2
Epoch
3
Epoch
4
Epoch
5
Epoch
6
Epoch
7
Epoch
8
Model Train With Early Stopping
Base LoRA
15
LoRA Implementation
14
Resume
In this work:
• Provide an Experiment for
research of performance:
• NBEATS
• ARIMA
• Transformer
• Logically assumed that layers in
models could be replaced using
LoRa approach
• Implement logic as a master
thesis
15
TN diagrams of some popular decompositions
!(")
!($) !(%)
"(")
"($)
"(%)
"(&)
!! !" !# !$
!(",")
!(%,")
!(&,")
!(",%)
!(%,%)
!(&,%)
!(",&)
!(%,&)
!(&,&)
"" "% "&
"' "( ")
"* "+ ",
!(")
!"
!# !$
…
!($)
!(%)
%" %%&"
%$
!(")
!"
!# !$
…
!($)
!(%)
%" %%&"
%$
%'
!(")
"$
""
"%
…
!(%)
!($)
!!
!" !#
#
!(")
"$
""
"%
…
!(%)
!($)
#
!!
!" !#
!(")
"$
""
"% …
!(%)
!($)
#
!!
!"
!#
#(")
THANK YOU
FOR YOUR TIME!
@misterzurg

More Related Content

PDF
Modern Computing: Cloud, Distributed, & High Performance
PDF
Time series analysis : Refresher and Innovations
DOC
Table of Contents
PDF
SmartSim Workshop 2024 at OLCF and NERSC
PPTX
StackNet Meta-Modelling framework
PPTX
Parallel Distributed Deep Learning on HPCC Systems
PPTX
Deep learning summary
PPTX
250113_JH_labseminar[An Attentive Inductive Bias for Sequential Recommendatio...
Modern Computing: Cloud, Distributed, & High Performance
Time series analysis : Refresher and Innovations
Table of Contents
SmartSim Workshop 2024 at OLCF and NERSC
StackNet Meta-Modelling framework
Parallel Distributed Deep Learning on HPCC Systems
Deep learning summary
250113_JH_labseminar[An Attentive Inductive Bias for Sequential Recommendatio...

Similar to Master Thesis Slides: Topic Development of Methods for Deep Neural Network Architectures Optimization based on Tensor Factorization Algorithms (20)

PDF
Overview of DuraMat software tool development
PPTX
Application of machine learning and cognitive computing in intrusion detectio...
PDF
How HPC and large-scale data analytics are transforming experimental science
PDF
Arpan_booth_talk_2 DNN and Tsnor Floww.pdf
PPTX
AWS Forcecast: DeepAR Predictor Time-series
PDF
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
PPTX
Parallel Left Ventricle Simulation Using the FEniCS Framework
PPTX
Foundation of ML Project Presentation - 1.pptx
PPTX
Technical seminar2 Neural architecture.pptx
PDF
Grid is Dead ? Nimrod on the Cloud
PDF
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
PDF
Entity embeddings for categorical data
PPT
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
PPTX
Deep learning with keras
PPTX
swatiVCprsentation artificial learning and machine learning.pptx
PPTX
Deep Learning Made Easy with Deep Features
PPSX
Intro to Deep Learning with Keras - using TensorFlow backend
PDF
unit-iii-deep-learningunit-iii-deep-learning.pdf
PDF
Introduction to machine learning with GPUs
DOCX
ECET 375 Effective Communication/tutorialrank.com
Overview of DuraMat software tool development
Application of machine learning and cognitive computing in intrusion detectio...
How HPC and large-scale data analytics are transforming experimental science
Arpan_booth_talk_2 DNN and Tsnor Floww.pdf
AWS Forcecast: DeepAR Predictor Time-series
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Parallel Left Ventricle Simulation Using the FEniCS Framework
Foundation of ML Project Presentation - 1.pptx
Technical seminar2 Neural architecture.pptx
Grid is Dead ? Nimrod on the Cloud
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
Entity embeddings for categorical data
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Deep learning with keras
swatiVCprsentation artificial learning and machine learning.pptx
Deep Learning Made Easy with Deep Features
Intro to Deep Learning with Keras - using TensorFlow backend
unit-iii-deep-learningunit-iii-deep-learning.pdf
Introduction to machine learning with GPUs
ECET 375 Effective Communication/tutorialrank.com
Ad

More from Denis Zakharov (9)

PDF
Seniors YSC2022
PDF
Bachelors Diploma Slides Short Version
PDF
Student Spring 2022
PPTX
Development of a plugin for VS Code that supports ACSL language.
PDF
Exploring ML methods to increase the profitability of the trading strategy
PPTX
Student Spring 2020
PPTX
Student Spring 2019
PPTX
Student Spring 2021
PPTX
DocHolder
Seniors YSC2022
Bachelors Diploma Slides Short Version
Student Spring 2022
Development of a plugin for VS Code that supports ACSL language.
Exploring ML methods to increase the profitability of the trading strategy
Student Spring 2020
Student Spring 2019
Student Spring 2021
DocHolder
Ad

Recently uploaded (20)

PPTX
A powerpoint on colorectal cancer with brief background
PPTX
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
PPTX
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
ELISA(Enzyme linked immunosorbent assay)
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
perinatal infections 2-171220190027.pptx
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PPT
Cell Structure Description and Functions
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PPTX
congenital heart diseases of burao university.pptx
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PPTX
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
PPTX
endocrine - management of adrenal incidentaloma.pptx
PPTX
2currentelectricity1-201006102815 (1).pptx
PPTX
Substance Disorders- part different drugs change body
A powerpoint on colorectal cancer with brief background
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
ELISA(Enzyme linked immunosorbent assay)
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
perinatal infections 2-171220190027.pptx
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
Cell Structure Description and Functions
Enhancing Laboratory Quality Through ISO 15189 Compliance
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
congenital heart diseases of burao university.pptx
Presentation1 INTRODUCTION TO ENZYMES.pptx
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
endocrine - management of adrenal incidentaloma.pptx
2currentelectricity1-201006102815 (1).pptx
Substance Disorders- part different drugs change body

Master Thesis Slides: Topic Development of Methods for Deep Neural Network Architectures Optimization based on Tensor Factorization Algorithms

  • 1. Development of Methods for Deep Neural Network Architectures Optimization based on Tensor Factorization Algorithms Supervisor: Revin Ilya Evgenyevich, research associate, Laboratory for Composite AI, Research Center “Strong AI in Industry” Presented by: Zakharov Denis, J4232c
  • 2. Pain: Current deep networks have excessive parameters which results in long training time and expensive storing wights production. Hypothesis: With the use of optimization method we can reduce size of stored model, increase throughput. Work: Development of Methods for Deep Neural Network Architectures Optimization based on Tensor Factorization Algorithms Problem 2
  • 3. Purpose and objectives of study Goal Development of Methods for Deep Neural Network Architectures Optimization based on Tensor Factorization Algorithms Objectives • Make a Literature review of related field to provide a background • Make a scientific research of tensor Algorithms (Optimization, Operations) • Make a scientific research of LoRA approach • Perform experiment with TS Models • Based on these findings propose a product that can be integrated into AutoML solution • Develop Optimization Method • Contribute to Fedot.Industrial framework 3
  • 4. BERT 340M T5 11B GPT-3 175B Megatron-Turing 530B GPT-4 1.76T Gemini Pro ≈30T Gemini Ultra ≈60T 2018 2019 2020 2021 2022 2023 2024 Model Trend Modern Networks 4 Number of parameters
  • 5. Tensor Decomposition 5 ! ≈ = !! "! #! " ! "#$ $ % & ! ≈ !! !" !# !! !" !# ! !! ≈ "" #! $! ! ! "#$ ! " # $! = ∗ ∗ t-SVD (Singular Value Decomposition) CANDECOMP/PARAFAC Tucker Block Term
  • 6. Low-rank decomposition 6 𝑀 𝑅! " 𝐿! ≈ × 𝑚×𝑘 𝑚×𝑛 𝑘×𝑛
  • 7. Statement of Experiment 7 Check performance of TS models NBEATS, Transformer, ARIMA on M4 dataset for checking how would they predict timeseries based on data.
  • 8. Models 8 Transformer — state-of-the-art deep learning model introduced in 2017. It is an encoder-decoder architecture whose core feature is the ‘multi-head attention’ mechanism Linear Linear Linear Scaled Dot-Product Attention Concat Linear Multi-Head Attention MatMul SoftMax Mask Scale MatMul Scaled Dot-Product Attention In Out Multi-Head Attention Add & Norm Feed Forward Add & Norm Masked Multi-Head Attention Add & Norm Multi-Head Attention Add & Norm Feed Forward Add & Norm SoftMax Linear ! × ! ×
  • 9. Models 9 NBEATS — deep learning-based approach for time series forecasting. ARIMA — popular statistical model used to forecast future values in a time series based on past values. Block FC Stack FC FC !!(#!) !"(#") Backcast Stack Forecast Stack Block 1 Block 2 Block K … Forecast Global Forecast Stack 1 Stack 2 Stack M …
  • 10. M4 Dataset 10 The M4 dataset is a collection of 100,000 time series used for the fourth edition of the Makridakis forecasting Competition. Consists of time series of: • Yearly — 63 avg training length • Quarterly — 125 • Monthly — 302 • Weekly — 2035 • Daily — 475 • Hourly — 682
  • 12. Interpretation of results 12 On short series all three models performed well and suitable for predicting However with longer range • NBEATS perform better than others models • Some models has a critical difference in predictions • Longer Range all models not struggle Training on just 50 epochs took: • Almost an hour for NBEATS • 20 min for Transformer • 1 min for Arima
  • 13. LoRA + rSVD ! ∈ #!×! A = "(0, &! ) B = 0 r x h !! "×$ % $×& ' "×& = ( "×" ) "×" *! "×& SVD recover ( $×" = ( + "×" ! $×" 13
  • 14. Results NBEATS Base Model LoRA Layers Model latency throughput No LoRA 0.00108 4364808.0 LoRA Layer 0.00166 4159606.0 MS Default 0.00106 4748564.0 MS All 0.00108 4482670.0 MS LoRA 0.00105 5365210.0 14
  • 15. Results Transformer Model latency throughput No LoRA 0.00085 9306328.0 LoRA Layer 0.00084 9782205.0 MS Default 0.00085 9572622.0 MS All 0.00085 9463282.0 MS LoRA 0.00084 8586088.0 0 0,01 0,02 0,03 0,04 0,05 0,06 0,07 Epoch 1 Epoch 2 Epoch 3 Epoch 4 Epoch 5 Epoch 6 Epoch 7 Epoch 8 Model Train With Early Stopping Base LoRA 15
  • 17. Resume In this work: • Provide an Experiment for research of performance: • NBEATS • ARIMA • Transformer • Logically assumed that layers in models could be replaced using LoRa approach • Implement logic as a master thesis 15 TN diagrams of some popular decompositions !(") !($) !(%) "(") "($) "(%) "(&) !! !" !# !$ !(",") !(%,") !(&,") !(",%) !(%,%) !(&,%) !(",&) !(%,&) !(&,&) "" "% "& "' "( ") "* "+ ", !(") !" !# !$ … !($) !(%) %" %%&" %$ !(") !" !# !$ … !($) !(%) %" %%&" %$ %' !(") "$ "" "% … !(%) !($) !! !" !# # !(") "$ "" "% … !(%) !($) # !! !" !# !(") "$ "" "% … !(%) !($) # !! !" !# #(")
  • 18. THANK YOU FOR YOUR TIME! @misterzurg