SlideShare a Scribd company logo
USING PARALLEL PROGRAMMING TO
IMPROVE PERFORMANCE OF IMAGE
PROCESSING
Chan Le – KAIST ’13
INTRODUCTION
   Me:
     Chan Le – 3rd year undergraduate student
     Double major in Computer Science & Management Science
     A Vietnamese - KAIST ’13


   Professor:
     Won-Ki Jeong
     GPU-accelerated large-scale biomedical image processing


   Project:
       Apply parallel programming to improve performance of image
        processing
MOTIVATION
   Biomedical researches work with images
       Really big images
           Take long time to process
     Raw images are hard to analyze & use for research
     Really noisy sometimes
           Need to preprocess before using
 Image preprocessing using serial algorithms are
  slow
 Nowadays, parallel computing are developing
       Thanks to the popularity of multi-core CPUs and GPUs
RELATED WORKS: USING PDE IN NOISE-
REDUCTION
                                               IN
                                             (x,y+1
                           ΔW = IW – It
                                                )
                                             ΔN

                               IW       ΔW        It   ΔE     IE
                              (x-1,y)         (x,y)         (x+1,y)
                                             ΔS
                                               IS
                                              (x,y-
                                               1)

   Heat equation
       At pixel every (x,y) of the image at the time t:
          I =It+ΔI
         t+1

                           ΔI = (ΔW+ ΔN+ ΔE+ ΔS) / 4
RELATED WORKS: ANISOTROPIC DIFFUSION
   Paper: Scale-space and edge detection using
    anisotropic diffusion (Pietro Perona & Jitendra Malik, 1990)
      Basic idea: Adding coefficient to each ΔW,ΔN,ΔS,ΔE
            .


   
                                         /4
   How to calculate each c?
    C=


    C=
RELATED WORKS
NVIDIA CUDA
   Serial vs Parallel program
      Thread: unit of processing
      In the past: CPU has only 1 core -> 1 thread at a time
      Nowadays: multi-cores -> multiple thread at a time


   CUDA™ is a parallel computing platform and
    programming model invented by NVIDIA.
   http://www.nvidia.com/object/cuda_home_new.html



   How could it helps?
      CPU: 1-6 cores
      GPU: hundreds
             improve performance by the scale of 10 to 100, depends on
             the algorithm
MY IMPLEMENTATION
   Implement Anisotropic
    Diffusion on CUDA
    platform

 1 thread handle 1 pixel
 Dividing the image to
  multiple sub-regions,
  process them parallely to
  exploit multiple cores
SOME RESULT – SMALL
SOME RESULT – SMALL
SOME RESULT – MEDIUM
SOME RESULT – MEDIUM
BENCHMARK
   100 times iteration
CONCLUSION
   The result of this project could be use to help
    improving quality of images before using.

 Utilizing GPU computing power could improve the
  performance of your program by 100-200 times
 Partial Differential Equations are good choices
  when design parallel algorithm
 However, the performance is limited by the GPU’s
  memory size

More Related Content

What's hot (20)

PDF
GAN - Theory and Applications
Emanuele Ghelfi
 
PDF
Math behind the kernels
Revanth Kumar
 
PDF
About Unsupervised Image-to-Image Translation
Mehdi Shibahara
 
PDF
MLIP - Chapter 3 - Introduction to deep learning
Charles Deledalle
 
PPTX
Object classification using deep neural network
nishakushwah4
 
PDF
Lecture32
zukun
 
PPTX
Large scale object recognition (AMMAI presentation)
Po-Jen Lai
 
PDF
07 regularization
Ronald Teo
 
PPTX
Machine learning in R
apolol92
 
PDF
Implementation of optimized diamond search algorithm
naeemtayyab
 
PDF
Machine Learning: Make Your Ruby Code Smarter
Astrails
 
PDF
2013.10.24 big datavisualization
Sean Kandel
 
PDF
Image transforms
BCET, Balasore
 
PDF
Epsrcws08 campbell isvm_01
Cheng Feng
 
PDF
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
 
PDF
NCM LECTURE NOTES ON I . n. herestein cryptography(3)
NARAYANASWAMY CHANDRAMOWLISWARAN
 
PDF
Generative Adversarial Networks
Mustafa Yagmur
 
PDF
Deep Neural Network
Jun Young Park
 
PPT
20100822 computervision veksler
Computer Science Club
 
GAN - Theory and Applications
Emanuele Ghelfi
 
Math behind the kernels
Revanth Kumar
 
About Unsupervised Image-to-Image Translation
Mehdi Shibahara
 
MLIP - Chapter 3 - Introduction to deep learning
Charles Deledalle
 
Object classification using deep neural network
nishakushwah4
 
Lecture32
zukun
 
Large scale object recognition (AMMAI presentation)
Po-Jen Lai
 
07 regularization
Ronald Teo
 
Machine learning in R
apolol92
 
Implementation of optimized diamond search algorithm
naeemtayyab
 
Machine Learning: Make Your Ruby Code Smarter
Astrails
 
2013.10.24 big datavisualization
Sean Kandel
 
Image transforms
BCET, Balasore
 
Epsrcws08 campbell isvm_01
Cheng Feng
 
Matrix Factorizations for Recommender Systems
Dmitriy Selivanov
 
NCM LECTURE NOTES ON I . n. herestein cryptography(3)
NARAYANASWAMY CHANDRAMOWLISWARAN
 
Generative Adversarial Networks
Mustafa Yagmur
 
Deep Neural Network
Jun Young Park
 
20100822 computervision veksler
Computer Science Club
 

Similar to Using parallel programming to improve performance of image processing (20)

PDF
Lecture30
zukun
 
PDF
Lecture30
zukun
 
PPT
Lect3cg2011
ishusharma6098
 
PDF
C g.2010 supply
sairamchowdary
 
PDF
Prelude to halide_public
Fixstars Corporation
 
PDF
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
PDF
Lecture03
zukun
 
PPTX
Gaussian Image Blurring in CUDA C++
Darshan Parsana
 
PDF
C O M P U T E R G R A P H I C S J N T U M O D E L P A P E R{Www
guest3f9c6b
 
PDF
Computer Graphics Jntu Model Paper{Www.Studentyogi.Com}
guest3f9c6b
 
PDF
Liszt los alamos national laboratory Aug 2011
Ed Dodds
 
PDF
CG OpenGL line & area-course 3
fungfung Chen
 
PDF
Image Smoothing for Structure Extraction
Jia-Bin Huang
 
PPTX
Cg
Amit Pathak
 
PDF
Image Interpolation
ThomasUnivalor
 
PDF
CG OpenGL vectors geometric & transformations-course 5
fungfung Chen
 
PDF
scan conversion of point , line and circle
Divy Kumar Gupta
 
PDF
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Jan Wedekind
 
PDF
Open GL 09 scan conversion
Roziq Bahtiar
 
PDF
GRUPO 2 : convolution separable
viisonartificial2012
 
Lecture30
zukun
 
Lecture30
zukun
 
Lect3cg2011
ishusharma6098
 
C g.2010 supply
sairamchowdary
 
Prelude to halide_public
Fixstars Corporation
 
Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...
Gurbinder Gill
 
Lecture03
zukun
 
Gaussian Image Blurring in CUDA C++
Darshan Parsana
 
C O M P U T E R G R A P H I C S J N T U M O D E L P A P E R{Www
guest3f9c6b
 
Computer Graphics Jntu Model Paper{Www.Studentyogi.Com}
guest3f9c6b
 
Liszt los alamos national laboratory Aug 2011
Ed Dodds
 
CG OpenGL line & area-course 3
fungfung Chen
 
Image Smoothing for Structure Extraction
Jia-Bin Huang
 
Image Interpolation
ThomasUnivalor
 
CG OpenGL vectors geometric & transformations-course 5
fungfung Chen
 
scan conversion of point , line and circle
Divy Kumar Gupta
 
Reconstruction (of micro-objects) based on focus-sets using blind deconvoluti...
Jan Wedekind
 
Open GL 09 scan conversion
Roziq Bahtiar
 
GRUPO 2 : convolution separable
viisonartificial2012
 
Ad

Recently uploaded (20)

PDF
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
PDF
Bridging CAD, IBM TRIRIGA & GIS with FME: The Portland Public Schools Case
Safe Software
 
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
PDF
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
PPSX
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
DOCX
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
PDF
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
PDF
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PPTX
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
PPTX
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PDF
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
Bridging CAD, IBM TRIRIGA & GIS with FME: The Portland Public Schools Case
Safe Software
 
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
Usergroup - OutSystems Architecture.ppsx
Kurt Vandevelde
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Java 25 and Beyond - A Roadmap of Innovations
Ana-Maria Mihalceanu
 
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
ArcGIS Utility Network Migration - The Hunter Water Story
Safe Software
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
Ad

Using parallel programming to improve performance of image processing

  • 1. USING PARALLEL PROGRAMMING TO IMPROVE PERFORMANCE OF IMAGE PROCESSING Chan Le – KAIST ’13
  • 2. INTRODUCTION  Me:  Chan Le – 3rd year undergraduate student  Double major in Computer Science & Management Science  A Vietnamese - KAIST ’13  Professor:  Won-Ki Jeong  GPU-accelerated large-scale biomedical image processing  Project:  Apply parallel programming to improve performance of image processing
  • 3. MOTIVATION  Biomedical researches work with images  Really big images  Take long time to process  Raw images are hard to analyze & use for research  Really noisy sometimes  Need to preprocess before using  Image preprocessing using serial algorithms are slow  Nowadays, parallel computing are developing  Thanks to the popularity of multi-core CPUs and GPUs
  • 4. RELATED WORKS: USING PDE IN NOISE- REDUCTION IN (x,y+1 ΔW = IW – It ) ΔN IW ΔW It ΔE IE (x-1,y) (x,y) (x+1,y) ΔS IS (x,y- 1)  Heat equation  At pixel every (x,y) of the image at the time t: I =It+ΔI  t+1 ΔI = (ΔW+ ΔN+ ΔE+ ΔS) / 4
  • 5. RELATED WORKS: ANISOTROPIC DIFFUSION  Paper: Scale-space and edge detection using anisotropic diffusion (Pietro Perona & Jitendra Malik, 1990)  Basic idea: Adding coefficient to each ΔW,ΔN,ΔS,ΔE  .   /4  How to calculate each c? C= C=
  • 7. NVIDIA CUDA  Serial vs Parallel program  Thread: unit of processing  In the past: CPU has only 1 core -> 1 thread at a time  Nowadays: multi-cores -> multiple thread at a time  CUDA™ is a parallel computing platform and programming model invented by NVIDIA.  http://www.nvidia.com/object/cuda_home_new.html  How could it helps?  CPU: 1-6 cores  GPU: hundreds   improve performance by the scale of 10 to 100, depends on the algorithm
  • 8. MY IMPLEMENTATION  Implement Anisotropic Diffusion on CUDA platform  1 thread handle 1 pixel  Dividing the image to multiple sub-regions, process them parallely to exploit multiple cores
  • 11. SOME RESULT – MEDIUM
  • 12. SOME RESULT – MEDIUM
  • 13. BENCHMARK  100 times iteration
  • 14. CONCLUSION  The result of this project could be use to help improving quality of images before using.  Utilizing GPU computing power could improve the performance of your program by 100-200 times  Partial Differential Equations are good choices when design parallel algorithm  However, the performance is limited by the GPU’s memory size