SlideShare a Scribd company logo
Streaming Multigrid for Gradient-Domain Operations on Large Images Michael Kazhdan , Johns Hopkins University Hugues Hoppe , Microsoft Research SIGGRAPH 08
Abstract Develop a  streaming multigrid  solver – 2 passes over out-of-core data Solver of  Poisson  eq. with unconstrained boundary conditions Construct  Relaxation ,  Restriction  and  Prolongation  operators in  multigrid  method based on the  B-spline basis Build up a framework to  pipeline multigrid  with a  window  of rows of images
Abstract Develop a streaming multigrid solver – 2 passes over out-of-core data 2 nd  order finite-elements in a single V-cycle Temporally blocked relation Multi-level streaming to pipeline restriction & prolongation into single streaming passes Key contribution forward-difference gradient    B-spline finite-element
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Gradient-domain problem as Poisson solution Solve the problem that  Find U to minimize  The Poisson equation
Image processing on  gradient-domain Lighting removal  by zeroing small gradient [Horn 74] HDR image  tone-mapped  by adaptively attenuating luminance gradients [Weiss 01] Overlapping  images stitched  seamlessly by merging gradients [P´erez et al.03; Agarwala et al.04;Levin et al.04] Shadow removal  by zeroing large luminance gradients in regions of constant chromaticity [Finlayson et al. 02]  Undesirable  reflections removed  in flash and ambient image pairs [Agrawal et al. 05] Photographic  tone management  is improved using gradient constraints [Bae et al. 06]
Painting on gradient-domain Painting  with interactive gradient-domain modeling  [McCann & Pollard 08] Diffusion curves [Orzan et al. 08]
Large images on gradient-domain Processing of  large images   [Kopf et al. 07] GB pixel images are too large to fit main memory Direct solution (ex: Cholesky factorization) is impractical Iteration  techniques (ex: conjugate gradients and  multigrid  method) are in-efficient Requiring many iterations over out-of-core data
Standard  multigrid   V-cycle
Contributions A general, accurate and efficient solver for  Poisson  eq. over  out-of-core  images Sufficient accuracy in 2 V-cycles on GB images 2 nd  order   B-spline  finite elements gets more accuracy than traditional finite element in multigrid method Efficiency on temporally locality Small moving windows of data in memory Pipelined V-cycle : 3 Gauss-Seidel relaxations, restriction and prolongation
Contributions On a single CPU core Solve a 3-channel, 16-MB gradient-domain problem with an rms error on the order of 10 −5  in 15 seconds.  High ratio of local computation to memory bandwidth Temporal locality in L1 cache
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Solvers of Poisson equation Iterative solvers Gauss-Seidel  and conjugate gradient  Memory-efficient  Require many iterations Reduce # of iterations using multi-resolution pre-conditioners [Gortler and Cohen 95; Szeliski 06] Reduce # of iterations using multigrid solvers [Brandt 77; Briggs et al. 00] GPU + multigrid [Bolz et al. 03; Goodnight et al. 03; G¨oddeke et al. 08]
Out-of-core Multigrid  are difficult to schedule out-of-core [Toledo 99] Solve out-of-core problem on a  coarser-resolution grid  and then upsample the resulting approximation [Kopf et al. 07a] Can not maintain robust sharp features Adaptive partition  to solve Poisson eq.For image stitching,  Poisson eq. in the image seams [Agarwala 07]  Poisson eq. in the context of fluid flow simulation [Losasso et al. 04] Poisson eq. in the surface reconstruction [Kazhdan et al. 06]. Our method addresses the general case where The Poisson eq. is solved accurately everywhere The problem size can not be reduced No initial solution guess is available
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Gradient-domain image processing    Poisson Equation
▽  u x 0 x 1 x i x N u 0 u 1 u i u N h u i+1
Δ U x i+1 x i-1 x i x N u i-1 u N h u i+1 u i
Δ U= f
Lu=f U u 1,1 u 1,2 u 1,3 u 2,1 u 2,2 u 2,3 u 3,1 u 3,2 u 3,3
Lu=f Gauss-Seidel  relaxation on  u L u  =  f ( L+D+R )  u  =  f  where L =   ( L+D+R ) D  u  =  f  – ( L+R )  u u   (k)  =  D   -1  ( f  – ( L+R )  u (k-1) ) =  D   -1   f +  R J   u (k-1)
Error, residual  Consider   Lu = f Error   e k  = u – u (k) Residual   r k  = f - L u (k)   = L e k u   (k)  =  D   -1   f +  R J   u (k-1)   e k =  u  –  u (k)   = ( D   -1   f +  R J   u )  –  ( D   -1   f +  R J   u (k-1) ) =  R J  ( u - u (k-1)  )   =  R J   e k-1 =  R J k   e 0  e  0  depends on  R J  and  e o
Limits of  Gauss-Seidel e k =  R J k   e 0  Gauss-Seidel relaxation is smooth but converges slowly on low-freq. components Why use  coarse  grids ? [Briggs et al. 00] Coarse grids  can be used to compute an improved  initial guess  for the  fine-grid  relaxation  Relaxation on the coarse-grid is much cheaper
coarse-grid correction Relax  k  times on  L h u h  = f h   on  fine-grid , initial  u (0)  arbitrary u   (k)  =  D   -1   f +  R J   u (k-1) Compute the  residual  of the  find-grid r = L h  u h (k)  - f h Restrict  the  residual  to the  coarse-grid r H  =  R  r h   where H = 2h, R is the  restriction Compute the  error  on the  coarse-grid L H  e H  = r H ,  where L H  = R L h  P Prolongate  (interpolate) the  error  to the  fine-grid e h =  P  e H  ,  where P is the  prolongation Correct the  fine-grid solution u h (k+1)  = u h (k)  +  e h   2 3 1 1 1 4 5 6 Fine grid coarse grid
Restriction & Prolongation Restriction operator Fine grid    coarse grid Typically using local weighted averaging Prolongation operator Coarse grid    fine grid Typically using bilinear interpolation Restriction (1D) prolongation (1D) R = P T
2D stencils of the multigrid  2D Restriction 2D Prolongation 2D Relaxation  (Laplacian)
V-cycle v h     MV h (v h , f h ) Relax  k  times on  L h u h  = f h , initial v arbitrary If  Ω h   is the coarsest grid, goto 4. Else  f 2h     R(f h  – L h v h ) = R r h   // restriction v 2h     0 v 2h     MV 2h (v 2h , f 2h ) Correct  v h     v h  + P v 2h  // prolongation Relax  k  times on  L h u h  = f h , initial guess v h Relaxation on L h u h  = f h Restriction f 2h  = R r h Relaxation on L 2h u 2h  = f 2h Relaxation on L 4h u 4h  = f 4h Restriction f 2h  = R r h
Standard multigrid V-cycle f l-1 =R l l -r l u l =P l l-1 u P l-1 +u R l L lmin u lmin =f lmin Base solution
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Representing the Poisson equation with 1D B-spline basis Solving Poisson eq . reduces to  solving  Lu = f L as the N x N matrix with L i,j = <  Δ B i (x), B j (x)> = <> f as the vector with f j  = < F(x), B j (x)>     L i,j = < ▽ B i (x),  -▽ B j (x)>
Fitting forward-difference gradient constrains Gradient t of unknown image v Difference of B-spline
2 nd  order B-spline 1D case
2 nd  order B-spline 2D case
2D stencils of the multigrid operators using B-splines Prolongation Restriction Laplacian
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
How to pipeline ? phases of each row A l     R    A l-1     R    A l-2     P    A l-1     P    A l Does neighborhood in  Laplacian stencil  exist ?  Gauss-Seidel Relaxation (A), Restriction (R), Prolongation (P) Time row row Perform 3 times relaxations (A) as a single streaming operations  R n-1 Al R Al-1 R Al-2 P Al-1 P Al R n Al R Al-1 R Al-2 P Al-1  P R n+1 Al R Al-1 R Al-2 P Al-1 A l R A l-1 A l-2 R P A l-1 A l P R n-1 Al Al Al R n Al Al Al R n+1 Al Al Al
Temporally blocked relaxation  Maintain a  window of rows  [i-1, i+2k+1] to perform a skipping, counter-current relaxation sweep, updating pixels in rows {i+2k-1, i+2k-3, …, i+1} Row 4 {11}  Relaxed twice: Row 3 {02,06}, row 2{03,08} Relaxed once: Row 5 {7}, row {10} Perform k=3 times relaxations (A) as a single streaming operations  Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 The pixel row is memory-resident i The i th  relaxation globally
Full data pipeline for gradient-domain processing
Memory analysis Implement the active windows on  u l R  and  f l  as circular memory buffer of images rows Window size (w, h)  w: the image width at the coarser level  h:  2k+3 (restriction), 2k+5 (prolongation) Memory usage is O(N x ) for an N x  x N y  image
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Parameters (n,k,v) n-order of the finite element (n-order B-spline) K times Gauss-Seidel relaxations v passes of V-cycles 2-order B-spline  basis    (2, 3, 2)
(n, v) Plot of the rms and max errors vs. # of multigrid V-cycles 2 nd -order element give the fastest convergence !
(n, k=2, v=1)
Parameter selection Sufficiently accurate solution with a minimum number of V-cycles 8-bit channel image processing Max error < 1/256    (2,5,1): 2 nd  order B-spline basis, 5 Gauss-Seidel updates with a single-V cycle in all our applications
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Implementation Maximizing disk throughput Larger block (4MB) transfer to minimize disk latency 2X speed improvement Storing the intermediate u,f floating-point value on dist at half precision Relaxation optimization leverage the vertical and horizontal symmetries of the stencil Make use of CPU SSE2- 4-vector instructors Non-power-of-two image Padding the input image to 2 lmax-lmin  times the coarest level Multi-channel image Stitching requires full-color gradient Interleave the per-channel solutions to reduce the total # of passes
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Environment NB 2.2 GHz Core 2 Duo processor 4 GB RAM Timing I/O to read the gradient field from disk Write JPEG-compressed ouput to disk Initial solution u=0 in all experiments (2,5,1)  2 nd  order B-spline basis, 5 Gauss-Seidel updates with a single-V cycle
Image stitching 19,588 x 4,457 (87 MB) panorama form  9 photos Copy the image gradients and solving the Poisson eq.
Image stitching SM: streaming multigrid solver QT: quadtree (AT) solver of Agarwala [07]
Tone-mapping  (HDR    normal tone) before after
Tone-mapping  (HDR    normal tone) Stream multigraid is the first one to solve Poisson eq in time that is linear on the # of pixels
GB stitching and tone-mapping May not capture the true scene contrast
GB stitching and tone-mapping
Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
Conclusion Streaming multigrid Out-of-core tech. for solving large global linear system Local access A few passes of sequential I/O 2 nd  order B-spline finite element formulation is compatible with traditional multigrid Efficient accurate solution in a single V-cycle
Future work Dirichlet boundary condition Modify 2D stencils Construct  f Soft constraint to match some original image u 0 Weighted minimization where w(x,y) is a spatially varying 2x2 diagonal matrix that weight difference in x and y independently Bilaplacian
Future work Reduce disk bandwidth by using compression/decompression of the streamed temporary data Parallelization Many-core CPUs or GPUs for instance by partitioning image rows

More Related Content

PDF
Ultrasound Modular Architecture
PDF
Distributed Parallel Process Particle Swarm Optimization on Fixed Charge Netw...
PDF
Message Passing
PDF
Slide11 icc2015
PDF
Modular representation theory of finite groups
PDF
Kernel for Chordal Vertex Deletion
PDF
Orthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
PDF
A TRAINING METHOD USING
 DNN-GUIDED LAYERWISE PRETRAINING
 FOR DEEP GAUSSIAN ...
Ultrasound Modular Architecture
Distributed Parallel Process Particle Swarm Optimization on Fixed Charge Netw...
Message Passing
Slide11 icc2015
Modular representation theory of finite groups
Kernel for Chordal Vertex Deletion
Orthogonal Faster than Nyquist Transmission for SIMO Wireless Systems
A TRAINING METHOD USING
 DNN-GUIDED LAYERWISE PRETRAINING
 FOR DEEP GAUSSIAN ...

What's hot (20)

PDF
Reproducing Kernel Hilbert Space of A Set Indexed Brownian Motion
PDF
icml2004 tutorial on spectral clustering part I
PDF
Polynomial Kernel for Interval Vertex Deletion
PDF
icml2004 tutorial on spectral clustering part II
PDF
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
PDF
presentation
PDF
129966863283913778[1]
PDF
Integration by Parts for DK Integral
PDF
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
PDF
A Hough Transform Based On a Map-Reduce Algorithm
PDF
Pilot Contamination Mitigation for Wideband Massive MIMO: Number of Cells Vs ...
PDF
Metodo Monte Carlo -Wang Landau
PDF
Dg34662666
PDF
Pilot Optimization and Channel Estimation for Multiuser Massive MIMO Systems
PDF
Multilayer Neural Networks
PDF
BNL_Research_Report
PDF
Graph Kernelpdf
PDF
Guarding Terrains though the Lens of Parameterized Complexity
PDF
Graph Edit Distance: Basics & Trends
PDF
Split Contraction: The Untold Story
Reproducing Kernel Hilbert Space of A Set Indexed Brownian Motion
icml2004 tutorial on spectral clustering part I
Polynomial Kernel for Interval Vertex Deletion
icml2004 tutorial on spectral clustering part II
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
presentation
129966863283913778[1]
Integration by Parts for DK Integral
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
A Hough Transform Based On a Map-Reduce Algorithm
Pilot Contamination Mitigation for Wideband Massive MIMO: Number of Cells Vs ...
Metodo Monte Carlo -Wang Landau
Dg34662666
Pilot Optimization and Channel Estimation for Multiuser Massive MIMO Systems
Multilayer Neural Networks
BNL_Research_Report
Graph Kernelpdf
Guarding Terrains though the Lens of Parameterized Complexity
Graph Edit Distance: Basics & Trends
Split Contraction: The Untold Story
Ad

Similar to study Streaming Multigrid For Gradient Domain Operations On Large Images (20)

PDF
Solving integral equations on boundaries with corners, edges, and nearly sing...
PDF
Practical Spherical Harmonics Based PRT Methods
PDF
Performance Analysis of Image Enhancement Using Dual-Tree Complex Wavelet Tra...
PPSX
Practical spherical harmonics based PRT methods.ppsx
PDF
4 satellite image fusion using fast discrete
PDF
Feedback Vertex Set
ODP
2007 EuRad Conference: Speech on Rough Layers (odp)
PPT
2007 EuRad Conference: Speech on Rough Layers (ppt)
PDF
MVPA with SpaceNet: sparse structured priors
PPT
ITS World Congress :: Vienna, Oct 2012
PDF
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
PDF
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
PDF
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
PDF
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
PDF
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
PDF
student-problem-solutions.PDF
PDF
student-problem-solutions.pdf
PDF
Greens Function Estimates For Lattice Schrdinger Operators And Applications A...
PDF
ANALYSIS OF INTEREST POINTS OF CURVELET COEFFICIENTS CONTRIBUTIONS OF MICROS...
Solving integral equations on boundaries with corners, edges, and nearly sing...
Practical Spherical Harmonics Based PRT Methods
Performance Analysis of Image Enhancement Using Dual-Tree Complex Wavelet Tra...
Practical spherical harmonics based PRT methods.ppsx
4 satellite image fusion using fast discrete
Feedback Vertex Set
2007 EuRad Conference: Speech on Rough Layers (odp)
2007 EuRad Conference: Speech on Rough Layers (ppt)
MVPA with SpaceNet: sparse structured priors
ITS World Congress :: Vienna, Oct 2012
A non-stiff numerical method for 3D interfacial flow of inviscid fluids.
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
Noise Removal in SAR Images using Orthonormal Ridgelet Transform
student-problem-solutions.PDF
student-problem-solutions.pdf
Greens Function Estimates For Lattice Schrdinger Operators And Applications A...
ANALYSIS OF INTEREST POINTS OF CURVELET COEFFICIENTS CONTRIBUTIONS OF MICROS...
Ad

More from Chiamin Hsu (12)

PPTX
study Domain Transform for Edge-Aware Image and Video Processing
PPTX
study Image and video abstraction by multi scale anisotropic kuwahara
PPTX
study Accelerating Spatially Varying Gaussian Filters
PPTX
stduy Edge-Based Image Coarsening
PPTX
study Shading Based Surface Editing
PPTX
study Diffusion Curves: A Vector Representation for Smooth-Shaded Images
PPTX
study Image Vectorization using Optimized Gradeint Meshes
PPTX
study Seam Carving For Content Aware Image Resizing
PPT
study Latent Doodle Space
PPTX
study Coded Aperture
PPTX
study Active Refocusing Of Images And Videos
PPTX
study Dappled Photography
study Domain Transform for Edge-Aware Image and Video Processing
study Image and video abstraction by multi scale anisotropic kuwahara
study Accelerating Spatially Varying Gaussian Filters
stduy Edge-Based Image Coarsening
study Shading Based Surface Editing
study Diffusion Curves: A Vector Representation for Smooth-Shaded Images
study Image Vectorization using Optimized Gradeint Meshes
study Seam Carving For Content Aware Image Resizing
study Latent Doodle Space
study Coded Aperture
study Active Refocusing Of Images And Videos
study Dappled Photography

study Streaming Multigrid For Gradient Domain Operations On Large Images

  • 1. Streaming Multigrid for Gradient-Domain Operations on Large Images Michael Kazhdan , Johns Hopkins University Hugues Hoppe , Microsoft Research SIGGRAPH 08
  • 2. Abstract Develop a streaming multigrid solver – 2 passes over out-of-core data Solver of Poisson eq. with unconstrained boundary conditions Construct Relaxation , Restriction and Prolongation operators in multigrid method based on the B-spline basis Build up a framework to pipeline multigrid with a window of rows of images
  • 3. Abstract Develop a streaming multigrid solver – 2 passes over out-of-core data 2 nd order finite-elements in a single V-cycle Temporally blocked relation Multi-level streaming to pipeline restriction & prolongation into single streaming passes Key contribution forward-difference gradient  B-spline finite-element
  • 4. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 5. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 6. Gradient-domain problem as Poisson solution Solve the problem that Find U to minimize The Poisson equation
  • 7. Image processing on gradient-domain Lighting removal by zeroing small gradient [Horn 74] HDR image tone-mapped by adaptively attenuating luminance gradients [Weiss 01] Overlapping images stitched seamlessly by merging gradients [P´erez et al.03; Agarwala et al.04;Levin et al.04] Shadow removal by zeroing large luminance gradients in regions of constant chromaticity [Finlayson et al. 02] Undesirable reflections removed in flash and ambient image pairs [Agrawal et al. 05] Photographic tone management is improved using gradient constraints [Bae et al. 06]
  • 8. Painting on gradient-domain Painting with interactive gradient-domain modeling [McCann & Pollard 08] Diffusion curves [Orzan et al. 08]
  • 9. Large images on gradient-domain Processing of large images [Kopf et al. 07] GB pixel images are too large to fit main memory Direct solution (ex: Cholesky factorization) is impractical Iteration techniques (ex: conjugate gradients and multigrid method) are in-efficient Requiring many iterations over out-of-core data
  • 11. Contributions A general, accurate and efficient solver for Poisson eq. over out-of-core images Sufficient accuracy in 2 V-cycles on GB images 2 nd order B-spline finite elements gets more accuracy than traditional finite element in multigrid method Efficiency on temporally locality Small moving windows of data in memory Pipelined V-cycle : 3 Gauss-Seidel relaxations, restriction and prolongation
  • 12. Contributions On a single CPU core Solve a 3-channel, 16-MB gradient-domain problem with an rms error on the order of 10 −5 in 15 seconds. High ratio of local computation to memory bandwidth Temporal locality in L1 cache
  • 13. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 14. Solvers of Poisson equation Iterative solvers Gauss-Seidel and conjugate gradient Memory-efficient Require many iterations Reduce # of iterations using multi-resolution pre-conditioners [Gortler and Cohen 95; Szeliski 06] Reduce # of iterations using multigrid solvers [Brandt 77; Briggs et al. 00] GPU + multigrid [Bolz et al. 03; Goodnight et al. 03; G¨oddeke et al. 08]
  • 15. Out-of-core Multigrid are difficult to schedule out-of-core [Toledo 99] Solve out-of-core problem on a coarser-resolution grid and then upsample the resulting approximation [Kopf et al. 07a] Can not maintain robust sharp features Adaptive partition to solve Poisson eq.For image stitching, Poisson eq. in the image seams [Agarwala 07] Poisson eq. in the context of fluid flow simulation [Losasso et al. 04] Poisson eq. in the surface reconstruction [Kazhdan et al. 06]. Our method addresses the general case where The Poisson eq. is solved accurately everywhere The problem size can not be reduced No initial solution guess is available
  • 16. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 17. Gradient-domain image processing  Poisson Equation
  • 18. ▽ u x 0 x 1 x i x N u 0 u 1 u i u N h u i+1
  • 19. Δ U x i+1 x i-1 x i x N u i-1 u N h u i+1 u i
  • 21. Lu=f U u 1,1 u 1,2 u 1,3 u 2,1 u 2,2 u 2,3 u 3,1 u 3,2 u 3,3
  • 22. Lu=f Gauss-Seidel relaxation on u L u = f ( L+D+R ) u = f where L = ( L+D+R ) D u = f – ( L+R ) u u (k) = D -1 ( f – ( L+R ) u (k-1) ) = D -1 f + R J u (k-1)
  • 23. Error, residual Consider Lu = f Error e k = u – u (k) Residual r k = f - L u (k) = L e k u (k) = D -1 f + R J u (k-1) e k = u – u (k) = ( D -1 f + R J u ) – ( D -1 f + R J u (k-1) ) = R J ( u - u (k-1) ) = R J e k-1 = R J k e 0 e  0 depends on R J and e o
  • 24. Limits of Gauss-Seidel e k = R J k e 0 Gauss-Seidel relaxation is smooth but converges slowly on low-freq. components Why use coarse grids ? [Briggs et al. 00] Coarse grids can be used to compute an improved initial guess for the fine-grid relaxation Relaxation on the coarse-grid is much cheaper
  • 25. coarse-grid correction Relax k times on L h u h = f h on fine-grid , initial u (0) arbitrary u (k) = D -1 f + R J u (k-1) Compute the residual of the find-grid r = L h u h (k) - f h Restrict the residual to the coarse-grid r H = R r h where H = 2h, R is the restriction Compute the error on the coarse-grid L H e H = r H , where L H = R L h P Prolongate (interpolate) the error to the fine-grid e h = P e H , where P is the prolongation Correct the fine-grid solution u h (k+1) = u h (k) + e h 2 3 1 1 1 4 5 6 Fine grid coarse grid
  • 26. Restriction & Prolongation Restriction operator Fine grid  coarse grid Typically using local weighted averaging Prolongation operator Coarse grid  fine grid Typically using bilinear interpolation Restriction (1D) prolongation (1D) R = P T
  • 27. 2D stencils of the multigrid 2D Restriction 2D Prolongation 2D Relaxation (Laplacian)
  • 28. V-cycle v h  MV h (v h , f h ) Relax k times on L h u h = f h , initial v arbitrary If Ω h is the coarsest grid, goto 4. Else f 2h  R(f h – L h v h ) = R r h // restriction v 2h  0 v 2h  MV 2h (v 2h , f 2h ) Correct v h  v h + P v 2h // prolongation Relax k times on L h u h = f h , initial guess v h Relaxation on L h u h = f h Restriction f 2h = R r h Relaxation on L 2h u 2h = f 2h Relaxation on L 4h u 4h = f 4h Restriction f 2h = R r h
  • 29. Standard multigrid V-cycle f l-1 =R l l -r l u l =P l l-1 u P l-1 +u R l L lmin u lmin =f lmin Base solution
  • 30. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 31. Representing the Poisson equation with 1D B-spline basis Solving Poisson eq . reduces to solving Lu = f L as the N x N matrix with L i,j = < Δ B i (x), B j (x)> = <> f as the vector with f j = < F(x), B j (x)>  L i,j = < ▽ B i (x), -▽ B j (x)>
  • 32. Fitting forward-difference gradient constrains Gradient t of unknown image v Difference of B-spline
  • 33. 2 nd order B-spline 1D case
  • 34. 2 nd order B-spline 2D case
  • 35. 2D stencils of the multigrid operators using B-splines Prolongation Restriction Laplacian
  • 36. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 37. How to pipeline ? phases of each row A l  R  A l-1  R  A l-2  P  A l-1  P  A l Does neighborhood in Laplacian stencil exist ? Gauss-Seidel Relaxation (A), Restriction (R), Prolongation (P) Time row row Perform 3 times relaxations (A) as a single streaming operations R n-1 Al R Al-1 R Al-2 P Al-1 P Al R n Al R Al-1 R Al-2 P Al-1 P R n+1 Al R Al-1 R Al-2 P Al-1 A l R A l-1 A l-2 R P A l-1 A l P R n-1 Al Al Al R n Al Al Al R n+1 Al Al Al
  • 38. Temporally blocked relaxation Maintain a window of rows [i-1, i+2k+1] to perform a skipping, counter-current relaxation sweep, updating pixels in rows {i+2k-1, i+2k-3, …, i+1} Row 4 {11} Relaxed twice: Row 3 {02,06}, row 2{03,08} Relaxed once: Row 5 {7}, row {10} Perform k=3 times relaxations (A) as a single streaming operations Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 The pixel row is memory-resident i The i th relaxation globally
  • 39. Full data pipeline for gradient-domain processing
  • 40. Memory analysis Implement the active windows on u l R and f l as circular memory buffer of images rows Window size (w, h) w: the image width at the coarser level h: 2k+3 (restriction), 2k+5 (prolongation) Memory usage is O(N x ) for an N x x N y image
  • 41. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 42. Parameters (n,k,v) n-order of the finite element (n-order B-spline) K times Gauss-Seidel relaxations v passes of V-cycles 2-order B-spline basis  (2, 3, 2)
  • 43. (n, v) Plot of the rms and max errors vs. # of multigrid V-cycles 2 nd -order element give the fastest convergence !
  • 45. Parameter selection Sufficiently accurate solution with a minimum number of V-cycles 8-bit channel image processing Max error < 1/256  (2,5,1): 2 nd order B-spline basis, 5 Gauss-Seidel updates with a single-V cycle in all our applications
  • 46. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 47. Implementation Maximizing disk throughput Larger block (4MB) transfer to minimize disk latency 2X speed improvement Storing the intermediate u,f floating-point value on dist at half precision Relaxation optimization leverage the vertical and horizontal symmetries of the stencil Make use of CPU SSE2- 4-vector instructors Non-power-of-two image Padding the input image to 2 lmax-lmin times the coarest level Multi-channel image Stitching requires full-color gradient Interleave the per-channel solutions to reduce the total # of passes
  • 48. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 49. Environment NB 2.2 GHz Core 2 Duo processor 4 GB RAM Timing I/O to read the gradient field from disk Write JPEG-compressed ouput to disk Initial solution u=0 in all experiments (2,5,1) 2 nd order B-spline basis, 5 Gauss-Seidel updates with a single-V cycle
  • 50. Image stitching 19,588 x 4,457 (87 MB) panorama form 9 photos Copy the image gradients and solving the Poisson eq.
  • 51. Image stitching SM: streaming multigrid solver QT: quadtree (AT) solver of Agarwala [07]
  • 52. Tone-mapping (HDR  normal tone) before after
  • 53. Tone-mapping (HDR  normal tone) Stream multigraid is the first one to solve Poisson eq in time that is linear on the # of pixels
  • 54. GB stitching and tone-mapping May not capture the true scene contrast
  • 55. GB stitching and tone-mapping
  • 56. Outline Introduction Related work Review of finite-difference multigrid Our finite-element multigrid approach Streaming multigrid solver Efficient convergence of 2 nd -order elements Implementation Application results Conclusions & future work
  • 57. Conclusion Streaming multigrid Out-of-core tech. for solving large global linear system Local access A few passes of sequential I/O 2 nd order B-spline finite element formulation is compatible with traditional multigrid Efficient accurate solution in a single V-cycle
  • 58. Future work Dirichlet boundary condition Modify 2D stencils Construct f Soft constraint to match some original image u 0 Weighted minimization where w(x,y) is a spatially varying 2x2 diagonal matrix that weight difference in x and y independently Bilaplacian
  • 59. Future work Reduce disk bandwidth by using compression/decompression of the streamed temporary data Parallelization Many-core CPUs or GPUs for instance by partitioning image rows

Editor's Notes

  • #2: Johns Hopkins University : 美國醫界著名大學
  • #4: . 本文就是用 multi-streaming 完成 multigrid 演算法 . 特別針對這種 out-of-core , 即演算過程中 , 不僅只在 main memory, 還需要用到 disk 的大影像 , 在 gradient domain 上處理的數值分析方法 .multigrid method 早在 19?? 年由 ? 俄國數學家 ?? 發展於 finite difference ( f(x+b)-f(x+a) ), 用來解偏微分方程式 (Partial differential equation) 的數值分析方法之一 . . 本文設計了的 streaming multigrid solver, 有 2 passes. 1. 對每個 multigrid method 的 V-cycle 計算 , 用二階的 finite elements 去達到逐夠的正確性 . 這個說法很怪 ? Gradient domain 是說 ▽ U=G. 而二階是對 G 作 divergence, ▽ .▽ U =▽. G = f 這樣當然是 Poisson equation. Uxx + Uyy = f , 也就是二階的偏微分 作二階偏微分的當然是希望在 V-cycle (multi-grid 的演算過程 ) 內做 2 nd order finite elements 2. temporally blocked relation: 回憶 CA 課程 , 以 cache 來增加 temporal 和 special 的 locality. ㄧ次處理數行的方法 ( 而非整張 image 計算 ), 保證計算的數行在 L1 內 , 發揮 cache temporal locality 的優點 3. multi-level streaming, 是用來對 multi-grid 中的兩個必要 phase -restriction 和 prolongation – 作 pipeline. restriction 是高解析度到低解析度的運算 . 而 prolongation 相對來說是低解析度到高解析度的運算 . 怎麼 pipeline, 就要分別分割 restriction 和 prolongation. 實作上 , 本文是利用 GPU 多個 streaming 達成 . 能完成上式的 key component – 過去這種 forward-difference gradient 的方法 ( 我想就是指 Gauss-Seidel 或 multigrid 了 ) 而 multigrid 間高低解析度之間的轉換 , 不外乎就是些高解析度是低解析度內插 , 低解析度是高解析度 mean 之類的 . 這裡卻用上了 B-spline finite-element. 將 ⊿ U(x)=sum ( ui * Bi(x)) ( 其中⊿ , 是▽. ▽ ) 也就是說 , x 點上的 , ⊿U(x), 是 B-spline basis 的集合 . 所以我們對 multigrid 的高低解析度轉換對象 , 成為不同解析度間的 B-spline Basis 轉換
  • #7: 以偏微方程式再對 image 寫一遍 minimize |gradient U– G| =: partial derivation (gU-G)=0  d g (U)- d G = 0
  • #8: Lighting removal HDR tone-mapped by attenuating ( 衰減 ) luminance gradient Image stitching – 影像縫合 Shadow removal Refection removal HDR tone management
  • #9: Paint: 大 meet, 小 meet 上次 present Large image:
  • #10: Out-of-core image 沒有效率 , 在於 disk-IO
  • #12: 1. 可在
  • #13: Rms: root mean square error
  • #15: 就是用數值方法逼近 Poisson eq 這類的偏微分方程式
  • #16: 1. Toledo [1999] presents an excellent survey of out-of-core algorithms for large linear systems
  • #18: 本文要處理的對象 , gradient-domain image processing U is the 2D image ▽ U = G. 是 vector function 不過 , 我們要運算的對象是 scalar, ▽ . ▽ U=▽ . G=f 在 image processing 的課題上 , 通常是設定 f, 解 U 提醒自己: 1. Lapalicain eq: Δ 2 u = 0 Δ 2 u = 0 2. Poisson eq: Δ 2 u = f Δ u = f
  • #19: 後註:以這種方式理解數值分析法對 u x 的處理 . u xx = u(x+h)-u(x) 教科書通常是用 Taylor seriers 展開後相減
  • #20: 後註:以這種方式理解數值分析法對 u x 的處理 . 教科書通常是用 Taylor seriers 展開後相減
  • #21: 本文碎碎念 forward difference : h * f’(x) = f(x+h) – f(x) Backward difference : h * f’(x) = f(x) – f(x-h)
  • #22: 詳細內容請參照 , 所以針對這個特例 , 可以得到以下的 Lu=f, Each row of the matrix L cor
  • #23: 這種反覆地求 u, 被稱為 relaxation 他們是這麼說 , the pressure of constraint is relaxed 若本題的 constrain 是 e  0 則逐次逼進 e = 0, 使得 e 的壓力變少 本文的 relaxation 都用 Gauss-seidel method !!
  • #24: 1. Error 和 residual 的定義 2. 目的 : e  0. 這是一種 relaxation problem. 將 error 變小 , 也就是將限制放鬆 ( 開山祖師的奇妙想法…其實和答案越接近 , 和答案越緊更直覺 ) 要注意到兩件事 , e k 是不是 smooth function ? 有沒有好 initial e 0 ?
  • #25: 完整的說法是 , Gauss-Seidel relaxation converges slowly on the low-frequency components of the solution. . 作 R J 的 Fourier transform (or series?). 可以分析出是 R J 的低頻成分的確要多次才收歛
  • #26: L H 和 L h 一樣嗎 ? 不一樣 !
  • #30: 將之前的整合成 V-cycle, 直接改用本 paper 圖文說明 [figure 1] Restriction: 由 residual 求得 coarse-grid 的 f ( 很奇怪吧 .. 應該也只能求 coarse-grid 的 residual) Base solution: 足夠 coarse level 後 , 直接 求 u 於 Lu = f Prolongation: 由 coarse-grid 的 u, 求得 fine-grid u ( 很難接受 , 仍然應該是 u = u + P e) 至於 data flow, 則改為本 multi-grid 的說法
  • #32: Bi(x) is the B-spline basis B(x) of ith index
  • #33: forward difference : h * f’(x) = f(x+h) – f(x)
  • #36: 由 B-spline, 1D P = ¼ (3 1), 1D R = ¼ (1 3)
  • #38: 分別說明 , 為何 relaxation 不能 pipeline, 也不好作平行化 以低解析度的 l-1 level, Relaxation 需要 laplacian stencil 內的 neighborhood. ( 同解析度 ) 雖然 row (n-1) 已經作了 , 但 row (n+1) 還沒有生成 l-1 level 的資料 所以過去是整 level 算完後再算另外一層 改為下方 , 一次讀入 2k 行 . 而執行一次為 整個移動完畢 . 就作完一次 V-pass ( 先不管 R, P) 使 out-of-core image 如同 in-memory 一樣 , 都在 main memory 內執行 . 執行速度不因
  • #41: k 是用在 k times relaxations
  • #43: N=2, k=3, 3 times Gauss-Seidel relaxations update, 2 V-cycles
  • #44: MAX ERROR: dash Rms (root mean square error) : solid line V-cycles : 表示 幾次 v-cycle. 越多當然越精確 神奇的是 , 2nd order B-spline 收斂最快 . 3rd order B-spline 反而收斂慢
  • #48: 很多都是細節 : 像是用半精度存 u, f, 就可以加快一半速度
  • #51: Panorama: 全景圖
  • #53: HDR  normal tone : 求 log-luminance gradient, apply non-linearly attenuation ( 衰減 ) Tone mapping 比 stitching 更難 ! adaptive method can not be applied full resolution over the entire grid of Poisson eq !!!
  • #55: 收集不同光圈下的 photos
  • #59: Dirichlet boundary condition, 像是在邊界加值 如 d 2 y/dx 2 + 3y=1, x = [0,1], y[0]=a, y[0]=b