Semantic-Aware Sky Replacement (SIGGRAPH 2016)

Sky is Not the Limit: Semantic-
Aware Sky Replacement
Yi-Hsuan Tsai Xiaohui Shen Zhe Lin Ming-Hsuan YangKalyan Sunkavalli
ACM Transactions on Graphics (SIGGRAPH), 2016

Motivation
Goal: automatically segment and replace with different styles of the sky

Challenges
• Manually edit sky using Photoshop
5 mins 30 mins
We need a good segmentation algorithm!
Input Image
Reference

Challenges
• Manually edit sky using Photoshop
Input Image
Reference
We need image harmonization!
v.s
Professional editingColors are not matched

System
Input Image
Sky
Segmentation
Reference Images
Sky
Search
Sky
Replacement
Results

Sky Segmentation
Input Image
Sky
Segmentation
Literatures
• Sky/non-sky classifier [Tao et al. SIGGRAPH’09]
• Scene parsing [Long et al. CVPR’15]
• Online refinement [Rother et al. SIGGRAPH’04]
Challenges
• Sky appearance varies widely
• skylines/landscapes, clouds, lighting conditions
• Need accurate sky boundaries

Sky Search
Input Image Reference Images
Sky
Search
Literatures
• GIST [Hays and Efros SIGGRAPH’07, Liu et al. CGF’14]
• Only consider global scene layout
• Need a large database
Challenges
• Search compatible images
• Account for image content
Reference Image 1 Reference Image 2 Reference Image 3

Sky Replacement
Input Image
Literatures
• Global transfer [Reinhard et al. 2001, Tao et al. SIGGRAPH’09]
• Image contents are not considered
• Less realistic results
• Local transfer [Wu et al. CGF’13, Laffont et al. SIGGRAPH’14]
• Boundary artifacts
• Rely on filters for smoothing
Challenges
• Transfer foreground appearance
• Account for image content
Sky
Replacement

Semantic-Aware System
Input Image
Sky
Segmentation
Reference Images
Sky
Search
Sky
Replacement
Results
Fully Convolutional Networks

Fully Convolutional Networks
Scene Parsing
Fg
Road
Building
Sky
Tree
Semantic Response
Sky
. . .
Building Road
Fully Convolutional Networks (FCN)
• End-to-end model
• Pixel-wise segmentation
• Finetune with 11 scene labels
• Semantic response map
[Long et al. CVPR’15]

Sky Segmentation
Input Image
Scene
Parsing
Online
Refinement
Fully Convolutional
Networks

Sky Segmentation
Input Image
Scene
Parsing
Fully Convolutional
Networks
Online
Refinement
Conditional Random Field optimization
• Online models: color, texture
• Semantic response (sky/non-sky)
• Pairwise term: magnitude of gradient

Input Image FCN Results Our Results

Semantic-Aware Sky Replacement (SIGGRAPH 2016)

Results
DeepLab [Chen et al. ICLR’15]

Sky Search
Input Image
Sky Image Database (415 Images)
Sky
Search

Sky Search
Input Image
Reference Images
Semantic Layout Descriptor
• Account for local layouts
• Utilize semantic responses
Sky
Search

Sky Search
Input Image
Reference Images
• Account for local layouts
• Utilize semantic responses
Sky
Search
Check Sky Properties
• Prevent large distortions
• Aspect ratio
• Resolution
• Ensure sky diversity
• Color similarity

Input Image
. . .
Sky Building Road
Semantic Responses
• Pixel-wise responses
• Range from 0 to 1

Input Image
. . .
Sky Building Road
Semantic Responses
Average pooling on spatial pyramids
• Global pooling

Input Image
. . .
Sky Building Road
Semantic Responses
Average pooling on spatial pyramids
• Global pooling
• Local contents (3x3 grids)
. . .
...

Input Image
. . .
Sky Building Road

Input Image
. . .
Sky Building Road
. . . . . . . . .

Input Image
. . .
Sky Building Road
. . . . . . . . .
Descriptor . . .

Sky Replacement
Input Image
Sky
Alignment
Sky Alignment
• Extract complete sky regions from reference
images
• Re-scale and paste on the input image
Reference Images

Sky Replacement
Input Image
Sky
Alignment
Semantic-aware
Transfer
Sky Alignment
• Extract complete sky regions from reference
images
• Re-scale and paste on the input image
Semantic-aware Transfer
• Adjustment foreground appearance
• Account for semantic regions
Reference Images

Direct local transfer [Laffont et al. SIGGRAPH’14]
• Match corresponding semantic regions
Input image Scene parsing

T1 (x)

T2 (x)
T1 (x)

Input image Scene parsing Direct local transfer
T2 (x)
T1 (x)

Propose a soft mapping method
• Utilize semantic responses as weights
for each category n
Input image Scene parsing Direct local transfer
T1 (x)
T2 (x)

Propose a soft mapping method
• Utilize semantic responses as weights
for each category n
Input image Scene parsing Direct local transfer Soft mapping
Wn (x) = 1 or 0
T1 (x)
T2 (x)

Transfer Functions
Transfer Functions Tn (x) for each category n
• Transfer luminance and color
T1 (x)
T2 (x)
Luminance
• Shift mean

Transfer Functions
Transfer Functions Tn (x) for each category n
• Transfer luminance and color
Color
• Matched regions: chrominance
• Histogram matching [Lee et al. CVPR’16]
• Non-matched regions: color temperature
• Consider entire foreground
• More conservative
Not all the semantic regions are matched!
T1 (x)
T2 (x)
?

Input Image Sky Replacement Results

Sky Replacement with
User Preference

Sky Replacement ResultsInput Image
Preferred Sky

Comparisons of different search methods

Comparisons of different transfer methods

Conclusions
• Automatic sky replacement results can be realistic
• New sky image database
• Semantics helps a lot
• Sky segmentation
• Sky image search
• Appearance transfer
• Apply semantics to other tasks
• Scene completion
• Photo and video re-coloring

Summary of my Other Projects:
Visual Object Recognition

Joint Object Classification and Segmentation [BMVC’13]
• How do segmentation and classification help each other?
Class-specific Object Segmentation Hypotheses [ICCV’13]
• How to utilize exemplars to gain more information
during learning and inference?
Image Retrieval [ICIP’14]
• Compute label similarities to bridge semantic gaps
Exemplar-based Object Detection [CVPR’15]
• Discover representative exemplars to build models
• Region-based feature extraction and model learning
Image (Object) Recognition
• Classification
• Segmentation
• Retrieval
• Detection

Video Object Recognition
• Object (Co-)segmentation
• Scene (Co-)parsing
Video Segmentation via Object Flow [CVPR’16]
• How do segmentation and optical flow help each other?
• Segmentation: multi-scale, spatio-temporal graphical model
• Optical flow: use segmentation to refine boundaries
• Iteratively solve the joint model
Semantic Co-segmentation in Videos (submitted to ECCV’16)
• Temporal-consistent object tracklets
• Relations between objects from a collection of videos
Ongoing and future work
• Scene Parsing via Deep CNNs
• Attention to small objects
• Label co-occurrence
• Video Scene Co-parsing
• Weakly-supervised: video tags
• Use image-based classifier

Object Segmentation
96.4 MCL, 74.4
93.3 PMCut, 59.1
94.4 MCL, 53.083.6 PMCut, 47.3

Object Segmentation
93.5 PMCut, 26.6
89.2 MCL, 65.373.8 PMCut, 58.0
86.9 PMCut, 68.0

Video Object Segmentation
Segmentation Updated Optical Flow Initial Optical Flow

Joint Object Classification
and Segmentation [BMVC’13] Object Segmentation [ICCV’13]
Image Retrieval [ICIP’14]
Object Detection [CVPR’15]
Video Object Segmentation
[CVPR’16]
Sky Replacement [SIGGRAPH’16]
Semantic Co-segmentation in Videos
(submitted to ECCV’16)
Video Scene Co-parsing (ongoing)
Image (Object) Recognition via Exemplars
• Classification
• Segmentation
• Retrieval
• Detection
Video Object Recognition: Temporal + CNN
• Object (Co-)segmentation
• Scene (Co-)parsing
Image/Video Editing
• Background/Object Replacement
• Scene Completion
• Re-coloring
Semantic Information
My homepage:
https://sites.google.com/site/yihsuantsai/
Thank you!

Semantic-Aware Sky Replacement (SIGGRAPH 2016)

More Related Content

What's hot (20)

Viewers also liked (13)

Similar to Semantic-Aware Sky Replacement (SIGGRAPH 2016) (20)

Recently uploaded (20)

Semantic-Aware Sky Replacement (SIGGRAPH 2016)