CVPR Buzz - 2021
Sorting Weights
Citations
Replies
Retweets
Likes
Poster Session
Abstracts

CVPR Buzz - 2021

Built by Matt Deitke

CVPR Buzz displays the most discussed papers at CVPR 2021 using Twitter for indexing discussions and Semantic Scholar for collecting citation data.

To add data or see how it was collected, checkout the GitHub repo:

0
1660 results
[1]

Meta Pseudo Labels

Hieu Pham, Zihang Dai, Qizhe Xie, Quoc V. Le

We present Meta Pseudo Labels, a semi-supervised learning method that achieves a new state-of-the-art top-1 accuracy of 90.2% on ImageNet, which is 1.6% better than the existing state-of-the-art. [Expand]

1178.75
36
44
723
3081
Thursday Poster Session
[2]

Animating Pictures With Eulerian Motion Fields

Aleksander Holynski, Brian L. Curless, Steven M. Seitz, Richard Szeliski

In this paper, we demonstrate a fully automatic method for converting a still image into a realistic animated looping video. [Expand]

1114.50
2
130
686
2948
Tuesday Poster Session
[3]

Taming Transformers for High-Resolution Image Synthesis

Patrick Esser, Robin Rombach, Bjorn Ommer

Designed to learn long-range interactions on sequential data, transformers continue to show state-of-the-art results on a wide variety of tasks. [Expand]

863.25
30
30
444
2415
Thursday Poster Session
[4]

Real-Time High-Resolution Background Matting

Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L. Curless, Steven M. Seitz, Ira Kemelmacher-Shlizerman

We introduce a real-time, high-resolution background replacement technique which operates at 30fps in 4K resolution, and 60fps for HD on a modern GPU. [Expand]

589.00
4
49
264
1763
Wednesday Poster Session
[5]

RepVGG: Making VGG-Style ConvNets Great Again

Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. [Expand]

561.75
9
15
319
1558
Thursday Poster Session
[6]

Natural Adversarial Examples

Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, Dawn Song

We introduce two challenging datasets that reliably cause machine learning model performance to substantially degrade. [Expand]

506.75
122
30
337
835
Thursday Poster Session
[7]

VirTex: Learning Visual Representations From Textual Annotations

Karan Desai, Justin Johnson

The de-facto approach to many vision tasks is to start from pretrained visual representations, typically learned via supervised training on ImageNet. [Expand]

461.50
36
29
245
1183
Wednesday Poster Session
[8]

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

Ting-Chun Wang, Arun Mallya, Ming-Yu Liu

We propose a neural talking-head video synthesis model and demonstrate its application to video conferencing. [Expand]

419.50
8
21
246
1133
Wednesday Poster Session
[9]

Learning Continuous Image Representation With Local Implicit Image Function

Yinbo Chen, Sifei Liu, Xiaolong Wang

How to represent an image? While the visual world is presented in a continuous manner, machines store and see the images in a discrete way with 2D arrays of pixels. [Expand]

373.75
10
14
189
1063
Wednesday Poster Session
[10]

Im2Vec: Synthesizing Vector Graphics Without Vector Supervision

Pradyumna Reddy, Michael Gharbi, Michal Lukac, Niloy J. Mitra

Vector graphics are widely used to represent fonts, logos, digital artworks, and graphic designs. [Expand]

358.75
3
10
188
1037
Wednesday Poster Session
[11]

Exploring Simple Siamese Representation Learning

Xinlei Chen, Kaiming He

Siamese networks have become a common structure in various recent models for unsupervised visual representation learning. [Expand]

345.75
112
14
120
681
Friday Poster Session
[12]

Bottleneck Transformers for Visual Recognition

Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. [Expand]

334.00
46
8
164
816
Friday Poster Session
[13]

Involution: Inverting the Inherence of Convolution for Visual Recognition

Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. [Expand]

300.75
6
16
192
779
Thursday Poster Session
[14]

Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation

Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph

Building instance segmentation models that are data-efficient and can handle rare object categories is an important challenge in computer vision. [Expand]

289.50
22
11
147
765
Tuesday Poster Session
[15]

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth

We present a learning-based method for synthesizingnovel views of complex scenes using only unstructured collections of in-the-wild photographs. [Expand]

268.50
75
8
131
504
Wednesday Poster Session
[16]

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yagiz Aksoy

Neural networks have shown great abilities in estimating depth from a single image. [Expand]

247.75
5
148
690
Wednesday Poster Session
[17]

Robust Consistent Video Depth Estimation

Johannes Kopf, Xuejian Rong, Jia-Bin Huang

We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video. [Expand]

246.50
1
10
109
754
Monday Poster Session
[18]

NeX: Real-Time View Synthesis With Neural Basis Expansion

Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn

We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce next-level view-dependent effects--in real time. [Expand]

246.25
6
23
183
572
Wednesday Poster Session
[19]

Motion Representations for Articulated Animation

Aliaksandr Siarohin, Oliver J. Woodford, Jian Ren, Menglei Chai, Sergey Tulyakov

We propose novel motion representations for animating articulated objects consisting of distinct parts. [Expand]

238.75
9
139
668
Thursday Poster Session
[20]

Omnimatte: Associating Objects and Their Effects in Video

Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T. Freeman, Michael Rubinstein

Computer vision has become increasingly better at segmenting objects in images and videos; however, scene effects related to the objects -- shadows, reflections, generated smoke, etc. [Expand]

211.50
5
123
595
Tuesday Poster Session
[21]

Closed-Form Factorization of Latent Semantics in GANs

Yujun Shen, Bolei Zhou

A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. [Expand]

211.00
39
4
143
398
Monday Poster Session
[22]

Scene Essence

Jiayan Qiu, Yiding Yang, Xinchao Wang, Dacheng Tao

What scene elements, if any, are indispensable for recognizing a scene? We strive to answer this question through the lens of an end-to-end learning scheme. [Expand]

PDF
Show Tweets
209.25
32
137
531
Wednesday Poster Session
[23]

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. [Expand]

207.25
13
6
129
513
Thursday Poster Session
[24]

Back to the Feature: Learning Robust Camera Localization From Pixels To Pose

Paul-Edouard Sarlin, Ajaykumar Unagar, Mans Larsson, Hugo Germain, Carl Toft, Viktor Larsson, Marc Pollefeys, Vincent Lepetit, Lars Hammarstrand, Fredrik Kahl, Torsten Sattler

Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. [Expand]

198.25
1
18
103
565
Tuesday Poster Session
[25]

Holistic 3D Scene Understanding From a Single Image With Implicit Representation

Cheng Zhang, Zhaopeng Cui, Yinda Zhang, Bing Zeng, Marc Pollefeys, Shuaicheng Liu

We present a new pipeline for holistic 3D scene understanding from a single image, which could predict object shape, object pose and scene layout. [Expand]

191.25
4
118
525
Wednesday Poster Session
[26]

Pre-Trained Image Processing Transformer

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. [Expand]

186.75
52
2
73
391
Thursday Poster Session
[27]

Stylized Neural Painting

Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, Zhenwei Shi

This paper proposes an image-to-painting translation method that generates vivid and realistic painting artworks with controllable styles. [Expand]

183.75
1
8
89
545
Friday Poster Session
[28]

ArtEmis: Affective Language for Visual Art

Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas J. Guibas

We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. [Expand]

182.50
4
8
110
486
Thursday Poster Session
[29]

DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort

Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

We introduce DatasetGAN: an automatic procedure to generate massive datasets of high-quality semantically segmented images requiring minimal human effort. [Expand]

174.00
2
5
114
455
Wednesday Poster Session
[30]

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, Tomas Pfister

We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without anomalous data. [Expand]

159.25
2
1
93
442
Wednesday Poster Session
[31]

DriveGAN: Towards a Controllable High-Quality Neural Simulation

Seung Wook Kim, Jonah Philion, Antonio Torralba, Sanja Fidler

Realistic simulators are critical for training and verifying robotics systems. [Expand]

157.75
8
102
419
Tuesday Poster Session
[32]

Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

Zhengqi Li, Simon Niklaus, Noah Snavely, Oliver Wang

We present a method to perform novel view and time synthesis of dynamic scenes, requiring only a monocular video with known camera poses as input. [Expand]

151.00
30
12
70
332
Tuesday Poster Session
[33]

GAN Prior Embedded Network for Blind Face Restoration in the Wild

Tao Yang, Peiran Ren, Xuansong Xie, Lei Zhang

Blind face restoration (BFR) from severely degraded face images in the wild is a very challenging problem. [Expand]

149.75
7
90
412
Monday Poster Session
[34]

Image Generators With Conditionally-Independent Pixel Synthesis

Ivan Anokhin, Kirill Demochkin, Taras Khakhulin, Gleb Sterkin, Victor Lempitsky, Denis Korzhenkov

Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner. [Expand]

143.00
8
14
56
414
Thursday Poster Session
[35]

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

Daiqing Li, Junlin Yang, Karsten Kreis, Antonio Torralba, Sanja Fidler

Training deep networks with limited labeled data while achieving a strong generalization ability is key in the quest to reduce human annotation efforts. [Expand]

137.00
4
7
85
355
Wednesday Poster Session
[36]

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens

Self-attention has the promise of improving computer vision systems due to parameter-independent scaling of receptive fields and content-dependent interactions, in contrast to parameter-dependent scaling and content-independent interactions of convolutions. [Expand]

136.50
16
2
67
346
Thursday Poster Session
[37]

Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or

We present a generic image-to-image translation framework, pixel2style2pixel (pSp). [Expand]

136.25
44
7
60
242
Monday Poster Session
[38]

Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers

Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H.S. Torr, Li Zhang

Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoder-decoder architecture. [Expand]

133.75
57
0
53
201
Tuesday Poster Session
[39]

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments From a Single Moving Camera

Felix Wimbauer, Nan Yang, Lukas von Stumberg, Niclas Zeller, Daniel Cremers

In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. [Expand]

130.25
1
0
59
399
Tuesday Poster Session
[40]

Information-Theoretic Segmentation by Inpainting Error Maximization

Pedro Savarese, Sunnie S. Y. Kim, Michael Maire, Greg Shakhnarovich, David McAllester

We study image segmentation from an information-theoretic perspective, proposing a novel adversarial method that performs unsupervised segmentation by partitioning images into maximally independent sets. [Expand]

130.00
1
4
61
390
Tuesday Poster Session
[41]

IBRNet: Learning Multi-View Image-Based Rendering

Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P. Srinivasan, Howard Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas Funkhouser

We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. [Expand]

128.00
11
3
61
343
Tuesday Poster Session
[42]

On Robustness and Transferability of Convolutional Neural Networks

Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic

Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts. [Expand]

126.25
18
5
70
288
Friday Poster Session
[43]

LoFTR: Detector-Free Local Feature Matching With Transformers

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou

We present a novel method for local image feature matching. [Expand]

125.50
3
3
73
341
Wednesday Poster Session
[44]

Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings

Brett D. Roads, Bradley C. Love

Advances in supervised learning approaches to object recognition flourished in part because of the availability of high-quality datasets and associated benchmarks. [Expand]

119.00
1
7
92
281
Tuesday Poster Session
[45]

Shape and Material Capture at Home

Daniel Lichy, Jiaye Wu, Soumyadip Sengupta, David W. Jacobs

In this paper, we present a technique for estimating the geometry and reflectance of objects using only a camera, flashlight, and optionally a tripod. [Expand]

112.75
1
9
73
292
Tuesday Poster Session
[46]

Re-Labeling ImageNet: From Single to Multi-Labels, From Global to Localized Labels

Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun

ImageNet has been the most popular image classification benchmark, but it is also the one with a significant level of label noise. [Expand]

111.75
11
8
58
279
Monday Poster Session
[47]

NeuralRecon: Real-Time Coherent 3D Reconstruction From Monocular Video

Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao

We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video. [Expand]

110.75
8
65
305
Friday Poster Session
[48]

Deep Animation Video Interpolation in the Wild

Li Siyao, Shiyu Zhao, Weijiang Yu, Wenxiu Sun, Dimitris Metaxas, Chen Change Loy, Ziwei Liu

In the animation industry, cartoon videos are usually produced at low frame rate since hand drawing of such frames is costly and time-consuming. [Expand]

110.50
6
56
324
Tuesday Poster Session
[49]

pixelNeRF: Neural Radiance Fields From One or Few Images

Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa

We propose pixelNeRF, a learning framework that predicts a continuous neural scene representation conditioned on one or few input images. [Expand]

PDF
arXiv
Show Tweets
109.25
5
59
314
Tuesday Poster Session
[50]

UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers

Zhigang Dai, Bolun Cai, Yugeng Lin, Junying Chen

Object detection with transformers (DETR) reaches competitive performance with Faster R-CNN via a transformer encoder-decoder architecture. [Expand]

106.75
24
2
48
233
Monday Poster Session
[51]

Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu

The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. [Expand]

102.50
15
6
43
258
Wednesday Poster Session
[52]

Playable Video Generation

Willi Menapace, Stephane Lathuiliere, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci

This paper introduces the unsupervised learning problem of playable video generation (PVG). [Expand]

102.50
2
6
68
260
Wednesday Poster Session
[53]

AutoInt: Automatic Integration for Fast Neural Volume Rendering

David B. Lindell, Julien N. P. Martel, Gordon Wetzstein

Numerical integration is a foundational technique in scientific computing and is at the core of many computer vision applications. [Expand]

100.75
14
5
40
262
Thursday Poster Session
[54]

Stable View Synthesis

Gernot Riegler, Vladlen Koltun

We present Stable View Synthesis (SVS). [Expand]

100.75
12
7
40
268
Thursday Poster Session
[55]

Shelf-Supervised Mesh Prediction in the Wild

Yufei Ye, Shubham Tulsiani, Abhinav Gupta

We aim to infer 3D shape and pose of objects from a single image and propose a learning-based approach that can train from unstructured image collections, using only segmentation outputs from off-the-shelf recognition systems as supervisory signal (i.e. [Expand]

95.50
1
6
54
264
Wednesday Poster Session
[56]

Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans

Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou

This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views. [Expand]

95.00
9
3
47
247
Wednesday Poster Session
[57]

Navigating the GAN Parameter Space for Semantic Image Editing

Anton Cherepkov, Andrey Voynov, Artem Babenko

Generative Adversarial Networks (GANs) are currently an indispensable tool for visual editing, being a standard component of image-to-image translation and image restoration pipelines. [Expand]

93.50
1
2
54
260
Tuesday Poster Session
[58]

Skip-Convolutions for Efficient Video Processing

Amirhossein Habibian, Davide Abati, Taco S. Cohen, Babak Ehteshami Bejnordi

We propose Skip-Convolutions to leverage the large amount of redundancies in video streams and save computations. [Expand]

92.25
1
38
292
Monday Poster Session
[59]

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth

Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel Brostow, Michael Firman

Self-supervised monocular depth estimation networks are trained to predict scene depth using nearby frames as a supervision signal during training. [Expand]

88.75
1
5
51
244
Monday Poster Session
[60]

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance. [Expand]

87.75
1
2
45
255
Tuesday Poster Session
[61]

SiamMOT: Siamese Multi-Object Tracking

Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, Joseph Tighe

In this work, we focus on improving online multi-object tracking (MOT). [Expand]

87.25
0
52
245
Thursday Poster Session
[62]

Self-Supervised Geometric Perception

Heng Yang, Wei Dong, Luca Carlone, Vladlen Koltun

We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). [Expand]

86.50
1
7
41
253
Thursday Poster Session
[63]

Stochastic Image-to-Video Synthesis Using cINNs

Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Bjorn Ommer

Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics: Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all the remaining characteristics not present in the initial frame. [Expand]

85.50
6
56
224
Tuesday Poster Session
[64]

Spatially-Adaptive Pixelwise Networks for Fast Image Translation

Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. [Expand]

84.50
1
6
42
244
Thursday Poster Session
[65]

Space-Time Neural Irradiance Fields for Free-Viewpoint Video

Wenqi Xian, Jia-Bin Huang, Johannes Kopf, Changil Kim

We present a method that learns a spatiotemporal neural irradiance field for dynamic scenes from a single video. [Expand]

84.25
27
5
27
170
Wednesday Poster Session
[66]

SMPLicit: Topology-Aware Generative Model for Clothed People

Enric Corona, Albert Pumarola, Guillem Alenya, Gerard Pons-Moll, Francesc Moreno-Noguer

In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. [Expand]

83.50
4
4
38
238
Thursday Poster Session
[67]

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

Yasamin Jafarian, Hyun Soo Park

A key challenge of learning the geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real world imagery. [Expand]

82.75
5
38
250
Thursday Poster Session
[68]

Multimodal Motion Prediction With Stacked Transformers

Yicheng Liu, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, Bolei Zhou

Predicting multiple plausible future trajectories of the nearby vehicles is crucial for the safety of autonomous driving. [Expand]

82.25
3
2
35
245
Wednesday Poster Session
[69]

Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction

Aljaz Bozic, Pablo Palafox, Michael Zollhofer, Justus Thies, Angela Dai, Matthias Niessner

We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. [Expand]

78.00
2
1
49
205
Monday Poster Session
[70]

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Aditya Prakash, Kashyap Chitta, Andreas Geiger

How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise for perception tasks such as object detection and motion forecasting. [Expand]

77.75
1
4
54
195
Tuesday Poster Session
[71]

Line Segment Detection Using Transformers Without Edges

Yifan Xu, Weijian Xu, David Cheung, Zhuowen Tu

In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. [Expand]

77.25
4
0
39
215
Tuesday Poster Session
[72]

Training Generative Adversarial Networks in One Stage

Chengchao Shen, Youtan Yin, Xinchao Wang, Xubin Li, Jie Song, Mingli Song

Generative Adversarial Networks (GANs) have demonstrated unprecedented success in various image generation tasks. [Expand]

76.75
3
35
234
Tuesday Poster Session
[73]

GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields

Michael Niemeyer, Andreas Geiger

Deep generative models allow for photorealistic image synthesis at high resolutions. [Expand]

76.25
20
2
43
137
Thursday Poster Session
[74]

Spatiotemporal Contrastive Video Representation Learning

Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui

We present a self-supervised Contrastive Video Representation Learning (CVRL) method to learn spatiotemporal visual representations from unlabeled videos. [Expand]

76.25
32
4
31
111
Tuesday Poster Session
[75]

Transformer Interpretability Beyond Attention Visualization

Hila Chefer, Shir Gur, Lior Wolf

Self-attention techniques, and specifically Transformers, are dominating the field of text processing and are becoming increasingly popular in computer vision classification tasks. [Expand]

75.75
12
3
37
178
Monday Poster Session
[76]

SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor

In this paper we introduce a Transformer-based approach to video object segmentation (VOS). [Expand]

75.75
6
3
42
192
Tuesday Poster Session
[77]

Positional Encoding As Spatial Inductive Bias in GANs

Rui Xu, Xintao Wang, Kai Chen, Bolei Zhou, Chen Change Loy

SinGAN shows impressive capability in learning internal patch distribution despite its limited effective receptive field. [Expand]

74.75
6
1
27
220
Thursday Poster Session
[78]

Probabilistic Embeddings for Cross-Modal Retrieval

Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus

Cross-modal retrieval methods build a common representation space for samples from multiple modalities, typically from the vision and the language domains. [Expand]

74.25
2
5
30
224
Wednesday Poster Session
[79]

Dual Contradistinctive Generative Autoencoder

Gaurav Parmar, Dacheng Li, Kwonjoon Lee, Zhuowen Tu

We present a new generative autoencoder model with dual contradistinctive losses to improve generative autoencoder that performs simultaneous inference (reconstruction) and synthesis (sampling). [Expand]

74.25
5
4
31
211
Monday Poster Session
[80]

D-NeRF: Neural Radiance Fields for Dynamic Scenes

Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer

Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. [Expand]

74.25
34
2
19
121
Wednesday Poster Session
[81]

On Feature Normalization and Data Augmentation

Boyi Li, Felix Wu, Ser-Nam Lim, Serge Belongie, Kilian Q. Weinberger

The moments (a.k.a., mean and standard deviation) of latent features are often removed as noise when training image recognition models, to increase stability and reduce training time. [Expand]

73.50
22
3
36
131
Thursday Poster Session
[82]

Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction

Guy Gafni, Justus Thies, Michael Zollhofer, Matthias Niessner

We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. [Expand]

73.00
14
1
39
157
Wednesday Poster Session
[83]

You Only Look One-Level Feature

Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun

This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out that the success of FPN is due to its divide-and-conquer solution to the optimization problem in object detection rather than multi-scale feature fusion. [Expand]

72.25
2
40
207
Thursday Poster Session
[84]

Metadata Normalization

Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli

Batch Normalization (BN) and its variants have delivered tremendous success in combating the covariate shift induced by the training step of deep learning methods. [Expand]

PDF
arXiv
Show Tweets
72.00
15
54
165
Wednesday Poster Session
[85]

End-to-End Video Instance Segmentation With Transformers

Yuqing Wang, Zhaoliang Xu, Xinlong Wang, Chunhua Shen, Baoshan Cheng, Hao Shen, Huaxia Xia

Video instance segmentation (VIS) is the task that requires simultaneously classifying, segmenting and tracking object instances of interest in video. [Expand]

72.00
44
0
13
86
Wednesday Poster Session
[86]

Repurposing GANs for One-Shot Semantic Part Segmentation

Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn

While GANs have shown success in realistic image generation, the idea of using GANs for other tasks unrelated to synthesis is underexplored. [Expand]

70.50
3
2
33
202
Tuesday Poster Session
[87]

Neural Lumigraph Rendering

Petr Kellnhofer, Lars C. Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, Gordon Wetzstein

Novel view synthesis is a challenging and ill-posed inverse rendering problem. [Expand]

69.75
2
2
35
199
Tuesday Poster Session
[88]

Exploiting Spatial Dimensions of Latent in GAN for Real-Time Image Editing

Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh

Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. [Expand]

PDF
arXiv
Show Tweets
69.50
4
44
186
Monday Poster Session
[89]

Passive Inter-Photon Imaging

Atul Ingle, Trevor Seets, Mauro Buttafava, Shantanu Gupta, Alberto Tosi, Mohit Gupta, Andreas Velten

Digital camera pixels measure image intensities by converting incident light energy into an analog electrical current, and then digitizing it into a fixed-width binary representation. [Expand]

69.00
1
52
171
Wednesday Poster Session
[90]

Plan2Scene: Converting Floorplans to 3D Scenes

Madhawa Vidanapathirana, Qirui Wu, Yasutaka Furukawa, Angel X. Chang, Manolis Savva

We address the task of converting a floorplan and a set of associated photos of a residence into a textured 3D mesh model, a task which we call Plan2Scene. [Expand]

68.00
6
40
186
Wednesday Poster Session
[91]

Task Programming: Learning Data Efficient Behavior Representations

Jennifer J. Sun, Ann Kennedy, Eric Zhan, David J. Anderson, Yisong Yue, Pietro Perona

Specialized domain knowledge is often necessary to accurately annotate training sets for in-depth analysis, but can be burdensome and time-consuming to acquire from domain experts. [Expand]

67.75
2
6
49
159
Tuesday Poster Session
[92]

Deep Occlusion-Aware Instance Segmentation With Overlapping BiLayers

Lei Ke, Yu-Wing Tai, Chi-Keung Tang

Segmenting highly-overlapping objects is challenging, because typically no distinction is made between real object contours and occlusion boundaries. [Expand]

66.25
1
4
26
205
Tuesday Poster Session
[93]

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks

Despoina Paschalidou, Angelos Katharopoulos, Andreas Geiger, Sanja Fidler

Impressive progress in 3D shape extraction led to representations that can capture object geometries with high fidelity. [Expand]

65.75
1
2
44
169
Tuesday Poster Session
[94]

Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging

Alvaro Parra, Shin-Fang Chng, Tat-Jun Chin, Anders Eriksson, Ian Reid

Under mild conditions on the noise level of the measurements, rotation averaging satisfies strong duality, which enables global solutions to be obtained via semidefinite programming (SDP) relaxation. [Expand]

PDF
arXiv
Show Tweets
65.00
2
25
208
Tuesday Poster Session
[95]

MOS: Towards Scaling Out-of-Distribution Detection for Large Semantic Space

Rui Huang, Yixuan Li

Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying machine learning models in the real world. [Expand]

64.50
3
28
199
Wednesday Poster Session
[96]

Anycost GANs for Interactive Image Synthesis and Editing

Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

Generative adversarial networks (GANs) have enabled photorealistic image synthesis and editing. [Expand]

64.00
2
6
41
160
Thursday Poster Session
[97]

Generative Hierarchical Features From Synthesizing Images

Yinghao Xu, Yujun Shen, Jiapeng Zhu, Ceyuan Yang, Bolei Zhou

Generative Adversarial Networks (GANs) have recently advanced image synthesis by learning the underlying distribution of the observed data. [Expand]

63.25
7
2
37
149
Tuesday Poster Session
[98]

NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces

Miguel Jaques, Michael Burke, Timothy M. Hospedales

Learning low-dimensional latent state space dynamics models has proven powerful for enabling vision-based planning and learning for control. [Expand]

63.00
1
6
42
158
Tuesday Poster Session
[99]

DetectoRS: Detecting Objects With Recursive Feature Pyramid and Switchable Atrous Convolution

Siyuan Qiao, Liang-Chieh Chen, Alan Yuille

Many modern object detectors demonstrate outstanding performances by using the mechanism of looking and thinking twice. [Expand]

62.00
50
2
8
30
Wednesday Poster Session
[100]

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Zongze Wu, Dani Lischinski, Eli Shechtman

We explore and analyze the latent style space of StyleGAN2, a state-of-the-art architecture for image generation, using models pretrained on several different datasets. [Expand]

61.75
15
3
28
128
Thursday Poster Session
[101]

MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers

Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

We present MaX-DeepLab, the first end-to-end model for panoptic segmentation. [Expand]

61.25
28
1
20
92
Tuesday Poster Session
[102]

Student-Teacher Learning From Clean Inputs to Noisy Inputs

Guanzhe Hong, Zhiyuan Mao, Xiaojun Lin, Stanley H. Chan

Feature-based student-teacher learning, a training method that encourages the student's hidden features to mimic those of the teacher network, is empirically successful in transferring the knowledge from a pre-trained teacher network to the student network. [Expand]

60.00
1
29
181
Thursday Poster Session
[103]

Scaled-YOLOv4: Scaling Cross Stage Partial Network

Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao

We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy. [Expand]

60.00
24
4
27
86
Thursday Poster Session
[104]

NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis

Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, Jonathan T. Barron

We present a method that takes as input a set of images of a scene illuminated by unconstrained known lighting, and produces as output a 3D representation that can be rendered from novel viewpoints under arbitrary lighting conditions. [Expand]

59.75
18
2
18
129
Wednesday Poster Session
[105]

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

Yuan-Hong Liao, Amlan Kar, Sanja Fidler

Data is the engine of modern computer vision, which necessitates collecting large-scale datasets. [Expand]

58.75
4
34
163
Tuesday Poster Session
[106]

Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder

Tal Daniel, Aviv Tamar

The recently introduced introspective variational autoencoder (IntroVAE) exhibits outstanding image generations, and allows for amortized inference using an image encoder. [Expand]

58.50
5
33
163
Tuesday Poster Session
[107]

Regularizing Generative Adversarial Networks Under Limited Data

Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang

Recent years have witnessed the rapid progress of generative adversarial networks (GANs). [Expand]

56.00
3
39
143
Wednesday Poster Session
[108]

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes

Dmytro Kotovenko, Matthias Wright, Arthur Heimbrecht, Bjorn Ommer

There have been many successful implementations of neural style transfer in recent years. [Expand]

54.25
1
4
20
169
Thursday Poster Session
[109]

Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images

Florian Kluger, Hanno Ackermann, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahn

Humans perceive and construct the surrounding world as an arrangement of simple parametric models. [Expand]

54.00
3
28
157
Thursday Poster Session
[110]

SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation

Sanghyun Son, Kyoung Mu Lee

Deep CNNs have achieved significant successes in image processing and its applications, including single image super-resolution (SR). [Expand]

52.00
2
30
146
Wednesday Poster Session
[111]

Learning To Recover 3D Scene Shape From a Single Image

Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Long Mai, Simon Chen, Chunhua Shen

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. [Expand]

51.75
2
3
30
136
Monday Poster Session
[112]

On Self-Contact and Human Pose

Lea Muller, Ahmed A. A. Osman, Siyu Tang, Chun-Hao P. Huang, Michael J. Black

People touch their face 23 times an hour, they cross their arms and legs, put their hands on their hips, etc. [Expand]

51.25
2
5
16
160
Wednesday Poster Session
[113]

Cross-Modal Contrastive Learning for Text-to-Image Generation

Han Zhang, Jing Yu Koh, Jason Baldridge, Honglak Lee, Yinfei Yang

The output of text-to-image synthesis systems should be coherent, clear, photo-realistic scenes with high semantic fidelity to their conditioned text descriptions. [Expand]

51.25
5
3
37
108
Monday Poster Session
[114]

HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms

Mahmoud Afifi, Marcus A. Brubaker, Michael S. Brown

While generative adversarial networks (GANs) can successfully produce high-quality images, they can be challenging to control. [Expand]

51.00
1
3
28
141
Wednesday Poster Session
[115]

Single Image Depth Prediction With Wavelet Decomposition

Michael Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit, Daniyar Turmukhambetov

We present a novel method for predicting accurate depths from monocular images with high efficiency. [Expand]

PDF
Show Tweets
50.75
2
27
147
Wednesday Poster Session
[116]

Birds of a Feather: Capturing Avian Shape Models From Images

Yufu Wang, Nikos Kolotouros, Kostas Daniilidis, Marc Badger

Animals are diverse in shape, but building a deformable shape model for a new species is not always possible due to the lack of 3D data. [Expand]

50.00
7
29
135
Thursday Poster Session
[117]

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

Bin Yan, Houwen Peng, Kan Wu, Dong Wang, Jianlong Fu, Huchuan Lu

Object tracking has achieved significant progress over the past few years. [Expand]

49.00
0
33
130
Thursday Poster Session
[118]

Body2Hands: Learning To Infer 3D Hands From Conversational Gesture Body Dynamics

Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo

We propose a novel learned deep prior of body motion for 3D hand shape synthesis and estimation in the domain of conversational gestures. [Expand]

48.50
2
2
28
128
Thursday Poster Session
[119]

OSTeC: One-Shot Texture Completion

Baris Gecer, Jiankang Deng, Stefanos Zafeiriou

The last few years have witnessed the great success of non-linear generative models in synthesizing high-quality photorealistic face images. [Expand]

48.00
2
0
25
134
Wednesday Poster Session
[120]

Towards Open World Object Detection

K J Joseph, Salman Khan, Fahad Shahbaz Khan, Vineeth N Balasubramanian

Humans have a natural instinct to identify unknown object instances in their environments. [Expand]

48.00
2
8
28
120
Tuesday Poster Session
[121]

MoViNets: Mobile Video Networks for Efficient Video Recognition

Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference. [Expand]

47.50
1
4
26
130
Friday Poster Session
[122]

Pulsar: Efficient Sphere-Based Neural Rendering

Christoph Lassner, Michael Zollhofer

We propose Pulsar, an efficient sphere-based differentiable rendering module that is orders of magnitude faster than competing techniques, modular, and easy-to-use due to its tight integration with PyTorch. [Expand]

PDF
arXiv
Show Tweets
47.50
2
31
126
Monday Poster Session
[123]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

Ning Wang, Wengang Zhou, Jie Wang, Houqiang Li

In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers. [Expand]

47.50
2
1
24
133
Monday Poster Session
[124]

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation

Kehong Gong, Jianfeng Zhang, Jiashi Feng

Existing 3D human pose estimators suffer poor generalization performance to new datasets, largely due to the limited diversity of 2D-3D pose pairs in the training data. [Expand]

47.25
1
2
22
139
Wednesday Poster Session
[125]

Large-Scale Localization Datasets in Crowded Indoor Spaces

Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guerin, Gabriela Csurka, Martin Humenberger

Estimating the precise location of a camera using visual localization enables interesting applications such as augmented reality or robot navigation. [Expand]

47.00
4
31
122
Tuesday Poster Session
[126]

Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors

Tao Yu, Zerong Zheng, Kaiwen Guo, Pengpeng Liu, Qionghai Dai, Yebin Liu

Human volumetric capture is a long-standing topic in computer vision and computer graphics. [Expand]

46.75
1
1
30
122
Tuesday Poster Session
[127]

See Through Gradients: Image Batch Recovery via GradInversion

Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

Training deep neural networks requires gradient estimation from data batches to update parameters. [Expand]

46.50
2
2
29
118
Friday Poster Session
[128]

Unsupervised Learning of 3D Object Categories From Videos in the Wild

Philipp Henzler, Jeremy Reizenstein, Patrick Labatut, Roman Shapovalov, Tobias Ritschel, Andrea Vedaldi, David Novotny

Recently, numerous works have attempted to learn 3D reconstructors of textured 3D models of visual categories given a training set of annotated static images of objects. [Expand]

46.25
1
1
25
130
Tuesday Poster Session
[129]

Reconstructing 3D Human Pose by Watching Humans in the Mirror

Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror. [Expand]

46.00
3
28
125
Thursday Poster Session
[130]

LipSync3D: Data-Efficient Learning of Personalized 3D Talking Faces From Video Using Pose and Lighting Normalization

Avisek Lahiri, Vivek Kwatra, Christian Frueh, John Lewis, Chris Bregler

In this paper, we present a video-based learning framework for animating personalized 3D talking faces from audio. [Expand]

46.00
7
22
133
Monday Poster Session
[131]

Permute, Quantize, and Fine-Tune: Efficient Compression of Neural Networks

Julieta Martinez, Jashan Shewakramani, Ting Wei Liu, Ioan Andrei Barsan, Wenyuan Zeng, Raquel Urtasun

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. [Expand]

45.25
1
34
112
Friday Poster Session
[132]

Blur, Noise, and Compression Robust Generative Adversarial Networks

Takuhiro Kaneko, Tatsuya Harada

Generative adversarial networks (GANs) have gained considerable attention owing to their ability to reproduce images. [Expand]

45.00
1
6
30
110
Thursday Poster Session
[133]

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While accurate lip synchronization has been achieved for arbitrary-subject audio-driven talking face generation, the problem of how to efficiently drive the head pose remains. [Expand]

44.75
4
2
26
109
Tuesday Poster Session
[134]

NeRD: Neural 3D Reflection Symmetry Detector

Yichao Zhou, Shichen Liu, Yi Ma

Recent advances have shown that symmetry, a structural prior that most objects exhibit, can support a variety of single-view 3D understanding tasks. [Expand]

44.50
3
27
121
Friday Poster Session
[135]

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations

Adel Ahmadyan, Liangkai Zhang, Artsiom Ablavatski, Jianing Wei, Matthias Grundmann

3D object detection has recently become popular due to many applications in robotics, augmented reality, autonomy, and image retrieval. [Expand]

44.00
5
0
20
116
Wednesday Poster Session
[136]

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He

We present a large-scale study on unsupervised spatiotemporal representation learning from videos. [Expand]

44.00
2
3
21
123
Tuesday Poster Session
[137]

Localizing Visual Sounds the Hard Way

Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman

The objective of this work is to localize sound sources that are visible in a video without using manual annotations. [Expand]

43.75
1
2
23
123
Friday Poster Session
[138]

Learning Optical Flow From Still Images

Filippo Aleotti, Matteo Poggi, Stefano Mattoccia

This paper deals with the scarcity of data for training optical flow networks, highlighting the limitations of existing sources such as labeled synthetic datasets or unlabeled real videos. [Expand]

43.50
1
26
121
Thursday Poster Session
[139]

Three Ways To Improve Semantic Segmentation With Self-Supervised Depth Estimation

Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Koring, Suman Saha, Luc Van Gool

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process. [Expand]

43.50
2
2
19
126
Wednesday Poster Session
[140]

VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency

Ruohan Gao, Kristen Grauman

We introduce a new approach for audio-visual speech separation. [Expand]

42.75
7
1
24
94
Thursday Poster Session
[141]

Counterfactual VQA: A Cause-Effect Look at Language Bias

Yulei Niu, Kaihua Tang, Hanwang Zhang, Zhiwu Lu, Xian-Sheng Hua, Ji-Rong Wen

Recent VQA models may tend to rely on language bias as a shortcut and thus fail to sufficiently learn the multi-modal knowledge from both vision and language. [Expand]

42.50
24
2
11
50
Thursday Poster Session
[142]

VDSM: Unsupervised Video Disentanglement With State-Space Modeling and Deep Mixtures of Experts

Matthew J. Vowels, Necati Cihan Camgoz, Richard Bowden

Disentangled representations support a range of downstream tasks including causal reasoning, generative modeling, and fair machine learning. [Expand]

42.00
0
18
132
Wednesday Poster Session
[143]

BoxInst: High-Performance Instance Segmentation With Box Annotations

Zhi Tian, Chunhua Shen, Xinlong Wang, Hao Chen

We present a high-performance method that can achieve mask-level instance segmentation with only bounding-box annotations for training. [Expand]

41.50
4
0
24
102
Tuesday Poster Session
[144]

On Semantic Similarity in Video Retrieval

Michael Wray, Hazel Doughty, Dima Damen

Current video retrieval efforts all found their evaluation on an instance-based assumption, that only a single caption is relevant to a query video and vice versa. [Expand]

41.25
3
21
120
Tuesday Poster Session
[145]

Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein

We have witnessed rapid progress on 3D-aware image synthesis, leveraging recent advances in generative visual models and neural rendering. [Expand]

41.00
14
1
13
81
Tuesday Poster Session
[146]

Fast and Accurate Model Scaling

Piotr Dollar, Mannat Singh, Ross Girshick

In this work we analyze strategies for convolutional neural network scaling; that is, the process of scaling a base convolutional network to endow it with greater computational complexity and consequently representational power. [Expand]

41.00
3
3
19
111
Monday Poster Session
[147]

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys

Objects moving at high speed appear significantly blurred when captured with cameras. [Expand]

41.00
3
4
12
124
Tuesday Poster Session
[148]

Few-Shot Transformation of Common Actions Into Time and Space

Pengwan Yang, Pascal Mettes, Cees G. M. Snoek

This paper introduces the task of few-shot common action localization in time and space. [Expand]

41.00
1
23
117
Friday Poster Session
[149]

Understanding Failures of Deep Networks via Robust Feature Extraction

Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz

Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances. [Expand]

40.25
2
25
109
Thursday Poster Session
[150]

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

Vitor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner

We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. [Expand]

40.00
2
1
21
109
Wednesday Poster Session
[151]

High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network

Jie Liang, Hui Zeng, Lei Zhang

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. [Expand]

40.00
4
24
108
Wednesday Poster Session
[152]

User-Guided Line Art Flat Filling With Split Filling Mechanism

Lvmin Zhang, Chengze Li, Edgar Simo-Serra, Yi Ji, Tien-Tsin Wong, Chunping Liu

Flat filling is a critical step in digital artistic content creation with the objective of filling line arts with flat colors. [Expand]

PDF
Show Tweets
39.75
0
26
107
Wednesday Poster Session
[153]

Neural Scene Graphs for Dynamic Scenes

Julian Ost, Fahim Mannan, Nils Thuerey, Julian Knodt, Felix Heide

Recent implicit neural rendering methods have demonstrated that it is possible to learn accurate view synthesis for complex scenes by predicting their volumetric density and color supervised solely by a set of RGB images. [Expand]

39.50
8
2
20
84
Tuesday Poster Session
[154]

LASR: Learning Articulated Shape Reconstruction From a Monocular Video

Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. [Expand]

39.50
0
17
124
Friday Poster Session
[155]

Representation Learning via Global Temporal Alignment and Cycle-Consistency

Isma Hadji, Konstantinos G. Derpanis, Allan D. Jepson

We introduce a weakly supervised method for representation learning based on aligning temporal sequences (e.g., videos) of the same process (e.g., human action). [Expand]

39.25
2
25
105
Wednesday Poster Session
[156]

A Sliced Wasserstein Loss for Neural Texture Synthesis

Eric Heitz, Kenneth Vanhoey, Thomas Chambon, Laurent Belcour

We address the problem of computing a textural loss based on the statistics extracted from the feature activations of a convolutional neural network optimized for object recognition (e.g. [Expand]

39.25
6
34
83
Wednesday Poster Session
[157]

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution

Kelvin C.K. Chan, Xintao Wang, Xiangyu Xu, Jinwei Gu, Chen Change Loy

We show that pre-trained Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR). [Expand]

38.75
4
2
14
109
Thursday Poster Session
[158]

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Tomas Jakab, Richard Tucker, Ameesh Makadia, Jiajun Wu, Noah Snavely, Angjoo Kanazawa

We introduce KeypointDeformer, a novel unsupervised method for shape control through automatically discovered 3D keypoints. [Expand]

38.50
3
27
97
Thursday Poster Session
[159]

High-Fidelity Face Tracking for AR/VR via Deep Lighting Adaptation

Lele Chen, Chen Cao, Fernando De la Torre, Jason Saragih, Chenliang Xu, Yaser Sheikh

3D video avatars can empower virtual communications by providing compression, privacy, entertainment, and a sense of presence in AR/VR. [Expand]

37.50
1
21
107
Thursday Poster Session
[160]

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

Stephen Hausler, Sourav Garg, Ming Xu, Michael Milford, Tobias Fischer

Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. [Expand]

37.50
3
3
21
93
Thursday Poster Session
[161]

Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts

Ji Hou, Benjamin Graham, Matthias Niessner, Saining Xie

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. [Expand]

37.50
4
0
26
82
Friday Poster Session
[162]

Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning

Zhicheng Huang, Zhaoyang Zeng, Yupan Huang, Bei Liu, Dongmei Fu, Jianlong Fu

We study on joint learning of Convolutional Neural Network (CNN) and Transformer for vision-language pre-training (VLPT) which aims to learn cross-modal alignments from millions of image-text pairs. [Expand]

37.25
0
24
101
Thursday Poster Session
[163]

CASTing Your Model: Learning To Localize Improves Self-Supervised Representations

Ramprasaath R. Selvaraju, Karan Desai, Justin Johnson, Nikhil Naik

Recent advances in self-supervised learning (SSL) have largely closed the gap with supervised ImageNet pretraining. [Expand]

37.25
2
2
18
103
Wednesday Poster Session
[164]

MobileDets: Searching for Object Detection Architectures for Mobile Accelerators

Yunyang Xiong, Hanxiao Liu, Suyog Gupta, Berkin Akin, Gabriel Bender, Yongzhe Wang, Pieter-Jan Kindermans, Mingxing Tan, Vikas Singh, Bo Chen

Inverted bottleneck layers, which are built upon depthwise convolutions, have been the predominant building blocks in state-of-the-art object detection models on mobile devices. [Expand]

37.25
12
4
23
51
Tuesday Poster Session
[165]

Rethinking Channel Dimensions for Efficient Model Design

Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo

Designing an efficient model within the limited computational cost is challenging. [Expand]

37.00
4
25
94
Monday Poster Session
[166]

ManipulaTHOR: A Framework for Visual Object Manipulation

Kiana Ehsani, Winson Han, Alvaro Herrasti, Eli VanderBilt, Luca Weihs, Eric Kolve, Aniruddha Kembhavi, Roozbeh Mottaghi

The domain of Embodied AI has recently witnessed substantial progress, particularly in navigating agents within their environments. [Expand]

36.75
2
22
101
Tuesday Poster Session
[167]

Efficient Initial Pose-Graph Generation for Global SfM

Daniel Barath, Dmytro Mishkin, Ivan Eichhardt, Ilia Shipachev, Jiri Matas

We propose ways to speed up the initial pose-graph generation for global Structure-from-Motion algorithms. [Expand]

36.50
1
4
20
98
Thursday Poster Session
[168]

Efficient Conditional GAN Transfer With Knowledge Propagation Across Classes

Mohamad Shahbazi, Zhiwu Huang, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool

Generative adversarial networks (GANs) have shown impressive results in both unconditional and conditional image generation. [Expand]

36.50
6
18
104
Thursday Poster Session
[169]

End-to-End Object Detection With Fully Convolutional Network

Jianfeng Wang, Lin Song, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng

Mainstream object detectors based on the fully convolutional network has achieved impressive performance. [Expand]

36.50
9
2
7
94
Friday Poster Session
[170]

Co-Attention for Conditioned Image Matching

Olivia Wiles, Sebastien Ehrhardt, Andrew Zisserman

We propose a new approach to determine correspondences between image pairs in the wild under large changes in illumination, viewpoint, context, and material. [Expand]

PDF
arXiv
Show Tweets
36.25
4
28
85
Friday Poster Session
[171]

Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans

Zhenyu Chen, Ali Gholami, Matthias Niessner, Angel X. Chang

We introduce the new task of dense captioning in RGB-D scans. [Expand]

36.00
2
3
24
85
Tuesday Poster Session
[172]

Learning Monocular 3D Reconstruction of Articulated Categories From Motion

Filippos Kokkinos, Iasonas Kokkinos

Monocular 3D reconstruction of articulated object categories is challenging due to the lack of training data and the inherent ill-posedness of the problem. [Expand]

36.00
1
1
18
103
Monday Poster Session
[173]

Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation

Nikita Araslanov, Stefan Roth

We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. [Expand]

35.75
0
18
107
Thursday Poster Session
[174]

Learned Initializations for Optimizing Coordinate-Based Neural Representations

Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng

Coordinate-based neural representations have shown significant promise as an alternative to discrete, array-based representations for complex low dimensional signals. [Expand]

35.75
16
1
3
72
Tuesday Poster Session
[175]

Audio-Visual Instance Discrimination with Cross-Modal Agreement

Pedro Morgado, Nuno Vasconcelos, Ishan Misra

We present a self-supervised learning approach to learn audio-visual representations from video and audio. [Expand]

35.50
35
2
0
0
Thursday Poster Session
[176]

We Are More Than Our Joints: Predicting How 3D Bodies Move

Yan Zhang, Michael J. Black, Siyu Tang

A key step towards understanding human behavior is the prediction of 3D human motion. [Expand]

35.50
4
0
22
82
Tuesday Poster Session
[177]

Rethinking and Improving the Robustness of Image Style Transfer

Pei Wang, Yijun Li, Nuno Vasconcelos

Extensive research in neural style transfer methods has shown that the correlation between features extracted by a pre-trained VGG network has remarkable ability to capture the visual style of an image. [Expand]

35.00
1
2
12
110
Monday Poster Session
[178]

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun

Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving. [Expand]

34.50
0
18
102
Wednesday Poster Session
[179]

Robust and Accurate Object Detection via Adversarial Learning

Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong

Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection. [Expand]

34.50
3
1
18
89
Friday Poster Session
[180]

Self-Supervised Multi-Frame Monocular Scene Flow

Junhwa Hur, Stefan Roth

Estimating 3D scene flow from a sequence of monocular images has been gaining increased attention due to the simple, economical capture setup. [Expand]

34.50
1
15
107
Monday Poster Session
[181]

PPR10K: A Large-Scale Portrait Photo Retouching Dataset With Human-Region Mask and Group-Level Consistency

Jie Liang, Hui Zeng, Miaomiao Cui, Xuansong Xie, Lei Zhang

Different from general photo retouching tasks, portrait photo retouching (PPR), which aims to enhance the visual quality of a collection of flat-looking portrait photos, has its special and practical requirements such as human-region priority (HRP) and group-level consistency (GLC). [Expand]

34.50
4
22
90
Monday Poster Session
[182]

Causal Attention for Vision-Language Tasks

Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai

We present a novel attention mechanism: Causal Attention (CATT), to remove the ever-elusive confounding effect in existing attention-based vision-language models. [Expand]

34.50
1
0
23
88
Wednesday Poster Session
[183]

VinVL: Revisiting Visual Representations in Vision-Language Models

Pengchuan Zhang, Xiujun Li, Xiaowei Hu, Jianwei Yang, Lei Zhang, Lijuan Wang, Yejin Choi, Jianfeng Gao

This paper presents a detailed study of improving vision features and develops an improved object detection model for vision language (VL) tasks. [Expand]

PDF
arXiv
Show Tweets
33.75
2
19
95
Tuesday Poster Session
[184]

Fast End-to-End Learning on Protein Surfaces

Freyr Sverrisson, Jean Feydy, Bruno E. Correia, Michael M. Bronstein

Proteins' biological functions are defined by the geometric and chemical structure of their 3D molecular surfaces. [Expand]

33.50
2
3
21
81
Thursday Poster Session
[185]

Multimodal Contrastive Training for Visual Representation Learning

Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. [Expand]

33.00
3
19
91
Tuesday Poster Session
[186]

DeRF: Decomposed Radiance Fields

Daniel Rebain, Wei Jiang, Soroosh Yazdani, Ke Li, Kwang Moo Yi, Andrea Tagliasacchi

With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. [Expand]

32.75
15
1
13
44
Thursday Poster Session
[187]

Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes

Jiashun Wang, Huazhe Xu, Jingwei Xu, Sifei Liu, Xiaolong Wang

Synthesizing 3D human motion plays an important role in many graphics applications as well as understanding human activity. [Expand]

32.75
1
2
15
95
Wednesday Poster Session
[188]

TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text

Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner

A crucial component for the scene text based reasoning required for TextVQA and TextCaps datasets involve detecting and recognizing text present in the images using an optical character recognition (OCR) system. [Expand]

32.50
2
1
19
83
Wednesday Poster Session
[189]

Knowledge Evolution in Neural Networks

Ahmed Taha, Abhinav Shrivastava, Larry S. Davis

Deep learning relies on the availability of a large corpus of data (labeled or unlabeled). [Expand]

32.50
1
43
43
Thursday Poster Session
[190]

Rotation-Only Bundle Adjustment

Seong Hun Lee, Javier Civera

We propose a novel method for estimating the global rotations of the cameras independently of their positions and the scene structure. [Expand]

32.00
1
6
23
72
Monday Poster Session
[191]

Learning To Count Everything

Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai

Existing works on visual counting primarily focus on one specific category at a time, such as people, animals, and cells. [Expand]

32.00
3
13
99
Tuesday Poster Session
[192]

Exploiting Aliasing for Manga Restoration

Minshan Xie, Menghan Xia, Tien-Tsin Wong

As a popular entertainment art form, manga enriches the line drawings details with bitonal screentones. [Expand]

32.00
3
18
89
Thursday Poster Session
[193]

Quantum Permutation Synchronization

Tolga Birdal, Vladislav Golyanik, Christian Theobalt, Leonidas J. Guibas

We present QuantumSync, the first quantum algorithm for solving a synchronization problem in the context of computer vision. [Expand]

31.75
3
4
11
89
Thursday Poster Session
[194]

Sparse R-CNN: End-to-End Object Detection With Learnable Proposals

Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, Ping Luo

We present Sparse R-CNN, a purely sparse method for object detection in images. [Expand]

31.50
18
1
8
37
Thursday Poster Session
[195]

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. [Expand]

31.25
2
16
91
Monday Poster Session
[196]

VS-Net: Voting With Segmentation for Visual Localization

Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li

Visual localization is of great importance in robotics and computer vision. [Expand]

31.00
2
17
88
Tuesday Poster Session
[197]

STaR: Self-Supervised Tracking and Reconstruction of Rigid Objects in Motion With Neural Rendering

Wentao Yuan, Zhaoyang Lv, Tanner Schmidt, Steven Lovegrove

We present STaR, a novel method that performs Self-supervised Tracking and Reconstruction of dynamic scenes with rigid motion from multi-view RGB videos without any manual annotation. [Expand]

31.00
5
2
16
70
Thursday Poster Session
[198]

Learning Multi-Scale Photo Exposure Correction

Mahmoud Afifi, Konstantinos G. Derpanis, Bjorn Ommer, Michael S. Brown

Capturing photographs with wrong exposures remains a major source of errors in camera-based imaging. [Expand]

30.25
3
22
74
Wednesday Poster Session
[199]

Weakly Supervised Learning of Rigid 3D Scene Flow

Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal

We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. [Expand]

30.25
1
1
18
80
Tuesday Poster Session
[200]

Monte Carlo Scene Search for 3D Scene Understanding

Shreyas Hampali, Sinisa Stekovic, Sayan Deb Sarkar, Chetan S. Kumar, Friedrich Fraundorfer, Vincent Lepetit

We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need for training data. [Expand]

30.25
1
2
16
83
Thursday Poster Session
[201]

Robust Reference-Based Super-Resolution via C2-Matching

Yuming Jiang, Kelvin C.K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image by introducing an additional high-resolution (HR) reference image. [Expand]

30.00
1
20
79
Monday Poster Session
[202]

SwiftNet: Real-Time Video Object Segmentation

Haochen Wang, Xiaolong Jiang, Haibing Ren, Yao Hu, Song Bai

In this work we present SwiftNet for real-time semi-supervised video object segmentation (one-shot VOS), which reports 77.8% J&F and 70 FPS on DAVIS 2017 validation dataset, leading all present solutions in overall accuracy and speed performance. [Expand]

30.00
1
1
24
67
Monday Poster Session
[203]

Surrogate Gradient Field for Latent Space Manipulation

Minjun Li, Yanghua Jin, Huachun Zhu

Generative adversarial networks (GANs) can generate high-quality images from sampled latent codes. [Expand]

29.75
3
18
80
Tuesday Poster Session
[204]

Deep Burst Super-Resolution

Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte

While single-image super-resolution (SISR) has attracted substantial interest in recent years, the proposed approaches are limited to learning image priors in order to add high frequency details. [Expand]

29.50
2
3
17
73
Wednesday Poster Session
[205]

CDFI: Compression-Driven Network Design for Frame Interpolation

Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov

DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e.g., mobile devices. [Expand]

29.50
1
20
77
Wednesday Poster Session
[206]

Intentonomy: A Dataset and Study Towards Human Intent Understanding

Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie, Ser-Nam Lim

An image is worth a thousand words, conveying information that goes beyond the physical visual content therein. [Expand]

29.50
1
2
18
76
Thursday Poster Session
[207]

Pixel-Aligned Volumetric Avatars

Amit Raj, Michael Zollhofer, Tomas Simon, Jason Saragih, Shunsuke Saito, James Hays, Stephen Lombardi

Acquisition and rendering of photo-realistic human heads is a highly challenging research problem of particular importance for virtual telepresence. [Expand]

29.25
0
18
81
Thursday Poster Session
[208]

Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction

Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei, Chelsea Finn

A video prediction model that generalizes to diverse scenes would enable intelligent agents such as robots to perform a variety of tasks via planning with the model. [Expand]

29.25
2
2
14
79
Monday Poster Session
[209]

Differentiable Patch Selection for Image Recognition

Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand. [Expand]

29.00
1
0
13
86
Monday Poster Session
[210]

PLOP: Learning Without Forgetting for Continual Semantic Segmentation

Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord

Deep learning approaches are nowadays ubiquitously used to tackle computer vision tasks such as semantic segmentation, requiring large datasets and substantial computational power. [Expand]

28.75
3
3
19
62
Tuesday Poster Session
[211]

Activate or Not: Learning Customized Activation

Ningning Ma, Xiangyu Zhang, Ming Liu, Jian Sun

We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not. [Expand]

28.75
1
2
15
79
Wednesday Poster Session
[212]

Augmentation Strategies for Learning With Noisy Labels

Kento Nishi, Yi Ding, Alex Rich, Tobias Hollerer

Imperfect labels are ubiquitous in real-world datasets. [Expand]

28.75
2
0
13
81
Wednesday Poster Session
[213]

Home Action Genome: Cooperative Compositional Action Understanding

Nishant Rai, Haofeng Chen, Jingwei Ji, Rishi Desai, Kazuki Kozuka, Shun Ishizaka, Ehsan Adeli, Juan Carlos Niebles

Existing research on action recognition treats activities as monolithic events occurring in videos. [Expand]

28.75
0
21
73
Wednesday Poster Session
[214]

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Fanbo Xiang, Zexiang Xu, Milos Hasan, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Hao Su

Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic rendering for challenging scenes that mesh reconstruction fails on. [Expand]

28.75
2
2
16
73
Wednesday Poster Session
[215]

Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris

Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. [Expand]

28.50
8
3
16
47
Wednesday Poster Session
[216]

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements

Qianli Ma, Shunsuke Saito, Jinlong Yang, Siyu Tang, Michael J. Black

Learning to model and reconstruct humans in clothing is challenging due to articulation, non-rigid deformation, and varying clothing types and topologies. [Expand]

28.25
4
0
17
63
Friday Poster Session
[217]

MIST: Multiple Instance Spatial Transformer

Baptiste Angles, Yuhe Jin, Simon Kornblith, Andrea Tagliasacchi, Kwang Moo Yi

We propose a deep network that can be trained to tackle image reconstruction and classification problems that involve detection of multiple object instances, without any supervision regarding their whereabouts. [Expand]

PDF
arXiv
Show Tweets
28.00
4
17
74
Monday Poster Session
[218]

Learning Compositional Radiance Fields of Dynamic Human Heads

Ziyan Wang, Timur Bagautdinov, Stephen Lombardi, Tomas Simon, Jason Saragih, Jessica Hodgins, Michael Zollhofer

Photorealistic rendering of dynamic humans is an important ability for telepresence systems, virtual shopping, synthetic data generation, and more. [Expand]

27.75
5
0
14
63
Tuesday Poster Session
[219]

Pixel Codec Avatars

Shugao Ma, Tomas Simon, Jason Saragih, Dawei Wang, Yuecheng Li, Fernando De la Torre, Yaser Sheikh

Telecommunication with photorealistic avatars in virtual or augmented reality is a promising path for achieving authentic face-to-face communication in 3D over remote physical distances. [Expand]

27.50
1
15
79
Monday Poster Session
[220]

Visual Semantic Role Labeling for Video Understanding

Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi

We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling. [Expand]

27.50
1
21
67
Tuesday Poster Session
[221]

Benchmarking Representation Learning for Natural World Image Collections

Grant Van Horn, Elijah Cole, Sara Beery, Kimberly Wilber, Serge Belongie, Oisin Mac Aodha

Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. [Expand]

27.50
1
2
17
70
Thursday Poster Session
[222]

Adversarial Generation of Continuous Images

Ivan Skorokhodov, Savva Ignatyev, Mohamed Elhoseiny

In most existing learning systems, images are typically viewed as 2D pixel arrays. [Expand]

27.25
8
5
6
60
Wednesday Poster Session
[223]

The Spatially-Correlative Loss for Various Image Translation Tasks

Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai

We propose a novel spatially-correlative loss that is simple, efficient, and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. [Expand]

27.25
1
17
74
Friday Poster Session
[224]

Ensembling With Deep Generative Views

Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhang

Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. [Expand]

PDF
arXiv
Show Tweets
27.00
3
18
69
Thursday Poster Session
[225]

VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization

Seunghwan Choi, Sunghyun Park, Minsoo Lee, Jaegul Choo

The task of image-based virtual try-on aims to transfer a target clothing item onto the corresponding region of a person, which is commonly tackled by fitting the item to the desired body part and fusing the warped item with the person. [Expand]

27.00
0
13
82
Thursday Poster Session
[226]

SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans

Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias Niessner

We present SPSG, a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion. [Expand]

26.75
2
0
12
75
Monday Poster Session
[227]

How Transferable Are Reasoning Patterns in VQA?

Corentin Kervadec, Theo Jaunet, Grigory Antipov, Moez Baccouche, Romain Vuillemot, Christian Wolf

Since its inception, Visual Question Answering (VQA) is notoriously known as a task, where models are prone to exploit biases in datasets to find shortcuts instead of performing high-level reasoning. [Expand]

26.75
4
0
14
63
Tuesday Poster Session
[228]

SOLD2: Self-Supervised Occlusion-Aware Line Description and Detection

Remi Pautrat, Juan-Ting Lin, Viktor Larsson, Martin R. Oswald, Marc Pollefeys

Compared to feature point detection and description, detecting and matching line segments offer additional challenges. [Expand]

26.75
0
20
67
Thursday Poster Session
[229]

Variational Transformer Networks for Layout Generation

Diego Martin Arroyo, Janis Postels, Federico Tombari

Generative models able to synthesize layouts of different kinds (e.g. [Expand]

26.50
1
2
11
78
Thursday Poster Session
[230]

End-to-End Human Pose and Mesh Reconstruction with Transformers

Kevin Lin, Lijuan Wang, Zicheng Liu

We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image. [Expand]

26.50
15
0
6
34
Monday Poster Session
[231]

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks

Shunsuke Saito, Jinlong Yang, Qianli Ma, Michael J. Black

We present SCANimate, an end-to-end trainable framework that takes raw 3D scans of a clothed human and turns them into an animatable avatar. [Expand]

26.50
4
2
11
66
Tuesday Poster Session
[232]

Parser-Free Virtual Try-On via Distilling Appearance Flows

Yuying Ge, Yibing Song, Ruimao Zhang, Chongjian Ge, Wei Liu, Ping Luo

Image virtual try-on aims to fit a garment image (target clothes) to a person image. [Expand]

26.25
2
19
65
Wednesday Poster Session
[233]

Predator: Registration of 3D Point Clouds With Low Overlap

Shengyu Huang, Zan Gojcic, Mikhail Usvyatsov, Andreas Wieser, Konrad Schindler

We introduce PREDATOR, a model for pairwise pointcloud registration with deep attention to the overlap region. [Expand]

26.25
2
4
13
67
Tuesday Poster Session
[234]

Few-Shot Image Generation via Cross-Domain Correspondence

Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zhang

Training generative models, such as GANs, on a target domain containing limited examples (e.g., 10) can easily result in overfitting. [Expand]

26.25
2
12
79
Wednesday Poster Session
[235]

Learning Decision Trees Recurrently Through Communication

Stephan Alaniz, Diego Marcos, Bernt Schiele, Zeynep Akata

Integrated interpretability without sacrificing the prediction accuracy of decision making algorithms has the potential of greatly improving their value to the user. [Expand]

26.00
3
13
75
Thursday Poster Session
[236]

Teachers Do More Than Teach: Compressing Image-to-Image Models

Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

Generative Adversarial Networks (GANs) have achieved huge success in generating high-fidelity images, however, they suffer from low efficiency due to tremendous computational cost and bulky memory usage. [Expand]

26.00
1
3
13
71
Thursday Poster Session
[237]

Image-to-Image Translation via Hierarchical Style Disentanglement

Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji

Recently, image-to-image translation has made significant progress in achieving both multi-label (i.e., translation conditioned on different labels) and multi-style (i.e., generation with diverse styles) tasks. [Expand]

26.00
3
16
69
Wednesday Poster Session
[238]

3D CNNs With Adaptive Temporal Feature Resolutions

Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Jurgen Gall

While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. [Expand]

25.75
3
6
88
Tuesday Poster Session
[239]

De-Rendering the World's Revolutionary Artefacts

Shangzhe Wu, Ameesh Makadia, Jiajun Wu, Noah Snavely, Richard Tucker, Angjoo Kanazawa

Recent works have shown exciting results in unsupervised image de-rendering--learning to decompose 3D shape, appearance, and lighting from single-image collections without explicit supervision. [Expand]

25.75
0
19
65
Tuesday Poster Session
[240]

VarifocalNet: An IoU-Aware Dense Object Detector

Haoyang Zhang, Ying Wang, Feras Dayoub, Niko Sunderhauf

Accurately ranking the vast number of candidate detections is crucial for dense object detectors to achieve high performance. [Expand]

25.75
5
1
14
54
Wednesday Poster Session
[241]

Multi-Objective Interpolation Training for Robustness To Label Noise

Diego Ortego, Eric Arazo, Paul Albert, Noel E. O'Connor, Kevin McGuinness

Deep neural networks trained with standard cross-entropy loss memorize noisy labels, which degrades their performance. [Expand]

25.50
1
1
17
63
Tuesday Poster Session
[242]

Center-Based 3D Object Detection and Tracking

Tianwei Yin, Xingyi Zhou, Philipp Krahenbuhl

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. [Expand]

25.50
19
0
2
22
Thursday Poster Session
[243]

A 3D GAN for Improved Large-Pose Facial Recognition

Richard T. Marriott, Sami Romdhani, Liming Chen

Facial recognition using deep convolutional neural networks relies on the availability of large datasets of face images. [Expand]

25.25
1
9
10
68
Thursday Poster Session
[244]

Pixel-Wise Anomaly Detection in Complex Driving Scenes

Giancarlo Di Biase, Hermann Blum, Roland Siegwart, Cesar Cadena

The inability of state-of-the-art semantic segmentation methods to detect anomaly instances hinders them from being deployed in safety-critical and complex applications, such as autonomous driving. [Expand]

25.00
1
1
17
61
Friday Poster Session
[245]

High-Fidelity and Arbitrary Face Editing

Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

Cycle consistency is widely used for face editing. [Expand]

25.00
0
13
74
Friday Poster Session
[246]

StylePeople: A Generative Model of Fullbody Human Avatars

Artur Grigorev, Karim Iskakov, Anastasia Ianina, Renat Bashirov, Ilya Zakharkin, Alexander Vakhitov, Victor Lempitsky

We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. [Expand]

25.00
1
2
13
68
Tuesday Poster Session
[247]

How Privacy-Preserving Are Line Clouds? Recovering Scene Details From 3D Lines

Kunal Chelani, Fredrik Kahl, Torsten Sattler

Visual localization is the problem of estimating the camera pose of a given image with respect to a known scene. [Expand]

24.75
2
11
75
Friday Poster Session
[248]

Style-Aware Normalized Loss for Improving Arbitrary Style Transfer

Jiaxin Cheng, Ayush Jaiswal, Yue Wu, Pradeep Natarajan, Prem Natarajan

Neural Style Transfer (NST) has quickly evolved from single-style to infinite-style models, also known as Arbitrary Style Transfer (AST). [Expand]

24.75
1
10
78
Monday Poster Session
[249]

Multi-Stage Progressive Image Restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

Image restoration tasks demand a complex balance between spatial details and high-level contextualized information while recovering images. [Expand]

24.75
10
0
8
43
Thursday Poster Session
[250]

SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning

Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, Kurt Keutzer

A common practice in unsupervised representation learning is to use labeled data to evaluate the quality of the learned representations. [Expand]

24.50
3
4
12
58
Monday Poster Session
[251]

Learning the Superpixel in a Non-Iterative and Lifelong Manner

Lei Zhu, Qi She, Bin Zhang, Yanye Lu, Zhilin Lu, Duo Li, Jie Hu

Superpixel is generated by automatically clustering pixels in an image into hundreds of compact partitions, which is widely used to perceive the object contours for its excellent contour adherence. [Expand]

24.50
2
10
76
Monday Poster Session
[252]

Rainbow Memory: Continual Learning With a Memory of Diverse Samples

Jihwan Bang, Heesu Kim, YoungJoon Yoo, Jung-Woo Ha, Jonghyun Choi

Continual learning is a realistic learning scenario for AI models. [Expand]

24.25
3
17
60
Wednesday Poster Session
[253]

Open World Compositional Zero-Shot Learning

Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata

Compositional Zero-Shot learning (CZSL) requires to recognize state-object compositions unseen during training. [Expand]

24.25
2
2
9
69
Tuesday Poster Session
[254]

Skeleton Merger: An Unsupervised Aligned Keypoint Detector

Ruoxi Shi, Zhengrong Xue, Yang You, Cewu Lu

Detecting aligned 3D keypoints is essential under many scenarios such as object tracking, shape retrieval and robotics. [Expand]

24.25
2
17
61
Monday Poster Session
[255]

Connecting What To Say With Where To Look by Modeling Human Attention Traces

Zihang Meng, Licheng Yu, Ning Zhang, Tamara L. Berg, Babak Damavandi, Vikas Singh, Amy Bearman

We introduce a unified framework to jointly model images, text, and human attention traces. [Expand]

24.00
1
18
59
Thursday Poster Session
[256]

Progressive Semantic-Aware Style Transformation for Blind Face Restoration

Chaofeng Chen, Xiaoming Li, Lingbo Yang, Xianhui Lin, Lei Zhang, Kwan-Yee K. Wong

Face restoration is important in face image processing, and has been widely studied in recent years. [Expand]

23.75
2
3
14
56
Thursday Poster Session
[257]

Point Cloud Upsampling via Disentangled Refinement

Ruihui Li, Xianzhi Li, Pheng-Ann Heng, Chi-Wing Fu

Point clouds produced by 3D scanning are often sparse, non-uniform, and noisy. [Expand]

23.75
1
14
66
Monday Poster Session
[258]

Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

Malik Boudiaf, Hoel Kervadec, Ziko Imtiaz Masud, Pablo Piantanida, Ismail Ben Ayed, Jose Dolz

We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances--an aspect often overlooked in the literature in favor of the meta-learning paradigm. [Expand]

23.50
3
0
13
56
Thursday Poster Session
[259]

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas J. Guibas

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. [Expand]

23.50
1
2
10
68
Wednesday Poster Session
[260]

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

Qingyong Hu, Bo Yang, Sheikh Khalid, Wen Xiao, Niki Trigoni, Andrew Markham

An essential prerequisite for unleashing the potential of supervised deep learning algorithms in the area of 3D scene understanding is the availability of large-scale and richly annotated datasets. [Expand]

23.50
7
1
13
39
Tuesday Poster Session
[261]

CodedStereo: Learned Phase Masks for Large Depth-of-Field Stereo

Shiyu Tan, Yicheng Wu, Shoou-I Yu, Ashok Veeraraghavan

Conventional stereo suffers from a fundamental trade-off between imaging volume and signal-to-noise ratio (SNR) -- due to the conflicting impact of aperture size on both these variables. [Expand]

23.25
0
19
55
Wednesday Poster Session
[262]

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang

Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data. [Expand]

23.25
8
2
9
41
Wednesday Poster Session
[263]

Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata

Having access to multi-modal cues (e.g. [Expand]

23.00
1
1
16
55
Tuesday Poster Session
[264]

Populating 3D Scenes by Learning Human-Scene Interaction

Mohamed Hassan, Partha Ghosh, Joachim Tesch, Dimitrios Tzionas, Michael J. Black

Humans live within a 3D space and constantly interact with it to perform tasks. [Expand]

23.00
2
2
10
62
Thursday Poster Session
[265]

CoCoNets: Continuous Contrastive 3D Scene Representations

Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki

This paper explores self-supervised learning of amodal 3D feature representations from RGB and RGB-D posed images and videos, agnostic to object and scene semantic content, and evaluates the resulting scene representations in the downstream tasks of visual correspondence, object tracking, and object detection. [Expand]

23.00
1
14
63
Thursday Poster Session
[266]

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Abulikemu Abuduweili, Xingjian Li, Humphrey Shi, Cheng-Zhong Xu, Dejing Dou

While recent studies on semi-supervised learning have shown remarkable progress in leveraging both labeled and unlabeled data, most of them presume a basic setting of the model is randomly initialized. [Expand]

22.75
1
4
14
55
Tuesday Poster Session
[267]

PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting

Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, Noah Snavely

We present an end-to-end inverse rendering pipeline that includes a fully differentiable renderer, and can reconstruct geometry, materials, and illumination from scratch from a set of images. [Expand]

22.75
1
0
10
67
Tuesday Poster Session
[268]

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers

Antoine Miech, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic, Andrew Zisserman

Our objective is language-based search of large-scale image and video datasets. [Expand]

22.50
1
1
14
57
Wednesday Poster Session
[269]

Energy-Based Learning for Scene Graph Generation

Mohammed Suhail, Abhay Mittal, Behjat Siddiquie, Chris Broaddus, Jayan Eledath, Gerard Medioni, Leonid Sigal

Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. [Expand]

22.50
1
1
14
57
Thursday Poster Session
[270]

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape From a Video

Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee

Despite the recent success of single image-based 3D human pose and shape estimation methods, recovering temporally consistent and smooth 3D human motion from a video is still challenging. [Expand]

22.25
2
0
8
65
Monday Poster Session
[271]

Generative Classifiers as a Basis for Trustworthy Image Classification

Radek Mackowiak, Lynton Ardizzone, Ullrich Kothe, Carsten Rother

With the maturing of deep learning systems, trustworthiness is becoming increasingly important for model assessment. [Expand]

22.25
2
1
24
32
Tuesday Poster Session
[272]

Unsupervised Visual Representation Learning by Tracking Patches in Video

Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

Inspired by the fact that human eyes continue to develop tracking ability in early and middle childhood, we propose to use tracking as a proxy task for a computer vision system to learn the visual representations. [Expand]

22.00
2
10
66
Monday Poster Session
[273]

An Alternative Probabilistic Interpretation of the Huber Loss

Gregory P. Meyer

The Huber loss is a robust loss function used for a wide range of regression tasks. [Expand]

21.75
14
3
4
20
Tuesday Poster Session
[274]

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Xingran Zhou, Bo Zhang, Ting Zhang, Pan Zhang, Jianmin Bao, Dong Chen, Zhongfei Zhang, Fang Wen

We present the full-resolution correspondence learning for cross-domain images, which aids image translation. [Expand]

PDF
arXiv
Show Tweets
21.75
1
15
56
Thursday Poster Session
[275]

Extreme Rotation Estimation Using Dense Correlation Volumes

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. [Expand]

21.50
1
0
13
56
Thursday Poster Session
[276]

Depth From Camera Motion and Object Detection

Brent A. Griffin, Jason J. Corso

This paper addresses the problem of learning to estimate the depth of detected objects given some measurement of camera motion (e.g., from robot kinematics or vehicle odometry). [Expand]

21.50
1
16
53
Monday Poster Session
[277]

Coordinate Attention for Efficient Mobile Network Design

Qibin Hou, Daquan Zhou, Jiashi Feng

Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. [Expand]

21.50
4
1
9
51
Thursday Poster Session
[278]

Universal Spectral Adversarial Attacks for Deformable Shapes

Arianna Rampini, Franco Pestarini, Luca Cosmo, Simone Melzi, Emanuele Rodola

Machine learning models are known to be vulnerable to adversarial attacks, namely perturbations of the data that lead to wrong predictions despite being imperceptible. [Expand]

21.50
0
12
62
Tuesday Poster Session
[279]

Unsupervised Real-World Image Super Resolution via Domain-Distance Aware Training

Yunxuan Wei, Shuhang Gu, Yawei Li, Radu Timofte, Longcun Jin, Hengjie Song

These days, unsupervised super-resolution (SR) is soaring due to its practical and promising potential in real scenarios. [Expand]

21.50
7
3
9
37
Thursday Poster Session
[280]

From Points to Multi-Object 3D Reconstruction

Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari

We propose a method to detect and reconstruct multiple 3D objects from a single RGB image. [Expand]

21.25
1
0
8
65
Tuesday Poster Session
[281]

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Hugo Germain, Vincent Lepetit, Guillaume Bourmaud

Absolute camera pose estimation is usually addressed by sequentially solving two distinct subproblems: First a feature matching problem that seeks to establish putative 2D-3D correspondences, and then a Perspective-n-Point problem that minimizes, w.r.t. [Expand]

21.25
1
2
17
45
Monday Poster Session
[282]

Repetitive Activity Counting by Sight and Sound

Yunhua Zhang, Ling Shao, Cees G. M. Snoek

This paper strives for repetitive activity counting in videos. [Expand]

21.25
1
11
62
Thursday Poster Session
[283]

Contrastive Learning for Compact Single Image Dehazing

Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

Single image dehazing is a challenging ill-posed problem due to the severe information degeneration. [Expand]

21.00
1
2
14
50
Wednesday Poster Session
[284]

Learning To Segment Rigid Motions From Two Frames

Gengshan Yang, Deva Ramanan

Appearance-based detectors achieve remarkable performance on common scenes, benefiting from high-capacity models and massive annotated data, but tend to fail for scenarios that lack training data. [Expand]

21.00
0
10
64
Monday Poster Session
[285]

Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

Mingi Ji, Seungjae Shin, Seunghyun Hwang, Gibeom Park, Il-Chul Moon

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. [Expand]

20.75
2
12
57
Wednesday Poster Session
[286]

VideoMoCo: Contrastive Video Representation Learning With Temporally Adversarial Examples

Tian Pan, Yibing Song, Tianyu Yang, Wenhao Jiang, Wei Liu

MoCo is effective for unsupervised image representation learning. [Expand]

20.75
2
1
11
52
Wednesday Poster Session
[287]

Seesaw Loss for Long-Tailed Instance Segmentation

Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, Dahua Lin

Instance segmentation has witnessed a remarkable progress on class-balanced benchmarks. [Expand]

20.75
2
1
10
54
Wednesday Poster Session
[288]

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun

Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation. [Expand]

20.75
2
0
9
57
Thursday Poster Session
[289]

The Lottery Tickets Hypothesis for Supervised and Self-Supervised Pre-Training in Computer Vision Models

Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The computer vision world has been re-gaining enthusiasm in various pre-trained models, including both classical ImageNet supervised pre-training and recently emerged self-supervised pre-training such as simCLR and MoCo. [Expand]

20.50
15
0
5
12
Friday Poster Session
[290]

Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning

Riccardo Volpi, Diane Larlus, Gregory Rogez

Most standard learning approaches lead to fragile models which are prone to drift when sequentially trained on samples of a different nature -- the well-known "catastrophic forgetting" issue. [Expand]

20.50
3
6
67
Tuesday Poster Session
[291]

No Shadow Left Behind: Removing Objects and Their Shadows Using Approximate Lighting and Geometry

Edward Zhang, Ricardo Martin-Brualla, Janne Kontkanen, Brian L. Curless

Removing objects from images is a challenging technical problem that is important for many applications, including mixed reality. [Expand]

20.50
0
14
54
Friday Poster Session
[292]

4D Panoptic LiDAR Segmentation

Mehmet Aygun, Aljosa Osep, Mark Weber, Maxim Maximov, Cyrill Stachniss, Jens Behley, Laura Leal-Taixe

Temporal semantic scene understanding is critical for self-driving cars or robots operating in dynamic environments. [Expand]

20.25
2
1
13
46
Tuesday Poster Session
[293]

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. [Expand]

20.25
2
1
12
48
Tuesday Poster Session
[294]

Semi-Supervised Synthesis of High-Resolution Editable Textures for 3D Humans

Bindita Chaudhuri, Nikolaos Sarafianos, Linda Shapiro, Tony Tung

We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup. [Expand]

20.00
3
15
47
Wednesday Poster Session
[295]

Roses Are Red, Violets Are Blue... but Should VQA Expect Them To?

Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to ""reason"", leading them to perform ""educated guesses"" instead. [Expand]

19.75
7
3
10
28
Monday Poster Session
[296]

Sketch2Model: View-Aware 3D Modeling From Single Free-Hand Sketches

Song-Hai Zhang, Yuan-Chen Guo, Qing-Wen Gu

We investigate the problem of generating 3D meshes from single free-hand sketches, aiming at fast 3D modeling for novice users. [Expand]

19.50
1
8
61
Tuesday Poster Session
[297]

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

Yinan He, Bei Gan, Siyu Chen, Yichun Zhou, Guojun Yin, Luchuan Song, Lu Sheng, Jing Shao, Ziwei Liu

The rapid progress of photorealistic synthesis techniques has reached at a critical point where the boundary between real and manipulated images starts to blur. [Expand]

19.25
3
9
56
Tuesday Poster Session
[298]

Ranking Neural Checkpoints

Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong

This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task. [Expand]

19.25
1
1
9
54
Monday Poster Session
[299]

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu

In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions. [Expand]

PDF
arXiv
Show Tweets
19.25
1
12
52
Monday Poster Session
[300]

Image Restoration for Under-Display Camera

Yuqian Zhou, David Ren, Neil Emerton, Sehoon Lim, Timothy Large

The new trend of full-screen devices encourages us to position a camera behind a screen. [Expand]

19.25
5
3
12
30
Wednesday Poster Session
[301]

Monocular Real-Time Full Body Capture With Inter-Part Correlations

Yuxiao Zhou, Marc Habermann, Ikhsanul Habibie, Ayush Tewari, Christian Theobalt, Feng Xu

We present the first method for real-time full body capture that estimates shape and motion of body and hands together with a dynamic 3D face model from a single color image. [Expand]

19.25
2
1
4
60
Tuesday Poster Session
[302]

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences

Qunjie Zhou, Torsten Sattler, Laura Leal-Taixe

The classical matching pipeline used for visual localization typically involves three steps: (i) local feature detection and description, (ii) feature matching, and (iii) outlier rejection. [Expand]

19.25
1
1
13
46
Tuesday Poster Session
[303]

Exemplar-Based Open-Set Panoptic Segmentation Network

Jaedong Hwang, Seoung Wug Oh, Joon-Young Lee, Bohyung Han

We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task. [Expand]

19.00
1
2
12
46
Monday Poster Session
[304]

High-Fidelity Neural Human Motion Transfer From Monocular Video

Moritz Kappel, Vladislav Golyanik, Mohamed Elgharib, Jann-Ole Henningson, Hans-Peter Seidel, Susana Castillo, Christian Theobalt, Marcus Magnor

Video-based human motion transfer creates video animations of humans following a source motion. [Expand]

19.00
1
1
8
55
Monday Poster Session
[305]

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang

Localization Quality Estimation (LQE) is crucial and popular in the recent advancement of dense object detectors since it can provide accurate ranking scores that benefit the Non-Maximum Suppression processing and improve detection performance. [Expand]

19.00
5
0
7
42
Thursday Poster Session
[306]

3D Spatial Recognition Without Spatially Labeled 3D

Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar

We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision. [Expand]

19.00
1
11
53
Thursday Poster Session
[307]

Unpaired Image-to-Image Translation via Latent Energy Transport

Yang Zhao, Changyou Chen

Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains. [Expand]

19.00
1
1
7
57
Friday Poster Session
[308]

DeepVideoMVS: Multi-View Stereo on Video With Recurrent Spatio-Temporal Fusion

Arda Duzceker, Silvano Galliani, Christoph Vogel, Pablo Speciale, Mihai Dusmanu, Marc Pollefeys

We propose an online multi-view depth prediction approach on posed video streams, where the scene geometry information computed in the previous time steps is propagated to the current time step in an efficient and geometrically plausible way. [Expand]

18.50
1
1
7
55
Thursday Poster Session
[309]

i3DMM: Deep Implicit 3D Morphable Model of Human Heads

Tarun Yenamandra, Ayush Tewari, Florian Bernard, Hans-Peter Seidel, Mohamed Elgharib, Daniel Cremers, Christian Theobalt

We present the first deep implicit 3D morphable model (i3DMM) of full heads. [Expand]

18.50
1
2
5
58
Thursday Poster Session
[310]

Hierarchical Motion Understanding via Motion Programs

Sumith Kulal, Jiayuan Mao, Alex Aiken, Jiajun Wu

Current approaches to video analysis of human motion focus on raw pixels or keypoints as the basic units of reasoning. [Expand]

18.25
0
7
59
Tuesday Poster Session
[311]

MagFace: A Universal Representation for Face Recognition and Quality Assessment

Qiang Meng, Shichao Zhao, Zhida Huang, Feng Zhou

The performance of face recognition system degrades when the variability of the acquired faces increases. [Expand]

18.00
3
1
6
47
Thursday Poster Session
[312]

Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion

Chulin Xie, Chuxin Wang, Bo Zhang, Hao Yang, Dong Chen, Fang Wen

In this paper, we proposed a novel Style-based Point Generator with Adversarial Rendering (SpareNet) for point cloud completion. [Expand]

18.00
2
4
10
40
Tuesday Poster Session
[313]

UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering

Mohamed El Banani, Luya Gao, Justin Johnson

Aligning partial views of a scene into a single whole is essential to understanding one's environment and is a key component of numerous robotics tasks such as SLAM and SfM. [Expand]

17.75
2
1
11
40
Wednesday Poster Session
[314]

ANR: Articulated Neural Rendering for Virtual Avatars

Amit Raj, Julian Tanke, James Hays, Minh Vo, Carsten Stoll, Christoph Lassner

Deferred Neural Rendering (DNR) uses a three-step pipeline to translate a mesh representation into an RGB image. [Expand]

17.75
4
0
8
39
Tuesday Poster Session
[315]

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li

To date, most existing self-supervised learning methods are designed and optimized for image classification. [Expand]

17.75
15
1
3
4
Tuesday Poster Session
[316]

Binary TTC: A Temporal Geofence for Autonomous Navigation

Abhishek Badki, Orazio Gallo, Jan Kautz, Pradeep Sen

Time-to-contact (TTC), the time for an object to collide with the observer's plane, is a powerful tool for path planning: it is potentially more informative than the depth, velocity, and acceleration of objects in the scene---even for humans. [Expand]

17.50
2
11
46
Thursday Poster Session
[317]

Deep Active Surface Models

Udaranga Wickramasinghe, Pascal Fua, Graham Knott

Active Surface Models have a long history of being useful to model complex 3D surfaces. [Expand]

17.50
1
0
15
36
Thursday Poster Session
[318]

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani

We present Vx2Text, a framework for text generation from multimodal inputs consisting of video plus text, speech, or audio. [Expand]

17.25
1
1
8
48
Tuesday Poster Session
[319]

Mask Guided Matting via Progressive Refinement Network

Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, Alan Yuille

We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance. [Expand]

17.25
0
8
53
Monday Poster Session
[320]

Masksembles for Uncertainty Estimation

Nikita Durasov, Timur Bagautdinov, Pierre Baque, Pascal Fua

Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. [Expand]

17.00
4
1
6
39
Thursday Poster Session
[321]

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc Van Gool

This paper presents an uncalibrated deep neural network framework for the photometric stereo problem. [Expand]

17.00
2
7
52
Tuesday Poster Session
[322]

Learning Graph Embeddings for Compositional Zero-Shot Learning

Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata

In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. [Expand]

17.00
2
2
7
44
Monday Poster Session
[323]

Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles

Jevgenij Gamper, Nasir Rajpoot

We present ARCH, a computational pathology (CP) multiple instance captioning dataset to facilitate dense supervision of CP tasks. [Expand]

16.75
7
8
44
Friday Poster Session
[324]

Quantifying Explainers of Graph Neural Networks in Computational Pathology

Guillaume Jaume, Pushpak Pati, Behzad Bozorgtabar, Antonio Foncubierta, Anna Maria Anniciello, Florinda Feroce, Tilman Rau, Jean-Philippe Thiran, Maria Gabrani, Orcun Goksel

Explainability of deep learning methods is imperative to facilitate their clinical adoption in digital pathology. [Expand]

16.75
4
2
10
29
Wednesday Poster Session
[325]

NPAS: A Compiler-Aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

Zhengang Li, Geng Yuan, Wei Niu, Pu Zhao, Yanyu Li, Yuxuan Cai, Xuan Shen, Zheng Zhan, Zhenglun Kong, Qing Jin, Zhiyu Chen, Sijia Liu, Kaiyuan Yang, Bin Ren, Yanzhi Wang, Xue Lin

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. [Expand]

PDF
arXiv
Show Tweets
16.75
1
10
46
Thursday Poster Session
[326]

Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE

Jialun Peng, Dong Liu, Songcen Xu, Houqiang Li

Given an incomplete image without additional constraint, image inpainting natively allows for multiple solutions as long as they appear plausible. [Expand]

16.75
1
7
52
Wednesday Poster Session
[327]

Temporal Query Networks for Fine-Grained Video Understanding

Chuhan Zhang, Ankush Gupta, Andrew Zisserman

Our objective in this work is fine-grained classification of actions in untrimmed videos, where the actions may be temporally extended or may span only a few frames of the video. [Expand]

16.75
0
7
53
Tuesday Poster Session
[328]

Points As Queries: Weakly Semi-Supervised Object Detection by Points

Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei Zhang, Jian Sun

We propose a novel point annotated setting for the weakly semi-supervised object detection task, in which the dataset comprises small fully annotated images and large weakly annotated images by points. [Expand]

16.50
0
8
50
Wednesday Poster Session
[329]

How Well Do Self-Supervised Models Transfer?

Linus Ericsson, Henry Gouk, Timothy M. Hospedales

Self-supervised visual representation learning has seen huge progress recently, but no large scale evaluation has compared the many models now available. [Expand]

16.50
7
1
4
29
Tuesday Poster Session
[330]

Learning To Relate Depth and Semantics for Unsupervised Domain Adaptation

Suman Saha, Anton Obukhov, Danda Pani Paudel, Menelaos Kanakis, Yuhua Chen, Stamatios Georgoulis, Luc Van Gool

We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. [Expand]

16.50
3
5
53
Wednesday Poster Session
[331]

LOHO: Latent Optimization of Hairstyles via Orthogonalization

Rohit Saha, Brendan Duke, Florian Shkurti, Graham W. Taylor, Parham Aarabi

Hairstyle transfer is challenging due to hair structure differences in the source and target hair. [Expand]

16.50
1
1
11
39
Monday Poster Session
[332]

Visual Room Rearrangement

Luca Weihs, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi

There has been a significant recent progress in the field of Embodied AI with researchers developing models and algorithms enabling embodied agents to navigate and interact within completely unseen environments. [Expand]

16.50
2
6
52
Tuesday Poster Session
[333]

AdaBins: Depth Estimation Using Adaptive Bins

Shariq Farooq Bhat, Ibraheem Alhashim, Peter Wonka

We address the problem of estimating a high quality dense depth map from a single RGB input image. [Expand]

16.25
9
1
6
16
Tuesday Poster Session
[334]

AutoDO: Robust AutoAugment for Biased Data With Label Noise via Scalable Probabilistic Implicit Differentiation

Denis Gudovskiy, Luca Rigazio, Shun Ishizaka, Kazuki Kozuka, Sotaro Tsukizawa

AutoAugment has sparked an interest in automated augmentation methods for deep learning models. [Expand]

16.25
0
11
43
Friday Poster Session
[335]

SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification

Zijian Hu, Zhengyu Yang, Xuefeng Hu, Ram Nevatia

A common classification task situation is where one has a large amount of data available for training, but only a small portion is annotated with class labels. [Expand]

16.00
1
1
10
39
Thursday Poster Session
[336]

SMURF: Self-Teaching Multi-Frame Unsupervised RAFT With Full-Image Warping

Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski

We present SMURF, a method for unsupervised learning of optical flow that improves state of the art on all benchmarks by 36% to 40% and even outperforms several supervised approaches such as PWC-Net and FlowNet2. [Expand]

16.00
0
10
44
Tuesday Poster Session
[337]

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

Contrastive learning methods for unsupervised visual representation learning have reached remarkable levels of transfer performance. [Expand]

16.00
16
0
0
0
Friday Poster Session
[338]

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, Junjie Huang, Xinze Chen, Jiagang Zhu, Tian Yang, Jiwen Lu, Dalong Du, Jie Zhou

In this paper, we contribute a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol. [Expand]

16.00
3
2
8
34
Wednesday Poster Session
[339]

Sequential Graph Convolutional Network for Active Learning

Razvan Caramalau, Binod Bhattarai, Tae-Kyun Kim

We propose a novel pool-based Active Learning frame-work constructed on a sequential Graph Convolution Net-work (GCN). [Expand]

15.75
1
3
10
36
Wednesday Poster Session
[340]

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll

Recent neural view synthesis methods have achieved impressive quality and realism, surpassing classical pipelines which rely on multi-view reconstruction. [Expand]

15.75
0
8
47
Wednesday Poster Session
[341]

Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling

Zhichao Huang, Xintong Han, Jia Xu, Tong Zhang

We present a new method for few-shot human motion transfer that achieves realistic human image generation with only a small number of appearance inputs. [Expand]

15.75
0
9
45
Monday Poster Session
[342]

HOTR: End-to-End Human-Object Interaction Detection With Transformers

Bumsoo Kim, Junhyun Lee, Jaewoo Kang, Eun-Sol Kim, Hyunwoo J. Kim

Human-Object Interaction (HOI) detection is a task of identifying "a set of interactions" in an image, which involves the i) localization of the subject (i.e., humans) and target (i.e., objects) of interaction, and ii) the classification of the interaction labels. [Expand]

15.75
2
1
7
40
Monday Poster Session
[343]

Spatially Consistent Representation Learning

Byungseok Roh, Wuhyun Shin, Ildoo Kim, Sungwoong Kim

Self-supervised learning has been widely used to obtain transferrable representations from unlabeled images. [Expand]

15.75
3
2
6
37
Monday Poster Session
[344]

HDR Environment Map Estimation for Real-Time Augmented Reality

Gowri Somanath, Daniel Kurz

We present a method to estimate an HDR environment map from a narrow field-of-view LDR camera image in real-time. [Expand]

15.75
0
11
41
Wednesday Poster Session
[345]

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Jong-Chyi Su, Zezhou Cheng, Subhransu Maji

We evaluate the effectiveness of semi-supervised learning (SSL) on a realistic benchmark where data exhibits considerable class imbalance and contains images from novel classes. [Expand]

15.75
3
1
10
30
Thursday Poster Session
[346]

Mesoscopic Photogrammetry With an Unstabilized Phone Camera

Kevin C. Zhou, Colin Cooke, Jaehee Park, Ruobing Qian, Roarke Horstmeyer, Joseph A. Izatt, Sina Farsiu

We present a feature-free photogrammetric technique that enables quantitative 3D mesoscopic (mm-scale height variation) imaging with tens-of-micron accuracy from sequences of images acquired by a smartphone at close range (several cm) under freehand motion without additional hardware. [Expand]

15.75
1
6
50
Wednesday Poster Session
[347]

Global Transport for Fluid Reconstruction With Learned Self-Supervision

Erik Franz, Barbara Solenthaler, Nils Thuerey

We propose a novel method to reconstruct volumetric flows from sparse views via a global transport formulation. [Expand]

15.50
1
4
53
Monday Poster Session
[348]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Niessner

Semantic scene understanding from point clouds is particularly challenging as the points reflect only a sparse set of the underlying 3D geometry. [Expand]

15.50
2
0
10
34
Tuesday Poster Session
[349]

Repopulating Street Scenes

Yifan Wang, Andrew Liu, Richard Tucker, Jiajun Wu, Brian L. Curless, Steven M. Seitz, Noah Snavely

We present a framework for automatically reconfiguring images of street scenes by populating, depopulating, or repopulating them with objects such as pedestrians or vehicles. [Expand]

15.50
1
10
41
Tuesday Poster Session
[350]

Towards High Fidelity Face Relighting With Realistic Shadows

Andrew Hou, Ze Zhang, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu

Existing face relighting methods often struggle with two problems: maintaining the local facial details of the subject and accurately removing and synthesizing shadows in the relit image, especially hard shadows. [Expand]

15.25
0
6
49
Thursday Poster Session
[351]

KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA

Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach

One of the most challenging question types in VQA is when answering the question requires outside knowledge not present in the image. [Expand]

15.25
2
0
8
37
Thursday Poster Session
[352]

Generalized Domain Adaptation

Yu Mitsuzumi, Go Irie, Daiki Ikami, Takashi Shibata

Many variants of unsupervised domain adaptation (UDA) problems have been proposed and solved individually. [Expand]

15.25
2
9
41
Monday Poster Session
[353]

DECOR-GAN: 3D Shape Detailization by Conditional Refinement

Zhiqin Chen, Vladimir G. Kim, Matthew Fisher, Noam Aigerman, Hao Zhang, Siddhartha Chaudhuri

We introduce a deep generative network for 3D shape detailization, akin to stylization with the style being geometric details. [Expand]

15.00
1
1
4
47
Friday Poster Session
[354]

Continual Learning via Bit-Level Information Preserving

Yujun Shi, Li Yuan, Yunpeng Chen, Jiashi Feng

Continual learning tackles the setting of learning different tasks sequentially. [Expand]

15.00
0
9
42
Friday Poster Session
[355]

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

Li Yi, Boqing Gong, Thomas Funkhouser

We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors. [Expand]

15.00
8
3
3
19
Thursday Poster Session
[356]

IIRC: Incremental Implicitly-Refined Classification

Mohamed Abdelsalam, Mojtaba Faramarzi, Shagun Sodhani, Sarath Chandar

We introduce the 'Incremental Implicitly-Refined Classification (IIRC)' setup, an extension to the class incremental learning setup where the incoming batches of classes have two granularity levels. [Expand]

14.75
2
3
5
38
Wednesday Poster Session
[357]

Depth Completion Using Plane-Residual Representation

Byeong-Uk Lee, Kyunghyun Lee, In So Kweon

The basic framework of depth completion is to predict a pixel-wise dense depth map using very sparse input data. [Expand]

14.75
0
12
35
Thursday Poster Session
[358]

Orthogonal Over-Parameterized Training

Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

The inductive bias of a neural network is largely determined by the architecture and the training algorithm. [Expand]

14.75
7
3
6
16
Wednesday Poster Session
[359]

Towards Real-World Blind Face Restoration With Generative Facial Prior

Xintao Wang, Yu Li, Honglun Zhang, Ying Shan

Blind face restoration usually relies on facial priors, such as facial geometry prior or reference prior, to restore realistic and faithful details. [Expand]

14.75
1
1
11
32
Wednesday Poster Session
[360]

Instance Localization for Self-Supervised Detection Pretraining

Ceyuan Yang, Zhirong Wu, Bolei Zhou, Stephen Lin

Prior research on self-supervised learning has led to considerable progress on image classification, but often with degraded transfer performance on object detection. [Expand]

14.75
3
2
5
35
Tuesday Poster Session
[361]

Multiresolution Knowledge Distillation for Anomaly Detection

Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad H. Rohban, Hamid R. Rabiee

Unsupervised representation learning has proved to be a critical component of anomaly detection/localization in images. [Expand]

14.50
3
1
2
41
Thursday Poster Session
[362]

SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration

Sheng Ao, Qingyong Hu, Bo Yang, Andrew Markham, Yulan Guo

Extracting robust and general 3D local features is key to downstream tasks such as point cloud registration and reconstruction. [Expand]

14.25
4
3
5
28
Thursday Poster Session
[363]

Unsupervised 3D Shape Completion Through GAN Inversion

Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy

Most 3D shape completion approaches rely heavily on partial-complete shape pairs and learn in a fully supervised manner. [Expand]

14.00
2
8
38
Monday Poster Session
[364]

End-to-End Human Object Interaction Detection With HOI Transformer

Cheng Zou, Bohan Wang, Yue Hu, Junqi Liu, Qian Wu, Yu Zhao, Boxun Li, Chenguang Zhang, Chi Zhang, Yichen Wei, Jian Sun

We propose HOI Transformer to tackle human object interaction (HOI) detection in an end-to-end manner. [Expand]

14.00
2
0
9
30
Thursday Poster Session
[365]

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting

Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

Self-supervised learning has gained prominence due to its efficacy at learning powerful representations from unlabelled data that achieve excellent performance on many challenging downstream tasks. [Expand]

13.75
3
1
4
34
Tuesday Poster Session
[366]

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, Dieter Fox

We introduce DexYCB, a new dataset for capturing hand grasping of objects. [Expand]

13.75
2
7
39
Wednesday Poster Session
[367]

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

Shaofei Wang, Andreas Geiger, Siyu Tang

Registering point clouds of dressed humans to parametric human models is a challenging task in computer vision. [Expand]

13.75
3
1
6
30
Wednesday Poster Session
[368]

Wide-Baseline Multi-Camera Calibration Using Person Re-Identification

Yan Xu, Yu-Jhe Li, Xinshuo Weng, Kris Kitani

We address the problem of estimating the 3D pose of a network of cameras for large-environment wide-baseline scenarios, e.g., cameras for construction sites, sports stadiums, and public spaces. [Expand]

13.75
1
7
40
Thursday Poster Session
[369]

How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language

Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giro-i-Nieto

One of the factors that have hindered progress in the areas of sign language recognition, translation, and production is the absence of large annotated datasets. [Expand]

13.50
4
1
8
21
Monday Poster Session
[370]

AGORA: Avatars in Geography Optimized for Regression Analysis

Priyanka Patel, Chun-Hao P. Huang, Joachim Tesch, David T. Hoffmann, Shashank Tripathi, Michael J. Black

While the accuracy of 3D human pose estimation from images has steadily improved on benchmark datasets, the best methods still fail in many real-world scenarios. [Expand]

13.50
4
0
5
28
Thursday Poster Session
[371]

VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation

Siyuan Qiao, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen

In this paper, we present ViP-DeepLab, a unified model attempting to tackle the long-standing and challenging inverse projection problem in vision, which we model as restoring the point clouds from perspective image sequences while providing each point with instance-level semantic interpretations. [Expand]

13.50
3
1
8
25
Tuesday Poster Session
[372]

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding

Bo Sun, Banghuai Li, Shengcai Cai, Ye Yuan, Chi Zhang

Emerging interests have been brought to recognize previously unseen objects given very few training examples, known as few-shot object detection (FSOD). [Expand]

13.50
1
0
6
38
Wednesday Poster Session
[373]

Lifting 2D StyleGAN for 3D-Aware Face Generation

Yichun Shi, Divyansh Aggarwal, Anil K. Jain

We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation. [Expand]

13.25
0
7
39
Tuesday Poster Session
[374]

Self-Supervised Learning of Depth Inference for Multi-View Stereo

Jiayu Yang, Jose M. Alvarez, Miaomiao Liu

Recent supervised multi-view depth estimation networks have achieved promising results. [Expand]

13.25
0
6
41
Wednesday Poster Session
[375]

Unsupervised Human Pose Estimation Through Transforming Shape Templates

Luca Schmidtke, Athanasios Vlontzos, Simon Ellershaw, Anna Lukens, Tomoki Arichi, Bernhard Kainz

Human pose estimation is a major computer vision problem with applications ranging from augmented reality and video capture to surveillance and movement tracking. [Expand]

13.00
2
7
36
Monday Poster Session
[376]

LiDAR-Based Panoptic Segmentation via Dynamic Shifting Network

Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu

With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. [Expand]

12.75
1
0
3
41
Thursday Poster Session
[377]

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer

Tianwei Lin, Zhuoqi Ma, Fu Li, Dongliang He, Xin Li, Errui Ding, Nannan Wang, Jie Li, Xinbo Gao

Artistic style transfer aims at migrating the style from an example image to a content image. [Expand]

12.75
1
1
6
34
Tuesday Poster Session
[378]

Neural Surface Maps

Luca Morreale, Noam Aigerman, Vladimir G. Kim, Niloy J. Mitra

Maps are arguably one of the most fundamental concepts used to define and operate on manifold surfaces in differentiable geometry. [Expand]

12.75
1
6
38
Tuesday Poster Session
[379]

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting With Their Explanations

Wolfgang Stammer, Patrick Schramowski, Kristian Kersting

Most explanation methods in deep learning map importance estimates for a model's prediction back to the original input space. [Expand]

12.75
3
3
7
22
Tuesday Poster Session
[380]

A Deep Emulator for Secondary Motion of 3D Characters

Mianlun Zheng, Yi Zhou, Duygu Ceylan, Jernej Barbic

Fast and light-weight methods for animating 3D characters are desirable in various applications such as computer games. [Expand]

12.75
0
10
31
Tuesday Poster Session
[381]

Content-Aware GAN Compression

Yuchen Liu, Zhixin Shu, Yijun Li, Zhe Lin, Federico Perazzi, Sun-Yuan Kung

Generative adversarial networks (GANs), e.g., StyleGAN2, play a vital role in various image generation and synthesis tasks, yet their notoriously high computational cost hinders their efficient deployment on edge devices. [Expand]

12.50
1
6
37
Thursday Poster Session
[382]

Faster Meta Update Strategy for Noise-Robust Deep Learning

Youjiang Xu, Linchao Zhu, Lu Jiang, Yi Yang

It has been shown that deep neural networks are prone to overfitting on biased training data. [Expand]

12.50
2
2
10
20
Monday Poster Session
[383]

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision

Yang Hong, Juyong Zhang, Boyi Jiang, Yudong Guo, Ligang Liu, Hujun Bao

In this paper, we propose StereoPIFu, which integrates the geometric constraints of stereo vision with implicit function representation of PIFu, to recover the 3D shape of the clothed human from a pair of low-cost rectified images. [Expand]

12.25
1
1
7
30
Monday Poster Session
[384]

CoMoGAN: Continuous Model-Guided Image-to-Image Translation

Fabio Pizzati, Pietro Cerri, Raoul de Charette

CoMoGAN is a continuous GAN relying on the unsupervised reorganization of the target data on a functional manifold. [Expand]

12.25
2
5
37
Thursday Poster Session
[385]

Self-Supervised Motion Learning From Static Images

Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo H. Ang

Motions are reflected in videos as the movement of pixels, and actions are essentially patterns of inconsistent motions between the foreground and the background. [Expand]

12.00
3
1
2
31
Monday Poster Session
[386]

KOALAnet: Blind Super-Resolution Using Kernel-Oriented Adaptive Local Adjustment

Soo Ye Kim, Hyeonjun Sim, Munchurl Kim

Blind super-resolution (SR) methods aim to generate a high quality high resolution image from a low resolution image containing unknown degradations. [Expand]

12.00
0
5
38
Wednesday Poster Session
[387]

3DCaricShop: A Dataset and a Baseline Method for Single-View 3D Caricature Face Reconstruction

Yuda Qiu, Xiaojie Xu, Lingteng Qiu, Yan Pan, Yushuang Wu, Weikai Chen, Xiaoguang Han

Caricature is an artistic representation that deliberately exaggerates the distinctive features of a human face to convey humor or sarcasm. [Expand]

12.00
2
4
38
Wednesday Poster Session
[388]

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc Van Gool, Rainer Stiefelhagen

Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks. [Expand]

12.00
5
6
31
Wednesday Poster Session
[389]

Self-Supervised Visibility Learning for Novel View Synthesis

Yujiao Shi, Hongdong Li, Xin Yu

We address the problem of novel view synthesis (NVS) from a few sparse source view images. [Expand]

12.00
0
7
34
Wednesday Poster Session
[390]

AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

Xiao Song, Guorun Yang, Xinge Zhu, Hui Zhou, Zhe Wang, Jianping Shi

Recently, records on stereo matching benchmarks are constantly broken by end-to-end disparity networks. [Expand]

12.00
6
1
5
13
Wednesday Poster Session
[391]

UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training

Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu, Jingjing Liu

Vision-and-language pre-training has achieved impressive success in learning multimodal representations between vision and language. [Expand]

12.00
0
4
40
Tuesday Poster Session
[392]

HyperSeg: Patch-Wise Hypernetwork for Real-Time Semantic Segmentation

Yuval Nirkin, Lior Wolf, Tal Hassner

We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. [Expand]

11.75
1
0
6
31
Tuesday Poster Session
[393]

Inverting Generative Adversarial Renderer for Face Reconstruction

Jingtan Piao, Keqiang Sun, Quan Wang, Kwan-Yee Lin, Hongsheng Li

Given a monocular face image as input, 3D face geometry reconstruction aims to recover a corresponding 3Dface mesh. [Expand]

11.75
1
8
30
Friday Poster Session
[394]

Categorical Depth Distribution Network for Monocular 3D Object Detection

Cody Reading, Ali Harakeh, Julia Chae, Steven L. Waslander

Monocular 3D object detection is a key problem for autonomous vehicles, as it provides a solution with simple configuration compared to typical multi-sensor systems. [Expand]

11.75
1
2
9
23
Wednesday Poster Session
[395]

Monocular Reconstruction of Neural Face Reflectance Fields

Mallikarjun B R, Ayush Tewari, Tae-Hyun Oh, Tim Weyrich, Bernd Bickel, Hans-Peter Seidel, Hanspeter Pfister, Wojciech Matusik, Mohamed Elgharib, Christian Theobalt

The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. [Expand]

11.75
2
2
3
31
Tuesday Poster Session
[396]

IMAGINE: Image Synthesis by Image-Guided Model Inversion

Pei Wang, Yijun Li, Krishna Kumar Singh, Jingwan Lu, Nuno Vasconcelos

Synthesizing variations of a specific reference image with semantically valid content is an important task in terms of personalized generation as well as for data augmentation. [Expand]

11.75
1
0
2
39
Tuesday Poster Session
[397]

Kaleido-BERT: Vision-Language Pre-Training on Fashion Domain

Mingchen Zhuge, Dehong Gao, Deng-Ping Fan, Linbo Jin, Ben Chen, Haoming Zhou, Minghui Qiu, Ling Shao

We present a new vision-language (VL) pre-training model dubbed Kaleido-BERT, which introduces a novel kaleido strategy for fashion cross-modality representations from transformers. [Expand]

11.75
1
2
5
31
Thursday Poster Session
[398]

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training

Minheng Ni, Haoyang Huang, Lin Su, Edward Cui, Taroon Bharti, Lijuan Wang, Dongdong Zhang, Nan Duan

We present M3P, a Multitask Multilingual Multimodal Pre-trained model that combines multilingual pre-training and multimodal pre-training into a unified framework via multitask pre-training. [Expand]

11.50
5
2
2
20
Tuesday Poster Session
[399]

(AF)2-S3Net: Attentive Feature Fusion With Adaptive Feature Selection for Sparse Semantic Segmentation Network

Ran Cheng, Ryan Razani, Ehsan Taghavi, Enxu Li, Bingbing Liu

Autonomous robotic systems and self driving cars rely on accurate perception of their surroundings as the safety of the passengers and pedestrians is the top priority. [Expand]

11.25
3
0
8
17
Thursday Poster Session
[400]

Neighbor2Neighbor: Self-Supervised Denoising From Single Noisy Images

Tao Huang, Songjiang Li, Xu Jia, Huchuan Lu, Jianzhuang Liu

In the last few years, image denoising has benefited a lot from the fast development of neural networks. [Expand]

11.25
1
3
6
26
Thursday Poster Session
[401]

Task-Aware Variational Adversarial Active Learning

Kwanyoung Kim, Dongwon Park, Kwang In Kim, Se Young Chun

Often, labeling large amount of data is challenging due to high labeling cost limiting the application domain of deep learning techniques. [Expand]

11.25
8
1
4
4
Wednesday Poster Session
[402]

Roof-GAN: Learning To Generate Roof Geometry and Relations for Residential Houses

Yiming Qian, Hao Zhang, Yasutaka Furukawa

This paper presents Roof-GAN, a novel generative adversarial network that generates structured geometry of residential roof structures as a set of roof primitives and their relationships. [Expand]

11.25
1
3
38
Monday Poster Session
[403]

ReMix: Towards Image-to-Image Translation With Limited Data

Jie Cao, Luanxuan Hou, Ming-Hsuan Yang, Ran He, Zhenan Sun

Image-to-image (I2I) translation methods based on generative adversarial networks (GANs) typically suffer from overfitting when limited training data is available. [Expand]

11.00
0
8
28
Thursday Poster Session
[404]

VaB-AL: Incorporating Class Imbalance and Difficulty With Variational Bayes for Active Learning

Jongwon Choi, Kwang Moo Yi, Jihoon Kim, Jinho Choo, Byoungjip Kim, Jinyeop Chang, Youngjune Gwon, Hyung Jin Chang

Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. [Expand]

11.00
1
3
5
27
Tuesday Poster Session
[405]

DeFLOCNet: Deep Image Editing via Flexible Low-Level Controls

Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao, Bin Jiang, Wei Liu

User-intended visual content fills the hole regions of an input image in the image editing scenario. [Expand]

11.00
2
6
30
Wednesday Poster Session
[406]

Learnable Motion Coherence for Correspondence Pruning

Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang

Motion coherence is an important clue for distinguishing true correspondences from false ones. [Expand]

11.00
7
7
23
Tuesday Poster Session
[407]

Pose Recognition With Cascade Transformers

Ke Li, Shijie Wang, Xiang Zhang, Yifan Xu, Weijian Xu, Zhuowen Tu

In this paper, we present a regression-based pose recognition method using cascade Transformers. [Expand]

11.00
0
6
32
Monday Poster Session
[408]

LEAP: Learning Articulated Occupancy of People

Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang

Substantial progress has been made on modeling rigid 3D objects using deep implicit representations. [Expand]

11.00
3
2
1
28
Wednesday Poster Session
[409]

Learning Delaunay Surface Elements for Mesh Reconstruction

Marie-Julie Rakotosaona, Paul Guerrero, Noam Aigerman, Niloy J. Mitra, Maks Ovsjanikov

We present a method for reconstructing triangle meshes from point clouds. [Expand]

11.00
0
2
40
Monday Poster Session
[410]

3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection

He Wang, Yezhen Cong, Or Litany, Yue Gao, Leonidas J. Guibas

3D object detection is an important yet demanding task that heavily relies on difficult to obtain 3D annotations. [Expand]

11.00
1
0
7
26
Thursday Poster Session
[411]

Adversarial Robustness Under Long-Tailed Distribution

Tong Wu, Ziwei Liu, Qingqiu Huang, Yu Wang, Dahua Lin

Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks. [Expand]

11.00
1
4
35
Wednesday Poster Session
[412]

ReNAS: Relativistic Evaluation of Neural Architecture Search

Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS). [Expand]

11.00
11
Tuesday Poster Session
[413]

Camouflaged Object Segmentation With Distraction Mining

Haiyang Mei, Ge-Peng Ji, Ziqi Wei, Xin Yang, Xiaopeng Wei, Deng-Ping Fan

Camouflaged object segmentation (COS) aims to identify objects that are "perfectly" assimilate into their surroundings, which has a wide range of valuable applications. [Expand]

10.75
3
2
5
19
Wednesday Poster Session
[414]

Learning Complete 3D Morphable Face Models From Images and Videos

Mallikarjun B R, Ayush Tewari, Hans-Peter Seidel, Mohamed Elgharib, Christian Theobalt

Most 3D face reconstruction methods rely on 3D morphable models, which disentangle the space of facial deformations into identity and expression geometry, and skin reflectance. [Expand]

10.75
1
0
5
29
Tuesday Poster Session
[415]

QPIC: Query-Based Pairwise Human-Object Interaction Detection With Image-Wide Contextual Information

Masato Tamura, Hiroki Ohashi, Tomoaki Yoshinaga

We propose a simple, intuitive yet powerful method for human-object interaction (HOI) detection. [Expand]

10.75
1
0
6
27
Wednesday Poster Session
[416]

Lite-HRNet: A Lightweight High-Resolution Network

Changqian Yu, Bin Xiao, Changxin Gao, Lu Yuan, Lei Zhang, Nong Sang, Jingdong Wang

We present an efficient high-resolution network, Lite-HRNet, for human pose estimation. [Expand]

10.75
3
0
6
19
Wednesday Poster Session
[417]

Topological Planning With Transformers for Vision-and-Language Navigation

Kevin Chen, Junshen K. Chen, Jo Chuang, Marynel Vazquez, Silvio Savarese

Conventional approaches to vision-and-language navigation (VLN) are trained end-to-end but struggle to perform well in freely traversable environments. [Expand]

10.50
2
0
7
20
Wednesday Poster Session
[418]

KeepAugment: A Simple Information-Preserving Data Augmentation Approach

Chengyue Gong, Dilin Wang, Meng Li, Vikas Chandra, Qiang Liu

Data augmentation (DA) is an essential technique for training state-of-the-art deep learning systems. [Expand]

10.50
2
2
5
22
Monday Poster Session
[419]

Black-Box Explanation of Object Detectors via Saliency Maps

Vitali Petsiuk, Rajiv Jain, Varun Manjunatha, Vlad I. Morariu, Ashutosh Mehra, Vicente Ordonez, Kate Saenko

We propose D-RISE, a method for generating visual explanations for the predictions of object detectors. [Expand]

10.50
3
3
6
15
Thursday Poster Session
[420]

CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild

Bastian Wandt, Marco Rudolph, Petrissa Zell, Helge Rhodin, Bodo Rosenhahn

Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. [Expand]

10.50
3
2
4
20
Thursday Poster Session
[421]

Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation

Bin Yan, Xinyu Zhang, Dong Wang, Huchuan Lu, Xiaoyun Yang

Visual object tracking aims to precisely estimate the bounding box for the given target, which is a challenging problem due to factors such as deformation and occlusion. [Expand]

10.50
8
1
2
5
Tuesday Poster Session
[422]

Point Cloud Instance Segmentation Using Probabilistic Embeddings

Biao Zhang, Peter Wonka

In this paper, we propose a new framework for point cloud instance segmentation. [Expand]

10.50
6
1
4
9
Wednesday Poster Session
[423]

Adversarially Adaptive Normalization for Single Domain Generalization

Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, Mingyuan Zhou

Single domain generalization aims to learn a model that performs well on many unseen domains with only one domain data for training. [Expand]

10.25
1
5
30
Wednesday Poster Session
[424]

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

Xiangtao Kong, Hengyuan Zhao, Yu Qiao, Chao Dong

We aim at accelerating super-resolution (SR) networks on large images (2K-8K). [Expand]

10.25
2
9
21
Thursday Poster Session
[425]

Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency

Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang

Weakly supervised instance segmentation reduces the cost of annotations required to train models. [Expand]

10.25
0
7
27
Thursday Poster Session
[426]

Learning the Predictability of the Future

Didac Suris, Ruoshi Liu, Carl Vondrick

We introduce a framework for learning from unlabeled video what is predictable in the future. [Expand]

10.25
1
9
22
Thursday Poster Session
[427]

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji

6D pose estimation from a single RGB image is a fundamental task in computer vision. [Expand]

10.25
1
5
30
Friday Poster Session
[428]

Visually Informed Binaural Audio Generation without Binaural Audios

Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

Stereophonic audio, especially binaural audio, plays an essential role in immersive viewing environments. [Expand]

10.25
4
1
3
18
Thursday Poster Session
[429]

DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation

Xiong Zhang, Hongmin Xu, Hong Mo, Jianchao Tan, Cheng Yang, Lei Wang, Wenqi Ren

Existing NAS methods for dense image prediction tasks usually compromise on restricted search space or search on proxy task to meet the achievable computational demands. [Expand]

10.25
7
3
3
4
Thursday Poster Session
[430]

BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

Kelvin C.K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy

Video super-resolution (VSR) approaches tend to have more components than the image counterparts as they need to exploit the additional temporal dimension. [Expand]

10.00
10
Tuesday Poster Session
[431]

Generalizable Pedestrian Detection: The Elephant in the Room

Irtiza Hasan, Shengcai Liao, Jinpeng Li, Saad Ullah Akram, Ling Shao

Pedestrian detection is used in many vision based applications ranging from video surveillance to autonomous driving. [Expand]

10.00
3
2
6
14
Wednesday Poster Session
[432]

Progressive Semantic Segmentation

Chuong Huynh, Anh Tuan Tran, Khoa Luu, Minh Hoai

The objective of this work is to segment high-resolution images without overloading GPU memory usage or losing the fine details in the output segmentation map. [Expand]

10.00
1
4
31
Friday Poster Session
[433]

AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations From Self-Trained Negative Adversaries

Qianjiang Hu, Xiao Wang, Wei Hu, Guo-Jun Qi

Contrastive learning relies on constructing a collection of negative examples that are sufficiently hard to discriminate against positive queries when their representations are self-trained. [Expand]

10.00
5
1
3
13
Monday Poster Session
[434]

Weakly-Supervised Physically Unconstrained Gaze Estimation

Rakshit Kothari, Shalini De Mello, Umar Iqbal, Wonmin Byeon, Seonwook Park, Jan Kautz

A major challenge for physically unconstrained gaze estimation is acquiring training data with 3D gaze annotations for in-the-wild and outdoor scenarios. [Expand]

10.00
1
6
27
Wednesday Poster Session
[435]

Self-Point-Flow: Self-Supervised Scene Flow Estimation From Point Clouds With Optimal Transport and Random Walk

Ruibo Li, Guosheng Lin, Lihua Xie

Due to the scarcity of annotated scene flow data, self-supervised scene flow learning in point clouds has attracted increasing attention. [Expand]

10.00
1
2
35
Friday Poster Session
[436]

ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks

Xinshao Wang, Yang Hua, Elyor Kodirov, David A. Clifton, Neil M. Robertson

To train robust deep neural networks (DNNs), we systematically study several target modification approaches, which include output regularisation, self and non-self label correction (LC). [Expand]

10.00
5
3
3
11
Monday Poster Session
[437]

Correlated Input-Dependent Label Noise in Large-Scale Image Classification

Mark Collier, Basil Mustafa, Efi Kokiopoulou, Rodolphe Jenatton, Jesse Berent

Large scale image classification datasets often contain noisy labels. [Expand]

9.75
1
0
4
27
Monday Poster Session
[438]

Learning to Track Instances without Video Annotations

Yang Fu, Sifei Liu, Umar Iqbal, Shalini De Mello, Humphrey Shi, Jan Kautz

Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement of large-scale, frame-wise annotation, and 2) the complexity of two-stage approaches. [Expand]

9.75
1
5
28
Wednesday Poster Session
[439]

HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching

Vladimir Tankovich, Christian Hane, Yinda Zhang, Adarsh Kowdle, Sean Fanello, Sofien Bouaziz

This paper presents HITNet, a novel neural network architecture for real-time stereo matching. [Expand]

9.75
4
1
3
16
Thursday Poster Session
[440]

PatchmatchNet: Learned Multi-View Patchmatch Stereo

Fangjinhua Wang, Silvano Galliani, Christoph Vogel, Pablo Speciale, Marc Pollefeys

We present PatchmatchNet, a novel and learnable cascade formulation of Patchmatch for high-resolution multi-view stereo. [Expand]

9.75
3
2
32
Thursday Poster Session
[441]

Unsupervised Feature Learning by Cross-Level Instance-Group Discrimination

Xudong Wang, Ziwei Liu, Stella X. Yu

Unsupervised feature learning has made great strides with contrastive learning based on instance discrimination and invariant mapping, as benchmarked on curated class-balanced datasets. [Expand]

9.75
1
5
28
Thursday Poster Session
[442]

Pose-Guided Human Animation From a Single Image in the Wild

Jae Shin Yoon, Lingjie Liu, Vladislav Golyanik, Kripasindhu Sarkar, Hyun Soo Park, Christian Theobalt

We present a new pose transfer method for synthesizing a human animation from a single image of a person controlled by a sequence of body poses. [Expand]

9.75
2
1
5
20
Thursday Poster Session
[443]

Adversarial Imaging Pipelines

Buu Phan, Fahim Mannan, Felix Heide

Adversarial attacks play a critical role in understanding deep neural network predictions and improving their robustness. [Expand]

9.50
2
13
10
Friday Poster Session
[444]

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Fatemeh Saleh, Sadegh Aliakbarian, Hamid Rezatofighi, Mathieu Salzmann, Stephen Gould

Despite the recent advances in multiple object tracking (MOT), achieved by joint detection and tracking, dealing with long occlusions remains a challenge. [Expand]

9.50
1
0
1
32
Thursday Poster Session
[445]

Pareidolia Face Reenactment

Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video. [Expand]

PDF
arXiv
Show Tweets
9.50
0
3
32
Monday Poster Session
[446]

Counterfactual Zero-Shot and Open-Set Visual Recognition

Zhongqi Yue, Tan Wang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang

We present a novel counterfactual framework for both Zero-Shot Learning (ZSL) and Open-Set Recognition (OSR), whose common challenge is generalizing to the unseen-classes by only training on the seen-classes. [Expand]

9.50
4
0
3
16
Thursday Poster Session
[447]

Digital Gimbal: End-to-End Deep Image Stabilization With Learnable Exposure Times

Omer Dahary, Matan Jacoby, Alex M. Bronstein

Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. [Expand]

9.25
0
7
23
Thursday Poster Session
[448]

FBNetV3: Joint Architecture-Recipe Search Using Predictor Pretraining

Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu, Peter Vajda, Joseph E. Gonzalez

Neural Architecture Search (NAS) yields state-of-the-art neural networks that outperform their best manually-designed counterparts. [Expand]

PDF
arXiv
Show Tweets
9.25
4
9
15
Friday Poster Session
[449]

Checkerboard Context Model for Efficient Learned Image Compression

Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, Hongwei Qin

For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance. [Expand]

9.25
0
6
25
Thursday Poster Session
[450]

Audio-Driven Emotional Video Portraits

Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

Despite previous success in generating audio-driven talking heads, most of the previous studies focus on the correlation between speech content and the mouth shape. [Expand]

9.25
1
1
7
18
Thursday Poster Session
[451]

MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation

Sanjay Kariyappa, Atul Prakash, Moinuddin K Qureshi

High quality Machine Learning (ML) models are often considered valuable intellectual property by companies. [Expand]

9.25
7
0
0
9
Thursday Poster Session
[452]

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra

Neural architecture search (NAS) has shown great promise in designing state-of-the-art (SOTA) models that are both accurate and efficient. [Expand]

9.25
5
2
2
11
Tuesday Poster Session
[453]

Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification

Jianwen Xie, Yifei Xu, Zilong Zheng, Song-Chun Zhu, Ying Nian Wu

We propose a generative model of unordered point sets, such as point clouds, in the forms of an energy-based model, where the energy function is parameterized by an input-permutation-invariant bottom-up neural network. [Expand]

PDF
arXiv
Show Tweets
9.25
1
7
22
Thursday Poster Session
[454]

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens

Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei Zhang, Chao Xu, Chunjing Xu, Dacheng Tao, Chang Xu

Neural Architecture Search (NAS) aims to automatically discover optimal architectures. [Expand]

9.25
4
2
4
11
Wednesday Poster Session
[455]

SimPoE: Simulated Character Control for 3D Human Pose Estimation

Ye Yuan, Shih-En Wei, Tomas Simon, Kris Kitani, Jason Saragih

Accurate estimation of 3D human motion from monocular video requires modeling both kinematics (body motion without physical forces) and dynamics (motion with physical forces). [Expand]

9.25
1
1
2
28
Wednesday Poster Session
[456]

Pushing It Out of the Way: Interactive Visual Navigation

Kuo-Hao Zeng, Luca Weihs, Ali Farhadi, Roozbeh Mottaghi

We have observed significant progress in visual navigation for embodied agents. [Expand]

9.25
2
5
25
Wednesday Poster Session
[457]

Asymmetric Metric Learning for Knowledge Transfer

Mateusz Budnik, Yannis Avrithis

Knowledge transfer from large teacher models to smaller student models has recently been studied for metric learning, focusing on fine-grained classification. [Expand]

9.00
2
1
4
19
Wednesday Poster Session
[458]

Distilling Knowledge via Knowledge Review

Pengguang Chen, Shu Liu, Hengshuang Zhao, Jiaya Jia

Knowledge distillation transfers knowledge from the teacher network to the student one, with the goal of greatly improving the performance of the student network. [Expand]

9.00
3
3
27
Tuesday Poster Session
[459]

Deep Polarization Imaging for 3D Shape and SVBRDF Acquisition

Valentin Deschaintre, Yiming Lin, Abhijeet Ghosh

We present a novel method for efficient acquisition of shape and spatially varying reflectance of 3D objects using polarization cues. [Expand]

9.00
1
7
21
Friday Poster Session
[460]

Searching by Generating: Flexible and Efficient One-Shot NAS With Architecture Generator

Sian-Yao Huang, Wei-Ta Chu

In one-shot NAS, sub-networks need to be searched from the supernet to meet different hardware constraints. [Expand]

9.00
2
9
16
Monday Poster Session
[461]

Revamping Cross-Modal Recipe Retrieval With Hierarchical Transformers and Self-Supervised Learning

Amaia Salvador, Erhan Gundogdu, Loris Bazzani, Michael Donoser

Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models. [Expand]

9.00
0
3
30
Thursday Poster Session
[462]

AutoFlow: Learning a Better Training Set for Optical Flow

Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications. [Expand]

9.00
1
5
25
Wednesday Poster Session
[463]

PISE: Person Image Synthesis and Editing With Decoupled GAN

Jinsong Zhang, Kun Li, Yu-Kun Lai, Jingyu Yang

Person image synthesis, e.g., pose transfer, is a challenging problem due to large variation and occlusion. [Expand]

9.00
1
2
3
24
Wednesday Poster Session
[464]

Improving Calibration for Long-Tailed Recognition

Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia

Deep neural networks may perform poorly when training datasets are heavily class-imbalanced. [Expand]

9.00
0
6
24
Friday Poster Session
[465]

MP3: A Unified Model To Map, Perceive, Predict and Plan

Sergio Casas, Abbas Sadat, Raquel Urtasun

High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. [Expand]

8.75
1
5
24
Thursday Poster Session
[466]

Domain-Independent Dominance of Adaptive Methods

Pedro Savarese, David McAllester, Sudarshan Babu, Michael Maire

From a simplified analysis of adaptive methods, we derive AvaGrad, a new optimizer which outperforms SGD on vision tasks when its adaptability is properly tuned. [Expand]

8.75
5
1
3
8
Friday Poster Session
[467]

Rotation Equivariant Siamese Networks for Tracking

Deepak K. Gupta, Devanshu Arya, Efstratios Gavves

Rotation is among the long prevailing, yet still unresolved, hard challenges encountered in visual object tracking. [Expand]

8.50
2
5
22
Thursday Poster Session
[468]

Improving Unsupervised Image Clustering With Robust Learning

Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon Hong, Meeyoung Cha

Unsupervised image clustering methods often introduce alternative objectives to indirectly train the model and are subject to faulty predictions and overconfident results. [Expand]

8.50
1
2
4
20
Thursday Poster Session
[469]

Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation

Subhankar Roy, Evgeny Krivosheev, Zhun Zhong, Nicu Sebe, Elisa Ricci

In this paper we address multi-target domain adaptation (MTDA), where given one labeled source dataset and multiple unlabeled target datasets that differ in data distributions, the task is to learn a robust predictor for all the target domains. [Expand]

8.50
0
8
18
Tuesday Poster Session
[470]

Adaptive Class Suppression Loss for Long-Tail Object Detection

Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang

To address the problem of long-tail distribution for the large vocabulary object detection task, existing methods usually divide the whole categories into several groups and treat each group with different strategies. [Expand]

8.50
0
10
14
Tuesday Poster Session
[471]

Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation

Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, Nenghai Yu

Image matting is a fundamental and challenging problem in computer vision and graphics. [Expand]

8.50
0
5
24
Thursday Poster Session
[472]

Mitigating Face Recognition Bias via Group Adaptive Classifier

Sixue Gong, Xiaoming Liu, Anil K. Jain

Face recognition is known to exhibit bias -- subjects in a certain demographic group can be better recognized than other groups. [Expand]

8.25
5
2
3
5
Tuesday Poster Session
[473]

Interpolation-Based Semi-Supervised Learning for Object Detection

Jisoo Jeong, Vikas Verma, Minsung Hyun, Juho Kannala, Nojun Kwak

Despite the data labeling cost for the object detection tasks being substantially more than that of the classification tasks, semi-supervised learning methods for object detection have not been studied much. [Expand]

8.25
5
2
2
7
Thursday Poster Session
[474]

Learning Invariant Representations and Risks for Semi-Supervised Domain Adaptation

Bo Li, Yezhen Wang, Shanghang Zhang, Dongsheng Li, Kurt Keutzer, Trevor Darrell, Han Zhao

The success of supervised learning crucially hinges on the assumption that training data matches test data, which rarely holds in practice due to potential distribution shift. [Expand]

8.25
4
2
2
11
Monday Poster Session
[475]

DeepSurfels: Learning Online Appearance Fusion

Marko Mihajlovic, Silvan Weder, Marc Pollefeys, Martin R. Oswald

We present DeepSurfels, a novel hybrid scene representation for geometry and appearance information. [Expand]

PDF
arXiv
Show Tweets
8.25
0
3
27
Thursday Poster Session
[476]

Neural Camera Simulators

Hao Ouyang, Zifan Shi, Chenyang Lei, Ka Lung Law, Qifeng Chen

We present a controllable camera simulator based on deep neural networks to synthesize raw image data under different camera settings, including exposure time, ISO, and aperture. [Expand]

8.25
1
3
26
Wednesday Poster Session
[477]

The Neural Tangent Link Between CNN Denoisers and Non-Local Filters

Julian Tachella, Junqi Tang, Mike Davies

Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. [Expand]

8.25
3
2
5
9
Wednesday Poster Session
[478]

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

Quin Thames, Arjun Karpur, Wade Norris, Fangting Xia, Liviu Panait, Tobias Weyand, Jack Sim

Understanding the nutritional content of food from visual data is a challenging computer vision problem, with the potential to have a positive and widespread impact on public health. [Expand]

8.25
2
1
0
24
Wednesday Poster Session
[479]

Data-Free Model Extraction

Jean-Baptiste Truong, Pratyush Maini, Robert J. Walls, Nicolas Papernot

Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model. [Expand]

8.25
2
1
4
16
Tuesday Poster Session
[480]

The Multi-Temporal Urban Development SpaceNet Dataset

Adam Van Etten, Daniel Hogan, Jesus Martinez Manso, Jacob Shermeyer, Nicholas Weir, Ryan Lewis

Satellite imagery analytics have numerous human development and disaster response applications, particularly when time series methods are involved. [Expand]

8.25
1
0
11
7
Tuesday Poster Session
[481]

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

Panoptic segmentation unifies semantic segmentation and instance segmentation which has been attracting increasing attention in recent years. [Expand]

8.00
8
0
0
0
Wednesday Poster Session
[482]

Fully Convolutional Networks for Panoptic Segmentation

Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia

In this paper, we present a conceptually simple, strong, and efficient framework for panoptic segmentation, called Panoptic FCN. [Expand]

8.00
2
0
4
16
Monday Poster Session
[483]

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu, Jing Shao, Hongsheng Li

Localizing persons and recognizing their actions from videos is a challenging task towards high-level video under-standing. [Expand]

8.00
4
1
2
11
Monday Poster Session
[484]

Using Shape To Categorize: Low-Shot Learning With an Explicit Shape Bias

Stefan Stojanov, Anh Thai, James M. Rehg

It is widely accepted that reasoning about object shape is important for object recognition. [Expand]

8.00
1
9
13
Monday Poster Session
[485]

Structured Scene Memory for Vision-Language Navigation

Hanqing Wang, Wenguan Wang, Wei Liang, Caiming Xiong, Jianbing Shen

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i.e., entailing an agent to navigate 3D environments through following linguistic instructions. [Expand]

8.00
1
1
3
21
Wednesday Poster Session
[486]

Dense Label Encoding for Boundary Discontinuity Free Rotation Detection

Xue Yang, Liping Hou, Yue Zhou, Wentao Wang, Junchi Yan

Rotation detection serves as a fundamental building block in many visual applications involving aerial image, scene text, and face etc. [Expand]

8.00
7
0
0
4
Friday Poster Session
[487]

Rethinking BiSeNet for Real-Time Semantic Segmentation

Mingyuan Fan, Shenqi Lai, Junshi Huang, Xiaoming Wei, Zhenhua Chai, Junfeng Luo, Xiaolin Wei

BiSeNet has been proved to be a popular two-stream network for real-time segmentation. [Expand]

7.75
1
1
4
18
Wednesday Poster Session
[488]

Image Inpainting Guided by Coherence Priors of Semantics and Textures

Liang Liao, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi Satoh

Existing inpainting methods have achieved promising performance in recovering defected images of specific scenes. [Expand]

7.75
0
3
25
Tuesday Poster Session
[489]

Fair Attribute Classification Through Latent Space De-Biasing

Vikram V. Ramaswamy, Sunnie S. Y. Kim, Olga Russakovsky

Fairness in visual recognition is becoming a prominent and critical topic of discussion as recognition systems are deployed at scale in the real world. [Expand]

7.75
3
1
2
14
Wednesday Poster Session
[490]

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

Xinyi Wu, Zhenyao Wu, Hao Guo, Lili Ju, Song Wang

Semantic segmentation of nighttime images plays an equally important role as that of daytime images in autonomous driving, but the former is much more challenging due to poor illuminations and arduous human annotations. [Expand]

7.75
1
1
5
16
Friday Poster Session
[491]

Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach

Xingqian Xu, Zhifei Zhang, Zhaowen Wang, Brian Price, Zhonghao Wang, Humphrey Shi

Text segmentation is a prerequisite in many real-world text-related tasks, e.g., text style transfer, and scene text removal. [Expand]

7.75
2
0
2
19
Thursday Poster Session
[492]

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, Yong Wang, Fang Wen

Self-training is a competitive approach in domain adaptive segmentation, which trains the network with the pseudo labels on the target domain. [Expand]

7.75
5
0
2
7
Thursday Poster Session
[493]

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

Zhizhong Huang, Junping Zhang, Hongming Shan

To minimize the effects of age variation in face recognition, previous work either extracts identity-related discriminative features by minimizing the correlation between identity- and age-related features, called age-invariant face recognition (AIFR), or removes age variation by transforming the faces of different age groups into the same age group, called face age synthesis (FAS); however, the former lacks visual results for model interpretation while the latter suffers from artifacts compromising downstream recognition. [Expand]

7.50
2
2
2
16
Wednesday Poster Session
[494]

Coarse-Fine Networks for Temporal Activity Detection in Videos

Kumara Kahatapitiya, Michael S. Ryoo

In this paper, we introduce 'Coarse-Fine Networks', a two-stream architecture which benefits from different abstractions of temporal resolution to learn better video representations for long-term motion. [Expand]

7.50
1
2
2
20
Wednesday Poster Session
[495]

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection

Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang

Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries. [Expand]

7.50
1
5
19
Tuesday Poster Session
[496]

SMD-Nets: Stereo Mixture Density Networks

Fabio Tosi, Yiyi Liao, Carolin Schmitt, Andreas Geiger

Despite stereo matching accuracy has greatly improved by deep learning in the last few years, recovering sharp boundaries and high-resolution outputs efficiently remains challenging. [Expand]

7.50
1
0
29
Wednesday Poster Session
[497]

Learning Accurate Dense Correspondences and When To Trust Them

Prune Truong, Martin Danelljan, Luc Van Gool, Radu Timofte

Establishing dense correspondences between a pair of images is an important and general problem. [Expand]

7.50
4
0
3
8
Tuesday Poster Session
[498]

Hierarchical and Partially Observable Goal-Driven Policy Learning With Goals Relational Graph

Xin Ye, Yezhou Yang

We present a novel two-layer hierarchical reinforcement learning approach equipped with a Goals Relational Graph (GRG) for tackling the partially observable goal-driven task, such as goal-driven visual navigation. [Expand]

7.50
1
1
3
19
Thursday Poster Session
[499]

RefineMask: Towards High-Quality Instance Segmentation With Fine-Grained Features

Gang Zhang, Xin Lu, Jingru Tan, Jianmin Li, Zhaoxiang Zhang, Quanquan Li, Xiaolin Hu

The two-stage methods for instance segmentation, e.g. [Expand]

7.50
0
6
18
Tuesday Poster Session
[500]

FSDR: Frequency Space Domain Randomization for Domain Generalization

Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

Domain generalization aims to learn a generalizable model from a `known' source domain for various `unknown' target domains. [Expand]

7.25
7
1
0
0
Tuesday Poster Session
[501]

Differentiable SLAM-Net: Learning Particle SLAM for Visual Navigation

Peter Karkus, Shaojun Cai, David Hsu

Simultaneous localization and mapping (SLAM) remains challenging for a number of downstream applications, such as visual robot navigation, because of rapid turns, featureless walls, and poor camera quality. [Expand]

7.25
0
5
19
Monday Poster Session
[502]

Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation

Zhengxiong Luo, Zhicheng Wang, Yan Huang, Liang Wang, Tieniu Tan, Erjin Zhou

Heatmap regression has become the most prevalent choice for nowadays human pose estimation methods. [Expand]

7.25
1
0
7
11
Thursday Poster Session
[503]

How Robust Are Randomized Smoothing Based Defenses to Data Poisoning?

Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm

Predictions of certifiably robust classifiers remain constant in a neighborhood of a point, making them resilient to test-time attacks with a guarantee. [Expand]

PDF
arXiv
Show Tweets
7.25
0
4
21
Thursday Poster Session
[504]

Improving Panoptic Segmentation at All Scales

Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder

Crop-based training strategies decouple training resolution from GPU memory consumption, allowing the use of large-capacity panoptic segmentation networks on multi-megapixel images. [Expand]

7.25
0
3
23
Wednesday Poster Session
[505]

AdderSR: Towards Energy Efficient Image Super-Resolution

Dehua Song, Yunhe Wang, Hanting Chen, Chang Xu, Chunjing Xu, Dacheng Tao

This paper studies the single image super-resolution problem using adder neural networks (AdderNets). [Expand]

7.25
4
2
4
3
Friday Poster Session
[506]

Truly Shift-Invariant Convolutional Neural Networks

Anadi Chaman, Ivan Dokmanic

Thanks to the use of convolution and pooling layers, convolutional neural networks were for a long time thought to be shift-invariant. [Expand]

7.00
3
1
4
7
Tuesday Poster Session
[507]

MetricOpt: Learning To Optimize Black-Box Evaluation Metrics

Chen Huang, Shuangfei Zhai, Pengsheng Guo, Josh Susskind

We study the problem of directly optimizing arbitrary non-differentiable task evaluation metrics such as misclassification rate and recall. [Expand]

7.00
0
5
18
Monday Poster Session
[508]

Bi-GCN: Binary Graph Convolutional Network

Junfu Wang, Yunhong Wang, Zhen Yang, Liang Yang, Yuanfang Guo

Graph Neural Networks (GNNs) have achieved tremendous success in graph representation learning. [Expand]

7.00
1
1
4
15
Monday Poster Session
[509]

DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images

Meng Ye, Mikael Kanski, Dong Yang, Qi Chang, Zhennan Yan, Qiaoying Huang, Leon Axel, Dimitris Metaxas

Cardiac tagging magnetic resonance imaging (t-MRI) is the gold standard for regional myocardium deformation and cardiac strain estimation. [Expand]

7.00
0
8
12
Wednesday Poster Session
[510]

Dynamic Region-Aware Convolution

Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun

We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions where features have similar representation. [Expand]

6.75
6
2
0
1
Wednesday Poster Session
[511]

Shared Cross-Modal Trajectory Prediction for Autonomous Driving

Chiho Choi, Joon Hee Choi, Jiachen Li, Srikanth Malla

Predicting future trajectories of traffic agents in highly interactive environments is an essential and challenging problem for the safe operation of autonomous driving systems. [Expand]

6.75
6
3
0
0
Monday Poster Session
[512]

SuperMix: Supervising the Mixing Data Augmentation

Ali Dabouei, Sobhan Soleymani, Fariborz Taherkhani, Nasser M. Nasrabadi

This paper presents a supervised mixing augmentation method termed SuperMix, which exploits the salient regions within input images to construct mixed training samples. [Expand]

6.75
6
1
1
0
Thursday Poster Session
[513]

Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy

Federico Paredes-Valles, Guido C. H. E. de Croon

Event cameras are novel vision sensors that sample, in an asynchronous fashion, brightness increments with low latency and high temporal resolution. [Expand]

6.75
2
1
5
8
Tuesday Poster Session
[514]

PU-GCN: Point Cloud Upsampling Using Graph Convolutional Networks

Guocheng Qian, Abdulellah Abualshour, Guohao Li, Ali Thabet, Bernard Ghanem

The effectiveness of learning-based point cloud upsampling pipelines heavily relies on the upsampling modules and feature extractors used therein. [Expand]

6.75
3
0
4
7
Thursday Poster Session
[515]

TrafficSim: Learning To Simulate Realistic Multi-Agent Behaviors

Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun

Simulation has the potential to massively scale evaluation of self-driving systems, enabling rapid development as well as safe deployment. [Expand]

6.75
3
1
2
10
Wednesday Poster Session
[516]

Neural Descent for Visual 3D Human Pose and Shape

Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We present deep neural network methodology to reconstruct the 3d pose and shape of people, including hand gestures and facial expression, given an input RGB image. [Expand]

6.75
4
1
2
6
Thursday Poster Session
[517]

Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty

Bingbing Zhuang, Manmohan Chandraker

Learning methods for relative camera pose estimation have been developed largely in isolation from classical geometric approaches. [Expand]

6.75
0
4
19
Monday Poster Session
[518]

The Lottery Ticket Hypothesis for Object Recognition

Sharath Girish, Shishira R Maiya, Kamal Gupta, Hao Chen, Larry S. Davis, Abhinav Shrivastava

Recognition tasks, such as object recognition and keypoint estimation, have seen widespread adoption in recent years. [Expand]

6.50
6
2
0
0
Monday Poster Session
[519]

FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space

Quande Liu, Cheng Chen, Jing Qin, Qi Dou, Pheng-Ann Heng

Federated learning allows distributed medical institutions to collaboratively learn a shared prediction model with privacy protection. [Expand]

6.50
4
1
3
3
Monday Poster Session
[520]

On Learning the Geodesic Path for Incremental Learning

Christian Simon, Piotr Koniusz, Mehrtash Harandi

Neural networks notoriously suffer from the problem of catastrophic forgetting, the phenomenon of forgetting the past knowledge when acquiring new knowledge. [Expand]

6.50
2
2
3
10
Monday Poster Session
[521]

TransFill: Reference-Guided Image Inpainting by Merging Multiple Color and Spatial Transformations

Yuqian Zhou, Connelly Barnes, Eli Shechtman, Sohrab Amirghodsi

Image inpainting is the task of plausibly restoring missing pixels within a hole region that is to be removed from a target image. [Expand]

6.50
0
4
18
Monday Poster Session
[522]

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations

Umberto Michieli, Pietro Zanuttigh

Deep neural networks suffer from the major limitation of catastrophic forgetting old tasks when learning new ones. [Expand]

6.25
3
1
3
6
Monday Poster Session
[523]

Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions

Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

When people observe events, they are able to abstract key information and build concise summaries of what is happening. [Expand]

6.25
1
2
3
13
Thursday Poster Session
[524]

TearingNet: Point Cloud Autoencoder To Learn Topology-Friendly Representations

Jiahao Pang, Duanshun Li, Dong Tian

Topology matters. [Expand]

6.25
2
0
2
13
Wednesday Poster Session
[525]

Uncertainty-Guided Model Generalization to Unseen Domains

Fengchun Qiao, Xi Peng

We study a worst-case scenario in generalization: Out-of-domain generalization from a single source. [Expand]

6.25
3
0
1
11
Tuesday Poster Session
[526]

Fingerspelling Detection in American Sign Language

Bowen Shi, Diane Brentari, Greg Shakhnarovich, Karen Livescu

Fingerspelling, in which words are signed letter by letter, is an important component of American Sign Language. [Expand]

6.25
3
2
18
Tuesday Poster Session
[527]

Semi-Supervised Action Recognition With Temporal Contrastive Learning

Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

Learning to recognize actions from only a handful of labeled videos is a challenging problem due to the scarcity of tediously collected activity labels. [Expand]

6.25
1
0
5
11
Wednesday Poster Session
[528]

Rectification-Based Knowledge Retention for Continual Learning

Pravendra Singh, Pratik Mazumder, Piyush Rai, Vinay P. Namboodiri

Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting. [Expand]

6.25
1
0
3
15
Thursday Poster Session
[529]

Multiple Instance Active Learning for Object Detection

Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection. [Expand]

6.25
1
0
4
13
Tuesday Poster Session
[530]

AQD: Towards Accurate Quantized Object Detection

Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, Chunhua Shen

Network quantization allows inference to be conducted using low-precision arithmetic for improved inference efficiency of deep neural networks on edge devices. [Expand]

6.00
2
2
4
6
Monday Poster Session
[531]

Polarimetric Normal Stereo

Yoshiki Fukao, Ryo Kawahara, Shohei Nobuhara, Ko Nishino

We introduce a novel method for recovering per-pixel surface normals from a pair of polarization cameras. [Expand]

PDF
Show Tweets
6.00
0
7
10
Monday Poster Session
[532]

Point2Skeleton: Learning Skeletal Representations from Point Clouds

Cheng Lin, Changjian Li, Yuan Liu, Nenglun Chen, Yi-King Choi, Wenping Wang

We introduce Point2Skeleton, an unsupervised method to learn skeletal representations from point clouds. [Expand]

6.00
1
1
0
19
Tuesday Poster Session
[533]

Source-Free Domain Adaptation for Semantic Segmentation

Yuang Liu, Wei Zhang, Jun Wang

Unsupervised Domain Adaptation (UDA) can tackle the challenge that convolutional neural network (CNN)-based approaches for semantic segmentation heavily rely on the pixel-level annotated data, which is labor-intensive. [Expand]

6.00
1
1
2
15
Monday Poster Session
[534]

Delving Into Localization Errors for Monocular 3D Object Detection

Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang

Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging. [Expand]

6.00
1
0
5
10
Tuesday Poster Session
[535]

Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification

Oren Nuriel, Sagie Benaim, Lior Wolf

Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. [Expand]

6.00
2
0
3
10
Wednesday Poster Session
[536]

Multi-Attentional Deepfake Detection

Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Tianyi Wei, Weiming Zhang, Nenghai Yu

Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns. [Expand]

6.00
2
2
5
4
Monday Poster Session
[537]

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, Marios Savvides

Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data. [Expand]

6.00
3
0
2
8
Wednesday Poster Session
[538]

Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies

Jinzheng Cai, Youbao Tang, Ke Yan, Adam P. Harrison, Jing Xiao, Gigin Lin, Le Lu

Monitoring treatment response in longitudinal studies plays an important role in clinical practice. [Expand]

5.75
3
0
3
5
Thursday Poster Session
[539]

Shot Contrastive Self-Supervised Learning for Scene Boundary Detection

Shixing Chen, Xiaohan Nie, David Fan, Dongqing Zhang, Vimal Bhat, Raffay Hamid

Scenes play a crucial role in breaking the storyline of movies and TV episodes into semantically cohesive parts. [Expand]

5.75
1
3
16
Wednesday Poster Session
[540]

Semantic Palette: Guiding Scene Generation With Class Proportions

Guillaume Le Moing, Tuan-Hung Vu, Himalaya Jain, Patrick Perez, Matthieu Cord

Despite the recent progress of generative adversarial networks (GANs) at synthesizing photo-realistic images, producing complex urban scenes remains a challenging problem. [Expand]

5.75
2
3
15
Wednesday Poster Session
[541]

Generative Interventions for Causal Learning

Chengzhi Mao, Augustine Cha, Amogh Gupta, Hao Wang, Junfeng Yang, Carl Vondrick

We introduce a framework for learning robust visual representations that generalize to new viewpoints, backgrounds, and scene contexts. [Expand]

5.75
3
0
3
5
Tuesday Poster Session
[542]

Visual Navigation With Spatial Attention

Bar Mayo, Tamir Hazan, Ayellet Tal

This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class, where in each step the agent is provided with an egocentric RGB image of the scene. [Expand]

5.75
2
3
15
Friday Poster Session
[543]

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin

State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution. [Expand]

5.75
5
1
0
2
Wednesday Poster Session
[544]

RGB-D Local Implicit Function for Depth Completion of Transparent Objects

Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox

Majority of the perception methods in robotics require depth information provided by RGB-D cameras. [Expand]

5.75
0
2
19
Tuesday Poster Session
[545]

Person30K: A Dual-Meta Generalization Network for Person Re-Identification

Yan Bai, Jile Jiao, Wang Ce, Jun Liu, Yihang Lou, Xuetao Feng, Ling-Yu Duan

Recently, person re-identification (ReID) has vastly benefited from the surging waves of data-driven methods. [Expand]

PDF
Show Tweets
5.50
0
3
16
Monday Poster Session
[546]

Convolutional Dynamic Alignment Networks for Interpretable Classifications

Moritz Bohle, Mario Fritz, Bernt Schiele

We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets), which are performant classifiers with a high degree of inherent interpretability. [Expand]

5.50
0
4
14
Wednesday Poster Session
[547]

Semantic Audio-Visual Navigation

Changan Chen, Ziad Al-Halah, Kristen Grauman

Recent work on audio-visual navigation assumes a constantly-sounding target and restricts the role of audio to signaling the target's position. [Expand]

5.50
2
1
1
11
Thursday Poster Session
[548]

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

Siyan Dong, Qingnan Fan, He Wang, Ji Shi, Li Yi, Thomas Funkhouser, Baoquan Chen, Leonidas J. Guibas

Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc. [Expand]

5.50
1
1
2
13
Wednesday Poster Session
[549]

Optimal Gradient Checkpoint Search for Arbitrary Computation Graphs

Jianwei Feng, Dong Huang

Deep Neural Networks(DNNs) require huge GPU memory when training on modern image/video databases. [Expand]

5.50
1
1
1
15
Thursday Poster Session
[550]

Rank-One Prior: Toward Real-Time Scene Recovery

Jun Liu, Wen Liu, Jianing Sun, Tieyong Zeng

Scene recovery is a fundamental imaging task for several practical applications, e.g., video surveillance and autonomous vehicles, etc. [Expand]

5.50
0
5
12
Thursday Poster Session
[551]

2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition

Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis

3D convolutional networks are prevalent for video recognition. [Expand]

5.50
2
0
2
10
Tuesday Poster Session
[552]

Keep Your Eyes on the Lane: Real-Time Attention-Guided Lane Detection

Lucas Tabelini, Rodrigo Berriel, Thiago M. Paixao, Claudine Badue, Alberto F. De Souza, Thiago Oliveira-Santos

Modern lane detection methods have achieved remarkable performances in complex real-world scenarios, but many have issues maintaining real-time efficiency, which is important for autonomous vehicles. [Expand]

5.50
1
0
1
16
Monday Poster Session
[553]

Semi-Supervised Video Deraining With Dynamical Rain Generator

Zongsheng Yue, Jianwen Xie, Qian Zhao, Deyu Meng

While deep learning (DL)-based video deraining methods have achieved significant success recently, they still exist two major drawbacks. [Expand]

5.50
1
1
19
Monday Poster Session
[554]

PCLs: Geometry-Aware Neural Reconstruction of 3D Pose With Perspective Crop Layers

Frank Yu, Mathieu Salzmann, Pascal Fua, Helge Rhodin

Local processing is an essential feature of CNNs and other neural network architectures -- it is one of the reasons why they work so well on images where relevant information is, to a large extent, local. [Expand]

5.50
1
1
2
13
Wednesday Poster Session
[555]

Deep Implicit Templates for 3D Shape Representation

Zerong Zheng, Tao Yu, Qionghai Dai, Yebin Liu

Deep implicit functions (DIFs), as a kind of 3D shape representation, are becoming more and more popular in the 3D vision community due to their compactness and strong representation power. [Expand]

5.50
2
0
2
10
Monday Poster Session
[556]

DOTS: Decoupling Operation and Topology in Differentiable Architecture Search

Yu-Chao Gu, Li-Juan Wang, Yun Liu, Yi Yang, Yu-Huan Wu, Shao-Ping Lu, Ming-Ming Cheng

Differentiable Architecture Search (DARTS) has attracted extensive attention due to its efficiency in searching for cell structures. [Expand]

5.25
4
1
1
2
Thursday Poster Session
[557]

Distilling Causal Effect of Data in Class-Incremental Learning

Xinting Hu, Kaihua Tang, Chunyan Miao, Xian-Sheng Hua, Hanwang Zhang

We propose a causal framework to explain the catastrophic forgetting in Class-Incremental Learning (CIL) and then derive a novel distillation method that is orthogonal to the existing anti-forgetting techniques, such as data replay and feature/label distillation. [Expand]

5.25
5
0
0
1
Tuesday Poster Session
[558]

Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation

Rongjie Li, Songyang Zhang, Bo Wan, Xuming He

Scene graph generation is an important visual understanding task with a broad range of vision applications. [Expand]

5.25
0
4
13
Wednesday Poster Session
[559]

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring

Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. [Expand]

5.25
5
1
0
0
Wednesday Poster Session
[560]

Simultaneously Localize, Segment and Rank the Camouflaged Objects

Yunqiu Lv, Jing Zhang, Yuchao Dai, Aixuan Li, Bowen Liu, Nick Barnes, Deng-Ping Fan

Camouflage is a key defence mechanism across species that is critical to survival. [Expand]

5.25
5
1
0
0
Thursday Poster Session
[561]

Learning Camera Localization via Dense Scene Matching

Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu, Ping Tan

Camera localization aims to estimate 6 DoF camera poses from RGB images. [Expand]

5.25
0
2
17
Monday Poster Session
[562]

Diverse Semantic Image Synthesis via Probability Distribution Modeling

Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. [Expand]

5.25
0
2
17
Wednesday Poster Session
[563]

Divergence Optimization for Noisy Universal Domain Adaptation

Qing Yu, Atsushi Hashimoto, Yoshitaka Ushiku

Universal domain adaptation (UniDA) has been proposed to transfer knowledge learned from a label-rich source domain to a label-scarce target domain without any constraints on the label sets. [Expand]

5.25
0
8
5
Monday Poster Session
[564]

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

Shancheng Fang, Hongtao Xie, Yuxin Wang, Zhendong Mao, Yongdong Zhang

Linguistic knowledge is of great benefit to scene text recognition. [Expand]

5.00
4
2
12
Wednesday Poster Session
[565]

Representative Batch Normalization With Feature Calibration

Shang-Hua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng

Batch Normalization (BatchNorm) has become the default component in modern neural networks to stabilize training. [Expand]

5.00
5
Wednesday Poster Session
[566]

FrameExit: Conditional Early Exiting for Efficient Video Recognition

Amir Ghodrati, Babak Ehteshami Bejnordi, Amirhossein Habibian

In this paper, we propose a conditional early exiting framework for efficient video recognition. [Expand]

5.00
1
0
1
14
Friday Poster Session
[567]

Achieving Robustness in Classification Using Optimal Transport With Hinge Regularization

Mathieu Serrurier, Franck Mamalet, Alberto Gonzalez-Sanz, Thibaut Boissin, Jean-Michel Loubes, Eustasio del Barrio

Adversarial examples have pointed out Deep Neural Network's vulnerability to small local noise. [Expand]

5.00
3
2
0
6
Monday Poster Session
[568]

Understanding the Behaviour of Contrastive Loss

Feng Wang, Huaping Liu

Unsupervised contrastive learning has achieved outstanding success, while the mechanism of contrastive loss has been less studied. [Expand]

5.00
5
Monday Poster Session
[569]

TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption

Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Florencio, Lijuan Wang, Cha Zhang, Lei Zhang, Jiebo Luo

In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and Text-Caption tasks. [Expand]

5.00
5
Wednesday Poster Session
[570]

Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation

Yazhou Yao, Tao Chen, Guo-Sen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, Jian Zhang

Semantic segmentation aims to classify every pixel of an input image. [Expand]

5.00
2
0
3
6
Monday Poster Session
[571]

Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification

Yuyang Zhao, Zhun Zhong, Fengxiang Yang, Zhiming Luo, Yaojin Lin, Shaozi Li, Nicu Sebe

Recent advances in person re-identification (ReID) obtain impressive accuracy in the supervised and unsupervised learning settings. [Expand]

5.00
5
Tuesday Poster Session
[572]

Learning the Best Pooling Strategy for Visual Semantic Embedding

Jiacheng Chen, Hexiang Hu, Hao Wu, Yuning Jiang, Changhu Wang

Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions. [Expand]

4.75
3
0
0
7
Friday Poster Session
[573]

Context-Aware Layout to Image Generation With Enhanced Object Appearance

Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. [Expand]

4.75
1
1
1
12
Thursday Poster Session
[574]

Neural Response Interpretation Through the Lens of Critical Pathways

Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Christian Rupprecht, Seong Tae Kim, Nassir Navab

Is critical input information encoded in specific sparse pathways within the neural network? In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. [Expand]

4.75
2
1
2
6
Thursday Poster Session
[575]

3D-to-2D Distillation for Indoor Scene Parsing

Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

Indoor scene semantic parsing from RGB images is very challenging due to occlusions, object distortion, and viewpoint variations. [Expand]

4.75
2
1
1
8
Tuesday Poster Session
[576]

Combining Semantic Guidance and Deep Reinforcement Learning for Generating Human Level Paintings

Jaskirat Singh, Liang Zheng

Generation of stroke-based non-photorealistic imagery, is an important problem in the computer vision community. [Expand]

4.75
1
1
4
6
Friday Poster Session
[577]

Joint Learning of 3D Shape Retrieval and Deformation

Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri, Leonidas J. Guibas

We propose a novel technique for producing high-quality 3D models that match a given target object image or scan. [Expand]

4.75
2
0
1
9
Thursday Poster Session
[578]

Deep Gradient Projection Networks for Pan-sharpening

Shuang Xu, Jiangshe Zhang, Zixiang Zhao, Kai Sun, Junmin Liu, Chunxia Zhang

Pan-sharpening is an important technique for remote sensing imaging systems to obtain high resolution multispectral images. [Expand]

4.75
0
2
15
Monday Poster Session
[579]

Learning Semantic-Aware Dynamics for Video Prediction

Xinzhu Bei, Yanchao Yang, Stefano Soatto

We propose an architecture and training scheme to predict video frames by explicitly modeling dis-occlusions and capturing the evolution of semantically consistent regions in the video. [Expand]

4.50
0
3
12
Monday Poster Session
[580]

Mixed-Privacy Forgetting in Deep Networks

Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting. [Expand]

4.50
1
1
3
7
Monday Poster Session
[581]

Track, Check, Repeat: An EM Approach to Unsupervised Tracking

Adam W. Harley, Yiming Zuo, Jing Wen, Ayush Mangal, Shubhankar Potdar, Ritwick Chaudhry, Katerina Fragkiadaki

We propose an unsupervised method for detecting and tracking moving objects in 3D, in unlabelled RGB-D videos. [Expand]

4.50
0
0
18
Friday Poster Session
[582]

Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging

Tao Huang, Weisheng Dong, Xin Yuan, Jinjian Wu, Guangming Shi

In coded aperture snapshot spectral imaging (CASSI) system, the real-world hyperspectral image (HSI) can be reconstructed from the captured compressive image in a snapshot. [Expand]

4.50
3
1
1
3
Friday Poster Session
[583]

SetVAE: Learning Hierarchical Composition for Generative Modeling of Set-Structured Data

Jinwoo Kim, Jaehoon Yoo, Juho Lee, Seunghoon Hong

Generative modeling of set-structured data, such as point clouds, requires reasoning over local and global structures at various scales. [Expand]

4.50
1
0
17
Thursday Poster Session
[584]

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

Modern 3D object detectors have immensely benefited from the end-to-end learning idea. [Expand]

4.50
2
2
12
Wednesday Poster Session
[585]

BRepNet: A Topological Message Passing System for Solid Models

Joseph G. Lambourne, Karl D.D. Willis, Pradeep Kumar Jayaraman, Aditya Sanghi, Peter Meltzer, Hooman Shayani

Boundary representation (B-rep) models are the standard way 3D shapes are described in Computer-Aided Design (CAD) applications. [Expand]

4.50
3
0
1
4
Thursday Poster Session
[586]

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion

Shi Qiu, Saeed Anwar, Nick Barnes

Given the prominence of current 3D sensors, a fine-grained analysis on the basic point cloud data is worthy of further investigation. [Expand]

4.50
1
3
11
Monday Poster Session
[587]

TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking

N Dinesh Reddy, Laurent Guigues, Leonid Pishchulin, Jayan Eledath, Srinivasa G. Narasimhan

We consider the task of 3D pose estimation and trackingof multiple people seen in an arbitrary number of camerafeeds. [Expand]

PDF
Show Tweets
4.50
0
0
18
Thursday Poster Session
[588]

DISCO: Dynamic and Invariant Sensitive Channel Obfuscation for Deep Neural Networks

Abhishek Singh, Ayush Chopra, Ethan Garza, Emily Zhang, Praneeth Vepakomma, Vivek Sharma, Ramesh Raskar

Recent deep learning models have shown remarkable performance in image classification. [Expand]

4.50
1
4
9
Thursday Poster Session
[589]

Practical Wide-Angle Portraits Correction With Deep Structured Models

Jing Tan, Shan Zhao, Pengfei Xiong, Jiangyu Liu, Haoqiang Fan, Shuaicheng Liu

Wide-angle portraits often enjoy expanded views. [Expand]

4.50
1
2
13
Tuesday Poster Session
[590]

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

Longyin Wen, Dawei Du, Pengfei Zhu, Qinghua Hu, Qilong Wang, Liefeng Bo, Siwei Lyu

To promote the developments of object detection, tracking and counting algorithms in drone-captured videos, we construct a benchmark with a new drone-captured large-scale dataset, named as DroneCrowd, formed by 112 video clips with 33,600 HD frames in various scenarios. [Expand]

4.50
1
1
15
Wednesday Poster Session
[591]

Regularizing Neural Networks via Adversarial Model Perturbation

Yaowei Zheng, Richong Zhang, Yongyi Mao

Effective regularization techniques are highly desired in deep learning for alleviating overfitting and improving generalization. [Expand]

4.50
2
2
2
4
Wednesday Poster Session
[592]

Removing Diffraction Image Artifacts in Under-Display Camera via Dynamic Skip Connection Network

Ruicheng Feng, Chongyi Li, Huaijin Chen, Shuai Li, Chen Change Loy, Jinwei Gu

Recent development of Under-Display Camera (UDC) systems provides a true bezel-less and notch-free viewing experience on smartphones (and TV, laptops, tablets), while allowing images to be captured from the selfie camera embedded underneath. [Expand]

4.25
0
2
13
Monday Poster Session
[593]

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, Xinjing Cheng

Real-world training data usually exhibits long-tailed distribution, where several majority classes have a significantly larger number of samples than the remaining minority classes. [Expand]

4.25
1
3
10
Tuesday Poster Session
[594]

Simulating Unknown Target Models for Query-Efficient Black-Box Attacks

Chen Ma, Li Chen, Jun-Hai Yong

Many adversarial attacks have been proposed to investigate the security issues of deep neural networks. [Expand]

PDF
arXiv
Show Tweets
4.25
1
6
4
Thursday Poster Session
[595]

Beyond Image to Depth: Improving Depth Prediction Using Echoes

Kranti Kumar Parida, Siddharth Srivastava, Gaurav Sharma

We address the problem of estimating depth with multi modal audio visual data. [Expand]

4.25
1
2
0
11
Wednesday Poster Session
[596]

On the Difficulty of Membership Inference Attacks

Shahbaz Rezaei, Xin Liu

Recent studies propose membership inference (MI) attacks on deep models, where the goal is to infer if a sample has been used in the training process. [Expand]

PDF
arXiv
Show Tweets
4.25
2
5
5
Wednesday Poster Session
[597]

Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-Constrained Optimization

Fakai Wang, Kang Zheng, Le Lu, Jing Xiao, Min Wu, Shun Miao

Accurate vertebra localization and identification are required in many clinical applications of spine disorder diagnosis and surgery planning. [Expand]

4.25
0
4
9
Tuesday Poster Session
[598]

Single-View 3D Object Reconstruction From Shape Priors in Memory

Shuo Yang, Min Xu, Haozhe Xie, Stuart Perry, Jiahao Xia

Existing methods for single-view 3D object reconstruction directly learn to transform image features into 3D representations. [Expand]

PDF
arXiv
Show Tweets
4.25
1
3
10
Tuesday Poster Session
[599]

Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Yibo Yang, Shan You, Hongyang Li, Fei Wang, Chen Qian, Zhouchen Lin

Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. [Expand]

4.25
4
0
0
1
Tuesday Poster Session
[600]

Open-Vocabulary Object Detection Using Captions

Alireza Zareian, Kevin Dela Rosa, Derek Hao Hu, Shih-Fu Chang

Despite the remarkable accuracy of deep neural networks in object detection, they are costly to train and scale due to supervision requirements. [Expand]

4.25
1
1
1
10
Thursday Poster Session
[601]

CoLA: Weakly-Supervised Temporal Action Localization With Snippet Contrastive Learning

Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou

Weakly-supervised temporal action localization (WS-TAL) aims to localize actions in untrimmed videos with only video-level labels. [Expand]

4.25
2
0
2
5
Friday Poster Session
[602]

Instant-Teaching: An End-to-End Semi-Supervised Object Detection Framework

Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, Hao Li

Supervised learning based object detection frameworks demand plenty of laborious manual annotations, which may not be practical in real applications. [Expand]

4.25
1
1
3
6
Tuesday Poster Session
[603]

Face Forgery Detection by 3D Decomposition

Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li

Detecting digital face manipulation has attracted extensive attention due to the potential harms of fake media to the public. [Expand]

4.25
1
0
4
5
Tuesday Poster Session
[604]

How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks With DenseNet-Type Skip Connections?

Kartikeya Bhardwaj, Guihong Li, Radu Marculescu

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. [Expand]

PDF
arXiv
Show Tweets
4.00
3
2
9
Thursday Poster Session
[605]

Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning

Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto

We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. [Expand]

4.00
2
0
1
6
Monday Poster Session
[606]

SLADE: A Self-Training Framework for Distance Metric Learning

Jiali Duan, Yen-Liang Lin, Son Tran, Larry S. Davis, C.-C. Jay Kuo

Most existing distance metric learning approaches use fully labeled data to learn the sample similarities in an embedding space. [Expand]

4.00
0
3
10
Wednesday Poster Session
[607]

Generalized Few-Shot Object Detection Without Forgetting

Zhibo Fan, Yuchen Ma, Zeming Li, Jian Sun

Learning object detection from few examples recently emerged to deal with data-limited situations. [Expand]

4.00
1
2
11
Tuesday Poster Session
[608]

Regressive Domain Adaptation for Unsupervised Keypoint Detection

Junguang Jiang, Yifei Ji, Ximei Wang, Yufeng Liu, Jianmin Wang, Mingsheng Long

Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain. [Expand]

4.00
3
2
9
Tuesday Poster Session
[609]

Taskology: Utilizing Task Relations at Scale

Yao Lu, Soren Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon

Many computer vision tasks address the problem of scene understanding and are naturally interrelated e.g. [Expand]

4.00
3
1
0
3
Wednesday Poster Session
[610]

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Zhiwu Qing, Haisheng Su, Weihao Gan, Dongliang Wang, Wei Wu, Xiang Wang, Yu Qiao, Junjie Yan, Changxin Gao, Nong Sang

Temporal action proposal generation aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet important task in the video understanding field. [Expand]

4.00
4
Monday Poster Session
[611]

Affective Processes: Stochastic Modelling of Temporal Context for Emotion and Facial Expression Recognition

Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos

Temporal context is key to the recognition of expressions of emotion. [Expand]

4.00
2
2
10
Wednesday Poster Session
[612]

Look Before You Speak: Visually Contextualized Utterances

Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid

While most conversational AI systems focus on textual dialogue only, conditioning utterances on visual context (when it's available) can lead to more realistic conversations. [Expand]

4.00
1
0
1
10
Friday Poster Session
[613]

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

Yapeng Tian, Di Hu, Chenliang Xu

There are rich synchronized audio and visual events in our daily life. [Expand]

4.00
4
Monday Poster Session
[614]

CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. [Expand]

4.00
3
1
0
3
Wednesday Poster Session
[615]

Deep Optimized Priors for 3D Shape Modeling and Reconstruction

Mingyue Yang, Yuxin Wen, Weikai Chen, Yongwei Chen, Kui Jia

Many learning-based approaches have difficulty scaling to unseen data, as the generality of its learned prior is limited to the scale and variations of the training samples. [Expand]

4.00
3
0
1
2
Tuesday Poster Session
[616]

Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition

Songyang Zhang, Zeming Li, Shipeng Yan, Xuming He, Jian Sun

Despite the success of the deep neural networks, it remains challenging to effectively build a system for long-tail visual recognition tasks. [Expand]

4.00
0
4
8
Monday Poster Session
[617]

Few-Shot 3D Point Cloud Semantic Segmentation

Na Zhao, Tat-Seng Chua, Gim Hee Lee

Many existing approaches for 3D point cloud semantic segmentation are fully supervised. [Expand]

4.00
2
2
0
6
Wednesday Poster Session
[618]

A Second-Order Approach to Learning With Instance-Dependent Label Noise

Zhaowei Zhu, Tongliang Liu, Yang Liu

The presence of label noise often misleads the training of deep neural networks. [Expand]

4.00
4
0
0
0
Wednesday Poster Session
[619]

Meta Batch-Instance Normalization for Generalizable Person Re-Identification

Seokeon Choi, Taekyung Kim, Minki Jeong, Hyoungseob Park, Changick Kim

Although supervised person re-identification (Re-ID) methods have shown impressive performance, they suffer from a poor generalization capability on unseen domains. [Expand]

3.75
3
0
0
3
Tuesday Poster Session
[620]

A Peek Into the Reasoning of Neural Networks: Interpreting With Structural Visual Concepts

Yunhao Ge, Yao Xiao, Zhi Xu, Meng Zheng, Srikrishna Karanam, Terrence Chen, Laurent Itti, Ziyan Wu

Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from a lack of transparency and interpretability. [Expand]

3.75
1
1
12
Monday Poster Session
[621]

StyleMix: Separating Content and Style for Enhanced Data Augmentation

Minui Hong, Jinwoo Choi, Gunhee Kim

In spite of the great success of deep neural networks for many challenging classification tasks, the learned networks are vulnerable to overfitting and adversarial attacks. [Expand]

PDF
Show Tweets
3.75
0
1
13
Thursday Poster Session
[622]

General Multi-Label Image Classification With Transformers

Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. [Expand]

3.75
1
1
2
6
Friday Poster Session
[623]

Model-Contrastive Federated Learning

Qinbin Li, Bingsheng He, Dawn Song

Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. [Expand]

3.75
1
2
1
7
Wednesday Poster Session
[624]

UAV-Human: A Large Benchmark for Human Behavior Understanding With Unmanned Aerial Vehicles

Tianjiao Li, Jun Liu, Wei Zhang, Yun Ni, Wenqian Wang, Zhiheng Li

Human behavior understanding with unmanned aerial vehicles (UAVs) is of great significance for a wide range of applications, which simultaneously brings an urgent demand of large, challenging, and comprehensive benchmarks for the development and evaluation of UAV-based models. [Expand]

3.75
1
3
8
Friday Poster Session
[625]

Learning Asynchronous and Sparse Human-Object Interaction in Videos

Romero Morais, Vuong Le, Svetha Venkatesh, Truyen Tran

Human activities can be learned from video. [Expand]

3.75
1
3
8
Friday Poster Session
[626]

Neural Prototype Trees for Interpretable Fine-Grained Image Recognition

Meike Nauta, Ron van Bree, Christin Seifert

Prototype-based methods use interpretable representations to address the black-box nature of deep learning models, in contrast to post-hoc explanation methods that only approximate such models. [Expand]

3.75
1
1
2
6
Thursday Poster Session
[627]

PGT: A Progressive Method for Training Models on Long Videos

Bo Pang, Gao Peng, Yizhuo Li, Cewu Lu

Convolutional video models have an order of magnitude larger computational complexity than their counterpart image-level models. [Expand]

3.75
0
2
11
Thursday Poster Session
[628]

3D Object Detection With Pointformer

Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, Gao Huang

Feature learning for 3D object detection from point clouds is very challenging due to the irregularity of 3D point cloud data. [Expand]

3.75
3
0
0
3
Wednesday Poster Session
[629]

DeFlow: Learning Complex Image Degradations From Unpaired Data With Conditional Flows

Valentin Wolf, Andreas Lugmayr, Martin Danelljan, Luc Van Gool, Radu Timofte

The difficulty of obtaining paired data remains a major bottleneck for learning image restoration and enhancement models for real-world applications. [Expand]

3.75
2
2
1
3
Monday Poster Session
[630]

A Decomposition Model for Stereo Matching

Chengtang Yao, Yunde Jia, Huijun Di, Pengxiang Li, Yuwei Wu

In this paper, we present a decomposition model for stereo matching to solve the problem of excessive growth in computational cost (time and memory cost) as the resolution increases. [Expand]

3.75
0
2
11
Tuesday Poster Session
[631]

Transitional Adaptation of Pretrained Models for Visual Storytelling

Youngjae Yu, Jiwan Chung, Heeseung Yun, Jongseok Kim, Gunhee Kim

Previous models for vision-to-language generation tasks usually pretrain a visual encoder and a language generator in the respective domains and jointly finetune them with the target task. [Expand]

PDF
Show Tweets
3.75
0
1
13
Thursday Poster Session
[632]

Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map

Elmira Amirloo, Mohsen Rohani, Ershad Banijamali, Jun Luo, Pascal Poupart

In this paper we propose a system consisting of a modular network and a trajectory planner. [Expand]

3.50
0
3
8
Wednesday Poster Session
[633]

A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images Detection

Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Ngai-Man Cheung

CNN-based generative modelling has evolved to produce synthetic images indistinguishable from real images in the RGB pixel space. [Expand]

3.50
2
2
8
Wednesday Poster Session
[634]

Transformer Tracking

Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, Huchuan Lu

Correlation acts as a critical role in the tracking field, especially in recent popular Siamese-based trackers. [Expand]

3.50
3
1
0
1
Wednesday Poster Session
[635]

Recurrent Multi-View Alignment Network for Unsupervised Surface Registration

Wanquan Feng, Juyong Zhang, Hongrui Cai, Haofei Xu, Junhui Hou, Hujun Bao

Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data. [Expand]

3.50
0
1
12
Wednesday Poster Session
[636]

Domain Adaptation With Auxiliary Target Domain-Oriented Classifier

Jian Liang, Dapeng Hu, Jiashi Feng

Domain adaptation (DA) aims to transfer knowledge from a label-rich but heterogeneous domain to a label-scare domain, which alleviates the labeling efforts and attracts considerable attention. [Expand]

3.50
2
1
10
Friday Poster Session
[637]

Discovering Hidden Physics Behind Transport Dynamics

Peirong Liu, Lin Tian, Yubo Zhang, Stephen Aylward, Yueh Lee, Marc Niethammer

Transport processes are ubiquitous. [Expand]

3.50
1
1
11
Wednesday Poster Session
[638]

Multi-Person Implicit Reconstruction From a Single Image

Armin Mustafa, Akin Caliskan, Lourdes Agapito, Adrian Hilton

We present a new end-to-end learning framework to obtain detailed and spatially coherent reconstructions of multiple people from a single image. [Expand]

3.50
2
2
8
Thursday Poster Session
[639]

Offboard 3D Object Detection From Point Cloud Sequences

Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa Vo, Boyang Deng, Dragomir Anguelov

While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels. [Expand]

3.50
1
0
2
6
Tuesday Poster Session
[640]

Backdoor Attacks Against Deep Learning Systems in the Physical World

Emily Wenger, Josephine Passananti, Arjun Nitin Bhagoji, Yuanshun Yao, Haitao Zheng, Ben Y. Zhao

Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific "trigger." Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that apply digitally generated patterns as triggers. [Expand]

3.50
3
4
3
Tuesday Poster Session
[641]

Neural Splines: Fitting 3D Surfaces With Infinitely-Wide Neural Networks

Francis Williams, Matthew Trager, Joan Bruna, Denis Zorin

We present Neural Splines, a technique for 3D surface reconstruction that is based on random feature kernels arising from infinitely-wide shallow ReLU networks. [Expand]

3.50
1
2
0
8
Wednesday Poster Session
[642]

Track To Detect and Segment: An Online Multi-Object Tracker

Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan

Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking. [Expand]

3.50
1
1
1
7
Thursday Poster Session
[643]

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

Tianfei Zhou, Wenguan Wang, Si Liu, Yi Yang, Luc Van Gool

To address the challenging task of instance-aware human part parsing, a new bottom-up regime is proposed to learn category-level human semantic segmentation as well as multi-person pose estimation in a joint and end-to-end manner. [Expand]

3.50
1
0
1
8
Monday Poster Session
[644]

One Shot Face Swapping on Megapixels

Yuhao Zhu, Qi Li, Jian Wang, Cheng-Zhong Xu, Zhenan Sun

Face swapping has both positive applications such as entertainment, human-computer interaction, etc., and negative applications such as DeepFake threats to politics, economics, etc. [Expand]

3.50
1
0
13
Tuesday Poster Session
[645]

PointDSC: Robust Point Cloud Registration Using Deep Spatial Consistency

Xuyang Bai, Zixin Luo, Lei Zhou, Hongkai Chen, Lei Li, Zeyu Hu, Hongbo Fu, Chiew-Lan Tai

Removing outlier correspondences is one of the critical steps for successful feature-based point cloud registration. [Expand]

3.25
1
3
6
Friday Poster Session
[646]

RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening

Sungha Choi, Sanghun Jung, Huiwon Yun, Joanne T. Kim, Seungryong Kim, Jaegul Choo

Enhancing the generalization capability of deep neural networks to unseen domains is crucial for safety-critical applications in the real world such as autonomous driving. [Expand]

3.25
1
0
1
7
Thursday Poster Session
[647]

Anomaly Detection in Video via Self-Supervised and Multi-Task Learning

Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, Mubarak Shah

Anomaly detection in video is a challenging computer vision problem. [Expand]

3.25
3
1
0
0
Thursday Poster Session
[648]

Graph Attention Tracking

Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, Chunhua Shen

Siamese network based trackers formulate the visual tracking task as a similarity matching problem. [Expand]

3.25
3
0
0
1
Wednesday Poster Session
[649]

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Lei Huang, Yi Zhou, Li Liu, Fan Zhu, Ling Shao

Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. [Expand]

3.25
2
1
9
Wednesday Poster Session
[650]

Monocular Depth Estimation via Listwise Ranking Using the Plackett-Luce Model

Julian Lienen, Eyke Hullermeier, Ralph Ewerth, Nils Nommensen

In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. [Expand]

3.25
1
0
1
7
Thursday Poster Session
[651]

Adaptive Aggregation Networks for Class-Incremental Learning

Yaoyao Liu, Bernt Schiele, Qianru Sun

Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase. [Expand]

PDF
arXiv
Show Tweets
3.25
2
1
9
Monday Poster Session
[652]

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network

Rui Liu, Yixiao Ge, Ching Lam Choi, Xiaogang Wang, Hongsheng Li

Conditional generative adversarial networks (cGANs) target at synthesizing diverse images given the input conditions and latent codes, but unfortunately, they usually suffer from the issue of mode collapse. [Expand]

3.25
2
1
0
4
Friday Poster Session
[653]

Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning

Mamshad Nayeem Rizve, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

In many real-world problems, collecting a large number of labeled samples is infeasible. [Expand]

3.25
1
1
1
6
Wednesday Poster Session
[654]

StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song

Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. [Expand]

3.25
3
1
0
0
Wednesday Poster Session
[655]

Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing

Wen Shen, Zhihua Wei, Shikun Huang, Binbin Zhang, Panyue Chen, Ping Zhao, Quanshi Zhang

In this paper, we diagnose deep neural networks for 3D point cloud processing to explore utilities of different network architectures. [Expand]

PDF
arXiv
Show Tweets
3.25
2
2
7
Wednesday Poster Session
[656]

Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction

Jiapeng Tang, Dan Xu, Kui Jia, Lei Zhang

This paper focuses on the task of 4D shape reconstruction from a sequence of point clouds. [Expand]

3.25
1
0
2
5
Tuesday Poster Session
[657]

SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud Based Place Recognition

Yan Xia, Yusheng Xu, Shuang Li, Rui Wang, Juan Du, Daniel Cremers, Uwe Stilla

We tackle the problem of place recognition from point cloud data and introduce a self-attention and orientation encoding network (SOE-Net) that fully explores the relationship between points and incorporates long-range context into point-wise local descriptors. [Expand]

3.25
3
0
0
1
Thursday Poster Session
[658]

Cross-Iteration Batch Normalization

Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin

A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. [Expand]

PDF
arXiv
Show Tweets
3.25
1
2
8
Thursday Poster Session
[659]

Prototype Completion With Primitive Knowledge for Few-Shot Learning

Baoquan Zhang, Xutao Li, Yunming Ye, Zhichao Huang, Lisai Zhang

Few-shot learning is a challenging task, which aims to learn a classifier for novel classes with few examples. [Expand]

3.25
1
2
2
3
Tuesday Poster Session
[660]

OpenMix: Reviving Known Knowledge for Discovering Novel Visual Categories in an Open World

Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi Yang, Nicu Sebe

In this paper, we tackle the problem of discovering new classes in unlabeled visual data given labeled data from disjoint classes. [Expand]

3.25
3
1
0
0
Wednesday Poster Session
[661]

Binary Graph Neural Networks

Mehdi Bahri, Gaetan Bahl, Stefanos Zafeiriou

Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. [Expand]

3.00
1
1
9
Wednesday Poster Session
[662]

Memory-Efficient Network for Large-Scale Video Compressive Sensing

Ziheng Cheng, Bo Chen, Guanliang Liu, Hao Zhang, Ruiying Lu, Zhengjue Wang, Xin Yuan

Video snapshot compressive imaging (SCI) captures a sequence of video frames in a single shot using a 2D detector. [Expand]

3.00
3
Friday Poster Session
[663]

NBNet: Noise Basis Learning for Image Denoising With Subspace Projection

Shen Cheng, Yuzhi Wang, Haibin Huang, Donghao Liu, Haoqiang Fan, Shuaicheng Liu

In this paper, we introduce NBNet, a novel framework for image denoising. [Expand]

3.00
1
0
1
6
Tuesday Poster Session
[664]

FS-Net: Fast Shape-Based Network for Category-Level 6D Object Pose Estimation With Decoupled Rotation Mechanism

Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Linlin Shen, Ales Leonardis

In this paper, we focus on category-level 6D pose and size estimation from a monocular RGB-D image. [Expand]

3.00
1
1
1
5
Monday Poster Session
[665]

Model-Based 3D Hand Reconstruction via Self-Supervised Learning

Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan

Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity. [Expand]

3.00
1
1
1
5
Wednesday Poster Session
[666]

Learning a Proposal Classifier for Multiple Object Tracking

Peng Dai, Renliang Weng, Wongun Choi, Changshui Zhang, Zhangping He, Wei Ding

The recent trend in multiple object tracking (MOT) is heading towards leveraging deep learning to boost the tracking performance. [Expand]

3.00
3
0
0
0
Monday Poster Session
[667]

Global2Local: Efficient Structure Search for Video Action Segmentation

Shang-Hua Gao, Qi Han, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng

Temporal receptive fields of models play an important role in action segmentation. [Expand]

3.00
3
Friday Poster Session
[668]

ContactOpt: Optimizing Contact To Improve Grasps

Patrick Grady, Chengcheng Tang, Christopher D. Twigg, Minh Vo, Samarth Brahmbhatt, Charles C. Kemp

Physical contact between hands and objects plays a critical role in human grasps. [Expand]

3.00
3
Monday Poster Session
[669]

Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On

Chongjian Ge, Yibing Song, Yuying Ge, Han Yang, Wei Liu, Ping Luo

Image virtual try-on replaces the clothes on a person image with a desired in-shop clothes image. [Expand]

3.00
3
0
9
Friday Poster Session
[670]

Anti-Aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation

Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples. [Expand]

3.00
3
0
0
0
Wednesday Poster Session
[671]

Spatiotemporal Registration for Event-Based Visual Odometry

Daqi Liu, Alvaro Parra, Tat-Jun Chin

A useful application of event sensing is visual odometry, especially in settings that require high-temporal resolution. [Expand]

3.00
0
2
8
Tuesday Poster Session
[672]

Dual-Stream Multiple Instance Learning Network for Whole Slide Image Classification With Self-Supervised Contrastive Learning

Bin Li, Yin Li, Kevin W. Eliceiri

We address the challenging problem of whole slide image (WSI) classification. [Expand]

3.00
3
Thursday Poster Session
[673]

Searching for Fast Model Families on Datacenter Accelerators

Sheng Li, Mingxing Tan, Ruoming Pang, Andrew Li, Liqun Cheng, Quoc V. Le, Norman P. Jouppi

Neural Architecture Search (NAS), together with model scaling, has shown remarkable progress in designing high accuracy and fast convolutional architecture families. [Expand]

3.00
3
0
0
0
Wednesday Poster Session
[674]

Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection

Aixuan Li, Jing Zhang, Yunqiu Lv, Bowen Liu, Tong Zhang, Yuchao Dai

Visual salient object detection (SOD) aims at finding the salient object(s) that attract human attention, while camouflaged object detection (COD) on the contrary intends to discover the camouflaged object(s) that hidden in the surrounding. [Expand]

3.00
3
0
0
0
Wednesday Poster Session
[675]

Diffusion Probabilistic Models for 3D Point Cloud Generation

Shitong Luo, Wei Hu

We present a probabilistic model for point cloud generation, which is fundamental for various 3D vision tasks such as shape completion, upsampling, synthesis and data augmentation. [Expand]

3.00
3
0
0
0
Tuesday Poster Session
[676]

Read and Attend: Temporal Localisation in Sign Language Videos

Gul Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman

The objective of this work is to annotate sign instances across a broad vocabulary in continuous sign language. [Expand]

3.00
2
1
1
1
Friday Poster Session
[677]

ACTION-Net: Multipath Excitation for Action Recognition

Zhengwei Wang, Qi She, Aljosa Smolic

Spatial-temporal, channel-wise, and motion patterns are three complementary and crucial types of information for video action recognition. [Expand]

3.00
0
3
6
Thursday Poster Session
[678]

MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing

Zhengjue Wang, Hao Zhang, Ziheng Cheng, Bo Chen, Xin Yuan

To capture high-speed videos using a two-dimensional detector, video snapshot compressive imaging (SCI) is a promising system, where the video frames are coded by different masks and then compressed to a snapshot measurement. [Expand]

3.00
3
Monday Poster Session
[679]

Prototype-Supervised Adversarial Network for Targeted Attack of Deep Hashing

Xunguang Wang, Zheng Zhang, Baoyuan Wu, Fumin Shen, Guangming Lu

Due to its powerful capability of representation learning and high-efficiency computation, deep hashing has made significant progress in large-scale image retrieval. [Expand]

3.00
0
2
8
Friday Poster Session
[680]

Understanding the Robustness of Skeleton-Based Action Recognition Under Adversarial Attack

He Wang, Feixiang He, Zhexi Peng, Tianjia Shao, Yong-Liang Yang, Kun Zhou, David Hogg

Action recognition has been heavily employed in many applications such as autonomous vehicles, surveillance, etc, where its robustness is a primary concern. [Expand]

3.00
2
2
0
2
Thursday Poster Session
[681]

Positive-Congruent Training: Towards Regression-Free Model Updates

Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto

Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error. [Expand]

3.00
1
0
2
4
Thursday Poster Session
[682]

Robust Instance Segmentation Through Reasoning About Multi-Object Occlusion

Xiaoding Yuan, Adam Kortylewski, Yihong Sun, Alan Yuille

Analyzing complex scenes with Deep Neural Networks is a challenging task, particularly when images contain multiple objects that partially occlude each other. [Expand]

3.00
0
2
8
Wednesday Poster Session
[683]

Deep Stable Learning for Out-of-Distribution Generalization

Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen

Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. [Expand]

3.00
0
2
8
Tuesday Poster Session
[684]

DoDNet: Learning To Segment Multi-Organ and Tumors From Multiple Partially Labeled Datasets

Jianpeng Zhang, Yutong Xie, Yong Xia, Chunhua Shen

Due to the intensive cost of labor and expertise in annotating 3D medical images at a voxel level, most benchmark datasets are equipped with the annotations of only one type of organs and/or tumors, resulting in the so-called partially labeling issue. [Expand]

3.00
3
0
0
0
Monday Poster Session
[685]

Improving Sign Language Translation With Monolingual Data by Sign Back-Translation

Hao Zhou, Wengang Zhou, Weizhen Qi, Junfu Pu, Houqiang Li

Despite existing pioneering works on sign language translation (SLT), there is a non-trivial obstacle, i.e., the limited quantity of parallel sign-text data. [Expand]

3.00
1
1
9
Monday Poster Session
[686]

Spatially-Varying Outdoor Lighting Estimation From Intrinsics

Yongjie Zhu, Yinda Zhang, Si Li, Boxin Shi

We present SOLID-Net, a neural network for spatially-varying outdoor lighting estimation from a single outdoor image for any 2D pixel location. [Expand]

3.00
0
3
6
Thursday Poster Session
[687]

ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows

Jie An, Siyu Huang, Yibing Song, Dejing Dou, Wei Liu, Jiebo Luo

Universal style transfer retains styles from reference images in content images. [Expand]

2.75
1
1
1
4
Monday Poster Session
[688]

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

Bowen Cheng, Ross Girshick, Piotr Dollar, Alexander C. Berg, Alexander Kirillov

We present Boundary IoU (Intersection-over-Union), a new segmentation evaluation measure focused on boundary quality. [Expand]

2.75
1
1
1
4
Thursday Poster Session
[689]

Equivariant Point Network for 3D Point Cloud Analysis

Haiwei Chen, Shichen Liu, Weikai Chen, Hao Li, Randall Hill

Features that are equivariant to a larger group of symmetries have been shown to be more discriminative and powerful in recent studies. [Expand]

2.75
1
2
6
Thursday Poster Session
[690]

Compatibility-Aware Heterogeneous Visual Search

Rahul Duggal, Hao Zhou, Shuo Yang, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We tackle the problem of visual search under resource constraints. [Expand]

2.75
1
2
6
Wednesday Poster Session
[691]

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

Joakim Bruslund Haurum, Thomas B. Moeslund

Perhaps surprisingly sewerage infrastructure is one of the most costly infrastructures in modern society. [Expand]

2.75
1
0
2
3
Thursday Poster Session
[692]

Reciprocal Landmark Detection and Tracking With Extremely Few Annotations

Jianzhe Lin, Ghazal Sahebzamani, Christina Luong, Fatemeh Taheri Dezaki, Mohammad Jafari, Purang Abolmaesumi, Teresa Tsang

Localization of anatomical landmarks to perform two-dimensional measurements in echocardiography is part of routine clinical workflow in cardiac disease diagnosis. [Expand]

2.75
0
2
7
Thursday Poster Session
[693]

Noise-Resistant Deep Metric Learning With Ranking-Based Instance Selection

Chang Liu, Han Yu, Boyang Li, Zhiqi Shen, Zhanning Gao, Peiran Ren, Xuansong Xie, Lizhen Cui, Chunyan Miao

The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. [Expand]

2.75
2
1
7
Tuesday Poster Session
[694]

Zero-Shot Adversarial Quantization

Yuang Liu, Wei Zhang, Jun Wang

Model quantization is a promising approach to compress deep neural networks and accelerate inference, making it possible to be deployed on mobile and edge devices. [Expand]

2.75
0
2
7
Monday Poster Session
[695]

Exploring Adversarial Fake Images on Face Manifold

Dongze Li, Wei Wang, Hongxing Fan, Jing Dong

Images synthesized by powerful generative adversarial network (GAN) based methods have drawn moral and privacy concerns. [Expand]

2.75
0
3
5
Tuesday Poster Session
[696]

StEP: Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis

Moustafa Meshry, Yixuan Ren, Larry S. Davis, Abhinav Shrivastava

We propose a novel approach for multi-modal Image-to-image (I2I) translation. [Expand]

2.75
3
2
4
Tuesday Poster Session
[697]

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences

Feitong Tan, Danhang Tang, Mingsong Dou, Kaiwen Guo, Rohit Pandey, Cem Keskin, Ruofei Du, Deqing Sun, Sofien Bouaziz, Sean Fanello, Ping Tan, Yinda Zhang

In this paper, we address the problem of building pixel-wise dense correspondences between human images under arbitrary camera viewpoints and body poses. [Expand]

2.75
0
2
7
Monday Poster Session
[698]

Depth-Conditioned Dynamic Message Propagation for Monocular 3D Object Detection

Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, Xiangyang Xue, Jianfeng Feng, Li Zhang

The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection. [Expand]

2.75
0
3
5
Monday Poster Session
[699]

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions

Junbin Xiao, Xindi Shang, Angela Yao, Tat-Seng Chua

We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark to advance video understanding from describing to explaining the temporal actions. [Expand]

2.75
1
3
4
Wednesday Poster Session
[700]

PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds

Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi

We introduce Position Adaptive Convolution (PAConv), a generic convolution operation for 3D point cloud processing. [Expand]

2.75
1
1
1
4
Tuesday Poster Session
[701]

Patch-VQ: 'Patching Up' the Video Quality Problem

Zhenqiang Ying, Maniratnam Mandal, Deepti Ghadiyaram, Alan Bovik

No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem for social and streaming media applications. [Expand]

2.75
2
0
0
3
Thursday Poster Session
[702]

Are Labels Always Necessary for Classifier Accuracy Evaluation?

Weijian Deng, Liang Zheng

To calculate the model accuracy on a computer vision task, e.g., object recognition, we usually require a test set composing of test samples and their ground truth labels. [Expand]

2.50
2
2
0
0
Thursday Poster Session
[703]

XProtoNet: Diagnosis in Chest Radiography With Global and Local Explanations

Eunji Kim, Siwon Kim, Minji Seo, Sungroh Yoon

Automated diagnosis using deep neural networks in chest radiography can help radiologists detect life-threatening diseases. [Expand]

2.50
1
0
1
4
Friday Poster Session
[704]

MongeNet: Efficient Sampler for Geometric Deep Learning

Leo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Olivier Salvado

Recent advances in geometric deep-learning introduce complex computational challenges for evaluating the distance between meshes. [Expand]

2.50
2
2
4
Friday Poster Session
[705]

One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation

Zhengzhe Liu, Xiaojuan Qi, Chi-Wing Fu

Point cloud semantic segmentation often requires largescale annotated training data, but clearly, point-wise labels are too tedious to prepare. [Expand]

2.50
1
0
9
Monday Poster Session
[706]

PointGuard: Provably Robust 3D Point Cloud Classification

Hongbin Liu, Jinyuan Jia, Neil Zhenqiang Gong

3D point cloud classification has many safety-critical applications such as autonomous driving and robotic grasping. [Expand]

2.50
2
2
0
0
Tuesday Poster Session
[707]

UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning

Kunming Luo, Chuan Wang, Shuaicheng Liu, Haoqiang Fan, Jue Wang, Jian Sun

We present an unsupervised learning approach for optical flow estimation by improving the upsampling and learning of pyramid network. [Expand]

2.50
2
0
0
2
Monday Poster Session
[708]

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

Trisha Mittal, Puneet Mathur, Aniket Bera, Dinesh Manocha

We present Affect2MM, a learning method for time-series emotion prediction for multimedia content. [Expand]

2.50
1
0
1
4
Tuesday Poster Session
[709]

Dive Into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition

Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, Tao Mei

Due to the subjective annotation and the inherent inter-class similarity of facial expressions, one of key challenges in Facial Expression Recognition (FER) is the annotation ambiguity. [Expand]

2.50
1
1
7
Tuesday Poster Session
[710]

SceneGen: Learning To Generate Realistic Traffic Scenes

Shuhan Tan, Kelvin Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

We consider the problem of generating realistic traffic scenes automatically. [Expand]

2.50
1
1
1
3
Monday Poster Session
[711]

Learning Better Visual Dialog Agents With Pretrained Visual-Linguistic Representation

Tao Tu, Qing Ping, Govindarajan Thattai, Gokhan Tur, Prem Natarajan

GuessWhat?! is a visual dialog guessing game which incorporates a Questioner agent that generates a sequence of questions, while an Oracle agent answers the respective questions about a target object in an image. [Expand]

2.50
2
1
6
Tuesday Poster Session
[712]

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu

Weakly supervised phrase grounding aims at learning region-phrase correspondences using only image-sentence pairs. [Expand]

2.50
1
1
1
3
Thursday Poster Session
[713]

When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks

Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo

Human pose estimation is a fundamental yet challenging task in computer vision, which aims at localizing human anatomical keypoints. [Expand]

2.50
1
1
7
Thursday Poster Session
[714]

Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision

Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, Serena Yeung

Instance segmentation is an active topic in computer vision that is usually solved by using supervised learning approaches over very large datasets composed of object level masks. [Expand]

2.50
2
2
4
Monday Poster Session
[715]

Capturing Omni-Range Context for Omnidirectional Segmentation

Kailun Yang, Jiaming Zhang, Simon Reiss, Xinxin Hu, Rainer Stiefelhagen

Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving. [Expand]

2.50
1
0
1
4
Monday Poster Session
[716]

TPCN: Temporal Point Cloud Networks for Motion Forecasting

Maosheng Ye, Tongyi Cao, Qifeng Chen

We propose the Temporal Point Cloud Networks (TPCN), a novel and flexible framework with joint spatial and temporal learning for trajectory prediction. [Expand]

2.50
2
2
0
0
Wednesday Poster Session
[717]

Learning To Recommend Frame for Interactive Video Object Segmentation in the Wild

Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao

This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively. [Expand]

2.50
1
1
7
Thursday Poster Session
[718]

Few-Shot Incremental Learning With Continually Evolved Classifiers

Chi Zhang, Nan Song, Guosheng Lin, Yun Zheng, Pan Pan, Yinghui Xu

Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points, without forgetting knowledge of old classes. [Expand]

2.50
2
0
0
2
Thursday Poster Session
[719]

Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis

Yaxuan Zhu, Ruiqi Gao, Siyuan Huang, Song-Chun Zhu, Ying Nian Wu

How to efficiently represent camera pose is an essential problem in 3D computer vision, especially in tasks like camera pose regression and novel view synthesis. [Expand]

2.50
0
2
6
Wednesday Poster Session
[720]

LQF: Linear Quadratic Fine-Tuning

Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization. [Expand]

2.25
1
0
2
1
Friday Poster Session
[721]

More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang, Tao Xiang, Yi-Zhe Song

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[722]

InverseForm: A Loss Function for Structured Boundary-Aware Segmentation

Shubhankar Borse, Ying Wang, Yizhe Zhang, Fatih Porikli

We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network, which efficiently learns the degree of parametric transformations between estimated and target boundaries. [Expand]

2.25
0
1
7
Tuesday Poster Session
[723]

Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds

Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

3D object detection in point clouds is a challenging vision task that benefits various applications for understanding the 3D visual world. [Expand]

2.25
0
0
9
Wednesday Poster Session
[724]

Deep Analysis of CNN-Based Spatio-Temporal Representations for Action Recognition

Chun-Fu Richard Chen, Rameswar Panda, Kandan Ramakrishnan, Rogerio Feris, John Cohn, Aude Oliva, Quanfu Fan

In recent years, a number of approaches based on 2D or 3D convolutional neural networks (CNN) have emerged for video action recognition, achieving state-of-the-art results on several large-scale benchmark datasets. [Expand]

PDF
arXiv
Show Tweets
2.25
1
1
6
Tuesday Poster Session
[725]

Reformulating HOI Detection As Adaptive Set Prediction

Mingfei Chen, Yue Liao, Si Liu, Zhiyuan Chen, Fei Wang, Chen Qian

Determining which image regions to concentrate is critical for Human-Object Interaction (HOI) detection. [Expand]

2.25
2
1
0
0
Wednesday Poster Session
[726]

Wasserstein Contrastive Representation Distillation

Liqun Chen, Dong Wang, Zhe Gan, Jingjing Liu, Ricardo Henao, Lawrence Carin

The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former. [Expand]

2.25
2
0
0
1
Friday Poster Session
[727]

Generalizable Person Re-Identification With Relevance-Aware Mixture of Experts

Yongxing Dai, Xiaotong Li, Jun Liu, Zekun Tong, Ling-Yu Duan

Domain generalizable (DG) person re-identification (ReID) is a challenging problem because we cannot access any unseen target domain data during training. [Expand]

2.25
1
0
1
3
Friday Poster Session
[728]

General Instance Distillation for Object Detection

Xing Dai, Zeren Jiang, Zhao Wu, Yiping Bao, Zhicheng Wang, Si Liu, Erjin Zhou

In recent years, knowledge distillation has been proved to be an effective solution for model compression. [Expand]

2.25
2
1
0
0
Wednesday Poster Session
[729]

Deformed Implicit Field: Modeling 3D Shapes With Learned Dense Correspondence

Yu Deng, Jiaolong Yang, Xin Tong

We propose a novel Deformed Implicit Field (DIF) representation for modeling 3D shapes of a category and generating dense correspondences among shapes. [Expand]

2.25
2
0
0
1
Wednesday Poster Session
[730]

AlphaMatch: Improving Consistency for Semi-Supervised Learning With Alpha-Divergence

Chengyue Gong, Dilin Wang, Qiang Liu

Semi-supervised learning (SSL) is a key approach toward more data-efficient machine learning by jointly leverage both labeled and unlabeled data. [Expand]

2.25
0
3
3
Thursday Poster Session
[731]

ReDet: A Rotation-Equivariant Detector for Aerial Object Detection

Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia

Recently, object detection in aerial images has gained much attention in computer vision. [Expand]

2.25
2
1
0
0
Monday Poster Session
[732]

Reinforced Attention for Few-Shot Learning and Beyond

Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson

Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images. [Expand]

2.25
2
1
0
0
Monday Poster Session
[733]

A Multiplexed Network for End-to-End, Multilingual OCR

Jing Huang, Guan Pang, Rama Kovvuri, Mandy Toh, Kevin J Liang, Praveen Krishnan, Xi Yin, Tal Hassner

Recent advances in OCR have shown that an end-to-end (E2E) training pipeline that includes both detection and recognition leads to the best results. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[734]

FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation

Yair Kittenplon, Yonina C. Eldar, Dan Raviv

Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[735]

Refer-It-in-RGBD: A Bottom-Up Approach for 3D Visual Grounding in RGBD Images

Haolin Liu, Anran Lin, Xiaoguang Han, Lei Yang, Yizhou Yu, Shuguang Cui

Grounding referring expressions in RGBD image has been an emerging field. [Expand]

2.25
1
1
6
Tuesday Poster Session
[736]

Semi-Supervised 3D Hand-Object Poses Estimation With Interactions in Time

Shaowei Liu, Hanwen Jiang, Jiarui Xu, Sifei Liu, Xiaolong Wang

Estimating 3D hand and object pose from a single image is an extremely challenging problem: hands and objects are often self-occluded during interactions, and the 3D annotations are scarce as even humans cannot directly label the ground-truths from a single image perfectly. [Expand]

2.25
2
1
0
0
Thursday Poster Session
[737]

Unsupervised Part Segmentation Through Disentangling Appearance and Shape

Shilong Liu, Lei Zhang, Xiao Yang, Hang Su, Jun Zhu

We study the problem of unsupervised discovery and segmentation of object parts, which, as an intermediate local representation, are capable of finding intrinsic object structure and providing more explainable recognition results. [Expand]

2.25
0
2
5
Wednesday Poster Session
[738]

PointNetLK Revisited

Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey

We address the generalization ability of recent learning-based point cloud registration methods. [Expand]

PDF
arXiv
Show Tweets
2.25
1
4
0
Thursday Poster Session
[739]

QAIR: Practical Query-Efficient Black-Box Attacks for Image Retrieval

Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, Hui Xue

We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting, where the adversary only has query access to the top-k ranked unlabeled images from the database. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[740]

Quasi-Dense Similarity Learning for Multiple Object Tracking

Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu

Similarity learning has been recognized as a crucial step for object tracking. [Expand]

2.25
1
3
0
2
Monday Poster Session
[741]

Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration

Liyuan Pan, Shah Chowdhury, Richard Hartley, Miaomiao Liu, Hongguang Zhang, Hongdong Li

The dual-pixel (DP) hardware works by splitting each pixel in half and creating an image pair in a single snapshot. [Expand]

2.25
1
0
0
5
Tuesday Poster Session
[742]

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-Bit Neural Networks via Guided Distribution Calibration

Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides

Previous studies dominantly target at self-supervised learning on real-valued networks and have achieved many promising results. [Expand]

2.25
2
1
0
0
Monday Poster Session
[743]

NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering Using RGB Cameras

Xin Suo, Yuheng Jiang, Pei Lin, Yingliang Zhang, Minye Wu, Kaiwen Guo, Lan Xu

4D reconstruction and rendering of human activities is critical for immersive VR/AR experience. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[744]

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Praveen Tirupattur, Kevin Duarte, Yogesh S Rawat, Mubarak Shah

Real world videos contain many complex actions with inherent relationships between action classes. [Expand]

2.25
4
0
5
Monday Poster Session
[745]

ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning

Liyuan Wang, Kuo Yang, Chongxuan Li, Lanqing Hong, Zhenguo Li, Jun Zhu

Continual learning usually assumes the incoming data are fully labeled, which might not be applicable in real applications. [Expand]

PDF
arXiv
Show Tweets
2.25
0
2
5
Tuesday Poster Session
[746]

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning

Jinpeng Wang, Yuting Gao, Ke Li, Yiqi Lin, Andy J. Ma, Hao Cheng, Pai Peng, Feiyue Huang, Rongrong Ji, Xing Sun

Self-supervised learning has shown great potentials in improving the video representation ability of deep neural networks by getting supervision from the data itself. [Expand]

2.25
2
1
0
0
Thursday Poster Session
[747]

T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval

Xiaohan Wang, Linchao Zhu, Yi Yang

Text-video retrieval is a challenging task that aims to search relevant video contents based on natural language descriptions. [Expand]

2.25
2
1
0
0
Tuesday Poster Session
[748]

Few-Shot Classification With Feature Map Reconstruction Networks

Davis Wertheimer, Luming Tang, Bharath Hariharan

In this paper we reformulate few-shot classification as a reconstruction problem in latent space. [Expand]

PDF
arXiv
Show Tweets
2.25
0
2
5
Wednesday Poster Session
[749]

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes

Zhihang Zhong, Yinqiang Zheng, Imari Sato

Joint rolling shutter correction and deblurring (RSCD) techniques are critical for the prevalent CMOS cameras. [Expand]

2.25
0
1
7
Wednesday Poster Session
[750]

Sequence-to-Sequence Contrastive Learning for Text Recognition

Aviad Aberdam, Ron Litman, Shahar Tsiper, Oron Anschel, Ron Slossberg, Shai Mazor, R. Manmatha, Pietro Perona

We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. [Expand]

2.00
2
0
0
0
Thursday Poster Session
[751]

Unsupervised Multi-Source Domain Adaptation Without Access to Source Data

Sk Miraj Ahmed, Dripta S. Raychaudhuri, Sujoy Paul, Samet Oymak, Amit K. Roy-Chowdhury

Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for an unlabeled dataset by transferring knowledge from a labeled source data, which has been trained on similar tasks. [Expand]

2.00
2
Wednesday Poster Session
[752]

Object Classification From Randomized EEG Trials

Hamad Ahmed, Ronnie B. Wilbur, Hari M. Bharadwaj, Jeffrey Mark Siskind

New results suggest strong limits to the feasibility of object classification from human brain activity evoked by image stimuli, as measured through EEG. [Expand]

2.00
1
1
1
1
Tuesday Poster Session
[753]

Polka Lines: Learning Structured Illumination and Reconstruction for Active Stereo

Seung-Hwan Baek, Felix Heide

Active stereo cameras that recover depth from structured light captures have become a cornerstone sensor modality for 3D scene reconstruction and understanding tasks across application domains. [Expand]

2.00
2
Tuesday Poster Session
[754]

Architectural Adversarial Robustness: The Case for Deep Pursuit

George Cazenavette, Calvin Murdock, Simon Lucey

Despite their unmatched performance, deep neural networks remain susceptible to targeted attacks by nearly imperceptible levels of adversarial noise. [Expand]

2.00
2
0
0
0
Wednesday Poster Session
[755]

Learning Feature Aggregation for Deep 3D Morphable Models

Zhixiang Chen, Tae-Kyun Kim

3D morphable models are widely used for the shape representation of an object class in computer vision and graphics applications. [Expand]

2.00
0
1
6
Thursday Poster Session
[756]

I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors

Chaoqi Chen, Zebiao Zheng, Yue Huang, Xinghao Ding, Yizhou Yu

Recent works on two-stage cross-domain detection have widely explored the local feature patterns to achieve more accurate adaptation results. [Expand]

2.00
2
Thursday Poster Session
[757]

Semantic-Aware Knowledge Distillation for Few-Shot Class-Incremental Learning

Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi

Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner. [Expand]

2.00
2
0
0
0
Monday Poster Session
[758]

LiBRe: A Practical Bayesian Approach to Adversarial Detection

Zhijie Deng, Xiao Yang, Shizhen Xu, Hang Su, Jun Zhu

Despite their appealing flexibility, deep neural networks (DNNs) are vulnerable against adversarial examples. [Expand]

2.00
1
1
5
Monday Poster Session
[759]

Multi-Institutional Collaborations for Improving Deep Learning-Based Magnetic Resonance Image Reconstruction Using Federated Learning

Pengfei Guo, Puyang Wang, Jinyuan Zhou, Shanshan Jiang, Vishal M. Patel

Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. [Expand]

2.00
2
Monday Poster Session
[760]

MetaCorrection: Domain-Aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation

Xiaoqing Guo, Chen Yang, Baopu Li, Yixuan Yuan

Unsupervised domain adaptation (UDA) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain. [Expand]

2.00
0
1
6
Tuesday Poster Session
[761]

Lips Don't Lie: A Generalisable and Robust Approach To Face Forgery Detection

Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Although current deep learning-based face forgery detectors achieve impressive performance in constrained scenarios, they are vulnerable to samples created by unseen manipulation methods. [Expand]

2.00
1
0
0
4
Tuesday Poster Session
[762]

Neural Cellular Automata Manifold

Alejandro Hernandez, Armand Vilalta, Francesc Moreno-Noguer

Very recently, the Neural Cellular Automata (NCA) has been proposed to simulate the morphogenesis process with deep networks. [Expand]

2.00
1
4
0
0
Wednesday Poster Session
[763]

Visualizing Adapted Knowledge in Domain Transfer

Yunzhong Hou, Liang Zheng

A source model trained on source data and a target model learned through unsupervised domain adaptation (UDA) usually encode different knowledge. [Expand]

2.00
1
1
1
1
Thursday Poster Session
[764]

Multi-Target Domain Adaptation With Collaborative Consistency Learning

Takashi Isobe, Xu Jia, Shuaijun Chen, Jianzhong He, Yongjie Shi, Jianzhuang Liu, Huchuan Lu, Shengjin Wang

Recently unsupervised domain adaptation for the semantic segmentation task has become more and more popular due to the high-cost of pixel-level annotation on real-world images. [Expand]

2.00
2
Wednesday Poster Session
[765]

In the Light of Feature Distributions: Moment Matching for Neural Style Transfer

Nikolai Kalischek, Jan D. Wegner, Konrad Schindler

Style transfer aims to render the content of a given image in the graphical/artistic style of another image. [Expand]

2.00
1
2
3
Wednesday Poster Session
[766]

UniT: Unified Knowledge Transfer for Any-Shot Object Detection and Segmentation

Siddhesh Khandelwal, Raghav Goyal, Leonid Sigal

Methods for object detection and segmentation rely on large scale instance-level annotations for training, which are difficult and time-consuming to collect. [Expand]

2.00
1
1
1
1
Tuesday Poster Session
[767]

Robust Reflection Removal With Reflection-Free Flash-Only Cues

Chenyang Lei, Qifeng Chen

We propose a simple yet effective reflection-free cue for robust reflection removal from a pair of flash and ambient (no-flash) images. [Expand]

2.00
2
Thursday Poster Session
[768]

SG-Net: Spatial Granularity Network for One-Stage Video Instance Segmentation

Dongfang Liu, Yiming Cui, Wenbo Tan, Yingjie Chen

Video instance segmentation (VIS) is a new and critical task in computer vision. [Expand]

2.00
1
0
0
4
Wednesday Poster Session
[769]

Watching You: Global-Guided Reciprocal Learning for Video-Based Person Re-Identification

Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang

Video-based person re-identification (Re-ID) aims to automatically retrieve video sequences of the same person under non-overlapping cameras. [Expand]

2.00
2
Thursday Poster Session
[770]

Beyond Max-Margin: Class Margin Equilibrium for Few-Shot Object Detection

Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye

Few-shot object detection has made encouraging progress by reconstructing novel class objects using the feature representation learned upon a set of base classes. [Expand]

2.00
2
Wednesday Poster Session
[771]

D2IM-Net: Learning Detail Disentangled Implicit Fields From Single Images

Manyi Li, Hao Zhang

We present the first single-view 3D reconstruction network aimed at recovering geometric details from an input image which encompass both topological shape structures and surface features. [Expand]

2.00
2
Wednesday Poster Session
[772]

Dynamic Slimmable Network

Changlin Li, Guangrun Wang, Bing Wang, Xiaodan Liang, Zhihui Li, Xiaojun Chang

Current dynamic networks and dynamic pruning methods have shown their promising capability in reducing theoretical computation complexity. [Expand]

2.00
2
Wednesday Poster Session
[773]

PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training

Luke Melas-Kyriazi, Arjun K. Manrai

Unsupervised domain adaptation is a promising technique for semantic segmentation and other computer vision tasks for which large-scale data annotation is costly and time-consuming. [Expand]

2.00
0
1
6
Thursday Poster Session
[774]

Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks

Roi Pony, Itay Naeh, Shie Mannor

Deep neural networks for video classification, just like image classification networks, may be subjected to adversarial manipulation. [Expand]

2.00
1
2
0
2
Monday Poster Session
[775]

Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect

Athena Sayles, Ashish Hooda, Mohit Gupta, Rahul Chatterjee, Earlence Fernandes

Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts -- a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. [Expand]

2.00
1
0
1
2
Thursday Poster Session
[776]

SSN: Soft Shadow Network for Image Compositing

Yichen Sheng, Jianming Zhang, Bedrich Benes

We introduce an interactive Soft Shadow Network (SSN) to generates controllable soft shadows for image compositing. [Expand]

2.00
1
1
0
3
Tuesday Poster Session
[777]

ZeroScatter: Domain Transfer for Long Distance Imaging and Vision Through Scattering Media

Zheng Shi, Ethan Tseng, Mario Bijelic, Werner Ritter, Felix Heide

Adverse weather conditions, including snow, rain, and fog, pose a major challenge for both human and computer vision. [Expand]

2.00
0
1
6
Tuesday Poster Session
[778]

Open Domain Generalization with Domain-Augmented Meta-Learning

Yang Shu, Zhangjie Cao, Chenyu Wang, Jianmin Wang, Mingsheng Long

Leveraging datasets available to learn a model with high generalization ability to unseen domains is important for computer vision, especially when the unseen domain's annotated data are unavailable. [Expand]

2.00
2
Wednesday Poster Session
[779]

Equalization Loss v2: A New Gradient Balance Approach for Long-Tailed Object Detection

Jingru Tan, Xin Lu, Gang Zhang, Changqing Yin, Quanquan Li

Recently proposed decoupled training methods emerge as a dominant paradigm for long-tailed object detection. [Expand]

2.00
0
1
6
Monday Poster Session
[780]

RAFT-3D: Scene Flow Using Rigid-Motion Embeddings

Zachary Teed, Jia Deng

We address the problem of scene flow: given a pair of stereo or RGB-D video frames, estimate pixelwise 3D motion. [Expand]

2.00
2
Wednesday Poster Session
[781]

Unsupervised Learning for Robust Fitting: A Reinforcement Learning Approach

Giang Truong, Huu Le, David Suter, Erchuan Zhang, Syed Zulqarnain Gilani

Robust model fitting is a core algorithm in a large number of computer vision applications. [Expand]

2.00
1
2
3
Wednesday Poster Session
[782]

Incremental Learning via Rate Reduction

Ziyang Wu, Christina Baek, Chong You, Yi Ma

Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes. [Expand]

2.00
2
0
0
0
Monday Poster Session
[783]

Efficient Regional Memory Network for Video Object Segmentation

Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Wenxiu Sun

Recently, several Space-Time Memory based networks have shown that the object cues (e.g. [Expand]

2.00
2
Monday Poster Session
[784]

Learnable Companding Quantization for Accurate Low-Bit Neural Networks

Kohei Yamamoto

Quantizing deep neural networks is an effective method for reducing memory consumption and improving inference speed, and is thus useful for implementation in resource-constrained devices. [Expand]

2.00
1
2
3
Tuesday Poster Session
[785]

Interactive Self-Training With Mean Teachers for Semi-Supervised Object Detection

Qize Yang, Xihan Wei, Biao Wang, Xian-Sheng Hua, Lei Zhang

The goal of semi-supervised object detection is to learn a detection model using only a few labeled data and large amounts of unlabeled data, thereby reducing the cost of data labeling. [Expand]

2.00
2
Tuesday Poster Session
[786]

Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation

Yuntong Ye, Yi Chang, Hanyu Zhou, Luxin Yan

Existing deep learning-based image deraining methods have achieved promising performance for synthetic rainy images, typically rely on the pairs of sharp images and simulated rainy counterparts. [Expand]

2.00
0
1
6
Monday Poster Session
[787]

Mutual Graph Learning for Camouflaged Object Detection

Qiang Zhai, Xin Li, Fan Yang, Chenglizhao Chen, Hong Cheng, Deng-Ping Fan

Automatically detecting/segmenting object(s) that blend in with their surroundings is difficult for current models. [Expand]

2.00
2
Thursday Poster Session
[788]

Group-aware Label Transfer for Domain Adaptive Person Re-identification

Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha

Unsupervised Domain Adaptive (UDA) person re-identification (ReID) aims at adapting the model trained on a labeled source-domain dataset to a target-domain dataset without any further annotations. [Expand]

2.00
2
Tuesday Poster Session
[789]

Partition-Guided GANs

Mohammadreza Armandpour, Ali Sadeghian, Chunyuan Li, Mingyuan Zhou

Despite the success of Generative Adversarial Networks (GANs), their training suffers from several well-known problems, including mode collapse and difficulties learning a disconnected set of manifolds. [Expand]

1.75
1
2
2
Tuesday Poster Session
[790]

ReAgent: Point Cloud Registration Using Imitation and Reinforcement Learning

Dominik Bauer, Timothy Patten, Markus Vincze

Point cloud registration is a common step in many 3D computer vision tasks such as object pose estimation, where a 3D model is aligned to an observation. [Expand]

1.75
1
2
2
Thursday Poster Session
[791]

FBI-Denoiser: Fast Blind Image Denoiser for Poisson-Gaussian Noise

Jaeseok Byun, Sungmin Cha, Taesup Moon

We consider the challenging blind denoising problem for Poisson-Gaussian noise, in which no additional information about clean images or noise level parameters is available. [Expand]

1.75
0
0
7
Tuesday Poster Session
[792]

Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles

Long Chen, Zhihong Jiang, Jun Xiao, Wei Liu

Controllable Image Captioning (CIC) -- generating image descriptions following designated control signals -- has received unprecedented attention over the last few years. [Expand]

1.75
1
1
1
0
Friday Poster Session
[793]

3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding

Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia

The ability to understand the ways to interact with objects from visual cues, a.k.a. [Expand]

1.75
1
1
4
Monday Poster Session
[794]

Unbiased Mean Teacher for Cross-Domain Object Detection

Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan

Cross-domain object detection is challenging, because object detection model is often vulnerable to data variance, especially to the considerable domain shift between two distinctive domains. [Expand]

1.75
1
1
1
0
Tuesday Poster Session
[795]

Adaptive Methods for Real-World Domain Generalization

Abhimanyu Dubey, Vignesh Ramanathan, Alex Pentland, Dhruv Mahajan

Invariant approaches have been remarkably successful in tackling the problem of domain generalization, where the objective is to perform inference on data distributions different from those used in training. [Expand]

1.75
1
1
0
2
Thursday Poster Session
[796]

Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings

Mihai Dusmanu, Johannes L. Schonberger, Sudipta N. Sinha, Marc Pollefeys

Many computer vision systems require users to upload image features to the cloud for processing and storage. [Expand]

1.75
1
2
2
Thursday Poster Session
[797]

Single-Shot Freestyle Dance Reenactment

Oran Gafni, Oron Ashual, Lior Wolf

The task of motion transfer between a source dancer and a target person is a special case of the pose transfer problem, in which the target person changes their pose in accordance with the motions of the dancer. [Expand]

1.75
1
0
1
1
Monday Poster Session
[798]

Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion

Vitor Guizilini, Rares Ambrus, Wolfram Burgard, Adrien Gaidon

Estimating scene geometry from cost-effective sensors is key for robots. [Expand]

1.75
0
0
7
Wednesday Poster Session
[799]

Interpreting Super-Resolution Networks With Local Attribution Maps

Jinjin Gu, Chao Dong

Image super-resolution (SR) techniques have been developing rapidly, benefiting from the invention of deep networks and its successive breakthroughs. [Expand]

1.75
1
0
0
3
Wednesday Poster Session
[800]

Learning Optical Flow From a Few Matches

Shihao Jiang, Yao Lu, Hongdong Li, Richard Hartley

State-of-the-art neural network models for optical flow estimation require a dense correlation volume at high resolutions for representing per-pixel displacement. [Expand]

1.75
0
1
5
Friday Poster Session
[801]

Multi-Shot Temporal Event Localization: A Benchmark

Xiaolong Liu, Yao Hu, Song Bai, Fei Ding, Xiang Bai, Philip H. S. Torr

Current developments in temporal event or action localization usually target actions captured by a single camera. [Expand]

1.75
0
0
7
Thursday Poster Session
[802]

Retinex-Inspired Unrolling With Cooperative Prior Architecture Search for Low-Light Image Enhancement

Risheng Liu, Long Ma, Jiaao Zhang, Xin Fan, Zhongxuan Luo

Low-light image enhancement plays very important roles in low-level vision areas. [Expand]

1.75
1
0
0
3
Wednesday Poster Session
[803]

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

Jichang Li, Guanbin Li, Yemin Shi, Yizhou Yu

In semi-supervised domain adaptation, a few labeled samples per class in the target domain guide features of the remaining target samples to aggregate around them. [Expand]

1.75
1
2
2
Monday Poster Session
[804]

Temporal Action Segmentation From Timestamp Supervision

Zhe Li, Yazan Abu Farha, Jurgen Gall

Temporal action segmentation approaches have been very successful recently. [Expand]

1.75
0
1
5
Wednesday Poster Session
[805]

Variational Relational Point Completion Network

Liang Pan, Xinyi Chen, Zhongang Cai, Junzhe Zhang, Haiyu Zhao, Shuai Yi, Ziwei Liu

Real-scanned point clouds are often incomplete due to viewpoint, occlusion, and noise. [Expand]

1.75
1
0
0
3
Wednesday Poster Session
[806]

The Affective Growth of Computer Vision

Norman Makoto Su, David J. Crandall

The success of deep learning has led to intense growth and interest in computer vision, along with concerns about its potential impact on society. [Expand]

1.75
1
1
4
Wednesday Poster Session
[807]

Look Closer To Segment Better: Boundary Patch Refinement for Instance Segmentation

Chufeng Tang, Hang Chen, Xiao Li, Jianmin Li, Zhaoxiang Zhang, Xiaolin Hu

Tremendous efforts have been made on instance segmentation but the mask quality is still not satisfactory. [Expand]

1.75
1
0
6
Thursday Poster Session
[808]

A Fourier-Based Framework for Domain Generalization

Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, Qi Tian

Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data. [Expand]

1.75
1
2
2
Thursday Poster Session
[809]

SOON: Scenario Oriented Object Navigation With Graph-Based Exploration

Fengda Zhu, Xiwen Liang, Yi Zhu, Qizhi Yu, Xiaojun Chang, Xiaodan Liang

The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots. [Expand]

1.75
0
1
5
Thursday Poster Session
[810]

Deeply Shape-Guided Cascade for Instance Segmentation

Hao Ding, Siyuan Qiao, Alan Yuille, Wei Shen

The key to a successful cascade architecture for precise instance segmentation is to fully leverage the relationship between bounding box detection and mask segmentation across multiple stages. [Expand]

PDF
arXiv
Show Tweets
1.50
0
2
2
Wednesday Poster Session
[811]

Encoder Fusion Network With Co-Attention Embedding for Referring Image Segmentation

Guang Feng, Zhiwei Hu, Lihe Zhang, Huchuan Lu

Recently, referring image segmentation has aroused widespread interest. [Expand]

1.50
0
0
6
Thursday Poster Session
[812]

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning

Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala

Visual events are a composition of temporal actions involving actors spatially interacting with objects. [Expand]

1.50
1
0
5
Wednesday Poster Session
[813]

Distilling Object Detectors via Decoupled Features

Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu

Knowledge distillation is a widely used paradigm for inheriting information from a complicated teacher network to a compact student network and maintaining the strong performance. [Expand]

1.50
1
0
1
0
Monday Poster Session
[814]

Learning by Aligning Videos in Time

Sanjay Haresh, Sateesh Kumar, Huseyin Coskun, Shahram N. Syed, Andrey Konin, Zeeshan Zia, Quoc-Huy Tran

We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. [Expand]

1.50
1
1
0
1
Tuesday Poster Session
[815]

DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation

Yufan He, Dong Yang, Holger Roth, Can Zhao, Daguang Xu

Recently, neural architecture search(NAS) has been applied to automatically search high-performance networks for medical image segmentation. [Expand]

1.50
1
1
0
1
Tuesday Poster Session
[816]

Deep Dual Consecutive Network for Human Pose Estimation

Zhenguang Liu, Haoming Chen, Runyang Feng, Shuang Wu, Shouling Ji, Bailin Yang, Xun Wang

Multi-frame human pose estimation in complicated situations is challenging. [Expand]

1.50
1
1
3
Monday Poster Session
[817]

Invertible Denoising Network: A Light Solution for Real Noise Removal

Yang Liu, Zhenyue Qin, Saeed Anwar, Pan Ji, Dongwoo Kim, Sabrina Caldwell, Tom Gedeon

Invertible networks have various benefits for image denoising since they are lightweight, information-lossless, and memory-saving during back-propagation. [Expand]

1.50
2
1
2
Thursday Poster Session
[818]

The Blessings of Unlabeled Background in Untrimmed Videos

Yuan Liu, Jingyuan Chen, Zhenfang Chen, Bing Deng, Jianqiang Huang, Hanwang Zhang

Weakly-supervised Temporal Action Localization (WTAL) aims to detect the action segments with only video-level action labels in training. [Expand]

1.50
1
0
0
2
Tuesday Poster Session
[819]

SurFree: A Fast Surrogate-Free Black-Box Attack

Thibault Maho, Teddy Furon, Erwan Le Merrer

Machine learning classifiers are critically prone to evasion attacks. [Expand]

1.50
1
2
0
0
Wednesday Poster Session
[820]

Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization

Haoyu Ma, Xiangru Lin, Zifeng Wu, Yizhou Yu

Unsupervised domain adaptation (UDA) in semantic segmentation is a fundamental yet promising task relieving the need for laborious annotation works. [Expand]

1.50
1
0
0
2
Tuesday Poster Session
[821]

Convolutional Hough Matching Networks

Juhong Min, Minsu Cho

Despite advances in feature representation, leveraging geometric relations is crucial for establishing reliable visual correspondences under large variations of images. [Expand]

1.50
1
0
0
2
Tuesday Poster Session
[822]

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu

Weakly supervised object localization (WSOL) remains an open problem due to the deficiency of finding object extent information using a classification network. [Expand]

1.50
0
1
4
Thursday Poster Session
[823]

Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation

Hyojin Park, Jayeon Yoo, Seohyeong Jeong, Ganesh Venkatesh, Nojun Kwak

Current state-of-the-art approaches for Semi-supervised Video Object Segmentation (Semi-VOS) propagates information from previous frames to generate segmentation mask for the current frame. [Expand]

1.50
0
0
6
Wednesday Poster Session
[824]

HoHoNet: 360 Indoor Holistic Understanding With Latent Horizontal Features

Cheng Sun, Min Sun, Hwann-Tzong Chen

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). [Expand]

1.50
1
0
0
2
Monday Poster Session
[825]

Layerwise Optimization by Gradient Decomposition for Continual Learning

Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang

Deep neural networks achieve state-of-the-art and sometimes super-human performance across a variety of domains. [Expand]

1.50
1
0
0
2
Wednesday Poster Session
[826]

Consensus Maximisation Using Influences of Monotone Boolean Functions

Ruwan Tennakoon, David Suter, Erchuan Zhang, Tat-Jun Chin, Alireza Bab-Hadiashar

Consensus maximisation (MaxCon), widely used for robust fitting in computer vision, aims to find the largest subset of data that fits the model within some tolerance level. [Expand]

1.50
0
1
4
Tuesday Poster Session
[827]

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

Aisha Urooj, Hilde Kuehne, Kevin Duarte, Chuang Gan, Niels Lobo, Mubarak Shah

The problem of grounding VQA tasks has seen an increased attention in the research community recently, with most attempts usually focusing on solving this task by using pretrained object detectors. [Expand]

1.50
1
0
5
Wednesday Poster Session
[828]

Efficient Feature Transformations for Discriminative and Generative Continual Learning

Vinay Kumar Verma, Kevin J Liang, Nikhil Mehta, Piyush Rai, Lawrence Carin

As neural networks are increasingly being applied to real-world applications, mechanisms to address distributional shift and sequential task learning without forgetting are critical. [Expand]

1.50
0
2
2
Thursday Poster Session
[829]

PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds

Yi Wei, Ziyi Wang, Yongming Rao, Jiwen Lu, Jie Zhou

In this paper, we propose a Point-Voxel Recurrent All-Pairs Field Transforms (PV-RAFT) method to estimate scene flow from point clouds. [Expand]

1.50
1
1
3
Tuesday Poster Session
[830]

Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association

Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang

Nowadays, we have witnessed the early progress on learning the association between voice and face automatically, which brings a new wave of studies to the computer vision community. [Expand]

1.50
1
2
0
0
Friday Poster Session
[831]

Rethinking Class Relations: Absolute-Relative Supervised and Unsupervised Few-Shot Learning

Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr

The majority of existing few-shot learning methods describe image relations with binary labels. [Expand]

1.50
1
0
5
Wednesday Poster Session
[832]

Variational Pedestrian Detection

Yuang Zhang, Huanyu He, Jianguo Li, Yuxi Li, John See, Weiyao Lin

Pedestrian detection in a crowd is a challenging task due to a high number of mutually-occluding human instances, which brings ambiguity and optimization difficulties to the current IoU-based ground truth assignment procedure in classical object detection methods. [Expand]

1.50
2
0
4
Thursday Poster Session
[833]

Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias

Yunhan Zhao, Shu Kong, Charless Fowlkes

Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. [Expand]

PDF
arXiv
Show Tweets
1.50
2
1
2
Friday Poster Session
[834]

Learning View-Disentangled Human Pose Representation by Contrastive Cross-View Mutual Information Maximization

Long Zhao, Yuxiao Wang, Jiaping Zhao, Liangzhe Yuan, Jennifer J. Sun, Florian Schroff, Hartwig Adam, Xi Peng, Dimitris Metaxas, Ting Liu

We introduce a novel representation learning method to disentangle pose-dependent as well as view-dependent factors from 2D human poses. [Expand]

1.50
0
1
4
Thursday Poster Session
[835]

Learning Statistical Texture for Semantic Segmentation

Lanyun Zhu, Deyi Ji, Shiping Zhu, Weihao Gan, Wei Wu, Junjie Yan

Existing semantic segmentation works mainly focus on learning the contextual information in high-level semantic features with CNNs. [Expand]

1.50
1
0
1
0
Thursday Poster Session
[836]

The Translucent Patch: A Physical and Universal Attack on Object Detectors

Alon Zolfi, Moshe Kravchik, Yuval Elovici, Asaf Shabtai

Physical adversarial attacks against object detectors have seen increasing success in recent years. [Expand]

1.50
1
2
0
0
Thursday Poster Session
[837]

Riggable 3D Face Reconstruction via In-Network Optimization

Ziqian Bai, Zhaopeng Cui, Xiaoming Liu, Ping Tan

This paper presents a method for riggable 3D face reconstruction from monocular images, which jointly estimates a personalized face rig and per-image parameters including expressions, poses, and illuminations. [Expand]

1.25
0
1
3
Tuesday Poster Session
[838]

View Generalization for Single Image Textured 3D Models

Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro

Humans can easily infer the underlying 3D geometry and texture of an object only from a single 2D image. [Expand]

PDF
Show Tweets
1.25
1
0
4
Tuesday Poster Session
[839]

Scale-Localized Abstract Reasoning

Yaniv Benny, Niv Pekar, Lior Wolf

We consider the abstract relational reasoning task, which is commonly used as an intelligence test. [Expand]

1.25
1
0
0
1
Thursday Poster Session
[840]

Limitations of Post-Hoc Feature Alignment for Robustness

Collin Burns, Jacob Steinhardt

Feature alignment is an approach to improving robustness to distribution shift that matches the distribution of feature activations between the training distribution and test distribution. [Expand]

1.25
1
1
2
Monday Poster Session
[841]

Semi-Supervised Domain Adaptation Based on Dual-Level Domain Mixing for Semantic Segmentation

Shuaijun Chen, Xu Jia, Jianzhong He, Yongjie Shi, Jianzhuang Liu

Data-driven based approaches, in spite of great success in many tasks, have poor generalization when applied to unseen image domains, and require expensive cost of annotation especially for dense pixel prediction tasks such as semantic segmentation. [Expand]

1.25
1
1
0
0
Wednesday Poster Session
[842]

Triple-Cooperative Video Shadow Detection

Zhihao Chen, Liang Wan, Lei Zhu, Jia Shen, Huazhu Fu, Wennan Liu, Jing Qin

Shadow detection in single image has received signifi-cant research interests in recent years. [Expand]

1.25
0
1
3
Monday Poster Session
[843]

Cloud2Curve: Generation and Vectorization of Parametric Sketches

Ayan Das, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yi-Zhe Song

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. [Expand]

1.25
1
1
0
0
Tuesday Poster Session
[844]

BASAR:Black-Box Attack on Skeletal Action Recognition

Yunfeng Diao, Tianjia Shao, Yong-Liang Yang, Kun Zhou, He Wang

Skeletal motion plays a vital role in human activity recognition as either an independent data source or a complement. [Expand]

1.25
1
0
0
1
Wednesday Poster Session
[845]

Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink

Ranjie Duan, Xiaofeng Mao, A. K. Qin, Yuefeng Chen, Shaokai Ye, Yuan He, Yun Yang

Though it is well known that the performance of deep neural networks (DNNs) degrades under certain light conditions, there exists no study on the threats of light beams emitted from some physical source as adversarial attacker on DNNs in a real-world scenario. [Expand]

1.25
1
1
0
0
Friday Poster Session
[846]

MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection

Jia-Chang Feng, Fa-Ting Hong, Wei-Shi Zheng

Weakly supervised video anomaly detection (WS-VAD) is to distinguish anomalies from normal events based on discriminative representations. [Expand]

1.25
1
1
0
0
Thursday Poster Session
[847]

Incremental Few-Shot Instance Segmentation

Dan Andrei Ganea, Bas Boom, Ronald Poppe

Few-shot instance segmentation methods are promising when labeled training data for novel classes is scarce. [Expand]

1.25
2
0
3
Monday Poster Session
[848]

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos

Mingfei Gao, Yingbo Zhou, Ran Xu, Richard Socher, Caiming Xiong

Online action detection in untrimmed videos aims to identify an action as it happens, which makes it very important for real-time applications. [Expand]

1.25
1
1
0
0
Monday Poster Session
[849]

Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression

Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, Jingdong Wang

In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. [Expand]

1.25
1
0
4
Thursday Poster Session
[850]

Cross Modal Focal Loss for RGBD Face Anti-Spoofing

Anjith George, Sebastien Marcel

Automatic methods for detecting presentation attacks are essential to ensure the reliable use of facial recognition technology. [Expand]

1.25
1
1
0
0
Wednesday Poster Session
[851]

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors

Vladimir Guzov, Aymen Mir, Torsten Sattler, Gerard Pons-Moll

We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. [Expand]

1.25
1
1
0
0
Tuesday Poster Session
[852]

Heterogeneous Grid Convolution for Adaptive, Efficient, and Controllable Computation

Ryuhei Hamaguchi, Yasutaka Furukawa, Masaki Onishi, Ken Sakurada

This paper proposes a novel heterogeneous grid convolution that builds a graph-based image representation by exploiting heterogeneity in the image content, enabling adaptive, efficient, and controllable computations in a convolutional architecture. [Expand]

1.25
0
1
3
Thursday Poster Session
[853]

ChallenCap: Monocular 3D Capture of Challenging Human Performances Using Multi-Modal References

Yannan He, Anqi Pang, Xin Chen, Han Liang, Minye Wu, Yuexin Ma, Lan Xu

Capturing challenging human motions is critical for numerous applications, but it suffers from complex motion patterns and severe self-occlusion under the monocular setting. [Expand]

1.25
1
1
0
0
Thursday Poster Session
[854]

Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries

Saif Imran, Xiaoming Liu, Daniel Morris

Depth completion starts from a sparse set of known depth values and estimates the unknown depths for the remaining image pixels. [Expand]

1.25
1
0
0
1
Monday Poster Session
[855]

Memory-Guided Unsupervised Image-to-Image Translation

Somi Jeong, Youngjung Kim, Eungbean Lee, Kwanghoon Sohn

We present a novel unsupervised framework for instance-level image-to-image translation. [Expand]

1.25
0
1
3
Tuesday Poster Session
[856]

Locate Then Segment: A Strong Pipeline for Referring Image Segmentation

Ya Jing, Tao Kong, Wei Wang, Liang Wang, Lei Li, Tieniu Tan

Referring image segmentation aims to segment the objects referred by a natural language expression. [Expand]

1.25
1
1
0
0
Wednesday Poster Session
[857]

Hierarchical Lovasz Embeddings for Proposal-Free Panoptic Segmentation

Tommi Kerola, Jie Li, Atsushi Kanehira, Yasunori Kudo, Alexis Vallet, Adrien Gaidon

Panoptic segmentation brings together two separate tasks: instance and semantic segmentation. [Expand]

PDF
Show Tweets
1.25
0
1
3
Thursday Poster Session
[858]

IronMask: Modular Architecture for Protecting Deep Face Template

Sunpill Kim, Yunseong Jeong, Jinsu Kim, Jungkon Kim, Hyung Tae Lee, Jae Hong Seo

Convolutional neural networks have made remarkable progress in the face recognition field. [Expand]

1.25
1
1
0
0
Friday Poster Session
[859]

Interpretable Social Anchors for Human Trajectory Forecasting in Crowds

Parth Kothari, Brian Sifringer, Alexandre Alahi

Human trajectory forecasting in crowds, at its core, is a sequence prediction problem with specific challenges of capturing inter-sequence dependencies (social interactions) and consequently predicting socially-compliant multimodal distributions. [Expand]

1.25
0
0
5
Thursday Poster Session
[860]

BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation

Jungbeom Lee, Jihun Yi, Chaehun Shin, Sungroh Yoon

Weakly supervised segmentation methods using bounding box annotations focus on obtaining a pixel-level mask from each box containing an object. [Expand]

1.25
1
1
0
0
Monday Poster Session
[861]

Looking Into Your Speech: Learning Cross-Modal Affinity for Audio-Visual Speech Separation

Jiyoung Lee, Soo-Whan Chung, Sunok Kim, Hong-Goo Kang, Kwanghoon Sohn

In this paper, we address the problem of separating individual speech signals from videos using audio-visual neural processing. [Expand]

1.25
1
1
0
0
Monday Poster Session
[862]

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

Honggu Liu, Xiaodan Li, Wenbo Zhou, Yuefeng Chen, Yuan He, Hui Xue, Weiming Zhang, Nenghai Yu

The remarkable success in face forgery techniques has received considerable attention in computer vision due to security concerns. [Expand]

1.25
1
1
0
0
Monday Poster Session
[863]

From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation

Chen Li, Gim Hee Lee

Animal pose estimation is an important field that has received increasing attention in the recent years. [Expand]

1.25
1
0
4
Monday Poster Session
[864]

Progressive Domain Expansion Network for Single Domain Generalization

Lei Li, Ke Gao, Juan Cao, Ziyao Huang, Yepeng Weng, Xiaoyue Mi, Zhengze Yu, Xiaoya Li, Boyang Xia

Single domain generalization is a challenging case of model generalization, where the models are trained on a single domain and tested on other unseen domains. [Expand]

1.25
1
1
0
0
Monday Poster Session
[865]

PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation

Xiangtai Li, Hao He, Xia Li, Duo Li, Guangliang Cheng, Jianping Shi, Lubin Weng, Yunhai Tong, Zhouchen Lin

Aerial Image Segmentation is a particular semantic segmentation problem and has several challenging characteristics that general semantic segmentation does not have. [Expand]

1.25
1
1
0
0
Tuesday Poster Session
[866]

MUST-GAN: Multi-Level Statistics Transfer for Self-Driven Person Image Generation

Tianxiang Ma, Bo Peng, Wei Wang, Jing Dong

Pose-guided person image generation usually involves using paired source-target images to supervise the training, which significantly increases the data preparation effort and limits the application of the models. [Expand]

1.25
0
0
5
Thursday Poster Session
[867]

Robust Audio-Visual Instance Discrimination

Pedro Morgado, Ishan Misra, Nuno Vasconcelos

We present a self-supervised learning method to learn audio and video representations. [Expand]

1.25
1
1
0
0
Thursday Poster Session
[868]

Focus on Local: Detecting Lane Marker From Bottom Up via Key Point

Zhan Qu, Huan Jin, Yang Zhou, Zhen Yang, Wei Zhang

Mainstream lane marker detection methods are implemented by predicting the overall structure and deriving parametric curves through post-processing. [Expand]

1.25
1
0
4
Thursday Poster Session
[869]

Probabilistic 3D Human Shape and Pose Estimation From Multiple Unconstrained Images in the Wild

Akash Sengupta, Ignas Budvytis, Roberto Cipolla

This paper addresses the problem of 3D human body shape and pose estimation from RGB images. [Expand]

1.25
1
1
2
Friday Poster Session
[870]

Manifold Regularized Dynamic Network Pruning

Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, Dacheng Tao, Chang Xu

Neural network pruning is an essential approach for reducing the computational complexity of deep models so that they can be well deployed on resource-limited devices. [Expand]

1.25
1
1
0
0
Tuesday Poster Session
[871]

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection

Wenjing Wang, Wenhan Yang, Jiaying Liu

Face detection in low light scenarios is challenging but vital to many practical applications, e.g., surveillance video, autonomous driving at night. [Expand]

1.25
1
0
4
Friday Poster Session
[872]

Scene Text Retrieval via Joint Text Detection and Similarity Learning

Hao Wang, Xiang Bai, Mingkun Yang, Shenggao Zhu, Jing Wang, Wenyu Liu

Scene text retrieval aims to localize and search all text instances from an image gallery, which are the same or similar with a given query text. [Expand]

1.25
0
0
5
Tuesday Poster Session
[873]

Towards More Flexible and Accurate Object Tracking With Natural Language: Algorithms and Benchmark

Xiao Wang, Xiujun Shu, Zhipeng Zhang, Bo Jiang, Yaowei Wang, Yonghong Tian, Feng Wu

Tracking by natural language specification is a new rising research topic that aims at locating the target object in the video sequence based on its language description. [Expand]

1.25
1
1
0
0
Thursday Poster Session
[874]

Troubleshooting Blind Image Quality Models in the Wild

Zhihua Wang, Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma

Recently, the group maximum differentiation competition (gMAD) has been used to improve blind image quality assessment (BIQA) models, with the help of full-reference metrics. [Expand]

1.25
1
1
0
0
Friday Poster Session
[875]

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

Human pose estimation has achieved significant progress in recent years. [Expand]

1.25
2
0
3
Friday Poster Session
[876]

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian

Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency. [Expand]

1.25
1
1
0
0
Tuesday Poster Session
[877]

FP-NAS: Fast Probabilistic Neural Architecture Search

Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli

Differential Neural Architecture Search (NAS) requires all layer choices to be held in memory simultaneously; this limits the size of both search space and final architecture. [Expand]

1.25
1
0
4
Thursday Poster Session
[878]

DER: Dynamically Expandable Representation for Class Incremental Learning

Shipeng Yan, Jiangwei Xie, Xuming He

We address the problem of class incremental learning, which is a core step towards achieving adaptive vision intelligence. [Expand]

1.25
1
0
0
1
Tuesday Poster Session
[879]

Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations

Yanyi Zhang, Xinyu Li, Ivan Marsic

Multi-label activity recognition is designed for recognizing multiple activities that are performed simultaneously or sequentially in each video. [Expand]

PDF
arXiv
Show Tweets
1.25
1
0
4
Thursday Poster Session
[880]

Weakly Supervised Video Salient Object Detection

Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, Junwei Han

Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are timeconsuming and expensive to obtain. [Expand]

1.25
0
0
5
Friday Poster Session
[881]

Simpler Certified Radius Maximization by Propagating Covariances

Xingjian Zhen, Rudrasis Chakraborty, Vikas Singh

One strategy for adversarially training a robust model is to maximize its certified radius -- the neighborhood around a given training sample for which the model's prediction remains unchanged. [Expand]

1.25
0
2
1
Wednesday Poster Session
[882]

Progressive Temporal Feature Alignment Network for Video Inpainting

Xueyan Zou, Linjie Yang, Ding Liu, Yong Jae Lee

Video inpainting aims to fill spatio-temporal "corrupted" regions with plausible content. [Expand]

1.25
1
1
2
Friday Poster Session
[883]

What's in the Image? Explorable Decoding of Compressed Images

Yuval Bahat, Tomer Michaeli

The ever-growing amounts of visual contents captured on a daily basis necessitate the use of lossy compression methods in order to save storage space and transmission bandwidth. [Expand]

PDF
Show Tweets
1.00
0
1
2
Tuesday Poster Session
[884]

Behavior-Driven Synthesis of Human Dynamics

Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer

Generating and representing human behavior are of major importance for various computer vision applications. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[885]

On Focal Loss for Class-Posterior Probability Estimation: A Theoretical Perspective

Nontawat Charoenphakdee, Jayakorn Vongkulbhisal, Nuttapong Chairatanakul, Masashi Sugiyama

The focal loss has demonstrated its effectiveness in many real-world applications such as object detection and image classification, but its theoretical understanding has been limited so far. [Expand]

1.00
1
Tuesday Poster Session
[886]

Wide-Baseline Relative Camera Pose Estimation With Directional Learning

Kefan Chen, Noah Snavely, Ameesh Makadia

Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. [Expand]

1.00
1
Tuesday Poster Session
[887]

A Hyperbolic-to-Hyperbolic Graph Convolutional Network

Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia

Hyperbolic graph convolutional networks (GCNs) demonstrate powerful representation ability to model graphs with hierarchical structure. [Expand]

1.00
1
Monday Poster Session
[888]

Square Root Bundle Adjustment for Large-Scale Reconstruction

Nikolaus Demmel, Christiane Sommer, Daniel Cremers, Vladyslav Usenko

We propose a new formulation for the bundle adjustment problem which relies on nullspace marginalization of landmark variables by QR decomposition. [Expand]

1.00
0
0
4
Thursday Poster Session
[889]

StickyPillars: Robust and Efficient Feature Matching on Point Clouds Using Graph Neural Networks

Kai Fischer, Martin Simon, Florian Olsner, Stefan Milz, Horst-Michael Gross, Patrick Mader

Robust point cloud registration in real-time is an important prerequisite for many mapping and localization algorithms. [Expand]

1.00
1
0
0
0
Monday Poster Session
[890]

Unsupervised Pre-Training for Person Re-Identification

Dengpan Fu, Dongdong Chen, Jianmin Bao, Hao Yang, Lu Yuan, Lei Zhang, Houqiang Li, Dong Chen

In this paper, we present a large scale unlabeled person re-identification (Re-ID) dataset "LUPerson" and make the first attempt of performing unsupervised pre-training for improving the generalization ability of the learned person Re-ID feature representation. [Expand]

1.00
0
1
2
Thursday Poster Session
[891]

Privacy-Preserving Collaborative Learning With Automatic Transformation Search

Wei Gao, Shangwei Guo, Tianwei Zhang, Han Qiu, Yonggang Wen, Yang Liu

Collaborative learning has gained great popularity due to its benefit of data privacy protection: participants can jointly train a Deep Learning model without sharing their training sets. [Expand]

1.00
0
0
4
Monday Poster Session
[892]

Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation

Rui Gong, Yuhua Chen, Danda Pani Paudel, Yawei Li, Ajad Chhatkuli, Wen Li, Dengxin Dai, Luc Van Gool

Open compound domain adaptation (OCDA) is a domain adaptation setting, where target domain is modeled as a compound of multiple unknown homogeneous domains, which brings the advantage of improved generalization to unseen domains. [Expand]

1.00
1
Wednesday Poster Session
[893]

Panoptic Segmentation Forecasting

Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander G. Schwing

Our goal is to forecast the near future given a set of recent observations. [Expand]

1.00
1
1
1
Thursday Poster Session
[894]

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

Yisheng He, Haibin Huang, Haoqiang Fan, Qifeng Chen, Jian Sun

In this work, we present FFB6D, a full flow bidirectional fusion network designed for 6D pose estimation from a single RGBD image. [Expand]

1.00
1
Tuesday Poster Session
[895]

Learnable Graph Matching: Incorporating Graph Partitioning With Deep Feature Learning for Multiple Object Tracking

Jiawei He, Zehao Huang, Naiyan Wang, Zhaoxiang Zhang

Data association across frames is at the core of Multiple Object Tracking (MOT) task. [Expand]

1.00
1
Tuesday Poster Session
[896]

Multi-Source Domain Adaptation With Collaborative Learning for Semantic Segmentation

Jianzhong He, Xu Jia, Shuaijun Chen, Jianzhuang Liu

Multi-source unsupervised domain adaptation (MSDA) aims at adapting models trained on multiple labeled source domains to an unlabeled target domain. [Expand]

1.00
1
Wednesday Poster Session
[897]

DSRNA: Differentiable Search of Robust Neural Architectures

Ramtin Hosseini, Xingyi Yang, Pengtao Xie

In deep learning applications, the architectures of deep neural networks are crucial in achieving high accuracy. [Expand]

1.00
1
Tuesday Poster Session
[898]

Detecting Human-Object Interaction via Fabricated Compositional Learning

Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, Dacheng Tao

Human-Object Interaction (HOI) detection, inferring the relationships between human and objects from images/videos, is a fundamental task for high-level scene understanding. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[899]

DI-Fusion: Online Implicit 3D Reconstruction With Deep Priors

Jiahui Huang, Shi-Sheng Huang, Haoxuan Song, Shi-Min Hu

Previous online 3D dense reconstruction methods struggle to achieve the balance between memory storage and surface quality, largely due to the usage of stagnant underlying geometry representation, such as TSDF (truncated signed distance functions) or surfels, without any knowledge of the scene priors. [Expand]

1.00
1
Wednesday Poster Session
[900]

Self-Supervised Video Representation Learning by Context and Motion Decoupling

Lianghua Huang, Yu Liu, Bin Wang, Pan Pan, Yinghui Xu, Rong Jin

A key challenge in self-supervised video representation learning is how to effectively capture motion information besides context bias. [Expand]

1.00
1
Thursday Poster Session
[901]

Learning Position and Target Consistency for Memory-Based Video Object Segmentation

Li Hu, Peng Zhang, Bang Zhang, Pan Pan, Yinghui Xu, Rong Jin

This paper studies the problem of semi-supervised video object segmentation(VOS). [Expand]

1.00
1
Tuesday Poster Session
[902]

EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation

Yang Jiao, Trac D. Tran, Guangming Shi

This paper addresses the challenging unsupervised scene flow estimation problem by jointly learning four low-level vision sub-tasks: optical flow F, stereo-depth D, camera pose P and motion segmentation S. [Expand]

1.00
0
1
2
Tuesday Poster Session
[903]

Embedding Transfer With Label Relaxation for Improved Metric Learning

Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak

This paper presents a novel method for embedding transfer, a task of transferring knowledge of a learned embedding model to another. [Expand]

1.00
1
0
3
Tuesday Poster Session
[904]

Improving Accuracy of Binary Neural Networks Using Unbalanced Activation Distribution

Hyungjun Kim, Jihoon Park, Changhun Lee, Jae-Joon Kim

Binarization of neural network models is considered as one of the promising methods to deploy deep neural network models on resource-constrained environments such as mobile devices. [Expand]

1.00
1
0
0
0
Wednesday Poster Session
[905]

Single-View Robot Pose and Joint Angle Estimation via Render & Compare

Yann Labbe, Justin Carpentier, Mathieu Aubry, Josef Sivic

We introduce RoboPose, a method to estimate the joint angles and the 6D camera-to-robot pose of a known articulated robot from a single RGB image. [Expand]

1.00
1
0
3
Monday Poster Session
[906]

Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency

Xin Lai, Zhuotao Tian, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia

Semantic segmentation has made tremendous progress in recent years. [Expand]

1.00
1
Monday Poster Session
[907]

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation

Jungbeom Lee, Eunji Kim, Sungroh Yoon

Weakly supervised semantic segmentation produces a pixel-level localization from class labels; but a classifier trained on such labels is likely to restrict its focus to a small discriminative region of the target object. [Expand]

1.00
1
Tuesday Poster Session
[908]

Regularization Strategy for Point Cloud via Rigidly Mixed Sample

Dogyoon Lee, Jaeha Lee, Junhyeop Lee, Hyeongmin Lee, Minhyeok Lee, Sungmin Woo, Sangyoun Lee

Data augmentation is an effective regularization strategy to alleviate the overfitting, which is an inherent drawback of the deep neural networks. [Expand]

1.00
1
0
0
0
Friday Poster Session
[909]

MOOD: Multi-Level Out-of-Distribution Detection

Ziqian Lin, Sreya Dutta Roy, Yixuan Li

Out-of-distribution (OOD) detection is essential to prevent anomalous inputs from causing a model to fail during deployment. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[910]

Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting

Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin

Crowd counting is a fundamental yet challenging task, which desires rich information to generate pixel-wise crowd density maps. [Expand]

1.00
1
0
0
0
Tuesday Poster Session
[911]

Goal-Oriented Gaze Estimation for Zero-Shot Learning

Yang Liu, Lei Zhou, Xiao Bai, Yifei Huang, Lin Gu, Jun Zhou, Tatsuya Harada

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. [Expand]

1.00
1
Tuesday Poster Session
[912]

Inception Convolution With Efficient Dilation Search

Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu

As a variant of standard convolution, a dilated convolution can control effective receptive fields and handle large scale variance of objects without introducing additional computational costs. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[913]

Action Shuffle Alternating Learning for Unsupervised Action Segmentation

Jun Li, Sinisa Todorovic

This paper addresses unsupervised action segmentation. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[914]

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, Cewu Lu

Model-based 3D pose and shape estimation methods reconstruct a full 3D mesh for the human body by estimating several parameters. [Expand]

1.00
1
0
0
0
Tuesday Poster Session
[915]

OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets

Zhengqin Li, Ting-Wei Yu, Shen Sang, Sarah Wang, Meng Song, Yuhan Liu, Yu-Ying Yeh, Rui Zhu, Nitesh Gundavarapu, Jia Shi, Sai Bi, Hong-Xing Yu, Zexiang Xu, Kalyan Sunkavalli, Milos Hasan, Ravi Ramamoorthi, Manmohan Chandraker

We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. [Expand]

1.00
1
Wednesday Poster Session
[916]

POSEFusion: Pose-Guided Selective Fusion for Single-View Human Volumetric Capture

Zhe Li, Tao Yu, Zerong Zheng, Kaiwen Guo, Yebin Liu

We propose POse-guided SElective Fusion (POSEFusion), a single-view human volumetric capture method that leverages tracking-based methods and tracking-free inference to achieve high-fidelity and dynamic 3D reconstruction. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[917]

Virtual Fully-Connected Layer: Training a Large-Scale Face Recognition Dataset With Limited Computational Resources

Pengyu Li, Biao Wang, Lei Zhang

Recently, deep face recognition has achieved significant progress because of Convolutional Neural Networks (CNNs) and large-scale datasets. [Expand]

1.00
1
Thursday Poster Session
[918]

M3DSSD: Monocular 3D Single Stage Object Detector

Shujie Luo, Hang Dai, Ling Shao, Yong Ding

In this paper, we propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention. [Expand]

1.00
1
Tuesday Poster Session
[919]

Bridging the Visual Gap: Wide-Range Image Blending

Chia-Ni Lu, Ya-Chu Chang, Wei-Chen Chiu

In this paper we propose a new problem scenario in image processing, wide-range image blending, which aims to smoothly merge two different input photos into a panorama by generating novel image content for the intermediate region between them. [Expand]

1.00
1
0
0
0
Monday Poster Session
[920]

IQDet: Instance-Wise Quality Distribution Sampling for Object Detection

Yuchen Ma, Songtao Liu, Zeming Li, Jian Sun

We propose a dense object detector with an instance-wise sampling strategy, named IQDet. [Expand]

1.00
1
Monday Poster Session
[921]

Depth-Aware Mirror Segmentation

Haiyang Mei, Bo Dong, Wen Dong, Pieter Peers, Xin Yang, Qiang Zhang, Xiaopeng Wei

We present a novel mirror segmentation method that leverages depth estimates from ToF-based cameras as an additional cue to disambiguate challenging cases where the contrast or relation in RGB colors between the mirror reflection and the surrounding scene is subtle. [Expand]

1.00
1
Tuesday Poster Session
[922]

GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction

Cheol-Hui Min, Jinseok Bae, Junho Lee, Young Min Kim

We present GATSBI, a generative model that can transform a sequence of raw observations into a structured latent representation that fully captures the spatio-temporal context of the agent's actions. [Expand]

1.00
0
1
2
Tuesday Poster Session
[923]

Background Splitting: Finding Rare Classes in a Sea of Background

Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian

We focus on the problem of training deep image classification models for a small number of extremely rare categories. [Expand]

1.00
0
1
2
Wednesday Poster Session
[924]

LayoutGMN: Neural Graph Matching for Structural Layout Similarity

Akshay Gadi Patil, Manyi Li, Matthew Fisher, Manolis Savva, Hao Zhang

We present a deep neural network to predict structural similarity between 2D layouts by leveraging Graph Matching Networks (GMN). [Expand]

1.00
1
Wednesday Poster Session
[925]

Robust Multimodal Vehicle Detection in Foggy Weather Using Complementary Lidar and Radar Signals

Kun Qian, Shilin Zhu, Xinyu Zhang, Li Erran Li

Vehicle detection with visual sensors like lidar and camera is one of the critical functions enabling autonomous driving. [Expand]

1.00
1
Monday Poster Session
[926]

Every Annotation Counts: Multi-Label Deep Supervision for Medical Image Segmentation

Simon Reiss, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen

Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field. [Expand]

1.00
1
Wednesday Poster Session
[927]

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, Xiaoliang Cheng, Kewei Liang

Binary grid mask representation is broadly used in instance segmentation. [Expand]

1.00
1
Wednesday Poster Session
[928]

StablePose: Learning 6D Object Poses From Geometrically Stable Patches

Yifei Shi, Junwen Huang, Xin Xu, Yifan Zhang, Kai Xu

We introduce the concept of geometric stability to the problem of 6D object pose estimation and propose to learn pose inference based on geometrically stable patches extracted from observed 3D point clouds. [Expand]

1.00
1
1
1
Thursday Poster Session
[929]

BCNet: Searching for Network Width With Bilaterally Coupled Network

Xiu Su, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu

Searching for a more compact network width recently serves as an effective way of channel pruning for the deployment of convolutional neural networks (CNNs) under hardware constraints. [Expand]

1.00
1
Monday Poster Session
[930]

Prioritized Architecture Sampling With Monto-Carlo Tree Search

Xiu Su, Tao Huang, Yanxi Li, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu

One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once. [Expand]

1.00
1
Wednesday Poster Session
[931]

Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks

Naoya Takahashi, Yuki Mitsufuji

Tasks that involve high-resolution dense prediction require a modeling of both local and global patterns in a large input field. [Expand]

1.00
1
Monday Poster Session
[932]

EnD: Entangling and Disentangling Deep Representations for Bias Correction

Enzo Tartaglione, Carlo Alberto Barbano, Marco Grangetto

Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. [Expand]

1.00
1
Thursday Poster Session
[933]

MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection

Vibashan VS, Vikram Gupta, Poojan Oza, Vishwanath A. Sindagi, Vishal M. Patel

Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. [Expand]

1.00
1
Tuesday Poster Session
[934]

Combinatorial Learning of Graph Edit Distance via Dynamic Embedding

Runzhong Wang, Tianqi Zhang, Tianshu Yu, Junchi Yan, Xiaokang Yang

Graph Edit Distance (GED) is a popular similarity measurement for pairwise graphs and it also refers to the recovery of the edit path from the source graph to the target graph. [Expand]

1.00
1
Tuesday Poster Session
[935]

Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection

Zhenyu Wang, Yali Li, Ye Guo, Lu Fang, Shengjin Wang

In this paper, we delve into semi-supervised object detection where unlabeled images are leveraged to break through the upper bound of fully-supervised object detection models. [Expand]

1.00
1
Tuesday Poster Session
[936]

Convolutional Neural Network Pruning With Structural Redundancy Reduction

Zi Wang, Chengcheng Li, Xiangyang Wang

Convolutional neural network (CNN) pruning has become one of the most successful network compression approaches in recent years. [Expand]

1.00
1
Thursday Poster Session
[937]

Enhancing the Transferability of Adversarial Attacks Through Variance Tuning

Xiaosen Wang, Kun He

Deep neural networks are vulnerable to adversarial examples that mislead the models with imperceptible perturbations. [Expand]

1.00
1
Monday Poster Session
[938]

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

Hui-Po Wang, Ning Yu, Mario Fritz

While Generative Adversarial Networks (GANs) show increasing performance and the level of realism is becoming indistinguishable from natural images, this also comes with high demands on data and computation. [Expand]

1.00
1
Wednesday Poster Session
[939]

Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

Xiang Wang, Shiwei Zhang, Zhiwu Qing, Yuanjie Shao, Changxin Gao, Nong Sang

Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. [Expand]

1.00
1
Monday Poster Session
[940]

NeuralFusion: Online Depth Fusion in Latent Space

Silvan Weder, Johannes L. Schonberger, Marc Pollefeys, Martin R. Oswald

We present a novel online depth map fusion approach that learns depth map aggregation in a latent feature space. [Expand]

1.00
1
Tuesday Poster Session
[941]

Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing

Yu Wu, Yi Yang

We investigate the weakly-supervised audio-visual video parsing task, which aims to parse a video into temporal event segments and predict the audible or visible event categories. [Expand]

1.00
1
Monday Poster Session
[942]

MotionRNN: A Flexible Model for Video Prediction With Spacetime-Varying Motions

Haixu Wu, Zhiyu Yao, Jianmin Wang, Mingsheng Long

This paper tackles video prediction from a new dimension of predicting spacetime-varying motions that are incessantly changing across both space and time. [Expand]

1.00
1
Thursday Poster Session
[943]

Intra-Inter Camera Similarity for Unsupervised Person Re-Identification

Shiyu Xuan, Shiliang Zhang

Most of unsupervised person Re-Identification (Re-ID) works produce pseudo-labels by measuring the feature similarity without considering the distribution discrepancy among cameras, leading to degraded accuracy in label computation across cameras. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[944]

Inferring CAD Modeling Sequences Using Zone Graphs

Xianghao Xu, Wenzhe Peng, Chin-Yi Cheng, Karl D.D. Willis, Daniel Ritchie

In computer-aided design (CAD), the ability to "reverse engineer" the modeling steps used to create 3D shapes is a long-sought-after goal. [Expand]

1.00
1
0
0
0
Tuesday Poster Session
[945]

DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-Scale Consistency

Zongxin Yang, Xin Yu, Yi Yang

Compared to 2D object bounding-box labeling, it is very difficult for humans to annotate 3D object poses, especially when depth images of scenes are unavailable. [Expand]

1.00
1
Tuesday Poster Session
[946]

Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification

Fengxiang Yang, Zhun Zhong, Zhiming Luo, Yuanzheng Cai, Yaojin Lin, Shaozi Li, Nicu Sebe

This paper considers the problem of unsupervised person re-identification (re-ID), which aims to learn discriminative models with unlabeled data. [Expand]

1.00
1
Tuesday Poster Session
[947]

ST3D: Self-Training for Unsupervised Domain Adaptation on 3D Object Detection

Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi

We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds. [Expand]

1.00
1
0
0
0
Wednesday Poster Session
[948]

Slimmable Compressive Autoencoders for Practical Neural Image Compression

Fei Yang, Luis Herranz, Yongmei Cheng, Mikhail G. Mozerov

Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. [Expand]

1.00
1
Tuesday Poster Session
[949]

Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation

Xiangyu Yue, Zangwei Zheng, Shanghang Zhang, Yang Gao, Trevor Darrell, Kurt Keutzer, Alberto Sangiovanni Vincentelli

Unsupervised Domain Adaptation (UDA) transfers predictive models from a fully-labeled source domain to an unlabeled target domain. [Expand]

1.00
1
Thursday Poster Session
[950]

LAFEAT: Piercing Through Adversarial Defenses With Latent Features

Yunrui Yu, Xitong Gao, Cheng-Zhong Xu

Deep convolutional neural networks are susceptible to adversarial attacks. [Expand]

1.00
1
Tuesday Poster Session
[951]

CorrNet3D: Unsupervised End-to-End Learning of Dense Correspondence for 3D Point Clouds

Yiming Zeng, Yue Qian, Zhiyu Zhu, Junhui Hou, Hui Yuan, Ying He

Motivated by the intuition that one can transform two aligned point clouds to each other more easily and meaningfully than a misaligned pair, we propose CorrNet3D -the first unsupervised and end-to-end deep learning-based framework - to drive the learning of dense correspondence between 3D shapes by means of deformation-like reconstruction to overcome the need for annotated data. [Expand]

1.00
1
1
1
Tuesday Poster Session
[952]

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

Chi Zhang, Baoxiong Jia, Song-Chun Zhu, Yixin Zhu

Spatial-temporal reasoning is a challenging task in Artificial Intelligence (AI) due to its demanding but unique nature: a theoretic requirement on representing and reasoning based on spatial-temporal knowledge in mind, and an applied requirement on a high-level cognitive system capable of navigating and acting in space and time. [Expand]

1.00
1
Wednesday Poster Session
[953]

ACRE: Abstract Causal REasoning Beyond Covariation

Chi Zhang, Baoxiong Jia, Mark Edmonds, Song-Chun Zhu, Yixin Zhu

Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. [Expand]

1.00
1
Wednesday Poster Session
[954]

Body Meshes as Points

Jianfeng Zhang, Dongdong Yu, Jun Hao Liew, Xuecheng Nie, Jiashi Feng

We consider the challenging multi-person 3D body mesh estimation task in this work. [Expand]

1.00
1
Monday Poster Session
[955]

EDNet: Efficient Disparity Estimation With Cost Volume Combination and Attention-Based Spatial Residual

Songyan Zhang, Zhicheng Wang, Qiang Wang, Jinshuo Zhang, Gang Wei, Xiaowen Chu

Existing state-of-the-art disparity estimation works mostly leverage the 4D concatenation volume and construct a very deep 3D convolution neural network (CNN) for disparity regression, which is inefficient due to the high memory consumption and slow inference speed. [Expand]

1.00
1
1
1
Tuesday Poster Session
[956]

Exploiting Edge-Oriented Reasoning for 3D Point-Based Scene Graph Analysis

Chaoyi Zhang, Jianhui Yu, Yang Song, Weidong Cai

Scene understanding is a critical problem in computer vision. [Expand]

1.00
1
Wednesday Poster Session
[957]

Neural Architecture Search With Random Labels

Xuanyang Zhang, Pengfei Hou, Xiangyu Zhang, Jian Sun

In this paper, we investigate a new variant of neural architecture search (NAS) paradigm -- searching with random labels (RLNAS). [Expand]

1.00
1
0
0
0
Wednesday Poster Session
[958]

Stochastic Whitening Batch Normalization

Shengdong Zhang, Ehsan Nezhadarya, Homa Fashandi, Jiayi Liu, Darin Graham, Mohak Shah

Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). [Expand]

1.00
1
0
3
Wednesday Poster Session
[959]

UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification

Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains. [Expand]

1.00
1
Thursday Poster Session
[960]

Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction From Raw Point Clouds

Wenbin Zhao, Jiabao Lei, Yuxin Wen, Jianguo Zhang, Kui Jia

Shape modeling and reconstruction from raw point clouds of objects stand as a fundamental challenge in vision and graphics research. [Expand]

1.00
1
Wednesday Poster Session
[961]

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

Wu Zheng, Weiliang Tang, Li Jiang, Chi-Wing Fu

We present Self-Ensembling Single-Stage object Detector (SE-SSD) for accurate and efficient 3D object detection in outdoor point clouds. [Expand]

1.00
0
0
4
Thursday Poster Session
[962]

Cross-MPI: Cross-Scale Stereo for Image Super-Resolution Using Multiplane Images

Yuemei Zhou, Gaochang Wu, Ying Fu, Kun Li, Yebin Liu

Various combinations of cameras enrich computational photography, among which reference-based superresolution (RefSR) plays a critical role in multiscale imaging systems. [Expand]

1.00
1
0
0
0
Thursday Poster Session
[963]

Panoptic-PolarNet: Proposal-Free LiDAR Point Cloud Panoptic Segmentation

Zixiang Zhou, Yang Zhang, Hassan Foroosh

Panoptic segmentation presents a new challenge in exploiting the merits of both detection and segmentation, with the aim of unifying instance segmentation and semantic segmentation in a single framework. [Expand]

1.00
1
Thursday Poster Session
[964]

Fourier Contour Embedding for Arbitrary-Shaped Text Detection

Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang

One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances. [Expand]

1.00
1
0
0
0
Tuesday Poster Session
[965]

Complementary Relation Contrastive Distillation

Jinguo Zhu, Shixiang Tang, Dapeng Chen, Shijie Yu, Yakun Liu, Mingzhe Rong, Aijun Yang, Xiaohua Wang

Knowledge distillation aims to transfer representation ability from a teacher model to a student model. [Expand]

1.00
1
0
0
0
Wednesday Poster Session
[966]

Where and What? Examining Interpretable Disentangled Representations

Xinqi Zhu, Chang Xu, Dacheng Tao

Capturing interpretable variations has long been one of the goals in disentanglement learning. [Expand]

1.00
1
Tuesday Poster Session
[967]

Denoise and Contrast for Category Agnostic Shape Completion

Antonio Alliegro, Diego Valsesia, Giulia Fracastoro, Enrico Magli, Tatiana Tommasi

In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. [Expand]

0.75
0
0
3
Tuesday Poster Session
[968]

Dogfight: Detecting Drones From Drones Videos

Muhammad Waseem Ashraf, Waqas Sultani, Mubarak Shah

As airborne vehicles are becoming more autonomous and ubiquitous, it has become vital to develop the capability to detect the objects in their surroundings. [Expand]

0.75
2
0
1
Tuesday Poster Session
[969]

What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa

Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. [Expand]

0.75
2
0
1
Tuesday Poster Session
[970]

Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown Generic Reflectance

Ziang Cheng, Hongdong Li, Yuta Asano, Yinqiang Zheng, Imari Sato

Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e.g. [Expand]

0.75
1
0
2
Friday Poster Session
[971]

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

Xingyu Chen, Yufeng Liu, Chongyang Ma, Jianlong Chang, Huayan Wang, Tian Chen, Xiaoyan Guo, Pengfei Wan, Wen Zheng

Recent years have witnessed significant progress in 3D hand mesh recovery. [Expand]

0.75
0
0
3
Thursday Poster Session
[972]

Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision

Xiaokang Chen, Yuhui Yuan, Gang Zeng, Jingdong Wang

In this paper, we study the semi-supervised semantic segmentation problem via exploring both labeled data and extra unlabeled data. [Expand]

0.75
0
0
3
Monday Poster Session
[973]

PiCIE: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering

Jang Hyun Cho, Utkarsh Mall, Kavita Bala, Bharath Hariharan

We present a new framework for semantic segmentation without annotations via clustering. [Expand]

0.75
0
1
1
Friday Poster Session
[974]

Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation

Zhekai Du, Jingjing Li, Hongzu Su, Lei Zhu, Ke Lu

Unsupervised Domain Adaptation (UDA) aims to generalize the knowledge learned from a well-labeled source domain to an unlabled target domain. [Expand]

0.75
1
0
2
Tuesday Poster Session
[975]

Siamese Natural Language Tracker: Tracking by Natural Language Descriptions With Siamese Trackers

Qi Feng, Vitaly Ablavsky, Qinxun Bai, Stan Sclaroff

We propose a novel Siamese Natural Language Tracker (SNLT), which brings the advancements in visual tracking to the tracking by natural language (NL) specification task. [Expand]

PDF
arXiv
Show Tweets
0.75
1
1
0
Tuesday Poster Session
[976]

OTA: Optimal Transport Assignment for Object Detection

Zheng Ge, Songtao Liu, Zeming Li, Osamu Yoshie, Jian Sun

Recent advances in label assignment in object detection mainly seek to independently define positive/negative training samples for each ground-truth (gt) object. [Expand]

0.75
1
1
0
Monday Poster Session
[977]

Bidirectional Projection Network for Cross Dimension Scene Understanding

Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia, Tien-Tsin Wong

2D image representations are in regular grids and can be processed efficiently, whereas 3D point clouds are unordered and scattered in 3D space. [Expand]

0.75
0
1
1
Thursday Poster Session
[978]

Few-Shot Open-Set Recognition by Transformation Consistency

Minki Jeong, Seokeon Choi, Changick Kim

In this paper, we attack a few-shot open-set recognition (FSOSR) problem, which is a combination of few-shot learning (FSL) and open-set recognition (OSR). [Expand]

0.75
0
0
3
Thursday Poster Session
[979]

Scalability vs. Utility: Do We Have To Sacrifice One for the Other in Data Importance Quantification?

Ruoxi Jia, Fan Wu, Xuehui Sun, Jiacen Xu, David Dao, Bhavya Kailkhura, Ce Zhang, Bo Li, Dawn Song

Quantifying the importance of each training point to a learning task is a fundamental problem in machine learning and the estimated importance scores have been leveraged to guide a range of data workflows such as data summarization and domain adaption. [Expand]

PDF
arXiv
Show Tweets
0.75
0
1
1
Wednesday Poster Session
[980]

Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network

Aupendu Kar, Prabir Kumar Biswas

Convolutional neural network (CNN) has achieved unprecedented success in image super-resolution tasks in recent years. [Expand]

0.75
0
0
3
Tuesday Poster Session
[981]

Deep Implicit Moving Least-Squares Functions for 3D Reconstruction

Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu

Point set is a flexible and lightweight representation widely used for 3D deep learning. [Expand]

0.75
1
0
2
Monday Poster Session
[982]

PD-GAN: Probabilistic Diverse GAN for Image Inpainting

Hongyu Liu, Ziyu Wan, Wei Huang, Yibing Song, Xintong Han, Jing Liao

We propose PD-GAN, a probabilistic diverse GAN forimage inpainting. [Expand]

0.75
1
0
2
Wednesday Poster Session
[983]

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

Yongfei Liu, Bo Wan, Lin Ma, Xuming He

Visual grounding, which aims to build a correspondence between visual objects and their language entities, plays a key role in cross-modal scene understanding. [Expand]

0.75
0
0
3
Tuesday Poster Session
[984]

Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation

Astuti Sharma, Tarun Kalluri, Manmohan Chandraker

Domain adaptation deals with training models using large scale labeled data from a specific source domain and then adapting the knowledge to certain target domains that have few or no labels. [Expand]

0.75
0
0
3
Tuesday Poster Session
[985]

Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning

Mingjie Sun, Jimin Xiao, Eng Gee Lim

In this paper, we are tackling the proposal-free referring expression grounding task, aiming at localizing the target object according to a query sentence, without relying on off-the-shelf object proposals. [Expand]

0.75
0
1
1
Thursday Poster Session
[986]

Delving into Data: Effectively Substitute Training for Black-box Attack

Wenxuan Wang, Bangjie Yin, Taiping Yao, Li Zhang, Yanwei Fu, Shouhong Ding, Jilin Li, Feiyue Huang, Xiangyang Xue

Deep models have shown their vulnerability when processing adversarial samples. [Expand]

0.75
0
0
3
Tuesday Poster Session
[987]

Exploring Sparsity in Image Super-Resolution for Efficient Inference

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, Yulan Guo

Current CNN-based super-resolution (SR) methods process all locations equally with computational resources being uniformly assigned in space. [Expand]

PDF
arXiv
Show Tweets
0.75
1
1
0
Tuesday Poster Session
[988]

From Rain Generation to Rain Removal

Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, Deyu Meng

For the single image rain removal (SIRR) task, the performance of deep learning (DL)-based methods is mainly affected by the designed deraining models and training datasets. [Expand]

0.75
2
0
1
Thursday Poster Session
[989]

Cycle4Completion: Unpaired Point Cloud Completion Using Cycle Transformation With Missing Region Coding

Xin Wen, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu

In this paper, we present a novel unpaired point cloud completion network, named Cycle4Completion, to infer the complete geometries from a partial 3D object. [Expand]

0.75
1
0
2
Thursday Poster Session
[990]

Bilateral Grid Learning for Stereo Matching Networks

Bin Xu, Yuhua Xu, Xiaoli Yang, Wei Jia, Yulan Guo

Real-time performance of stereo matching networks is important for many applications, such as automatic driving, robot navigation and augmented reality (AR). [Expand]

PDF
arXiv
Show Tweets
0.75
0
1
1
Thursday Poster Session
[991]

Diversifying Sample Generation for Accurate Data-Free Quantization

Xiangguo Zhang, Haotong Qin, Yifu Ding, Ruihao Gong, Qinghua Yan, Renshuai Tao, Yuhang Li, Fengwei Yu, Xianglong Liu

Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. [Expand]

0.75
1
0
2
Friday Poster Session
[992]

Fostering Generalization in Single-View 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors

Jan Bechtold, Maxim Tatarchenko, Volker Fischer, Thomas Brox

Single-view 3D object reconstruction has seen much progress, yet methods still struggle generalizing to novel shapes unseen during training. [Expand]

0.50
0
0
2
Friday Poster Session
[993]

Towards Part-Based Understanding of RGB-D Scans

Alexey Bokhovkin, Vladislav Ishimtsev, Emil Bogomolov, Denis Zorin, Alexey Artemov, Evgeny Burnaev, Angela Dai

Recent advances in 3D semantic scene understanding have shown impressive progress in 3D instance segmentation, enabling object-level reasoning about 3D scenes; however, a finer-grained understanding is required to enable interactions with objects and their functional understanding. [Expand]

0.50
0
1
0
Wednesday Poster Session
[994]

Fine-Grained Angular Contrastive Learning With Coarse Labels

Guy Bukchin, Eli Schwartz, Kate Saenko, Ori Shahar, Rogerio Feris, Raja Giryes, Leonid Karlinsky

Few-shot learning methods offer pre-training techniques optimized for easier later adaptation of the model to new classes (unseen during training) using one or a few examples. [Expand]

0.50
0
0
2
Wednesday Poster Session
[995]

Semantic Scene Completion via Integrating Instances and Scene In-the-Loop

Yingjie Cai, Xuesong Chen, Chao Zhang, Kwan-Yee Lin, Xiaogang Wang, Hongsheng Li

Semantic Scene Completion aims at reconstructing a complete 3D scene with precise voxel-wise semantics from a single-view depth or RGBD image. [Expand]

0.50
0
0
2
Monday Poster Session
[996]

Globally Optimal Relative Pose Estimation With Gravity Prior

Yaqing Ding, Daniel Barath, Jian Yang, Hui Kong, Zuzana Kukelova

Smartphones, tablets and camera systems used, e.g., in cars and UAVs, are typically equipped with IMUs (inertial measurement units) that can measure the gravity vector accurately. [Expand]

0.50
0
0
2
Monday Poster Session
[997]

Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball

Andrew Elliott, Stephen Law, Chris Russell

We present a simple regularization of adversarial perturbations based upon the perceptual loss. [Expand]

PDF
arXiv
Show Tweets
0.50
2
0
0
Wednesday Poster Session
[998]

Learning Goals From Failure

Dave Epstein, Carl Vondrick

We introduce a framework that predicts the goals behind observable human action in video. [Expand]

0.50
1
0
1
Wednesday Poster Session
[999]

Fair Feature Distillation for Visual Recognition

Sangwon Jung, Donggyu Lee, Taeeon Park, Taesup Moon

Fairness is becoming an increasingly crucial issue for computer vision, especially in the human-related decision systems. [Expand]

0.50
0
1
0
Thursday Poster Session
[1000]

How To Exploit the Transferability of Learned Image Compression to Conventional Codecs

Jan P. Klopp, Keng-Chi Liu, Liang-Gee Chen, Shao-Yi Chien

Lossy image compression is often limited by the simplicity of the chosen loss measure. [Expand]

0.50
0
0
2
Friday Poster Session
[1001]

Restore From Restored: Video Restoration With Pseudo Clean Video

Seunghwan Lee, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim

In this study, we propose a self-supervised video denoising method called ""restore-from-restored."" This method fine-tunes a pre-trained network by using a pseudo clean video during the test phase. [Expand]

0.50
1
0
1
Tuesday Poster Session
[1002]

Railroad Is Not a Train: Saliency As Pseudo-Pixel Supervision for Weakly Supervised Semantic Segmentation

Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim

Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects. [Expand]

0.50
2
0
0
Tuesday Poster Session
[1003]

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes With Biharmonic Coordinates

Minghua Liu, Minhyuk Sung, Radomir Mech, Hao Su

We propose DeepMetaHandles, a 3D conditional generative model based on mesh deformation. [Expand]

0.50
0
0
2
Monday Poster Session
[1004]

Anchor-Constrained Viterbi for Set-Supervised Action Segmentation

Jun Li, Sinisa Todorovic

This paper is about action segmentation under weak supervision in training, where the ground truth provides only a set of actions present, but neither their temporal ordering nor when they occur in a training video. [Expand]

PDF
arXiv
Show Tweets
0.50
0
0
2
Wednesday Poster Session
[1005]

Continuous Face Aging via Self-Estimated Residual Age Embedding

Zeqi Li, Ruowei Jiang, Parham Aarabi

Face synthesis, including face aging, in particular, has been one of the major topics that witnessed a substantial improvement in image fidelity by using generative adversarial networks (GANs). [Expand]

0.50
1
0
1
Thursday Poster Session
[1006]

HCRF-Flow: Scene Flow From Point Clouds With Continuous High-Order CRFs and Position-Aware Flow Embedding

Ruibo Li, Guosheng Lin, Tong He, Fayao Liu, Chunhua Shen

Scene flow in 3D point clouds plays an important role in understanding dynamic environments. [Expand]

0.50
2
0
0
Monday Poster Session
[1007]

Context Modeling in 3D Human Pose Estimation: A Unified Perspective

Xiaoxuan Ma, Jiajun Su, Chunyu Wang, Hai Ci, Yizhou Wang

Estimating 3D human pose from a single image suffers from severe ambiguity since multiple 3D joint configurations may have the same 2D projection. [Expand]

0.50
1
0
1
Tuesday Poster Session
[1008]

Lipstick Ain't Enough: Beyond Color Matching for In-the-Wild Makeup Transfer

Thao Nguyen, Anh Tuan Tran, Minh Hoai

Makeup transfer is the task of applying on a source face the makeup style from a reference image. [Expand]

0.50
2
0
0
Thursday Poster Session
[1009]

Lifelong Person Re-Identification via Adaptive Knowledge Accumulation

Nan Pu, Wei Chen, Yu Liu, Erwin M. Bakker, Michael S. Lew

Person ReID methods always learn through a stationary domain that is fixed by the choice of a given dataset. [Expand]

0.50
0
1
0
Wednesday Poster Session
[1010]

PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation

Tal Reiss, Niv Cohen, Liron Bergman, Yedid Hoshen

Anomaly detection methods require high-quality features. [Expand]

0.50
2
0
0
Monday Poster Session
[1011]

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, Hanzi Wang

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. [Expand]

0.50
2
0
0
Wednesday Poster Session
[1012]

Improved Handling of Motion Blur in Online Object Detection

Mohamed Sayed, Gabriel Brostow

We wish to detect specific categories of objects, for online vision systems that will run in the real world. [Expand]

0.50
0
0
2
Monday Poster Session
[1013]

Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution

Baoli Sun, Xinchen Ye, Baopu Li, Haojie Li, Zhihui Wang, Rui Xu

Existing color-guided depth super-resolution (DSR) approaches require paired RGB-D data as training examples where the RGB image is used as structural guidance to recover the degraded depth map due to their geometrical similarity. [Expand]

0.50
0
1
0
Wednesday Poster Session
[1014]

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun

As self-driving systems become better, simulating scenarios where the autonomy stack may fail becomes more important. [Expand]

0.50
2
0
0
Wednesday Poster Session
[1015]

Image Inpainting With External-Internal Learning and Monochromic Bottleneck

Tengfei Wang, Hao Ouyang, Qifeng Chen

Although recent inpainting approaches have demonstrated significant improvement with deep neural networks, they still suffer from artifacts such as blunt structures and abrupt colors when filling in the missing regions. [Expand]

0.50
2
0
0
Tuesday Poster Session
[1016]

Multiple Object Tracking With Correlation Learning

Qiang Wang, Yun Zheng, Pan Pan, Yinghui Xu

Recent works have shown that convolutional networks have substantially improved the performance of multiple object tracking by simultaneously learning detection and appearance features. [Expand]

0.50
0
0
2
Tuesday Poster Session
[1017]

Invertible Image Signal Processing

Yazhou Xing, Zian Qian, Qifeng Chen

Unprocessed RAW data is a highly valuable image format for image editing and computer vision. [Expand]

PDF
arXiv
Show Tweets
0.50
0
0
2
Tuesday Poster Session
[1018]

Open-Book Video Captioning With Retrieve-Copy-Generate Network

Ziqi Zhang, Zhongang Qi, Chunfeng Yuan, Ying Shan, Bing Li, Ying Deng, Weiming Hu

In this paper, we convert traditional video captioning task into a new paradigm, i.e., Open-book Video Captioning, which generates natural language under the prompts of video-content-relevant sentences, not limited to the video itself. [Expand]

0.50
1
0
1
Wednesday Poster Session
[1019]

MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

Ayan Kumar Bhunia, Shuvozit Ghose, Amandeep Kumar, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

Handwritten Text Recognition (HTR) remains a challenging problem to date, largely due to the varying writing styles that exist amongst us. [Expand]

0.25
1
0
0
Friday Poster Session
[1020]

Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

Yu Cheng, Bo Wang, Bo Yang, Robby T. Tan

In monocular video 3D multi-person pose estimation, inter-person occlusion and close interactions can cause human detection to be erroneous and human-joints grouping to be unreliable. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1021]

Contrastive Neural Architecture Search With Neural Architecture Comparators

Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, Yaowei Wang, Mingkui Tan

One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1022]

Efficient Object Embedding for Spliced Image Retrieval

Bor-Chun Chen, Zuxuan Wu, Larry S. Davis, Ser-Nam Lim

Detecting spliced images is one of the emerging challenges in computer vision. [Expand]

PDF
arXiv
Show Tweets
0.25
1
0
0
Thursday Poster Session
[1023]

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

Minghao Chen, Jianlong Fu, Haibin Ling

Despite remarkable progress achieved, most neural architecture search (NAS) methods focus on searching for one single accurate and robust architecture. [Expand]

0.25
1
0
0
Friday Poster Session
[1024]

Robust Representation Learning With Feedback for Single Image Deraining

Chenghao Chen, Hao Li

A deraining network can be interpreted as a conditional generator that aims at removing rain streaks from image. [Expand]

0.25
0
0
1
Wednesday Poster Session
[1025]

Scale-Aware Automatic Augmentation for Object Detection

Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia

We propose Scale-aware AutoAug to learn data augmentation policies for object detection. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1026]

Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging

Ilya Chugunov, Seung-Hwan Baek, Qiang Fu, Wolfgang Heidrich, Felix Heide

We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1027]

Diverse Branch Block: Building a Convolution as an Inception-Like Unit

Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding

We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1028]

Deep Graph Matching Under Quadratic Constraint

Quankai Gao, Fudong Wang, Nan Xue, Jin-Gang Yu, Gui-Song Xia

Recently, deep learning based methods have demonstrated promising results on the graph matching problem, by relying on the descriptive capability of deep features extracted on graph nodes. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1029]

SSAN: Separable Self-Attention Network for Video Representation Learning

Xudong Guo, Xun Guo, Yan Lu

Self-attention has been successfully applied to video representation learning due to the effectiveness of modeling long range dependencies. [Expand]

0.25
1
0
0
Thursday Poster Session
[1030]

Capsule Network Is Not More Robust Than Convolutional Network

Jindong Gu, Volker Tresp, Han Hu

The Capsule Network is widely believed to be more robust than Convolutional Networks. [Expand]

0.25
1
0
0
Thursday Poster Session
[1031]

Towards Fast and Accurate Real-World Depth Super-Resolution: Benchmark Dataset and Baseline

Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, Yao Zhao

Depth maps obtained by commercial depth sensors are always in low-resolution, making it difficult to be used in various computer vision tasks. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1032]

Transformation Driven Visual Reasoning

Xin Hong, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng

This paper defines a new visual reasoning paradigm by introducing an important factor, i.e. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1033]

Affordance Transfer Learning for Human-Object Interaction Detection

Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, Dacheng Tao

Reasoning the human-object interactions (HOI) is essential for deeper scene understanding, while object affordances (or functionalities) are of great importance for human to discover unseen HOIs with novel objects. [Expand]

0.25
0
0
1
Monday Poster Session
[1034]

DARCNN: Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images

Joy Hsu, Wah Chiu, Serena Yeung

In the biomedical domain, there is an abundance of dense, complex data where objects of interest may be challenging to detect or constrained by limits of human knowledge. [Expand]

0.25
1
0
0
Monday Poster Session
[1035]

FVC: A New Framework Towards Deep Video Compression in Feature Space

Zhihao Hu, Guo Lu, Dong Xu

Learning based video compression attracts increasing attention in the past few years. [Expand]

0.25
1
0
0
Monday Poster Session
[1036]

SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction From Video Data

Yuan-Ting Hu, Jiahong Wang, Raymond A. Yeh, Alexander G. Schwing

Extracting detailed 3D information of objects from video data is an important goal for holistic scene understanding. [Expand]

0.25
1
0
0
Monday Poster Session
[1037]

MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking

Jennifer Jang, Heinrich Jiang

MeanShift is a popular mode-seeking clustering algorithm used in a wide range of applications in machine learning. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1038]

LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents

ByeoungDo Kim, Seong Hyeon Park, Seokhwan Lee, Elbek Khoshimjonov, Dongsuk Kum, Junsoo Kim, Jeong Soo Kim, Jun Won Choi

In this paper, we address the problem of predicting the future motion of a dynamic agent (called a target agent) given its current and past states as well as the information on its environment. [Expand]

0.25
1
0
0
Thursday Poster Session
[1039]

SIPSA-Net: Shift-Invariant Pan Sharpening With Moving Object Alignment for Satellite Imagery

Jaehyup Lee, Soomin Seo, Munchurl Kim

Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1040]

Flow-Based Kernel Prior With Application to Blind Super-Resolution

Jingyun Liang, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte

Kernel estimation is generally one of the key problems for blind image super-resolution (SR). [Expand]

0.25
1
0
0
Wednesday Poster Session
[1041]

OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection

Tingting Liang, Yongtao Wang, Zhi Tang, Guosheng Hu, Haibin Ling

Recently, neural architecture search (NAS) has been exploited to design feature pyramid networks (FPNs) and achieved promising results for visual object detection. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1042]

Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation

Dohun Lim, Hyeonseok Lee, Sungchan Kim

We present a novel method for reliably explaining the predictions of neural networks. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1043]

Region-Aware Adaptive Instance Normalization for Image Harmonization

Jun Ling, Han Xue, Li Song, Rong Xie, Xiao Gu

Image composition plays a common but important role in photo editing. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1044]

Scene-Intuitive Agent for Remote Embodied Visual Grounding

Xiangru Lin, Guanbin Li, Yizhou Yu

Humans learn from life events to form intuitions towards the understanding of visual environments and languages. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1045]

From Shadow Generation To Shadow Removal

Zhihao Liu, Hui Yin, Xinyi Wu, Zhenyao Wu, Yang Mi, Song Wang

Shadow removal is a computer-vision task that aims to restore the image content in shadow regions. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1046]

Fully Convolutional Scene Graph Generation

Hengyue Liu, Ning Yan, Masood Mortazavi, Bir Bhanu

This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. [Expand]

0.25
1
0
0
Thursday Poster Session
[1047]

No Frame Left Behind: Full Video Action Recognition

Xin Liu, Silvia L. Pintea, Fatemeh Karimi Nejadasl, Olaf Booij, Jan C. van Gemert

Not all video frames are equally informative for recognizing an action. [Expand]

0.25
1
0
0
Thursday Poster Session
[1048]

Towards Unified Surgical Skill Assessment

Daochang Liu, Qiyue Li, Tingting Jiang, Yizhou Wang, Rulin Miao, Fei Shan, Ziyu Li

Surgical skills have a great influence on surgical safety and patients' well-being. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1049]

Causal Hidden Markov Model for Time Series Disease Forecasting

Jing Li, Botong Wu, Xinwei Sun, Yizhou Wang

We propose a causal hidden Markov model to achieve robust prediction of irreversible disease at an early stage, which is safety-critical and vital for medical treatment in early stages. [Expand]

0.25
0
0
1
Thursday Poster Session
[1050]

Exploring intermediate representation for monocular vehicle pose estimation

Shichao Li, Zengqiang Yan, Hongyang Li, Kwang-Ting Cheng

We present a new learning-based framework to recover vehicle pose in SO(3) from a single RGB image. [Expand]

0.25
0
0
1
Monday Poster Session
[1051]

DeepI2P: Image-to-Point Cloud Registration via Deep Classification

Jiaxin Li, Gim Hee Lee

This paper presents DeepI2P: a novel approach for cross-modality registration between an image and a point cloud. [Expand]

0.25
1
0
0
Friday Poster Session
[1052]

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

Zhichao Li, Feng Wang, Naiyan Wang

LiDAR-based 3D detection in point cloud is essential in the perception system of autonomous driving. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1053]

Generalizing Face Forgery Detection With High-Frequency Features

Yuchen Luo, Yong Zhang, Junchi Yan, Wei Liu

Current face forgery detection methods achieve high accuracy under the within-database scenario where training and testing forgeries are synthesized by the same algorithm. [Expand]

0.25
1
0
0
Friday Poster Session
[1054]

Self-Supervised Pillar Motion Learning for Autonomous Driving

Chenxu Luo, Xiaodong Yang, Alan Yuille

Autonomous driving can benefit from motion behavior comprehension when interacting with diverse traffic participants in highly dynamic environments. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1055]

Learning Semantic Person Image Generation by Region-Adaptive Normalization

Zhengyao Lv, Xiaoming Li, Xin Li, Fu Li, Tianwei Lin, Dongliang He, Wangmeng Zuo

Human pose transfer has received great attention due to its wide applications, yet is still a challenging task that is not well solved. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1056]

FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions

Weian Mao, Zhi Tian, Xinlong Wang, Chunhua Shen

We propose a fully convolutional multi-person pose estimation framework using dynamic instance-aware convolutions, termed FCPose. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1057]

Polygonal Point Set Tracking

Gunhee Nam, Miran Heo, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim

In this paper, we propose a novel learning-based polygonal point set tracking method. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1058]

Reducing Domain Gap by Reducing Style Bias

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, Donggeun Yoo

Convolutional Neural Networks (CNNs) often fail to maintain their performance when they confront new test domains, which is known as the problem of domain shift. [Expand]

PDF
arXiv
Show Tweets
0.25
1
0
0
Wednesday Poster Session
[1059]

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, Yasutaka Furukawa

This paper proposes a generative adversarial layout refinement network for automated floorplan generation. [Expand]

PDF
Show Tweets
0.25
0
0
1
Thursday Poster Session
[1060]

Hyperdimensional Computing as a Framework for Systematic Aggregation of Image Descriptors

Peer Neubert, Stefan Schubert

Image and video descriptors are an omnipresent tool in computer vision and its application fields like mobile robotics. [Expand]

0.25
1
0
0
Friday Poster Session
[1061]

Bridge To Answer: Structure-Aware Graph Interaction Network for Video Question Answering

Jungin Park, Jiyoung Lee, Kwanghoon Sohn

This paper presents a novel method, termed Bridge to Answer, to infer correct answers for questions about a given video by leveraging adequate graph interactions of heterogeneous crossmodal graphs. [Expand]

0.25
1
0
0
Thursday Poster Session
[1062]

VoxelContext-Net: An Octree Based Framework for Point Cloud Compression

Zizheng Que, Guo Lu, Dong Xu

In this paper, we propose a two-stage deep learning framework called VoxelContext-Net for both static and dynamic point cloud compression. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1063]

Self-Supervised Collision Handling via Generative 3D Garment Models for Virtual Try-On

Igor Santesteban, Nils Thuerey, Miguel A. Otaduy, Dan Casas

We propose a new generative model for 3D garment deformations that enables us to learn, for first time, a data-driven method for virtual try-on that effectively addresses garment-body collisions. [Expand]

0.25
1
0
0
Thursday Poster Session
[1064]

Single Pair Cross-Modality Super Resolution

Guy Shacht, Dov Danon, Sharon Fogel, Daniel Cohen-Or

Non-visual imaging sensors are widely used in the industry for different purposes. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1065]

Learning To Segment Actions From Visual and Language Instructions via Differentiable Weak Sequence Alignment

Yuhan Shen, Lu Wang, Ehsan Elhamifar

We address the problem of unsupervised localization of key-steps and feature learning in instructional videos using both visual and language instructions. [Expand]

PDF
Show Tweets
0.25
1
0
0
Wednesday Poster Session
[1066]

SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua

Pedestrian trajectory prediction is a key technology in autopilot, which remains to be very challenging due to complex interactions between pedestrians. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1067]

Towards Diverse Paragraph Captioning for Untrimmed Videos

Yuqing Song, Shizhe Chen, Qin Jin

Video paragraph captioning aims to describe multiple events in untrimmed videos with descriptive paragraphs. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1068]

Tracking Pedestrian Heads in Dense Crowd

Ramana Sundararaman, Cedric De Almeida Braga, Eric Marchand, Julien Pettre

Tracking humans in crowded video sequences is an important constituent of visual scene understanding. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1069]

Dynamic Metric Learning: Towards a Scalable Metric Space To Accommodate Multiple Semantic Scales

Yifan Sun, Yuke Zhu, Yuhan Zhang, Pengkun Zheng, Xi Qiu, Chi Zhang, Yichen Wei

This paper introduces a new fundamental characteristics, i.e., the dynamic range, from real-world metric tools to deep visual recognition. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1070]

Improving the Efficiency and Robustness of Deepfakes Detection Through Precise Geometric Features

Zekun Sun, Yujie Han, Zeyu Hua, Na Ruan, Weijia Jia

Deepfakes is a branch of malicious techniques that transplant a target face to the original one in videos, resulting in serious problems such as infringement of copyright, confusion of information, or even public panic. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1071]

Tangent Space Backpropagation for 3D Transformation Groups

Zachary Teed, Jia Deng

We address the problem of performing backpropagation for computation graphs involving 3D transformation groups SO(3), SE(3), and Sim(3). [Expand]

0.25
0
0
1
Wednesday Poster Session
[1072]

Unsupervised Object Detection With LIDAR Clues

Hao Tian, Yuntao Chen, Jifeng Dai, Zhaoxiang Zhang, Xizhou Zhu

Despite the importance of unsupervised object detection, to the best of our knowledge, there is no previous work addressing this problem. [Expand]

PDF
arXiv
Show Tweets
0.25
1
0
0
Tuesday Poster Session
[1073]

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization

Aysim Toker, Qunjie Zhou, Maxim Maximov, Laura Leal-Taixe

The goal of cross-view image based geo-localization is to determine the location of a given street view image by matching it against a collection of geo-tagged satellite images. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1074]

There Is More Than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking With Sound by Distilling Multimodal Knowledge

Francisco Rivera Valverde, Juana Valeria Hurtado, Abhinav Valada

Attributes of sound inherent to objects can provide valuable cues to learn rich representations for object detection and tracking. [Expand]

0.25
1
0
0
Thursday Poster Session
[1075]

CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement

Noranart Vesdapunt, Baoyuan Wang

Face detection is a fundamental problem for many downstream face applications, and there is a rising demand for faster, more accurate yet support for higher resolution face detectors. [Expand]

0.25
1
0
0
Monday Poster Session
[1076]

Implicit Feature Alignment: Learn To Convert Text Recognizer to Text Spotter

Tianwei Wang, Yuanzhi Zhu, Lianwen Jin, Dezhi Peng, Zhe Li, Mengchao He, Yongpan Wang, Canjie Luo

Text recognition is a popular research subject with many associated challenges. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1077]

Learning Fine-Grained Segmentation of 3D Shapes Without Part Labels

Xiaogang Wang, Xun Sun, Xinyu Cao, Kai Xu, Bin Zhou

Existing learning-based approaches to 3D shape segmentation usually formulate it as a semantic labeling problem, assuming that all parts of training shapes are annotated with a given set of labels. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1078]

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization

Guangming Wang, Xinrui Wu, Zhe Liu, Hesheng Wang

A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, using hierarchical embedding mask optimization is proposed in this paper. [Expand]

PDF
Show Tweets
0.25
1
0
0
Friday Poster Session
[1079]

Scene-Aware Generative Network for Human Motion Synthesis

Jingbo Wang, Sijie Yan, Bo Dai, Dahua Lin

We revisit human motion synthesis, a task useful in various real-world applications, in this paper. [Expand]

0.25
1
0
0
Thursday Poster Session
[1080]

TDN: Temporal Difference Networks for Efficient Action Recognition

Limin Wang, Zhan Tong, Bin Ji, Gangshan Wu

Temporal modeling still remains challenging for action recognition in videos. [Expand]

0.25
0
0
1
Monday Poster Session
[1081]

Training Networks in Null Space of Feature Covariance for Continual Learning

Shipeng Wang, Xiaorong Li, Jian Sun, Zongben Xu

In the setting of continual learning, a network is trained on a sequence of tasks, and suffers from catastrophic forgetting. [Expand]

0.25
0
0
1
Monday Poster Session
[1082]

Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images

Xinggang Wang, Jiapei Feng, Bin Hu, Qi Ding, Longjin Ran, Xiaoxin Chen, Wenyu Liu

Humans have a strong class-agnostic object segmentation ability and can outline boundaries of unknown objects precisely, which motivates us to propose a box-supervised class-agnostic object segmentation (BoxCaseg) based solution for weakly-supervised instance segmentation. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1083]

Unsupervised Degradation Representation Learning for Blind Super-Resolution

Longguang Wang, Yingqian Wang, Xiaoyu Dong, Qingyu Xu, Jungang Yang, Wei An, Yulan Guo

Most existing CNN-based super-resolution (SR) methods are developed based on an assumption that the degradation is fixed and known (e.g., bicubic downsampling). [Expand]

0.25
1
0
0
Wednesday Poster Session
[1084]

Forecasting Irreversible Disease via Progression Learning

Botong Wu, Sijie Ren, Jing Li, Xinwei Sun, Shi-Ming Li, Yizhou Wang

Forecasting Parapapillary atrophy (PPA), i.e., a symptom related to most irreversible eye diseases, provides an alarm for implementing an intervention to slow down the disease progression at early stage. [Expand]

PDF
arXiv
Show Tweets
0.25
0
0
1
Wednesday Poster Session
[1085]

SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences

Shun-Cheng Wu, Johanna Wald, Keisuke Tateno, Nassir Navab, Federico Tombari

Scene graphs are a compact and explicit representation successfully used in a variety of 2D scene understanding tasks. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1086]

A Dual Iterative Refinement Method for Non-Rigid Shape Matching

Rui Xiang, Rongjie Lai, Hongkai Zhao

In this work, a robust and efficient dual iterative refinement (DIR) method is proposed for dense correspondence between two nearly isometric shapes. [Expand]

0.25
0
0
1
Friday Poster Session
[1087]

Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

Zhihao Xia, Michael Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti

We introduce a neural network-based method to denoise pairs of images taken in quick succession in low-light environments, with and without a flash. [Expand]

0.25
0
0
1
Monday Poster Session
[1088]

DG-Font: Deformable Generative Networks for Unsupervised Font Generation

Yangchen Xie, Xinyuan Chen, Li Sun, Yue Lu

Font generation is a challenging problem especially for some writing systems that consist of a large number of characters and has attracted a lot of attention in recent years. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1089]

Graph Stacked Hourglass Networks for 3D Human Pose Estimation

Tianhan Xu, Wataru Takano

In this paper, we propose a novel graph convolutional network architecture, Graph Stacked Hourglass Networks, for 2D-to-3D human pose estimation tasks. [Expand]

0.25
1
0
0
Friday Poster Session
[1090]

Layout-Guided Novel View Synthesis From a Single Indoor Panorama

Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao

Existing view synthesis methods mainly focus on the perspective images and have shown promising results. [Expand]

0.25
1
0
0
Friday Poster Session
[1091]

Learning Dynamic Alignment via Meta-Filter for Few-Shot Learning

Chengming Xu, Yanwei Fu, Chen Liu, Chengjie Wang, Jilin Li, Feiyue Huang, Li Zhang, Xiangyang Xue

Few-shot learning (FSL), which aims to recognise new classes by adapting the learned knowledge with extremely limited few-shot (support) examples, remains an important open problem in computer vision. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1092]

Linear Semantics in Generative Adversarial Networks

Jianjin Xu, Changxi Zheng

Generative Adversarial Networks (GANs) are able to generate high-quality images, but it remains difficult to explicitly specify the semantics of synthesized images. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1093]

Temporal Modulation Network for Controllable Space-Time Video Super-Resolution

Gang Xu, Jun Xu, Zhen Li, Liang Wang, Xing Sun, Ming-Ming Cheng

Space-time video super-resolution (STVSR) aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos. [Expand]

0.25
0
0
1
Tuesday Poster Session
[1094]

3D-MAN: 3D Multi-Frame Attention Network for Object Detection

Zetong Yang, Yin Zhou, Zhifeng Chen, Jiquan Ngiam

3D object detection is an important module in autonomous driving and robotics. [Expand]

0.25
1
0
0
Monday Poster Session
[1095]

KSM: Fast Multiple Task Adaption via Kernel-Wise Soft Mask Learning

Li Yang, Zhezhi He, Junshan Zhang, Deliang Fan

Deep Neural Networks (DNN) could forget the knowledge about earlier tasks when learning new tasks, and this is known as catastrophic forgetting. [Expand]

0.25
1
0
0
Thursday Poster Session
[1096]

NetAdaptV2: Efficient Neural Architecture Search With Fast Super-Network Training and Architecture Optimization

Tien-Ju Yang, Yi-Lun Liao, Vivienne Sze

Neural architecture search (NAS) typically consists of three main steps: training a super-network, training and evaluating sampled deep neural networks (DNNs), and training the discovered DNN. [Expand]

0.25
1
0
0
Monday Poster Session
[1097]

Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

Gengcong Yang, Jingyi Zhang, Yong Zhang, Baoyuan Wu, Yujiu Yang

To generate "accurate" scene graphs, almost all exist-ing methods predict pairwise relationships in a determin-istic manner. [Expand]

0.25
0
0
1
Thursday Poster Session
[1098]

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis

Mingyu Yin, Li Sun, Qingli Li

View synthesis is usually done by an autoencoder, in which the encoder maps a source view image into a latent content code, and the decoder transforms it into a target view image according to the condition. [Expand]

0.25
1
0
0
Wednesday Poster Session
[1099]

Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed Hierarchical Tucker Structure

Miao Yin, Siyu Liao, Xiao-Yang Liu, Xiaodong Wang, Bo Yuan

Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. [Expand]

0.25
1
0
0
Thursday Poster Session
[1100]

Landmark Regularization: Ranking Guided Super-Net Training in Neural Architecture Search

Kaicheng Yu, Rene Ranftl, Mathieu Salzmann

Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware. [Expand]

0.25
1
0
0
Thursday Poster Session
[1101]

Real-Time Selfie Video Stabilization

Jiyang Yu, Ravi Ramamoorthi, Keli Cheng, Michel Sarkis, Ning Bi

We propose a novel real-time selfie video stabilization method. [Expand]

0.25
1
0
0
Thursday Poster Session
[1102]

Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy

Zikai Zhang, Bineng Zhong, Shengping Zhang, Zhenjun Tang, Xin Liu, Zhaoxiang Zhang

A practical long-term tracker typically contains three key properties, i.e., an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism. [Expand]

0.25
0
0
1
Monday Poster Session
[1103]

Domain-Robust VQA With Diverse Datasets and Methods but No Target Labels

Mingda Zhang, Tristan Maidment, Ahmad Diab, Adriana Kovashka, Rebecca Hwa

The observation that computer vision methods overfit to dataset specifics has inspired diverse attempts to make object recognition models robust to domain shifts. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1104]

Event-Based Synthetic Aperture Imaging With a Hybrid Network

Xiang Zhang, Wei Liao, Lei Yu, Wen Yang, Gui-Song Xia

Synthetic aperture imaging (SAI) is able to achieve the see through effect by blurring out the off-focus foreground occlusions and reconstructing the in-focus occluded targets from multi-view images. [Expand]

0.25
0
0
1
Thursday Poster Session
[1105]

View-Guided Point Cloud Completion

Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao

This paper presents a view-guided solution for the task of point cloud completion. [Expand]

0.25
1
0
0
Friday Poster Session
[1106]

Zero-Shot Instance Segmentation

Ye Zheng, Jiahong Wu, Yongqiang Qin, Faen Zhang, Li Cui

Deep learning has significantly improved the precision of instance segmentation with abundant labeled data. [Expand]

0.25
1
0
0
Monday Poster Session
[1107]

VIGOR: Cross-View Image Geo-Localization Beyond One-to-One Retrieval

Sijie Zhu, Taojiannan Yang, Chen Chen

Cross-view image geo-localization aims to determine the locations of street-view query images by matching with GPS-tagged reference images from aerial view. [Expand]

0.25
1
0
0
Tuesday Poster Session
[1108]

Leveraging the Availability of Two Cameras for Illuminant Estimation

Abdelrahman Abdelhamed, Abhijith Punnappurath, Michael S. Brown

Most modern smartphones are now equipped with two rear-facing cameras -- a main camera for standard imaging and an additional camera to provide wide-angle or telephoto zoom capabilities. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1109]

RPSRNet: End-to-End Trainable Rigid Point Set Registration Network Using Barnes-Hut 2D-Tree Representation

Sk Aziz Ali, Kerem Kahraman, Gerd Reis, Didier Stricker

We propose RPSRNet - a novel end-to-end trainable deep neural network for rigid point set registration. [Expand]

0.00
Thursday Poster Session
[1110]

Understanding and Simplifying Perceptual Distances

Dan Amir, Yair Weiss

Perceptual metrics based on features of deep Convolutional Neural Networks (CNNs) have shown remarkable success when used as loss functions in a range of computer vision problems and significantly outperform classical losses such as L1 or L2 in pixel space. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1111]

Learning Deep Latent Variable Models by Short-Run MCMC Inference With Optimal Transport Correction

Dongsheng An, Jianwen Xie, Ping Li

Learning latent variable models with deep top-down architectures typically requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. [Expand]

0.00
Thursday Poster Session
[1112]

Adversarial Robustness Across Representation Spaces

Pranjal Awasthi, George Yu, Chun-Sung Ferng, Andrew Tomkins, Da-Cheng Juan

Adversarial robustness corresponds to the susceptibility of deep neural networks to imperceptible perturbations made at test time. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1113]

GMOT-40: A Benchmark for Generic Multiple Object Tracking

Hexin Bai, Wensheng Cheng, Peng Chu, Juehuan Liu, Kai Zhang, Haibin Ling

Multiple Object Tracking (MOT) has witnessed remarkable advances in recent years. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1114]

Learning Scalable lY=-Constrained Near-Lossless Image Compression via Joint Lossy Image and Residual Compression

Yuanchao Bai, Xianming Liu, Wangmeng Zuo, Yaowei Wang, Xiangyang Ji

We propose a novel joint lossy image and residual compression framework for learning l_infinity-constrained near-lossless image compression. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1115]

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Zechen Bai, Zhigang Wang, Jian Wang, Di Hu, Errui Ding

Unsupervised domain adaptation (UDA) methods for person re-identification (re-ID) aim at transferring re-ID knowledge from labeled source data to unlabeled target data. [Expand]

0.00
Thursday Poster Session
[1116]

Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers

Apratim Bhattacharyya, Daniel Olmeda Reino, Mario Fritz, Bernt Schiele

Accurate prediction of pedestrian and bicyclist paths is integral to the development of reliable autonomous vehicles in dense urban environments. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1117]

Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions

Navaneeth Bodla, Gaurav Shrivastava, Rama Chellappa, Abhinav Shrivastava

Learning to model and predict how humans interact with objects while performing an action is challenging, and most of the existing video prediction models are ineffective in modeling complicated human-object interactions. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1118]

Understanding Object Dynamics for Interactive Image-to-Video Synthesis

Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Bjorn Ommer

What would be the effect of locally poking a static scene? We present an approach that learns naturally-looking global articulations caused by a local manipulation at a pixel level. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1119]

OCONet: Image Extrapolation by Object Completion

Richard Strong Bowen, Huiwen Chang, Charles Herrmann, Piotr Teterwak, Ce Liu, Ramin Zabih

Image extrapolation extends an input image beyond the originally-captured field of view. [Expand]

0.00
Monday Poster Session
[1120]

Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning

Liu Bo, Qiulei Dong, Zhanyi Hu

Transductive zero-shot learning (T-ZSL) which could alleviate the domain shift problem in existing ZSL works, has received much attention recently. [Expand]

0.00
Friday Poster Session
[1121]

GAIA: A Transfer Learning System of Object Detection That Fits Your Needs

Xingyuan Bu, Junran Peng, Junjie Yan, Tieniu Tan, Zhaoxiang Zhang

Transfer learning with pre-training on large-scale datasets has played an increasingly significant role in computer vision and natural language processing recently. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1122]

Rethinking Graph Neural Architecture Search From Message-Passing

Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang

Graph neural networks (GNNs) emerged recently as a standard toolkit for learning from data on graphs. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1123]

Revisiting Superpixels for Active Learning in Semantic Segmentation With Realistic Annotation Costs

Lile Cai, Xun Xu, Jun Hao Liew, Chuan Sheng Foo

State-of-the-art methods for semantic segmentation are based on deep neural networks that are known to be data-hungry. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1124]

Debiased Subjective Assessment of Real-World Image Enhancement

Peibei Cao, Zhangyang Wang, Kede Ma

In real-world image enhancement, it is often challenging (if not impossible) to acquire ground-truth data, preventing the adoption of distance metrics for objective quality assessment. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1125]

Normal Integration via Inverse Plane Fitting With Minimum Point-to-Plane Distance

Xu Cao, Boxin Shi, Fumio Okura, Yasuyuki Matsushita

This paper presents a surface normal integration method that solves an inverse problem of local plane fitting. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1126]

To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels

Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine, Vijay Vasudevan, Xiao Zhang, Dragomir Anguelov

3D object detection is vital for many robotics applications. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1127]

Adaptive Convolutions for Structure-Aware Style Transfer

Prashanth Chandran, Gaspard Zoss, Paulo Gotardo, Markus Gross, Derek Bradley

Style transfer between images is an artistic application of CNNs, where the 'style' of one image is transferred onto another image while preserving the latter's content. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1128]

Deep Perceptual Preprocessing for Video Coding

Aaron Chadha, Yiannis Andreopoulos

We introduce the concept of rate-aware deep perceptual preprocessing (DPP) for video encoding. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1129]

Your "Flamingo" is My "Bird": Fine-Grained, or Not

Dongliang Chang, Kaiyue Pang, Yixiao Zheng, Zhanyu Ma, Yi-Zhe Song, Jun Guo

Whether what you see in Figure 1 is a "flamingo" or a "bird", is the question we ask in this paper. [Expand]

PDF
arXiv
Show Tweets
0.00
Thursday Poster Session
[1130]

Learning Discriminative Prototypes With Dynamic Time Warping

Xiaobin Chang, Frederick Tung, Greg Mori

Dynamic Time Warping (DTW) is widely used for temporal data processing. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1131]

Towards Robust Classification Model by Counterfactual and Invariant Data Generation

Chun-Hao Chang, George Alexandru Adam, Anna Goldenberg

Despite the success of machine learning applications in science, industry, and society in general, many approaches are known to be non-robust, often relying on spurious correlations to make predictions. [Expand]

0.00
Thursday Poster Session
[1132]

Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection

Jiacheng Cheng, Nuno Vasconcelos

The problem of novelty detection in fine-grained visual classification (FGVC) is considered. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1133]

Learning To Filter: Siamese Relation Network for Robust Tracking

Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang

Despite the great success of Siamese-based trackers, their performance under complicated scenarios is still not satisfying, especially when there are distractors. [Expand]

0.00
Tuesday Poster Session
[1134]

Light Field Super-Resolution With Zero-Shot Learning

Zhen Cheng, Zhiwei Xiong, Chang Chen, Dong Liu, Zheng-Jun Zha

Deep learning provides a new avenue for light field super-resolution (SR). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1135]

Adaptive Image Transformer for One-Shot Object Detection

Ding-Jie Chen, He-Yen Hsieh, Tyng-Luh Liu

One-shot object detection tackles a challenging task that aims at identifying within a target image all object instances of the same class, implied by a query image patch. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1136]

Class-Aware Robust Adversarial Training for Object Detection

Pin-Chun Chen, Bo-Han Kung, Jun-Cheng Chen

Object detection is an important computer vision task with plenty of real-world applications; therefore, how to enhance its robustness against adversarial attacks has emerged as a crucial issue. [Expand]

0.00
Wednesday Poster Session
[1137]

Blind Deblurring for Saturated Images

Liang Chen, Jiawei Zhang, Songnan Lin, Faming Fang, Jimmy S. Ren

Blind deblurring has received considerable attention in recent years. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1138]

Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation

Haoxin Chen, Hanjie Wu, Nanxuan Zhao, Sucheng Ren, Shengfeng He

This paper tackles the task of Few-Shot Video Object Segmentation (FSVOS), i.e., segmenting objects in the query videos with certain class specified in a few labeled support images. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1139]

Deep Texture Recognition via Exploiting Cross-Layer Statistical Self-Similarity

Zhile Chen, Feng Li, Yuhui Quan, Yong Xu, Hui Ji

In recent years, convolutional neural networks (CNNs) have become a prominent tool for texture recognition. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1140]

DualAST: Dual Style-Learning Networks for Artistic Style Transfer

Haibo Chen, Lei Zhao, Zhizhong Wang, Huiming Zhang, Zhiwen Zuo, Ailin Li, Wei Xing, Dongming Lu

Artistic style transfer is an image editing task that aims at repainting everyday photographs with learned artistic styles. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1141]

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-Shot Learning

Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma

Recently, the transductive graph-based methods have achieved great success in the few-shot classification task. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1142]

Hybrid Rotation Averaging: A Fast and Robust Rotation Averaging Approach

Yu Chen, Ji Zhao, Laurent Kneip

We address rotation averaging (RA) and its application to real-world 3D reconstruction. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1143]

Indoor Lighting Estimation Using an Event Camera

Zehao Chen, Qian Zheng, Peisong Niu, Huajin Tang, Gang Pan

Image-based methods for indoor lighting estimation suffer from the problem of intensity-distance ambiguity. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1144]

Jigsaw Clustering for Unsupervised Visual Representation Learning

Pengguang Chen, Shu Liu, Jiaya Jia

Unsupervised representation learning with contrastive learning achieves great success recently. [Expand]

0.00
Thursday Poster Session
[1145]

Learning a Non-Blind Deblurring Network for Night Blurry Images

Liang Chen, Jiawei Zhang, Jinshan Pan, Songnan Lin, Faming Fang, Jimmy S. Ren

Deblurring night blurry images is difficult, because the common-used blur model based on the linear convolution operation does not hold in this situation due to the influence of saturated pixels. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1146]

Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification

Hao Chen, Yaohui Wang, Benoit Lagadec, Antitza Dantcheva, Francois Bremond

Recent self-supervised contrastive learning provides an effective approach for unsupervised person re-identification (ReID) by learning invariance from different views (transformed versions) of an input. [Expand]

0.00
Monday Poster Session
[1147]

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

It is well acknowledged that person re-identification (person ReID) highly relies on visual texture information like clothing. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1148]

Learning Student Networks in the Wild

Hanting Chen, Tianyu Guo, Chang Xu, Wenshuo Li, Chunjing Xu, Chao Xu, Yunhe Wang

Data-free learning for student networks is a new paradigm for solving users' anxiety caused by the privacy problem of using original training data. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1149]

MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes

Zhikai Chen, Lingxi Xie, Shanmin Pang, Yong He, Bo Zhang

Deepfakes raised serious concerns on the authenticity of visual contents. [Expand]

0.00
Wednesday Poster Session
[1150]

MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

Hansheng Chen, Yuyao Huang, Wei Tian, Zhong Gao, Lu Xiong

Object localization in 3D space is a challenging aspect in monocular 3D object detection. [Expand]

0.00
Wednesday Poster Session
[1151]

Neural Feature Search for RGB-Infrared Person Re-Identification

Yehansen Chen, Lin Wan, Zhihang Li, Qianyan Jing, Zongyuan Sun

RGB-Infrared person re-identification (RGB-IR ReID) is a challenging cross-modality retrieval problem, which aims at matching the person-of-interest over visible and infrared camera views. [Expand]

0.00
Monday Poster Session
[1152]

Perceptual Indistinguishability-Net (PI-Net): Facial Image Obfuscation With Manipulable Semantics

Jia-Wei Chen, Li-Ju Chen, Chia-Mu Yu, Chun-Shien Lu

With the growing use of camera devices, the industry has many image datasets that provide more opportunities for collaboration between the machine learning community and industry. [Expand]

0.00
Tuesday Poster Session
[1153]

Pareto Self-Supervised Training for Few-Shot Learning

Zhengyu Chen, Jixie Ge, Heshen Zhan, Siteng Huang, Donglin Wang

While few-shot learning (FSL) aims for rapid generalization to new concepts with little supervision, self-supervised learning (SSL) constructs supervisory signals directly computed from unlabeled data. [Expand]

0.00
Thursday Poster Session
[1154]

PSD: Principled Synthetic-to-Real Dehazing Guided by Physical Priors

Zeyuan Chen, Yangchao Wang, Yang Yang, Dong Liu

Deep learning-based methods have achieved remarkable performance for image dehazing. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1155]

S2R-DepthNet: Learning a Generalizable Depth-Specific Structural Representation

Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng

Human can infer the 3D geometry of a scene from a sketch instead of a realistic image, which indicates that the spatial structure plays a fundamental role in understanding the depth of scenes. [Expand]

0.00
Tuesday Poster Session
[1156]

Scene Text Telescope: Text-Focused Scene Image Super-Resolution

Jingye Chen, Bin Li, Xiangyang Xue

Image super-resolution, which is often regarded as a preprocessing procedure of scene text recognition, aims to recover the realistic features from a low-resolution text image. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1157]

Predicting Human Scanpaths in Visual Question Answering

Xianyu Chen, Ming Jiang, Qi Zhao

Attention has been an important mechanism for both humans and computer vision systems. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1158]

Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning

Shaoxiang Chen, Yu-Gang Jiang

Dense Event Captioning (DEC) aims to jointly localize and describe multiple events of interest in untrimmed videos, which is an advancement of the conventional video captioning task (generating a single sentence description for a trimmed video). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1159]

Test-Time Fast Adaptation for Dynamic Scene Deblurring via Meta-Auxiliary Learning

Zhixiang Chi, Yang Wang, Yuanhao Yu, Jin Tang

In this paper, we tackle the problem of dynamic scene deblurring. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1160]

Feature-Level Collaboration: Joint Unsupervised Learning of Optical Flow, Stereo Depth and Camera Motion

Cheng Chi, Qingjie Wang, Tianyu Hao, Peng Guo, Xin Yang

Precise estimation of optical flow, stereo depth and camera motion are important for the real-world 3D scene understanding and visual perception. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1161]

Multi-Label Learning From Single Positive Labels

Elijah Cole, Oisin Mac Aodha, Titouan Lorieul, Pietro Perona, Dan Morris, Nebojsa Jojic

Predicting all applicable labels for a given image is known as multi-label classification. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1162]

Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation

Ze Cui, Jing Wang, Shangyin Gao, Tiansheng Guo, Yihui Feng, Bo Bai

With the development of deep learning techniques, the combination of deep learning with image compression has drawn lots of attention. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1163]

Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression

Yufei Cui, Ziquan Liu, Qiao Li, Antoni B. Chan, Chun Jason Xue

Nested networks or slimmable networks are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. [Expand]

0.00
Monday Poster Session
[1164]

Towards Accurate 3D Human Motion Prediction From Incomplete Observations

Qiongjie Cui, Huaijiang Sun

Predicting accurate and realistic future human poses from historically observed sequences is a fundamental task in the intersection of computer vision, graphics, and artificial intelligence. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1165]

Dynamic Head: Unifying Object Detection Heads With Attentions

Xiyang Dai, Yinpeng Chen, Bin Xiao, Dongdong Chen, Mengchen Liu, Lu Yuan, Lei Zhang

The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1166]

Learning Affinity-Aware Upsampling for Deep Image Matting

Yutong Dai, Hao Lu, Chunhua Shen

We show that learning affinity in upsampling provides an effective and efficient approach to exploit pairwise interactions in deep networks. [Expand]

0.00
Tuesday Poster Session
[1167]

Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts

Steve Cruz, Will Hutchcroft, Yuguang Li, Naji Khosravan, Ivaylo Boyadzhiev, Sing Bing Kang

We present Zillow Indoor Dataset (ZInD): A large indoor dataset with 71,474 panoramas from 1,524 real unfurnished homes. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1168]

Progressive Contour Regression for Arbitrary-Shape Scene Text Detection

Pengwen Dai, Sanyi Zhang, Hua Zhang, Xiaochun Cao

State-of-the-art scene text detection methods usually model the text instance with local pixels or components from the bottom-up perspective and, therefore, are sensitive to noises and dependent on the complicated heuristic post-processing especially for arbitrary-shape texts. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1169]

Nearest Neighbor Matching for Deep Clustering

Zhiyuan Dang, Cheng Deng, Xu Yang, Kun Wei, Heng Huang

Deep clustering gradually becomes an important branch in unsupervised learning methods. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1170]

GANmut: Learning Interpretable Conditional Space for Gamut of Emotions

Stefano d'Apolito, Danda Pani Paudel, Zhiwu Huang, Andres Romero, Luc Van Gool

Humans can communicate emotions through a plethora of facial expressions, each with its own intensity, nuances and ambiguities. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1171]

Deep Homography for Efficient Stereo Image Compression

Xin Deng, Wenzhe Yang, Ren Yang, Mai Xu, Enpeng Liu, Qianhan Feng, Radu Timofte

In this paper, we propose HESIC, an end-to-end trainable deep network for stereo image compression (SIC). [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1172]

LAU-Net: Latitude Adaptive Upscaling Network for Omnidirectional Image Super-Resolution

Xin Deng, Hao Wang, Mai Xu, Yichen Guo, Yuhang Song, Li Yang

The omnidirectional images (ODIs) are usually at low-resolution, due to the constraints of collection, storage and transmission. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1173]

PML: Progressive Margin Loss for Long-Tailed Age Classification

Zongyong Deng, Hao Liu, Yaoxing Wang, Chenyang Wang, Zekuan Yu, Xuehong Sun

In this paper, we propose a progressive margin loss (PML) approach for unconstrained facial age classification. [Expand]

0.00
Wednesday Poster Session
[1174]

Variational Prototype Learning for Deep Face Recognition

Jiankang Deng, Jia Guo, Jing Yang, Alexandros Lattas, Stefanos Zafeiriou

Deep face recognition has achieved remarkable improvements due to the introduction of margin-based softmax loss, in which the prototype stored in the last linear layer represents the center of each class. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1175]

Sketch, Ground, and Refine: Top-Down Dense Video Captioning

Chaorui Deng, Shizhe Chen, Da Chen, Yuan He, Qi Wu

The dense video captioning task aims to detect and describe a sequence of events in a video for detailed and coherent storytelling. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1176]

Spatially-Invariant Style-Codes Controlled Makeup Transfer

Han Deng, Chu Han, Hongmin Cai, Guoqiang Han, Shengfeng He

Transferring makeup from the misaligned reference image is challenging. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1177]

Part-Aware Panoptic Segmentation

Daan de Geus, Panagiotis Meletis, Chenyang Lu, Xiaoxiao Wen, Gijs Dubbelman

In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing. [Expand]

PDF
Show Tweets
0.00
0
0
0
Tuesday Poster Session
[1178]

HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers

Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo

High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1179]

Learning Spatially-Variant MAP Models for Non-Blind Image Deblurring

Jiangxin Dong, Stefan Roth, Bernt Schiele

The classical maximum a-posteriori (MAP) framework for non-blind image deblurring requires defining suitable data and regularization terms, whose interplay yields the desired clear image through optimization. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1180]

EventZoom: Learning To Denoise and Super Resolve Neuromorphic Events

Peiqi Duan, Zihao W. Wang, Xinyu Zhou, Yi Ma, Boxin Shi

We address the problem of jointly denoising and super resolving neuromorphic events, a novel visual signal that represents thresholded temporal gradients in a space-time window. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1181]

TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search

Yawen Duan, Xin Chen, Hang Xu, Zewei Chen, Xiaodan Liang, Tong Zhang, Zhenguo Li

Recent breakthroughs of Neural Architecture Search (NAS) extend the field's research scope towards a broader range of vision tasks and more diversified search spaces. [Expand]

0.00
Tuesday Poster Session
[1182]

NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go

Marvin Eisenberger, David Novotny, Gael Kerchenbaum, Patrick Labatut, Natalia Neverova, Daniel Cremers, Andrea Vedaldi

We present NeuroMorph, a new neural network architecture that takes as input two 3D shapes and produces in one go, i.e. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1183]

Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models

Benjamin Eckart, Wentao Yuan, Chao Liu, Jan Kautz

While recent pre-training tasks on 2D images have proven very successful for transfer learning, pre-training for 3D data remains challenging. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1184]

Dual Attention Guided Gaze Target Detection in the Wild

Yi Fang, Jiapeng Tang, Wang Shen, Wei Shen, Xiao Gu, Li Song, Guangtao Zhai

Gaze target detection aims to infer where each person in a scene is looking. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1185]

Group Collaborative Learning for Co-Salient Object Detection

Qi Fan, Deng-Ping Fan, Huazhu Fu, Chi-Keung Tang, Ling Shao, Yu-Wing Tai

We present a novel group collaborative learning framework (GCNet) capable of detecting co-salient objects in real time (16ms), by simultaneously mining consensus representations at group level based on the two necessary criteria: 1) intra-group compactness to better formulate the consistency among co-salient objects by capturing their inherent shared attributes using our novel group affinity module; 2) inter-group separability to effectively suppress the influence of noisy objects on the output by introducing our new group collaborating module conditioning the inconsistent consensus. [Expand]

0.00
Thursday Poster Session
[1186]

Learning Triadic Belief Dynamics in Nonverbal Communication From Videos

Lifeng Fan, Shuwen Qiu, Zilong Zheng, Tao Gao, Song-Chun Zhu, Yixin Zhu

Humans possess a unique social cognition capability; nonverbal communication can convey rich social information among agents. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1187]

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos

Hehe Fan, Yi Yang, Mohan Kankanhalli

Point cloud videos exhibit irregularities and lack of order along the spatial dimension where points emerge inconsistently across different frames. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1188]

Cross-Domain Similarity Learning for Face Recognition in Unseen Domains

Masoud Faraki, Xiang Yu, Yi-Hsuan Tsai, Yumin Suh, Manmohan Chandraker

Face recognition models trained under the assumption of identical training and test distributions often suffer from poor generalization when faced with unknown variations, such as a novel ethnicity or unpredictable individual make-ups during test time. [Expand]

0.00
Thursday Poster Session
[1189]

SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation

Siqi Fan, Qiulei Dong, Fenghua Zhu, Yisheng Lv, Peijun Ye, Fei-Yue Wang

How to learn effective features from large-scale point clouds for semantic segmentation has attracted increasing attention in recent years. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1190]

LiDAR-Aug: A General Rendering-Based Augmentation Framework for 3D Object Detection

Jin Fang, Xinxin Zuo, Dingfu Zhou, Shengze Jin, Sen Wang, Liangjun Zhang

Annotating the LiDAR point cloud is crucial for deep learning-based 3D object detection tasks. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1191]

Semantic-Aware Video Text Detection

Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

Most existing video text detection methods track texts with appearance features, which are easily influenced by the change of perspective and illumination. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1192]

AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training

Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, Cristian Sminchisescu

I went to the gym today, but how well did I do? And where should I improve? Ah, my back hurts slightly... [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1193]

Anticipating Human Actions by Correlating Past With the Future With Jaccard Similarity Measures

Basura Fernando, Samitha Herath

We propose a framework for early action recognition and anticipation by correlating past features with the future using three novel similarity measures called Jaccard vector similarity, Jaccard cross-correlation and Jaccard Frobenius inner product over covariances. [Expand]

0.00
Thursday Poster Session
[1194]

A Multi-Task Network for Joint Specular Highlight Detection and Removal

Gang Fu, Qing Zhang, Lei Zhu, Ping Li, Chunxia Xiao

Specular highlight detection and removal are fundamental and challenging tasks. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1195]

Double Low-Rank Representation With Projection Distance Penalty for Clustering

Zhiqiang Fu, Yao Zhao, Dongxia Chang, Xingxing Zhang, Yiming Wang

This paper presents a novel, simple yet robust self-representation method, i.e., Double Low-Rank Representation with Projection Distance penalty (DLRRPD) for clustering. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1196]

Auto-Exposure Fusion for Single-Image Shadow Removal

Lan Fu, Changqing Zhou, Qing Guo, Felix Juefei-Xu, Hongkai Yu, Wei Feng, Yang Liu, Song Wang

Shadow removal is still a challenging task due to its inherent background-dependent and spatial-variant properties, leading to unknown and diverse shadow patterns. [Expand]

PDF
arXiv
Show Tweets
0.00
Wednesday Poster Session
[1197]

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu

Multi-Source Domain Adaptation (MSDA), which dedicates to transfer the knowledge learned from multiple source domains to an unlabeled target domain, has drawn increasing attention in the research community. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1198]

STMTrack: Template-Free Visual Tracking With Space-Time Memory Networks

Zhihong Fu, Qingjie Liu, Zehua Fu, Yunhong Wang

Boosting performance of the offline trained siamese trackers is getting harder nowadays since the fixed information of the template cropped from the first frame has been almost thoroughly mined, but they are poorly capable of resisting target appearance changes. [Expand]

0.00
Thursday Poster Session
[1199]

Robust Point Cloud Registration Framework Based on Deep Graph Matching

Kexue Fu, Shaolei Liu, Xiaoyuan Luo, Manning Wang

3D point cloud registration is a fundamental problem in computer vision and robotics. [Expand]

0.00
Wednesday Poster Session
[1200]

Transferable Query Selection for Active Domain Adaptation

Bo Fu, Zhangjie Cao, Jianmin Wang, Mingsheng Long

Unsupervised domain adaptation (UDA) enables transferring knowledge from a related source domain to a fully unlabeled target domain. [Expand]

0.00
Wednesday Poster Session
[1201]

Isometric Multi-Shape Matching

Maolin Gao, Zorah Lahner, Johan Thunberg, Daniel Cremers, Florian Bernard

Finding correspondences between shapes is a fundamental problem in computer vision and graphics, which is relevant for many applications, including 3D reconstruction, object tracking, and style transfer. [Expand]

0.00
0
0
0
Thursday Poster Session
[1202]

Information Bottleneck Disentanglement for Identity Swapping

Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He

Improving the performance of face forgery detectors often requires more identity-swapped images of higher-quality. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1203]

Network Pruning via Performance Maximization

Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang

Channel pruning is a class of powerful methods for model compression. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1204]

Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression

Chen Gao, Jinyu Chen, Si Liu, Luting Wang, Qiong Zhang, Qi Wu

The Remote Embodied Referring Expression (REVERIE) is a recently raised task that requires an agent to navigate to and localise a referred remote object according to a high-level language instruction. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1205]

Privacy Preserving Localization and Mapping From Uncalibrated Cameras

Marcel Geppert, Viktor Larsson, Pablo Speciale, Johannes L. Schonberger, Marc Pollefeys

Recent works on localization and mapping from privacy preserving line features have made significant progress towards addressing the privacy concerns arising from cloud-based solutions in mixed reality and robotics. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1206]

Video Object Segmentation Using Global and Instance Embedding Learning

Wenbin Ge, Xiankai Lu, Jianbing Shen

In this paper, we propose a feature embedding based video object segmentation (VOS) method which is simple, fast and effective. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1207]

Learning Graphs for Knowledge Transfer With Limited Labels

Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava

Fixed input graphs are a mainstay in approaches that utilize Graph Convolution Networks (GCNs) for knowledge transfer. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1208]

Polygonal Building Extraction by Frame Field Learning

Nicolas Girard, Dmitriy Smirnov, Justin Solomon, Yuliya Tarabalka

While state of the art image segmentation models typically output segmentations in raster format, applications in geographic information systems often require vector polygons. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1209]

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

Spyros Gidaris, Andrei Bursuc, Gilles Puy, Nikos Komodakis, Matthieu Cord, Patrick Perez

Learning image representations without human supervision is an important and active research field. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1210]

MaxUp: Lightweight Adversarial Training With Data Augmentation Improves Neural Network Training

Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu

We propose MaxUp, an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models, especially deep neural networks. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1211]

Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning

Jingyu Gong, Jiachen Xu, Xin Tan, Haichuan Song, Yanyun Qu, Yuan Xie, Lizhuang Ma

Hidden features in neural network usually fail to learn informative representation for 3D segmentation as supervisions are only given on output prediction, while this can be solved by omni-scale supervision on intermediate layers. [Expand]

0.00
Thursday Poster Session
[1212]

PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation With Neural Positional Encoding and Distilled Matting Loss

Juan Luis Gonzalez, Munchurl Kim

In this paper, we propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net. [Expand]

0.00
Tuesday Poster Session
[1213]

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction

Shanyan Guan, Jingwei Xu, Yunbo Wang, Bingbing Ni, Xiaokang Yang

This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1214]

Inverse Simulation: Reconstructing Dynamic Geometry of Clothed Humans via Optimal Control

Jingfan Guo, Jie Li, Rahul Narain, Hyun Soo Park

This paper studies the problem of inverse cloth simulation---to estimate shape and time-varying poses of the underlying body that generates physically plausible cloth motion, which matches to the point cloud measurements on the clothed humans. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1215]

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1216]

Intrinsic Image Harmonization

Zonghui Guo, Haiyong Zheng, Yufeng Jiang, Zhaorui Gu, Bing Zheng

Compositing an image usually inevitably suffers from inharmony problem that is mainly caused by incompatibility of foreground and background from two different images with distinct surfaces and lights, corresponding to material-dependent and light-dependent characteristics, namely, reflectance and illumination intrinsic images, respectively. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1217]

Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-Balanced Samplings

Hao Guo, Song Wang

Long-tailed data distribution is common in many multi-label visual recognition tasks and the direct use of these data for training usually leads to relatively low performance on tail classes. [Expand]

0.00
Thursday Poster Session
[1218]

Multispectral Photometric Stereo for Spatially-Varying Spectral Reflectances: A Well Posed Problem?

Heng Guo, Fumio Okura, Boxin Shi, Takuya Funatomi, Yasuhiro Mukaigawa, Yasuyuki Matsushita

Multispectral photometric stereo (MPS) aims at recovering the surface normal of a scene from a single-shot multispectral image, which is known as an ill-posed problem. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1219]

Online Multiple Object Tracking With Cross-Task Synergy

Song Guo, Jingya Wang, Xinchao Wang, Dacheng Tao

Modern online multiple object tracking (MOT) methods usually focus on two directions to improve tracking performance. [Expand]

0.00
Wednesday Poster Session
[1220]

Positive-Unlabeled Data Purification in the Wild for Object Detection

Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang

Deep learning based object detection approaches have achieved great progress with the benefit from large amount of labeled images. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1221]

Strengthen Learning Tolerance for Weakly Supervised Object Localization

Guangyu Guo, Junwei Han, Fang Wan, Dingwen Zhang

Weakly supervised object localization (WSOL) aims at learning to localize objects of interest by only using the image-level labels as the supervision. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1222]

Contrastive Embedding for Generalized Zero-Shot Learning

Zongyan Han, Zhenyong Fu, Shuo Chen, Jian Yang

Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes, when only the labeled examples from seen classes are provided. [Expand]

0.00
Monday Poster Session
[1223]

Learning To Fuse Asymmetric Feature Maps in Siamese Trackers

Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen

Recently, Siamese-based trackers have achieved promising performance in visual tracking. [Expand]

0.00
0
0
0
Friday Poster Session
[1224]

Crossing Cuts Polygonal Puzzles: Models and Solvers

Peleg Harel, Ohad Ben-Shahar

Jigsaw puzzle solving, the problem of constructing a coherent whole from a set of non-overlapping unordered fragments, is fundamental to numerous applications, and yet most of the literature has focused thus far on less realistic puzzles whose pieces are identical squares. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1225]

NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

Hyunho Ha, Joo Ho Lee, Andreas Meuleman, Min H. Kim

Multiview shape-from-shading (SfS) has achieved high-detail geometry, but its computation is expensive for solving a multiview registration and an ill-posed inverse rendering problem. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1226]

Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps

Yuk Heo, Yeong Jun Koh, Chang-Su Kim

We propose a novel guided interactive segmentation (GIS) algorithm for video objects to improve the segmentation accuracy and reduce the interaction time. [Expand]

0.00
Wednesday Poster Session
[1227]

DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution

Tong He, Chunhua Shen, Anton van den Hengel

Previous top-performing approaches for point cloud instance segmentation involve a bottom-up strategy, which often includes inefficient operations or complex pipelines, such as grouping over-segmented components, introducing additional steps for refining, or designing complicated loss functions. [Expand]

0.00
0
0
0
Monday Poster Session
[1228]

MOST: A Multi-Oriented Scene Text Detector With Localization Refinement

Minghang He, Minghui Liao, Zhibo Yang, Humen Zhong, Jun Tang, Wenqing Cheng, Cong Yao, Yongpan Wang, Xiang Bai

Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios. [Expand]

0.00
Wednesday Poster Session
[1229]

Composing Photos Like a Photographer

Chaoyi Hong, Shuaiyuan Du, Ke Xian, Hao Lu, Zhiguo Cao, Weicai Zhong

We show that explicit modeling of composition rules benefits image cropping. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1230]

Disentangling Label Distribution for Long-Tailed Visual Recognition

Youngkyu Hong, Seungju Han, Kwanghee Choi, Seokjun Seo, Beomsu Kim, Buru Chang

The current evaluation protocol of long-tailed visual recognition trains the classification model on the long-tailed source label distribution and evaluates its performance on the uniform target label distribution. [Expand]

0.00
Tuesday Poster Session
[1231]

Fine-Grained Shape-Appearance Mutual Learning for Cloth-Changing Person Re-Identification

Peixian Hong, Tao Wu, Ancong Wu, Xintong Han, Wei-Shi Zheng

Recently, person re-identification (Re-ID) has achieved great progress. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1232]

Partial Person Re-Identification With Part-Part Correspondence Learning

Tianyu He, Xu Shen, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua

Driven by the success of deep learning, the last decade has seen rapid advances in person re-identification (re-ID). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1233]

LPSNet: A Lightweight Solution for Fast Panoptic Segmentation

Weixiang Hong, Qingpei Guo, Wei Zhang, Jingdong Chen, Wei Chu

Panoptic segmentation is a challenging task aiming to simultaneously segment objects (things) at instance level and background contents (stuff) at semantic level. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1234]

Panoramic Image Reflection Removal

Yuchen Hong, Qian Zheng, Lingran Zhao, Xudong Jiang, Alex C. Kot, Boxin Shi

This paper studies the problem of panoramic image reflection removal, aiming at reliving the content ambiguity between reflection and transmission scenes. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1235]

Image Change Captioning by Learning From an Auxiliary Task

Mehrdad Hosseinzadeh, Yang Wang

We tackle the challenging task of image change captioning. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1236]

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation

Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen Gould

Accuracy of many visiolinguistic tasks has benefited significantly from the application of vision-and-language (V&L) BERT. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1237]

BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification

Ruibing Hou, Hong Chang, Bingpeng Ma, Rui Huang, Shiguang Shan

In this paper, we present an efficient spatial-temporal representation for video person re-identification (reID). [Expand]

0.00
Monday Poster Session
[1238]

Informative and Consistent Correspondence Mining for Cross-Domain Weakly Supervised Object Detection

Luwei Hou, Yu Zhang, Kui Fu, Jia Li

Cross-domain weakly supervised object detection aims to adapt object-level knowledge from a fully labeled source domain dataset (i.e. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1239]

Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net

Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1240]

DeepLM: Large-Scale Nonlinear Least Squares on Deep Learning Frameworks Using Stochastic Domain Decomposition

Jingwei Huang, Shan Huang, Mingwei Sun

We propose a novel approach for large-scale nonlinear least squares problems based on deep learning frameworks. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1241]

Geo-FARM: Geodesic Factor Regression Model for Misaligned Pre-Shape Responses in Statistical Shape Analysis

Chao Huang, Anuj Srivastava, Rongjie Liu

The problem of using covariates to predict shapes of objects in a regression setting is important in many fields. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1242]

Memory Oriented Transfer Learning for Semi-Supervised Image Deraining

Huaibo Huang, Aijing Yu, Ran He

Deep learning based methods have shown dramatic improvements in image rain removal by using large-scale paired data of synthetic datasets. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1243]

Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding

Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao

An LBYL ( 'Look Before You Leap' ) Network is proposed for end-to-end trainable one-stage visual grounding. [Expand]

0.00
Friday Poster Session
[1244]

MetaSets: Meta-Learning on Point Sets for Generalizable Representations

Chao Huang, Zhangjie Cao, Yunbo Wang, Jianmin Wang, Mingsheng Long

Deep learning techniques for point clouds have achieved strong performance on a range of 3D vision tasks. [Expand]

0.00
Wednesday Poster Session
[1245]

Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

Zhen Huang, Xu Shen, Jun Xing, Tongliang Liu, Xinmei Tian, Houqiang Li, Bing Deng, Jianqiang Huang, Xian-Sheng Hua

Knowledge Distillation (KD) is a popular technique to transfer knowledge from a teacher model or ensemble to a student model. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1246]

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Yu-Cheng Chang, Tsung-Lin Tsou, Yu-An Wang, Winston H. Hsu

Dense depth estimation plays a key role in multiple applications such as robotics, 3D reconstruction, and augmented reality. [Expand]

0.00
Friday Poster Session
[1247]

Video Rescaling Networks With Joint Optimization Strategies for Downscaling and Upscaling

Yan-Cheng Huang, Yi-Hsin Chen, Cheng-You Lu, Hui-Po Wang, Wen-Hsiao Peng, Ching-Chun Huang

This paper addresses the video rescaling task, which arises from the needs of adapting the video spatial resolution to suit individual viewing devices. [Expand]

0.00
Tuesday Poster Session
[1248]

Learning the Non-Differentiable Optimization for Blind Super-Resolution

Zheng Hui, Jie Li, Xiumei Wang, Xinbo Gao

Previous convolutional neural network (CNN) based blind super-resolution (SR) methods usually adopt an iterative optimization way to approximate the ground-truth (GT) step-by-step. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1249]

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation

Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1250]

A2-FPN: Attention Aggregation Based Feature Pyramid Network for Instance Segmentation

Miao Hu, Yali Li, Lu Fang, Shengjin Wang

Learning pyramidal feature representations is crucial for recognizing object instances at different scales. [Expand]

0.00
Thursday Poster Session
[1251]

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

Tianrui Hui, Shaofei Huang, Si Liu, Zihan Ding, Guanbin Li, Wenguan Wang, Jizhong Han, Fei Wang

Language-queried video actor segmentation aims to predict the pixel-level mask of the actor which performs the actions described by a natural language query in the target frames. [Expand]

0.00
Tuesday Poster Session
[1252]

Efficient Deformable Shape Correspondence via Multiscale Spectral Manifold Wavelets Preservation

Ling Hu, Qinsong Li, Shengjun Liu, Xinru Liu

The functional map framework has proven to be extremely effective for representing dense correspondences between deformable shapes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1253]

Learning Cross-Modal Retrieval With Noisy Labels

Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin

Recently, cross-modal retrieval is emerging with the help of deep multimodal learning. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1254]

Dense Relation Distillation With Context-Aware Aggregation for Few-Shot Object Detection

Hanzhe Hu, Shuai Bai, Aoxue Li, Jinshi Cui, Liwei Wang

Conventional deep learning based methods for object detection require a large amount of bounding box annotations for training, which is expensive to obtain such high quality annotated data. [Expand]

0.00
Wednesday Poster Session
[1255]

Pseudo 3D Auto-Correlation Network for Real Image Denoising

Xiaowan Hu, Ruijun Ma, Zhihong Liu, Yuanhao Cai, Xiaole Zhao, Yulun Zhang, Haoqian Wang

The extraction of auto-correlation in images has shown great potential in deep learning networks, such as the self-attention mechanism in the channel domain and the self-similarity mechanism in the spatial domain. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1256]

Model-Aware Gesture-to-Gesture Translation

Hezhen Hu, Weilun Wang, Wengang Zhou, Weichao Zhao, Houqiang Li

Hand gesture-to-gesture translation is a significant and interesting problem, which serves as a key role in many applications, such as sign language production. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1257]

Safe Local Motion Planning With Self-Supervised Freespace Forecasting

Peiyun Hu, Aaron Huang, John Dolan, David Held, Deva Ramanan

Safe local motion planning for autonomous driving in dynamic environments requires forecasting how the scene evolves. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1258]

Wide-Depth-Range 6D Object Pose Estimation in Space

Yinlin Hu, Sebastien Speierer, Wenzel Jakob, Pascal Fua, Mathieu Salzmann

6D pose estimation in space poses unique challenges that are not commonly encountered in the terrestrial setting. [Expand]

0.00
Friday Poster Session
[1259]

Self-Supervised 3D Mesh Reconstruction From Single Images

Tao Hu, Liwei Wang, Xiaogang Xu, Shu Liu, Jiaya Jia

Recent single-view 3D reconstruction methods reconstruct object's shape and texture from a single image with only 2D image-level annotation. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1260]

Self-Supervised Video GANs: Learning for Appearance Consistency and Motion Coherency

Sangeek Hyun, Jihwan Kim, Jae-Pil Heo

A video can be represented by the composition of appearance and motion. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1261]

Shape From Sky: Polarimetric Normal Recovery Under the Sky

Tomoki Ichikawa, Matthew Purri, Ryo Kawahara, Shohei Nobuhara, Kristin Dana, Ko Nishino

The sky exhibits a unique spatial polarization pattern by scattering the unpolarized sun light. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1262]

Optimal Quantization Using Scaled Codebook

Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez

We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1263]

3D Shape Generation With Grid-Based Implicit Functions

Moritz Ibing, Isaak Lim, Leif Kobbelt

Previous approaches to generate shapes in a 3D setting train a GAN on the latent space of an autoencoder (AE). [Expand]

0.00
Thursday Poster Session
[1264]

Facial Action Unit Detection With Transformers

Geethu Miriam Jacob, Bjorn Stenger

The Facial Action Coding System is a taxonomy for fine-grained facial expression analysis. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1265]

CAMERAS: Enhanced Resolution and Sanity Preserving Class Activation Mapping for Image Saliency

Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1266]

Learning Compositional Representation for 4D Captures With Neural ODE

Boyan Jiang, Yinda Zhang, Xingkui Wei, Xiangyang Xue, Yanwei Fu

Learning based representation has become the key to the success of many computer vision systems. [Expand]

0.00
Tuesday Poster Session
[1267]

UV-Net: Learning From Boundary Representations

Pradeep Kumar Jayaraman, Aditya Sanghi, Joseph G. Lambourne, Karl D.D. Willis, Thomas Davies, Hooman Shayani, Nigel Morris

We introduce UV-Net, a novel neural network architecture and representation designed to operate directly on Boundary representation (B-rep) data from 3D CAD models. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1268]

Mining Better Samples for Contrastive Learning of Temporal Correspondence

Sangryul Jeon, Dongbo Min, Seungryong Kim, Kwanghoon Sohn

We present a novel framework for contrastive learning of pixel-level representation using only unlabeled video. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1269]

Saliency-Guided Image Translation

Lai Jiang, Mai Xu, Xiaofei Wang, Leonid Sigal

In this paper, we propose a novel task for saliency-guided image translation, with the goal of image-to-image translation conditioned on the user specified saliency map. [Expand]

0.00
Friday Poster Session
[1270]

IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking

Shuai Jia, Yibing Song, Chao Ma, Xiaokang Yang

Adversarial attack arises due to the vulnerability of deep neural networks to perceive input samples injected with imperceptible perturbations. [Expand]

0.00
Tuesday Poster Session
[1271]

Leveraging Line-Point Consistence To Preserve Structures for Wide Parallax Image Stitching

Qi Jia, ZhengJun Li, Xin Fan, Haotian Zhao, Shiyu Teng, Xinchen Ye, Longin Jan Latecki

Generating high-quality stitched images with natural structures is a challenging task in computer vision. [Expand]

0.00
Thursday Poster Session
[1272]

Amalgamating Knowledge From Heterogeneous Graph Neural Networks

Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, Dacheng Tao

In this paper, we study a novel knowledge transfer task in the domain of graph neural networks (GNNs). [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1273]

Harmonious Semantic Line Detection via Maximal Weight Clique Selection

Dongkwon Jin, Wonhui Park, Seong-Gyun Jeong, Chang-Su Kim

A novel algorithm to detect an optimal set of semantic lines is proposed in this work. [Expand]

0.00
Friday Poster Session
[1274]

Turning Frequency to Resolution: Video Super-Resolution via Event Cameras

Yongcheng Jing, Yiding Yang, Xinchao Wang, Mingli Song, Dacheng Tao

State-of-the-art video super-resolution (VSR) methods focus on exploiting inter- and intra-frame correlations to estimate high-resolution (HR) video frames from low-resolution (LR) ones. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1275]

Cross-Modal Center Loss for 3D Cross-Modal Retrieval

Longlong Jing, Elahe Vahdani, Jiaxing Tan, Yingli Tian

Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1276]

Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling

Wei Ji, Shuang Yu, Junde Wu, Kai Ma, Cheng Bian, Qi Bi, Jingjing Li, Hanruo Liu, Li Cheng, Yefeng Zheng

In medical image analysis, it is typical to collect multiple annotations, each from a different clinical expert or rater, in the expectation that possible diagnostic errors could be mitigated. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1277]

Calibrated RGB-D Salient Object Detection

Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng

Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1278]

Practical Single-Image Super-Resolution Using Look-Up Table

Younghyun Jo, Seon Joo Kim

A number of super-resolution (SR) algorithms from interpolation to deep neural networks (DNN) have emerged to restore or create missing details of the input low-resolution image. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1279]

Joint Deep Model-Based MR Image and Coil Sensitivity Reconstruction Network (Joint-ICNet) for Fast MRI

Yohan Jun, Hyungseob Shin, Taejoon Eo, Dosik Hwang

Magnetic resonance imaging (MRI) can provide diagnostic information with high-resolution and high-contrast images. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1280]

Time Adaptive Recurrent Neural Network

Anil Kag, Venkatesh Saligrama

We propose a learning method that, dynamically modifies the time-constants of the continuous-time counterpart of a vanilla RNN. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1281]

Tackling the Ill-Posedness of Super-Resolution Through Adaptive Target Generation

Younghyun Jo, Seoung Wug Oh, Peter Vajda, Seon Joo Kim

By the one-to-many nature of the super-resolution (SR) problem, a single low-resolution (LR) image can be mapped to many high-resolution (HR) images. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1282]

Unsupervised Learning of Depth and Depth-of-Field Effect From Natural Images With Aperture Rendering Generative Adversarial Networks

Takuhiro Kaneko

Understanding the 3D world from 2D projected natural images is a fundamental challenge in computer vision and graphics. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1283]

Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning

Shichao Kan, Yigang Cen, Yang Li, Vladimir Mladenovic, Zhihai He

In unsupervised learning of image features without labels, especially on datasets with fine-grained object classes, it is often very difficult to tell if a given image belongs to one specific object class or another, even for human eyes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1284]

Zero-Shot Single Image Restoration Through Controlled Perturbation of Koschmieder's Model

Aupendu Kar, Sobhan Kanti Dhara, Debashis Sen, Prabir Kumar Biswas

Real-world image degradation due to light scattering can be described based on the Koschmieder's model. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1285]

Differentiable Diffusion for Dense Depth Estimation From Multi-View Images

Numair Khan, Min H. Kim, James Tompkin

We present a method to estimate dense depth by optimizing a sparse set of points such that their diffusion into a depth map minimizes a multi-view reprojection error from RGB supervision. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1286]

Guided Integrated Gradients: An Adaptive Path Method for Removing Noise

Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, Tolga Bolukbasi

Integrated Gradients (IG) is a commonly used feature attribution method for deep neural networks. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1287]

Neural Side-by-Side: Predicting Human Preferences for No-Reference Super-Resolution Evaluation

Valentin Khrulkov, Artem Babenko

Super-resolution based on deep convolutional networks is currently gaining much attention from both academia and industry. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1288]

Discriminative Appearance Modeling With Multi-Track Pooling for Real-Time Multi-Object Tracking

Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg

In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene. [Expand]

0.00
Wednesday Poster Session
[1289]

Joint Negative and Positive Learning for Noisy Labels

Youngdong Kim, Juseung Yun, Hyounguk Shon, Junmo Kim

Training of Convolutional Neural Networks (CNNs) with data with noisy labels is known to be a challenge. [Expand]

0.00
Wednesday Poster Session
[1290]

High-Quality Stereo Image Restoration From Double Refraction

Hakyeong Kim, Andreas Meuleman, Daniel S. Jeon, Min H. Kim

Single-shot monocular birefractive stereo methods have been used for estimating sparse depth from double refraction over edges. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1291]

Not Just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction

Daejin Kim, Mohammad Azam Khan, Jaegul Choo

Facial attribute editing aims to manipulate the image with the desired attribute while preserving the other details. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1292]

Quality-Agnostic Image Recognition via Invertible Decoder

Insoo Kim, Seungju Han, Ji-won Baek, Seong-Jin Park, Jae-Joon Han, Jinwoo Shin

Despite the remarkable performance of deep models on image recognition tasks, they are known to be susceptible to common corruptions such as blur, noise, and low-resolution. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1293]

Prototype-Guided Saliency Feature Learning for Person Search

Hanjae Kim, Sunghun Joung, Ig-Jae Kim, Kwanghoon Sohn

Existing person search methods integrate person detection and re-identification (re-ID) module into a unified system. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1294]

QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks

Vladimir Kryzhanovskiy, Gleb Balitskiy, Nikolay Kozyrskiy, Aleksandr Zuruev

Modern deep neural networks (DNNs) cannot be effectively used in mobile and embedded devices due to strict requirements for computational complexity, memory, and power consumption. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1295]

T-vMF Similarity for Regularizing Intra-Class Feature Distribution

Takumi Kobayashi

Deep convolutional neural networks (CNNs) leverage large-scale training dataset to produce remarkable performance on various image classification tasks. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1296]

Controllable Image Restoration for Under-Display Camera in Smartphones

Kinam Kwon, Eunhee Kang, Sangwon Lee, Su-Jin Lee, Hyong-Euk Lee, ByungIn Yoo, Jae-Joon Han

Under-display camera (UDC) technology is essential for full-screen display in smartphones and is achieved by removing the concept of drilling holes on display. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1297]

IMODAL: Creating Learnable User-Defined Deformation Models

Leander Lacroix, Benjamin Charlier, Alain Trouve, Barbara Gris

A natural way to model the evolution of an object (growth of a leaf for instance) is to estimate a plausible deforming path between two observations. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1298]

3D Video Stabilization With Depth Estimation by CNN-Based Optimization

Yao-Chih Lee, Kuan-Wei Tseng, Yu-Ta Chen, Chien-Cheng Chen, Chu-Song Chen, Yi-Ping Hung

Video stabilization is an essential component of visual quality enhancement. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1299]

Restoring Extremely Dark Images in Real Time

Mohit Lamba, Kaushik Mitra

A practical low-light enhancement solution must be computationally fast, memory-efficient, and achieve a visually appealing restoration. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1300]

Blocks-World Cameras

Jongho Lee, Mohit Gupta

For several vision and robotics applications, 3D geometry of man-made environments such as indoor scenes can be represented with a small number of dominant planes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1301]

CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback

Seungmin Lee, Dongwan Kim, Bohyung Han

We tackle the task of image retrieval with text feedback, where a reference image and modifier text are combined to identify the desired target image. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1302]

Iterative Filter Adaptive Network for Single Image Defocus Deblurring

Junyong Lee, Hyeongseok Son, Jaesung Rim, Sunghyun Cho, Seungyong Lee

We propose a novel end-to-end learning-based approach for single image defocus deblurring. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1303]

DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation

Seunghun Lee, Sunghyun Cho, Sunghoon Im

In this paper, we present DRANet, a network architecture that disentangles image representations and transfers the visual attributes in a latent space for unsupervised cross-domain adaptation. [Expand]

0.00
Thursday Poster Session
[1304]

PatchMatch-Based Neighborhood Consensus for Semantic Correspondence

Jae Yong Lee, Joseph DeGol, Victor Fragoso, Sudipta N. Sinha

We address estimating dense correspondences between two images depicting different but semantically related scenes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1305]

Network Quantization With Element-Wise Gradient Scaling

Junghyup Lee, Dohyung Kim, Bumsub Ham

Network quantization aims at reducing bit-widths of weights and/or activations, particularly important for implementing deep neural networks with limited hardware resources. [Expand]

0.00
Tuesday Poster Session
[1306]

Relevance-CAM: Your Model Already Knows Where To Look

Jeong Ryong Lee, Sewon Kim, Inyong Park, Taejoon Eo, Dosik Hwang

With increasing fields of application for neural networks and the development of neural networks, the ability to explain deep learning models is also becoming increasingly important. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1307]

Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning

Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, Yong Man Ro

Our work addresses long-term motion context issues for predicting future frames. [Expand]

0.00
Tuesday Poster Session
[1308]

Picasso: A CUDA-Based Library for Deep Learning Over 3D Meshes

Huan Lei, Naveed Akhtar, Ajmal Mian

We present Picasso, a CUDA-based library comprising novel modules for deep learning over complex real-world 3D meshes. [Expand]

0.00
Thursday Poster Session
[1309]

RangeIoUDet: Range Image Based Real-Time 3D Object Detector Optimized by Intersection Over Union

Zhidong Liang, Zehan Zhang, Ming Zhang, Xian Zhao, Shiliang Pu

Real-time and high-performance 3D object detection is an attractive research direction in autonomous driving. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1310]

4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis

Weihang Liao, Art Subpa-asa, Yinqiang Zheng, Imari Sato

Hyperspectral photoacoustic (HSPA) spectroscopy is an emerging bi-modal imaging technology that is able to show the wavelength-dependent absorption distribution of the interior of a 3D volume. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1311]

COMPLETER: Incomplete Multi-View Clustering via Contrastive Prediction

Yijie Lin, Yuanbiao Gou, Zitao Liu, Boyun Li, Jiancheng Lv, Xi Peng

In this paper, we study two challenging problems in incomplete multi-view clustering analysis, namely, i) how to learn an informative and consistent representation among different views without the help of labels and ii) how to recover the missing views from data. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1312]

Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

Chuming Lin, Chengming Xu, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu

Temporal action localization is an important yet challenging task in video understanding. [Expand]

0.00
Tuesday Poster Session
[1313]

Multi-View Multi-Person 3D Pose Estimation With Plane Sweep Stereo

Jiahao Lin, Gim Hee Lee

Existing approaches for multi-view multi-person 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views and solve for the 3D pose estimation for each person. [Expand]

0.00
Thursday Poster Session
[1314]

Rich Context Aggregation With Reflection Prior for Glass Surface Detection

Jiaying Lin, Zebang He, Rynson W.H. Lau

Glass surfaces appear everywhere. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1315]

Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval

Yang Liu, Qingchao Chen, Samuel Albanie

In this paper, we study the task of visual-text retrieval in the highly practical setting in which labelled visual data with paired text descriptions are available in one domain (the "source"), but only unlabelled visual data (without text descriptions) are available in the domain of interest (the "target"). [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1316]

What Can Style Transfer and Paintings Do for Model Robustness?

Hubert Lin, Mitchell van Zuijlen, Sylvia C. Pont, Maarten W.A. Wijntjes, Kavita Bala

A common strategy for improving model robustness is through data augmentations. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1317]

Cluster-Wise Hierarchical Generative Model for Deep Amortized Clustering

Huafeng Liu, Jiaqi Wang, Liping Jing

In this paper, we propose Cluster-wise Hierarchical Generative Model for deep amortized clustering (CHiGac). [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1318]

Context-Aware Biaffine Localizing Network for Temporal Sentence Grounding

Daizong Liu, Xiaoye Qu, Jianfeng Dong, Pan Zhou, Yu Cheng, Wei Wei, Zichuan Xu, Yulai Xie

This paper addresses the problem of temporal sentence grounding (TSG), which aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1319]

Deep Learning in Latent Space for Video Prediction and Compression

Bowen Liu, Yu Chen, Shiyu Liu, Hun-Seok Kim

Learning-based video compression has achieved substantial progress during recent years. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1320]

Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation

Fenglin Liu, Xian Wu, Shen Ge, Wei Fan, Yuexian Zou

Automatically generating radiology reports can improve current clinical practice in diagnostic radiology. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1321]

Exploit Visual Dependency Relations for Semantic Segmentation

Mingyuan Liu, Dan Schonfeld, Wei Tang

Dependency relations among visual entities are ubiquity because both objects and scenes are highly structured. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1322]

Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction

Feng Liu, Luan Tran, Xiaoming Liu

Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1323]

Generic Perceptual Loss for Modeling Structured Output Dependencies

Yifan Liu, Hao Chen, Yu Chen, Wei Yin, Chunhua Shen

The perceptual loss has been widely used as an effective loss term in image synthesis tasks including image super-resolution [16], and style transfer [14]. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1324]

iMiGUE: An Identity-Free Video Dataset for Micro-Gesture Understanding and Emotion Analysis

Xin Liu, Henglin Shi, Haoyu Chen, Zitong Yu, Xiaobai Li, Guoying Zhao

We introduce a new dataset for the emotional artificial intelligence research: identity-free video dataset for micro-gesture understanding and emotion analysis (iMiGUE). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1325]

Learning To Warp for Style Transfer

Xiao-Chang Liu, Yong-Liang Yang, Peter Hall

Since its inception in 2015, Style Transfer has focused on texturing a content image using an art exemplar. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1326]

Mask-Embedded Discriminator With Region-Based Semantic Regularization for Semi-Supervised Class-Conditional Image Synthesis

Yi Liu, Xiaoyang Huo, Tianyi Chen, Xiangping Zeng, Si Wu, Zhiwen Yu, Hau-San Wong

Semi-supervised generative learning (SSGL) makes use of unlabeled data to achieve a trade-off between the data collection/annotation effort and generation performance, when adequate labeled data are not available. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1327]

Neighborhood Normalization for Robust Geometric Feature Learning

Xingtong Liu, Benjamin D. Killeen, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

Extracting geometric features from 3D models is a common first step in applications such as 3D registration, tracking, and scene flow estimation. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1328]

RankDetNet: Delving Into Ranking Constraints for Object Detection

Ji Liu, Dong Li, Rongzhang Zheng, Lu Tian, Yi Shan

Modern object detection approaches cast detecting objects as optimizing two subtasks of classification and localization simultaneously. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1329]

PluckerNet: Learn To Register 3D Line Reconstructions

Liu Liu, Hongdong Li, Haodong Yao, Ruyi Zha

Aligning two partially-overlapped 3D line reconstructions in Euclidean space is challenging, as we need to simultaneously solve line correspondences and relative pose between reconstructions. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1330]

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation

Yahui Liu, Enver Sangineto, Yajing Chen, Linchao Bao, Haoxian Zhang, Nicu Sebe, Bruno Lepri, Wei Wang, Marco De Nadai

Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic interpolation results. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1331]

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

Jiawei Liu, Zheng-Jun Zha, Wei Wu, Kecheng Zheng, Qibin Sun

Video-based person re-identification aims to match pedestrians from video sequences across non-overlapping camera views. [Expand]

0.00
Tuesday Poster Session
[1332]

3D Human Action Representation Learning via Cross-View Consistency Pursuit

Linguo Li, Minsi Wang, Bingbing Ni, Hang Wang, Jiancheng Yang, Wenjun Zhang

In this work, we propose a Cross-view Contrastive Learning framework for unsupervised 3D skeleton-based action representation (CrosSCLR), by leveraging multi-view complementary supervision signal. [Expand]

0.00
Tuesday Poster Session
[1333]

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation

Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, Joongkyu Kim

Prototype learning is extensively used for few-shot segmentation. [Expand]

0.00
Wednesday Poster Session
[1334]

Combined Depth Space Based Architecture Search for Person Re-Identification

Hanjun Li, Gaojie Wu, Wei-Shi Zheng

Most works on person re-identification (ReID) take advantage of large backbone networks such as ResNet, which are designed for image classification instead of ReID, for feature extraction. [Expand]

0.00
Tuesday Poster Session
[1335]

Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer

Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu

Occluded person re-identification (Re-ID) is a challenging task as persons are frequently occluded by various obstacles or other persons, especially in the crowd scenario. [Expand]

0.00
Tuesday Poster Session
[1336]

Domain Consensus Clustering for Universal Domain Adaptation

Guangrui Li, Guoliang Kang, Yi Zhu, Yunchao Wei, Yi Yang

In this paper, we investigate Universal Domain Adaptation (UniDA) problem, which aims to transfer the knowledge from source to target under unaligned label space. [Expand]

0.00
Wednesday Poster Session
[1337]

Dynamic Class Queue for Large Scale Face Recognition in the Wild

Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Wenyu Liu

Learning discriminative representation using large-scale face datasets in the wild is crucial for real-world applications, yet it remains challenging. [Expand]

0.00
Tuesday Poster Session
[1338]

Dynamic Domain Adaptation for Efficient Inference

Shuang Li, JinMing Zhang, Wenxuan Ma, Chi Harold Liu, Wei Li

Domain adaptation (DA) enables knowledge transfer from a labeled source domain to an unlabeled target domain by reducing the cross-domain distribution discrepancy. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1339]

Dynamic Transfer for Multi-Source Domain Adaptation

Yunsheng Li, Lu Yuan, Yinpeng Chen, Pei Wang, Nuno Vasconcelos

Recent works of multi-source domain adaptation focus on learning a domain-agnostic model, of which the parameters are static. [Expand]

0.00
Wednesday Poster Session
[1340]

Ego-Exo: Transferring Visual Representations From Third-Person to First-Person Videos

Yanghao Li, Tushar Nagarajan, Bo Xiong, Kristen Grauman

We introduce an approach for pre-training egocentric video models using large-scale third-person video datasets. [Expand]

0.00
Tuesday Poster Session
[1341]

FaceInpainter: High Fidelity Face Adaptation to Heterogeneous Domains

Jia Li, Zhaoyang Li, Jie Cao, Xingguang Song, Ran He

In this work, we propose a novel two-stage framework named FaceInpainter to implement controllable Identity-Guided Face Inpainting (IGFI) under heterogeneous domains. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1342]

Few-Shot Object Detection via Classification Refinement and Distractor Retreatment

Yiting Li, Haiyue Zhu, Yu Cheng, Wenxin Wang, Chek Sing Teo, Cheng Xiang, Prahlad Vadakkepat, Tong Heng Lee

We aim to tackle the challenging Few-Shot Object Detection (FSOD) where data-scarce categories are presented during the model learning. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1343]

Hilbert Sinkhorn Divergence for Optimal Transport

Qian Li, Zhichao Wang, Gang Li, Jun Pang, Guandong Xu

Sinkhorn divergence has become a very popular metric to compare probability distributions in optimal transport. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1344]

Learning To Identify Correct 2D-2D Line Correspondences on Sphere

Haoang Li, Kai Chen, Ji Zhao, Jiangliu Wang, Pyojin Kim, Zhe Liu, Yun-Hui Liu

Given a set of putative 2D-2D line correspondences, we aim to identify correct matches. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1345]

Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression

Wanhua Li, Xiaoke Huang, Jiwen Lu, Jianjiang Feng, Jie Zhou

Uncertainty is the only certainty there is. [Expand]

0.00
Thursday Poster Session
[1346]

Lighting, Reflectance and Geometry Estimation From 360deg Panoramic Stereo

Junxuan Li, Hongdong Li, Yasuyuki Matsushita

We propose a method for estimating high-definition spatially-varying lighting, reflectance, and geometry of a scene from 360deg stereo images. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1347]

Generalizing to the Open World: Deep Visual Odometry With Online Adaptation

Shunkai Li, Xin Wu, Yingdian Cao, Hongbin Zha

Despite learning-based visual odometry (VO) has shown impressive results in recent years, the pretrained networks may easily collapse in unseen environments. [Expand]

0.00
Thursday Poster Session
[1348]

Meta-Mining Discriminative Samples for Kinship Verification

Wanhua Li, Shiwei Wang, Jiwen Lu, Jianjiang Feng, Jie Zhou

Kinship verification aims to find out whether there is a kin relation for a given pair of facial images. [Expand]

0.00
Friday Poster Session
[1349]

Probabilistic Model Distillation for Semantic Correspondence

Xin Li, Deng-Ping Fan, Fan Yang, Ao Luo, Hong Cheng, Zicheng Liu

Semantic correspondence is a fundamental problem in computer vision, which aims at establishing dense correspondences across images depicting different instances under the same category. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1350]

Progressive Stage-Wise Learning for Unsupervised Feature Representation Enhancement

Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

Unsupervised learning methods have recently shown their competitiveness against supervised training. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1351]

Representing Videos As Discriminative Sub-Graphs for Action Recognition

Dong Li, Zhaofan Qiu, Yingwei Pan, Ting Yao, Houqiang Li, Tao Mei

Human actions are typically of combinatorial structures or patterns, i.e., subjects, objects, plus spatio-temporal interactions in between. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1352]

Spatial Assembly Networks for Image Representation Learning

Yang Li, Shichao Kan, Jianhe Yuan, Wenming Cao, Zhihai He

It has been long recognized that deep neural networks are sensitive to changes in spatial configurations or scene structures. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1353]

SelfDoc: Self-Supervised Document Representation Learning

Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu

We propose SelfDoc, a task-agnostic pre-training framework for document image understanding. [Expand]

0.00
Tuesday Poster Session
[1354]

Self-Supervised Video Hashing via Bidirectional Transformers

Shuyan Li, Xiu Li, Jiwen Lu, Jie Zhou

Most existing unsupervised video hashing methods are built on unidirectional models with less reliable training objectives, which underuse the correlations among frames and the similarity structure between videos. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1355]

Spatial Feature Calibration and Temporal Fusion for Effective One-Stage Video Instance Segmentation

Minghan Li, Shuai Li, Lida Li, Lei Zhang

Modern one-stage video instance segmentation networks suffer from two limitations. [Expand]

0.00
Wednesday Poster Session
[1356]

Spherical Confidence Learning for Face Recognition

Shen Li, Jianqing Xu, Xiaqing Xu, Pengcheng Shen, Shaoxin Li, Bryan Hooi

An emerging line of research has found that spherical spaces better match the underlying geometry of facial images, as evidenced by the state-of-the-art facial recognition methods which benefit empirically from spherical representations. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1357]

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, Radu Timofte

In this paper, we tackle the problem of convolutional neural network design. [Expand]

0.00
0
0
0
Monday Poster Session
[1358]

Towards Compact CNNs via Collaborative Compression

Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. [Expand]

0.00
Tuesday Poster Session
[1359]

Toward Accurate and Realistic Outfits Visualization With Attention to Details

Kedan Li, Min Jin Chong, Jeffrey Zhang, Jingen Liu

Virtual try-on methods aim to generate images of fashion models wearing arbitrary combinations of garments. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1360]

Transferable Semantic Augmentation for Domain Adaptation

Shuang Li, Mixue Xie, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Wei Li

Domain adaptation has been widely explored by transferring the knowledge from a label-rich source domain to a related but unlabeled target domain. [Expand]

0.00
0
0
0
Thursday Poster Session
[1361]

Transformation Invariant Few-Shot Object Detection

Aoxue Li, Zhenguo Li

Few-shot object detection (FSOD) aims to learn detectors that can be generalized to novel classes with only a few instances. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1362]

Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations

Zhihui Li, Lina Yao

Temporal action detection on unconstrained videos has seen significant research progress in recent years. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1363]

VirFace: Enhancing Face Recognition via Unlabeled Shallow Data

Wenyu Li, Tianchu Guo, Pengyu Li, Binghui Chen, Biao Wang, Wangmeng Zuo, Lei Zhang

Recently, exploiting the effect of the unlabeled data for face recognition attracts increasing attention. [Expand]

0.00
Thursday Poster Session
[1364]

CLCC: Contrastive Learning for Color Constancy

Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, Kevin Jou

In this paper, we present CLCC, a novel contrastive learning framework for color constancy. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1365]

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Xiaoxiao Long, Lingjie Liu, Wei Li, Christian Theobalt, Wenping Wang

We present a novel method for multi-view depth estimation from a single video, which is a critical task in various applications, such as perception, reconstruction and robot navigation. [Expand]

PDF
arXiv
Show Tweets
0.00
Wednesday Poster Session
[1366]

Radar-Camera Pixel Depth Association for Depth Completion

Yunfei Long, Daniel Morris, Xiaoming Liu, Marcos Castro, Punarjay Chakravarty, Praveen Narayanan

While radar and video data can be readily fused at the detection level, fusing them at the pixel level is potentially more beneficial. [Expand]

0.00
Thursday Poster Session
[1367]

Conditional Bures Metric for Domain Adaptation

You-Wei Luo, Chuan-Xian Ren

As a vital problem in classification-oriented transfer, unsupervised domain adaptation (UDA) has attracted widespread attention in recent years. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1368]

Action Unit Memory Network for Weakly Supervised Temporal Action Localization

Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang

Weakly supervised temporal action localization aims to detect and localize actions in untrimmed videos with only video-level labels during training. [Expand]

0.00
Wednesday Poster Session
[1369]

Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

Huiwen Luo, Koki Nagano, Han-Wei Kung, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1370]

Scalable Differential Privacy With Sparse Network Finetuning

Zelun Luo, Daniel J. Wu, Ehsan Adeli, Li Fei-Fei

We propose a novel method for privacy-preserving training of deep neural networks leveraging public, out-domain data. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1371]

Intelligent Carpet: Inferring 3D Human Pose From Tactile Signals

Yiyue Luo, Yunzhu Li, Michael Foshey, Wan Shou, Pratyusha Sharma, Tomas Palacios, Antonio Torralba, Wojciech Matusik

Daily human activities, e.g., locomotion, exercises, and resting, are heavily guided by the tactile interactions between the human and the ground. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1372]

Stay Positive: Non-Negative Image Synthesis for Augmented Reality

Katie Luo, Guandao Yang, Wenqi Xian, Harald Haraldsson, Bharath Hariharan, Serge Belongie

In applications such as optical see-through and projector augmented reality, producing images amounts to solving non-negative image generation, where one can only add light to an existing image. [Expand]

0.00
Wednesday Poster Session
[1373]

Large-Capacity Image Steganography Based on Invertible Neural Networks

Shao-Ping Lu, Rong Wang, Tao Zhong, Paul L. Rosin

Many attempts have been made to hide information in images, where the main challenge is how to increase the payload capacity without the container image being detected as containing a message. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1374]

CGA-Net: Category Guided Aggregation for Point Cloud Semantic Segmentation

Tao Lu, Limin Wang, Gangshan Wu

Previous point cloud semantic segmentation networks use the same process to aggregate features from neighbors of the same category and different categories. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1375]

Dual-GAN: Joint BVP and Noise Modeling for Remote Physiological Measurement

Hao Lu, Hu Han, S. Kevin Zhou

Remote photoplethysmography (rPPG) based physiological measurement has great application values in health monitoring, emotion analysis, etc. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1376]

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution

Liying Lu, Wenbo Li, Xin Tao, Jiangbo Lu, Jiaya Jia

Reference-based image super-resolution (RefSR) has shown promising success in recovering high-frequency details by utilizing an external reference image (Ref). [Expand]

0.00
0
0
0
Tuesday Poster Session
[1377]

Personalized Outfit Recommendation With Learnable Anchors

Zhi Lu, Yang Hu, Yan Chen, Bing Zeng

The multimedia community has recently seen a tremendous surge of interest in the fashion recommendation problem. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1378]

Learning Normal Dynamics in Videos With Meta Prototype Network

Hui Lv, Chen Chen, Zhen Cui, Chunyan Xu, Yong Li, Jian Yang

Frame reconstruction (current or future frames) based on Auto-Encoder (AE) is a popular method for video anomaly detection. [Expand]

0.00
Thursday Poster Session
[1379]

Progressive Modality Reinforcement for Human Multimodal Emotion Recognition From Unaligned Multimodal Sequences

Fengmao Lv, Xiang Chen, Yanyong Huang, Lixin Duan, Guosheng Lin

Human multimodal emotion recognition involves time-series data of different modalities, such as natural language, visual motions, and acoustic behaviors. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1380]

Residential Floor Plan Recognition and Reconstruction

Xiaolei Lv, Shengchu Zhao, Xinyang Yu, Binqiang Zhao

Recognition and reconstruction of residential floor plan drawings are important and challenging in design, decoration, and architectural remodeling fields. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1381]

Towards Evaluating and Training Verifiably Robust Neural Networks

Zhaoyang Lyu, Minghao Guo, Tong Wu, Guodong Xu, Kehuan Zhang, Dahua Lin

Recent works have shown that interval bound propagation (IBP) can be used to train verifiably robust neural networks. [Expand]

0.00
Tuesday Poster Session
[1382]

Efficient Multi-Stage Video Denoising With Recurrent Spatio-Temporal Fusion

Matteo Maggioni, Yibin Huang, Cheng Li, Shuai Xiao, Zhongqian Fu, Fenglong Song

In recent years, denoising methods based on deep learning have achieved unparalleled performance at the cost of large computational complexity. [Expand]

0.00
Tuesday Poster Session
[1383]

MultiLink: Multi-Class Structure Recovery via Agglomerative Clustering and Model Selection

Luca Magri, Filippo Leveni, Giacomo Boracchi

We address the problem of recovering multiple structures of different classes in a dataset contaminated by noise and outliers. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1384]

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira

How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1385]

Magic Layouts: Structural Prior for Component Detection in User Interface Designs

Dipu Manandhar, Hailin Jin, John Collomosse

We present Magic Layouts; a method for parsing screenshots or hand-drawn sketches of user interface (UI) layouts. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1386]

CapsuleRRT: Relationships-Aware Regression Tracking via Capsules

Ding Ma, Xiangqian Wu

Regression tracking has gained more and more attention thanks to its easy-to-implement characteristics, while existing regression trackers rarely consider the relationships between the object parts and the complete object. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1387]

Weakly Supervised Action Selection Learning in Video

Junwei Ma, Satya Krishna Gorti, Maksims Volkovs, Guangwei Yu

Localizing actions in video is a core task in computer vision. [Expand]

0.00
Wednesday Poster Session
[1388]

Image Super-Resolution With Non-Local Sparse Attention

Yiqun Mei, Yuchen Fan, Yuqian Zhou

Both non-local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1389]

Real-Time Sphere Sweeping Stereo From Multiview Fisheye Images

Andreas Meuleman, Hyeonjoong Jang, Daniel S. Jeon, Min H. Kim

A set of cameras with fisheye lenses have been used to capture a wide field of view. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1390]

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

Jiaxu Miao, Yunchao Wei, Yu Wu, Chen Liang, Guangrui Li, Yi Yang

In this paper, we present a new dataset with the target of advancing the scene parsing task from images to videos. [Expand]

0.00
Tuesday Poster Session
[1391]

PVGNet: A Bottom-Up One-Stage 3D Object Detector With Integrated Multi-Level Features

Zhenwei Miao, Jikai Chen, Hongyu Pan, Ruiwen Zhang, Kaixuan Liu, Peihan Hao, Jun Zhu, Yang Wang, Xin Zhan

Quantization-based methods are widely used in LiDAR points 3D object detection for its efficiency in extracting context information. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1392]

Physically-Aware Generative Network for 3D Shape Modeling

Mariem Mezghanni, Malika Boulkenafed, Andre Lieutier, Maks Ovsjanikov

Shapes are often designed to satisfy structural properties and serve a particular functionality in the physical world. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1393]

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps

Lu Mi, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao, Chen Sun, Cordelia Schmid, Nir Shavit, Yuning Chai, Dragomir Anguelov

High Definition (HD) maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1394]

Wasserstein Barycenter for Multi-Source Domain Adaptation

Eduardo Fernandes Montesuma, Fred Maurice Ngole Mboula

Multi-source domain adaptation is a key technique that allows a model to be trained on data coming from various probability distribution. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1395]

Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences

Norman Muller, Yu-Shiang Wong, Niloy J. Mitra, Angela Dai, Matthias Niessner

Multi-object tracking from RGB-D video sequences is a challenging problem due to the combination of changing viewpoints, motion, and occlusions over time. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1396]

Extreme Low-Light Environment-Driven Image Denoising Over Permanently Shadowed Lunar Regions With a Physical Noise Model

Ben Moseley, Valentin Bickel, Ignacio G. Lopez-Francos, Loveneesh Rana

Recently, learning-based approaches have achieved impressive results in the field of low-light image denoising. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1397]

Interventional Video Grounding With Dual Contrastive Learning

Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu

Video grounding aims to localize a moment from an untrimmed video for a given textual query. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1398]

All Labels Are Not Created Equal: Enhancing Semi-Supervision via Label Grouping and Co-Training

Islam Nassar, Samitha Herath, Ehsan Abbasnejad, Wray Buntine, Gholamreza Haffari

Pseudo-labeling is a key component in semi-supervised learning (SSL). [Expand]

0.00
0
0
0
Wednesday Poster Session
[1399]

Divide-and-Conquer for Lane-Aware Diverse Trajectory Prediction

Sriram Narayanan, Ramin Moslemi, Francesco Pittaluga, Buyu Liu, Manmohan Chandraker

Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions. [Expand]

0.00
Friday Poster Session
[1400]

FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation

Jaemin Na, Heechul Jung, Hyung Jin Chang, Wonjun Hwang

Unsupervised domain adaptation (UDA) methods for learning domain invariant representations have achieved remarkable progress. [Expand]

0.00
0
0
0
Monday Poster Session
[1401]

Pedestrian and Ego-Vehicle Trajectory Prediction From Monocular Camera

Lukas Neumann, Andrea Vedaldi

Predicting future pedestrian trajectory is a crucial component of autonomous driving systems, as recognizing critical situations based only on current pedestrian position may come too late for any meaningful corrective action (e.g. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1402]

Dictionary-Guided Scene Text Recognition

Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh-Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai

Language prior plays an important role in the way humans perceive and recognize text in the wild. [Expand]

0.00
Wednesday Poster Session
[1403]

Discovering Relationships Between Object Categories via Universal Canonical Maps

Natalia Neverova, Artsiom Sanakoyeu, Patrick Labatut, David Novotny, Andrea Vedaldi

We tackle the problem of learning the geometry of multiple categories of deformable objects jointly. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1404]

Clusformer: A Transformer Based Clustering Approach to Unsupervised Large-Scale Face and Visual Landmark Recognition

Xuan-Bac Nguyen, Duc Toan Bui, Chi Nhan Duong, Tien D. Bui, Khoa Luu

The research in automatic unsupervised visual clustering has received considerable attention over the last couple years. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1405]

FAPIS: A Few-Shot Anchor-Free Part-Based Instance Segmenter

Khoi Nguyen, Sinisa Todorovic

This paper is about few-shot instance segmentation, where training and test image sets do not share the same object classes. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1406]

Controlling the Rain: From Removal to Rendering

Siqi Ni, Xueyun Cao, Tao Yue, Xuemei Hu

Existing rain image editing methods focus on either removing rain from rain images or rendering rain on rain-free images. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1407]

HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection

Jongyoun Noh, Sanghoon Lee, Bumsub Ham

We address the problem of 3D object detection, that is, estimating 3D object bounding boxes from point clouds. [Expand]

0.00
Thursday Poster Session
[1408]

Automated Log-Scale Quantization for Low-Cost Deep Neural Networks

Sangyun Oh, Hyeonuk Sim, Sugil Lee, Jongeun Lee

Quantization plays an important role in deep neural network (DNN) hardware. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1409]

Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation

Youngmin Oh, Beomjun Kim, Bumsub Ham

We address the problem of weakly-supervised semantic segmentation (WSSS) using bounding box annotations. [Expand]

0.00
Tuesday Poster Session
[1410]

Protecting Intellectual Property of Generative Adversarial Networks From Ambiguity Attacks

Ding Sheng Ong, Chee Seng Chan, Kam Woh Ng, Lixin Fan, Qiang Yang

Ever since Machine Learning as a Service emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. [Expand]

PDF
arXiv
Show Tweets
0.00
0
0
0
Tuesday Poster Session
[1411]

A Quasiconvex Formulation for Radial Cameras

Carl Olsson, Viktor Larsson, Fredrik Kahl

In this paper we study structure from motion problems for 1D radial cameras. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1412]

Bilinear Parameterization for Non-Separable Singular Value Penalties

Marcus Valtonen Ornhag, Jose Pedro Iglesias, Carl Olsson

Low rank inducing penalties have been proven to successfully uncover fundamental structures considered in computer vision and machine learning; however, such methods generally lead to non-convex optimization problems. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1413]

Neural Auto-Exposure for High-Dynamic Range Object Detection

Emmanuel Onzon, Fahim Mannan, Felix Heide

Real-world scenes have a dynamic range of up to 280 dB that today's imaging sensors cannot directly capture. [Expand]

0.00
Wednesday Poster Session
[1414]

SDD-FIQA: Unsupervised Face Image Quality Assessment With Similarity Distribution Distance

Fu-Zhao Ou, Xingyu Chen, Ruixin Zhang, Yuge Huang, Shaoxin Li, Jilin Li, Yong Li, Liujuan Cao, Yuan-Gen Wang

In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. [Expand]

0.00
Wednesday Poster Session
[1415]

Fast Sinkhorn Filters: Using Matrix Scaling for Non-Rigid Shape Correspondence With Functional Maps

Gautam Pai, Jing Ren, Simone Melzi, Peter Wonka, Maks Ovsjanikov

In this paper, we provide a theoretical foundation for pointwise map recovery from functional maps and highlight its relation to a range of shape correspondence methods based on spectral alignment. [Expand]

0.00
Monday Poster Session
[1416]

Synthesize-It-Classifier: Learning a Generative Classifier Through Recurrent Self-Analysis

Arghya Pal, Raphael C.-W. Phan, KokSheik Wong

In this work, we show the generative capability of an image classifier network by synthesizing high-resolution, photo-realistic, and diverse images at scale. [Expand]

0.00
Tuesday Poster Session
[1417]

Generalization on Unseen Domains via Inference-Time Label-Preserving Target Projections

Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1418]

Trajectory Prediction With Latent Belief Energy-Based Model

Bo Pang, Tianyang Zhao, Xu Xie, Ying Nian Wu

Human trajectory prediction is critical for autonomous platforms like self-driving cars or social robots. [Expand]

0.00
Thursday Poster Session
[1419]

Recorrupted-to-Recorrupted: Unsupervised Deep Learning for Image Denoising

Tongyao Pang, Huan Zheng, Yuhui Quan, Hui Ji

Deep denoiser, the deep network for denoising, has been the focus of the recent development on image denoising. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1420]

Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders

Jiwoong Park, Junho Cho, Hyung Jin Chang, Jin Young Choi

Most of the existing literature regarding hyperbolic embedding concentrate upon supervised learning, whereas the use of unsupervised hyperbolic embedding is less well explored. [Expand]

0.00
Tuesday Poster Session
[1421]

Learning To Predict Visual Attributes in the Wild

Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

Visual attributes constitute a large portion of information contained in a scene. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1422]

SliceNet: Deep Dense Depth Estimation From a Single Indoor Panorama Using a Slice-Based Representation

Giovanni Pintore, Marco Agus, Eva Almansa, Jens Schneider, Enrico Gobbetti

We introduce a novel deep neural network to estimate a depth map from a single monocular indoor panorama. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1423]

Recognizing Actions in Videos From Unseen Viewpoints

AJ Piergiovanni, Michael S. Ryoo

Standard methods for video recognition use large CNNs designed to capture spatio-temporal data. [Expand]

PDF
arXiv
Show Tweets
0.00
0
0
0
Tuesday Poster Session
[1424]

CompositeTasking: Understanding Images by Spatial Composition of Tasks

Nikola Popovic, Danda Pani Paudel, Thomas Probst, Guolei Sun, Luc Van Gool

We define the concept of CompositeTasking as the fusion of multiple, spatially distributed tasks, for various aspects of image understanding. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1425]

A Functional Approach to Rotation Equivariant Non-Linearities for Tensor Field Networks.

Adrien Poulenard, Leonidas J. Guibas

Learning pose invariant representation is a fundamental problem in shape analysis. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1426]

Labeled From Unlabeled: Exploiting Unlabeled Data for Few-Shot Deep HDR Deghosting

K. Ram Prabhakar, Gowtham Senthil, Susmit Agrawal, R. Venkatesh Babu, Rama Krishna Sai S Gorthi

High Dynamic Range (HDR) deghosting is an indispensable tool in capturing wide dynamic range scenes without ghosting artifacts. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1427]

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

John Phillips, Julieta Martinez, Ioan Andrei Barsan, Sergio Casas, Abbas Sadat, Raquel Urtasun

Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving including perception, motion forecasting, and motion planning. [Expand]

0.00
Tuesday Poster Session
[1428]

BABEL: Bodies, Action and Behavior With English Labels

Abhinanda R. Punnakkal, Arjun Chandrasekaran, Nikos Athanasiou, Alejandra Quiros-Ramirez, Michael J. Black

Understanding the semantics of human movement -- the what, how and why of the movement -- is an important problem that requires datasets of human actions with semantic labels. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1429]

Boosting Video Representation Learning With Multi-Faceted Integration

Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Xiao-Ping Zhang, Dong Wu, Tao Mei

Video content is multifaceted, consisting of objects, scenes, interactions or actions. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1430]

Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation Priors

Haiquan Qiu, Yao Wang, Deyu Meng

Snapshot compressive imaging (SCI) is a new type of compressive imaging system that compresses multiple frames of images into a single snapshot measurement, which enjoys low cost, low bandwidth, and high-speed sensing rate. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1431]

PQA: Perceptual Question Answering

Yonggang Qi, Kai Zhang, Aneeshan Sain, Yi-Zhe Song

Perceptual organization remains one of the very few established theories on the human visual system. [Expand]

0.00
Thursday Poster Session
[1432]

Multi-Scale Aligned Distillation for Low-Resolution Detection

Lu Qi, Jason Kuen, Jiuxiang Gu, Zhe Lin, Yi Wang, Yukang Chen, Yanwei Li, Jiaya Jia

In instance-level detection tasks (e.g., object detection), reducing input resolution is an easy option to improve runtime efficiency. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1433]

Removing Raindrops and Rain Streaks in One Go

Ruijie Quan, Xin Yu, Yuanzhi Liang, Yi Yang

Existing rain-removal algorithms often tackle either rain streak removal or raindrop removal, and thus may fail to handle real-world rainy scenes. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1434]

DyGLIP: A Dynamic Graph Model With Link Prediction for Accurate Multi-Camera Multiple Object Tracking

Kha Gia Quach, Pha Nguyen, Huu Le, Thanh-Dat Truong, Chi Nhan Duong, Minh-Triet Tran, Khoa Luu

Multi-Camera Multiple Object Tracking (MC-MOT) is a significant computer vision problem due to its emerging applicability in several real-world applications. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1435]

Exploiting & Refining Depth Distributions With Triangulation Light Curtains

Yaadhav Raaj, Siddharth Ancha, Robert Tamburo, David Held, Srinivasa G. Narasimhan

Active sensing through the use of Adaptive Depth Sensors is a nascent field, with potential in areas such as Advanced driver-assistance systems (ADAS). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1436]

DAT: Training Deep Networks Robust To Label-Noise by Matching the Feature Distributions

Yuntao Qu, Shasha Mo, Jianwei Niu

In real application scenarios, the performance of deep networks may be degraded when the dataset contains noisy labels. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1437]

Flow Guided Transformable Bottleneck Networks for Motion Retargeting

Jian Ren, Menglei Chai, Oliver J. Woodford, Kyle Olszewski, Sergey Tulyakov

Human motion retargeting aims to transfer the motion of one person in a driving video or set of images to another person. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1438]

Adaptive Consistency Prior Based Deep Network for Image Denoising

Chao Ren, Xiaohai He, Chuncheng Wang, Zhibo Zhao

Recent studies have shown that deep networks can achieve promising results for image denoising. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1439]

Reciprocal Transformations for Unsupervised Video Object Segmentation

Sucheng Ren, Wenxi Liu, Yongtuo Liu, Haoxin Chen, Guoqiang Han, Shengfeng He

Unsupervised video object segmentation (UVOS) aims at segmenting the primary objects in videos without any human intervention. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1440]

Learning From the Master: Distilling Cross-Modal Advanced Knowledge for Lip Reading

Sucheng Ren, Yong Du, Jianming Lv, Guoqiang Han, Shengfeng He

Lip reading aims to predict the spoken sentences from silent lip videos. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1441]

End-to-End High Dynamic Range Camera Pipeline Optimization

Nicolas Robidoux, Luis E. Garcia Capel, Dong-eun Seo, Avinash Sharma, Federico Ariza, Felix Heide

With a 280 dB dynamic range, the real world is a High Dynamic Range (HDR) world. [Expand]

0.00
Tuesday Poster Session
[1442]

Gaussian Context Transformer

Dongsheng Ruan, Daiyin Wang, Yuan Zheng, Nenggan Zheng, Min Zheng

Recently, a large number of channel attention blocks are proposed to boost the representational power of deep convolutional neural networks (CNNs). [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1443]

Learning-Based Image Registration With Meta-Regularization

Ebrahim Al Safadi, Xubo Song

We introduce a meta-regularization framework for learning-based image registration. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1444]

Learning an Explicit Weighting Scheme for Adapting Complex HSI Noise

Xiangyu Rui, Xiangyong Cao, Qi Xie, Zongsheng Yue, Qian Zhao, Deyu Meng

A general approach for handling hyperspectral image (HSI) denoising issue is to impose weights on different HSI pixels to suppress negative influence brought by noisy elements. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1445]

Multi-Perspective LSTM for Joint Visual Representation Learning

Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia, Ali Etemad

We present a novel LSTM cell architecture capable of learning both intra- and inter-perspective relationships available in visual sequences captured from multiple perspectives. [Expand]

0.00
Friday Poster Session
[1446]

Introvert: Human Trajectory Prediction via Conditional 3D Attention

Nasim Shafiee, Taskin Padir, Ehsan Elhamifar

Predicting human trajectories is an important component of autonomous moving platforms, such as social robots and self-driving cars. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1447]

Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects

Aashish Sharma, Robby T. Tan

Most existing nighttime visibility enhancement methods focus on low light. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1448]

CFNet: Cascade and Fused Cost Volume for Robust Stereo Matching

Zhelun Shen, Yuchao Dai, Zhibo Rao

Recently, the ever-increasing capacity of large-scale annotated datasets has led to profound progress in stereo matching. [Expand]

0.00
Thursday Poster Session
[1449]

Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes

Shuai Shen, Wanhua Li, Zheng Zhu, Guan Huang, Dalong Du, Jiwen Lu, Jie Zhou

Face clustering is a promising method for annotating unlabeled face images. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1450]

Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation

Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji

Panoptic segmentation aims to partition an image to object instances and semantic content for thing and stuff categories, respectively. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1451]

clDice - A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

Suprosanna Shit, Johannes C. Paetzold, Anjany Sekuboyina, Ivan Ezhov, Alexander Unger, Andrey Zhylka, Josien P. W. Pluim, Ulrich Bauer, Bjoern H. Menze

Accurate segmentation of tubular, network-like structures, such as vessels, neurons, or roads, is relevant to many fields of research. [Expand]

0.00
Friday Poster Session
[1452]

Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment

Dongyu She, Yu-Kun Lai, Gaoxiong Yi, Kun Xu

Learning computational models of image aesthetics can have a substantial impact on visual art and graphic design. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1453]

Learning by Planning: Language-Guided Global Image Editing

Jing Shi, Ning Xu, Yihang Xu, Trung Bui, Franck Dernoncourt, Chenliang Xu

Recently, language-guided global image editing draws increasing attention with growing application potentials. [Expand]

0.00
Thursday Poster Session
[1454]

GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition

Fengmin Shi, Jie Guo, Haonan Zhang, Shan Yang, Xiying Wang, Yanwen Guo

In this paper, we aim to recognize materials with combined use of auditory and visual perception. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1455]

Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data

Ying Shu, Yan Yan, Si Chen, Jing-Hao Xue, Chunhua Shen, Hanzi Wang

Recent advances in deep learning have demonstrated excellent results for Facial Attribute Recognition (FAR), typically trained with large-scale labeled data. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1456]

Communication Efficient SGD via Gradient Sampling With Bayes Prior

Liuyihan Song, Kang Zhao, Pan Pan, Yu Liu, Yingya Zhang, Yinghui Xu, Rong Jin

Gradient compression has been widely adopted in data-parallel distributed training of deep neural networks to reduce communication overhead. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1457]

Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos

Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang

In this paper, we address the problem of referring expression comprehension in videos, which is challenging due to complex expression and scene dynamics. [Expand]

0.00
Monday Poster Session
[1458]

Hybrid Message Passing With Performance-Driven Structures for Facial Action Unit Detection

Tengfei Song, Zijun Cui, Wenming Zheng, Qiang Ji

Message passing neural network has been an effective method to represent dependencies among nodes by propagating messages. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1459]

Mesh Saliency: An Independent Perceptual Measure or a Derivative of Image Saliency?

Ran Song, Wei Zhang, Yitian Zhao, Yonghuai Liu, Paul L. Rosin

While mesh saliency aims to predict regional importance of 3D surfaces in agreement with human visual perception and is well researched in computer vision and graphics, latest work with eye-tracking experiments shows that state-of-the-art mesh saliency methods remain poor at predicting human fixations. [Expand]

0.00
Wednesday Poster Session
[1460]

Tree-Like Decision Distillation

Jie Song, Haofei Zhang, Xinchao Wang, Mengqi Xue, Ying Chen, Li Sun, Dacheng Tao, Mingli Song

Knowledge distillation pursues a diminutive yet well-behaved student network by harnessing the knowledge learned by a cumbersome teacher model. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1461]

Spatio-temporal Contrastive Domain Adaptation for Action Recognition

Xiaolin Song, Sicheng Zhao, Jingyu Yang, Huanjing Yue, Pengfei Xu, Runbo Hu, Hua Chai

Unsupervised domain adaptation (UDA) for human action recognition is a practical and challenging problem. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1462]

Dynamic Probabilistic Graph Convolution for Facial Action Unit Intensity Estimation

Tengfei Song, Zijun Cui, Yuru Wang, Wenming Zheng, Qiang Ji

Deep learning methods have been widely applied to automatic facial action unit (AU) intensity estimation and achieved state-of-the-art performance. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1463]

Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling

Daniel Stadler, Jurgen Beyerer

Multi-pedestrian trackers perform well when targets are clearly visible making the association task quite easy. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1464]

Gated Spatio-Temporal Attention-Guided Video Deblurring

Maitreya Suin, A. N. Rajagopalan

Video deblurring remains a challenging task due to the complexity of spatially and temporally varying blur. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1465]

Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion

Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, Xi Li

RGB-D salient object detection (SOD) is usually formulated as a problem of classification or regression over two modalities, i.e., RGB and depth. [Expand]

0.00
Monday Poster Session
[1466]

Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

Cheng Sun, Chi-Wei Hsiao, Ning-Hsu Wang, Min Sun, Hwann-Tzong Chen

Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1467]

Learning View Selection for 3D Scenes

Yifan Sun, Qixing Huang, Dun-Yu Hsiao, Li Guan, Gang Hua

Efficient 3D space sampling to represent an underlying3D object/scene is essential for 3D vision, robotics, and be-yond. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1468]

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

Yanan Sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai

Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1469]

Lesion-Aware Transformers for Diabetic Retinopathy Grading

Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang

Diabetic retinopathy (DR) is the leading cause of permanent blindness in the working-age population. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1470]

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection

Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin Elsayed, Alex Bewley, Xiao Zhang, Cristian Sminchisescu, Dragomir Anguelov

The detection of 3D objects from LiDAR data is a critical component in most autonomous driving systems. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1471]

Soteria: Provable Defense Against Privacy Leakage in Federated Learning From Representation Perspective

Jingwei Sun, Ang Li, Binghui Wang, Huanrui Yang, Hai Li, Yiran Chen

Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1472]

Semantic Image Matting

Yanan Sun, Chi-Keung Tang, Yu-Wing Tai

Natural image matting separates the foreground from background in fractional occupancy which can be caused by highly transparent objects, complex foreground (e.g., net or tree), and/or objects containing very fine details (e.g., hairs). [Expand]

0.00
0
0
0
Wednesday Poster Session
[1473]

Tuning IR-Cut Filter for Illumination-Aware Spectral Reconstruction From RGB

Bo Sun, Junchi Yan, Xiao Zhou, Yinqiang Zheng

To reconstruct spectral signals from multi-channel observations, in particular trichromatic RGBs, has recently emerged as a promising alternative to traditional scanning-based spectral imager. [Expand]

0.00
Monday Poster Session
[1474]

Uncertainty Reduction for Model Adaptation in Semantic Segmentation

Prabhu Teja S, Francois Fleuret

Traditional methods for Unsupervised Domain Adaptation (UDA) targeting semantic segmentation exploit information common to the source and target domains, using both labeled source data and unlabeled target data. [Expand]

0.00
Wednesday Poster Session
[1475]

ArtCoder: An End-to-End Method for Generating Scanning-Robust Stylized QR Codes

Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, Mingliang Xu, Tao Ren

Quick Response (QR) code is one of the most worldwide used two-dimensional codes. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1476]

Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification

Fariborz Taherkhani, Ali Dabouei, Sobhan Soleymani, Jeremy Dawson, Nasser M. Nasrabadi

The goal is to use Wasserstein metric to provide pseudo labels for the unlabeled images to train a Convolutional Neural Networks (CNN) in a Semi-Supervised Learning (SSL) manner for the classification task. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1477]

Event-Based Bispectral Photometry Using Temporally Modulated Illumination

Tsuyoshi Takatani, Yuzuha Ito, Ayaka Ebisu, Yinqiang Zheng, Takahito Aoto

Analysis of bispectral difference plays a critical role in various applications that involve rays propagating in a light absorbing medium. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1478]

Humble Teachers Teach Better Students for Semi-Supervised Object Detection

Yihe Tang, Weifeng Chen, Yijun Luo, Yuting Zhang

We propose a semi-supervised approach for contemporary object detectors following the teacher-student dual model framework. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1479]

Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms

Yuxing Tang, Zhenjie Cao, Yanbo Zhang, Zhicheng Yang, Zongcheng Ji, Yiwei Wang, Mei Han, Jie Ma, Jing Xiao, Peng Chang

Mammographic mass detection is an integral part of a computer-aided diagnosis system. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1480]

Mutual CRF-GNN for Few-Shot Learning

Shixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli Ouyang

Graph-neural-networks (GNN) is a rising trend for few-shot learning. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1481]

SKFAC: Training Neural Networks With Faster Kronecker-Factored Approximate Curvature

Zedong Tang, Fenlong Jiang, Maoguo Gong, Hao Li, Yue Wu, Fan Yu, Zidong Wang, Min Wang

The bottleneck of computation burden limits the widespread use of the 2nd order optimization algorithms for training deep neural networks. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1482]

OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations

Yang Tan, Yang Li, Shao-Lun Huang

Transfer learning across heterogeneous data distributions (a.k.a. [Expand]

0.00
Friday Poster Session
[1483]

Mirror3D: Depth Refinement for Mirror Surfaces

Jiaqi Tan, Weijie Lin, Angel X. Chang, Manolis Savva

Despite recent progress in depth sensing and 3D reconstruction, mirror surfaces are a significant source of errors. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1484]

Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks?

Yapeng Tian, Chenliang Xu

In this paper, we propose to make a systematic study on machines' multisensory perception under attacks. [Expand]

0.00
Tuesday Poster Session
[1485]

Farewell to Mutual Information: Variational Distillation for Cross-Modal Person Re-Identification

Xudong Tian, Zhizhong Zhang, Shaohui Lin, Yanyun Qu, Yuan Xie, Lizhuang Ma

The Information Bottleneck (IB) provides an information theoretic principle for representation learning, by retaining all information relevant for predicting label while minimizing the redundancy. [Expand]

0.00
Monday Poster Session
[1486]

Probabilistic Selective Encryption of Convolutional Neural Networks for Hierarchical Services

Jinyu Tian, Jiantao Zhou, Jia Duan

Model protection is vital when deploying Convolutional Neural Networks (CNNs) for commercial services, due to the massive costs of training them. [Expand]

0.00
0
0
0
Monday Poster Session
[1487]

Post-Hoc Uncertainty Calibration for Domain Drift Scenarios

Christian Tomani, Sebastian Gruber, Muhammed Ebrar Erdem, Daniel Cremers, Florian Buettner

We address the problem of uncertainty calibration. [Expand]

0.00
Wednesday Poster Session
[1488]

FaceSec: A Fine-Grained Robustness Evaluation Framework for Face Recognition Systems

Liang Tong, Zhengzhang Chen, Jingchao Ni, Wei Cheng, Dongjin Song, Haifeng Chen, Yevgeniy Vorobeychik

We present FACESEC, a framework for fine-grained robustness evaluation of face recognition systems. [Expand]

0.00
Thursday Poster Session
[1489]

Automatic Correction of Internal Units in Generative Neural Networks

Ali Tousi, Haedong Jeong, Jiyeon Han, Hwanil Choi, Jaesik Choi

Generative Adversarial Networks (GANs) have shown satisfactory performance in synthetic image generation by devising complex network structure and adversarial training scheme. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1490]

Explore Image Deblurring via Encoded Blur Kernel Space

Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai

This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space. [Expand]

0.00
Thursday Poster Session
[1491]

Reconsidering Representation Alignment for Multi-View Clustering

Daniel J. Trosten, Sigurd Lokse, Robert Jenssen, Michael Kampffmeyer

Aligning distributions of view representations is a core component of today's state of the art models for deep multi-view clustering. [Expand]

0.00
Monday Poster Session
[1492]

SSLayout360: Semi-Supervised Indoor Layout Estimation From 360deg Panorama

Phi Vu Tran

Recent years have seen flourishing research on both semi-supervised learning and 3D room layout reconstruction. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1493]

ColorRL: Reinforced Coloring for End-to-End Instance Segmentation

Tran Anh Tuan, Nguyen Tuan Khoa, Tran Minh Quan, Won-Ki Jeong

Instance segmentation, the task of identifying and separating each individual object of interest in the image, is one of the actively studied research topics in computer vision. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1494]

Time Lens: Event-Based Video Frame Interpolation

Stepan Tulyakov, Daniel Gehrig, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, Davide Scaramuzza

State-of-the-art frame interpolation methods generate intermediate frames by inferring object motions in the image from consecutive key-frames. [Expand]

0.00
Friday Poster Session
[1495]

Uncertainty-Aware Camera Pose Estimation From Points and Lines

Alexander Vakhitov, Luis Ferraz, Antonio Agudo, Francesc Moreno-Noguer

Perspective-n-Point-and-Line (PnPL) algorithms aim at fast, accurate, and robust camera localization with respect to a 3D model from 2D-3D feature correspondences, being a major part of modern robotic and AR/VR systems. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1496]

Can We Characterize Tasks Without Labels or Features?

Bram Wallace, Ziyang Wu, Bharath Hariharan

The problem of expert model selection deals with choosing the appropriate pretrained network ("expert") to transfer to a target task. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1497]

A Self-Boosting Framework for Automated Radiographic Report Generation

Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li

Automated radiographic report generation is a challenging task since it requires to generate paragraphs describing fine-grained visual differences of cases, especially for those between the diseased and the healthy. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1498]

Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification

Peng Wang, Kai Han, Xiu-Shen Wei, Lei Zhang, Lei Wang

Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning in imbalanced cases. [Expand]

0.00
Monday Poster Session
[1499]

Deep Two-View Structure-From-Motion Revisited

Jianyuan Wang, Yiran Zhong, Yuchao Dai, Stan Birchfield, Kaihao Zhang, Nikolai Smolyanskiy, Hongdong Li

Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM. [Expand]

0.00
Wednesday Poster Session
[1500]

Domain-Specific Suppression for Adaptive Object Detection

Yu Wang, Rui Zhang, Shuo Zhang, Miao Li, Yangyang Xia, Xishan Zhang, Shaoli Liu

Domain adaptation methods face performance degradation in object detection, as the complexity of tasks require more about the transferability of the model. [Expand]

0.00
Wednesday Poster Session
[1501]

Dual Attention Suppression Attack: Generate Adversarial Camouflage in Physical World

Jiakai Wang, Aishan Liu, Zixin Yin, Shunchang Liu, Shiyu Tang, Xianglong Liu

Deep learning models are vulnerable to adversarial examples. [Expand]

0.00
Wednesday Poster Session
[1502]

EvDistill: Asynchronous Events To End-Task Learning via Bidirectional Reconstruction-Guided Cross-Modal Knowledge Distillation

Lin Wang, Yujeong Chae, Sung-Hoon Yoon, Tae-Kyun Kim, Kuk-Jin Yoon

Event cameras sense per-pixel intensity changes and produce asynchronous event streams with high dynamic range and less motion blur, showing advantages over the conventional cameras. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1503]

FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation

Sijin Wang, Ziwei Yao, Ruiping Wang, Zhongqin Wu, Xilin Chen

Image caption evaluation is a crucial task, which involves the semantic perception and matching of image and text. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1504]

FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds

Haiyan Wang, Jiahao Pang, Muhammad A. Lodhi, Yingli Tian, Dong Tian

Scene flow depicts the dynamics of a 3D scene, which is critical for various applications such as autonomous driving, robot navigation, AR/VR, etc. [Expand]

0.00
Thursday Poster Session
[1505]

From Semantic Categories to Fixations: A Novel Weakly-Supervised Visual-Auditory Saliency Detection Approach

Guotao Wang, Chenglizhao Chen, Deng-Ping Fan, Aimin Hao, Hong Qin

Thanks to the rapid advances in the deep learning techniques and the wide availability of large-scale training sets, the performances of video saliency detection models have been improving steadily and significantly. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1506]

Gradient-Based Algorithms for Machine Teaching

Pei Wang, Kabir Nagrecha, Nuno Vasconcelos

The problem of machine teaching is considered. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1507]

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

Jing Wang, Jinhui Tang, Mingkun Yang, Xiang Bai, Jiebo Luo

OCR-based image captioning aims to automatically describe images based on all the visual entities (both visual objects and scene text) in images. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1508]

Glancing at the Patch: Anomaly Localization With Global and Local Feature Comparison

Shenzhi Wang, Liwei Wu, Lei Cui, Yujun Shen

Anomaly localization, with the purpose to segment the anomalous regions within images, is challenging due to the large variety of anomaly types. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1509]

LED2-Net: Monocular 360deg Layout Estimation via Differentiable Depth Rendering

Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai

Although significant progress has been made in room layout estimation, most methods aim to reduce the loss in the 2D pixel coordinate rather than exploiting the room structure in the 3D space. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1510]

Multi-Decoding Deraining Network and Quasi-Sparsity Based Training

Yinglong Wang, Chao Ma, Bing Zeng

Existing deep deraining models are mainly learned via directly minimizing the statistical differences between rainy images and rain-free ground truths. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1511]

PAUL: Procrustean Autoencoder for Unsupervised Lifting

Chaoyang Wang, Simon Lucey

Recent success in casting Non-rigid Structure from Motion (NRSfM) as an unsupervised deep learning problem has raised fundamental questions about what novelty in NRSfM prior could the deep learning offer. [Expand]

0.00
Monday Poster Session
[1512]

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang

Camera and LiDAR are two complementary sensors for 3D object detection in the autonomous driving context. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1513]

Pseudo Facial Generation With Extreme Poses for Face Recognition

Guoli Wang, Jiaqi Ma, Qian Zhang, Jiwen Lu, Jie Zhou

Face recognition has achieved a great success in recent years, it is still challenging to recognize those facial images with extreme poses. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1514]

Representative Forgery Mining for Fake Face Detection

Chengrui Wang, Weihong Deng

Although vanilla Convolutional Neural Network (CNN) based detectors can achieve satisfactory performance on fake face detection, we observe that the detectors tend to seek forgeries on a limited region of face, which reveals that the detectors is short of understanding of forgery. [Expand]

0.00
Thursday Poster Session
[1515]

Rich Features for Perceptual Quality Assessment of UGC Videos

Yilin Wang, Junjie Ke, Hossein Talebi, Joong Gon Yim, Neil Birkbeck, Balu Adsumilli, Peyman Milanfar, Feng Yang

Video quality assessment for User Generated Content (UGC) is an important topic in both industry and academia. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1516]

RSG: A Simple but Effective Module for Learning Imbalanced Datasets

Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, Zhenghua Xu

Imbalanced datasets widely exist in practice and are a great challenge for training deep neural models with a good generalization on infrequent classes. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1517]

Single-Stage Instance Shadow Detection With Bidirectional Relation Learning

Tianyu Wang, Xiaowei Hu, Chi-Wing Fu, Pheng-Ann Heng

Instance shadow detection aims to find shadow instances paired with the objects that cast the shadows. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1518]

Structured Multi-Level Interaction Network for Video Moment Localization via Language Query

Hao Wang, Zheng-Jun Zha, Liang Li, Dong Liu, Jiebo Luo

We address the problem of localizing a specific moment described by a natural language query. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1519]

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Xudong Wang, Long Lian, Stella X. Yu

The vision-based reinforcement learning (RL) has achieved tremendous success. [Expand]

0.00
Tuesday Poster Session
[1520]

A Generalized Loss Function for Crowd Counting and Localization

Jia Wan, Ziquan Liu, Antoni B. Chan

Previous work shows that a better density map representation can improve the performance of crowd counting. [Expand]

0.00
Monday Poster Session
[1521]

Self-Attention Based Text Knowledge Mining for Text Detection

Qi Wan, Haoqin Ji, Linlin Shen

Pre-trained models play an important role in deep learning based text detectors. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1522]

MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation

Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen

For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics. [Expand]

0.00
Friday Poster Session
[1523]

Shallow Feature Matters for Weakly Supervised Object Localization

Jun Wei, Qin Wang, Zhen Li, Sheng Wang, S. Kevin Zhou, Shuguang Cui

Weakly supervised object localization (WSOL) aims to localize objects by only utilizing image-level labels. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1524]

Autoregressive Stylized Motion Synthesis With Generative Flow

Yu-Hui Wen, Zhipeng Yang, Hongbo Fu, Lin Gao, Yanan Sun, Yong-Jin Liu

Motion style transfer is an important problem in many computer graphics and computer vision applications, including human animation, games, and robotics. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1525]

Holistic 3D Human and Scene Mesh Estimation From Single View Images

Zhenzhen Weng, Serena Yeung

The 3D world limits the human body pose and the human body pose conveys information about the surrounding objects. [Expand]

0.00
Monday Poster Session
[1526]

Learning Progressive Point Embeddings for 3D Point Cloud Generation

Cheng Wen, Baosheng Yu, Dacheng Tao

Generative models for 3D point clouds are extremely important for scene/object reconstruction applications in autonomous driving and robotics. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1527]

PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths

Xin Wen, Peng Xiang, Zhizhong Han, Yan-Pei Cao, Pengfei Wan, Wen Zheng, Yu-Shen Liu

The task of point cloud completion aims to predict the missing part for an incomplete 3D shape. [Expand]

0.00
Wednesday Poster Session
[1528]

Separating Skills and Concepts for Novel Visual Question Answering

Spencer Whitehead, Hui Wu, Heng Ji, Rogerio Feris, Kate Saenko

Generalization to out-of-distribution data has been a problem for Visual Question Answering (VQA) models. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1529]

Learning To Associate Every Segment for Video Panoptic Segmentation

Sanghyun Woo, Dahun Kim, Joon-Young Lee, In So Kweon

Temporal correspondence -- linking pixels or objects across frames -- is a fundamental supervisory signal for the video models. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1530]

Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics

Yanzhao Wu, Ling Liu, Zhongwei Xie, Ka-Ho Chow, Wenqi Wei

Neural network ensembles are gaining popularity by harnessing the complementary wisdom of multiple base models. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1531]

Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation

Tong Wu, Junshi Huang, Guangyu Gao, Xiaoming Wei, Xiaolin Wei, Xuan Luo, Chi Harold Liu

Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1532]

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification

Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji

Visible-infrared person re-identification (Re-ID) aims to match the pedestrian images of the same identity from different modalities. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1533]

Improving the Transferability of Adversarial Samples With Adversarial Transformations

Weibin Wu, Yuxin Su, Michael R. Lyu, Irwin King

Although deep neural networks (DNNs) have achieved tremendous performance in diverse vision challenges, they are surprisingly susceptible to adversarial examples, which are born of intentionally perturbing benign samples in a human-imperceptible fashion. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1534]

Progressive Unsupervised Learning for Visual Object Tracking

Qiangqiang Wu, Jia Wan, Antoni B. Chan

In this paper, we propose a progressive unsupervised learning (PUL) framework, which entirely removes the need for annotated training videos in visual tracking. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1535]

Towards Long-Form Video Understanding

Chao-Yuan Wu, Philipp Krahenbuhl

Our world offers a never-ending stream of visual stimuli, yet today's vision systems only accurately recognize patterns within a few seconds. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1536]

Improving Transferability of Adversarial Patches on Face Recognition With Generative Models

Zihao Xiao, Xianfeng Gao, Chilin Fu, Yinpeng Dong, Wei Gao, Xiaolu Zhang, Jun Zhou, Jun Zhu

Face recognition is greatly improved by deep convolutional neural networks (CNNs). [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1537]

Dynamic Weighted Learning for Unsupervised Domain Adaptation

Ni Xiao, Lei Zhang

Unsupervised domain adaptation (UDA) aims to improve the classification performance on an unlabeled target domain by leveraging information from a fully labeled source domain. [Expand]

0.00
Thursday Poster Session
[1538]

Space-Time Distillation for Video Super-Resolution

Zeyu Xiao, Xueyang Fu, Jie Huang, Zhen Cheng, Zhiwei Xiong

Compact video super-resolution (VSR) networks can be easily deployed on resource-limited devices, e.g., smart-phones and wearable devices, but have considerable performance gaps compared with complicated VSR networks that require a large amount of computing resources. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1539]

You See What I Want You To See: Exploring Targeted Black-Box Transferability Attack for Hash-Based Image Retrieval Systems

Yanru Xiao, Cong Wang

With the large multimedia content online, deep hashing has become a popular method for efficient image retrieval and storage. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1540]

Scale-Aware Graph Neural Network for Few-Shot Semantic Segmentation

Guo-Sen Xie, Jie Liu, Huan Xiong, Ling Shao

Few-shot semantic segmentation (FSS) aims to segment unseen class objects given very few densely-annotated support images from the same class. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1541]

End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution

Wenzhu Xing, Karen Egiazarian

Image denoising, demosaicing and super-resolution are key problems of image restoration well studied in the recent decades. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1542]

Seeing in Extra Darkness Using a Deep-Red Flash

Jinhui Xiong, Jian Wang, Wolfgang Heidrich, Shree Nayar

We propose a new flash technique for low-light imaging, using deep-red light as an illuminating source. [Expand]

0.00
Wednesday Poster Session
[1543]

Adaptive Rank Estimate in Robust Principal Component Analysis

Zhengqin Xu, Rui He, Shoulie Xie, Shiqian Wu

Robust principal component analysis (RPCA) and its variants have gained wide applications in computer vision. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1544]

Consistent Instance False Positive Improves Fairness in Face Recognition

Xingkun Xu, Yuge Huang, Pengcheng Shen, Shaoxin Li, Jilin Li, Feiyue Huang, Yong Li, Zhen Cui

Demographic bias is a significant challenge in practical face recognition systems. [Expand]

0.00
0
0
0
Monday Poster Session
[1545]

Discrimination-Aware Mechanism for Fine-Grained Representation Learning

Furong Xu, Meng Wang, Wei Zhang, Yuan Cheng, Wei Chu

Recently, with the emergence of retrieval requirements for certain individual in the same superclass, e.g., birds, persons, cars, fine-grained recognition task has attracted a significant amount of attention from academia and industry. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1546]

Layer-Wise Searching for 1-Bit Detectors

Sheng Xu, Junhe Zhao, Jinhu Lu, Baochang Zhang, Shumin Han, David Doermann

1-bit detectors show great promise for resource-constrained embedded devices but often suffer from a significant performance gap compared with their real-valued counterparts. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1547]

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning Over Traffic Events

Li Xu, He Huang, Jun Liu

Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1548]

Towards Accurate Text-Based Image Captioning With Content Diversity Exploration

Guanghui Xu, Shuaicheng Niu, Mingkui Tan, Yucheng Luo, Qing Du, Qi Wu

Text-based image captioning (TextCap) which aims to read and reason images with texts is crucial for a machine to understand a detailed and complex scene environment, considering that texts are omnipresent in daily life. [Expand]

0.00
Thursday Poster Session
[1549]

A Circular-Structured Representation for Visual Emotion Distribution Learning

Jingyuan Yang, Jie Li, Leida Li, Xiumei Wang, Xinbo Gao

Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1550]

Bottom-Up Shift and Reasoning for Referring Image Segmentation

Sibei Yang, Meng Xia, Guanbin Li, Hong-Yu Zhou, Yizhou Yu

Referring image segmentation aims to segment the referent that is the corresponding object or stuff referred by a natural language expression in an image. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1551]

Beyond Short Clips: End-to-End Video-Level Learning With Collaborative Memories

Xitong Yang, Haoqi Fan, Lorenzo Torresani, Larry S. Davis, Heng Wang

The standard way of training video models entails sampling at each iteration a single clip from a video and optimizing the clip prediction with respect to the video-level label. [Expand]

0.00
Wednesday Poster Session
[1552]

CT-Net: Complementary Transfering Network for Garment Transfer With Arbitrary Geometric Changes

Fan Yang, Guosheng Lin

Garment transfer shows great potential in realistic applications with the goal of transfering outfits across different people images. [Expand]

0.00
Wednesday Poster Session
[1553]

Defending Multimodal Fusion Models Against Single-Source Adversaries

Karren Yang, Wan-Yi Lin, Manash Barman, Filipe Condessa, Zico Kolter

Beyond achieving high performance across many vision tasks, multimodal models are expected to be robust to single-source faults due to the availability of redundant information between modalities. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1554]

Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes

Huiting Yang, Liangyu Chai, Qiang Wen, Shuang Zhao, Zixun Sun, Shengfeng He

Generative adversarial networks (GANs) learn to map noise latent vectors to high-fidelity image outputs. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1555]

DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping

Yanchao Yang, Brian Lai, Stefano Soatto

We describe an unsupervised method to detect and segment portions of images of live scenes that, at some point in time, are seen moving as a coherent whole, which we refer to as objects. [Expand]

PDF
arXiv
Show Tweets
0.00
0
0
0
Tuesday Poster Session
[1556]

End-to-End Rotation Averaging With Multi-Source Propagation

Luwei Yang, Heng Li, Jamal Ahmed Rahim, Zhaopeng Cui, Ping Tan

This paper presents an end-to-end neural network for multiple rotation averaging in SfM. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1557]

Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods

Minghan Yang, Dong Xu, Hongyu Chen, Zaiwen Wen, Mengyun Chen

In this paper, we consider stochastic second-order methods for minimizing a finite summation of nonconvex functions. [Expand]

PDF
arXiv
Show Tweets
0.00
0
0
0
Wednesday Poster Session
[1558]

Exploiting Semantic Embedding and Visual Feature for Facial Action Unit Detection

Huiyuan Yang, Lijun Yin, Yi Zhou, Jiuxiang Gu

Recent study on detecting facial action units (AU) has utilized auxiliary information (i.e., facial landmarks, relationship among AUs and expressions, web facial images, etc.), in order to improve the AU detection performance. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1559]

L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing

Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang

A deep facial attribute editing model strives to meet two requirements: (1) attribute correctness -- the target attribute should correctly appear on the edited face image; (2) irrelevance preservation -- any irrelevant information (e.g., identity) should not be changed after editing. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1560]

LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity

Cheng-Fu Yang, Wan-Cyuan Fan, Fu-En Yang, Yu-Chiang Frank Wang

When translating text inputs into layouts or images, existing works typically require explicit descriptions of each object in a scene, including their spatial information or the associated relationships. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1561]

Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking

Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua

Multi-person pose estimation and tracking serve as crucial steps for video understanding. [Expand]

0.00
Wednesday Poster Session
[1562]

Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis

Karren Yang, Samuel Goldman, Wengong Jin, Alex X. Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler

In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1563]

Partially View-Aligned Representation Learning With Noise-Robust Contrastive Loss

Mouxing Yang, Yunfan Li, Zhenyu Huang, Zitao Liu, Peng Hu, Xi Peng

In real-world applications, it is common that only a portion of data is aligned across views due to spatial, temporal, or spatiotemporal asynchronism, thus leading to the so-called Partially View-aligned Problem (PVP). [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1564]

Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow

Shangrong Yang, Chunyu Lin, Kang Liao, Chunjie Zhang, Yao Zhao

Distortion rectification is often required for fisheye images. [Expand]

0.00
Tuesday Poster Session
[1565]

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-View Transformation

Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Yuexin Ma, Shengfeng He, Jia Pan

HD map reconstruction is crucial for autonomous driving. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1566]

SelfSAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network

Xu Yang, Cheng Deng, Zhiyuan Dang, Kun Wei, Junchi Yan

Graph convolution networks (GCNs) are a powerful deep learning approach and have been successfully applied to representation learning on graphs in a variety of real-world applications. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1567]

StruMonoNet: Structure-Aware Monocular 3D Prediction

Zhenpei Yang, Li Erran Li, Qixing Huang

Monocular 3D prediction is one of the fundamental problems in 3D vision. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1568]

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu

Weakly supervised temporal action detection aims to localize temporal boundaries of actions and identify their categories simultaneously with only video-level category labels during training. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1569]

Anchor-Free Person Search

Yichao Yan, Jinpeng Li, Jie Qin, Song Bai, Shengcai Liao, Li Liu, Fan Zhu, Ling Shao

Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection and person re-identification (re-id). [Expand]

0.00
Wednesday Poster Session
[1570]

Discrete-Continuous Action Space Policy Gradient-Based Attention for Image-Text Matching

Shiyang Yan, Li Yu, Yuan Xie

Image-text matching is an important multi-modal task with massive applications. [Expand]

0.00
Wednesday Poster Session
[1571]

Online Learning of a Probabilistic and Adaptive Scene Representation

Zike Yan, Xin Wang, Hongbin Zha

Constructing and maintaining a consistent scene model on-the-fly is the core task for online spatial perception, interpretation, and action. [Expand]

0.00
0
0
0
Thursday Poster Session
[1572]

Self-Aligned Video Deraining With Transmission-Depth Consistency

Wending Yan, Robby T. Tan, Wenhan Yang, Dengxin Dai

In this paper, we address the problems of rain streaks and rain accumulation removal in video, by developing a self-aligned network with transmission-depth consistency. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1573]

Primitive Representation Learning for Scene Text Recognition

Ruijie Yan, Liangrui Peng, Shanyu Xiao, Gang Yao

Scene text recognition is a challenging task due to diverse variations of text instances in natural scene images. [Expand]

0.00
Monday Poster Session
[1574]

Unsupervised Hyperbolic Metric Learning

Jiexi Yan, Lei Luo, Cheng Deng, Heng Huang

Learning feature embedding directly from images without any human supervision is a very challenging and essential task in the field of computer vision and machine learning. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1575]

Jo-SRC: A Contrastive Approach for Combating Noisy Labels

Yazhou Yao, Zeren Sun, Chuanyi Zhang, Fumin Shen, Qi Wu, Jian Zhang, Zhenmin Tang

Due to the memorization effect in Deep Neural Networks (DNNs), training with noisy labels usually results in inferior model performance. [Expand]

0.00
Tuesday Poster Session
[1576]

Joint-DetNAS: Upgrade Your Detector With NAS, Pruning and Dynamic Distillation

Lewei Yao, Renjie Pi, Hang Xu, Wei Zhang, Zhenguo Li, Tong Zhang

We propose Joint-DetNAS, a unified NAS framework for object detection, which integrates 3 key components: Neural Architecture Search, pruning, and Knowledge Distillation. [Expand]

0.00
Wednesday Poster Session
[1577]

Adversarial Invariant Learning

Nanyang Ye, Jingxuan Tang, Huayu Deng, Xiao-Yun Zhou, Qianxiao Li, Zhenguo Li, Guang-Zhong Yang, Zhanxing Zhu

Though machine learning algorithms are able to achieve pattern recognition from the correlation between data and labels, the presence of spurious features in the data decreases the robustness of these learned relationships with respect to varied testing environments. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1578]

Linguistic Structures As Weak Supervision for Visual Scene Graph Generation

Keren Ye, Adriana Kovashka

Prior work in scene graph generation requires categorical supervision at the level of triplets---subjects and objects, and predicates that relate them, either with or without bounding box information. [Expand]

0.00
Wednesday Poster Session
[1579]

Iso-Points: Optimizing Neural Implicit Surfaces With Hybrid Representations

Wang Yifan, Shihao Wu, Cengiz Oztireli, Olga Sorkine-Hornung

Neural implicit functions have emerged as a powerful representation for surfaces in 3D. [Expand]

0.00
Monday Poster Session
[1580]

Towards Efficient Tensor Decomposition-Based DNN Model Compression With Optimization Framework

Miao Yin, Yang Sui, Siyu Liao, Bo Yuan

Advanced tensor decomposition, such as Tensor train (TT) and Tensor ring (TR), has been widely studied for deep neural network (DNN) model compression, especially for recurrent neural networks (RNNs). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1581]

RaScaNet: Learning Tiny Models by Raster-Scanning Images

Jaehyoung Yoo, Dongwook Lee, Changyong Son, Sangil Jung, ByungIn Yoo, Changkyu Choi, Jae-Joon Han, Bohyung Han

Deploying deep convolutional neural networks on ultra-low power systems is challenging due to the extremely limited resources. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1582]

Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing

Yuanyuan Yuan, Shuai Wang, Mingyue Jiang, Tsong Yueh Chen

Visual question answering (VQA) takes an image and a natural-language question as input and returns a natural-language answer. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1583]

Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner

Chong Yu

With the development of deep learning, neural networks tend to be deeper and larger to achieve good performance. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1584]

Adaptive Weighted Discriminator for Training Generative Adversarial Networks

Vasily Zadorozhnyy, Qiang Cheng, Qiang Ye

Generative adversarial network (GAN) has become one of the most important neural network models for classical unsupervised machine learning. [Expand]

0.00
Tuesday Poster Session
[1585]

Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval

Yawen Zeng, Da Cao, Xiaochi Wei, Meng Liu, Zhou Zhao, Zheng Qin

Given an untrimmed video and a query sentence, cross-modal video moment retrieval aims to rank a video moment from pre-segmented video moment candidates that best matches the query sentence. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1586]

Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces

Alireza Zaeemzadeh, Niccolo Bisagno, Zeno Sambugaro, Nicola Conci, Nazanin Rahnavard, Mubarak Shah

The goal of out-of-distribution (OOD) detection is to handle the situations where the test samples are drawn from a different distribution than the training data. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1587]

Hyper-LifelongGAN: Scalable Lifelong Learning for Image Conditioned Generation

Mengyao Zhai, Lei Chen, Greg Mori

Deep neural networks are susceptible to catastrophic forgetting: when encountering a new task, they can only remember the new task and fail to preserve its ability to accomplish previously learned tasks. [Expand]

0.00
Monday Poster Session
[1588]

ABMDRNet: Adaptive-Weighted Bi-Directional Modality Difference Reduction Network for RGB-T Semantic Segmentation

Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han

Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1589]

Accurate Few-Shot Object Detection With Support-Query Mutual Guidance and Hybrid Loss

Lu Zhang, Shuigeng Zhou, Jihong Guan, Ji Zhang

Most object detection methods require huge amounts of annotated data and can detect only the categories that appear in the training set. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1590]

Attention-Guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton

Xi Zhang, Xiaolin Wu

We propose a deep learning system for attention-guided dual-layer image compression (AGDL). [Expand]

0.00
Thursday Poster Session
[1591]

Coarse-To-Fine Person Re-Identification With Auxiliary-Domain Classification and Second-Order Information Bottleneck

Anguo Zhang, Yueming Gao, Yuzhen Niu, Wenxi Liu, Yongcheng Zhou

Person re-identification (Re-ID) is to retrieve a particular person captured by different cameras, which is of great significance for security surveillance and pedestrian behavior analysis. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1592]

Confluent Vessel Trees With Accurate Bifurcations

Zhongwen Zhang, Dmitrii Marin, Maria Drangova, Yuri Boykov

We are interested in unsupervised reconstruction of complex near-capillary vasculature with thousands of bifurcations where supervision and learning are infeasible. [Expand]

0.00
0
0
0
Wednesday Poster Session
[1593]

Cross-View Cross-Scene Multi-View Crowd Counting

Qi Zhang, Wei Lin, Antoni B. Chan

Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1594]

Data-Free Knowledge Distillation for Image Super-Resolution

Yiman Zhang, Hanting Chen, Xinghao Chen, Yiping Deng, Chunjing Xu, Yunhe Wang

Convolutional network compression methods require training data for achieving acceptable results, but training data is routinely unavailable due to some privacy and transmission limitations. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1595]

Cross-View Gait Recognition With Deep Universal Linear Embeddings

Shaoxiong Zhang, Yunhong Wang, Annan Li

Gait is considered an attractive biometric identifier for its non-invasive and non-cooperative features compared with other biometric identifiers such as fingerprint and iris. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1596]

DeepACG: Co-Saliency Detection via Semantic-Aware Contrast Gromov-Wasserstein Distance

Kaihua Zhang, Mingliang Dong, Bo Liu, Xiao-Tong Yuan, Qingshan Liu

The objective of co-saliency detection is to segment the co-occurring salient objects in a group of images. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1597]

DualGraph: A Graph-Based Method for Reasoning About Label Noise

HaiYang Zhang, XiMing Xing, Liang Liu

Unreliable labels derived from large-scale dataset prevent neural networks from fully exploring the data. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1598]

Explicit Knowledge Incorporation for Visual Reasoning

Yifeng Zhang, Ming Jiang, Qi Zhao

Existing explainable and explicit visual reasoning methods only perform reasoning based on visual evidence but do not take into account knowledge beyond what is in the visual scene. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1599]

Generating Manga From Illustrations via Mimicking Manga Creation Workflow

Lvmin Zhang, Xinrui Wang, Qingnan Fan, Yi Ji, Chunping Liu

We present a framework to generate manga from digital illustrations. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1600]

Hallucination Improves Few-Shot Object Detection

Weilin Zhang, Yu-Xiong Wang

Learning to detect novel objects with a few instances is challenging. [Expand]

0.00
Thursday Poster Session
[1601]

iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression

Shifeng Zhang, Chen Zhang, Ning Kang, Zhenguo Li

It is nontrivial to store rapidly growing big data nowadays, which demands high-performance lossless compression techniques. [Expand]

0.00
Monday Poster Session
[1602]

Flow-Guided One-Shot Talking Face Generation With a High-Resolution Audio-Visual Dataset

Zhimeng Zhang, Lincheng Li, Yu Ding, Changjie Fan

One-shot talking face generation should synthesize high visual quality facial videos with reasonable animations of expression and head pose, and just utilize arbitrary driving audio and arbitrary single face image as the source. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1603]

Keypoint-Graph-Driven Learning Framework for Object Pose Estimation

Shaobo Zhang, Wanqing Zhao, Ziyu Guan, Xianlin Peng, Jinye Peng

Many recent 6D pose estimation methods exploited object 3D models to generate synthetic images for training because labels come for free. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1604]

Learning by Watching

Jimuyang Zhang, Eshed Ohn-Bar

When in a new situation or geographical location, human drivers have an extraordinary ability to watch others and learn maneuvers that they themselves may have never performed. [Expand]

0.00
Thursday Poster Session
[1605]

Learning a Facial Expression Embedding Disentangled From Identity

Wei Zhang, Xianpeng Ji, Keyu Chen, Yu Ding, Changjie Fan

The facial expression analysis requires a compact and identity-ignored expression representation. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1606]

Learning Temporal Consistency for Low Light Video Enhancement From Single Images

Fan Zhang, Yu Li, Shaodi You, Ying Fu

Single image low light enhancement is an important task and it has many practical applications. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1607]

Learning Tensor Low-Rank Prior for Hyperspectral Image Reconstruction

Shipeng Zhang, Lizhi Wang, Lei Zhang, Hua Huang

Snapshot hyperspectral imaging has been developed to capture the spectral information of dynamic scenes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1608]

Learning To Restore Hazy Video: A New Real-World Dataset and a New Method

Xinyi Zhang, Hang Dong, Jinshan Pan, Chao Zhu, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Fei Wang

Most of the existing deep learning-based dehazing methods are trained and evaluated on the image dehazing datasets, where the dehazed images are generated by only exploiting the information from the corresponding hazy ones. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1609]

Learning To Aggregate and Personalize 3D Face From In-the-Wild Photo Collection

Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang

Non-prior face modeling aims to reconstruct 3D face only from images without shape assumptions. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1610]

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos

Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen

We address the problem of localizing a specific moment from an untrimmed video by a language sentence query. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1611]

Learning a Self-Expressive Network for Subspace Clustering

Shangzhi Zhang, Chong You, Rene Vidal, Chun-Guang Li

State-of-the-art subspace clustering methods are based on the self-expressive model, which represents each data point as a linear combination of other data points. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1612]

MR Image Super-Resolution With Squeeze and Excitation Reasoning Attention Network

Yulun Zhang, Kai Li, Kunpeng Li, Yun Fu

High-quality high-resolution (HR) magnetic resonance (MR) images afford more detailed information for reliable diagnosis and quantitative image analyses. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1613]

Objects Are Different: Flexible Monocular 3D Object Detection

Yunpeng Zhang, Jiwen Lu, Jie Zhou

The precise localization of 3D objects from a single image without depth information is a highly challenging problem. [Expand]

0.00
Tuesday Poster Session
[1614]

Posterior Promoted GAN With Distribution Discriminator for Unsupervised Image Synthesis

Xianchao Zhang, Ziyang Cheng, Xiaotong Zhang, Han Liu

Sufficient real information in generator is a critical point for the generation ability of GAN. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1615]

Person Re-Identification Using Heterogeneous Local Graph Attention Networks

Zhong Zhang, Haijia Zhang, Shuang Liu

Recently, some methods have focused on learning local relation among parts of pedestrian images for person re-identification (Re-ID), as it offers powerful representation capabilities. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1616]

Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging

Feilong Zhang, Xianming Liu, Cheng Guo, Shiyi Lin, Junjun Jiang, Xiangyang Ji

Phase retrieval from intensity-only measurements plays a central role in many real-world imaging tasks. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1617]

PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS With Relationship Recovery

Tianyi Zhang, Jie Lin, Peng Hu, Bin Zhao, Mohamed M. Sabry Aly

Non-maximum Suppression (NMS) is an essential post-processing step in modern convolutional neural networks for object detection. [Expand]

0.00
Friday Poster Session
[1618]

Refining Pseudo Labels With Clustering Consensus Over Generations for Unsupervised Object Re-Identification

Xiao Zhang, Yixiao Ge, Yu Qiao, Hongsheng Li

Unsupervised object re-identification targets at learning discriminative representations for object retrieval without any annotations. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1619]

Robust Bayesian Neural Networks by Spectral Expectation Bound Regularization

Jiaru Zhang, Yang Hua, Zhengui Xue, Tao Song, Chengyu Zheng, Ruhui Ma, Haibing Guan

Bayesian neural networks have been widely used in many applications because of the distinctive probabilistic representation framework. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1620]

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji

Recent progress on visual question answering has explored the merits of grid features for vision language tasks. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1621]

RPN Prototype Alignment for Domain Adaptive Object Detector

Yixin Zhang, Zilei Wang, Yushi Mao

Recent years have witnessed great progress in object detection. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1622]

Self-Guided and Cross-Guided Learning for Few-Shot Segmentation

Bingfeng Zhang, Jimin Xiao, Terry Qin

Few-shot segmentation has been attracting a lot of attention due to its effectiveness to segment unseen object classes with a few annotated samples. [Expand]

0.00
Wednesday Poster Session
[1623]

Sparse Multi-Path Corrections in Fringe Projection Profilometry

Yu Zhang, Daniel Lau, David Wipf

Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1624]

SRDAN: Scale-Aware and Range-Aware Domain Adaptation Network for Cross-Dataset 3D Object Detection

Weichen Zhang, Wen Li, Dong Xu

Geometric characteristic plays an important role in the representation of an object in 3D point clouds. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1625]

TSGCNet: Discriminative Geometric Feature Learning With Two-Stream Graph Convolutional Network for 3D Dental Model Segmentation

Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen

The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. [Expand]

0.00
Tuesday Poster Session
[1626]

Unbalanced Feature Transport for Exemplar-Based Image Translation

Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, Chunyan Miao

Despite the great success of GANs in images translation with different conditioned inputs such as semantic segmentation and edge map, generating high-fidelity images with reference styles from exemplars remains a grand challenge in conditional image-to-image translation. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1627]

3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management

Tianyi Zhao, Kai Cao, Jiawen Yao, Isabella Nogues, Le Lu, Lingyun Huang, Jing Xiao, Zhaozheng Yin, Ling Zhang

The pancreatic disease taxonomy includes ten types of masses (tumors or cysts) [20, 8]. [Expand]

0.00
0
0
0
Thursday Poster Session
[1628]

Deep Lucas-Kanade Homography for Multimodal Image Alignment

Yiming Zhao, Xinming Huang, Ziming Zhang

Estimating homography to align image pairs captured by different sensors or image pairs with large appearance changes is an important and general challenge for many computer vision applications. [Expand]

0.00
Friday Poster Session
[1629]

Cascaded Prediction Network via Segment Tree for Temporal Video Grounding

Yang Zhao, Zhou Zhao, Zhu Zhang, Zhijie Lin

Temporal video grounding aims to localize the target segment which is semantically aligned with the given sentence in an untrimmed video. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1630]

Distribution-Aware Adaptive Multi-Bit Quantization

Sijie Zhao, Tao Yue, Xuemei Hu

In this paper, we explore the compression of deep neural networks by quantizing the weights and activations into multi-bit binary networks (MBNs). [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1631]

Graph-Based High-Order Relation Discovery for Fine-Grained Recognition

Yifan Zhao, Ke Yan, Feiyue Huang, Jia Li

Fine-grained object recognition aims to learn effective features that can identify the subtle differences between visually similar objects. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1632]

PhD Learning: Learning With Pompeiu-Hausdorff Distances for Video-Based Vehicle Re-Identification

Jianan Zhao, Fengliang Qi, Guangyu Ren, Lin Xu

Vehicle re-identification (re-ID) is of great significance to urban operation, management, security and has gained more attention in recent years. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1633]

Prior Based Human Completion

Zibo Zhao, Wen Liu, Yanyu Xu, Xianing Chen, Weixin Luo, Lei Jin, Bohui Zhu, Tong Liu, Binqiang Zhao, Shenghua Gao

We study a very challenging task, human image completion, which tries to recover the human body part with a reasonable human shape from the corrupted region. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1634]

Spk2ImgNet: Learning To Reconstruct Dynamic Scene From Continuous Spike Stream

Jing Zhao, Ruiqin Xiong, Hangfan Liu, Jian Zhang, Tiejun Huang

The recently invented retina-inspired spike camera has shown great potential for capturing dynamic scenes. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1635]

Self-Generated Defocus Blur Detection via Dual Adversarial Discriminators

Wenda Zhao, Cai Shang, Huchuan Lu

Although existing fully-supervised defocus blur detection (DBD) models significantly improve performance, training such deep models requires abundant pixel-level manual annotation, which is highly time-consuming and error-prone. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1636]

Deep Compositional Metric Learning

Wenzhao Zheng, Chengkun Wang, Jiwen Lu, Jie Zhou

In this paper, we propose a deep compositional metric learning (DCML) framework for effective and generalizable similarity measurement between images. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1637]

Deep Convolutional Dictionary Learning for Image Denoising

Hongyi Zheng, Hongwei Yong, Lei Zhang

Inspired by the great success of deep neural networks (DNNs), many unfolding methods have been proposed to integrate traditional image modeling techniques, such as dictionary learning (DicL) and sparse coding, into DNNs for image restoration. [Expand]

0.00
Monday Poster Session
[1638]

High-Speed Image Reconstruction Through Short-Term Plasticity for Spiking Cameras

Yajing Zheng, Lingxiao Zheng, Zhaofei Yu, Boxin Shi, Yonghong Tian, Tiejun Huang

Fovea, located in the centre of the retina, is specialized for high-acuity vision. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1639]

Improving Multiple Object Tracking With Single Object Tracking

Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, Hanqing Lu

Despite considerable similarities between multiple object tracking (MOT) and single object tracking (SOT) tasks, modern MOT methods have not benefited from the development of SOT ones to achieve satisfactory performance. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session
[1640]

Patchwise Generative ConvNet: Training Energy-Based Models From a Single Natural Image for Internal Learning

Zilong Zheng, Jianwen Xie, Ping Li

Exploiting internal statistics of a single natural image has long been recognized as a significant research paradigm where the goal is to learn the distribution of patches within the image without relying on external training data. [Expand]

0.00
Tuesday Poster Session
[1641]

Ultra-High-Definition Image Dehazing via Multi-Guided Bilateral Learning

Zhuoran Zheng, Wenqi Ren, Xiaochun Cao, Xiaobin Hu, Tao Wang, Fenglong Song, Xiuyi Jia

During the last couple of years, convolutional neural networks (CNNs) have achieved significant success in the single image dehazing task. [Expand]

PDF
Show Tweets
0.00
Friday Poster Session
[1642]

Unsupervised Disentanglement of Linear-Encoded Facial Semantics

Yutong Zheng, Yu-Kai Huang, Ran Tao, Zhiqiang Shen, Marios Savvides

We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision. [Expand]

0.00
Tuesday Poster Session
[1643]

Single Image Reflection Removal With Absorption Effect

Qian Zheng, Boxin Shi, Jinnan Chen, Xudong Jiang, Ling-Yu Duan, Alex C. Kot

In this paper, we consider the absorption effect for the problem of single image reflection removal. [Expand]

PDF
Show Tweets
0.00
Thursday Poster Session
[1644]

Glance and Gaze: Inferring Action-Aware Points for One-Stage Human-Object Interaction Detection

Xubin Zhong, Xian Qu, Changxing Ding, Dacheng Tao

Modern human-object interaction (HOI) detection approaches can be divided into one-stage methods and two-stage ones. [Expand]

0.00
Thursday Poster Session
[1645]

DAP: Detection-Aware Pre-Training With Weak Supervision

Yuanyi Zhong, Jianfeng Wang, Lijuan Wang, Jian Peng, Yu-Xiong Wang, Lei Zhang

This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. [Expand]

0.00
Tuesday Poster Session
[1646]

Neighborhood Contrastive Learning for Novel Class Discovery

Zhun Zhong, Enrico Fini, Subhankar Roy, Zhiming Luo, Elisa Ricci, Nicu Sebe

In this paper, we address Novel Class Discovery (NCD), the task of unveiling new classes in a set of unlabeled samples given a labeled dataset with known classes. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1647]

Decoupled Dynamic Filter Networks

Jingkai Zhou, Varun Jampani, Zhixiong Pi, Qiong Liu, Ming-Hsuan Yang

Convolution is one of the basic building blocks of CNN architectures. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1648]

Effective Sparsification of Neural Networks With Global Sparsity Constraint

Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang

Weight pruning is an effective technique to reduce the model size and inference time for deep neural networks in real world deployments. [Expand]

0.00
Tuesday Poster Session
[1649]

Embracing Uncertainty: Decoupling and De-Bias for Robust Temporal Grounding

Hao Zhou, Chongyang Zhang, Yan Luo, Yanjun Chen, Chuanping Hu

Temporal grounding aims to localize temporal boundaries within untrimmed videos by language queries, but it faces the challenge of two types of inevitable human uncertainties: query uncertainty and label uncertainty. [Expand]

0.00
Wednesday Poster Session
[1650]

Face Forensics in the Wild

Tianfei Zhou, Wenguan Wang, Zhiyuan Liang, Jianbing Shen

On existing public benchmarks, face forgery detection techniques have achieved great success. [Expand]

0.00
0
0
0
Tuesday Poster Session
[1651]

Human De-Occlusion: Invisible Perception and Recovery for Humans

Qiang Zhou, Shiyin Wang, Yitong Wang, Zilong Huang, Xinggang Wang

In this paper, we tackle the problem of human de-occlusion which reasons about occluded segmentation masks and invisible appearance content of humans. [Expand]

0.00
Tuesday Poster Session
[1652]

Image De-Raining via Continual Learning

Man Zhou, Jie Xiao, Yifan Chang, Xueyang Fu, Aiping Liu, Jinshan Pan, Zheng-Jun Zha

While deep convolutional neural networks (CNNs) have achieved great success on image de-raining task, most existing methods can only learn fixed mapping rules between paired rainy/clean images on a single dataset. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1653]

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

Long-term actions involve many important visual concepts, e.g., objects, motions, and sub-actions, and there are various relations among these concepts, which we call basic relations. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1654]

Learning Placeholders for Open-Set Recognition

Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan

Traditional classifiers are deployed under closed-set setting, with both training and test classes belong to the same set. [Expand]

0.00
Tuesday Poster Session
[1655]

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach

Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jiang

Monocular 3D object detection is an important task in autonomous driving. [Expand]

PDF
Show Tweets
0.00
Wednesday Poster Session
[1656]

Positive Sample Propagation Along the Audio-Visual Event Line

Jinxing Zhou, Liang Zheng, Yiran Zhong, Shijie Hao, Meng Wang

Visual and audio signals often coexist in natural environments, forming audio-visual events (AVEs). [Expand]

0.00
0
0
0
Wednesday Poster Session
[1657]

Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation

Tianfei Zhou, Jianwu Li, Xueyi Li, Ling Shao

This paper addresses the task of unsupervised video multi-object segmentation. [Expand]

0.00
Tuesday Poster Session
[1658]

Prototype Augmentation and Self-Supervision for Incremental Learning

Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, Cheng-Lin Liu

Despite the impressive performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning new tasks incrementally. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1659]

Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning

Kai Zhu, Yang Cao, Wei Zhai, Jie Cheng, Zheng-Jun Zha

Few-shot class-incremental learning is to recognize the new classes given few samples and not forget the old classes. [Expand]

PDF
Show Tweets
0.00
Tuesday Poster Session
[1660]

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events

Yunhao Zou, Yinqiang Zheng, Tsuyoshi Takatani, Ying Fu

Event cameras are novel sensors that capture the dynamics of a scene asynchronously. [Expand]

PDF
Show Tweets
0.00
Monday Poster Session