Cheng-Chun Hsu

About Me

I am a Ph.D. student in Computer Science at the University of Texas at Austin. My research lies at the intersection of robotics and computer vision.

Publications [Google Scholar]

SPOT: SE(3) Pose Trajectory Diffusion for Object-Centric Manipulation

Cheng-Chun Hsu, Bowen Wen, Jie Xu, Yashraj Narang, Xiaolong Wang, Yuke Zhu, Joydeep Biswas, Stan Birchfield

International Conference on Robotics and Automation (ICRA), 2025.

Project Page Paper Code

We enable robots to learn everyday tasks from human video demonstrations by usinbg object-centric representation. By predicting future object pose trajectories, SPOT achieves strong generalization capabilities with only eight human video demonstrations.

KinScene: Model-Based Mobile Manipulation of Articulated Scenes

Cheng-Chun Hsu, Ben Abbatematteo, Zhenyu Jiang, Yuke Zhu, Roberto Martín-Martín, Joydeep Biswas

Mobile Manipulation Workshop at ICRA, 2024.
Spotlight Presentation
Project Page Paper

We enable mobile manipulators to perform long-horizon tasks by autonomously exploring and building scene-level articulation models of articulated objects. It maps the scene, infers object properties, and plans sequential interactions for accurate real-world manipulation.

Ditto in the House: Building Articulated Models of Indoor Scenes through Interactive Perception

Cheng-Chun Hsu, Zhenyu Jiang, Yuke Zhu

International Conference on Robotics and Automation (ICRA), 2023.

Project Page Paper Code

We develop an interactive perception approach for robots to build indoor scene articulation models by efficiently discovering and characterizing articulated objects through coupled affordance prediction and articulation inference.

Ditto: Building Digital Twins of Articulated Objects from Interaction

Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu

Computer Vision and Pattern Recognition (CVPR), 2022.
Oral Presentation
Project Page Paper Code

We develop an approach that builds digital twins of articulated objects by learning their articulation models and 3D geometry from visual observations before and after interaction.

Every Pixel Matters: Center-aware Feature Alignment for Domain Adaptive Object Detector

Cheng-Chun Hsu, Yi-Hsuan Tsai, Yen-Yu Lin, and Ming-Hsuan Yang

European Conference on Computer Vision (ECCV), 2020.

Project Page Paper Code

We propose a domain adaptation framework for object detection that uses pixel-wise objectness and centerness to align features, focusing on foreground pixels for better cross-domain adaptation.

Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior

Cheng-Chun Hsu*, Kuang-Jui Hsu*, Chung-Chi Tsai, Yen-Yu Lin, and Yung-Yu Chuang

Neural Information Processing Systems (NeurIPS), 2019.

Paper Code

We propose a weakly supervised instance segmentation method that leverages Multiple Instance Learning (MIL) to address ambiguous foreground separation from bounding box annotations.

What Dress Fits Me Best? Fashion Recommendation on the Clothing Style for Personal Body Shape

Shintami Chusnul Hidayati, Cheng-Chun Hsu, Yu-Ting Chang, Kai-Lung Hua, Jianlong Fu, and Wen-Huang Cheng

ACM International Conference on Multimedia (MM), 2018.
Oral Presentation
Paper

We propose to learn clothing style and body shape compatibility from social big data, offering personalized outfit recommendations by factoring in a user's body shape.

Technical Reports

Center-context-gap Refinement for Weakly Supervised Instance Segmentation

Cheng-Chun Hsu*, Kuang-Jui Hsu*, Chiachen Ho, Yen-Yu Lin, and Yung-Yu Chuang

Technical report, 2019.

Paper

We propose a weakly supervised instance segmentation method using image-level labels, leveraging MIL, semantic segmentation, and a novel refinement module.