Qiong Chang

Qiong Chang

Assistant Professor
School of Computing, Institute of Science Tokyo
(formerly Tokyo Institute of Technology)

Research Interests

GPU Accelerators

Kernel designs for high-performance computing that leverage GPUs for large-scale data processing.

Hardware-Aware 3D Vision

Accelerating complex computer vision algorithms using FPGAs and GPUs for real-time applications.

Large-Scale Data Processing

Efficient algorithms and systems for large-scale graph and data processing on modern hardware.

About Me

I am an assistant professor at the Institute of Science Tokyo (formerly Tokyo Institute of Technology), School of Computing. I received my Ph.D. in Intelligent Systems Engineering from the University of Tsukuba. Before joining the Institute of Science Tokyo, I was a researcher at the National Institute of Advanced Industrial Science and Technology (AIST).

News

Feb 2026
Paper "Memory Efficient Point Cloud Registration Accelerator on FPGA" accepted to ICRA'26
Sep 2025
Paper "Fast Approximate Aggregation with Error Guarantee Using Encoded Bit-slice Indexing" accepted to iiWAS'25
Jun 2025
Paper "3D GNLM: Efficient 3D Non-Local Means Kernel with Nested Reuse Strategies for Embedded GPUs" accepted to TACO'25
May 2025
Paper "FSAC-IA: A Hierarchical Constructed SAC-IA Algorithm for Point Cloud Alignment Acceleration" accepted to ICIP'25
May 2025
Paper "Unified Schema-Driven Graph Polystore: Achieving Transparency in Multi-Model Integration and Migration" accepted to DEXA'25
Mar 2025
Paper "Faster than Fast: Accelerating Oriented FAST Feature Detection on Low-end Embedded GPUs" accepted to TECS'25
Jan 2025
Paper "Efficient Parallel Implementation of Non-Local Means Algorithm on GPU" accepted to GPGPU'25
Jan 2025
Paper "Accelerating Nearest Neighbor Search in 3D Point Cloud Registration on GPUs" accepted to TACO'25

Selected Research

TACO 2025

Accelerating Nearest Neighbor Search in 3D Point Cloud Registration on GPUs

Qiong Chang, Weimin Wang, Jun Miyazaki
A GPU-accelerated method to significantly speed up nearest neighbor search for 3D point cloud registration, enhancing real-time performance in high-density spatial data processing.
GPU 3D Vision Point Cloud
12×
speedup
TECS 2025

Faster than Fast: Accelerating Oriented FAST Feature Detection on Low-end Embedded GPUs

Qiong Chang, Xinyuan Chen, Weimin Wang, Xiang Li, Jun Miyazaki
Two methods to accelerate the most time-consuming steps in Oriented FAST feature detection: FAST feature point detection and Harris corner detection.
Embedded GPU Feature Detection
2.2×
speedup
TACO 2025

3D GNLM: Efficient 3D Non-Local Means Kernel with Nested Reuse Strategies for Embedded GPUs

Xiang Li, Qiong Chang*, Yun Li, Jun Miyazaki
An efficient parallel implementation of the 3D Non-Local Means denoising algorithm on GPU, significantly accelerating performance for high-resolution medical image processing tasks.
GPU Medical Imaging
5.5×
speedup
TACO 2024

An Optimized GPU Implementation for GIST Descriptor

Xiang Li, Qiong Chang*, Aolong Zha, Shijie Chang, Yun Li, Jun Miyazaki
An optimized GPU-based implementation of the GIST descriptor, significantly accelerating image feature extraction for large-scale visual processing tasks.
GPU Feature Extraction
6.4×
speedup
JPDC 2023

Multi-Directional Sobel Operator Kernel on GPUs

Qiong Chang, Xiang Li, Yun Li, Jun Miyazaki
A GPU-accelerated multi-directional Sobel operator kernel for efficient and parallel edge detection across multiple gradient orientations.
GPU Edge Detection
11×
speedup
IEEE TSMC 2024

TinyStereo: A Tiny Coarse-to-Fine Framework for Vision-based Depth Estimation on Embedded GPUs

Qiong Chang, Xin Xu, Aolong Zha, Yongqing Sun, Yun Li
A lightweight coarse-to-fine stereo matching framework optimized for embedded GPUs, enabling efficient and accurate depth estimation under constrained resources.
Embedded GPU Stereo Matching Depth Estimation
22
fps on TX2
JSA 2022

Efficient Stereo Matching on Embedded GPUs with Zero-Means Cross Correlation

Qiong Chang, Aolong Zha, Weimin Wang, Xin Liu, Masaki Onishi, Lei Lei, Tsutomu Maruyama
Fast ZNCC feature matching on embedded GPUs, offering an effective real-time alternative to traditional Census in stereo matching.
Embedded GPU Stereo Matching
speedup