Publications

Hardware-aware algorithm related publications

Journal

J10
3D GNLM: Efficient 3D Non-Local Means Kernel with Nested Reuse Strategies for Embedded GPUs
Xiang Li, Qiong Chang*, Yun Li, Jun Miyazaki
ACM Transactions on Architecture and Code Optimization, 2025
J9
Faster than Fast: Accelerating Oriented FAST Feature Detection on Low-end Embedded GPUs
Qiong Chang, Xinyuan Chen, Xiang Li, Weimin Wang, Jun Miyazaki
ACM Transactions on Embedded Computing Systems, 2025
J8
Accelerating Nearest Neighbor Search in 3D Point Cloud Registration on GPUs
Qiong Chang, Weimin Wang, Jun Miyazaki
ACM Transactions on Architecture and Code Optimization, 2025
J7
An Optimized GPU Implementation for GIST Descriptor
Xiang Li, Qiong Chang*, Aolong Zha, Shijie Chang, Yun Li, Jun Miyazaki
ACM Transactions on Architecture and Code Optimization, 2024
J6
TinyStereo: A Tiny Coarse-to-Fine Framework for Vision-based Depth Estimation on Embedded GPUs
Qiong Chang, Xin Xu, Aolong Zha, Yongqing Sun, Yun Li
IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2024
J5
High-precision Plant Height Measurement by Drone with RTK-GNSS and Single Camera for Real-time Processing
Yuta Matsuura, Heming Zhang, Kousuke Nakao, Qiong Chang, Firmansyah Iman, Shin Kawai, Yoshiki Yamaguchi, Tsutomu Maruyama, Hisayoshi Hayashi, Hajime Nobuhara
Scientific Reports, 2023
J4
Multi-Directional Sobel Operator Kernel on GPUs
Qiong Chang, Xiang Li, Yun Li, Jun Miyazaki
Journal of Parallel and Distributed Computing, 2023
J3
An Incremental SAT-Based Approach for Solving the Real-Time Taxi-Sharing Service Problem
Aolong Zha, Qiong Chang*, Itsuki Noda
Discrete Applied Mathematics, 2023
J2
Efficient Stereo Matching on Embedded GPUs with Zero-Means Cross Correlation
Qiong Chang, Aolong Zha, Weimin Wang, Xin Liu, Masaki Onishi, Lei Lei, Tsutomu Maruyama
Journal of Systems Architecture, 2022
J1
Real-Time Stereo Vision System: A Multi-Block Matching on GPU
Qiong Chang, Tsutomu Maruyama
IEEE Access, 2018

Conference

C15
Memory Efficient Point Cloud Registration Accelerator on FPGA
Qiong Chang, Dongqi Cai, Ran Dong, Junpei Zhong
IEEE International Conference on Robotics and Automation (ICRA), 2026
C14
FSAC-IA: A Hierarchical Constructed SAC-IA Algorithm for Point Cloud Alignment Acceleration
Ziyang Yu, Qiong Chang, Jun Miyazaki
IEEE International Conference on Image Processing (ICIP), 2025
C13
Efficient Parallel Implementation of Non-Local Means Algorithm on GPU
Xiang Li, Qiong Chang, Yun Li, Jun Miyazaki
17th Workshop on General Purpose Processing Using GPU (GPGPU 2025), 2025
C12
K-way In-place Merge by CPU-GPU Cooperative Processing
Shinya Miura, Qiong Chang, Jun Miyazaki
35th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2024
C11
Extension of Parallel Primitives and Their Applications to Large-Scale Data Processing
Masashi Nakano, Qiong Chang*, Jun Miyazaki
35th International Conference on Database and Expert Systems Applications (DEXA), 2024
C10
Acceleration of Neural Network Inference for Embedded GPU Systems
Kei Terakura, Qiong Chang, Jun Miyazaki
International Conference on Big Data and Smart Computing (BigComp), 2024
C9
GPU Acceleration of Multi-object Tracking with Motion Vector Interpolation and Affine Transformation
Yoshiki Kunimoto, Qiong Chang, Yashiki Yamaguchi, Tsutomu Maruyama
34th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2023
C8
VAN-ICP: GPU-Accelerated Approximate Nearest Neighbor Search for ICP Registration via Voxel Dilation
Weimin Wang, Qiong Chang*
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
C7
StereoVAE: A Lightweight Stereo-Matching System Using Embedded GPUs
Qiong Chang, Xiang Li, Xun Xi, Xin Liu, Yun Li, Jun Miyazaki
IEEE International Conference on Robotics and Automation (ICRA), 2023
C6
Acceleration of Video Stabilization Using Embedded GPU
Yuzuki Mimura, Qiong Chang, Tsutomu Maruyama
IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2022
C5
Fast SQL/Row Pattern Recognition Query Processing Using Parallel Primitives on GPUs
Tsubasa Ohara, Qiong Chang*, Jun Miyazaki
32nd International Conference on Database and Expert Systems Applications (DEXA), 2021
C4
Z2-ZNCC: ZigZag Scanning-based Zero-means Normalized Cross Correlation for Fast and Accurate Stereo Matching on Embedded GPU
Qiong Chang, Aolong Zha, Weimin Wang, Masaki Onishi, Tsutomu Maruyama
IEEE 38th International Conference on Computer Design (ICCD), 2020
C3
A GPU Accelerator for Domain Transformation-Based Stereo Matching
Qiong Chang, Aolong Zha, Masaki Onishi, Tsutomu Maruyama
2nd International Conference on Algorithms, Computing and Artificial Intelligence (ACAI), 2019
C2
Real-Time High-Quality Stereo Matching System on a GPU
Qiong Chang, Tsutomu Maruyama
IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP), 2018
C1
Fast Convolution Kernels on Pascal GPU with High Memory Efficiency
Qiong Chang
26th High Performance Computing Symposium (HPC), 2018 Best Paper Award

Full paper list on Google Scholar