Seeing in the Dark: Reconstructing 3D Human Pose Using NIR Single-Pixel Imaging
- Carlos Osorio
- hace 12 minutos
- 1 Min. de lectura
Estimating a person’s 3D pose and body shape from a single image is a fundamental challenge in computer vision—especially when lighting is poor or the subject is partially occluded. Most traditional approaches rely on RGB images, which often fail in real-world scenarios such as nighttime environments or disaster zones. Our recent work introduces a breakthrough using Single-Pixel Imaging (SPI) in the Near-Infrared (NIR) spectrum (850–1550 nm), combined with Time-of-Flight (TOF) technology. This setup offers a powerful alternative to standard imaging methods. NIR light has the unique ability to penetrate clothing and adapt to changing illumination, making it ideal for human detection in low-visibility conditions. Instead of relying on high-resolution sensors, our SPI system reconstructs 3D point clouds from a series of single-pixel measurements. These point clouds are then processed using advanced deep learning models:
A Vision Transformer (ViT) aligns the reconstructed human poses with a predefined SMPL-X skeleton model.
A self-supervised PointNet++ network estimates fine-grained attributes such as global rotation, translation, body shape, and pose.

Our lab experiments simulating night-time environments demonstrate the potential of this system for real-world applications, especially in rescue missions where vision-based solutions often fail. With no dependence on ambient light and an architecture tailored for low-SWaP (size, weight, and power) devices, NIR-SPI could become a core technology for search-and-rescue UAVs, surveillance, or night-time human monitoring.
Osorio Quero, C.; Durini, D.; Martinez-Carranza, J. ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging. Appl. Sci. 2025, 15, 6138. https://doi.org/10.3390/app15116138
コメント