NCKU · Dept. of Electrical Engineering · Tainan, Taiwan

Assistive Navigation for Visually Impaired People
via Multi-Sensor Fusion & Smart Camera Networks

Prof. Sok-Ian Sou  蘇淑茵  ·  National Cheng Kung University  ·  sisou@mail.ncku.edu.tw
ViTrack · Multimedia Tools & Applications · vol.85, no.70 · Feb. 2026 CamTrack · Neural Computing and Applications · accepted 2026 Paper Demo Video
Demo Snapshots
ViTrack deviation and obstruction examples
Example ViTrack notifications: (a) Deviation — user's direction diverges from the tactile path; (b) Obstruction — a bicycle on the tactile paving triggers an alert.
CamTrack handover zone and camera coverage
Handover zone settings based on camera location in the CamTrack system. Camera A→B handover and the guiding process are shown in the demo video.

Student Achievements · 2023

Competition Recognition

Student teams supervised by Prof. Sou received awards at both national and international competitions.

2nd Place
鈺立微 AI視覺辨識及運算組
第28屆大專校院資訊應用服務創新競賽
2023 University Information Application Innovation Competition
黃湙珵 · 林耕澤 · 吳炯霖
Honorary Mention
Viclusion: A Vision-assisted Tracking & Guiding System for Visually Impaired People
2023 IEEE ComSoc Student Competition · IEEE Communications Society
黃暄閔 · 黃湙珵 · 林耕澤 · 吳炯霖

Motivation

The Navigation Challenge

2.2 billion people worldwide live with vision impairment. Traditional aids — white canes and tactile paving — are easily blocked by obstacles, while single-sensor electronic travel aids suffer signal interference or occlusion. Our research fuses existing surveillance infrastructure with smartphone sensors to deliver reliable, continuous, and unobtrusive guidance with no extra wearable hardware.

2.2B
people with vision impairment worldwide (WHO)
500+
surveillance cameras on NCKU smart campus
0
extra wearable devices required for users

Publication 1 · MTAP 2026 · S.-I. Sou & S.-M. Huang · doi: 10.1007/s11042-026-21198-6

ViTrack — BLE & Camera Navigation System

ViTrack fuses BLE wireless fingerprinting with existing surveillance cameras for privacy-preserving, user-controlled outdoor navigation. A DNN/LSTM network localizes the user via RSSI signals from low-cost sniffers, then hands off to YOLOv7 for visual tracking, deviation detection, and obstacle avoidance — all without extra hardware beyond the user's smartphone.

System Architecture — Layered View
User Layer
📱
Smartphone
BLE advertising · Bluetooth 5.0 · 100 ms interval
🦯
Guidance Feedback
Audio / haptic navigation commands via mobile app
Wireless
Layer
📡
4 × BLE Sniffers
Raspberry Pi 3B+ · RSSI collection · 6 m apart
🧠
DNN / LSTM Localization
Min-max normalized RSSI · T=3 time steps · Softmax output → 8 reference points
📍
User Location Estimate
1 Hz update · Observation zone 20m × 5.7m
Vision
Layer
📷
Surveillance Camera
NCKU network · 1920×1080 @ 20 fps
👁️
YOLOv7 Detection
Confidence ≥ 0.5 · Real-time person detection
🔗
Temporal Matching
Majority-vote across n frames: O* = argmax S(Oⱼ) — mismatch probability ~αⁿ
Safety
Layer
📐
Deviation Detection
Nested zones R_TP ⊂ R_OZ · Yaw angle θ_TP calibration · 3-tier alerts
🚧
Obstacle Avoidance
Left / front / right detection · Correction commands N = −1, 0, +1
🗺️
Path Guidance
Direction correction delivered via smartphone notifications
Key Innovation: Data Augmentation for Sniffer Failure
  • Randomly zero out one sniffer's RSSI during training to simulate device outage
  • Augmented dataset: 4 groups, each simulating failure of one specific sniffer
  • DNN/LSTM with augmentation consistently outperforms baseline across all sniffer-failure scenarios
ri,aug = [RSSI'1, … , 0 (sniffer a), … , RSSI'J]
P_mismatch ≈ αn  → longer window = exponentially better matching
Localization Accuracy — Model Comparison (MDE)

Mean Distance Error across DNN, 1D-CNN, and LSTM variants. Lower = better.

DNN
Highest MDE
1D-CNN (T=1)
High
1D-CNN (T=3)
Medium
DNN/LSTM ★
Best ✓

★ Heterogeneous device results are similar — outdoor RSSI variability dominates over hardware differences.

Experiment Deployment Parameters
ComponentSpecificationNotes
BLE Sniffers4 × Raspberry Pi 3B+Positioned 6 m apart at 0.7 m height on traffic cones
Reference Points8 RPs · 2 m spacing20 m × 5.7 m observation zone at NCKU Dept. EE entrance
Surveillance Camera1920 × 1080 @ 20 fpsExisting NCKU campus network — no new hardware added
BLE advertising100 ms · 0 dBm · BT 5.0nRF Connect app; various smartphone brands tested
LSTM time steps TT = 3Best accuracy/latency trade-off for temporal RSSI features
HBOE body orientationAbsolute angle errorUsed for guiding system performance evaluation

Publication 2 · NCA 2026 · S.-I. Sou, Y.-C. Huang & C.-L. Wu

CamTrack — Multi-Sensor Fusion & Predictive Handover

CamTrack advances ViTrack by adding IMU-derived trajectory prediction for proactive camera-to-camera handover. A Markov-chain probability model explicitly quantifies the cost–benefit trade-off between tracking reliability and camera resource usage, enabling informed system configuration rather than heuristic tuning.

System Architecture — Layered View
Sensor
Layer
📐
IMU (Phone)
Yaw angle · velocity · heading estimation
+
📡
BLE Sniffers
RSSI coarse-grained localization across campus
+
📷
Camera Network
500+ cameras · 69 NVR servers · 1920×1080@20fps
Handover
Layer
🧭
Yaw Mapping
IMU Yaw → T-sample directional verification · camera transition table
🔮
Predictive Camera Selection
Enters handover zone → activates next camera · deactivates current
📊
Markov Handover Model
P_Φ(T_H) = P₁ + P₂ · analytical cost–benefit framework
Tracking
Layer
👁️
YOLOv7
25–30 fps · RTX 3060 · confidence ≥ 0.5
🪪
Stage 1: ReID Matching
High-conf detections · Hungarian algorithm · L2 distance threshold τ=15
📈
Stage 2: SORT Tracking
Kalman filter · IoU = 0.05 · handles occluded / low-confidence detections
Output
Layer
🗺️
MapTalk Interface
Web UI for admin · camera map · live stream access via SSH tunnel
📱
Mobile Navigation
Real-time guiding notifications pushed to user's smartphone
Tracking Success: 6 vs. 9 Active Cameras
Sparse · 6 cam
~48%
Sparse · 9 cam ★
~87%  +39%
Dense · 6 cam
~74%
Dense · 9 cam ★
~87%  +13%

Sparse zones gain more from additional cameras — analytical and simulation results closely matched.

Expiry Timer T_H Effect (Sparse Zone)
T_H = 10 min
Low
T_H = 30 min
Medium
T_H = 60 min
High
T_H ≥ 100 min
Saturated ✓

T_H gain saturates ~100 min, aligned with ~30-min average idle time. Trade-off: higher T_H → higher resource cost.

Markov Chain Handover Probability Model
  • State 0 = user in uncovered gap; State i = user in camera Ci coverage
  • P₁: immediate transition to adjacent camera (direct success)
  • P₂: delayed re-entry within timer TH (conditional success)
  • PΦ(TH) = P₁ + P₂ — closed-form analytical expression
  • Enables quantitative cost–benefit decisions for any campus deployment
P₁ = (1−Pi,0) · Σpi,j / (Σpi,j + Σpi,k)
P₂ = Pi,0(1−e−μT_H) · Σqj / (Σqj + Σqk)
PΦ(T_H) = P₁ + P₂

Lab Expertise

Research Topics

BLE Wireless Fingerprinting LSTM Localization Person Re-Identification Multi-Camera Tracking Sensor Fusion (IMU+BLE+Camera) Predictive Handover Markov Chain Modeling YOLOv7 Object Detection Assistive Navigation Obstacle Avoidance Deviation Detection Smart Campus Systems