Assistive Navigation Research — Prof. Sou, NCKU EE

Demo Snapshots

ViTrack deviation and obstruction examples — Example ViTrack notifications: (a) Deviation — user's direction diverges from the tactile path; (b) Obstruction — a bicycle on the tactile paving triggers an alert.

CamTrack handover zone and camera coverage — Handover zone settings based on camera location in the CamTrack system. Camera A→B handover and the guiding process are shown in the demo video.

Student Achievements · 2023

Competition Recognition

Student teams supervised by Prof. Sou received awards at both national and international competitions.

2nd Place

鈺立微 AI視覺辨識及運算組

第28屆大專校院資訊應用服務創新競賽
2023 University Information Application Innovation Competition

黃湙珵 · 林耕澤 · 吳炯霖

Honorary Mention

Viclusion: A Vision-assisted Tracking & Guiding System for Visually Impaired People

2023 IEEE ComSoc Student Competition · IEEE Communications Society

黃暄閔 · 黃湙珵 · 林耕澤 · 吳炯霖

Motivation

The Navigation Challenge

2.2 billion people worldwide live with vision impairment. Traditional aids — white canes and tactile paving — are easily blocked by obstacles, while single-sensor electronic travel aids suffer signal interference or occlusion. Our research fuses existing surveillance infrastructure with smartphone sensors to deliver reliable, continuous, and unobtrusive guidance with no extra wearable hardware.

2.2B

people with vision impairment worldwide (WHO)

500+

surveillance cameras on NCKU smart campus

0

extra wearable devices required for users

Publication 1 · MTAP 2026 · S.-I. Sou & S.-M. Huang · doi: 10.1007/s11042-026-21198-6

ViTrack — BLE & Camera Navigation System

ViTrack fuses BLE wireless fingerprinting with existing surveillance cameras for privacy-preserving, user-controlled outdoor navigation. A DNN/LSTM network localizes the user via RSSI signals from low-cost sniffers, then hands off to YOLOv7 for visual tracking, deviation detection, and obstacle avoidance — all without extra hardware beyond the user's smartphone.

System Architecture — Layered View

User Layer

📱

Smartphone

BLE advertising · Bluetooth 5.0 · 100 ms interval

→

🦯

Guidance Feedback

Audio / haptic navigation commands via mobile app

Wireless
Layer

📡

4 × BLE Sniffers

Raspberry Pi 3B+ · RSSI collection · 6 m apart

→

🧠

DNN / LSTM Localization

Min-max normalized RSSI · T=3 time steps · Softmax output → 8 reference points

→

📍

User Location Estimate

1 Hz update · Observation zone 20m × 5.7m

Vision
Layer

📷

Surveillance Camera

NCKU network · 1920×1080 @ 20 fps

→

👁️

YOLOv7 Detection

Confidence ≥ 0.5 · Real-time person detection

→

🔗

Temporal Matching

Majority-vote across n frames: O* = argmax S(Oⱼ) — mismatch probability ~αⁿ

Safety
Layer

📐

Deviation Detection

Nested zones R_TP ⊂ R_OZ · Yaw angle θ_TP calibration · 3-tier alerts

→

🚧

Obstacle Avoidance

Left / front / right detection · Correction commands N = −1, 0, +1

→

🗺️

Path Guidance

Direction correction delivered via smartphone notifications

Key Innovation: Data Augmentation for Sniffer Failure

Randomly zero out one sniffer's RSSI during training to simulate device outage
Augmented dataset: 4 groups, each simulating failure of one specific sniffer
DNN/LSTM with augmentation consistently outperforms baseline across all sniffer-failure scenarios

        ri,aug = [RSSI'1, … , 0 (sniffer a), … , RSSI'J]

        P_mismatch ≈ αn  → longer window = exponentially better matching

Localization Accuracy — Model Comparison (MDE)

Mean Distance Error across DNN, 1D-CNN, and LSTM variants. Lower = better.

DNN

Highest MDE

1D-CNN (T=1)

High

1D-CNN (T=3)

Medium

DNN/LSTM ★

Best ✓

★ Heterogeneous device results are similar — outdoor RSSI variability dominates over hardware differences.

Experiment Deployment Parameters

Component	Specification	Notes
BLE Sniffers	4 × Raspberry Pi 3B+	Positioned 6 m apart at 0.7 m height on traffic cones
Reference Points	8 RPs · 2 m spacing	20 m × 5.7 m observation zone at NCKU Dept. EE entrance
Surveillance Camera	1920 × 1080 @ 20 fps	Existing NCKU campus network — no new hardware added
BLE advertising	100 ms · 0 dBm · BT 5.0	nRF Connect app; various smartphone brands tested
LSTM time steps T	T = 3	Best accuracy/latency trade-off for temporal RSSI features
HBOE body orientation	Absolute angle error	Used for guiding system performance evaluation

Publication 2 · NCA 2026 · S.-I. Sou, Y.-C. Huang & C.-L. Wu

CamTrack — Multi-Sensor Fusion & Predictive Handover

CamTrack advances ViTrack by adding IMU-derived trajectory prediction for proactive camera-to-camera handover. A Markov-chain probability model explicitly quantifies the cost–benefit trade-off between tracking reliability and camera resource usage, enabling informed system configuration rather than heuristic tuning.

System Architecture — Layered View

Sensor
Layer

📐

IMU (Phone)

Yaw angle · velocity · heading estimation

+

📡

BLE Sniffers

RSSI coarse-grained localization across campus

+

📷

Camera Network

500+ cameras · 69 NVR servers · 1920×1080@20fps

Handover
Layer

🧭

Yaw Mapping

IMU Yaw → T-sample directional verification · camera transition table

→

🔮

Predictive Camera Selection

Enters handover zone → activates next camera · deactivates current

→

📊

Markov Handover Model

P_Φ(T_H) = P₁ + P₂ · analytical cost–benefit framework

Tracking
Layer

👁️

YOLOv7

25–30 fps · RTX 3060 · confidence ≥ 0.5

→

🪪

Stage 1: ReID Matching

High-conf detections · Hungarian algorithm · L2 distance threshold τ=15

→

📈

Stage 2: SORT Tracking

Kalman filter · IoU = 0.05 · handles occluded / low-confidence detections

Output
Layer

🗺️

MapTalk Interface

Web UI for admin · camera map · live stream access via SSH tunnel

→

📱

Mobile Navigation

Real-time guiding notifications pushed to user's smartphone

Tracking Success: 6 vs. 9 Active Cameras

Sparse · 6 cam

~48%

Sparse · 9 cam ★

~87% +39%

Dense · 6 cam

~74%

Dense · 9 cam ★

~87% +13%

Sparse zones gain more from additional cameras — analytical and simulation results closely matched.

Expiry Timer T_H Effect (Sparse Zone)

T_H = 10 min

Low

T_H = 30 min

Medium

T_H = 60 min

High

T_H ≥ 100 min

Saturated ✓

T_H gain saturates ~100 min, aligned with ~30-min average idle time. Trade-off: higher T_H → higher resource cost.

Markov Chain Handover Probability Model

State 0 = user in uncovered gap; State i = user in camera C_i coverage
P₁: immediate transition to adjacent camera (direct success)
P₂: delayed re-entry within timer T_H (conditional success)
P_Φ(T_H) = P₁ + P₂ — closed-form analytical expression
Enables quantitative cost–benefit decisions for any campus deployment

        P₁ = (1−Pi,0) · Σpi,j / (Σpi,j + Σpi,k)

        P₂ = Pi,0(1−e−μT_H) · Σqj / (Σqj + Σqk)

        PΦ(T_H) = P₁ + P₂

Lab Expertise

Research Topics

BLE Wireless Fingerprinting LSTM Localization Person Re-Identification Multi-Camera Tracking Sensor Fusion (IMU+BLE+Camera) Predictive Handover Markov Chain Modeling YOLOv7 Object Detection Assistive Navigation Obstacle Avoidance Deviation Detection Smart Campus Systems