Professor · Researcher · Author

Rowel O.
Atienza

Professor at the University of the Philippines Diliman's Electrical and Electronics Engineering Institute. PhD in Robotics from the Australian National University. Inventor of ViTSTR and PARSeq — deployed by NASA on the International Space Station.

1,700+
Citations
h-index 18
h-index
i10: 28
i10-index
Top 2%
Scientists Worldwide
Rowel Atienza
Rowel O. Atienza
Professor, EEEI
University of the Philippines Diliman
Computer Vision Robotics AI / Deep Learning Speech Processing Scene Text Recognition
★ Top 2% Scientists Worldwide 2025

Background & Research

AI, Computer Vision, Robotics and beyond

Research Interests

  • Computer Vision & Scene Text Recognition
  • Robotics & Human-Robot Interaction
  • Speech Synthesis & Signal Processing
  • Embodied AI & Autonomous Agents
  • Deep Learning & Model Efficiency
  • Point Cloud & 3D Vision

Education

  • PhD in Robotics — Australian National University, 2008
  • MEng — National University of Singapore, 1997

Affiliations

  • Professor, EEEI — University of the Philippines Diliman
  • Ubiquitous Computing Laboratory
  • AI Graduate Program, UP Diliman

Highlights

Inventor of ViTSTR and PARSeq, state-of-the-art scene text recognition models integrated into Intel OpenVINO, PaddlePaddle, and deployed by NASA on the Astrobee robot aboard the International Space Station.


Recognized in the Stanford/Elsevier Top 2% Scientists Worldwide ranking (2025). Publishes and reviews at top venues: ECCV, ICRA, ICASSP, ICDAR, CVPR.

Notable Achievements

  • Best of ICCV 2025 — Sari Sandbox
  • NASA ISS deployment — PARSeq on Astrobee
  • Intel OpenVINO integration — ViTSTR
  • PaddlePaddle integration — PARSeq
  • Author of Packt bestseller Advanced Deep Learning

Latest Work

Most recent papers — 2025 & 2026

ICCVW 2025 2025 2025 Best of ICCV
Sari Sandbox: A Virtual Retail Store Environment for Embodied AI Agents
Janika Deborah Gajo, Gerarld Paul Merales, Jerome Escarcha, Brenden Ashley Molina, Gian Nartea, Emmanuel Maminta, Juan Carlos Roldan, Rowel Atienza
A high-fidelity, photorealistic 3D retail store simulation for benchmarking embodied agents against human performance in shopping tasks. Features over 250 interactive grocery items across three store configurations, controlled via an API. Supports both VR for human interaction and a VLM-powered embodied agent. Introduces SariBench, a dataset of annotated human demonstrations across varied task difficulties. Selected as one of the "Best of ICCV" from over 12,000 global submissions — the first comprehensive retail store simulation for embodied AI agent training.
CVPRW 2026 2026 2026
A Survey of Spatial Memory Representations for Efficient Robot Navigation
Ma. Madecheen S. Pangaliman, Steven S. Sison, Erwin P. Quilloy, Rowel Atienza
A comprehensive survey of spatial memory efficiency for vision-based robot navigation, examining 88 references spanning 52 systems from 1989–2025 — from occupancy grids to neural implicit representations. Introduces α = Mpeak/Mmap, the ratio of peak runtime memory to saved map size, exposing the gap between published map sizes and actual deployment cost. Profiling on an NVIDIA A100 GPU reveals α spans two orders of magnitude within neural methods alone (2.3 for Point-SLAM to 215 for NICE-SLAM). Proposes a standardized evaluation protocol and an α-aware budgeting algorithm for assessing deployment feasibility on embedded platforms (8–16 GB, <30 W). Accepted at the Women in Computer Vision (WiCV) Workshop at CVPR 2026.
arXiv 2025 2025 2025
Interpretable Open-Vocabulary Referring Object Detection with Reverse Contrast Attention (RCA)
Drandreb Earl Juanico, Rowel Atienza, Jeffrey Kenneth Go
A plug-in method that enhances object localization in vision-language transformers without retraining. RCA reweights final-layer attention by suppressing extremes and amplifying mid-level activations to let semantically relevant but subdued tokens guide predictions. Evaluated on Open Vocabulary Referring Object Detection (OV-RefOD), RCA improves FitAP in 11 out of 15 open-source VLMs, with gains up to +26.6%.

Selected Publications

Prioritizing top-cited works and recent 2025–2026 papers · Citation counts from Google Scholar

ICIP 2023 2023 Recent
Scene Text Recognition Models Explainability Using Local Features
Mark Vincent Ty, Rowel Atienza
We investigate the explainability of scene text recognition models using local feature attribution methods. The study sheds light on which image regions and features drive predictions in state-of-the-art STR models, providing interpretability tools for the research community.
TENCON 2023 2023 Recent
Fast Data Augmentation for Scene Text Recognition Using CUDA
David Angelo Piscasio, Rowel Atienza
We propose FastSTRAug, a CUDA-based library of 36 augmentation functions designed for STR, significantly faster than its CPU-based counterpart STRAug while maintaining the same augmentation diversity and quality improvements for scene text recognition models.
ICMI 2003 2003 Foundational
Intuitive Human-Robot Interaction Through Active 3D Gaze Tracking
Rowel Atienza, Alexander Zelinsky
View All Publications on Google Scholar ↗

Authored Books

Practical guides to advanced deep learning

📗

Advanced Deep Learning with TensorFlow 2 and Keras (2nd Ed.)

Updated for TensorFlow 2 with new chapters on object detection (SSD), semantic segmentation (FCN, PSPNet), and unsupervised learning using mutual information. Autoencoders, GANs, VAEs, Deep RL.

Packt Publishing · 2020
📘

Advanced Deep Learning with Keras (1st Ed.)

A comprehensive guide to advanced deep learning techniques including Autoencoders, GANs, VAEs, and Deep Reinforcement Learning that drive today's most impressive AI results.

Packt Publishing · 2018

Get in Touch

Research collaborations, speaking, consulting