Digital-Twin Dispatching for Urban Mobility via Spatio-Temporal Transformers and Offline Reinforcement Learning

Long Zhang; Ruiyan Ma; Peter Greg

doi:10.51903/jtie.v4i2.501

Authors

Long Zhang Transportation Systems Engineering, Southern Methodist University, TX, USA
Ruiyan Ma Software Engineering, UC Irvine, CA, USA
Peter Greg Information Technology, Illinois Tech, IL, USA

DOI:

https://doi.org/10.51903/jtie.v4i2.501

Keywords:

Digital Twin, Urban Mobility, Ride-Hailing Dispatch, Demand Forecasting, Spatio-Temporal Transformer

Abstract

This study addresses the challenge of optimizing ride-hailing dispatch and repositioning under data limitations by proposing an end-to-end digital-twin dispatching framework that integrates spatio-temporal demand forecasting with offline reinforcement learning. Using publicly available NYC FOIL ride-hailing data aggregated at the dispatching-base level, the research aims to evaluate whether coarse-grained data can still support reliable, reproducible decision-making pipelines. The methodology consists of two main components: (i) multivariate time-series forecasting using baseline models, a temporal convolutional network (TCN), and a spatio-temporal transformer to predict next-day demand; and (ii) a digital-twin simulation combined with an action-constrained offline reinforcement learning approach, including behavior cloning (BC) and Conservative Q-Learning (CQL), to optimize fleet repositioning decisions. Experimental results show that the TCN achieves the best forecasting accuracy on the test period, although dominant demand regions largely drive performance gains. In the control phase, conservative policies such as CQL demonstrate stable performance with reduced repositioning costs, but do not significantly outperform behavior cloning due to limited training data. The findings indicate that, in coarse aggregate settings, operational improvements are more influenced by controlling policy sensitivity than by marginal forecasting gains. This study contributes a reproducible benchmark pipeline and highlights the importance of conservative control strategies, transparent assumptions, and sensitivity analysis when deploying AI-driven mobility systems based on limited or aggregated data.

References

Agustin, N., & Rahayu, S. (2025). The Role of Social Media as a Micro-Ecosystem in Supporting Community-Based E-Learning Platforms: A Systematic Literature Review. Journal of Technology Informatics and Engineering, 4(3), 369–390. https://doi.org/10.51903/jtie.v4i3.445

Alonso-Mora, J., Samaranayake, S., Wallar, A., Frazzoli, E., & Rus, D. (2017). On-Demand High-Capacity Ride-Sharing via Dynamic Trip-Vehicle Assignment. Proceedings of the National Academy of Sciences, 114(3), 462–467. https://doi.org/10.1073/pnas.1611675114

Bai, S., Kolter, J. Z., & Koltun, V. (2018). An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv Preprint, arXiv:1803.01271. https://arxiv.org/abs/1803.01271

Batty, M. (2018). Digital Twins. Environment and Planning B: Urban Analytics and City Science, 45(5), 817–820. https://doi.org/10.1177/2399808318796416

Boschert, S., & Rosen, R. (2016). Digital Twin—The Simulation Aspect. In P. Hehenberger & D. Bradley (Eds.), Mechatronic Futures, 59–74. https://doi.org/10.1007/978-3-319-32156-1_5

FiveThirtyEight. (2015). Uber TLC FOIL Response (NYC): Uber-Jan-Feb-FOIL.csv. GitHub. https://github.com/fivethirtyeight/uber-tlc-foil-response

Fujimoto, S., Meger, D., & Precup, D. (2019). Off-Policy Deep Reinforcement Learning Without Exploration. In Proceedings of the 36th International Conference on Machine Learning (ICML), 97, 2052–2062. https://proceedings.mlr.press/v97/fujimoto19a.html

Fuller, A., Fan, Z., Day, C., & Barlow, C. (2020). Digital Twin: Enabling Technologies, Challenges and Open Research. IEEE Access, 8, 108952–108971. https://doi.org/10.1109/access.2020.2998358

Gini, C. (1921). Measurement of Inequality of Incomes. The Economic Journal, 31(121), 124–126. https://doi.org/10.2307/2223319

Grieves, M. (2014). Digital Twin: Manufacturing Excellence Through Virtual Factory Replication (White Paper). Apriso / Dassault Systèmes. https://www.manufacturingview.media/uploads/resources/resources/222025114804PM_Resource_DigitalTwinManufacturingExcellenceThroughVirtualFactoryReplication.pdf

Holler, J., Vuorio, R., Qin, Z., Tang, X., Jiao, Y., Jin, T., Singh, S., Wang, C., & Ye, J. (2019). Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem. arXiv Preprint, arXiv:1911.11260. https://arxiv.org/abs/1911.11260

Jiao, Y., Tang, X., Qin, Z. T., Li, S., Zhang, F., Zhu, H., & Ye, J. (2021). Real-World Ride-Hailing Vehicle Repositioning Using Deep Reinforcement Learning. Transportation Research Part C: Emerging Technologies, 130, 103289. https://doi.org/10.1016/j.trc.2021.103289

Kingma, D. P., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1412.6980

Kušić, K., Schumann, R., & Ivanjko, E. (2023). A Digital Twin in Transportation: Real-Time Synergy of Traffic Data Streams and Simulation for Virtualizing Motorway Dynamics. Advanced Engineering Informatics, 55, 101858. https://doi.org/10.1016/j.aei.2022.101858

Ke, J., Zheng, H., Yang, H., & Chen, X. M. (2017). Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach. Transportation Research Part C: Emerging Technologies, 85, 591–608. https://doi.org/10.1016/j.trc.2017.10.016

Kleinrock, L. (1975). Queueing Systems. Volume 1: Theory (Leonard Kleinrock). SIAM Review, 18(3), 512-514. https://doi.org/10.1137/1018095

Kostrikov, I., Nair, A., & Levine, S. (2021). Offline Reinforcement Learning with Implicit Q-Learning. arXiv Preprint, arXiv:2110.06169. https://arxiv.org/abs/2110.06169

Kritzinger, W., Karner, M., Traar, G., Henjes, J., & Sihn, W. (2018). Digital Twin in Manufacturing: A Categorical Literature Review and Classification. IFAC PapersOnLine, 51(11), 1016–1022. https://doi.org/10.1016/j.ifacol.2018.08.474

Kumar, A., Zhou, A., Tucker, G., & Levine, S. (2020). Conservative Q-Learning for Offline Reinforcement Learning. Advances in Neural Information Processing Systems, 33, 1179-1191. https://proceedings.neurips.cc/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf

Levine, S., Kumar, A., Tucker, G., & Fu, J. (2020). Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. arXiv Preprint, arXiv:2005.01643. https://arxiv.org/abs/2005.01643

Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2018). Diffusion Convolutional Recurrent Neural Network: Data Driven Traffic Forecasting. In International Conference on Learning Representations (ICLR 2018). https://openreview.net/forum?id=H1g f2CcKX

Lin, K., Zhao, R., Xu, Z., & Zhou, J. (2018). Efficient Large-Scale Fleet Management via Multi-Agent Deep Reinforcement Learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 1774–1783. https://doi.org/10.1145/3219819.3219993

Lim, B., Arik, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal Fusion Transformers for Interpretable Multi Horizon Time Series Forecasting. International Journal of Forecasting, 37(4), 1748–1764. https://doi.org/10.1016/j.ijforecast.2020.05.011

MacQueen, J. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281–297. https://projecteuclid.org/euclid.bsmsp/1200512992

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-Level Control through Deep Reinforcement Learning. Nature, 518, 529–533. https://doi.org/10.1038/nature14236

Musrifah, F., & Hasanah, I. A. (2025). Ethical Implications of AI-Driven Recruitment: A Multi-Perspective Study on Bias and Transparency in Digital Hiring Platforms. Journal of Management and Informatics, 4(1), 599–616. https://doi.org/10.51903/jmi.v4i1.140

Qin, Z. T., Zhu, H., & Ye, J. (2022). Reinforcement Learning for Ridesharing: An Extended Survey. Transportation Research Part C: Emerging Technologies, 144, 103852. https://doi.org/10.1016/j.trc.2022.103852

Rokhman, N., Sumaryanto, S., & Maulan, P. A. (2025). Perancangan Tempat Sampah Cerdas Berbasis Arduino UNO di MTS Sunan Kalijaga. JUISI: Jurnal Ilmiah Sistem Informasi, 4(1), 1–12. https://doi.org/10.51903/twpgeq37

Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press. http://incompleteideas.net/book/the-book-2nd.html

Tao, F., Cheng, J., Qi, Q., Zhang, M., Zhang, H., & Sui, F. (2018). Digital Twin-Driven Product Design, Manufacturing and Service with Big Data. The International Journal of Advanced Manufacturing Technology, 94, 3563–3576. https://doi.org/10.1007/s00170-017-0233-1

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. In Advances in Neural Information Processing Systems, 30, 1-11. https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

Xu, M., Dai, W., Liu, C., Gao, X., Lin, W., Qi, G.-J., & Xiong, H. (2020). Spatial-Temporal Transformer Networks for Traffic Flow Forecasting. arXiv. https://arxiv.org/abs/2001.02908

Yu, B., Yin, H., & Zhu, Z. (2018). Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI). https://www.ijcai.org/proceedings/2018/

Zhou, M., Jin, J., Zhang, W., Qin, Z., Jiao, Y., Wang, C., Wu, G., Yu, Y., & Ye, J. (2019). Multi-Agent Reinforcement Learning for Order-Dispatching via Order-Vehicle Distribution Matching. arXiv. https://arxiv.org/abs/1910.02591