Integrating Machine Learning and Optimisation to Solve the Capacitated Vehicle Routing Problem

Pedrozo, Daniel Antunes; GUPTA, Prateek; MEIRA, Jorge Augusto; Silva, Fabiano

doi:10.5220/0013165900003893

Request a copy

Paper published in a book (Scientific congresses, symposiums and conference proceedings)

Integrating Machine Learning and Optimisation to Solve the Capacitated Vehicle Routing Problem

Pedrozo, Daniel Antunes; GUPTA, Prateek; MEIRA, Jorge Augusto et al.

2025 • In Schlosser, Rainer (Ed.) Proceedings of the 14th International Conference on Operations Research and Enterprise Systems

Peer reviewed

Permalink
https://hdl.handle.net/10993/64982

DOI
10.5220/0013165900003893

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

ICORES_2025_45_CR.pdf

Author postprint (185.2 kB)

Request a copy

All documents in ORBilu are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Graph Attention Network; Reinforcement Learning; Vehicle Routing Problem; Computer Science (miscellaneous); Management Science and Operations Research; Control and Optimization; Theoretical Computer Science

Abstract :

[en] The Capacitated Vehicle Routing Problem (CVRP) is a fundamental combinatorial optimisation challenge in logistics. It aims to optimise routes so a fleet of vehicles can supply customer’s demands while minimising costs - that can be seem as total distance travelled or time spent. Traditional techniques - exact algorithms, heuristics and metaheuristics - have been thoroughly studied, but this methods often face limitations in scalability and use of computational resources when confronted with larger and more complex instances. Recently, Graph Neural Networks (GNNs) and Graph Attention Networks (GATs) have been used to tackle these more complex instances by capturing the relational structures inherent in graph-based information. Existing methods often rely on the REINFORCE approach with baselines like the Greedy Rollout, which uses a doubleactor architecture that introduces computational overhead that could hinder scalability. This paper addresses this problem by introducing a novel approach that uses a GAT network trained using reinforcement learning with the DiCE estimator. By using DiCE, our method eliminates the need for a double-actor structure, which contributes to lower the computational training cost without sacrificing solution quality. Our experiments indicate that our model achieves solutions close to the optimal, with a notable decrease in training time and resource utilisation when compared with other techniques. This work provides a more efficient machine learning framework for the CVRP.

Disciplines :

Computer science

Author, co-author :

Pedrozo, Daniel Antunes ; Department of Informatics, Federal University of Parana, Brazil ; Unilu - University of Luxembourg

GUPTA, Prateek ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN

MEIRA, Jorge Augusto ; University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > SEDAN

Silva, Fabiano ; Department of Informatics, Federal University of Parana, Brazil

External co-authors :

yes

Language :

English

Title :

Integrating Machine Learning and Optimisation to Solve the Capacitated Vehicle Routing Problem

Publication date :

25 February 2025

Event name :

Proceedings of the 14th International Conference on Operations Research and Enterprise Systems

Event place :

Porto, Portugal

Event date :

23-02-2025 => 25-02-2025

Main work title :

Proceedings of the 14th International Conference on Operations Research and Enterprise Systems

Editor :

Schlosser, Rainer

Publisher :

Science and Technology Publications, Lda

ISBN/EAN :

9789897587320

Peer reviewed :

Peer reviewed

Available on ORBilu :

since 19 May 2025

Statistics

Number of views

107 (4 by Unilu)

Number of downloads

0 (0 by Unilu)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

Applegate, D. L., Bixby, R. E., Chvatal, V., and Cook, W. J. (2006). The traveling salesman problem. http://www. math.uwaterloo.ca/tsp/concorde.html. Accessed: October, 2023
Battaglia, P. W., Hamrick, J. B., Bapst, V., SanchezGonzalez, A., Zambaldi, V., Malinowski, M., Tacchetti, A., Raposo, D., Santoro, A., Faulkner, R., et al. (2018). Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261.
Bello, I., Pham, H., Le, Q. V., Norouzi, M., and Bengio, S. (2016). Neural combinatorial optimization with reinforcement learning. CoRR, abs/1611.09940.
Christofides, N. (1976). Worst-case analysis of a new heuristic for the travelling salesman problem. (388).
Dai, H., Khalil, E. B., Zhang, Y., Dilkina, B., and Song, L. (2017). Learning combinatorial optimization algorithms over graphs. CoRR, abs/1704.01665.
Foerster, J., Farquhar, G., Al-Shedivat, M., Rocktäschel, T., Xing, E. P., and Whiteson, S. (2018). Dice: The infinitely differentiable monte-carlo estimator.
Google (2023). Or-tools. https://developers.google.com/ optimization. Accessed: October, 2023.
Gori, M., Monfardini, G., and Scarselli, F. (2005). A new model for learning in graph domains. 2:729–734 vol. 2.
Helsgaun, K. (2000). An effective implementation of the lin–kernighan traveling salesman heuristic. European Journal of Operational Research, 126(1):106–130.
Hopfield, J. and Tank, D. (1985). Neural computation of decisions in optimization problems. Biological cybernetics, 52:141–52.
Kool, W., van Hoof, H., and Welling, M. (2019). Attention, learn to solve routing problems!
Lei, K., Guo, P., Wang, Y., Wu, X., and Zhao, W. (2021). Solve routing problems with a residual edge-graph attention neural network. CoRR, abs/2105.02730.
Misra, D. (2020). Mish: A self regularized non-monotonic activation function.
Schulman, J., Heess, N., Weber, T., and Abbeel, P. (2016). Gradient estimation using stochastic computation graphs.
Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks.
Sutton, R. S., Mcallester, D., Singh, S., and Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. 12:1057–1063.
Talbi, E.-G. (2009). Metaheuristics: From Design to Implementation. Wiley Publishing.
Toth, P., Vigo, D., Toth, P., and Vigo, D. (2014). Vehicle Routing: Problems, Methods, and Applications, Second Edition. Society for Industrial and Applied Mathematics, USA.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2023). Attention is all you need.
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. (2018). Graph attention networks. ArXiv, abs/1710.10903.
Vinyals, O., Fortunato, M., and Jaitly, N. (2017). Pointer networks.
Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8:229–256.