LLMs for HPO | Leonard Jung

Abstract: A common challenge when training a neural network (NN) is the choice of hyperparameters for the network, e.g., layer count, learning rate, and dropout. We propose combining traditional HPO algorithms with several LLM agents that decide the initialization of the parameter space, exploration of the parameter space, and real-time trial monitoring.

We evaluated several open source LLM models, including Deepseek and LLAMA, with the Optuna Bayesian HPO library.