Melih is the founder and PI of the SDU Adaptive Intelligence (ADIN) Lab, which pursues basic research on Bayesian inference and stochastic process modeling with deep neural nets with application to reinforcement learning and continual learning. Melih is proud to be an ELLIS Member, the society of the top artificial intelligence researchers in the Europe.
Melih is fascinated by both popular and technical aspects of natural sciences, especially physics and biology. His leisure time goes to thinking wishfully about their applications to the development of environment-friendly technologies.
Melih is in love with his wife Fatos and his daughter Sehnaz.
Current approaches to model-based offline Reinforcement Learning (RL) often incorporate uncertainty-based reward penalization to address the distributional shift problem. While these approaches have achieved some success, we argue that this penalization introduces excessive conservatism, potentially resulting in suboptimal policies through underestimation. We identify as an important cause of over-penalization the lack of a reliable uncertainty estimator capable of propagating uncertainties in the Bellman operator. The common approach to calculating the penalty term relies on sampling-based uncertainty estimation, resulting in high variance. To address this challenge, we propose a novel method termed Moment Matching Offline Model-Based Policy Optimization (MOMBO). MOMBO learns a Q-function using moment matching, which allows us to deterministically propagate uncertainties through the Q-function. We evaluate MOMBO’s performance across various environments and demonstrate empirically that MOMBO is a more stable and sample-efficient approach.
@inproceedings{akgul2024,author={Akgul, A. and Haussmann, M. and Kandemir, M.},year={2024},title={Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning},booktitle={Neural Information Processing Systems},url={https://arxiv.org/abs/2406.04088},}
NeurIPS
Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures
We present improved algorithms with worst-case regret guarantees for the stochastic linear bandit problem. The widely used "optimism in the face of uncertainty" principle reduces a stochastic bandit problem to the construction of a confidence sequence for the unknown reward function. The performance of the resulting bandit algorithm depends on the size of the confidence sequence, with smaller confidence sets yielding better empirical performance and stronger regret guarantees. In this work, we use a novel tail bound for adaptive martingale mixtures to construct confidence sequences which are suitable for stochastic bandits. These confidence sequences allow for efficient action selection via convex programming. We prove that a linear bandit algorithm based on our confidence sequences is guaranteed to achieve competitive worst-case regret. We show that our confidence sequences are tighter than competitors, both empirically and theoretically. Finally, we demonstrate that our tighter confidence sequences give improved performance in several hyperparameter tuning tasks.
@inproceedings{flynn2023improved,author={Flynn, H. and Reeb, D. and Kandemir, M. and Peters, J.},year={2023},title={Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures},booktitle={Neural Information Processing Systems},url={https://arxiv.org/abs/2309.14298},}
T-PAMI
PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison
H. Flynn, D. Reeb, M. Kandemir, and 1 more author
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
PAC-Bayes has recently re-emerged as an effective theory with which one can derive principled learning algorithms with tight performance guarantees. However, applications of PAC-Bayes to bandit problems are relatively rare, which is a great misfortune. Many decision-making problems in healthcare, finance and natural sciences can be modelled as bandit problems. In many of these applications, principled algorithms with strong performance guarantees would be very much appreciated. This survey provides an overview of PAC-Bayes performance bounds for bandit problems and an experimental comparison of these bounds. Our experimental comparison has revealed that available PAC-Bayes upper bounds on the cumulative regret are loose, whereas available PAC-Bayes lower bounds on the expected reward can be surprisingly tight. We found that an offline contextual bandit algorithm that learns a policy by optimising a PAC-Bayes bound was able to learn randomised neural network polices with competitive expected reward and non-vacuous performance guarantees.
@article{flyn2022pacbayes,title={PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison},author={Flynn, H. and Reeb, D. and Kandemir, M. and Peters, J.},journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},year={2023},url={https://arxiv.org/abs/2211.16110},}
ICLR
Evidential Turing Processes
M. Kandemir, A. Akgül, M. Haussmann, and 1 more author
In International Conference on Learning Representations, 2022
A probabilistic classifier with reliable predictive uncertainties i) fits successfully to the target domain data, ii) provides calibrated class probabilities in difficult regions of the target domain (e.g. class overlap), and iii) accurately identifies queries coming out of the target domain and reject them. We introduce an original combination of Evidential Deep Learning, Neural Processes, and Neural Turing Machines capable of providing all three essential properties mentioned above for total uncertainty quantification. We observe our method on three image classification benchmarks to consistently improve the in-domain uncertainty quantification, out-of-domain detection, and robustness against input perturbations with one single model. Our unified solution delivers an implementation-friendly and computationally efficient recipe for safety clearance and provides intellectual economy to an investigation of algorithmic roots of epistemic awareness in deep neural nets.
@inproceedings{kandemir2022evidential,title={Evidential Turing Processes},author={Kandemir, M. and Akgül, A. and Haussmann, M. and Unal, G.},booktitle={International Conference on Learning Representations},year={2022},url={https://openreview.net/forum?id=84NMXTHYe-},}
T-PAMI
A Deterministic Approximation to Neural SDEs
A. Look, M. Kandemir, B. Rakitsch, and 1 more author
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022
Neural Stochastic Differential Equations (NSDEs) model the drift and diffusion functions of a stochastic process as neural networks. While NSDEs are known to make accurate predictions, their uncertainty quantification properties haven been remained unexplored so far. We report the empirical finding that obtaining well-calibrated uncertainty estimations from NSDEs is computationally prohibitive. As a remedy, we develop a computationally affordable deterministic scheme which accurately approximates the transition kernel, when dynamics is governed by a NSDE. Our method introduces a bidimensional moment matching algorithm: vertical along the neural net layers and horizontal along the time direction, which benefits from an original combination of effective approximations. Our deterministic approximation of the transition kernel is applicable to both training and prediction. We observe in multiple experiments that the uncertainty calibration quality of our method can be matched by Monte Carlo sampling only after introducing high computation cost. Thanks to the numerical stability of deterministic training, our method also provides improvement in prediction accuracy.
@article{look2022adeterministic,title={A Deterministic Approximation to Neural SDEs},author={Look, A. and Kandemir, M. and Rakitsch, B. and Peters, J.},journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},year={2022},url={https://arxiv.org/abs/2006.08973},publisher={IEEE},}
IJCAI
Deep Active Learning with Adaptive Acquisition
M. Haußmann, F.A. Hamprecht, and M. Kandemir
In International Joint Conference on Artificial Intelligence, 2019
@inproceedings{haussmann2019deep,title={Deep Active Learning with Adaptive Acquisition},author={Hau{\ss}mann, M. and Hamprecht, F.A. and Kandemir, M.},booktitle={International Joint Conference on Artificial Intelligence},year={2019},}
NeurIPS
Evidential Deep Learning to Quantify Classification Uncertainty
@inproceedings{sensoy2018evidential,title={Evidential Deep Learning to Quantify Classification Uncertainty},author={Sensoy, M. and Kaplan, L. and Kandemir, M.},booktitle={Neural Information Processing Systems},pages={3179--3189},year={2018},}