HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://idurugkar.github.io/publications/
access-control-allow-origin: *
expires: Tue, 30 Dec 2025 19:18:02 GMT
cache-control: max-age=600
x-proxy-cache: MISS
x-github-request-id: 9904:3FD64F:A719D6:BB9AFB:69542311
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 19:08:02 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210084-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767121682.437860,VS0,VE207
vary: Accept-Encoding
x-fastly-request-id: 5c958879cec2bfb1980ed4b77861d81f26bc522a
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Mon, 01 May 2023 21:36:00 GMT
access-control-allow-origin: *
etag: W/"645030c0-4f20"
expires: Tue, 30 Dec 2025 19:18:02 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: 0B71:328FD3:A6E1BB:BB5EFE:6954230F
accept-ranges: bytes
age: 0
date: Tue, 30 Dec 2025 19:08:02 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210084-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1767121683.658360,VS0,VE205
vary: Accept-Encoding
x-fastly-request-id: c47c506eb0072b4047b356641d971b9637431bee
content-length: 4874
Publications - Ishan Durugkar Publications Published in AAAI 2023 , 2023
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
Recommended citation: Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone. (2023). "DM2: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching". AAAI, 2023. https://arxiv.org/abs/2206.00233
Published in Deep RL Workshop, NeurIPS 2021 , 2021
Estimating and maximizing the Wasserstein distance from a start state to learn skills in a controlled Markov process
Recommended citation: Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih. (2021). "Wasserstein Distance Maximizing Intrinsic Control". Deep RL workshop, NeurIPS, 2021. https://arxiv.org/abs/2110.15331
Published in NeurIPS 2021 , 2021
Estimating and minimizing the Wasserstein distance to an idealized target distribution to learn a goal-conditioned policy
Recommended citation: Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone. (2021). "Adversarial Intrinsic Motivation for Reinforcement Learning". NeurIPS, 2021. https://arxiv.org/abs/2105.13345
Published in NeurIPS 2020 , 2020
Using Imitation from Observation techniques to speed up transfer of policies between environments with dynamics mismatch
Recommended citation: Siddarth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone. (2020). "An Imitation from Observation Approach to Sim-to-Real Transfer". NeurIPS, 2020. https://arxiv.org/pdf/2008.01594.pdf
Published in International Joint Conference on Artificial Intelligence , 2020
This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task
Recommended citation: Ishan Durugkar, Elad Liebman, Peter Stone. (2020). "Balancing Individual Preferences and Shared Objectives in Multiagent Reinforcement Learning". International Joint Conference on Artificial Intelligence, 2020. https://idurugkar.github.com/files/IJCAI20-ishand.pdf
Published in International Conference on Machine Learning , 2020
Reducing Sampling Error in Batch Temporal Difference Learning
Recommended citation: Brahma S. Pavse, Ishan Durugkar, Josiah P. Hanna, Peter Stone. (2020). "Reducing Sampling Error in Batch Temporal Difference Learning". International Conference on Machine Learning, 2020. https://idurugkar.github.com/files/PSEC-TD.pdf
Published in ArXiv , 2019
Using multiple auxiliary tasks as soft preferences on the policy while learning
Recommended citation: Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine. (2019). "Multi-Preference Actor Critic". ArXiv Preprint 2019. https://idurugkar.github.com/files/MultiPref_ActorCritic.pdf
Published in International Conference on Learning Representations , 2018
Using Policy Gradient to learn navigation in knowledge bases
Recommended citation: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum. (2018). "Go for a walk and arrive at the answer: Reasoning over paths in knowledge bases using reinforcement learning". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/MINERVA_ICLR2018.pdf
Published in International Conference on Learning Representations , 2017
Improving GAN training with multiple descriminators
Recommended citation: Ishan Durugkar, Ian Gemp, Sridhar Mahadevan. (2017). "Generative Multi-Adversarial Networks". International Conference on Learning Representations, 2017. https://idurugkar.github.com/files/GMAN_ICLR2017.pdf
Published in Twenty-Ninth IAAI Conference , 2017
Predicting nonstationarity in changing real world scenarios for off-policy evaluation
Recommended citation: Philip S Thomas, Georgios Theocharous, Mohammad Ghavamzadeh, Ishan Durugkar, Emma Brunskill. (2017). "Predictive off-policy policy evaluation for nonstationary decision problems, with applications to digital marketing". Twenty-Ninth IAAI Conference. https://idurugkar.github.com/files/predictive_off_policy_evaluation_IAAI2017.pdf
Published in arXiv preprint arXiv:1608.05983 , 2016
mapping VAE decoder samples back to latent space for improved accuracy
Recommended citation: Ian Gemp, Ishan Durugkar, Mario Parente, M Darby Dyar, Sridhar Mahadevan. (2016). "Inverting Variational Autoencoders for Improved Generative Accuracy". arXiv preprint arXiv:1608.05983. https://idurugkar.github.com/files/inverting_variational_ae.pdf
Published in arXiv preprint arXiv:1606.04615 , 2016
DQN with macro-actions
Recommended citation: Ishan P Durugkar, Clemens Rosenbaum, Stefan Dernbach, Sridhar Mahadevan. (2016). "Deep reinforcement learning with macro-actions". arXiv preprint arXiv:1606.04615. https://idurugkar.github.com/files/macro_actions.pdf