I am a Faculty Fellow at the Center for Data Science at New York University, where I collaborate with Tal Linzen. I received my PhD in computational linguistics from The Ohio State University, where I worked with William Schuler.
My work aims to advance our understanding of language processing in humans and machines by drawing on techniques from psycholinguistics and machine learning. I am particularly interested in developing computational models that capture the real-time processing behavior of human language users, and interpretability techniques for studying the predictions and representations of neural networks.
@article{ohschuler25jml,author={Oh, Byung-Doh and Schuler, William},title={Dissociable frequency effects attenuate as large language model surprisal predictors improve},year={2025},journal={Journal of Memory and Language},volume={143},pages={104645},section={articles},}
EACL
Frequency explains the inverse correlation of large language models’ size, training data amount, and surprisal’s fit to reading times
Byung-Doh Oh, Shisen Yue, and William Schuler
In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
@inproceedings{ohetal24eacl,author={Oh, Byung-Doh and Yue, Shisen and Schuler, William},title={Frequency explains the inverse correlation of large language models' size, training data amount, and surprisal's fit to reading times},year={2024},booktitle={Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics},pages={2644--2663},section={papers},}
ACL
Token-wise decomposition of autoregressive language model hidden states for analyzing model predictions
Byung-Doh Oh, and William Schuler
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023
@inproceedings{ohschuler23acl,author={Oh, Byung-Doh and Schuler, William},title={Token-wise decomposition of autoregressive language model hidden states for analyzing model predictions},year={2023},booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},pages={10105--10117},section={papers},}
TACL
Why does surprisal from larger Transformer-based language models provide a poorer fit to human reading times?
Byung-Doh Oh, and William Schuler
Transactions of the Association for Computational Linguistics, 2023
@article{ohschuler23tacl,author={Oh, Byung-Doh and Schuler, William},title={Why does surprisal from larger Transformer-based language models provide a poorer fit to human reading times?},year={2023},journal={Transactions of the Association for Computational Linguistics},volume={11},pages={336--350},section={articles},}
EMNLP
Entropy- and distance-based predictors from GPT-2 attention patterns predict reading times over and above GPT-2 surprisal
Byung-Doh Oh, and William Schuler
In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
@inproceedings{ohschuler22emnlp,author={Oh, Byung-Doh and Schuler, William},title={Entropy- and distance-based predictors from GPT-2 attention patterns predict reading times over and above GPT-2 surprisal},year={2022},booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},pages={9324--9334},section={papers},}
FAI
Comparison of structural parsers and neural language models as surprisal estimators
Byung-Doh Oh, Christian Clark, and William Schuler
@article{ohetal22fai,author={Oh, Byung-Doh and Clark, Christian and Schuler, William},title={Comparison of structural parsers and neural language models as surprisal estimators},year={2022},journal={Frontiers in Artificial Intelligence},volume={5},pages={777963},section={articles},}