Research - YJ Choe

Research Interests

Broadly, I work on topics in statistics, machine learning, and natural language processing. In recent years, I've been excited about a somewhat eclectic set of research areas and topics:

Game-theoretic statistics: sequential inference; anytime-validity; e-values and e-processes; confidence sequences; testing by betting; and evaluation of forecasters and black-box predictors;
Science of large language models: causal representations; geometry of LLM embeddings; Transformers; (mechanistic) interpretability; and alignment.

Preprints

Combining Evidence Across Filtrations Using Adjusters (arXiv, slides)
Y. J. Choe, A. Ramdas
arXiv, 2024
Contributed talk at JSM 2024 (August); Invited talk at ICSDS 2024 (December)

The Linear Representation Hypothesis and the Geometry of Large Language Models (arXiv, code, slides)
K. Park, Y. J. Choe, V. Veitch
arXiv, 2023
Preliminary work presented (oral and poster) at the NeurIPS 2023 Workshop on Causal Representation Learning

Thesis

Comparing Forecasters and Abstaining Classifiers (thesis, slides)
Y. J. Choe
Ph.D. Thesis, Carnegie Mellon University (2023)

Publications

Asterisks (*) denote equal contribution.

Counterfactually Comparing Abstaining Classifiers (proc, arXiv, code, slides, poster)
Y. J. Choe, A. Gangrade, A. Ramdas
Advances in Neural Information Processing Systems (NeurIPS), 2023
Preliminary work presented (oral and poster) at the ICML 2023 Workshop on Counterfactuals in Minds and Machines

Comparing Sequential Forecasters (proc, arXiv, code, slides, poster)
Y. J. Choe, A. Ramdas
Operations Research, 2023
Preliminary work presented at SAVI 2022 and JSM 2021
Student Poster Award (Runner-up) w/ $10K Research Grant, Citadel Securities Inaugural PhD Summit

KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding (proc, arXiv, data)
J. Ham*, Y. J. Choe*, K. Park*, I. Choi, H. Soh
Findings of the Association for Computational Linguistics: EMNLP, 2020

word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs (proc, arXiv, code)
Y. J. Choe*, K. Park*, D. Kim*
Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020

Jejueo Datasets for Machine Translation and Speech Synthesis (proc, arXiv, code)
K. Park, Y. J. Choe, J. Ham
Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020

Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation (proc, arXiv)
J. Lim, S. Ryu, K. Park, Y. J. Choe, J. Ham, W. Y. Kim
Journal of Chemical Information and Modeling, 2019

A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning (proc, arXiv, code)
Y. J. Choe*, J. Ham*, K. Park*, Y. Yoon*
Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), 2019
Runner-up, ACL Workshop Shared Task (Restricted & Low Resource Tracks)

Discovery of Natural Language Concepts in Individual Units of CNNs (proc, arXiv, poster, code)
S. Na, Y. J. Choe, D. Lee, G. Kim
International Conference on Learning Representations (ICLR), 2019

Local White Matter Architecture Defines Functional Brain Dynamics (proc, arXiv, slides)
Y. J. Choe, S. Balakrishnan, A. Singh, J. M. Vettel, T. Verstynen
IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018
Franklin V. Taylor Memorial Award (for Best Paper and Oral Presentation)

Miscellaneous

An Empirical Study of Invariant Risk Minimization (arXiv, code, slides)
Y. J. Choe, J. Ham, K. Park
ICML Workshop on Uncertainty and Robustness in Deep Learning, 2020

Probabilistic Interpretations of Recurrent Neural Networks (report)
Collaborators: J. Shin, N. Spencer
Technical Report, 2017

Learning Diverse Overcomplete Dictionaries via Determinantal Priors (abstract)
M. Al-Shedivat, Y. J. Choe, N. Spencer, E. P. Xing
ICML Workshop on Geometry in Machine Learning, 2016

A Statistical Analysis of Neural Networks (report)
Technical Report, 2016

Sparse Additive Models with Shape Constraints (report, slides, code)
Mentors: J. Lafferty, S. Chatterjee, M. Xu
University of Chicago Computer Science REU, 2014