Research - YJ Choe

Research Interests

Broadly, I work on topics in statistics, machine learning, and natural language processing. In recent years, I've been excited about a somewhat eclectic set of research areas and topics:

  • Game-theoretic statistics: sequential inference; anytime-validity; e-values and e-processes; confidence sequences; testing by betting; and evaluation of forecasters and black-box predictors;

  • Science of large language models: causal representations; geometry of LLM embeddings; Transformers; (mechanistic) interpretability; and alignment.

Preprints

  • Combining Evidence Across Filtrations Using Adjusters (arXiv, slides)
    Y. J. Choe, A. Ramdas
    arXiv, 2024
    Contributed talk at JSM 2024 (August); Invited talk at ICSDS 2024 (December)

Thesis

  • Comparing Forecasters and Abstaining Classifiers (thesis, slides)
    Y. J. Choe
    Ph.D. Thesis, Carnegie Mellon University (2023)

Publications

Asterisks (*) denote equal contribution.

  • KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding (proc, arXiv, data)
    J. Ham*, Y. J. Choe*, K. Park*, I. Choi, H. Soh
    Findings of the Association for Computational Linguistics: EMNLP, 2020

  • word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs (proc, arXiv, code)
    Y. J. Choe*, K. Park*, D. Kim*
    Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020

  • Jejueo Datasets for Machine Translation and Speech Synthesis (proc, arXiv, code)
    K. Park, Y. J. Choe, J. Ham
    Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020

  • Predicting Drug–Target Interaction Using a Novel Graph Neural Network with 3D Structure-Embedded Graph Representation (proc, arXiv)
    J. Lim, S. Ryu, K. Park, Y. J. Choe, J. Ham, W. Y. Kim
    Journal of Chemical Information and Modeling, 2019

  • Discovery of Natural Language Concepts in Individual Units of CNNs (proc, arXiv, poster, code)
    S. Na, Y. J. Choe, D. Lee, G. Kim
    International Conference on Learning Representations (ICLR), 2019

Miscellaneous

  • Probabilistic Interpretations of Recurrent Neural Networks (report)
    Collaborators: J. Shin, N. Spencer
    Technical Report, 2017

  • A Statistical Analysis of Neural Networks (report)
    Technical Report, 2016

  • Sparse Additive Models with Shape Constraints (report, slides, code)
    Mentors: J. Lafferty, S. Chatterjee, M. Xu
    University of Chicago Computer Science REU, 2014