profile photo

Dongha Lee

Assistant Professor
Data & Language Intelligence Lab. @ Yonsei
Department of Aritifial Intelligence
Yonsei University

Adjunct Professor
Graduate School of Aritifial Intelligence
POSTECH

Email  /  Google Scholar  /  Github

Biography

I am an assistant professor in the Department of Artificial Intelligence at Yonsei University. I received my Ph.D. in Computer Science from POSTECH where I was advised by Prof. Hwanjo Yu. During my Ph.D., I was fortunate to work as a visiting scholar at UT Health, under the supervision of Prof. Xiaoqian Jiang. After completing my Ph.D., I worked as a postdoctoral research fellow at the UIUC with my advisor, Prof. Jiawei Han.

Research Interest

  • LLMs with Parametric and Non-Parametric Knowledge
  • Information Retrieval & Recommender Systems
  • Data Intelligence for Real-World Applications

Publications

Unsupervised Robust Cross-Lingual Entity Alignment via Neighbor Triple Matching with Entity and Relation Texts
Soojin Yoon, Sungho Ko, Tongyoung Kim, SeongKu Kang, Jinyoung Yeo, Dongha Lee
WSDM, 2025
paper / code
Improving Scientific Document Retrieval with Concept Coverage-based Query Set Generation
SeongKu Kang, Bowen Jin, Wonbin Kweon, Yu Zhang, Dongha Lee, Jiawei Han, Hwanjo Yu
WSDM, 2025
paper / code
Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths
Sangam Lee, Ryang Heo, SeongKu Kang, Susik Yoon, Jinyoung Yeo, Dongha Lee
Preprint (arXiv), 2024
paper / code
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
Seoyeon Kim, Huiseo Kim, Chanjun Park, Jinyoung Yeo, Dongha Lee
Preprint (arXiv), 2024
paper / code
Towards Lifelong Dialogue Agents via Relation-aware Memory Construction and Timeline-augmented Response Generation
{Kai Tzu-iunn Ong, Namyoung Kim}, Minju Gwak, Hyungjoo Chae, Taeyoon Kwon, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo
Preprint (arXiv), 2024
paper / code
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
Seungbeen Lee, Seungwon Lim, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, Youngjae Yu
Preprint (arXiv), 2024
paper / code
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Hyungjoo Chae, Namyoung Kim, Kai Tzu-iunn Ong, Minju Gwak, Gwanwoo Song, Jihoon Kim, Sunghwan Kim, Dongha Lee, Jinyoung Yeo
Preprint (arXiv), 2024
paper / code
Evaluating Robustness of Reward Models for Mathematical Reasoning
{Sunghwan Kim, Dongjin Kang}, Taeyoon Kwon, Hyungjoo Chae, Jungsoo Won, Dongha Lee, Jinyoung Yeo
Preprint (arXiv), 2024
paper / code
Bag of Tricks for Diabetic Retinopathy and Diabetic Macular Edema Classification in Ultra-Widefield Imaging
Hyeonmin Kim, Chanyang Seo, Wonyoung Seo, Yunnie Cho, Ohhyun Kwon, Dongha Lee
MICCAI Challenge: Ultra-Widefield Fundus Imaging for Diabetic Retinopathy (UWF4DR) , 2024 **First Place Award**
paper / code
Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
Yeongbin Seo, Dongha Lee, Jinyoung Yeo
NeurIPS, 2024
paper / code
Taxonomy-guided Semantic Indexing for Academic Paper Search
SeongKu Kang, Yunyi Zhang, Pengcheng Jiang, Dongha Lee, Jiawei Han, Hwanjo Yu 
EMNLP (Oral), 2024
paper / code
Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering
{Sungho Ko, Hyunjin Cho}, Hyungjoo Chae, Jinyoung Yeo, Dongha Lee 
EMNLP (Oral), 2024
paper / code
Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization
Kwangwook Seo, Jinyoung Yeo, Dongha Lee 
EMNLP Findings, 2024
paper / code
Make Compound Sentences Simple to Analyze: Learning to Split Sentences for Aspect-based Sentiment Analysis
{Yongsik Seo, Sungwon Song, Ryang Heo}, Jieyong Kim, Dongha Lee 
EMNLP Findings, 2024
paper / code
CACTUS: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory
{Suyeon Lee, Sunghwan Kim, Minju Kim}, Dongjin Kang, Dongil Yang, Harim Kim, Minseok Kang, Dayi Jung, Min Hee Kim, Seungbeen Lee, Kyoung-Mee Chung, Youngjae Yu, Dongha Lee, Jinyoung Yeo
EMNLP Findings, 2024
paper / code
Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation
Seonghyeon Lee, Suyeon Kim, Joonwon Jang, HeeJae Chon, Dongha Lee, Hwanjo Yu
EMNLP Findings, 2024
paper / code
SC-Rec: Enhancing Generative Retrieval with Self-Consistent Reranking for Sequential Recommendation
Tongyoung Kim, Soojin Yoon, Seongku Kang, Jinyoung Yeo, Dongha Lee
Preprint (arXiv), 2024
paper / code
Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation
Jieyong Kim, Hyunseo Kim, Hyunjin Cho, SeongKu Kang, Buru Chang, Jinyoung Yeo, Dongha Lee
Preprint (arXiv), 2024
paper / code
Graph Signal Processing for Cross-Domain Recommendation
Jeongeun Lee, SeongKu Kang, Won-Yong Shin, Jeongwhan Choi, Noseong Park, Dongha Lee
Preprint (arXiv), 2024
paper / code
Is Functional Correctness Enough to Evaluate Code Language Models? Exploring Diversity of Generated Codes
{HeeJae Chon, Seonghyeon Lee}, Jinyoung Yeo, Dongha Lee 
Preprint (arXiv), 2024
paper / code
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation
{Dongjin Kang, Sunghwan Kim}, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, Jinyoung Yeo
ACL, 2024 **Outstanding Paper**
paper / code
VERIFINER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models
{Seoyeon Kim, Kwangwook Seo}, Hyungjoo Chae, Jinyoung Yeo, Dongha Lee
ACL, 2024
paper / code
PEARL: A Review-driven Persona-Knowledge grounded Conversational Recommendation Dataset
{Minjin Kim, Minju Kim}, Hana Kim, Beong-woo Kwak, Soyeon Jeon, Hyunseo Kim, SeongKu Kang, Youngjae Yu, Jinyoung Yeo, Dongha Lee
ACL Findings, 2024
paper / code / dataset
Self-Consistent Reasoning-based Aspect-Sentiment Quad Prediction with Extract-Then-Assign Strategy
{Jieyong Kim, Ryang Heo}, Yongsik Seo, SeongKu Kang, Jinyoung Yeo, Dongha Lee
ACL Findings, 2024
paper / code
Exploring Language Model’s Code Generation Ability with Auxiliary Functions
Seonghyeon Lee, Sanghwan Jang, Seongbo Jang, Dongha Lee, Hwanjo Yu
NAACL Findings, 2024
paper / code
RTSUM: Relation Triple-based Interpretable Summarization with Multi-level Salience Visualization
Seonglae Cho, Myungha Jang, Jinyoung Yeo, Dongha Lee
NAACL Demo, 2024
paper / code
Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection
Suyeon Kim, Dongha Lee*, SeongKu Kang, Sukang Chae, Sanghwan Jang, Hwanjo Yu*
CVPR, 2024
paper / code
Unbiased, Effective, and Efficient Distillation from Heterogeneous Models for Recommender Systems
SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, Hwanjo Yu
ACM Transactions on Recommender Systems, 2024
paper / code
Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy
SeongKu Kang, Shivam Agarwal, Bowen Jin, Dongha Lee, Hwanjo Yu, Jiawei Han
WWW, 2024
paper / code
Evidentiality-Aware Retrieval for Overcoming Abstractiveness in Open-Domain Question Answering
Yongho Song, Dahyun Lee, Myungha Jang, Seung-won Hwang, Kyungjae Lee, Dongha Lee, Jinyoung Yeo
EACL Findings, 2024
paper / code
Commonsense-augmented Memory Construction and Management in Long-term Conversations via Context-aware Persona Refinement
Hana Kim, Kai Tzu-iunn Ong, Seoyeon Kim, Dongha Lee, Jinyoung Yeo
EACL, 2024
paper / code
Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales
Taeyoon Kwon, Kai Ong, Dongjin Kang, Seungjun Moon, Jeong Ryong Lee, Dosik Hwang, Beomseok Sohn, Yongsik Sim, Dongha Lee, Jinyoung Yeo
AAAI, 2024
paper / code
Multi-Domain Recommendation to Attract Users via Domain Preference Modeling
Hyunjun Ju, SeongKu Kang, Dongha Lee, Junyoung Hwang, Sanghwan Jang, Hwanjo Yu
AAAI, 2024
paper / code
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents
Hyungjoo Chae, Yongho Song, Kai Tzu-iunn Ong, Taeyoon Kwon, Minjin Kim, Youngjae Yu, Dongha Lee, Dongyeop Kang, Jinyoung Yeo
EMNLP, 2023
paper / code
Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding
Susik Yoon, Dongha Lee, Yunyi Zhang, Jiawei Han
SIGIR, 2023
paper / code
SCStory: Self-supervised and Continual Online Story Discovery
Susik Yoon, Yu Meng, Dongha Lee, Jiawei Han
WWW, 2023
paper / code
Distillation from Heterogeneous Models for Top-K Recommendation
SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, Hwanjo Yu
WWW, 2023
paper / code
Topology-Specific Experts for Molecular Property Prediction
Suyeon Kim, Dongha Lee, SeongKu Kang, Seonghyeon Lee, Hwanjo Yu
AAAI, 2023
paper / code
Topic Taxonomy Expansion via Hierarchy-Aware Topic Phrase Generation
Dongha Lee, Jiaming Shen, Seonghyeon Lee, Susik Yoon, Hwanjo Yu, Jiawei Han
EMNLP Findings, 2022
paper / code
Mitigating Viewpoint Sensitivity of Self-supervised One-class Classifiers
Hyunjun Ju, Dongha Lee, SeongKu Kang, Hwanjo Yu
Information Sciences, 2022
paper / code
Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning
Seonghyeon Lee, Dongha Lee, Seongbo Jang, Hwanjo Yu
ACL, 2022
paper / code
TaxoCom: Topic Taxonomy Completion with Hierarchical Discovery of Novel Topic Clusters
Dongha Lee, Jiaming Shen, SeongKu Kang, Susik Yoon, Jiawei Han, Hwanjo Yu
WWW, 2022
paper / code
Consensus Learning from Heterogeneous Objectives for One-Class Collaborative Filtering
SeongKu Kang, Dongha Lee, Wonbin Kweon, Junyoung Hwang, Hwanjo Yu
WWW, 2022
paper / code
Personalized Knowledge Distillation for Recommender System
SeongKu Kang, Dongha Lee, Wonbin Kweon, Hwanjo Yu
Knowledge-Based Systems, 2022
paper / code
Out-of-Category Document Identification Using Target-Category Names as Weak Supervision
Dongha Lee, Dongmin Hyun, Jiawei Han, Hwanjo Yu
ICDM, 2021
paper / code
Learnable Structural Semantic Readout for Graph Classification
Dongha Lee, Suyeon Kim, Seonghyeon Lee, Chanyoung Park, Hwanjo Yu
ICDM, 2021
paper / code
Weakly Supervised Temporal Anomaly Segmentation with Dynamic Time Warping
Dongha Lee, Sehun Yu, Hyunjun Ju, Hwanjo Yu
ICCV, 2021
paper / code
Out-of-manifold Regularization in Contextual Embedding Space for Text Classification
Seonghyeon Lee, Dongha Lee, Hwanjo Yu
ACL, 2021
paper / code
Bootstrapping User and Item Representations for One-Class Collaborative Filtering
Dongha Lee, SeongKu Kang, Hyunjun Ju, Chanyoung Park, Hwanjo Yu
SIGIR, 2021
paper / code
Learnable Dynamic Temporal Pooling for Time Series Classification
Dongha Lee, Seonghyeon Lee, Hwanjo Yu
AAAI, 2021
paper / code
Multi-class Data Description for Out-of-distribution Detection
Dongha Lee, Sehun Yu, Hwanjo Yu
KDD, 2020
paper / code
Generating Sequential Electronic Health Records using Dual Adversarial Autoencoder
Dongha Lee, Hwanjo Yu, Xiaoqian Jiang, Deevakar Rogith, Meghana Gudala, Mubeen Tejani, Qiuchen Zhang, Li Xiong
Journal of the American Medical Informatics Association (JAMIA), 2020
paper / code
Harmonized Representation Learning on Dynamic EHR Graphs
Dongha Lee, Xiaoqian Jiang, Hwanjo Yu
Journal of Biomedical Informatics, 2020
paper / code
Convolutional Neural Networks with Compression Complexity Pooling for Out-of-Distribution Image Detection
Sehun Yu, Dongha Lee, Hwanjo Yu
IJCAI, 2020
paper / code
PUMAD: PU Metric Learning for Anomaly Detection
Hyunjun Ju, Dongha Lee, Junyoung Hwang, Junghyun Namkung, Hwanjo Yu
Information Sciences, 2020
paper / code
Scalable Disk-based Topic Modeling for Memory Limited Devices
Byungju Kim, Dongha Lee, Jinoh Oh, Hwanjo Yu
Information Sciences, 2020
paper / code
OCAM: Out-of-core Coordinate Descent Algorithm for Matrix Completion
Dongha Lee, Jinoh Oh, Hwanjo Yu
Information Sciences, 2020
paper / code / webpage
Large-Scale Matrix and Tensor Completion based on Out-of-Core Approaches
Dongha Lee
Ph.D. Dissertation, 2020
paper
Semi-Supervised Learning for Cross-Domain Recommendation to Cold-Start Users
SeongKu Kang, Junyoung Hwang, Dongha Lee, Hwanjo Yu
CIKM, 2019
paper
Action Space Learning for Heterogeneous User Behavior Prediction
Dongha Lee, Chanyoung Park, Hyunjun Ju, Junyoung Hwang, Hwanjo Yu
IJCAI, 2019
paper / code
Fast Tucker Factorization for Large-scale Tensor Completion
Dongha Lee, Jaehyung Lee, Hwanjo Yu
ICDM, 2018
paper / code / webpage
Disk-based Matrix Completion for Memory Limited Devices
Dongha Lee, Jinoh Oh, Christos Faloutsos, Byungju Kim, Hwanjo Yu
CIKM, 2018
paper / webpage
DualSentiNet: Dual Prediction of Word and Document Sentiments Using Shared Word Embedding
Dongha Lee, Hyunjun Ju, Jung-Mi Park, Kye-Yoon Kim, Hwanjo Yu
IMCOM, 2018
paper
Compressing Model for Matrix Factorization with Quantization Using k-means Clustering
Junsu Cho, Dongha Lee, Hwanjo Yu
KDBC, 2017
GeoVideoIndex: Indexing for Georeferenced Videos
Dongha Lee, Jinoh Oh, Woong-Kee Loh, Hwanjo Yu
Information Sciences, 2016
paper

Teaching

Undergraduate

  • [CSI3106] Software Engineering, 2023F
  • [AAI3120] Machine Learning, 2023S, 2024S
  • [AIC2110] Introduction to Data Science, 2024S

Graduate

  • [AAI5009] Recommender Systems and Information Filtering, 2023S
  • [AAI5013] Advanced Data Mining, 2023F, 2024F

Work Experience

University of Illinois at Urbana-Champaign (UIUC), United States
Postdoctoral Research Fellow, 2021.07 - 2023.02
Department of Computer Science
Advisor: Prof. Jiawei Han
Pohang University of Science and Technology (POSTECH), South Korea
Postdoctoral Researcher, 2020.03 - 2021.06
Department of Computer Science and Engineering
Advisor: Prof. Hwanjo Yu
University of Texas Health Science Center at Houston (UT Health), United States
Visiting Scholar, 2018.09 - 2019.02
School of Biomedical Informatics
Advisor: Prof. Xiaoqian Jiang

Education

Pohang University of Science and Technology (POSTECH), South Korea
Ph.D. in Computer Science and Engineering, 2015.03 - 2020.02
Large-scale Matrix and Tensor Completion based on Out-of-core Approaches
Advisor: Prof. Hwanjo Yu
Technical University of Berlin (TU Berlin), Germany
B.S. in Computer Science, 2013.10 - 2014.02
Exchange Student
Pohang University of Science and Technology (POSTECH), South Korea
B.S. in Computer Science and Enginnering, 2011.03 - 2015.02
Summa Cum Laude (Ranked 1st in the Department)

Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.