Georgetown Grad Student Team Wins International Natural Language Processing Challenge
A team of three current and former Georgetown graduate students in computational linguistics and computer science has won an international competition in natural language processing and will present their research at a global summit on the subject next month.
Zhuoxuan Nymphea Ju (G’26), Jingni Wu (G’25) and Abhishek Purushothama (G’29) placed first in the biannual Discourse Relation Parsing and Treebanking (DISRPT) shared task in discourse relation classification in September. The team developed and demonstrated a decoder-based language processing system, called DeDisCo, that outperformed those developed by competitors from universities in Germany, France, Canada and India.
“[Winning] is definitely an unexpected academic and personal highlight,” wrote Purushothama, a Ph.D. student in computer science. “The team came to the challenge with different perspectives and motivations, and that came to be a good approach and a winning system.”
Along with Purushothama, Ju, a master’s student in computational linguistics, and Wu, a May graduate of the computational linguistics master’s program, are current and former members of the Corpling Lab. That computational linguistics lab is led by associate professor Amir Zeldes and is part of the larger Computation and Language @ Georgetown network.
The Corpling Lab has ranked highly in the DISRPT competition in recent years, including placing first in all five categories in DISRPT 2021 and scoring first and second in various categories of the 2019 competition.
Members of this year’s winning team will present their research virtually at the 2025 EMNLP, or Empirical Methods in Natural Language Processing, conference in Suzhou, China, on Nov. 9.

The winning team and their mentor, from left to right: Abhishek Purushothama (G’29), Zhuoxuan Nymphea Ju (G’26), Jingni Wu (G’25) and Amir Zeldes (photo composite courtesy of Amir Zeldes)
What are Computational Linguistics and Discourse Relation Classification?
Computational linguistics is a key area within artificial intelligence and computer science that centers on teaching computers how to learn, model and communicate in human language, a practice broadly referred to as natural language processing. Discourse relation classification describes how systems identify and process connections between multiple sentences to understand meanings arising from their combinations.
“The propositions in language, such as sentences or clauses, can have all sorts of relations between them, but these aren’t always spelled out explicitly,” wrote Zeldes, who is an organizer of DISRPT but was not involved in judging the competition.
“For example, even in a very short text like, ‘Kim fell. Mary pushed her,’” we infer that the falling happened after the pushing (temporal ordering) and that the pushing caused the falling (causality),” Zeldes continued. “These are 2 of the 17 relations that needed to be distinguished in the task.”
The discourse relation and classification task challenged student groups to create a system that could parse 39 data sets in 16 languages, including English, Nigerian Pidgin and Thai, and build connections between them.
DeDisCo’s decoder-based system uses existing data and patterns to generate new information in a process similar to those of large language models like GPT and Claude. Decoder systems work sequentially, understanding each new input in relation to previously shared data points in a way that mimics how humans process new information.
Combining Language and Tech for Good
Purushothama and Ju said they were drawn to studying computational linguistics by a love of language in its many forms.
Ju said her interest in computational linguistics started when she began formally studying languages and their structures as an undergraduate, and she merged that with a growing interest in technology. Purushothama’s affinity took root earlier and arose out of a childhood curiosity for technology and passion for fiction.
“Both the potential and the challenge of using computers to ‘do’ language has been interesting ever since I first started using computers,” Purushothama wrote. “It has only become more exciting [as] I have learned and studied more computation and language.”
The field of computational linguistics is broader than many might think, Ju and Purushothama said. It incorporates aspects of many STEM and language-based fields to solve challenges in language and information processing that affect all areas of society.
“Computational linguistics is a very inclusive field,” Ju wrote. “Whether someone is more drawn to understanding human language and cognition, or to computational systems, there’s always a meaningful way to contribute.”
