My primary research interests lie in several areas:
Below, I provide a brief overview of the ongoing projects in which I’m involved; for additional details, refer to the relevant papers listed or feel free to send me a message. My recent work has been conducted in UIC’s Natural Language Processing Laboratory, which I co-direct. On my publications page, you can also find links to papers about earlier lines of research that, while not ongoing, still represent substantial interests of mine.
Healthcare Applications
My work at the intersection of natural language processing and healthcare focuses primarily on aspects of cognitive and mental health, caregiver support, and online behavior. My current collaborators in this area include researchers from UIC, UI Health, and the University of California, San Diego.
In my group’s work related to cognitive and mental health, we have developed approaches for spoken language detection of cognitive and mental health conditions. In the area of cognitive health, we’ve worked on Alzheimer’s disease detection and prediction of fine-grained cognitive health scores; more recently, we’ve also developed a dataset to facilitate pioneering studies of language patterns in preclinical Alzheimer’s disease. In the area of mental health, we’ve worked on recognizing language characteristics in individuals with bipolar disorder and schizophrenia when participating in a spoken dialogue social skills performance assessment. Portions of this work were funded by the National Institutes of Health. Some of our recent publications in this area include:
- Shahla Farzana,* Edoardo Stoppa,* Alex Leow, Tamar Gollan, Raeanne Moore, David Salmon, Douglas Galasko, Erin Sundermann, and Natalie Parde. SLaCAD: A Spoken Language Corpus for Early Alzheimer’s Disease Detection. In the Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Turin, Italy, May 20-25, 2024.
- Shahla Farzana* and Natalie Parde. Towards Domain-Agnostic and Domain-Adaptive Dementia Detection from Spoken Language. In the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). Toronto, Canada, July 9-14, 2023.
- Ankit Aich,* Avery Quynh, Varsha Badal, Amy Pinkham, Philip Harvey, Colin Depp, and Natalie Parde. Towards Intelligent Clinically-Informed Language Analyses of People with Bipolar Disorder and Schizophrenia. In Findings of the Association for Computational Linguistics: EMNLP 2022 (EMNLP Findings 2022). Online, December 7-11, 2022.
In my group’s project collaborating with researchers in UIC’s CPERL group, we are investigating ways to support caregivers of children with pediatric rehabilitation needs through the development of smart and connected tools for equitable early intervention service design. This project is funded by the National Science Foundation and involves the development of novel techniques for dialogue systems and caregiver strategy classification. Some papers describing more about aspects of the project background include:
- Mina Valizadeh,* Vera Kaelin, Mary Khetani, and Natalie Parde. CareCorpus: A Corpus of Real-World Solution-Focused Caregiver Strategies for Personalized Pediatric Rehabilitation Service Design. In the Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Turin, Italy, May 20-25, 2024.
- Vera C. Kaelin, Andrew D. Boyd, Martha M. Werler, Natalie Parde, and Mary A. Khetani. Natural Language Processing to Classify Caregiver Strategies Supporting Participation Among Children and Youth with Craniofacial Microsomia and Other Childhood-Onset Disabilities. In the Journal of Healthcare Informatics Research, 2023.
- Vera Kaelin, Mina Valizadeh, Zurisadai Salgado, Natalie Parde, and Mary Khetani. Artificial Intelligence in Rehabilitation Targeting the Participation of Children and Youth With Disabilities: Scoping Review. Journal of Medical Internet Research 23(11):e25745. November 2021.
In my group’s work pertaining to health-related online behavior we have examined medical self-disclosure and (more recently) empathy in online health forums. This has resulted in two innovative datasets: one pertaining to medical self-disclosure, and AcnEmpathize, a novel dataset of domain-specific health-related online empathy. More information about this work, including details for accessing these datasets, can be found here:
- Gyeongeun Lee* and Natalie Parde. AcnEmpathize: A Dataset for Understanding Empathy in Dermatology Conversations. In the Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Turin, Italy, May 20-25, 2024.
- Mina Valizadeh,* Xing Qian,* Pardis Ranjbar-Noiey,* Cornelia Caragea and Natalie Parde. What Clued the AI Doctor In? On the Influence of Data Source and Quality for Transformer-Based Medical Self-Disclosure Detection. In the Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023). Dubrovnik, Croatia, May 2-6, 2023.
- Mina Valizadeh,* Pardis Ranjbar-Noiey,* Cornelia Caragea, and Natalie Parde. Identifying Medical Self-Disclosure in Online Communities. In the Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021). Online, June 6-11, 2021.
Multimodality
My work in multimodal natural language processing has primarily centered on language and vision. My group investigates numerous topics falling under this umbrella, recently including visual storytelling and the detection of misogynistic memes. Papers addressing these topics can be found here:
Creative Language
My work on creative language has primarily focused on figurative language. Recently, my group has examined metaphor, idiom, and hyperbole in empathetic online communication (publication forthcoming). We also conducted a focused study on euphemism. Some publications regarding this, along with my earlier work on metaphor novelty and sarcasm in more generalized domains, include the following:
Other Ongoing Research
A complementary focus of my ongoing research has been on the reproducibility and robust evaluation of natural language processing experiments. My research group has conducted a number of high-level and probing studies on this topic, in an effort to promote higher quality and more open science. A selection of our publications regarding this topic include:
- Mohammad Arvan,* A. Seza Doğruöz, and Natalie Parde. Investigating Reproducibility at Interspeech Conferences: A Longitudinal and Comparative Perspective. In the Proceedings of the 24th INTERSPEECH Conference (INTERSPEECH 2023). Dublin, Ireland, August 20-24, 2023.
- Maja Popović, Mohammad Arvan,* Natalie Parde, and Anya Belz. Exploring Variation of Results from Different Experimental Conditions. In Findings of the Association for Computational Linguistics: ACL 2023 (ACL Findings 2023). Online, July 9-14, 2023.
- Mohammad Arvan,* Luís Pina, and Natalie Parde. Reproducibility in Computational Linguistics: Is Source Code Enough?. In the Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). Abu Dhabi, United Arab Emirates, December 7-11, 2022.