Merge branch 'gh-pages' of https://github.com/ndnlp/nlplus into gh-pages

ndnlp · Nov 12, 2023 · baa9ae2 · baa9ae2
2 parents 364b177 + ca0c95c
commit baa9ae2
Show file tree

Hide file tree

Showing 6 changed files with 91 additions and 0 deletions.
diff --git a/_posts/2023-09-25-ziems.md b/_posts/2023-09-25-ziems.md
@@ -0,0 +1,12 @@
+---
+layout: post
+title:  Noah Ziems
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Title: Explainable Decision Trees
+
+Abstract: Decision trees offer a strong balance between speed, accuracy, and interpretability for tabular data. However, interpreting decision trees in practice is often difficult, as they require background knowledge in machine learning to understand. In this talk, I will discuss my work on exploring how large language models can be used to assist in decision tree explainability as well as how these explanations can be evaluated using auto-generated quiz questions.
+
+Bio: Noah Ziems is a second year PhD student in the Department of Computer Science and Engineering at the University of Notre Dame, and is a member of Dr. Meng Jiang's DM2 lab. His research is focused on question answering, large language models, and information retrieval.
diff --git a/_posts/2023-10-02-oketch.md b/_posts/2023-10-02-oketch.md
@@ -0,0 +1,30 @@
+---
+layout: post
+title:  Kezia Oketch
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Title: When Automated Assessment Meets Automated Content Generation: Examining Text Quality in the Era of GPTs
+
+Abstract: The use of machine learning (ML) models to assess and score textual data has become increasingly pervasive in
+an array of contexts including natural language processing, information retrieval, search and recommendation,
+and credibility assessment of online content. A significant disruption at the intersection of ML and text are
+text-generating large-language models such as generative pre-trained transformers (GPTs). We empirically
+assess the differences in how ML-based scoring models trained on human content assess the quality of content
+generated by humans versus GPTs. To do so, we propose an analysis framework that encompasses essay scoring
+ML-models, human and ML-generated essays, and a statistical model that parsimoniously considers the impact
+of type of respondent, prompt genre, and the ML model used for assessment model. A rich testbed is utilized
+that encompasses 18,460 human-generated and GPT-based essays. Results of our benchmark analysis reveal
+that transformer pretrained language models (PLMs) more accurately score human essay quality as compared
+to CNN/RNN and feature-based ML methods. Interestingly, we find that the transformer PLMs tend to score
+GPT-generated text 10-15% higher on average, relative to human-authored documents. Conversely, traditional
+deep learning and feature-based ML models score human text considerably higher. Further analysis reveals that
+even though the transformer PLMs are exclusively fine-tuned on human text, they more prominently attend
+to certain tokens appearing only in GPT-generated text, possibly due to familiarity/overlap in pre-training.
+Our framework and results have implications for text classification settings where automated scoring of text
+is likely to be disrupted by generative AI.
+
+Bio: Kezia Oketch is a second-year PhD student in the Department of IT, Analytics & Operations at the University of Notre Dame's Mendoza College of Business. She is a member of the Human Analytics Lab (HAL) and is mentored by Professors Ahmed Abbasi and John Lalor. Her research primarily revolves around predicting behavior, personalization, and user modeling.
+
+
diff --git a/_posts/2023-10-09-zhang.md b/_posts/2023-10-09-zhang.md
@@ -0,0 +1,12 @@
+---
+layout: post
+title:  Zhihan Zhang
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Title: Lessons We Learned Towards Instruction Tuning – Data, Training & Evaluation
+
+Abstract: Instruction-tuned language models have become popular recently due to their ability to solve any NLP task given a natural language instruction. Since the release of Llama and Alpaca models, a lot of research approaches have explored various key aspects of instruction tuning language models, including the data design, the training procedure and the evaluation protocol. In this presentation, I will briefly summarize some lessons we learned from instruction tuning papers in the past few months - what factors do researchers consider critical for tuning instruction-following models?
+
+Bio: Zhihan Zhang is a third-year PhD student from the DM2 lab at Notre Dame, under the supervision of Dr. Meng Jiang. His recent research mainly focuses on large language models and instruction tuning, and he also has research experience in knowledge-augmented NLP and text retrieval. He has published multiple papers at top-tier NLP venues like ACL, EMNLP and TACL.
diff --git a/_posts/2023-10-23-szymanski.md b/_posts/2023-10-23-szymanski.md
@@ -0,0 +1,12 @@
+---
+layout: post
+title:  Annalisa Szymanski
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Title:  Leveraging Large Language Models to Assist with Nutrition and Dietary Health:  Design Implications from a Study with Registered Dietitians
+
+Abstract: Large Language Models (LLMs) have the potential to significantly contribute to the fields of nutrition and dietetics, particularly in generating food product explanations that facilitate informed food selections for individuals. However, the extent to which these models can offer effective and accurate information remains largely unexplored and unverified. This study addresses this gap by examining LLMs' capabilities and limitations in generating accurate nutrition information. We assess the impact of varying levels of specificity in the prompts used for generating LLM explanations. Using a mixed-method approach, we collaborate with registered dietitians to evaluate the nutrition information generated by the model. From this collaboration, our research proposes a set of design implications to shape the future of using LLMs when producing nuanced dietary information. These design implications, when utilized, may enhance the practicality and efficacy in generating comprehensive explanations of food products to provide customized nutrition information.
+
+Bio: Annalisa Szymanski is a third-year PhD student studying Human-Computer Interaction under the supervision of Dr. Ronald Metoyer. At present, she is engaged in a project dedicated to enhancing food accessibility within food desert communities. Her research is focused on investigating the optimal utilization of AI solutions to harness nutrition data for the generation of personalized food recommendations and for the educational resources that cater to the diverse needs of all users.
diff --git a/_posts/2023-10-30-wan.md b/_posts/2023-10-30-wan.md
@@ -0,0 +1,13 @@
+---
+layout: post
+title:  Ruyuan Wan
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Title: Digital and Historical Exclusivity in Feminine Linguistics: From Nüshu to
+Xiaohongshu
+
+Abstract: In the evolving landscape of human communication, exclusive spaces have emerged as a powerful tool for marginalized groups, particularly women. Drawing parallels between the ancient Nüshu (women script) and modern tagging on the social media platform Xiaohongshu, this study examines how women have historically crafted unique communication spaces for unfiltered dialogue, free from external judgments. While inclusivity remains a societal ideal, exclusivity, as demonstrated by strategies like the "baby feeding" tag, can serve as a protective mechanism against deep-rooted gender biases. This research underscores the lasting need for such spaces, revealing the intricate balance between inclusivity and exclusivity in fostering genuine discourse. It further calls for future explorations into the dynamics of language style, user auditing, and digital exclusivity, emphasizing their implications in our digital age.
+
+Bio: Ruyuan Wan is a PhD student at the University of Notre Dame. Standing at the crossroads of HCI, NLP and Social Computing, she is passionated about exploring the intricacies and nuances of communication dynamics in the evolving digital landscape. She is driven to decode how humans and machines (AI functioned tools) communicate, collaborate and sometimes conflict, and how these patterns can be harnessed to address pressing societal challenges.
diff --git a/_posts/2023-11-6-rex.md b/_posts/2023-11-6-rex.md
@@ -0,0 +1,12 @@
+---
+layout: post
+title:  Georgina Curto Rex
+---
+
+Lunch at 12:30pm, talk at 1pm, in 148 Fitzpatrick
+
+Abstract: While many types of hate speech and online toxicity have been the focus of extensive research in NLP, toxic language stigmatizing poor people has been mostly disregarded. Yet, aporophobia, a social bias against the poor, is a common phenomenon online, which can be psychologically damaging as well as hindering poverty reduction policy measures. We demonstrate that aporophobic attitudes are indeed present in social media and argue that the existing NLP datasets and models are inadequate to effectively address this problem. Efforts toward designing specialized resources and novel socio-technical mechanisms for confronting aporophobia are needed.
+
+Bio: Georgina Rex is a Postdoctoral Fellow at the ND Technology Ethics Center and is involved with the Economically Sustainable AI for Good Project at the ND-IBM Tech Ethics Lab. She chairs the IJCAI Symposia in the Global South and Co-Chair the AI & Social Good Special Track at the International Joint Conference on Artificial Intelligence (IJCAI'23). She has been a Visiting Scholar at the Kavli Center for Ethics, Science and the Public (UC Berkeley).
+Focusing on issues of poverty mitigation, fairness and inclusion, Georgina works on the design of AI socio-technical systems that provide new insights to counteract inequality, and more broadly, to advance interdisciplinary research towards the achievement of the UN Sustainable Development Goals (SDGs). 
+She conducts research that contributes to the AI state of the art in Natural Language Processing, Agent-Based Modeling, Social Networks and Machine Learning, with the ultimate goal to offer insights for innovative interventions to local and global challenges