Evidence-based Research on D&I in AI

Evidence-based research is a method of inquiry that uses empirical evidence to test or evaluate hypotheses and theories. It involves gathering, analyzing, and interpreting data with rigorous and transparent methods. This method is important for the topic of diversity and inclusion in artificial intelligence (AI) because it gives objective insights into how diverse perspectives affect AI development and applications, and gains insights into the real challenges, gaps, and opportunities. By using evidence-based research, we can find and address biases, ensure equal representation, and improve the fairness and effectiveness of AI systems, ultimately leading to more inclusive and socially responsible technological developments.

Our team is leveraging evidence to design impactful resources, guidelines, and tools aimed at enhancing diversity and ethical practices in AI.

  • We have conducted comprehensive systematic literature reviews on “AI and The Quest for Diversity and Inclusion,” “Responsible AI Governance,” and “Inclusive and Explainable AI Systems,” aiming to uncover trends, develop ethical frameworks, and enhance transparency and accessibility in AI practices.
  • We are conducting qualitative research on “Large Language Models” to examine their social and ethical impacts, and to make sure that AI progresses in ways that respect human values. Our empirical study of “Human Value Requirements in AI Systems” aims to find user-oriented design principles that put human needs first.
  • In “Responsible AI for Scientific Research”, we are assessing how AI can help responsibly advance scientific knowledge, ensuring that innovation is based on Ethical and Responsible AI principles.
  • Finally, our project on “Operationalising Diversity and Inclusion in AI” is committed to providing practical strategies that integrate diversity and inclusion into the core of AI development processes, creating an AI ecosystem that is inclusive, fair, and reflective of the diverse range of humanity.

In the rapidly advancing AI landscape, the integration of diversity and inclusion (D&I) principles is not just a moral imperative but a necessity for creating inclusive, unbiased, and trustworthy technology. Despite this, D&I considerations are frequently overlooked in AI’s design, development, and deployment, leading to a proliferation of undesirable outcomes, as highlighted in our latest research article. With AI increasingly becoming an integral part of our social and professional lives, everyone must be aware of the repercussions of what an ethical and responsible AI is that addresses the D&I concerns.

Our latest research, a Systematic Literature Review (SLR) of 48 scholarly articles, sheds light on the critical challenges and solutions for embedding D&I in AI. The study investigates two main themes: the impact of D&I on the development and deployment of AI systems, and conversely, how AI can be used to enhance D&I initiatives. Through a comprehensive analysis of the selected academic papers published between 2017 and 2022, the review categorizes and assesses various challenges and solutions associated with integrating D&I into AI. The study identifies 55 unique challenges related to D&I within the context of AI, along with 33 corresponding solutions. Additionally, the review covers 24 unique challenges and presents 23 solutions concerning the application of AI for improving D&I practices.

The insights from our SLR reveal:

  • Disparity in Research Focus: Our findings reveal an imbalance between studies exploring D&I within AI and those leveraging AI to enhance D&I practices.
  • Neglected Diversity Attributes: While gender is the most commonly addressed attribute, other critical factors like ethnicity and language are frequently overlooked, raising concerns about the comprehensiveness of current D&I efforts in AI.
  • Sectoral Bias in Literature: The dominance of the health sector in existing research suggests a narrow focus, leaving crucial areas such as law and education etc. underexplored.
  • Limited Technological Scope: There is more focus on facial analysis and natural language processing in AI systems compared to technologies like voice recognition and large language models.
  • Governance and Geographic Diversity: Our review highlights a significant lack of attention to AI governance challenges and solutions, as well as a limited geographic diversity in research contributions, particularly from the Global South.

This review offers a comprehensive roadmap for building a more inclusive AI future, making it an essential read for policymakers, technologists, and anyone committed to ethical AI practices. Its overarching goal is to enhance awareness among researchers and practitioners about the critical need to embed D&I principles comprehensively in AI systems, aiming for more inclusive and equitable technology development and application.

Citation:

Shams, R.A., Zowghi, D. & Bano, M. “AI and the quest for diversity and inclusion: a systematic literature review”. Journal of AI Ethics (2023). https://doi.org/10.1007/s43681-023-00362-w

Contact person: Dr. Rifat Ara Shams, Rifat.Shams@data61.csiro.au

As artificial intelligence transforms a wide range of sectors and drives innovation, it also introduces complex challenges concerning ethics, transparency, bias, and fairness. The imperative for integrating Responsible AI (RAI) principles within governance frameworks is paramount to mitigate these emerging risks. While there are many solutions for AI governance, significant questions remain about their effectiveness in practice. Addressing this knowledge gap, this paper aims to examine the existing literature on AI Governance. The focus of this study is to analyze the literature to answer key questions: WHO is accountable for AI systems’ governance, WHAT elements are being governed, WHEN governance occurs within the AI development life cycle, and HOW it is executed through various mechanisms like frameworks, tools, standards, policies, or models. Employing a systematic literature review methodology, a rigorous search and selection process has been employed. This effort resulted in the identification of 61 relevant articles on the subject of AI Governance. Out of the 61 studies analyzed, only 5 provided complete responses to all questions. The findings from this review aid research in formulating more holistic and comprehensive Responsible AI (RAI) governance frameworks. This study highlights the important role of AI governance on various levels especially organizational in establishing effective and responsible AI practices. The findings of this study provide a foundational basis for future research and the development of comprehensive governance models that align with RAI principles.

Citation:

Batool, A., Zowghi, D., & Bano, M. (2023). Responsible AI Governance: A Systematic Literature Review. arXiv preprint arXiv:2401.10896.

Contatct Person: Professor Didar Zowghi, didar.zowghi@csiro.au

[nba_accordion label="Investigating Diversity and Inclusion violation in AI Incidents (In-Progress)"]

Diveristy Analyser App

In the rapidly advancing Artificial Intelligence (AI) landscape, the integration of Diversity and Inclusion (D&I) principles is not just a moral imperative but a necessity for creating inclusive, unbiased, and trustworthy technology. Despite this, D&I considerations are frequently overlooked in AI’s design, development, and deployment, leading to numerous incidents of bias. Addressing these issues is essential, and the first step is to identify the AI incidents that occurred due to D&I issues and investigate the underlying causes of D&I-related incidents in AI. To achieve this goal, we are working on establishing a set of criteria to effectively identify D&I-related AI incidents. Additionally, we aim to develop strategies that detail the steps and techniques needed to prevent future D&I-related AI incidents and limit their negative impacts. For this purpose, we have conducted manual analysis with two AI incident databases: AI Incident Database (AIID) and AI, Algorithmic, and Automation Incidents and Controversies (AIAAIC), and developed a set of criteria (decision tree) to identify D&I related AI incidents. Based on the decision tree, we developed a prototype of a mobile application using GPT4’s API to detect D&I-related AI incidents. Its overarching goal is to enhance awareness among researchers and practitioners about the critical need to embed D&I principles comprehensively in AI systems, aiming for more inclusive and equitable technology development and application. 

Contact person: Dr. Rifat Ara Shams, Rifat.Shams@data61.csiro.au

Explainable AI (XAI) plays a crucial role in enhancing transparency and providing rational explanations to support users of AI systems. Inclusive AI actively seeks to engage and represent individuals with diverse attributes who are affected by and contribute to the AI ecosystem. Both inclusion and XAI advocate for the active involvement of the users and stakeholders during the entire AI system lifecycle. However, the relationship between XAI and Inclusive AI has not been explored. In this paper, We present the results of a systematic literature review to explore this relationship in the recent AI research literature. We were able to identify 18 research articles on the topic. Our analysis focused on exploring approaches to (1) the human attributes and perspectives, (2) preferred explanation methods, and (3) Human-AI interaction. Based on our findings, we identified potential future XAI research directions and proposed strategies for practitioners involved in the design and development of inclusive AI systems.
Citation: Girard, Amelie, Didar Zowghi, Muneera Bano, and Marian-Andrei Riziou. “Inclusive and Explainable AI Systems: A Systematic Literature Review.” (2024). https://scholarspace.manoa.hawaii.edu/items/6353f320-f22a-468d-9983-f79574795a20

Contatct Person: Professor Didar Zowghi, didar.zowghi@csiro.au

  • AI and Human Reasoning

Context: The advent of AI-driven large language models (LLMs), such as ChatGPT 3.5 and GPT-4, have stirred discussions about their role in qualitative research. Some view these as tools to enrich human understanding, while others perceive them as threats to the core values of the discipline.

Problem: A significant concern revolves around the disparity between AI-generated classifications and human comprehension, prompting questions about the reliability of AI-derived insights. An “AI echo chamber” could potentially risk the diversity inherent in qualitative research. A minimal overlap between AI and human interpretations amplifies concerns about the fading human element in research.

Objective: This study aimed to compare and contrast the comprehension capabilities of humans and LLMs, specifically ChatGPT 3.5 and GPT-4.

Methodology: We conducted an experiment with small sample of Alexa app reviews, initially classified by a human analyst. ChatGPT 3.5 and GPT-4 were then asked to classify these reviews and provide the reasoning behind each classification. We compared the results with human classification and reasoning.

Results: The research indicated a significant alignment between human and ChatGPT 3.5 classifications in one-third of cases, and a slightly lower alignment with GPT-4 in over a quarter of cases. The two AI models showed a higher alignment, observed in more than half of the instances. However, a consensus across all three methods was seen only in about one-fifth of the classifications. In the comparison of human and LLMs reasoning, it appears that human analysts lean heavily on their individual experiences. As expected, LLMs, on the other hand, base their reasoning on the specific word choices found in app reviews and the functional components of the app itself.

Conclusion: Our results highlight the potential for effective human-LLM collaboration, suggesting a synergistic rather than competitive relationship. Researchers must continuously evaluate LLMs’ role in their work, thereby fostering a future where AI and humans jointly enrich qualitative research.

Citation: Bano, M., Zowghi, D., & Whittle, J. (2023). AI and Human Reasoning: Qualitative Research in the Age of Large Language Models. The AI Ethics Journal3 (1). https://doi.org/10.47289/AIEJ20240122

  • Opportunities and Challenges 

The recent surge in the integration of Large Language Models (LLMs) like ChatGPT into qualitative research in software engineering, much like in other professional domains, demands a closer inspection. This vision paper seeks to explore the opportunities of using LLMs in qualitative research to address many of its legacy challenges as well as potential new concerns and pitfalls arising from the use of LLMs. We share our vision for the evolving role of the qualitative researcher in the age of LLMs and contemplate how they may utilize LLMs at various stages of their research experience.

Citation: Bano, M., Hoda, R., Zowghi, D. et al. Large language models for qualitative research in software engineering: exploring opportunities and challenges. Journal of Automated Software Engineering 31, 8 (2024). https://doi.org/10.1007/s10515-023-00407-8 

 

Contatct Person: Muneera Bano, muneera.bano@csiro.au

Human Value Requirements in AI Systems: Empirical Analysis of Amazon Alexa

The necessity of integrating human values such as transparency, privacy, social recognition, and tradition into the Requirements Engineering (RE) process is universally recognized. Despite this, there remains a distinct scarcity of empirical research regarding how to efficiently integrate human values into RE. This deficiency becomes even more significant in the context of Artificial Intelligence (AI) systems, taking into account their considerable societal impact. Neglecting or breaching human values in AI systems could lead to user dissatisfaction, adverse socio-economic implications, and under certain circumstances, societal damage. However, there is a lack of guidance on addressing human values within the RE process for specific contexts of AI system development. 

In this work, we conducted an empirical analysis of the Amazon Alexa app as a case study, examining 1003 users’ reviews to identify relevant human values and assess the extent to which these values are addressed or ignored in the app. We identified 34 values of the end-users of Amazon Alexa. Among them, only one value is addressed (self-discipline) and 23 of them are ignored (freedom, equality, obedience) in the app. The feedback provided mixed experiences (both addressed and ignored) on the rest of the ten values. We developed the following insights through this study. 

  1. Using user reviews to identify ignored values: The findings reveal that addressed values are fewer compared to ignored values in user reviews. Thus, our approach aids developers in identifying overlooked user values. However, it is still advisable to use traditional requirements elicitation techniques (e.g. surveys, workshops and interviews) to capture essential human value requirements to be addressed at the outset.
  2. Prioritizing highly-mentioned values: Our data suggests prioritizing frequently referenced values during the development of voice assistant apps, despite the need to address all 34 identified values. 
  3. Adopting a human-centered lens on Non-functional Requirements (NFR): While many identified values mirror NFRs, we aim to analyze NFRs from a human-centered perspective, particularly in AI systems.
  4. Aligning human values with AI Ethical Principles: We plan to integrate ethical challenges in AI development with human values, advancing responsible and inclusive AI.
  5. Creating a human value requirements catalog: We have identified 16 values not covered by standard theories. As they range from general to domain-specific, we propose building a catalog that encompasses both system and human perspectives.
  6. Considering the context of AI Systems for human values: From studying the Alexa app, we’ve identified context-specific user values. We aim to find patterns of human values and dependent variables by analyzing various types of AI apps.

Through this analysis, we have tailored an approach for identifying human values from a specific type of AI system. We assert that our approach holds potential applicability across diverse AI systems and in a wide variety of contexts, presenting valuable guidance for developing human value requirements in values-based AI systems. 

Citation:

R. A. Shams, M. Bano, D. Zowghi, Q. Lu and J. Whittle, “Human Value Requirements in AI Systems: Empirical Analysis of Amazon Alexa,” 2023 IEEE 31st International Requirements Engineering Conference Workshops (REW), Hannover, Germany, 2023, pp. 138-145, doi: 10.1109/REW57809.2023.00030

Contact person: Dr. Rifat Ara Shams, Rifat.Shams@data61.csiro.au

Scientific research organizations that are developing and deploying Artificial Intelligence (AI) systems are at the intersection of technological progress and ethical considerations. The push for Responsible AI (RAI) in such institutions underscores the increasing emphasis on integrating ethical considerations within AI design and development, championing core values like fairness, accountability, and transparency. For scientific research organizations, prioritizing these practices is paramount not just for mitigating biases and ensuring inclusivity, but also for fostering trust in AI systems among both users and broader stakeholders. In this paper, we explore the practices at a research organization concerning RAI practices, aiming to assess the awareness and preparedness regarding the ethical risks inherent in AI design and development. We have adopted a mixed-method research approach, utilising a comprehensive survey combined with follow-up in-depth interviews with selected participants from AI-related projects. Our results have revealed certain knowledge gaps concerning ethical, responsible, and inclusive AI, with limitations in awareness of the available AI ethics frameworks. This revealed an overarching underestimation of the ethical risks that AI technologies can present, especially when implemented without proper guidelines and governance. Our findings reveal the need for a holistic and multi-tiered strategy to uplift capabilities and better support science research teams for responsible, ethical, and inclusive AI development and deployment.
Bano, M., Zowghi, D., Shea, P., & Ibarra, G. (2023). Investigating Responsible AI for Scientific Research: An Empirical Study. arXiv preprint arXiv:2312.09561.

Contatct Person: Professor Didar Zowghi, didar.zowghi@csiro.au

The growing presence of Artificial Intelligence (AI) in various sectors necessitates systems that accurately reflect societal diversity. This study seeks to envision the operationalization of the ethical imperatives of diversity and inclusion (D&I) within AI ecosystems, addressing the current disconnect between ethical guidelines and their practical implementation. A significant challenge in AI development is the effective operationalization of D&I principles, which is critical to prevent the reinforcement of existing biases and ensure equity across AI applications. This paper proposes a vision of a framework for developing a tool utilizing persona-based simulation by Generative AI (GenAI). The approach aims to facilitate the representation of the needs of diverse users in the requirements analysis process for AI software. The proposed framework is expected to lead to a comprehensive persona repository with diverse attributes that inform the development process with detailed user narratives. This research contributes to the development of an inclusive AI paradigm that ensures future technological advances are designed with a commitment to the diverse fabric of humanity.

Workshop Link: https://conf.researchr.org/details/icse-2024/raie-2024-papers/6/A-Vision-for-Operationalising-Diversity-and-Inclusion-in-AI 

Citation: Bano, M., Zowghi, D., & Gervasi, V. (2023). A Vision for Operationalising Diversity and Inclusion in AI. Responsible AI Engineering Workshop at ICSE’24. arXiv preprint arXiv:2312.06074.

As Artificial Intelligence (AI) permeates many aspects of society, it brings numerous advantages while at the same time raising ethical concerns and potential risks, such as perpetuating inequalities through biased or discriminatory decision-making. To develop AI systems that cater for the needs of diverse users and uphold ethical values, it is essential to consider and integrate diversity and inclusion (D&I) principles throughout AI development and deployment. Requirements engineering (RE) is a fundamental process in developing software systems by eliciting and specifying relevant needs from diverse stakeholders. This research aims to address the lack of research and practice on how to elicit and capture D&I requirements for AI systems. We have conducted comprehensive data collection and synthesis from the literature review to extract requirements themes related to D&I in AI. We have proposed a tailored user story template to capture D&I requirements and conducted focus group exercises to use the themes and user story template in writing D&I requirements for two example AI systems. Additionally, we have investigated the capability of our solution by generating synthetic D&I requirements captured in user stories with the help of a Large Language Model.

Citation: Bano, M., Zowghi, D., Gervasi, V., & Shams, R. (2023). AI for All: Operationalising Diversity and Inclusion Requirements for AI Systems. arXiv preprint arXiv:2311.14695.

Contatct Person: Muneera Bano, muneera.bano@csiro.au

Contact for Collaboration

Senior Research Scientist, Team Co-Lead - AI Diversity and Inclusion