AI Apology: An Apologetic Approach to Socially Responsible Agents
When humans and machine collaborate, there could be a misalignment of behaviours, and the machine might be causing harm to the user. This is another aspect of situational awareness. The machine needs to be aware of this misalignment and potential for harm; there might be a need for an apology on the part of the machine to maintain or even restore trust.
This is the topic addressed by Hadassah Harland, a CSIRO R+ top-up scholarship PhD student with Deakin University (supervised by Prof Richard Dazeley, Prof Peter Vamplew, Dr Bahareh Nakisa, Dr Francisco Cruz and Dr Hashini Senaratne. More specifically, Hadassah is working on the design of an apologetic framework for AI/robot agents. Her work also includes developing an impactful tool to enable machine agents to communicate to humans their awareness of any misaligned behaviours or harm caused during human-machine collaborative tasks and to repair trust.
Project Outcomes:
- 20 Dec 2024: The proposed AI Apology framework and a critical review of apology in AI systems made available to the public (currently under review)
- 10 Dec 2024: Hadassah presented a workshop paper titled “Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI” at the NeurIPS conference