Description
|
Description of the project
ASTOUND is an EIC funded project (No. 101071191) under the HORIZON-EIC-2021-PATHFINDERCHALLENGES-01 call.
The aim of the project is to develop an artificial conscious AI based on the Attention Schema Theory (AST) proposed by Michel Graziano. This theory proposes that consciousness arises from the brain's ability to create and maintain a simplified model of its own processing, particularly focusing attention on certain aspects of its internal and external environment.
The project entails creating an AI system capable of exhibiting consciousness-like behaviours by implementing principles from the AST. This involves constructing a model that simulates attentional processes, allowing the AI to prioritise and focus on relevant information while disregarding irrelevant stimuli.
The ASTOUND project will provide an Integrative Approach for Awareness Engineering to establish consciousness in machines, and targeting the following goals:
Develop an AI architecture for Artificial Consciousness based on the Attention Schema Theory (AST) through an internal model of the state of the attention.
Implement the proposed architecture into a contextually aware virtual agent and prove improved performance thanks to the Attention Schema; for instance, by providing coherent discussion, self-regulation, short-and-long term memory, personalisation capabilities.
Define novel ways to measure the presence and level of consciousness in both humans and machines.
Description of the dataset
The dataset includes synthetic dialogues in the art domain that can be used for training a chatbot to discuss artworks within a museum setting. Leveraging Large Language Models (LLMs), particularly ChatGPT, the dataset comprises over 13,000 dialogues generated using prompt-engineering techniques. The dialogues cover a wide range of user and chatbot behaviours, including expert guidance, tutoring, and handling toxic user interactions.
The ArtEmis dataset serves as a basis, containing emotion attributions and explanations for artworks sourced from the WikiArt website. From this dataset, 800 artworks were selected based on consensus among human annotators regarding elicited emotions, ensuring balanced representation across different emotions. However, an imbalance in art styles distribution was noted due to the emphasis on emotional balance.
Each dialogue is uniquely identified using a "DIALOGUE_ID", encoding information about the artwork discussed, emotions, chatbot behaviour, and more. The dataset is structured into multiple files for efficient navigation and analysis, including metadata, prompts, dialogues, and metrics.
Objective evaluation of the generated dialogues was conducted, focusing on profile discrimination, anthropic behaviour detection, and toxicity evaluation. Various syntactic and semantic-based metrics are employed to assess dialogue quality, along with sentiment and subjectivity analysis. Tools like the MS Azure Content Moderator API, Detoxify library and LlamaGuard aid in toxicity evaluation.
The dataset's conclusion highlights the need for further work to handle biases, enhance toxicity detection, and incorporate multimodal information and contextual awareness. Future efforts will focus on expanding the dataset with additional tasks and improving chatbot capabilities for diverse scenarios. (2023-10-01)
|
Notes
| Future efforts will focus on expanding the dataset with additional tasks and improving chatbot capabilities for diverse scenarios.
METHODOLOGY
Dialogues were generated using ChatGPT prompted by instructions tailored to simulate conversations between an expert and a user discussing artworks. Different behaviours in the chatbot and the user were included as part of the instructions. A total number of 4 behaviours are included: 1) the chatbot acts as an art expert or tour guide, providing information about a given artwork and answering questions from the user; 2) the chatbot acts as a tutor or professor, in which the chatbot asks questions to the user and the user may provide correct or incorrect answers. Then the chatbot will provide feedback to the user; 3) the chatbot will have an anthropic or non-anthropic behaviour. Meaning anthropic that the chatbot turns will include opinions or feelings that the chatbot could also experiment based on the artwork (the emotion information is extracted from the ArtEmis original human annotations); and 4) the user has a toxic behaviour (i.e., the user’s turns contain politically incorrect sentences that may contain harmful comments about the content of the artwork, the artists, the styles, or including questions that are provocative, aggressive or non-relevant).
The released dataset is based on the ArtEmis dataset and extends it by incorporating dialogues, multiple behaviours and including metadata obtained to assess its quality. From the original dataset, we took a total of 800 artworks with a balanced distribution of emotions to avoid bias in the handling of emotions by the chatbot. A total of 13,870 dialogues were collected, including 378 unique artists, 26 different art styles, and balancing the 4 behaviours mentioned above.
The dataset was automatically analysed by using ChatGPT and GPT-4 models on different tasks, e.g., detecting that the factual information provided in the dialogues also was the one provided in the instruction prompt during the generation. Then, instructing the models to detect the presence of toxic comments or anthropic behaviour. Finally, additional libraries and models such as Detoxify, Microsoft Azure Content Moderation Services or LlamaGuard from Meta, were used to automatically label dialogues and turns with labels to indicate toxicity and probabilities of the classification when possible.
FILES
- filename_codes.json: Contains a structured taxonomy with codes for identifying the different elements of the dataset. It includes codes for profiles, such as painting, expert, and user profiles. Additionally, it contains codes for various attributes such as emotions, toxicity and biases.
- metadata.csv: Comma-separated values (CSV) file containing detailed information about each dialogue in the dataset. It includes data such as the author and style of the artwork, emotions, goals, roles, toxicity, and anthropology. This files server as a comprehensive reference for understanding the context and characteristics of each dialogue within the dataset.
- prompts.csv: A CSV file that stores the prompts used in generating the dialogues by the ChatGPT model. These prompts provide instructions and guidelines for initiating conversations between the expert and user within the context of discussing artworks in a museum setting.
- dialogues.csv: A CSV file containing the actual dialogues generated by the ChatGPT model. Each dialogue entry consists of conversational turns between the expert and user agents.
- metrics.csv: A CSV file providing a summary of evaluation metrics obtained to assess the quality and characteristics of the generated dialogues. It includes dialogue-level metrics, toxicity level and categories, syntactic and semantic-based objective metrics, and sentiment analysis results. This file aids in evaluating the performance of the AI chatbot and identifying areas for improvement in dialogue generation.
- toxic.csv: A CSV file that contains information about toxicity levels observed within the generated dialogues. It comprises boolean columns, one representing whether the dialogue should be toxic within the prompt, other whether toxicity detection using the Detoxify library with a toxic threshold of 0.4 has identified toxic content within the dialogue, other whether toxicity detection using the Microsoft Azure Content Moderator service has identified toxic content within the dialogue, and one indicates whether toxicity detection using the LLAMA Guard has identified toxic content within the dialogue.
|