The Chief Digital and Artificial Intelligence Office (CDAO) has concluded a pilot program under the Crowdsourced AI Red-Teaming (CAIRT) Assurance Program, focusing on Large-Language Model (LLM) chatbots in military medicine. The CAIRT initiative aids the Department of Defense (DoD) by developing grassroots, crowdsourced strategies for AI assurance and risk mitigation. This approach allows projects to gather extensive data and engage various stakeholders.
The pilot was conducted by Humane Intelligence, a tech company specializing in algorithmic evaluations, in partnership with the Defense Health Agency (DHA) and the Program Executive Office, Defense Healthcare Management Systems (PEO DHMS). Utilizing red-teaming methods—employing adversarial techniques to test system robustness—Humane Intelligence identified specific vulnerabilities within systems. Red-teaming also draws participants interested in new technologies who can contribute to system improvements as potential future users. Previously, a similar exercise was held in spring 2024 using a financial bounty.
In this latest program, Humane Intelligence employed crowdsourced red-teaming for two potential military medicine applications: clinical note summarization and a medical advisory chatbot. Over 200 participants from DHA, the Uniformed Services University of the Health Sciences, and other services took part in evaluating three popular LLMs. The exercise revealed over 800 findings of possible vulnerabilities and biases concerning these applications. These findings will help develop benchmark datasets for evaluating future vendors and tools according to performance expectations. Additionally, they will inform DoD policies and best practices for responsibly using Generative AI (GenAI), ultimately enhancing military medical care.
Dr. Matthew Johnson, CDAO’s lead for this initiative stated: “Since applying GenAI for such purposes within the DoD is in earlier stages of piloting and experimentation, this program acts as an essential pathfinder for generating a mass of testing data, surfacing areas for consideration, and validating mitigation options that will shape future research, development, and assurance of GenAI systems that may be deployed in the future.”
The recent pilot highlights the importance of ongoing testing through CAIRT to boost CDAO’s AI Rapid Capabilities Cell's effectiveness with GenAI missions across DoD use cases.
The CDAO began operations in June 2022 with a mission to integrate AI capabilities across the DoD efficiently. It focuses on accelerating data adoption along with analytics and AI solutions within digital infrastructure policies to protect against current threats.
For more information about CDAO initiatives or updates visit ai.mil or follow them on LinkedIn (@DoD Chief Digital and Artificial Intelligence Office) or X (@dodcdao).