Insights

NIST Releases Two Guidance Documents for Developing Trustworthy AI

By K.C. Halm, Alexander Sisto, John D. Seiver, and Kate Berry

03.24.22

On March 16 and 17, 2022, the National Institute of Standards and Technology (NIST) released two documents as part of its effort to establish a voluntary framework for developing trustworthy and responsible artificial intelligence ("AI"). These publications reflect NIST's continued work towards developing technical standards and related tools needed to measure, evaluate, and enhance trustworthiness of AI systems.

First, NIST released a special publication outlining standards for identifying and managing bias in AI, titled "Towards a Standard for Identifying and Managing Bias in Artificial Intelligence" (Bias Guidance). The Bias Guidance is not intended to put forward a final solution for addressing AI bias but is instead intended to "surface the salient issues in the challenging area of AI bias and to provide a first step on the roadmap for developing detailed socio-technical guidance for identifying and managing AI bias."

Second, NIST released for public comment a draft of the AI Risk Management Framework (RMF) that it has been developing with the input of stakeholders since the summer of 2021. A workshop on the RMF will be held March 29-31, 2022, and comments on the draft RMF will be accepted by NIST until April 29, 2022. The agency anticipates release of a second draft and further workshops later this year. The final version is expected in January 2023.

NIST Bias Guidance

NIST's special publication on bias in AI adopted a broad construct of identifying the different ways in which bias in AI may arise. Specifically, NIST identified three categories of potential bias:

1. Systemic Bias: Defined as biases that "result from procedures and practices of particular institutions that operate in ways which result in certain social groups being advantaged or favored and others being disadvantaged or devalued." As explained by NIST, systemic biases can be present in datasets used to train AI systems.
2. Statistical and Computational Biases: Defined as "biases that are the result of sample data not being representative of the population in fact." NIST explains that these biases "often arise when algorithms are trained on one type of data and cannot extrapolate beyond those data."
3. Human Bias: Defined as "systematic errors in human thought based on a limited number of heuristic principles and predicting values to simpler judgmental operations." NIST explains that such biases are "a fundamental part of the human mind," and can often be helpful in making a decision or filling in missing or unknown information. However, because these biases are implicit, "simply increasing awareness of bias does not ensure control over it."

NIST points out that when addressing bias in AI, many developers "default to overly technical solutions" that do not "adequately capture the societal impact of AI systems." To remedy this narrow conception of bias in AI, NIST recommends that AI developers adopt a "socio-technical approach" to development, which "takes into account the values and behavior modeled from the datasets, the humans who interact with them, and the complex organizational factors that go into their commission, design, development, and ultimate deployment."

As part of adopting a socio-technical approach, NIST makes a number of recommendations that AI developers can take to identify and remediate bias. Specifically, NIST recommends conducting AI impact assessments, engaging a variety of diverse stakeholders, and incorporating concepts of human-centered design throughout the development and deployment process.

NIST also recommends a number of governance principles for combatting bias in AI, including: continual monitoring for bias impacts, providing feedback channels, developing written policies and procedures, and attributing specific accountability for development of AI. Governance focuses on more than the technical aspects of AI systems, but also on "organizational processes and cultural competencies that directly impact the individuals involved in training, deploying and monitoring" AI systems.

In particular, the following recommendations are actionable steps organizations should implement to mitigate potential bias in AI systems and tools used within their enterprise:

1. Monitoring: The deployment of additional systems that monitor for potential bias issues, which can alert the proper personnel when potential problems are detected.
2. Recourse/Feedback Channels: Systems that allow end users to flag incorrect or potentially harmful results and seek recourse for errors or harms.
3. Policies and Procedures: Internal written policies and procedures that address key roles, responsibilities, and processes at all stages of the AI model lifecycle. Policies may:
- a. Define key terms and concepts related to AI systems and the scope of their intended impact;
- b. Address the use of sensitive or potentially risky data;
- c. Detail standards for experimental design, data quality, and model training;
- d. Outline how the risk of bias should be mapped and measured and according to what standards;
- e. Detail processes for model testing and validation;
- f. Detail the process of review by legal or risk functions;
- g. Set forth the periodicity and depth of ongoing auditing and review;
- h. Outline requirements for change management; and
- i. Detail any plans related to incident response for such systems in the event that any significant risks do materialize during development.
4. Documentation: The development of standardized, model documents that contain interpretable descriptions of system mechanisms that enable oversight personnel to make informed, risk-based decisions about the system's potential bias.
5. Accountability: Ensuring that a specific team or individual is responsible for bias management in AI systems.
6. Risk Mitigation, Tiering, and Incentives: Organizations should acknowledge that risk mitigation, not risk avoidance, is often the most effective factor in managing risks of AI bias. Specifically, organizations should acknowledge that incidents can and will occur and should emphasize practical detection and mitigation once they do.

NIST Draft Risk Management Framework

On March 17, 2022, NIST published an initial draft of its Artificial Intelligence RMF for addressing risks in the design, development, use, and evaluation of AI products and services across a wide spectrum of types, applications, and maturity of AI systems (we have discussed NIST's previous efforts regarding trustworthy AI here). The RMF is intended to provide a voluntary and flexible, structured, and measurable process to address AI risks throughout the AI lifecycle and "offer guidance for the development and use of trustworthy and responsible AI."

The draft RMF is not a "checklist" or compliance mechanism to be used in isolation but is intended to be considered with other critical risk mitigation practices, yielding a more integrated outcome resulting in organizational efficiencies that reduce or mitigate potential risks associated with using AI. To that end, the draft RMF is intended to be risk-based, resource efficient, and present a voluntary, flexible, structured, and clearly measurable process to address AI risks with guidance for development and use of trustworthy and responsible AI.

In line with this approach, the draft RMF does not prescribe specific risk thresholds for AI systems. Rather, NIST suggests that "risk thresholds should be set through policies and norms that can be established by AI system owners, organizations, industries, communities, or regulators."

The draft RMF uses a three-class taxonomy to classify characteristics that should be considered when identifying and managing risk related to AI systems:

1. Technical Characteristics "refer to factors that are under the direct control of AI system designers and developers" and include:
- a. Accuracy: The degree to which the ML model is correctly capturing a relationship that exists within training data;
- b. Reliability: Whether a model consistently generates the same results, within the bounds of acceptable statistical error;
- c. Robustness: Whether the model has minimum sensitivity to variations in uncontrollable factors; and
- d. Resilience: The extent to which an AI model can withstand adversarial attacks or, more generally, unexpected changes in its environment or use.
2. Socio-Technical Characteristics "refer to how AI systems are used and perceived in individual, group, and societal contexts," and include:
- a. Explainability refers to providing a programmatic, sometimes causal, description of how model predictions are generated;
- b. Interpretability refers to the meaning of an algorithm's output in the context of its designed functional purpose;
- c. Privacy refers generally to the norms and practices that help to safeguard values such as human autonomy and dignity;
- d. Safety refers to an absence (or minimization) of failures or conditions that render a system dangerous; and
- e. Managing Bias refers to programs that manage the three categories of bias identified in its Bias Guidance.
3. Guiding Principles refer to broader societal norms and values that indicate societal priorities and include:
- a. Fairness means the absence of harmful bias and generally should be determined using technical and broader societal considerations;
- b. Accountability refers to the idea that individual humans and their organizations should be answerable and held accountable for the outcomes of AI systems, particularly adverse impacts stemming from risks; and
- c. Transparency refers to the extent to which information is available to a user when interacting with an AI system.

The draft RMF notes that certain outcomes and actions will enable organizations to better manage AI risks by function with AI risk management activities organized at a high level:

Mapping: Critical to identifying context and classification of capabilities, targeted usage, goals, and expected benefits and costs as well as identifying risks and harms to individual, organizational, and societal perspectives are identified.
Measuring: Risks are assessed, analyzed, or tracked; systems evaluated, and feedback gathered.
Managing: Assessments of potential harms and results and prioritize actions to maximize benefits and minimize harm; communicate to internal and external stakeholders as appropriate.
Governing: Clear policies, processes, procedures, and practices implemented for development, testing, deployment, use, and auditing of AI systems that are transparent. Teams and individuals are empowered, responsible, and trained for managing risks of AI systems; diversity, equity, inclusion, accessibility, and cultural considerations from potentially impacted individuals and communities are fully considered.

Bias Minimization and Interactions With Other Elements of Trustworthy AI

Although the Bias Guidance and draft RMF share a number of concepts and recommendations, it is interesting that NIST's Bias Guidance indicated that reducing bias in AI may come into tension with other components of trustworthy AI identified in the draft RMF—namely, accuracy, explainability, and the interaction between human and AI decision making.

First, while both documents recognized that accuracy of AI models is one of the primary components of trustworthy AI, the Bias Guidance also explains that achieving accuracy may not always be the preferred outcome if such results lead to biased outcomes. NIST explains: "it is possible to mathematically address statistical bias in a dataset, then develop an algorithm which performs with high accuracy, yet produce outcomes that are harmful to a social class and diametrically opposed to the intended purpose of the AI system." To address these competing objectives, NIST recommends that developers consider the broader context of development and deployment of AI systems to detect potential bias.

NIST has also emphasized that explainability is a key component of trustworthy AI. With that said, NIST acknowledged in its Bias Guidance that simpler AI models which tend to be more transparent and explainable can actually "exacerbate statistical biases because restrictive assumptions on the training data often do not hold with nuanced demographics."

Finally, the Bias Guidance highlights a tension with the notion that a "human (expert or otherwise) can effectively and objectively oversee the use of algorithmic decision systems," in light of the fact that "[h]umans carry their own significant cognitive biases and heuristics into the operation of AI systems …." As a result, NIST is seeking "to develop formal guidance about how to implement human-in-the-loop processes that do not amplify or perpetuate the many human, systemic and computational biases that can degrade outcomes in this complex setting."

As this demonstrates, developers of AI systems may have difficulty balancing the various components of NIST's draft RMF and Bias Guidance when developing and deploying AI systems. DWT's AI team will continue to monitor the workshops and NIST's upcoming second draft of the RMF.

NIST Releases Two Guidance Documents for Developing Trustworthy AI

NIST Bias Guidance

NIST Draft Risk Management Framework

Bias Minimization and Interactions With Other Elements of Trustworthy AI

Related Insights