Vulnerabilities and risks shall be examined via threat modeling for Data Entry Point Attacks
- Reasoning: Threat modeling helps address and quantify impacts of attacks
- Note: Threat modeling should involve the Cyber Security Expert who is versed in AI security
Regular reviews shall be established for potential attacks, risks related to Data Entry Point Attacks
Events and anomalies related to Data Entry Point Attacks shall be collected and analyzed
A procedure shall be established to classify incidents, analyze impacts, and investigate incidents of Data Entry Point Attacks
A procedure shall be established for containment and communication to stakeholders about incidents of Data Entry Point Attacks
Model access shall be rate limited
- Reasoning: High rate of access to model can be used strategically by attackers to launch effective Model Inversion, Model Extraction, Privacy-related Attacks
Data Entry Point Attacks shall be tested via penetration testing or red teaming for an algorithm
- Reasoning: Red teaming helps address and quantify likelihood of attacks
All known Data Entry Point Attacks specific to the model type should be considered during penetration testing or red teaming
- Reasoning: Different attacks are more severe towards different model architectures and complexities.
Model developers, designers, researchers shall understand the tradeoffs and risks related to Technical Defenses if they are implemented
- Reasoning: There are tradeoffs inherent in the defense (e.g., Differential Privacy) and/or among multiple defenses (e.g., the effect of reducing overfitting as a defense against Membership Inference vs. against Model Extraction)
The model shall be examined for vulnerabilities associated with Trojaning and Backdooring prior to training
- Reasoning: Trojaning and Backdooring risks arise from the internals (e.g., “sleeper agent” neurons) of the model.
- Note: The phrasing is designed to be agnostic about model source, including:
- Models built by the entity
- Models built by a Third Party
- Models built by the entity on top on an open-source model (e.g., via Transfer Learning)
The training dataset and training labels shall be examined for vulnerabilities associated with Data Poisoning prior to training
- Reasoning: Data Poisoning risks arise from adversarial inputs existing in dataset in the features or labels. In addition, training data can be accessed by attackers in between training periods (e.g., insecure file storage, untrusted accounts).
- Note: The phrasing is designed to be agnostic about training data source, including:
- Training dataset and training labels obtained by the entity
- Training dataset and training labels obtained by a Third Party (e.g., a data labeling service)
- Open-source datasets with labels already prepared
Detection of malicious intent when accessing model should be implemented
- Reasoning: Model queries can be used strategically by attackers to gain knowledge about the model and/or training data. There is some evidence that query patterns of malicious actors are different from those of normal users, and can be used to infer intent.
One or more defenses against Data Access Attacks shall be implemented
One or more defenses against Model Poisoning shall be implemented
One or more defenses against Model Evasion should be implemented
- Reasoning: Although Model Evasion is heavily studied in AML, there
One or more defenses against Attacks Related to Privacy shall be implemented
Differential Privacy should be implemented for any analytics that contain Sensitive Personal Data
Differential Privacy should be implemented in models trained on datasets that contain Sensitive Personal Data
The choice of parameters shall be explained if Differential Privacy is used in analytics or in training the model
- Reasoning: These parameters present a tradeoff between utility (e.g., the accuracy of a model or the true statistic of a dataset) and level of privacy.