Data Entry Point Attacks: Open-Source Tools

Non-Technical Tools

Threat Modeling

Adversarial ML Threat Matrix

  • What is it? Knowledge base of adversary tactics, techniques, and case studies for ML systems based on real-world observations, demonstrations from ML red teams and security groups, and the state of the possible from academic research
  • Who is it for? Security analysts, AI system analysts
  • Features
    • Interactive matrix covering both ML specific and non-ML specific threats
    • Case studies, including attacks on real-world ML production systems

Technical Tools

Adversarial ML

Advbox

  • What is it? A toolbox to generate adversarial examples that fool neural networks in PaddlePaddle, PyTorch, Caffe2, MxNet, Keras, TensorFlow
  • Who is it for? Data scientists, ML/AI security personnel
  • Features
    • Adversarial examples generation
    • Benchmarking tool for ML model robustness
    • CLI to generate adversarial examples with zero coding
  • Language: Python

Counterfit

  • What is it? Microsoft Azure’s CLI that provides a generic automation layer for assessing the security of ML models
  • Who is it for? Data scientists, ML engineers
  • Framework: Azure Shell, Anaconda Python env

Adversarial Robustness Toolbox

  • What is it? Tools that enable developers and researchers to defend and evaluate ML models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference
  • Who is it for? Data scientists, ML engineers
  • Frameworks: TensorFlow, Keras, PyTorch, MXNet, scikit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.
  • Features
    • Support for diverse data types: images, tables, audio, video, etc.
    • Support for diverse ML tasks: classification, object detection, speech recognition, generation, certification
  • Language: Python

Text Fooler

  • What is it? A model for natural language attack on text classification and inference
  • Who is it for? Data scientists, ML engineers
  • Features
    • Benchmarked generated adversary results on 7 datasets over 3 language models: BERT, LSTM, CNN
  • Language: Python

CleverHans

  • What is it? An adversarial example library for constructing attacks, building defenses, and benchmarking ML systems’ vulnerabilities to adversarial examples
  • Who is it for? Data scientists, ML engineers
  • Frameworks: JAX, PyTorch, TF2
  • Language: Python

Privacy

Machine Learning Privacy Meter

  • What is it? A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks
  • Who is it for? Data scientists, ML engineers
  • Features
    • Quantitative assessment of privacy risk
    • Extensive privacy reports generation about the aggregate and individual risk for data records in the training set at multiple levels of access to the model
    • Estimation of the amount of information that can be revealed
      • Black-box access: through the predictions of a model
      • White-box access: through both the predictions and parameters of a model
  • Framework: TF2
  • Language: Python

Google’s Differential Privacy

  • What is it? Libraries to generate epsilon- and (epsilon, delta)-differentially private statistics over datasets
  • Who is it for? Data scientists, ML engineers
  • Features
    • Google’s Privacy on Beam solutions
    • A stochastic tester, used to help catch regressions that could make the differential privacy property no longer hold
    • Differential privacy accounting library
    • CLI for running differentially private SQL queries with ZetaSQL
  • Language: C++, Go, Java

IBM’s Differential Privacy Library

  • What is it? General-purpose library for experimenting with, investigating and developing applications in, differential privacy
  • Who is it for? Data scientists, ML engineers
  • Features
    • Generic tools for differentially private data analysis (e.g., histograms)
    • Differentially-private models: clustering, classification, regression, dimensionality reduction, and pre-processing
    • Privacy budget and total privacy loss calculation using advanced composition techniques
  • Language: Python

Diffpriv

  • What is it? Privacy-aware data science tools
  • Who is it for? Data scientists, ML engineers
  • Language: R

Privacy Raven

  • What is it? Privacy testing library for deep learning systems
  • Who is it for? Data scientists, ML engineers
  • Features
    • Support for label-only black-box model extraction, membership inference, and (soon) model inversion attacks

Language: Python