A Chain of Custody is the exhaustive traceability (“paper trail,” physical or otherwise) that chronologically records the ownership, viewing, analysis, and transformations of a data record or data sources. Although commonly a term known for evidence in criminal law and police procedure, the same fundamental concepts apply to data provenance for frameworks like GDPR, Sarbanes-Oxley (SOX) compliance, and more.
A Chain of Custody fulfills two purposes. Firstly, it guarantees a list of all Actions performed on items it tracks. Thus, one can know exactly what has (or has not) been done with said Assets. Secondly, it shows any Actors (generally people) that have had access to (“touched”) the Asset while being tracked.
A simple yet useful example would be customer records at a bank. Using Chain of Custody principles, the record management system at the bank should track every change to a customer account, but should also include who (which Actor), what (which Action), when (a timestamp), and more. Some systems also include a why component, allowing an Actor to include a comment or note. Lastly, and arguably most importantly, such a system should track who has viewed the data, even if no changes to the data has been performed.
Definitions
- Assets are the items being tracked as part of a Chain of Custody (for example, a database of personal records, metadata, etc.)
- Actors are any people or systems (“agents”) involved with Assets.
- Actions are transactions (generally verbs) that Actors perform on Assets (updated, deleting, etc.) and can also include passive Actions (e.g. viewing, counting).
- Authentication is the process of validation that an Actor is who they say they are.
- Authorization is the process of assigning permissions to an Actor: certain Actors can only do certain Actions to Assets.
- Logging is just that – an unchangeable log of all Actions that an Actor performed on the tracked Asset.
- Log Entries are individual entries that map to a unique Actor performing an Action on given Asset(s).
Criteria for a good Chain of Custody:
N.B. ‘Must’ statements = sufficient, ‘Should’ statements = mature
- The system must have a list of tracked Assets.
- The system must have an acceptable list of Actors.
- The system must use Authentication to ensure Actors cannot be faked.
- The system must have a Logging system for any Actions performed on Assets.
- The Logging system must be unchangeable, such that the Log Entries cannot be changed afterwards. Furthermore, the system should be able to verifiably show that no tampering has occurred.
- The system should track data Assets at both a macro level (e.g. “database table”) and atomic level (e.g. “individual rows in a database table”).
- The system should have an acceptable list of Actions.
- The system should have an Authorization system in order to limit an Actor’s scope of Actions.
- The Logging system should have an interface for the viewing, searching, and filtering of log entries (of note: such viewing should be considered an Action in its own right).
Sufficient – Enough to pass the audit
Mature – More than enough to pass the audit
Insufficient – Not enough to pass the audit
Definitions and examples in the table below will be specific to this Knowledge Store and should be populated by the Knowledge Store Owner. You are trying to capture the best summary of criteria that puts an audit response or piece of audit evidence, into these categories.
Sufficient | Essential inclusions: – A Chain of Custody must be Comprehensive, in that the system covers all relevant data sources and types (“Items”), not only a subset – Where appropriate consideration for Relevant Legal Frameworks, in other words if the law, regulations or regulatory guidance dictates the categories of data or information to include that must be included in the Chain of Custody – A Chain of Custody must be Exhaustive, in that it covers all Actions and Actors that may interact on the data sources and types – The ability to review, search, and filter aspects of the Chain of Custody logs – For it to achieve Chain of Custody status it must log every interaction and every change without exception Benchmarks for good practice: – That there are no Actors (even root/administrators) with the ability to edit previous Chain of Custody Log Entries – That no Actors, Actions, or Assets fall outside of the coverage of the Chain of Custody – That the same backup / disaster recovery protocols in place for the Assets also cover the Chain of Custody Log and/or metadata as well |
Sufficient | Examples: – Software version control systems have a rigorous (if ancillary) Chain of Command system, whereby code and files (Assets) are tracked for any changes included in the System of Record – CanFor’s accreditation to the Forest Stewardship Council: https://www.canfor.com/docs/responsibility/3)-chain-of-custody-documented-control-system.pdf?sfvrsn=2 – Loring Laboratories uses Chain of Custody protocols for the tracking of its lab Assets: https://www.loringlabs.net/chain_of_custody.html |
Mature | Indicators that related controls and practices have reached a mature state: – A workflow engine to ensure standardization across different use cases and departments – Sufficient backup and recovery policies and procedures to ensure redundancy and disaster recovery – If scale is a concern, Log Entries as a separate (“non-production,” “analytical”) system for reporting – A capability to manage Actors management for after such Actors are no longer part of the Chain of Custody and/or relevant organization – The ability to share and/or export aspects of the Chain of Custody Log Entries – Minimizes both the number of Actors and their relevant Actions to the minimum required to perform their roles. – Have a well-documented process and/or swimlanes as part of a corporate policy system – Standard templates to store Chain of Custody information, where automated ingestion and tracking is not possible or reasonable – Where staleness / expiration is a risk, mature systems will ensure that the quality of Assets is maintained, including relevant metadata to measure as such |
Mature | Examples: – Almost all system of a litigious nature (either criminal or civil) will have a mature Chain of Command – The EPA Handbook includes factors to ensure the quality of Assets is maintained: https://www3.epa.gov/ttnamti1/files/ambient/pm25/qa/vol2sec08.pdf – EPIC e-health records include Chain of Custody protocols as part of regulatory compliance (e.g. HIPPA): https://www.harmonyhit.com/securing-chain-of-custody-for-your-electronic-medical-records |
FAQs
Question | Answer |
Can’t people just use manual systems? Why all the overhead? | Theoretically, yes. But a strong Chain of Custody system should be provably verifiable that the Logs have not been tampered. Such a task is difficult with manual systems. |
How do you make logs untamperable? | There are a variety of clever and/or complex ways to ensure this (checksums, or at the extreme example, blockchain). Fundamentally, the concept is that of proving that such logs haven’t changed since their creation. |
How does chain of custody work for physical items (papers, assets, etc.) | Most document management and records management systems can handle both electronic and physical assets, usually with a mapping of physical box IDs in a warehouse to a virtual list within the custodial system. For more sensitive assets, locked boxes or sealed bags and signature chains can be used to ensure nothing was tampered with. |
Linked Knowledge Stores and Content
This is where you list or link to other related knowledge stores e.g. Ethics Curriculum would link to Code of Ethics. It would also link to GDPR articles, AI Audit criteria, Audit Explanatory Notes, relevant laws and regulations.
Content Type | Content Description and Link |
Code of Data Ethics | https://forhumanity.center/bok/code-of-data-ethics |
Risk Management Framework | https://forhumanity.center/bok/risk-management |
AAA Systems Log | AAA refers to Authentication (to identify), Authorization (to give permission) and Accounting (to log an audit trail) [wikipedia] |
Information Quality | “Traditionally, information has been viewed as a by-product of a computer system or an event. From this viewpoint, the focus is on designing and delivering computer systems, rather than designing and delivering information” [springer, for example] |
EU Artificial Intelligence Act | A starting point from the EU website: https://digital-strategy.ec.europa.eu/en/policies/european-approach-artificial-intelligence |
Data Taxonomy (via Sundar N.) |