Article 10 Data and data governance


    1. High-risk AI systems which make use of techniques involving the training of AI models with data shall be developed on the basis of training, validation and testing data sets that meet the quality criteria referred to in paragraphs 2 to 5 whenever such data sets are used.

    1. Training, validation and testing data sets shall be subject to data governance and management practices appropriate for the intended purpose means the use for which a product with digital elements is intended by the manufacturer, including the specific context and conditions of use, as specified in the information supplied by the manufacturer in the instructions for use, promotional or sales materials and statements, as well as in the technical documentation; of the high-risk AI system. Those practices shall concern in particular:

      1. the relevant design choices;

      2. data collection processes and the origin of data, and in the case of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679;, the original purpose of the data collection;

      3. relevant data-preparation processing operations, such as annotation, labelling, cleaning, updating, enrichment and aggregation;

      4. the formulation of assumptions, in particular with respect to the information that the data are supposed to measure and represent;

      5. an assessment of the availability, quantity and suitability of the data sets that are needed;

      6. examination in view of possible biases that are likely to affect the health and safety of persons, have a negative impact on fundamental rights or lead to discrimination prohibited under Union law, especially where data outputs influence inputs for future operations;

      7. appropriate measures to detect, prevent and mitigate possible biases identified according to point (f);

      8. the identification of relevant data gaps or shortcomings that prevent compliance with this Regulation, and how those gaps and shortcomings can be addressed.

    1. Training, validation and testing data sets shall be relevant, sufficiently representative means a natural or legal person established in the Union explicitly designated to act on behalf of a DNS service provider, a TLD name registry, an entity providing domain name registration services, a cloud computing service provider, a data centre service provider, a content delivery network provider, a managed service provider, a managed security service provider, or a provider of an online marketplace, of an online search engine or of a social networking services platform that is not established in the Union, which may be addressed by a competent authority or a CSIRT in the place of the entity itself with regard to the obligations of that entity under this Directive;, and to the best extent possible, free of errors and complete in view of the intended purpose means the use for which a product with digital elements is intended by the manufacturer, including the specific context and conditions of use, as specified in the information supplied by the manufacturer in the instructions for use, promotional or sales materials and statements, as well as in the technical documentation;. They shall have the appropriate statistical properties, including, where applicable, as regards the persons or groups means a group as defined in Article 2, point (11), of Directive 2013/34/EU; of persons in relation to whom the high-risk AI system is intended to be used. Those characteristics of the data sets may be met at the level of individual data sets or at the level of a combination thereof.

    1. Data sets shall take into account, to the extent required by the intended purpose means the use for which a product with digital elements is intended by the manufacturer, including the specific context and conditions of use, as specified in the information supplied by the manufacturer in the instructions for use, promotional or sales materials and statements, as well as in the technical documentation;, the characteristics or elements that are particular to the specific geographical, contextual, behavioural or functional setting within which the high-risk AI system is intended to be used.

    1. To the extent that it is strictly necessary for the purpose of ensuring biasbusiness impact analysis detection and correction in relation to the high-risk AI systems in accordance with paragraph (2), points (f) and (g) of this Article, the providers of such systems may exceptionally process special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679;, subject to appropriate safeguards for the fundamental rights and freedoms of natural persons. In addition to the provisions set out in Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680, all the following conditions must be met in order for such processing to occur:

      1. the biasbusiness impact analysis detection and correction cannot be effectively fulfilled by processing other data, including synthetic or anonymised data;

      2. the special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; are subject to technical limitations on the re-use of the personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679;, and state-of-the-art security and privacy-preserving measures, including pseudonymisation;

      3. the special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; are subject to measures to ensure that the personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; processed are secured, protected, subject to suitable safeguards, including strict controls and documentation of the access, to avoid misuse and ensure that only authorised persons have access to those personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; with appropriate confidentiality obligations;

      4. the special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; are not to be transmitted, transferred or otherwise accessed by other parties;

      5. the special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; are deleted once the biasbusiness impact analysis has been corrected or the personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; has reached the end of its retention period, whichever comes first;

      6. the records of processing activities pursuant to Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data means personal data as defined in Article 4, point (1), of Regulation (EU) 2016/679; was strictly necessary to detect and correct biases, and why that objective could not be achieved by processing other data.

    1. For the development of high-risk AI systems not using techniques involving the training of AI models, paragraphs 2 to 5 apply only to the testing data sets.

We're continuously improving our platform to serve you better.

Your feedback matters! Let us know how we can improve.

Found a bug?

Springflod is a Swedish boutique consultancy firm specialising in cyber security within the financial services sector.

We offer professional services concerning information security governance, risk and compliance.

Crafted with ❤️ by Springflod