What would legislation for data infrastructure and open data look like? 

A useful summary from the Open Data Institute (from last year) of what the legal framework for a sustainable data infrastructure should look like

Legislation should:

  • Define a set of roles and responsibilities around data infrastructure assets such as data collectors, maintainers, publishers and regulators. It would give basic requirements for the responsibilities of each of these roles (eg that publishers must make the data available in machine-readable form) but provide for future flexibility by stating that standards and guidelines will be specified through separate materials published outside legislation.

  • Define what it means to be open data within the data infrastructure, what additional roles and responsibilities this incurs, and what kinds of data should be designated as open data. Not all data infrastructure assets will be designated as open data infrastructure assets. For other data infrastructure assets it should be specified what the sharing regime is for those assets.

  • Provide a legislative framework that enables someone (for example the Minister from the department responsible for the data, or a Chief Data Officer) the power to designate a particular dataset as being a data infrastructure asset using secondary legislation. This enables the list of data infrastructure assets to grow over time. Primary legislation would need to define what secondary legislation needs to say about data assets. For example, secondary legislation might define spending data as data infrastructure. It would need to state what items are included (eg the granularity of the spend items), what information about them must be provided (eg the category of the spend), and under what access regime (eg that it should be published openly). Legislation should not include technical details such as the format in which it should be published because technical best practice is likely to change.

  • Designate certain assets as data infrastructure assets and indicate who the collector, maintainer, publisher, access regime, and so on are for each of those assets. These would fall into three general categories:

    • classes of materials, such as public records and national statistics
    • existing data infrastructure assets defined by legislation, including various registers where there is already a designated registrar
    • new data infrastructure assets which have been mandated by policy as part of the government’s open data initiative, such as spending data or election data
  • Set some limits about the removal of data assets from the list of data infrastructure assets, eg that changes to their status (both whether they are listed or not and what their access regime is) can only happen with at least a year’s notice. This ensures that businesses can expect and rely on stability from data assets listed within the data infrastructure.