Public People: Personal Data in Open Government Databases

Government open data (OGD) programs, in which a government places public data online in a downloadable and typically machine-readable format, have spread widely in recent years. Just over ten years after the launch of the first OGD program, Washington D.C. in 2006, dozens of OGD programs exist worldwide. The practice of opening government data has spread but strong norms around what to open do not yet exist.
Open data programs take data that is technically but not practically accessible to the public, and place it onto the Internet, bringing information out from town hall filing cabinets and into Google search results, civic apps, data-driven journalism and commercial databases. Politicians and advocates laud government open data programs for their economic potential, but alongside these positive outcomes, specific privacy implications exist for open people data. The phrase “people data” here refers to any data concerning a specific individual, as opposed to aggregated statistics. This United Kingdom and the United States rank as the first and second most open in terms of government open data, according to the Open Data Barometer, yet the two nations handle personal data in diverging ways.
What data is “embodied” by a specific person has expanded, as recent academic work has uncovered the ease with which data can be deanonymized. A dataset no longer needs to list a specific person’s name or contact information to be personal. The definition of what is sensitive personal information has also expanded in recent years, as open data by definition can be freely reused and linked with other public or private databases. While information from a singular dataset may not be sensitive, the privacy implications of its use could increase when it is linked with other information.
The UK closely regulates the publication of people data through policies such as the 1998 Data Protection Act, whereas the US is comparatively lenient in publishing information related to people. Not only has the definition of “people data” in the US and UK evolved over the years but the differences in regulation of personal data in the US and the UK has implications for data reuse in the two nations.
The impact of open data can be difficult to measure but the relative reuse of data can be addressed by comparing the most used datasets in US federal database data.gov to the most used datasets in the UK open database data.gov.uk. Do two nations use categorically different data, and how do the differences in the datasets available impact the commercial applications of the OGD?
The government open data movement is still in the early stages of development; while the US and UK may have a fairly developed open data programs at the national level, others are just considering it. Norms around personal data in OGD should develop in lock step with spread of open data itself. The “Public People” presentation will help to explore the role of OGD in bringing personal data out into the public sphere of the Internet.