Data Stewardship in Action: Workshop on Making Data Collaboratives Systematic, Sustainable and Responsible
Instructor: Stefaan Verhulst (@sverhulst), The GovLab, New York University (@thegovlab & @nyu)
-
250 examples of the private and public sector working together with data
-
A whole spectrum to sharing data and collaboratives. From fully open data to fully shared results (with fully restricted data)
-
Many ways to run a collaborative depending on several variables (eg, openness, timing, type of work, focus, etc)
-
Is there any evidence of what works better within the spectrum of the conditionality matrix of open-restrictive data versus different levels of collaboratives? It varies on case-by-case scenarios. There is a paper in the making (to appear @thegovlab site) for guiding what strategy is better depending on the variables (e.g., scope, access, data, collaboration) you have for a project
-
Big challenge: do any collaborative sustainable
-
A lot of data are important for the public and reside in the private sector. Reciprocity from the public sector is the main incentive for corporations to share some of the data. There are also: reputation, doing research, social responsability, retention of talent (data scientists in industry may be more motivated to stay in a company where they have access to data for the public good such as, for example, UNICEF)
-
There is a large cultural challenge between accumulating data and sharing it. "Often people do not get promoted for sharing data, but because nothing bad happenned with the data"
-
Does data sharing have to be free? We need to have the hard convo about financial sustainability
-
There are also ethical considerations. Multiple risks: privacy, ethical, and competitive risks
-
With the private sector it is helpful to ask for data in a customer-centric framework rather than in a data-centric way. Customer-centric question: what questions are you interested in answering? why can't you? answer: because the data are not available to be analyzed and answer the question. Conclusion: Hey, corp, your customer needs these data.
-
If you don't share any data, but only the insights from that data, how do you trust the insights? This is not solved yet. There is room for trusted organizations to flourish and can audit the insights from data without disclosing the data
Data Stewardship - The role of the chief data steward
-
Make data collaboratives: sustainable, sistematic, and responsible - a gobernance model
-
Repository of contracts: Contractual wheel of data collaboration. To help the legal conversation about data and make it more systematic
-
Trying to go beyond prototypes and make data use for public policy sustainable
-
One way is to have a Data Steward, taht is a person within each organization that can connect with the data. Other functions: There is also a need to create the data marketplace + Build the community of practice
-
The Cambridge Analytica case made it clear that even partnership with academia can turn into a disaster. It's important to have data stewards to know who to partner with in the corporate world
-
The public sector can help making this work in the private sector in a self-regulatory framework by asking the private sector to make the private sector work on this (carrot/stick approach - the stick is the shadow of the public sector helping this to really happen)
-
Secondary use debate: data were collected for one purpose but insight is usually extracted as a secondary use of that data. The use for secondary analyses is a case GDPR does not talk about
-
We already have chief data security officers. Now we have the same need for data stewardship. National stats agencies are data stewards. There are connections to be made with them and a commmunity to build.
-
Slides will be sent over email to participants
Instructors: Gefion Thuermer (@GefionT), Johanna Walker; University of Southampton (@unisouthampton), Peter Wells (@peterkwells); Open Data Institute (@ODIHQ), and Kieron O’Hara; University of Southampton
-
Follows nicely with the previous workshop, but it is independent. More practical. Includes success cases and tips for sharing data
-
Data Sharing Workshop Resources & Glossary can be found here
-
Slides summarized in the following notes can be found here
-
Motivation for a data sharing economy: "data economy may increase to 739 billion euros by 2020"
-
"Data sharing club" (yes, quite similar to belonging to a club where you pay a bit to belong, entrants may get in for free - like students to professional societies - if they fullfil some reqs). This idea constrats with open/close platforms for sharing data
-
Challenges to solve: Incentives, broader than industry sectors, conditions, match supply and demand, trust (to not abuse your data and behave responsibly with your data)
-
Ecosystem examples: Data sharing incubators and caccelerators, Intearnational data spaces, BDVA-PPP I - space, ODI - and a bunch of others that I missed
-
Data Pitch: a data sharing case study. European, open, data-driven innovation program. Innovators, enterpeneurs, data providers, ...
-
Data provider challenges: it took a year to get data providers involved, Identify the data and problem to include in the pitch. 8 months to define the challenge and review legal aspects of data. 4 months to agree on a contract. 2 months to obtain metadata sample
-
Case: Greiner Packaging International ended up matched with five innovators/startups who are now working on advancing different areas of thier business. Greiner had a person dedicated to advance data-driven innovation - we could say it is a data steward
Kieron O'Hara
-
Theoretical approach
-
Results? Often you look down the water and you just see the risks (i.e. sharks)
-
Ethical data stewardship: regulation is not enough (...) Institutions may be needed for trust (e.g., Data trusts/data sharing clubs)
Jack Springman
-
Practical approach
-
Data mobility infrastructure
-
The data portability growth oportunity for the UK economy - 2018 report:
-
Advantages: economic benefits, accelerate innovation, healthier markets, improve productivity
-
Issues: consumer services and applications, adaptive regulations, infrastructure, and two more I could not note
-
Personal data mobility sandbox. British Telco, British Gas, Barclays, BBC, Faceboo + a group of external trusted observers: Centre for data ethics and innovations, consumers international dcms, ico, university of southampton. The purpose is to demonstrate safe data sharing for people to share their personal data
-
GDPR does not make it easy to share the data/data mobility between individuals and different institutions
-
From data portability to data mobility. A vision: using a data facilitator like "dgme" with apis from different companies (e.g., fb, spotify, etc) - individuals import/export their data from the different companies in a safe and ethical environment. it was important to test the user experience about perception of security and privacy
-
What do people/companies get in return? Value
-
Future report about this initiative can be found at the Ctrl-Shift website
-
It has to be safe and valuable to share data
Jack Hardinges, Peter Wells - Open Data Institute
-
Slides can be found here
-
ODI mission: works with companies and governments to build an open trusworthy data ecosystem
-
ODI vision: we want a world where data works for evyerone. Non-naive idealists. Long-term goal
-
ODI works through the complete data sprectrum: smal/medium/(meaningless :-)) big data - public, commercial, etc
-
Data is infrastructure. We see it more when it is not working. Data should be more boring than it usually is
-
Trust and trustworthiness is highly context-dependent
-
Which data access models give people increased access to data while retaining trust?
-
Trust: A general definition of trust (Kieron O'Hara, 2012). Taxonomy: BURRITOS model. Better Understanding the Responsibilities and Rationales Informing Trustworthy Options for Sharing, but a better way to talk about taxonnomy is "The map of data access" which will be published in about three weeks with a travel guide over the map
-
For example, some areas of the map require a lot more exploration than others. If you are in a given part of the map that others explored, the guide will give you hints on what worked for others and what didn't work at a practical level
-
A federation of data explorers (Federation, like the Federation in Star Trek)? UoS, Datapitch, Govlab, data stweards, big firms, governments, etc
-
Case study for the trust island in the map. Data trust as a legal structure that provides independent stewardship of data. The trustess/data stewards of the data trust take on responsibility for how data is used and shared
-
Several related examples: genomics england access commitee, office for nationals stats, secure research service, NHS health research authority, confidentiality advisory group, metadac, etc, etc, etc.
-
Pilot projects: Civic (data about electric vehicle for instance), food waste (and sales data), illegal wildlife trade (image and acoustic data). some of them had personal identification data. Multi-disciplinary team over three months.
-
Challenges in increasing access to data: lack of value for business, concerns about reputation impact, who owns and has rights on the data, lack of standards, limited data literacy and skills
-
Challenges in building data trusts: what is a data trust?, determining independence, financial sustainability, decision making open participatory and deliberative; lack of ecosystem maturity; how to demonstrate trustworthiness; use of data trusts to steward personal data, avoiding technology/first solucions (poniendo el caballo adelante del carro?)
-
Building data trusts: there is a 30-page guide on how to get this going online at ODI´s website