Rozita Dara, IPC-ON
Patricia Brantingham, SFU
Caitlin Hertzman, Population Data BC
This session was intended as four case study presentations of how to manage the security and availability of big data in research.
The claim from Stephen is that the responsibility fundamentally falls on the researcher to understand the issues and ensure that the data and results are handled correctly.
Stephen outlined several existing scenarios including single computers or drives with encryption, external service providers or "cloud" solutions, private clouds, and the pluses and minuses of all of these options. All of these are very common to us who work in these research environments and there were no surprises here.
EmuLab in Utah was cited as a HPC facility that VLANs out the various research computing inside a secure data centre. Compute Canada is waiting on response to a proposal they have submitted to build a like facility.
When asked how to ensure that data destruction is complete at the end of a research project, Stephen said he takes complete responsibility, buys self encrypting drives on servers, and does everything himself, and that's the solution. This approach was challenged from the University management and privacy responsibility perspective. I agree that Stephen's approach is at best a stop-gap measure at best, and is not efficient, institutionally accountable, or scalable.
Rozita spoke to us about virtual tool use to protect information freedom. The relationship between the amount of data becoming available, the value of that data, and the legislation and controls available to govern the use of that data was posed to us as an increasing challenge.
The challenges Rozita summarised as:
Over-regulation of data
Privacy in context (one ring will not rule them all)
We are suggested to check out the site http://PrivacyByDesign.ca to see the summary of her research on these challenges, and her proposed solution, "SmartData" which is effectively tagging data with metadata and building a parallel architecture to manage this. That is how I understood what she proposed. A SmartData symposium will be held in Toronto for those interested.
Population Data BC is a clearing house for health, demographic, occupational, environmental, and educational data from various public bodies. They work out of UVic, SFU, and UBC to provide the data and training on how to use it. They do not conduct research themselves, but provide the data to researchers.
Three models of privacy are suggested, enterprise risk management, information governance, and privacy by design. A best practice is to understand each of these, and apply the aspects that best suit your organisation and the data you collect, store, and use.
Best practices that are used by PopData BC are:
Physical zoning with fobbed access and alarms
Fortification of walls
Sign in and escort for visitors
Network zoning with two factor authentication
Dummy terminals (physically different computers for working with secure data than using for general administrative work)
Separation of identifiers from content
Proactive linkage (data anonymization)
Auditing, logging, monitoring
Secure research environment (a VPN for researchers to access data pools)
Encryption (full data lifecycle protection)
Data destruction methods
Data access request formats
Privacy training (and testing after)
Criminal records check
Close working relationship with OCIO et al
Patricia gave us illustrations of the criminology studies, data, and data representations used at SFU. Plans are for a provincial data store of criminology relevant data, and a Commonwealth internetwork to share research.
Lorenzo from SFU gave us a vague but useful explanation of the complexities involved in building the 5 layer secure environment for housing this research data.
- Posted using BlogPress from my iPad
Location:W Hastings St,Vancouver,Canada