[Resource Topic] 2016/012: Cryptography for Big Data Security

Welcome to the resource topic for 2016/012

Title:
Cryptography for Big Data Security

Authors: Ariel Hamlin, Nabil Schear, Emily Shen, Mayank Varia, Sophia Yakoubov, Arkady Yerukhimovich

Abstract:

As big data collection and analysis becomes prevalent in today’s computing environments there is a growing need for techniques to ensure security of the collected data. To make matters worse, due to its large volume and velocity, big data is commonly stored on distributed or shared computing resources not fully controlled by the data owner. Thus, tools are needed to ensure both the confidentiality of the stored data and the integrity of the analytics results even in untrusted environments. In this chapter, we present several cryptographic approaches for securing big data and discuss the appropriate use scenarios for each. We begin with the problem of securing big data storage. We first address the problem of secure block storage for big data allowing data owners to store and retrieve their data from an untrusted server. We present techniques that allow a data owner to both control access to their data and ensure that none of their data is modified or lost while in storage. However, in most big data applications, it is not sufficient to simply store and retrieve one’s data and a search functionality is necessary to allow one to select only the relevant data. Thus, we present several techniques for searchable encryption allowing database- style queries over encrypted data. We review the performance, functionality, and security provided by each of these schemes and describe appropriate use-cases. However, the volume of big data often makes it infeasible for an analyst to retrieve all relevant data. Instead, it is desirable to be able to perform analytics directly on the stored data without compromising the confidentiality of the data or the integrity of the computation results. We describe several recent cryptographic breakthroughs that make such processing possible for varying classes of analytics. We review the performance and security characteristics of each of these schemes and summarize how they can be used to protect big data analytics especially when deployed in a cloud setting. We hope that the exposition in this chapter will raise awareness of the latest types of tools and protections available for securing big data. We believe better understanding and closer collaboration between the data science and cryptography communities will be critical to enabling the future of big data processing.

ePrint: https://eprint.iacr.org/2016/012

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .