Clusion v0.2.0

encrypted search

by Seny Kamara and Tarik Moataz

Over the last two years, we have been working on an open source encrypted search library called Clusion. We are using it in several ESL projects, including the Signal Search project we recently released. We are also excited that other groups are experimenting with it. In this post, we wanted to give an overview of Clusion and answer some of the most frequently asked questions we have received about it.

Overview

Clusion is an open source encrypted search library created and maintained by the Encrypted Systems Lab at Brown University. It is written in Java and is available under the GNU General Public License v3. Clusion is designed to be easy to use and modular. It can be used for both fast prototyping and integration into real-world applications. It implements state-of-the-art and highly-efficient searchable symmetric encryption (SSE) schemes with varying levels of expressiveness. This includes schemes that handle single, conjunctive, disjunctive and boolean keyword search.

What is encrypted search? Encrypted search is an area of cryptography that focuses on the design of cryptographic primitives to search on encrypted data. There are a wide range of possible solutions to this problem including fully-homomorphic encryption (FHE), oblivious RAM (ORAM), multi-party computation (MPC), property-preserving encryption (PPE) and structured encryption (STE). Each of these approaches provide different tradeoffs between efficiency, expressiveness and security. For an introduction to the area, see this paper and these lecture notes.

What is searchable symmetric encryption (SSE)? One approach to designing encrypted search solutions is to use structured encryption (STE). STE schemes encrypt data structures in such a way that they can be queried privately. SSE is a special case of STE that focuses on encrypting search structures, e.g., inverted indices, search trees etc.

How did Clusion come about? The idea of building Clusion started in 2015 in a bus ride heading to Saint Annes in Rennes, France. We were discussing the scarcity of encrypted search implementations and the complete lack of open source libraries. At that time, there were some implementations from Microsoft, IBM and Columbia/Bell Labs but none were open source. This was surprising because it had been almost ten years since we had practical techniques to search on encrypted data. The issue became even more pressing when we wanted to implement a new boolean SSE scheme we were working on called IEX, which relied on several building blocks and techniques that had never been implemented. So we decided to implement IEX, along with its underlying building blocks, and to make everything open source. Our hope was that an open source SSE library would help others who wanted to evaluate their constructions and would establish a common framework that researchers and practitioners could refer to and improve on.

Who is Clusion for? We designed Clusion for anyone who is interested in using or experimenting with encrypted search technology. It has already been used in several ESL projects by undergraduate, master's and PhD students.

What kind of applications can use Clusion? Clusion can be used by any application that needs to store and query sensitive data. We use it to evaluate our constructions, to build prototypes and are using it to build a system that will be deployed at scale.

Some examples of applications include cloud backup services like Dropbox, iCloud, Google Drive and OneDrive; and NoSQL databases like MongoDB and Redis. Encrypted search is not only applicable in cloud scenarios, however, but can also be used by desktop and mobile applications when storing sensitive data locally. This was the case, for example, in a recent project where we used Clusion to add a secure message search feature to the Signal app.

An important note on security. While SSE, and encrypted search technologies in general, make the design of secure and privacy-preserving applications easier, it is important to understand their inherent limitations. For starters, encryption technology as a whole can only protect data against certain kinds of threats and provides no guarantees against attacks that target users or input/output devices. Second, it is important to keep in mind that the security of your system may not reduce directly to the security of the underlying SSE scheme. In other words, the proper use and implementation of the SSE scheme may be useless if the larger system in which it is used is not designed or implemented properly. Finally, it is crucial to understand that most encrypted search solutions leak some information. In some cases like ORAM-based solutions this leakage can be very small. In others, like PPE-based solutions this leakage can be very large. SSE solutions are somewhere in the middle: more leakage than ORAM-based solutions but less than PPE-based solutions. A lot of ongoing research is trying to understand the practical impact of this leakage so deciding how and when to properly use these schemes can be tricky. For these reasons, we recommend that any real-world deployments of encrypted search technology be accompanied by a thorough review by experts in the field.

Implementation

The Clusion architecture. Clusion is built in a modular fashion and is split into several components, as shown in Fig. 1. It handles the entire encrypted search pipeline including the parsing and indexing of data, the creation of encrypted data structures, the encrypted query operations and, finally, displaying the result to the user. We give more details about the components below.

Fig. 1: The Clusion components.

Indexing. The Clusion indexer takes as input a folder and indexes the files in it. It can index pdf files, various Microsoft file formats (e.g., .doc and .ppt), media files (e.g., pictures and videos) as well as raw text such as html and text files. The different file formats are handled using PDFBox and POI. Tokenization and stemming is done using Apache Lucene. The indexer outputs two multi-maps. The first associates keywords to document filenames and the second associates filenames to keywords. The default indexer does full-text indexing but it can be modified to index only a user-defined set of keywords. The underlying data structures are built with Google Guava.

Cryptographic primitives. We use Bouncy Castle to instantiate all the underlying cryptographic primitives. In particular, we use AES-CTR for symmetric encryption and HMAC-SHA256/512 and AES-CMAC for PRFs. Key generation is based on PBE PKCS1 and we use SecureRandom as a PRG. We also provide a encrypt-then-mac authenticated encryption and a deterministic version of AES (AES with a synthetic IV). In addition, there is also an implementation of the HCB1 online cipher.

SSE schemes. As of v0.2.0, Clusion includes the following constructions:

  • 2Lev: a single-keyword SSE scheme by [ CJJJKRS14]. 2Lev has optimal search complexity and is designed to be I/O-efficient so it can be disk-resident.

  • Dynamic 2Lev: a dynamic variant of 2Lev that handles additions of new documents. Here, we modified the original dynamic construction of [ CJJJKRS14] to be forward-private. Note that this variant does not handle delete operations.

  • Dynamic pibase: pibase is a single-keyword SSE scheme from [ CJJJKRS14]. It is has optimal search complexity and is designed to be memory-resident. The dynamic variant of pibase handles additions and deletions but search complexity is linear in the number of delete operations which means it should only be used in settings where few deletes occur. We also modified this construction to be forward-private.

  • ZMF: a compact single-keyword SSE scheme from [ KM17]. This construction has linear search complexity but produces very small encrypted indexes. It should be used when space is at a premium.

  • IEX: a (purely) disjunctive SSE scheme from [ KM17]. IEX has sub-linear search complexity and optimal communication complexity. IEX makes black-box use of an underlying single-keyword SSE scheme, which means that it can be instantiated using any single-keyword SSE scheme. The two instantiations implemented in Clusion are IEX-2Lev and IEX-ZMF. The former is optimized for search; achieving fast search at the cost of a large encrypted index. The latter is optimized for space; producing a compact index at the cost of slower search (note that IEX-ZMF still has sub-linear search complexity).

  • BIEX: a boolean SSE scheme from [ KM17]. Like IEX, BIEX also makes black-box use of a single-keyword SSE scheme. The two instantiations implemented in Clusion are BIEX-2Lev and BIEX-ZMF, providing the same tradeoffs between search and space discussed above.

Test code. Clusion includes test code that can serve as examples of how to use its various constructions. All the tests currently run in memory and all the algorithms run on the same machine. That is, the tests do not implement any client/server scenarios. This is because Clusion currently does not include a client/server infrastructure (this is something we are working on and will be adding soon).

Future Work

There are many ways in which Clusion can be improved and we are currently working on several new features. Here are a few that we plan to release in upcoming versions:

  • Client/server infrastructure: as stated earlier, Clusion does not currently have any client/server code. This is a limitation for usage in cloud scenarios where the client encrypts its data before sending it to a cloud provider and the latter executes the search. We are working on this feature and will release something soon.

  • I/O-efficient encrypted indexes: while the 2Lev construction is designed to be I/O-efficient, this is not the case for the rest of the constructions implemented in Clusion. We are currently experimenting with ways to make non-I/O-efficient constructions more I/O-efficient in practice.

Clusion is an open source library and, as for any open source project, we hope that the community will use it and help improve it. If you are interested in helping or if you are using Clusion for a project, please let us know.

Tags// encrypted search, clusion,
More Reading
Newer// Signal Search
Older// Encrypting Distributed Databases with Provable Security

玻璃钢生产厂家新密室内不锈钢玻璃钢人物雕塑济南南海玻璃钢人物雕塑南昌商场美陈费用黑龙江扁鹊人物玻璃钢雕塑制作福州玻璃钢仿真水果雕塑厂家吉安定制玻璃钢雕塑价位湖北京剧豫剧脸谱玻璃钢雕塑厂家南昌人物玻璃钢雕塑公司邯郸标识标牌玻璃钢人物雕塑抚顺玻璃钢雕塑生产厂家重庆商场装饰美陈四月商场布置樱花美陈公众号玻璃钢仿铜小孩人物雕塑河北玻璃钢雕塑雕刻公司平凉卡通玻璃钢雕塑设计海伦镜面玻璃钢雕塑厂玻璃钢雕塑材料制作有哪些玻璃钢人像雕塑联系方式北京玻璃钢伟人雕塑出厂价格南通玻璃钢雕塑武威玻璃钢植物雕塑哪家好富阳区玻璃钢金属雕塑山西户外装饰玻璃钢美陈雕塑赣州佛像玻璃钢雕塑定做价格玻璃钢雕塑修改安阳市玻璃钢雕塑定制玻璃钢雕塑 西安美陈玻璃钢卡通雕塑代理商宁安玻璃钢伟人像雕塑华阴玻璃钢游乐场门头雕塑香港通过《维护国家安全条例》两大学生合买彩票中奖一人不认账让美丽中国“从细节出发”19岁小伙救下5人后溺亡 多方发声单亲妈妈陷入热恋 14岁儿子报警汪小菲曝离婚始末遭遇山火的松茸之乡雅江山火三名扑火人员牺牲系谣言何赛飞追着代拍打萧美琴窜访捷克 外交部回应卫健委通报少年有偿捐血浆16次猝死手机成瘾是影响睡眠质量重要因素高校汽车撞人致3死16伤 司机系学生315晚会后胖东来又人满为患了小米汽车超级工厂正式揭幕中国拥有亿元资产的家庭达13.3万户周杰伦一审败诉网易男孩8年未见母亲被告知被遗忘许家印被限制高消费饲养员用铁锨驱打大熊猫被辞退男子被猫抓伤后确诊“猫抓病”特朗普无法缴纳4.54亿美元罚金倪萍分享减重40斤方法联合利华开始重组张家界的山上“长”满了韩国人?张立群任西安交通大学校长杨倩无缘巴黎奥运“重生之我在北大当嫡校长”黑马情侣提车了专访95后高颜值猪保姆考生莫言也上北大硕士复试名单了网友洛杉矶偶遇贾玲专家建议不必谈骨泥色变沉迷短剧的人就像掉进了杀猪盘奥巴马现身唐宁街 黑色着装引猜测七年后宇文玥被薅头发捞上岸事业单位女子向同事水杯投不明物质凯特王妃现身!外出购物视频曝光河南驻马店通报西平中学跳楼事件王树国卸任西安交大校长 师生送别恒大被罚41.75亿到底怎么缴男子被流浪猫绊倒 投喂者赔24万房客欠租失踪 房东直发愁西双版纳热带植物园回应蜉蝣大爆发钱人豪晒法院裁定实锤抄袭外国人感慨凌晨的中国很安全胖东来员工每周单休无小长假白宫:哈马斯三号人物被杀测试车高速逃费 小米:已补缴老人退休金被冒领16年 金额超20万

玻璃钢生产厂家 XML地图 TXT地图 虚拟主机 SEO 网站制作 网站优化