[Resource Topic] 2024/586: Encryption Based Covert Channel for Large Language Models

Welcome to the resource topic for 2024/586

Title:
Encryption Based Covert Channel for Large Language Models

Authors: Yongge Wang

Abstract:

Transformer neural networks have gained significant traction since their introduction, becoming pivotal across diverse domains. Particularly in large language models like Claude and ChatGPT, the transformer architecture has demonstrated remarkable efficacy. This paper provides a concise overview of transformer neural networks and delves into their security considerations, focusing on covert channel attacks and their implications for the safety of large language models. We present a covert channel utilizing encryption and demonstrate its efficacy in circumventing Claude.ai’s security measures. Our experiment reveals that Claude.ai appears to log our queries and blocks our attack within two days of our initial successful breach. This raises two concerns within the community: (1) The extensive logging of user inputs by large language models could pose privacy risks for users. (2) It may deter academic research on the security of such models due to the lack of experiment repeatability.

ePrint: https://eprint.iacr.org/2024/586

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .