[Resource Topic] 2024/2062: Two Halves Make a Whole: How to Reconcile Soundness and Robustness in Watermarking for Large Language Models

system · December 23, 2024, 6:50am

Welcome to the resource topic for 2024/2062

Title:
Two Halves Make a Whole: How to Reconcile Soundness and Robustness in Watermarking for Large Language Models

Authors: Lei Fan, Chenhao Tang, Weicheng Yang, Hong-Sheng Zhou

Abstract:

Watermarking techniques have been used to safeguard AI-generated content. In this paper, we study publicly detectable watermarking schemes (Fairoze et al.), and have several research findings.

First, we observe that two important security properties, robustness and soundness, may conflict with each other. We then formally investigate these two properties in the presence of an arguably more realistic adversary that we called editing-adversary, and we can prove an impossibility result that, the robustness and soundness properties cannot be achieved via a publicly-detectable single watermarking scheme.

Second, we demonstrate our main result: we for the first time introduce the new concept of publicly- detectable dual watermarking scheme, for AI-generated content. We provide a novel construction by using two publicly-detectable watermarking schemes; each of the two watermarking schemes can achieve “half” of the two required properties: one can achieve robustness, and the other can achieve soundness. Eventually, we can combine the two halves into a whole, and achieve the robustness and soundness properties at the same time. Our construction has been implemented and evaluated.

ePrint: https://eprint.iacr.org/2024/2062

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .