[Resource Topic] 2024/1424: A Waterlog for Detecting and Tracing Synthetic Text from Large Language Models

Welcome to the resource topic for 2024/1424

Title:
A Waterlog for Detecting and Tracing Synthetic Text from Large Language Models

Authors: Brennon Brimhall, Orion Weller, Matthew Green, Ian Miers

Abstract:

We propose waterlogs, a new direction to detect and trace synthetic text outputs from large language models based on transparency logs. Waterlogs offer major categorical advantages over watermarking: it (1) allows for the inclusion of arbitrary metadata to facilitate tracing, (2) is publicly verifiable by third parties, and (3) operates in a distributed manner while remaining robust and efficient.

Waterlogs rely on a verifiable Hamming distance index, a novel data structure that we describe, to efficiently search multi-dimensional semantic hashes of natural language embeddings in a verifiable manner. This data structure may be of independent interest.

We implement a waterlog, which we call DREDGE, and benchmark it using synthetic text generated by GPT-2 1.5B and OPT-13B; embeddings are generated via OpenAI’s text-embedding-ada-002 model. We provide empirical benchmarks on the efficiency of appending text to the log and querying it for matches. We compare our results to watermarking and outline areas for further research.

ePrint: https://eprint.iacr.org/2024/1424

See all topics related to this paper.

Feel free to post resources that are related to this paper below.

Example resources include: implementations, explanation materials, talks, slides, links to previous discussions on other websites.

For more information, see the rules for Resource Topics .