empirical study

Sandcastles in the Storm: Revisiting the (Im)possibility of Strong Watermarking

Watermarking AI-generated text is critical for combating misuse. Yet recent theoretical work argues that any watermark can be erased via random walk attacks that perturb text while preserving quality. However, such attacks rely on two key …

Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking

Data augmentation techniques apply transformations to existing texts to generate additional data. The transformations may produce low-quality texts, where the meaning of the text is changed and the text may even be mangled beyond human comprehension. …

Sibylvariant Transformations for Robust Text Classification

The vast majority of text transformation techniques in NLP are inherently limited in their ability to expand input space coverage due to an implicit constraint to preserve the original class label. In this work, we propose the notion of sibylvariance …

Is Neuron Coverage a Meaningful Measure for Testing Deep Neural Networks?

Recent effort to test deep learning systems has produced an intuitive and compelling test criterion called neuron coverage (NC), which resembles the notion of traditional code coverage. NC measures the proportion of neurons activated in a neural …