Right-protected data publishing with hierarchical clustering preservation

The emergence of cloud-based storage services is opening up new avenues in data exchange and data dissemination. This has ampli?ed the interest in right-protection mecha- nisms to establish ownership in case of data leakage. Cur- rent right-protection technologies, however, rarely provide strong guarantees on the dataset utility after the protection process. This work presents techniques that explicitly ad- dress this shortcoming and provably preserve the outcome of certain mining operations. In particular, we take special care to guarantee that the outcome of hierarchical clustering operations remains the same before and after right protec- tion. We encode data ownership using watermarking prin- ciples. In the process, we derive fundamental bounds on the distortion incurred by the watermarking. We leverage our theoretical analysis to design fast algorithms for right protection without exhaustively searching the vast design space.



