Abstract:
Reconstruction of shredded documents remains a significant challenge.
Creating a better document reconstruction system enables not just recovery of
information accidentally lost but also understanding our limitations against
adversaries' attempts to gain access to information. Existing approaches to
reconstructing shredded documents adopt either a predominantly manual (e.g.,
crowd-sourcing) or a near automatic approach. We describe
\\textit{Deshredder}, a visual analytic approach that scales well and
effectively incorporates user input to direct the reconstruction
process.Deshredder represents shredded pieces as time series and uses nearest
neighbor matching techniques that enable matching both the contours of
shredded pieces as well as the content of shreds themselves. More
importantly, Deshredder's interface support visual analytics through user
interaction with similarity matrices as well as higher level assembly through
more complex stitching functions. We identify a functional task taxonomy
leading to design considerations for constructing deshredding solutions, and
describe how Deshredder applies to problems from the DARPA Shredder Challenge
through expert evaluations.