what is it? Shredded Document Reconstruction via Pairwise Seam Classification. Basically, it pieces back together shredded documents using AI so you don't have to.
accuracy: 82.7%
pytorch
streamlit
opencv
resnet-18
huggingface
click for the deep-dive nerd stats
the tech stack
- Libraries: PyTorch, Streamlit, OpenCV, NumPy, Plotly, Matplotlib, Seaborn, HuggingFace, Pillow, PDF2Image/Poppler.
- Hardware: Rented from Vast.ai. 2x RTX 3070, AMD EPYC 7B12 64-Core Processor, 128 GB of DDR4 RAM. Navigated via terminal SSH and FileZilla.
the model: SeamResNet
- Architecture: ResNet-18 backbone initialized with ImageNet weights, giving access to low-level feature extractors. Classification uses a Custom Dense Block.
- Early Fusion: Input data is combined at the earliest stage, meaning the receptive field convolution filter directly covers both strips simultaneously to detect micro-texture spatial coherence.
- Adversarial Training: Uses Hard Negative Mining to create spatially inaccurate but semantically similar data. This forces the loss function to penalize the smallest alignment errors, making it robust against translation invariance.
the data
- Streamed real-time from the RVL-CDIP dataset via Hugging Face.
- 400,000 images (320k train, 40k val/test) split across 16 categories like forms, emails, and invoices.
- Total size: ~35GB.
limitations
- Currently limited strictly to vertical strip-cut documents.
- The Multi-start Greedy Heuristic algorithm is a bottleneck and not guaranteed to find the absolute optimal sequence.
- Sensitive to heavy colors or photographs, but a quick grayscale preprocessing step completely fixes performance for standard, lightly colored documents.