Skip to content

ailensgroup/awesome-tampered-text-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Awesome Tampered Text Detection

A curated collection of papers, datasets, code, and tools for detecting tampered text and forged documents.

Table of Contents

  1. Papers
  2. Key Datasets
  3. Text Tampering Methods
  4. References

📚 Papers

Paper Year Venue Method Dataset Train. Code Val. Code Infer. Code
Yang et al. [1] : Deep Learning + ELA fusion 2022 ISEEIE Deep learning + ELA fusion Ali Tianchi Competition - - -
Wang et al. [2] : SR3 2022 ECCV SR3 T-IC13 - - -
Qu et al. [3] : DTD 2023 CVPR DTD Doc-Tamper
Okamoto et al. [4] : FCN 2023 arXiv FCN (semantic segmentation) FD-VIED - - -
Jain [5] : K-Means, Decision Trees, Logistic Regression 2024 IRJET K-Means, Decision Trees, Logistic Regression - - - -
Chen et al. [6] : FFDN 2024 ECCV FFDN Doc-Tamper - - -
Qu et al. [7] : TextSleuth / Tampered Text Detective 2024 arXiv TextSleuth / Tampered Text Detective ETTD - - -
Shao et al. [8] : DTL-ARob 2024 ECCV DTL-ARob Doc-Tamper, T-SROIE - - -
Wang et al. [9] : SherlockNet 2024 IEEE TMM SherlockNet EN-HA - - -
Li et al. [10] : Dual-path framework 2024 arXiv Dual-path framework Suning Scene Data - - -
Liao et al. [11] : CTP-Net 2023 arXiv CTP-Net Fake Chinese Trademark (FCTM) - - -
Li et al. [12] : MA-Net 2024 IEEE TIFS Forgery trace enhancement + multiscale attention TextTamper
Ren et al. [13] : EMF-Net 2024 Expert Systems With Applications EMF-Net TMI12K - - -
Qu et al. [14] : DAF + Text Jitter 2025 AAAI DAF + Text Jitter OSTF
Duan et al. [15] : TTDMamba 2025 IJCV TTDMamba RealTTD - - -
Wong et al. [16] : ADCD-Net 2025 ICCV ADCD-Net Doc-Tamper -
Li et al. [17] : DCLNet 2025 Signal Processing DCLNet Doc-Tamper - - -
Nguyen et al. [18] : TALIU 2025 IEEE Access TALIU Doc-Tamper - - -
Li et al. [19] : CD-SD 2025 JVCIR CD-SD Doc-Tamper - - -
Li et al. [20] : Spatial-Frequency Fusion + Swin-T 2025 IET Spatial-Frequency Fusion + Swin-T Doc-Tamper - - -
George & Marcel [21] : EdgeDoc 2025 arXiv EdgeDoc FantasyID - - -
Luo et al. [26] : ASC-Former 2025 Pattern Recognition ASC-Former RTM

📂 Key Datasets

A Dataset for Forgery Detection and Spotting in Document Images (2017) [22]

  • Introduction: Character-level ground truth on payslip documents with XML annotations for character bounding boxes and values; 677 images focusing on copy-paste tampering.
  • Link: Dataset Homepage

CMID — Copy-Move ID (2021) [23]

  • Introduction: Pixel-level masks for copy-move forgeries on ID documents. Includes separate genuine (304) and tampered (893) images.
  • Link: Dataset Homepage

FCD — Forged Character Detection Datasets (2022) [24]

  • Introduction: Bounding-box annotations for forged characters across passports, driving licences, and visa stickers; three 15K-image subsets.
  • Link: Implementation code | FCD-P | FCD-D | FCD-V

Tampered-IC13 (2022) [2]

  • Introduction: Bounding-box annotations over ICDAR 2013 scene text images with S3R strategy; train/test splits 229/233. Tampering generated via SRNet.
  • Link: Dataset Homepage

FSD — Forged Scanned Document (2023) [25]

  • Introduction: Pixel-level segmentation masks for forged documents built on FUNSD; covers copy-move, splicing, and resampling.
  • Link: Download Link

DocTamper (2023) [3]

  • Introduction: Large-scale pixel-level dataset for document tampering localization with multiple subsets (contracts, invoices, receipts, noisy office, scanned receipts).
  • Link: Dataset Homepage

ETTD — Explainable Tampered Text Detection Dataset (2024) [7]

  • Introduction: Pixel-level masks with natural language rationales for documents, IDs, and scene texts; includes ETTD-Train, ETTD-Test, and class-agnostic ETTD-CD.
  • Link: Not Available

TextTamper (2024) [12]

  • Introduction: Pixel-level annotations for text tampering localization across certificates, documents, and tables; created with rule-based, Poisson, and deep image blending.
  • Link: Dataset Homepage

OSTF — Open-set Scene Text Forensics (2025) [14]

  • Introduction: Open-set benchmark on scene text with bounding boxes; evaluates multiple generation/erasure models (Derend, SRNet, STEFANN, Mostel, DiffSTE, AnyText, Textdiff, UDiffText) on ICDAR 2013, ReCTS, TextOCR, etc.
  • Link: Dataset Homepage

RealDTT — Real-world Comprehensive Dataset of Tampered Text Images (2025) [15]

  • Introduction: Pixel-level segmentation across scene text and documents from MARIO-LAION, FUNSD, ReCTS, LSVT, RCTW; includes Photoshop, STEFANN, MOSTEL, VATr, ViTEraser, SRNet, DiffSTE, AnyText, UDiffText, TextDiffuser subsets.
  • Link: Dataset Homepage

RTM — Real Text Manipulation (2025) [26]

  • Introduction: Pixel-level binary masks for varied manipulations (copy-move, splicing, insertion, inpainting, coverage) across charts, receipts, certificates, scanned docs, table-heavy pages; train/test from SROIE, FUNSD, TNCR, volunteers.
  • Link: Dataset Homepage

Comparison Table

Datasets Language Annotation level Subset Domain
A Dataset for Forgery Detection and Spotting in Document Images [22] French Character-level (XML) - Payslips
CMID (Copy-Move ID) [23] French Pixel-level Genuine / Tampered ID documents
FCD [24] English Bounding box FCD-P / FCD-D / FCD-V Passports, Driving Licences, Visa Stickers
Tampered-IC13 [2] English Bounding box Train / Test Scene texts
FSD (Forged Scanned Document) [25] English Pixel-level (segmentation) Train / Test / Authentic Scanned documents
DocTamper [3] English, Chinese Pixel-level (segmentation) Train / Test / DocTamper-FCD / DocTamper-SCD Contracts, Invoices, Receipts, Text pages
ETTD [7] English, Chinese Pixel-level + Natural language Train / Test / ETTD-CD Documents, ID cards, Scene texts
TextTamper [12] Chinese Pixel-level Train / Val Certificates, Documents, Tables
OSTF [14] Chinese Bounding box Model-specific splits Scene text
RealDTT [15] Various Pixel-level (segmentation) Photoshop / STEFANN / MOSTEL / VATr / ViTEraser / SRNet / DiffSTE / AnyText / UDiffText / TextDiffuser Scene text, Documents
RTM [26] English Pixel-level (binary masks) Train / Test Charts, Receipts, Certificates, Scanned documents, Tables

🛠️ Text Tampering Methods

Overview Table for Text Tampering

Model Name Year Venue Category Key Contribution / Core Idea Architecture GitHub / Code Status Paper Link
SRNet [27] 2019 ACM MM GAN-Based Editing Early GAN-based end-to-end network for text replacement. GAN-based Official Repo Paper
EnsNet [28] 2019 ArXiv Text Editing, Inpainting, Removal An early text removal model using a feature-level attention block. GAN-based (U-Net with attention) Official Repo Paper
SwapText [29] 2020 ArXiv GAN-Based Editing Robust three-stage GAN for text replacement and background preservation. Three-stage GAN N/A Paper
STEFANN [30] 2020 CVPR Character-Level & Font-Aware Character-level editing preserving font structure and color. Two-stage GAN (FANnet + Colornet) Official Repo Paper
dRENDER [31] 2021 ICCV Parametric & De-rendering Parses rendering parameters of stylized text for artifact-free re-rendering. Differentiable Text Rendering Model Official Repo Paper
MOSTEL [32] 2022 AAAI Stroke-Level & Fine-Grained Stroke-level text editing using guidance maps for high glyph fidelity. Generates stroke guidance maps; Semi-supervised hybrid learning Official Repo Paper
DiffUTE [33] 2023 NeurIPS Universal Text Editing & Diffusion Self-supervised diffusion model for high-fidelity text replacement/modification. Diffusion Model with glyph/position guidance Official Repo Paper
DiffSTE [34] 2023 ArXiv Universal Text Editing & Diffusion Diffusion-based scene text editing to modify styles and colors while preserving structure. Diffusion Model Official Repo Paper
Magicremover [35] 2023 ArXiv Text Editing, Inpainting, Removal Tuning-free text-guided image inpainting for text/object removal. Diffusion Model (Stable Diffusion) N/A Paper
AnyText [36] 2023 ArXiv Multilingual & Cross-Language Pioneering multilingual text generation/editing with an OCR-based text encoder. Diffusion Model with auxiliary latent module Official Repo Paper
TextDiff [37] 2023 ArXiv Text Super-Resolution Diffusion model that sharpens text by predicting the high-frequency residual. Two-module framework: TEM + MRD (Residual Diffusion) Official Repo Paper
GlyphControl [38] 2023 NeurIPS Character-Aware Synthesis Glyph-conditional diffusion model for explicit control of text content, location, and size. Conditional Diffusion Model Official Repo Paper
UDiffText [39] 2023 ArXiv Character-Aware Synthesis Unified framework for text synthesis and editing using a character-level text encoder. Fine-tuned Stable Diffusion with char-level encoder Official Repo Paper
TextCtrl [40] 2024 NeurIPS Style-Preserving & Prior-Guided Explicitly disentangles style and glyph structure priors for style preservation. Diffusion Model with Style-Structure guidance Official Repo Paper
AnyText2 [41] 2024 ArXiv Multilingual & Cross-Language Adds customizable font/color attributes and improves speed over AnyText. Diffusion Model (WriteNet+AttnX) Official Repo Paper
TextCrafter [42] 2024 ArXiv Character-Aware Synthesis Precisely renders multiple texts with varying attributes by segmenting and rendering independently. Diffusion Model with independent region rendering Official Repo Paper
TextMaster [43] 2024 ArXiv Style-Preserving & Prior-Guided Universal controllable text editing with adaptive font and style injection. Diffusion Model with Attention and Perceptual Loss N/A Paper
GlyphMastero [44] 2025 ArXiv Stroke-Level & Fine-Grained Specialized glyph encoder to guide a diffusion model for stroke-level precision. Diffusion Model + Glyph Encoder N/A Paper
Qwen-Image [45] 2025 ArXiv Character-Aware Synthesis SOTA image foundation model with excellent multilingual text rendering and editing. Multimodal DiT (MMDiT) Official Repo Paper

📖 References

[1] P. Yang, W. Fang, F. Zhang, L. Bai, Y. Gao, "Document Image Forgery Detection Based on Deep Learning Models," 2022 International Symposium on Electrical, Electronics and Information Engineering (ISEEIE), 2022, pp. 1-5, Paper

[2] Y. Wang, C. Liu, X. Liu, D. Peng, L. Jin, "SR3 for Tampered Text Detection," Proceedings of the European Conference on Computer Vision (ECCV), 2022, pp. 1234-1245, Paper

[3] C. Qu, Y. Liu, X. Liu, D. Peng, F. Guo, L. Jin, "DTD: Document Tampering Detection," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 5937-5946 Paper | Code

[4] Y. Okamoto, G. Osada, I. Yahiro, R. Hasegawa, P. Zhu, H. Kataoka, "Image Generation and Learning Strategy for Deep Document Forgery Detection," arXiv preprint arXiv:2311.03650, Nov. 2023. Paper.

[5] J. Jain, "AI-Driven OCR for Fraud Detection in FinTech Income Verification Systems," International Research Journal of Engineering and Technology (IRJET), Vol. 11, Issue 12, Dec. 2024, pp. 850-854. Paper

[6] Z. Chen, S. Chen, T. Yao, K. Sun, S. Ding, X. Lin, L. Cao, R. Ji, "Enhancing Tampered Text Detection through Frequency Feature Fusion and Decomposition," in Proceedings of the European Conference on Computer Vision (ECCV), 2024, pp. 394-411. Paper

[7] C. Qu, J. Liu, H. Chen, B. Yu, J. Liu, W. Wang, L. Jin, "Explainable Tampered Text Detection via Multimodal Large Models," arXiv preprint arXiv:2412.14816v2, Dec. 2024. Paper

[8] H. Shao, Z. Qian, K. Huang, W. Wang, X. Huang, Q. Wang, "Delving into Adversarial Robustness on Document Tampering Localization," Proceedings of the European Conference on Computer Vision (ECCV), 2024. Paper

[9] J. Wang, L. Mou, C. Zheng, W. Gao, "Image-Based Freeform Handwriting Authentication With Energy-Oriented Self-Supervised Learning," IEEE Transactions on Multimedia, vol. 27, pp. 1397-1409, 2025. Paper

[10] G. Li, X. Yang, W. Ma, "A Two-Stage Dual-Path Framework for Text Tampering Detection and Recognition," arXiv preprint arXiv:2402.13545v2, Feb. 2024. Paper

[11] X. Liao, S. Chen, J. Chen, T. Wang, X. Li, "CTP-Net: Character Texture Perception Network for Document Image Forgery Localization," arXiv preprint arXiv:2308.02158v1, Aug. 2023. Paper

[12] B. Li, J. Xu, Y. Wang, Z. Wu, "Robust Text Image Tampering Localization via Forgery Traces Enhancement and Multiscale Attention," IEEE Transactions on Information Forensics and Security (TIFS), 2024. Paper | Code

[13] R. Ren, Q. Hao, F. Gu, S. Niu, J. Zhang, M. Wang, "EMF-Net: An Edge-Guided Multi-Feature Fusion Network for Text Manipulation Detection," Expert Systems with Applications, Vol. 249, Part A, 2024, 123548. Paper

[14] C. Qu, Y. Zhong, F. Guo, L. Jin, "Revisiting Tampered Scene Text Detection in the Era of Generative AI," Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 694–702, 2025. Paper | Code

[15] J. Duan, H. Sun, F. Ji, et al., "RealDTT: Towards A Comprehensive Real-World Dataset for Tampered Text Detection," International Journal of Computer Vision, 2025. Paper

[16] K. A. Wong, J. Zhou, H. Wu, Y.-W. Si, J. Zhou, “ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement,” Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025. Paper | Code

[17] W. Li, B. Li, K. Zheng, S. Li, H. Li, "Document image forgery detection and localization in desensitization scenarios," Signal Processing, vol. 238, 110123, 2025. Paper

[18] A. D. Nguyen, H.-Y. Kim, H. N. Nguyen, "TALIU: A Novel Decoder and Augmentation Strategy for Boosting Tampered Document Image Detection," IEEE Access, vol. 13, pp. 70340-70351, 2025. Paper

[19] L. Li, Y. Bai, S. Zhang, M. Emam, "Document forgery detection based on spatial-frequency and multi-scale feature network," Journal of Visual Communication and Image Representation, vol. 107, 104393, 2025. Paper

[20] L. Li, K. Zhang, J. Lu, S. Zhang, N. Chu, "Multiclassification Tampering Detection Algorithm Based on Spatial-Frequency Fusion and Swin-T," IET Image Processing, vol. 19, 2025. Paper

[21] A. George, S. Marcel, "EdgeDoc: Hybrid CNN-Transformer Model for Accurate Forgery Detection and Localization in ID Documents," arXiv preprint arXiv:2508.16284, 2025. Paper

[22] N. Sidere, F. Cruz, M. Coustaty and J.-M. Ogier, "A dataset for forgery detection and spotting in document images," Seventh International Conference on Emerging Security Technologies (EST), Canterbury, UK, 2017, pp. 26–31. Paper

[23] G. Mahfoudi, F. Morain-Nicolier, F. Retraint and M. Pic, "CMID: A New Dataset for Copy-Move Forgeries on ID Documents," IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 2021, pp. 3028–3032. Paper

[24] T. Kumar, M. Turab, S. Talpur, R. Brennan and M. Bendechache, "Forged Character Detection Datasets: Passports, Driving Licences and Visa Stickers," International Journal of Artificial Intelligence & Applications, vol. 13, pp. 21–35, Mar. 2022. Paper

[25] A. K. Jaiswal, S. Singh and S. K. Tripathy, "FSD: A novel forged document dataset and baseline," 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, pp. 1–6, 2023. Paper

[26] D. Luo, Y. Liu, R. Yang, X. Liu, J. Zeng, Y. Zhou and X. Bai, "Toward real text manipulation detection: New dataset and new solution," Pattern Recognition, vol. 157, p. 110828, 2025. Paper

[27] L. Wu, C. Zhang, J. Liu, J. Han, J. Liu, E. Ding and X. Bai, "Editing text in the wild," in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 1500–1508. Paper | Code

[28] S. Zhang, Y. Liu, L. Jin, Y. Huang and S. Lai, "EnsNet: Ensconce text in the wild," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 1, 2019, pp. 801–808. Paper | Code

[29] Q. Yang, J. Huang and W. Lin, "SwapText: Image based texts transfer in scenes," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 14700–14709. Paper

[30] P. Roy, S. Bhattacharya, S. Ghosh and U. Pal, "STEFANN: Scene text editor using font adaptive neural network," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13228–13237. Paper | Code

[31] W. Shimoda, D. Haraguchi, S. Uchida and K. Yamaguchi, "De-rendering stylized texts," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1076–1085. Paper | Code

[32] Y. Qu, Q. Tan, H. Xie, J. Xu, Y. Wang and Y. Zhang, "Exploring stroke-level modifications for scene text editing," in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 2, 2023, pp. 2119–2127. Paper | Code

[33] H. Chen, Z. Xu, Z. Gu, Y. Li, C. Meng, H. Zhu and W. Wang, "DiffUTE: Universal text editing diffusion model," Advances in Neural Information Processing Systems, vol. 36, pp. 63062–63074, 2023. Paper | Code

[34] J. Ji, G. Zhang, Z. Wang, B. Hou, Z. Zhang, B. Price and S. Chang, "Improving diffusion models for scene text editing with dual encoders," arXiv preprint arXiv:2304.05568, 2023. Paper | Code

[35] L. Yu, J. Yu, "Magicremover: Tuning-free text-guided image inpainting," arXiv preprint arXiv:2310.14428, 2023. Paper

[36] Y. Tuo, W. Xiang, J.-Y. He, Y. Geng and X. Xie, "AnyText: Multilingual visual text generation and editing," arXiv preprint arXiv:2311.03054, 2023. Paper | Code

[37] B. Liu, Z. Yang, P. Wang, J. Zhou, Z. Liu, Z. Song, Y. Liu and Y. Xiong, "TextDiff: Mask-guided residual diffusion models for scene text image super-resolution," arXiv preprint arXiv:2308.06743, 2023. Paper | Code

[38] Y. Yang, D. Gui, Y. Yuan, W. Liang, H. Ding, H. Hu and K. Chen, "GlyphControl: Glyph conditional control for visual text generation," Advances in Neural Information Processing Systems, vol. 36, pp. 44050–44066, 2023. Paper | Code

[39] Y. Zhao and Z. Lian, "UDiffText: A unified framework for high-quality text synthesis in arbitrary images via character-aware diffusion models," in Proceedings of the European Conference on Computer Vision, 2024, pp. 217–233. Paper | Code

[40] W. Zeng, Y. Shu, Z. Li, D. Yang and Y. Zhou, "TextCtrl: Diffusion-based scene text editing with prior guidance control," Advances in Neural Information Processing Systems, vol. 37, pp. 138569–138594, 2024. Paper | Code

[41] Y. Tuo, Y. Geng and L. Bo, "AnyText2: Visual text generation and editing with customizable attributes," arXiv preprint arXiv:2411.15245, 2024. Paper | Code

[42] N. Du, Z. Chen, S. Gao, Z. Chen, X. Chen, Z. Jiang, J. Yang and Y. Tai, "TextCrafter: Accurately rendering multiple texts in complex visual scenes," arXiv preprint arXiv:2503.23461, 2025. Paper | Code

[43] Z. Yan, J. Wang, A. Wang, Y. Li, W. Shang and R. Lin, "TextMaster: A unified framework for realistic text editing via glyph-style dual-control," arXiv preprint arXiv:2410.09879, 2024. Paper

[44] T. Wang, T. Liu, X. Qu, C. Wu, L. Liu and X. Hu, "GlyphMastero: A glyph encoder for high-fidelity scene text editing," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pp. 28523–28532. Paper

[45] C. Wu, J. Li, J. Zhou, J. Lin, K. Gao, K. Yan, S. Yin et al., "Qwen-Image technical report," arXiv preprint arXiv:2508.02324, 2025. Paper | Code


🤝 Contributing

Contributions are welcome! Please open an issue or pull request to add more papers, datasets, or implementations.


⭐ Acknowledgements

If you find this list helpful, please star ⭐ the repository to support the project.

About

A curated collection of papers, datasets, code, and tools for detecting tampered text and forged documents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors