Authors: Travis Hammond (thammo19@jhu.edu) and Quinton Wiley (qtwiley95@gmail.com)
Affiliation: Whiting School of Engineering, Johns Hopkins University
This repository contains the research paper, "Generalized Detection of AI-Generated Images Using Foundation Models," authored for the Advanced Applied Machine Learning course (EN.705.742.8VL.SP25) at Johns Hopkins University under Dr. Guven.
The full paper is available in this repository: View the Paper
This repository does not contain the source code, only the final paper for the course.
The proliferation of advanced AI image generators poses significant challenges for digital forensics, content authenticity, and information integrity. This paper presents an improved detection model for AI-generated imagery that achieves high accuracy across various generation methods. We contribute key innovations focusing on foundation models: an end-to-end finetuning approach for vision foundation models that significantly outperforms classification probe methods while preserving generalization, and a computationally efficient training strategy utilizing autoencoder artifacts, demonstrating transferability to full generative models. Our comprehensive evaluation across multiple generators demonstrates that DINOv2 achieves superior cross-generator generalization compared to CLIP and AIMv2, and performance further increases when finetuned with carefully controlled learning rates. These findings challenge prevailing assumptions about feature freezing and provide practical methods for developing robust, generalizable detection systems capable of adapting to the rapidly evolving landscape of generative AI.
If you find this work useful in your research, please consider citing the paper:
@techreport{hammondwiley2025,
title = {Generalized Detection of AI-Generated Images Using Foundation Models},
author = {Hammond, Travis and Wiley, Quinton},
institution = {Johns Hopkins University},
year = {2025}
}