Detect LLM: Paragraph-level Classification of AI-Generated Text

📍 Overview

This repository presents our solution for the Detect AI-generated Text Competition, a national-level data science competition hosted in 2025, Dacon.

The goal of the cahllenge was to detect whether a paragraph was written by a Large Language Model (LLM), with only document-level labels provided. This required re-egineering the data, designing weak supervision strategies, and buildig robust classifiers in a highly imbalanced setting.

Competition Link

📍 Competition Information

Host: Dacon
Track: Detect AI-generated Text
Evaluation Metric: ROC-AUC
Input: Document-level 'full_text' (Provided as train.csv)
Label: 'generated' (0 for human-written, 1 for AI-generated)
Goal: Classify each paragraph as human- or AI-generated

✔️ train.csv

title	full_text	generated
카호올라웨섬	카호올라웨섬은 하와이 제도를 구성하는 (중략...)	0
청색거성	천문학에서 청색거성은 광도 분류 (중략...)	0
수난곡	수난곡은 배우의 연기 없이 무대에 (중략...)	1

✔️ test.csv

ID	title	paragraph_index	paragraph_text
TEST_0000	공중 도덕의 의의와 필요성	0	도덕이란 원래 개인의 자각...
TEST_0001	공중 도덕의 의의와 필요성	1	도덕은 단순히 개인의 문제...
TEST_0002	공중 도덕의 의의와 필요성	2	여기에 이른바 공중도덕은...

✔️ sample_submission.csv

ID	generated
TEST_0000	0
TEST_0001	0

📍 Data Analysis & Preprocessing

Our core challenge was to reconstruct paragraph-level labels from document-level data in an extremely imbalanced and noisy setting. To overcome this, we developed multiple weak supervision and filtering strategies, focusing on data-centric approaches.

➡️ 1) Data Restructuring

Each full_text (up to 9 paragraphs) was split into multiple paragraph_text units
For long paragraphs, we applied sliding window chunking with overlapping stride
Input format:
"제목: {title} 본문: {paragraph_text}"

➡️ 2) Class Imbalance Handling

The dataset exhibited severe class imbalance:
approximately 10:1 ratio between generated=0 and generated=1 labels.

To address this, we performed both data augmentation and filtering for high-confidence positive samples.

Oversampling + Label Propagation
Filtering using Perplexity and Semantic Similarity

🔁 3) KANANA-based Positive Data Augmentation

We used a pretrained KoGPT model ("kanana") to generate synthetic paragraphs mimicking generated=1 style.
These augmented paragraphs were added to the training set to enrich the positive class.

Model: kakaocorp/kanana-1.5-2.1b-instruct-2505 (instruction-tuned KoGPT variant)
Prompt-based generation: Input includes preceding ([BEFORE]) and following ([AFTER]) paragraphs. The model generates a coherent middle ([TARGET]) paragraph mimicking generated=1 style.
Post-processing:Used kss for sentence segmentation. Removed incomplete trailing sentences using regex-based ending pattern matching (다, 요, 습니다 etc.). Applied manual spot-checking to remove low-quality generations. -Labeling & Integration: All generated paragraphs labeled as generated=1. Added to the training set to enrich the positive class distribution.
Scale: Generated 11,317 synthetic paragraphs for augmentation.

🔁 4) Weak Labeling & Filtering Strategies learn more->

We experimented with various heuristic and unsupervised labeling techniques:

1. Perplexity-based Filtering

Used GPT-like models to compute perplexity scores
Sentences or paragraphs with low perplexity were considered likely AI-generated
Applied thresholding to extract generated=1 candidates

2. Style Feature + PPL Clustering

Extracted syntactic style features (e.g., average sentence length, token diversity)
Combined with perplexity
Performed KMeans or HDBSCAN clustering to isolate AI-like patterns

3. HDBSCAN on Paragraph-level PPL

Applied sentence-wise perplexity estimation
Clustered using HDBSCAN to extract dense positive clusters
Label assigned to entire paragraph if one or more sentences clustered as AI

4. Perturbation-based Confidence Scoring

Modified (perturbed) the sentence input
Measured change in log-likelihood (LL) or model confidence
Used as a proxy for model “surprise” or fluency robustness

📍 Modeling Strategy

➡️ Base Models:

The first baseline model used KLUE-RoBERTa (Which you can find and use in hugging face). The model with the best performance in STS.

Model	STS	TC
mBERT-base	84.66	81.55
XML-R-base	89.16	83.52
KLUE-RoBERTa-base	92.50	85.07

check more ->

After that, many attempts were made using many models. leand more ->

➡️ Key Techniques:

Sliding window over tokens with max pooling
Sentence-level Pertubation + Perplexity filtering
KLUE-RoBERTa embedding + cosine similarity for pseudo-labeling
Model ensemble using ranking + voting

⭐ Final Model:

Our final submission used an ensemble of three models, trained on a paragraph-level relabeled dataset

➡️ Models

1. KLUE-RoBERTa-large (fine-tuned)

Chosen for its strong semantic representation capability, which is crucial for detecting subtle contextual inconsistencies in AI-generated text
Check training details here

2. KLUE-RoBERTa-base (fine-tuned)

Selected as a lighter and more generalizable counterpart to the large model, reducing overfitting risk on imbalanced data
Check training details here

3. CatBoost Classifier

Focused on stylometric anomalies often observed in AI-generated text, complementing semantic models
Features: unique_ratio, verb_ratio, entropy, polynomial interaction terms (degree=2), Perplexity (PPL) estimated via skt/kogpt2-base-v2
Check training details here

➡️ Ensemble Strategy

A custom Extreme Voting method was adopted to maximize ROC-AUC:

≥2 models ≥0.5 → max(probabilities) (optimistic consensus)
Otherwise → min(probabilities) (conservative fallback)

📍 Experiments & Results

📍 Project Structure

📍 Team & Contributions

Our team of 5 members divided responsibilities as follows:

Team Name: PokSSak(폭싹)

소속	이름	역할
숙명여자대학교 컴퓨터사이언스전공(22)	김소영	Data Engineering, Relabeling & Model Development	soyoung2118
숙명여자대학교 데이터사이언스전공(23)	김수빈
숙명여자대학교 데이터사이언스전공(23)	오현서	Data Engineering & Experiment Execution
숙명여자대학교 데이터사이언스전공(23)	원지우	Data Engineering & Experiment Execution
숙명여자대학교 소프트웨어융합전공(22)	임소정	Data Engineering & Model Development	sophia

We collaborated for 4 weeks.

📍 Stacks

Environment

Development

Communication

📍 Key Takeaways

📍 Appendix

Final submission score: ROC-AUC 0.889
Full competition leaderboard: [https://dacon.io/competitions/official/236473/leaderboard]
PDF summary: 'docs/project_summary.pdf'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect LLM: Paragraph-level Classification of AI-Generated Text

📍 Overview

📍 Competition Information

✔️ train.csv

✔️ test.csv

✔️ sample_submission.csv

📍 Data Analysis & Preprocessing

➡️ 1) Data Restructuring

➡️ 2) Class Imbalance Handling

🔁 3) KANANA-based Positive Data Augmentation

🔁 4) Weak Labeling & Filtering Strategies learn more->

1. Perplexity-based Filtering

2. Style Feature + PPL Clustering

3. HDBSCAN on Paragraph-level PPL

4. Perturbation-based Confidence Scoring

📍 Modeling Strategy

➡️ Base Models:

➡️ Key Techniques:

⭐ Final Model:

➡️ Models

➡️ Ensemble Strategy

📍 Experiments & Results

📍 Project Structure

📍 Team & Contributions

📍 Stacks

📍 Key Takeaways

📍 Appendix

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Detect LLM: Paragraph-level Classification of AI-Generated Text

📍 Overview

📍 Competition Information

✔️ train.csv

✔️ test.csv

✔️ sample_submission.csv

📍 Data Analysis & Preprocessing

➡️ 1) Data Restructuring

➡️ 2) Class Imbalance Handling

🔁 3) KANANA-based Positive Data Augmentation

🔁 4) Weak Labeling & Filtering Strategies learn more->

1. Perplexity-based Filtering

2. Style Feature + PPL Clustering

3. HDBSCAN on Paragraph-level PPL

4. Perturbation-based Confidence Scoring

📍 Modeling Strategy

➡️ Base Models:

➡️ Key Techniques:

⭐ Final Model:

➡️ Models

➡️ Ensemble Strategy

📍 Experiments & Results

📍 Project Structure

📍 Team & Contributions

📍 Stacks

📍 Key Takeaways

📍 Appendix