Skip to content

Latest commit

 

History

History
211 lines (147 loc) · 8.57 KB

File metadata and controls

211 lines (147 loc) · 8.57 KB

Detect LLM: Paragraph-level Classification of AI-Generated Text

image

📍 Overview

This repository presents our solution for the Detect AI-generated Text Competition, a national-level data science competition hosted in 2025, Dacon.

The goal of the cahllenge was to detect whether a paragraph was written by a Large Language Model (LLM), with only document-level labels provided. This required re-egineering the data, designing weak supervision strategies, and buildig robust classifiers in a highly imbalanced setting.

Competition Link


📍 Competition Information

  • Host: Dacon
  • Track: Detect AI-generated Text
  • Evaluation Metric: ROC-AUC
  • Input: Document-level 'full_text' (Provided as train.csv)
  • Label: 'generated' (0 for human-written, 1 for AI-generated)
  • Goal: Classify each paragraph as human- or AI-generated

✔️ train.csv

title full_text generated
카호올라웨섬 카호올라웨섬은 하와이 제도를 구성하는 (중략...) 0
청색거성 천문학에서 청색거성은 광도 분류 (중략...) 0
수난곡 수난곡은 배우의 연기 없이 무대에 (중략...) 1

✔️ test.csv

ID title paragraph_index paragraph_text
TEST_0000 공중 도덕의 의의와 필요성 0 도덕이란 원래 개인의 자각...
TEST_0001 공중 도덕의 의의와 필요성 1 도덕은 단순히 개인의 문제...
TEST_0002 공중 도덕의 의의와 필요성 2 여기에 이른바 공중도덕은...

✔️ sample_submission.csv

ID generated
TEST_0000 0
TEST_0001 0

📍 Data Analysis & Preprocessing

Our core challenge was to reconstruct paragraph-level labels from document-level data in an extremely imbalanced and noisy setting. To overcome this, we developed multiple weak supervision and filtering strategies, focusing on data-centric approaches.


image

➡️ 1) Data Restructuring

  • Each full_text (up to 9 paragraphs) was split into multiple paragraph_text units
  • For long paragraphs, we applied sliding window chunking with overlapping stride
  • Input format:
    "제목: {title} 본문: {paragraph_text}"

➡️ 2) Class Imbalance Handling

The dataset exhibited severe class imbalance:
approximately 10:1 ratio between generated=0 and generated=1 labels.

To address this, we performed both data augmentation and filtering for high-confidence positive samples.

  • Oversampling + Label Propagation
  • Filtering using Perplexity and Semantic Similarity

🔁 3) KANANA-based Positive Data Augmentation

We used a pretrained KoGPT model ("kanana") to generate synthetic paragraphs mimicking generated=1 style.
These augmented paragraphs were added to the training set to enrich the positive class.

  • Model: kakaocorp/kanana-1.5-2.1b-instruct-2505 (instruction-tuned KoGPT variant)
  • Prompt-based generation: Input includes preceding ([BEFORE]) and following ([AFTER]) paragraphs. The model generates a coherent middle ([TARGET]) paragraph mimicking generated=1 style.
  • Post-processing:Used kss for sentence segmentation. Removed incomplete trailing sentences using regex-based ending pattern matching (다, 요, 습니다 etc.). Applied manual spot-checking to remove low-quality generations. -Labeling & Integration: All generated paragraphs labeled as generated=1. Added to the training set to enrich the positive class distribution.
  • Scale: Generated 11,317 synthetic paragraphs for augmentation.

🔁 4) Weak Labeling & Filtering Strategies learn more->

We experimented with various heuristic and unsupervised labeling techniques:

1. Perplexity-based Filtering

  • Used GPT-like models to compute perplexity scores
  • Sentences or paragraphs with low perplexity were considered likely AI-generated
  • Applied thresholding to extract generated=1 candidates

2. Style Feature + PPL Clustering

  • Extracted syntactic style features (e.g., average sentence length, token diversity)
  • Combined with perplexity
  • Performed KMeans or HDBSCAN clustering to isolate AI-like patterns

3. HDBSCAN on Paragraph-level PPL

  • Applied sentence-wise perplexity estimation
  • Clustered using HDBSCAN to extract dense positive clusters
  • Label assigned to entire paragraph if one or more sentences clustered as AI

4. Perturbation-based Confidence Scoring

  • Modified (perturbed) the sentence input
  • Measured change in log-likelihood (LL) or model confidence
  • Used as a proxy for model “surprise” or fluency robustness

📍 Modeling Strategy

➡️ Base Models:

The first baseline model used KLUE-RoBERTa (Which you can find and use in hugging face). The model with the best performance in STS.

Model STS TC
mBERT-base 84.66 81.55
XML-R-base 89.16 83.52
KLUE-RoBERTa-base 92.50 85.07

check more ->

After that, many attempts were made using many models. leand more ->


➡️ Key Techniques:

  1. Sliding window over tokens with max pooling
  2. Sentence-level Pertubation + Perplexity filtering
  3. KLUE-RoBERTa embedding + cosine similarity for pseudo-labeling
  4. Model ensemble using ranking + voting

⭐ Final Model:

Our final submission used an ensemble of three models, trained on a paragraph-level relabeled dataset

➡️ Models

1. KLUE-RoBERTa-large (fine-tuned)

  • Chosen for its strong semantic representation capability, which is crucial for detecting subtle contextual inconsistencies in AI-generated text
  • Check training details here

2. KLUE-RoBERTa-base (fine-tuned)

  • Selected as a lighter and more generalizable counterpart to the large model, reducing overfitting risk on imbalanced data
  • Check training details here

3. CatBoost Classifier

  • Focused on stylometric anomalies often observed in AI-generated text, complementing semantic models
  • Features: unique_ratio, verb_ratio, entropy, polynomial interaction terms (degree=2), Perplexity (PPL) estimated via skt/kogpt2-base-v2
  • Check training details here

➡️ Ensemble Strategy

A custom Extreme Voting method was adopted to maximize ROC-AUC:

  • ≥2 models ≥0.5 → max(probabilities) (optimistic consensus)
  • Otherwise → min(probabilities) (conservative fallback)

📍 Experiments & Results


📍 Project Structure


📍 Team & Contributions

Our team of 5 members divided responsibilities as follows:

Team Name: PokSSak(폭싹)

소속 이름 역할
숙명여자대학교 컴퓨터사이언스전공(22) 김소영 Data Engineering, Relabeling & Model Development soyoung2118
숙명여자대학교 데이터사이언스전공(23) 김수빈
숙명여자대학교 데이터사이언스전공(23) 오현서 Data Engineering & Experiment Execution
숙명여자대학교 데이터사이언스전공(23) 원지우 Data Engineering & Experiment Execution
숙명여자대학교 소프트웨어융합전공(22) 임소정 Data Engineering & Model Development sophia

We collaborated for 4 weeks.


📍 Stacks

Environment

Development

Communication


📍 Key Takeaways


📍 Appendix