A lightweight Java implementation of the TextRank algorithm for automatic text summarization. This project builds a graph-based model where sentences are represented as nodes, and edges are weighted based on sentence similarity. Using Dijkstra's algorithm for centrality calculation, it extracts the most important sentences to generate concise summaries.
- Text Input & Processing: Input multi-sentence text and automatically split into sentences
- Graph Construction: Build similarity graph between sentences based on word overlap
- Centrality Scoring: Calculate sentence importance using Dijkstra's shortest path algorithm
- Automatic Summarization: Generate summaries by selecting top-ranked sentences
- Keyword Extraction: Extract and rank important keywords from the text
- Sentence Search: Linear search functionality to find sentences containing specific keywords
- Graph Visualization: Display the adjacency list representation of the sentence graph
- BFS Traversal: Explore the graph using Breadth-First Search starting from any sentence
- Statistics Display: Show text statistics including sentence count, word count, and graph metrics
- Stack-based Operations: Find top sentence using stack data structure
- Interactive Menu: User-friendly console interface with numbered menu options
- Java Development Kit (JDK) 8 or higher
- No external dependencies required (uses only standard Java libraries)
- Clone or download the project files
- Navigate to the project directory
- Compile the Java source files:
javac -d bin src/*.java src/*/*.java- Run the application:
java -cp bin Main- Launch the application using the command above
- Use the interactive menu to:
- Input Text: Enter your text (end with empty line)
- Generate Summary: Create automatic summary with top keywords and sentences
- Display Graph: View the sentence similarity graph
- Search Sentences: Find sentences containing specific keywords
- BFS Traversal: Explore graph connectivity
- View Statistics: Check text and graph metrics
- Find Top 1: Get most important sentence using stack operations
1. Input Teks
-> Enter your text paragraph here
-> Press Enter on empty line to finish
2. Generate Ringkasan
-> View top keywords
-> See sentence rankings with centrality scores
-> Get automatic summary (top 3 sentences)
-
Text Preprocessing:
- Split input text into individual sentences
- Tokenize sentences into words
- Remove stopwords (English and Indonesian)
-
Graph Construction:
- Each sentence becomes a node
- Edge weights represent sentence similarity (based on shared words)
- Distance = 100 - (similarity_score × 10)
-
Centrality Calculation:
- Use Dijkstra's algorithm from each node
- Calculate total distance to all other nodes
- Lower total distance = higher centrality (more important)
-
Summary Generation:
- Rank sentences by centrality score
- Select top N sentences (default: 3)
- Reorder to maintain original sequence
- Combine into coherent summary
- LinkedList: Custom implementation for dynamic arrays
- Graph: Adjacency list representation
- Priority Queue: For Dijkstra's algorithm
- Stack: For navigation and top-1 finding operations
- Queue: For BFS traversal
TextRank-Lite/
├── .gitignore
├── README.md
├── bin/ # Compiled class files
└── src/
├── app/
│ └── Main.java # Main application with menu interface
├── algorithms/
│ ├── Dijkstra.java # Shortest path algorithm
│ ├── Searcher.java # Linear search functionality
│ └── Sorter.java # Sorting algorithms (merge sort)
├── graph/
│ ├── Edge.java # Graph edge representation
│ └── TextGraph.java # Graph construction and operations
├── nlp/
│ ├── Sentence.java # Sentence representation
│ └── SentenceScore.java # Sentence scoring
└── structures/
├── Linkedlist.java # Custom linked list
├── NavigationStack.java # Stack implementation
└── Queue.java # Queue implementation
- Counts overlapping words between sentences
- Normalized by sentence length
- Used to determine edge weights in the graph
- Implements TextRank through graph centrality
- Uses Dijkstra's algorithm for efficiency
- Considers both direct and indirect connections
- Frequency-based ranking
- Stopword filtering (bilingual: English + Indonesian)
- Minimum word length filtering
- I Kadek Mahesa Permana Putra F1D02410052
- Muhammad Ravi Rayvansyah F1D02410078
- Fadila Rosidatul A’la F1D02410042
- Istiqomah Virginia F1D02410116
- Nurhidayah Maulidia F1D02410022
- Islam Ahmed Fouad Abunima F1D02411003
- Mohanad R. M. Abumattar F1D02411002
This project is open source and available under the MIT License.
This implementation was developed as part of the Algorithm and Data Structure course project, demonstrating practical application of graph algorithms, data structures, and algorithmic thinking in natural language processing tasks.