This shared task aims to introduce and regularize the concept of Word Grouping in Indian languages. Word grouping is a process wherein, given a plain text sentence in an Indian language, the task is to output a sequence of word groups, where each group represents a semantically cohesive unit.
Example:
Input Sentence: कुक आइलैंड्स दक्षिण प्रशांत महासागर के बीच में पोलिनेशिया में स्थित एक द्वीप देश है , जिसका नूजीलैंड के साथ खुला सहयोग है ।
Output Sentence: कुक__आइलैंड्स दक्षिण प्रशांत__महासागर__के बीच में पोलिनेशिया__में स्थित एक द्वीप देश है , जिसका नूजीलैंड__के साथ खुला सहयोग है ।
The link to shared task test phase is :
- Max Team Size: 4
- An individual cannot be part of multiple teams.
- Submission is from one CodaBench account.
- CodaBench account is required for participation.
Exact Matching will be used for evaluation. score.py is the evaluation file for the competition.
Your submission should contatin a .zip of predictions.csv file. The predictions.csv file should contain 2 columns named Input Sentence and Output Sentence. Submissions that do not conform to this requirements will not be evaluated by the system.
All participating teams are expected to submit a system paper describing methodologies adopted and findings.
For this task, adaptation of statistical and linguistic methodologies is highly encouraged.