- title: Engligbo
- emoji: ⚡
- colorFrom: purple
- colorTo: pink
- sdk: gradio
- sdk_version: 5.25.2
- app_file: app.py
- pinned: false
- license: mit
- short_description: Summarization and translation from English to Igbo
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Text Summarization and translation is a major task by NLP, although most applications do these operations separately I decided to experiment by combining both features into one. Engligbo is a powerful tool designed to summarize English text and translate the summary into the Igbo language; a major language in Nigeria.. It leverages Natural Language Processing (NLP) techniques to provide accurate and efficient summarization and translation capabilities.
One of our first set of lectures in this course was text processing. We also did translation at some point using the encoder-decoder mechanism. Since there are available translators, I decided to integrate one of them into a text summarizer using the knowledge from the text processing classes. As a Twitter(now X) user, I find it fascinating how entire threads can be summarized by bots. I have always imagined the technology behind that, hence the reason for doing something similar in this task. Initially, the goal was to summarize Twitter threads and translate to igbo but the time constraint was a major factor. probably future versions would consider twitter. Key Features:
- Text Summarization: Engligbo can condense lengthy English texts into concise summaries, capturing the essential information.
- Igbo Translation: It seamlessly translates the generated summaries into the Igbo language, making information accessible to a wider audience. Versatile Input Methods: You can input text via direct pasting, uploading files, or providing website links.
This task was written in Python in a Google Colab notebook. It will subsequently be moved to a more stable environment. Other tools implemented include: NLTK: For text processing and summarizing Google Cloud Translator API: For translating from English to Igbo Hugging Face Hub: To host the application Gradio: For User Interface Gemini: For debugging
This application is made up of three major parts:
- Functionality: This phase focuses on extracting text from various sources (pasted text, URL, or uploaded file) and generating a concise summary.
- Functions involved:
summarize_text(text, num_sentences=3): This function takes the input text and the desired number of sentences for the summary as input. It performs the following steps: Tokenizes the text into words and sentences. Removes stop words (common words like "the," "a," "is," etc.) to focus on important terms. Calculates word frequencies to identify the most significant words in the text. Scores sentences based on the frequencies of the words they contain. Selects the top-ranked sentences to form the summary. Returns the generated summary as a string.
get_text_from_url(url): If the user provides a URL, this function fetches the webpage content, extracts the text from HTML paragraphs, and returns the extracted text.
read_text_from_file(file_input): If the user uploads a file, this function reads the file content and returns it as a string.
-
Functionality: This phase integrates with the Google Cloud Translate API to translate the generated English summary into Igbo.
-
Functions involved: translate_text(text, target_language='ig'): This function utilizes the Google Cloud Translate API to translate the input text (the summary) into the target language (Igbo, represented by the language code 'ig'). It returns the translated text.
- Functionality: This phase creates the user interface using the Gradio library, allowing users to interact with the app and obtain the summarized and translated text.
- Functions involved: summarize_and_translate(input_method, text_input, link_input, file_input): This function serves as the main logic of the app. Based on the selected input method, it retrieves the input text, summarizes it using the summarize_text function, translates the summary using the translate_text function, and returns both the summary and the Igbo translation.
update_input_visibility(choice): This function dynamically updates the visibility of input fields based on the user's selected input method (text, link, or file). It ensures that only the relevant input field is displayed at a time. gr.Blocks(), gr.Markdown(), gr.Radio(), gr.Textbox(), gr.File(), gr.Button(), demo.launch(): These functions from the Gradio library create and manage the user interface elements, handle user interactions, and launch the interactive web application.
- Accessing the App: You can access the Engligbo app through a web interface. Open your web browser and navigate to the provided link or run it locally if you have the code.
- Inputting Text: Choose your preferred input method from the options provided:
- Paste Text: Directly paste the text you want to summarize and translate into the designated text box.
- Upload File: Upload a text file containing the content you want to process. Supported file formats may vary, so ensure your file is compatible.
- Paste Link: Provide a link to a webpage containing the text you want to summarize and translate. Engligbo will extract the relevant text from the webpage.
Once you've provided the input, click the "Submit" button. Engligbo will process the text, generate a concise summary, and translate the summary into Igbo. The summarized and translated outputs will be displayed in separate boxes.
- Viewing Results: Review the generated summary and Igbo translation. You can copy and paste the results for further use or download.
- File Upload Issues: If you encounter problems uploading files, ensure that the file format is supported and that the file size is within the allowed limits.
- Translation Errors: In case of translation inaccuracies, you can try rephrasing the input text or using a different input method.
- Connectivity Issues: Ensure that you have a stable internet connection for optimal performance, as the app may rely on online resources for translation.
- Customization: Integration of Text To Speech APIs. That way, the text is not only summarized but read out aloud in various languages too..
- Integration: Integration into social media applications like Twitter(X)
Engligbo provides a user-friendly platform for summarizing English text and translating it into the Igbo language. By following this guide, you can effectively utilize Engligbo's features to gain insights from large amounts of information and make them accessible to a wider audience.