Skip to content

My repository to store the articles and codes for the CNPQ project "sistema de legendagem de imagens".

License

Notifications You must be signed in to change notification settings

loioladev/cnpq-caption-ia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Captioning System

This repository contains both the research and project implementation of an Image Captioning System that received an Honourable Mention at the 30th Congress of Scientific Initiation (UnB) and 21st Congress of the Federal District, Brazil. The project combines natural language processing (NLP) and computer vision techniques to generate descriptive captions for images. It explores the latest methodologies in the field, such as Convolutional Neural Networks (CNNs) for image processing and Recurrent Neural Networks (RNNs), particularly LSTMs, for text generation.

Honorable Mention

Table of Contents

Project Overview

The image captioning system is divided into two main parts:

  1. Research: The research section includes a detailed literature review and experiments exploring various image captioning models, architectures, and techniques.
  2. Project: The implementation part focuses on building a working prototype using state-of-the-art deep learning models.

Installation

To run the project locally, follow these steps:

  1. Clone the repository:

    git clone https://github.com/loioladev/cnpq-caption-ia.git
    cd cnpq-caption-ia
  2. Create a virtual environment and install dependencies:

    python -m venv venv
    source venv/bin/activate
    ./requirements.sh
  3. Download and prepare datasets before training the models.

Usage

The project is divided into two main parts: YOLO Object Detection and LSTM Text Generation. Each part has its own set of scripts and notebooks for training and evaluation.

Datasets

YOLO Datasets

The YOLO datasets are used for training the object detection model. The datasets are available in the following links:

LSTM Datasets

The LSTM datasets are used for training the text generation model. The datasets are available in the following links:

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

My repository to store the articles and codes for the CNPQ project "sistema de legendagem de imagens".

Resources

License

Stars

Watchers

Forks

Packages

No packages published