hotel_ibs

Step 1: Getting data

Download training images from https://github.com/GWUvision/Hotels-50K/ (execute download_train.py).

Problems: dataset is huge (1,027,871 images from 50,000 hotels, and 92 major hotel chains) so that takes a lot of time. For the prototyping phase I had to download only part of it.

Step 2:

create_database.py Create PostgreSQL database 'hotel_ibs' with tables:

hotel
chain
images

Problems: again, the dataset is huge so I had to only create part of the entries.

Step 3: Tag images

tag_images.py Get tags with OpenAI CLIP

install OpenAI CLIP
git clone https://github.com/openai/CLIP.git
cd CLIP
pip install -r requirements.txt
pip install .
run images through OpenAI CLIP
save tags to relevant image_id

Problems:

the current implementation that I found, I have to create the tags myself, they are not auto-generated from an internal Clip dictionary as I had thought.
the larger the size of the tags dictionary, the more memory CLIP takes to tag the image. my computer was quickly running out of memory for dictionaries above 200 words. For dictionaries at 200 words it took 30s per image so I decided to move it to a Google Colab session to use T4 GPUs.

Step 4: Create Flask backend and fetch information from Postgres database and images from AWS S3

I already had a backend from another project and just had to make minor adjustements

Step 5: Create React frontend

I already had a frontend from another project and just had to make minor adjustements. There is a text box and a button "Search".

Step 6: Connect frontend and backend

When page is loaded, information from database is downloaded. When user clicks the button search, the tags are searched for the text in the textbox. If images are tagged with this search item they are downloaded from the AWS S3 bucket and displayed. The images are cached to diminish download time in the future.

This is the first working state of the application:

Step 7: New dataset from Google

The images from the original dataset only contained photos of the insides of the rooms, and I wanted more photos from the hotel and hotel facilities (pool, restaurant, ...) so I created a script to download 10 images per hotel from a pre-filtered hotel list. I also reduced the amount of hotels from xx thousand to xx hundreds. At the end I had ~4000 images from the original dataset and ~2000 from google. I had to retag them and re-uploaded the new complete set (~6000 images) to AWS and updated the database correspondingly.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
backend		backend
frontend		frontend
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hotel_ibs

Step 1: Getting data

Step 2:

Step 3: Tag images

Step 4: Create Flask backend and fetch information from Postgres database and images from AWS S3

Step 5: Create React frontend

Step 6: Connect frontend and backend

Step 7: New dataset from Google

Step 8: Adapted Frontend to include filtering hotels by availability and price.

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mrpolla/hotel_ibs

Folders and files

Latest commit

History

Repository files navigation

hotel_ibs

Step 1: Getting data

Step 2:

Step 3: Tag images

Step 4: Create Flask backend and fetch information from Postgres database and images from AWS S3

Step 5: Create React frontend

Step 6: Connect frontend and backend

Step 7: New dataset from Google

Step 8: Adapted Frontend to include filtering hotels by availability and price.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages