Skip to content

green-db table can not be joined with scraping table based on id #78

@BigDatalex

Description

@BigDatalex

Currently it is not possible to relate information of the scraping table to its corresponding extracted product information in the green-db table via id. If we want to join the tables we currently have to use timestamp, url and category.

We already use the id, to retrieve a specific row in the scraping table, but the id is not used any further when writing the extracted product information into the green-db, see:

scraped_page = CONNECTION_FOR_TABLE[table_name].get_scraped_page(id=row_id)
if product := extract_product(table_name=table_name, scraped_page=scraped_page):
green_db_connection.write(product)

The green-db table already has an id column, but this is autogenerated, see:

id = Column(INTEGER, nullable=False, autoincrement=True, primary_key=True)

So, integrating this shouIdn't be a lot of work and would help whenever we want to use information from scraping table together with green-db table. For example using the HTML together with the extracted product information for some ML.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions