Skip to content

Conversation

@osipovartem
Copy link
Collaborator

@osipovartem osipovartem commented May 6, 2025

Which issue does this PR close?

Rationale for this change

According to Postgres docs information_schema.tables should contain all tables and views defined in the current database, not from all catalogs
The same for columns, schemas, views

What changes are included in this PR?

Use resolved table reference to get catalog name and pass it to InformationSchemaProvider builder to fetch tables for only this particular catalog instead of all catalogs

Are these changes tested?

Tested localy

Are there any user-facing changes?

No

@osipovartem osipovartem requested review from Vedin and rampage644 May 6, 2025 11:55
@osipovartem osipovartem changed the title Issues/701 information schema per catalog information schema tables, schemas, views per specific catalog May 6, 2025
@osipovartem osipovartem changed the title information schema tables, schemas, views per specific catalog information schema tables, columns, schemas, views per specific catalog May 6, 2025
@rampage644
Copy link

What about an option to have information_schema implementation exclusively in the main repo? Such that there is no dependency on upstream?

Information schema could be considered as part of catalog and datafusion recommends each database has its own catalog implementation. Information schema could be viewed as a reference implementation in datafusion and perhaps it's quite okay to have separate implementation?

@osipovartem
Copy link
Collaborator Author

What about an option to have information_schema implementation exclusively in the main repo? Such that there is no dependency on upstream?

Information schema could be considered as part of catalog and datafusion recommends each database has its own catalog implementation. Information schema could be viewed as a reference implementation in datafusion and perhaps it's quite okay to have separate implementation?

In this case we have to make the list of existing builders public and just rewrite 3-4 of them to fetch data per catalog. In this case we can update it according to snowflake implementation.
Or do you suggest to copy all information related part and update it in our repo? (~1350 rows of code)

@rampage644
Copy link

Or do you suggest to copy all information related part and update it in our repo? (~1350 rows of code)

Yes, I was suggesting something like this. I am not yet sure how complex information schema support could be, but perhaps having started with a copy paste as initial implementation is not the craziest idea.

@rampage644
Copy link

Should it be closed due to Embucket/embucket#718 ?

@osipovartem osipovartem closed this May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants