Conversation
| ) | ||
|
|
||
|
|
||
| class InstallationProgress(create_time_mixin(created_at=False, updated_at=False), Base): |
There was a problem hiding this comment.
| class InstallationProgress(create_time_mixin(created_at=False, updated_at=False), Base): | |
| class GitHubInstallationProgress(Base): |
I guess. If we write "github" in the description... So that we don't confuse with GitLab.
Also, create_time_mixin(created_at=False, updated_at=False) does nothing, right?
| account_created = Column(TIMESTAMP(timezone=True)) | ||
| fetch_started = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
How is "account_created" different from "fetch_started"?
There was a problem hiding this comment.
more so we don`t have a "account_created" event. yet.
and for a case of re-fetch they will be different.
so for now we can cut this field I think. yes
| precompute_started = Column(TIMESTAMP(timezone=True)) | ||
| precompute_completed = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
This is reposet and Athenian account specific. We have to move those timestamps to the repository_sets table.
There was a problem hiding this comment.
so this force us to have some key in the state DB for github_account_ids.
create a table for comparing github_account_id and athenian_account_id needed.
There was a problem hiding this comment.
You probably mean account_github_accounts?
There was a problem hiding this comment.
if it is contains both this IDs - yes it is
| consistency_completed = Column(TIMESTAMP(timezone=True)) | ||
| precompute_started = Column(TIMESTAMP(timezone=True)) | ||
| precompute_completed = Column(TIMESTAMP(timezone=True)) | ||
| current_status = Column(Text()) |
There was a problem hiding this comment.
Let's delete this for now. I need us to create something absolutely minimal and ASAP. If we make the status, we will have to think what statuses can we have, when to update and what, consider different edge cases, etc.
There was a problem hiding this comment.
noone is forcing us to use all the columns of the table from the begining.
we can create a table "as is" and add the features to the pipeline and API when we will be ready
There was a problem hiding this comment.
Let's migrate to additional fields in the future. To avoid seduction.
There was a problem hiding this comment.
Sorry, but how one text field is making it harder? We already have a PR to MD that uses it.
There was a problem hiding this comment.
addition - PR to cloud-common is ready too. and if we cut the columns it will slow us to implement changes instead of getting some features faster.
and if we delete precompute timestamps from this table we will use all the rest immediately after releasing all the features that in process now
There was a problem hiding this comment.
Therefore, we can update the status field in the metadata whatever way we want, but we will not use it downstream. If it warms your heart much to update an unused field - I am not standing against 😄
There was a problem hiding this comment.
Yes, I'd still like to have that field for the bot.
Agree about precomputer, we will need a second table then?
There was a problem hiding this comment.
precompute- fields dropped. account_created field dropped. others (including status) is used by actual written logic in metadata. the point is that status can have a voluntary (apart of some statuses that would be fixed) value and this can be used for some specific messages like "paused", "delayed" etc.
There was a problem hiding this comment.
@dennwc we have a field "precomputed bool" in the table "repository_sets"
so it would be right to add a couple new fields to it. and provide a logic to fill this fields from the precomputer directly (not with event handler).
There was a problem hiding this comment.
It makes sense to write to metadata 👍
| tracking_re = Column(Text(), nullable=False, default=".*", server_default=".*") | ||
| precomputed = Column(Boolean(), nullable=False, default=False, server_default="false") | ||
| precompute_started = Column(TIMESTAMP(timezone=True)) | ||
| precompute_completed = Column(TIMESTAMP(timezone=True)) |
There was a problem hiding this comment.
| precompute_completed = Column(TIMESTAMP(timezone=True)) | |
| precompute_finished = Column(TIMESTAMP(timezone=True)) |
Minor syntax nitpick: start-finish; begin-end; initiate-complete. I know that people don't usually care, but this is just good education and conformance to the surrounding table conventions :)
There was a problem hiding this comment.
Oh, and I see we have the same syntax to fix in installation_progress 🙏
| consistency_completed = Column(TIMESTAMP(timezone=True)) | ||
| precompute_started = Column(TIMESTAMP(timezone=True)) | ||
| precompute_completed = Column(TIMESTAMP(timezone=True)) | ||
| current_status = Column(Text()) |
There was a problem hiding this comment.
Therefore, we can update the status field in the metadata whatever way we want, but we will not use it downstream. If it warms your heart much to update an unused field - I am not standing against 😄
dennwc
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @Ildyakov and @vmarkovtsev)
server/athenian/api/models/state/versions/5b3dc49a9d7b_installation_progress.py line 20 at r2 (raw file):
def upgrade(): op.create_table( "installation_progress",
Since we are removing API-related things from the table anyway, does it make sense to move it to MD repo completely? It can be a set of added columns for github.account, for example. This way we can add a data migration as well (set all accounts to "done" status).
this is interesting. yes the reason to use state db was precomputing. |
727b2c7 to
94959d3
Compare
This change is