-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: Units are Containers #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
740e631 to
9cf955f
Compare
|
Maybe it would be more intuitive to just use regular multi-table inheritance then? We'd spawn a bunch of per-container tables, but it'll probably be easier for folks to grok? (Really, I'm just looking for excuses to use every ORM feature out there at this point. 😜 |
|
I actually like the approach in this PR, and think it would work well if we're willing to use a JSONField for each container's metadata. |
I have a gut reaction against JSONFields in general, but maybe I've been overindexing on that. Concretely, what do we lose with Proxy Models + JSONFields, as compared to what exists now in openedx#278 ? I can think of only two things:
@bradenmacdonald if you've considered both these drawbacks and see them as no big deal, then that resolves my objections. Curious to hear @ormsbee 's opinion too. Lastly, I imagine we're going to be in a very similar situation over in Component-land when if we ever want to start putting metadata about ProblemBlocks, VideoBlocks, LTIBlocks, etc. into the database. Would we apply this same solution there? If not, why? |
|
Oh, I just realized that MySQL has a proper JSON column type as of 5.7. I'd had it in my head that was a Postgres-only thing. So queries like this will run performantly? Unit.objects.filter(metadata__discussions_enabled=True)If so, then the drawback of JSONField becomes less about efficiency, and more about validation and type-safety of the metadata schema, which we could instead handle decently well at the Python API layer using attrs or dataclasses. |
|
@kdmccormick Yes. Queries within JSON fields won't use an index, but they'll run at the database level, pretty efficiently. You can even create an index on some fields inside the JSON value using generated columns or functional indexes, but at that point it's probably better to just use a regular column (Edit: these type of indexes can't be managed easily using Django but MySQL does support it). I've used PostgeSQL JSON fields for years and found them to be great. I imagine MySQL's are similar. So your second point doesn't apply. I would say the drawbacks of JSON fields for this use case are:
And as you said, the drawback of the proxy models approach is also related to foreign keys: you can't declare any new fields (including foreign keys) on Unit if it's a proxy model, and the DB won't enforce that keys to unit are not made to other container types (but Django will at the python level, I believe). Also, these approaches (JSON metadata, proxy models) don't have to be coupled. We can use either one without the other. |
|
@kdmccormick: It will not run efficiently unless you're running PostgreSQL. It "works" in MySQL insofar as it will ensure that it's valid JSON, but the indexing is not helpful. From the docs:
|
|
@ormsbee In today's platform, we don't have any way to create indexes on metadata fields like I'm not sure our particular access patterns will ever warrant indexed metadata fields. If you're filtering on Maybe it would be useful when we're talking about components and not units, but even then I'm not sure. But yeah a query like "of all units in all courses and libraries in the platform, what percent have discussion_enabled" will not be very efficient in a JSON field, but shouldn't be significantly slower than if it were a regular unindexed column. |
|
No, we can't efficiently query for I do share @kdmccormick's misgivings about having a generic catch-all JSON field because it encourages a behavior where the natural pattern is to just lump more stuff in it under a namespace, like how CourseOverview just continues to grow new fields. I do think that we'll want to use JSONField in areas, but I would like the extension pattern to encourage grouping by models. For instance, maybe there's an |
|
I do agree with your concern, though I don't think it's only We also don't need to solve this "modelling data fields for extensions" problem now - we just need to settle on a Units model that will support a nice option in the future. But from my perspective, such a system should make it easy to add new namespaced data fields (i.e. easy to register a new model like What I like about JSONField is that it makes versioning / import-export / serialization / REST API access / python API access trivial to implement, and it works for all plugins. Implementing the equivalents with proper django models and plugin registration is also very feasible, but definitely more work (though with benefits like schema, indexing, foreign keys). |
|
Closing in favor of #17 . I still like JSONFields though :p |
This PR refactors
Unitso it's just a proxy model forContainerand doesn't use a separate table. There is a newcontainer_typecolumn on theContainerobject to specify that it's a unit (or not).One thing to note is that proxy models like
Unit(after this refactor) cannot declare fields: