The current definition says
An integer specifying a group of tracks which are designed to be rendered together. Tracks with the same group number SHOULD be rendered simultaneously, are time-aligned and are designed to accompany one another. A common example would be tying together audio and video tracks.
I propose to remove time-aligned from the definition. Since in a conference setting, we have multiple content streams to be rendered together , say, in a grid layout. These are not necessarily time-aligned. We would assign all these tracks same renderGroupId