Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Updated/Added Adult and Pediatric Dashboards
There was a problem hiding this comment.
Pull request overview
This PR updates the Bridge2AI Voice Project to version 3.0.0, introducing comprehensive pediatric data support alongside significant improvements to documentation, dashboard features, and dataset statistics. The release expands the project's scope beyond adult participants to include pediatric voice data collection and analysis.
Key Changes:
- Added pediatric study dashboard with new data visualization and collection tracking
- Updated dataset statistics reflecting expanded participant counts (442→833 participants) and revised disorder categorizations
- Enhanced documentation with cohort-specific protocols and improved data collection methodology descriptions
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| src/dashboard.py | Added "Pediatric Study Dashboard" tab and updated version to 3.0.0 |
| src/tabs/study_dashboard_peds.py | New file implementing pediatric-specific dashboard with demographics and data collection visualizations |
| src/tabs/utils.py | Added load_peds_data function and enhanced create_html_table with link formatting support |
| src/tabs/dashboard_data/peds_dashboard_data.json | New pediatric dashboard data file with 300 participants and collection statistics |
| src/tabs/dashboard_data/dashboard_data.json | Updated adult dataset statistics with revised participant counts and disorder categories |
| tables/Validated_Questionnaires.csv | Added Pediatrics column and Example PDF links, plus 4 new pediatric questionnaires |
| tables/Acoustic_Tasks_Protocol.csv | Added 18 new pediatric acoustic tasks (lines 24-41) |
| src/tabs/overview.py | Updated dataset URLs to v3.0.0 and added pediatric dataset access information |
| src/tabs/collection_methods.py | Enhanced with pediatric data collection methods and interactive task/questionnaire links |
| src/tabs/data_pre_processing.py | Expanded folder structure documentation and updated de-identification methods for v3.0.0 |
| src/tabs/study_metadata.py | Updated minimum age criteria to include pediatric range (2-17 years) |
| README.md | Updated version badge from 2.0.5 to 3.0.0 |
| SECURITY.md | Updated supported version to v3.0.0 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| **Please Note:** The public data releases do not contain pediatric data. It also does not contain an equal distribution | ||
| of these categories of diseases. Further releases will contain additional data. | ||
| **Please Note:** The public data releases do contain an equal distribution of these categories of diseases. Further releases will contain additional data. |
There was a problem hiding this comment.
The sentence "The public data releases do contain an equal distribution" is missing the word "not". Based on the PR description and the previous text that was removed, this should likely read "do not contain an equal distribution" to maintain consistency with the message that the dataset does not have equal distribution across disease categories.
| **Please Note:** The public data releases do contain an equal distribution of these categories of diseases. Further releases will contain additional data. | |
| **Please Note:** The public data releases do not contain an equal distribution of these categories of diseases. Further releases will contain additional data. |
|
|
||
| if key and title and chart_type: | ||
| labels, values = get_data(data, key, name_mapping) | ||
|
|
||
| if num_plots <= rows*cols_per_row+col_index: | ||
| if chart_type is None: | ||
| cols[col_index].empty() | ||
| else: | ||
| if chart_type == 'pie': | ||
| fig = create_pie_chart(labels, values, title, plot_props) | ||
| elif chart_type == 'horizontal_bar': | ||
| fig = create_bar_chart(labels, values, title, plot_props) | ||
|
|
||
| elif chart_type == 'vertical_bar': | ||
| fig = create_bar_chart(labels, values, title, plot_props) |
There was a problem hiding this comment.
The variables labels and values are only assigned when key, title, and chart_type are all truthy (line 157-158), but they are used unconditionally within the nested if-elif chain (lines 165, 167, 170). If key, title, or chart_type is falsy, these variables will be undefined when referenced, leading to a NameError. Consider restructuring the logic to ensure these variables are only used when properly initialized, or initialize them with default values before the conditional.
| if key and title and chart_type: | |
| labels, values = get_data(data, key, name_mapping) | |
| if num_plots <= rows*cols_per_row+col_index: | |
| if chart_type is None: | |
| cols[col_index].empty() | |
| else: | |
| if chart_type == 'pie': | |
| fig = create_pie_chart(labels, values, title, plot_props) | |
| elif chart_type == 'horizontal_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| elif chart_type == 'vertical_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| if num_plots <= rows * cols_per_row + col_index: | |
| # Only attempt to create a chart when we have a chart type and valid key/title | |
| if chart_type is None or not (key and title): | |
| cols[col_index].empty() | |
| else: | |
| labels, values = get_data(data, key, name_mapping) | |
| if chart_type == 'pie': | |
| fig = create_pie_chart(labels, values, title, plot_props) | |
| elif chart_type == 'horizontal_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| elif chart_type == 'vertical_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| else: | |
| # Unknown chart type; render empty column to avoid errors | |
| cols[col_index].empty() | |
| return |
|
|
||
| Minimum Age<br> | ||
| 18 years (this will change when pediatric cohort is introduced, and metadata will be updated to reflect new eligibility criteria) | ||
| 2 - 17 for pediatric participants and 18+ for adult cohorts. |
There was a problem hiding this comment.
The age range text "2 - 17 for pediatric participants and 18+ for adult cohorts" is missing the word "years" after "17". For consistency with "18 years" in the original text and clarity, it should read "2 - 17 years for pediatric participants and 18+ for adult cohorts" or "2 - 17 years for pediatric participants and 18+ years for adult cohorts".
| 2 - 17 for pediatric participants and 18+ for adult cohorts. | |
| 2 - 17 years for pediatric participants and 18+ for adult cohorts. |
| **Clinical validation:** Clinical validation is performed by qualified physician or practitioner based on established gold standards for diagnosis **(Table 1)**. | ||
|
|
||
| **Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**. | ||
| **Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**. |
There was a problem hiding this comment.
The documentation states "A total of 22 acoustic Tasks are recorded through the app" but this count appears to be outdated. Looking at the Acoustic_Tasks_Protocol.csv table, there are now 41 tasks listed (23 original adult tasks + 18 pediatric tasks). The documentation should be updated to reflect the correct total number of acoustic tasks, or clarify that 22 refers only to adult tasks.
| **Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**. | |
| **Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 23 adult and 18 pediatric acoustic tasks (41 total) are included in the protocol **(Table 2)**. |
| ├── dataset_description.json | ||
| ├── participants.json | ||
| ├── participants.tsv | ||
| ├── phenotype |
There was a problem hiding this comment.
There appears to be an inconsistent indentation level. The "phenotype" line (line 18) should be at the same indentation level as "dataset_description.json" (line 17) to maintain consistent hierarchy in the folder structure diagram. Currently it appears to be indented one extra level.
|
|
||
| # Convert DataFrame to HTML table | ||
| html_table = df.to_html(index=False, classes='table table-striped') | ||
| html_table = df.to_html(index=False, classes='table table-striped',escape=False,formatters=link_formattter) |
There was a problem hiding this comment.
There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.
| csv_file_path = "tables/Acoustic_Tasks_Protocol.csv" | ||
| caption = 'Table 2 - Acoustic Tasks in Protocol' | ||
| create_html_table(csv_file_path, caption) | ||
| create_html_table(csv_file_path, caption,link_formattter={"Task": map_tasks_to_videos}) |
There was a problem hiding this comment.
There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.
| csv_file_path = "tables/Validated_Questionnaires.csv" | ||
| caption = 'Table 3 - Validated Questionnaires integrated into App' | ||
| create_html_table(csv_file_path, caption, ['X']) | ||
| create_html_table(csv_file_path, caption, ['X'],link_formattter={'Example': map_questionnaire_link}) |
There was a problem hiding this comment.
There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.
| """ | ||
| ) | ||
| st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice-audio/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:") | ||
| st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice/3.0.0/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:") |
There was a problem hiding this comment.
The URL for the "Audio included" dataset appears to be the same as the main dataset URL. Both buttons now point to "https://physionet.org/content/b2ai-voice/3.0.0/", but the previous version had a separate URL "https://physionet.org/content/b2ai-voice-audio/" for the audio-included version. If these are indeed separate datasets, the URL should be different to distinguish between them.
| st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice/3.0.0/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:") | |
| st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice-audio/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:") |
| if num_plots <= rows*cols_per_row+col_index: | ||
| if chart_type is None: | ||
| cols[col_index].empty() | ||
| else: | ||
| if chart_type == 'pie': | ||
| fig = create_pie_chart(labels, values, title, plot_props) | ||
| elif chart_type == 'horizontal_bar': | ||
| fig = create_bar_chart(labels, values, title, plot_props) | ||
|
|
||
| elif chart_type == 'vertical_bar': | ||
| fig = create_bar_chart(labels, values, title, plot_props) | ||
| cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig()) | ||
| else: | ||
| cols[col_index].empty() | ||
|
|
There was a problem hiding this comment.
The condition if num_plots <= rows*cols_per_row+col_index on line 160 appears to be logically incorrect. This condition will almost never be true because num_plots is typically less than or equal to rows*cols_per_row, but here it's being compared to rows*cols_per_row+col_index which will be even larger. The logic should likely be if plot_index < num_plots to check if there's a plot to display at this index, which would be consistent with the outer check on line 151.
| if num_plots <= rows*cols_per_row+col_index: | |
| if chart_type is None: | |
| cols[col_index].empty() | |
| else: | |
| if chart_type == 'pie': | |
| fig = create_pie_chart(labels, values, title, plot_props) | |
| elif chart_type == 'horizontal_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| elif chart_type == 'vertical_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig()) | |
| else: | |
| cols[col_index].empty() | |
| if plot_index < num_plots: | |
| if chart_type is None: | |
| cols[col_index].empty() | |
| else: | |
| if chart_type == 'pie': | |
| fig = create_pie_chart(labels, values, title, plot_props) | |
| elif chart_type == 'horizontal_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| elif chart_type == 'vertical_bar': | |
| fig = create_bar_chart(labels, values, title, plot_props) | |
| cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig()) | |
| else: | |
| cols[col_index].empty() |
This pull request updates the Bridge2AI Voice Project to version 3.0.0, expanding support for pediatric data and improving documentation and dashboard features. The most significant changes include the addition of pediatric study data and dashboard views, updates to dataset statistics, enhancements to documentation, and improved presentation of protocol and questionnaire information.
Pediatric Data Integration and Dashboard Updates
peds_dashboard_data.json) containing participant demographics, recording statistics, and task/questionnaire collection info.Dataset Statistics and Metadata Updates
dashboard_data.json) with new participant counts, expanded demographic breakdowns, revised disorder categories, and updated acoustic task and questionnaire collection numbers to reflect the latest dataset.Documentation and Protocol Enhancements
README.mdandSECURITY.md, and provided cohort-specific protocol documentation including pediatric protocols. [1] [2] [3]Dashboard Table Presentation Improvements
Let me know if you have any questions about these changes or want to dive deeper into any specific part!