Skip to content
This repository was archived by the owner on Jan 7, 2026. It is now read-only.

v3.0.0#63

Merged
alexsigaras merged 18 commits intomainfrom
develop
Dec 18, 2025
Merged

v3.0.0#63
alexsigaras merged 18 commits intomainfrom
develop

Conversation

@alexsigaras
Copy link
Member

This pull request updates the Bridge2AI Voice Project to version 3.0.0, expanding support for pediatric data and improving documentation and dashboard features. The most significant changes include the addition of pediatric study data and dashboard views, updates to dataset statistics, enhancements to documentation, and improved presentation of protocol and questionnaire information.

Pediatric Data Integration and Dashboard Updates

  • Added a new pediatric dashboard data file (peds_dashboard_data.json) containing participant demographics, recording statistics, and task/questionnaire collection info.
  • Updated the main dashboard to include a "Pediatric Study Dashboard" tab and linked the new pediatric dashboard page function. [1] [2] [3]

Dataset Statistics and Metadata Updates

  • Refreshed the main dashboard data (dashboard_data.json) with new participant counts, expanded demographic breakdowns, revised disorder categories, and updated acoustic task and questionnaire collection numbers to reflect the latest dataset.

Documentation and Protocol Enhancements

  • Updated documentation links and references to version 3.0.0 in README.md and SECURITY.md, and provided cohort-specific protocol documentation including pediatric protocols. [1] [2] [3]
  • Clarified pediatric data collection procedures and distinguished between adult and pediatric data collection methods in the collection methods documentation.

Dashboard Table Presentation Improvements

  • Enhanced dashboard tables to include example links for acoustic tasks and PDF links for questionnaires, improving usability and clarity for users reviewing protocol details.

Let me know if you have any questions about these changes or want to dive deeper into any specific part!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Bridge2AI Voice Project to version 3.0.0, introducing comprehensive pediatric data support alongside significant improvements to documentation, dashboard features, and dataset statistics. The release expands the project's scope beyond adult participants to include pediatric voice data collection and analysis.

Key Changes:

  • Added pediatric study dashboard with new data visualization and collection tracking
  • Updated dataset statistics reflecting expanded participant counts (442→833 participants) and revised disorder categorizations
  • Enhanced documentation with cohort-specific protocols and improved data collection methodology descriptions

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
src/dashboard.py Added "Pediatric Study Dashboard" tab and updated version to 3.0.0
src/tabs/study_dashboard_peds.py New file implementing pediatric-specific dashboard with demographics and data collection visualizations
src/tabs/utils.py Added load_peds_data function and enhanced create_html_table with link formatting support
src/tabs/dashboard_data/peds_dashboard_data.json New pediatric dashboard data file with 300 participants and collection statistics
src/tabs/dashboard_data/dashboard_data.json Updated adult dataset statistics with revised participant counts and disorder categories
tables/Validated_Questionnaires.csv Added Pediatrics column and Example PDF links, plus 4 new pediatric questionnaires
tables/Acoustic_Tasks_Protocol.csv Added 18 new pediatric acoustic tasks (lines 24-41)
src/tabs/overview.py Updated dataset URLs to v3.0.0 and added pediatric dataset access information
src/tabs/collection_methods.py Enhanced with pediatric data collection methods and interactive task/questionnaire links
src/tabs/data_pre_processing.py Expanded folder structure documentation and updated de-identification methods for v3.0.0
src/tabs/study_metadata.py Updated minimum age criteria to include pediatric range (2-17 years)
README.md Updated version badge from 2.0.5 to 3.0.0
SECURITY.md Updated supported version to v3.0.0

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


**Please Note:** The public data releases do not contain pediatric data. It also does not contain an equal distribution
of these categories of diseases. Further releases will contain additional data.
**Please Note:** The public data releases do contain an equal distribution of these categories of diseases. Further releases will contain additional data.
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence "The public data releases do contain an equal distribution" is missing the word "not". Based on the PR description and the previous text that was removed, this should likely read "do not contain an equal distribution" to maintain consistency with the message that the dataset does not have equal distribution across disease categories.

Suggested change
**Please Note:** The public data releases do contain an equal distribution of these categories of diseases. Further releases will contain additional data.
**Please Note:** The public data releases do not contain an equal distribution of these categories of diseases. Further releases will contain additional data.

Copilot uses AI. Check for mistakes.
Comment on lines +156 to +170

if key and title and chart_type:
labels, values = get_data(data, key, name_mapping)

if num_plots <= rows*cols_per_row+col_index:
if chart_type is None:
cols[col_index].empty()
else:
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)

elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables labels and values are only assigned when key, title, and chart_type are all truthy (line 157-158), but they are used unconditionally within the nested if-elif chain (lines 165, 167, 170). If key, title, or chart_type is falsy, these variables will be undefined when referenced, leading to a NameError. Consider restructuring the logic to ensure these variables are only used when properly initialized, or initialize them with default values before the conditional.

Suggested change
if key and title and chart_type:
labels, values = get_data(data, key, name_mapping)
if num_plots <= rows*cols_per_row+col_index:
if chart_type is None:
cols[col_index].empty()
else:
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)
elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
if num_plots <= rows * cols_per_row + col_index:
# Only attempt to create a chart when we have a chart type and valid key/title
if chart_type is None or not (key and title):
cols[col_index].empty()
else:
labels, values = get_data(data, key, name_mapping)
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)
elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
else:
# Unknown chart type; render empty column to avoid errors
cols[col_index].empty()
return

Copilot uses AI. Check for mistakes.

Minimum Age<br>
18 years (this will change when pediatric cohort is introduced, and metadata will be updated to reflect new eligibility criteria)
2 - 17 for pediatric participants and 18+ for adult cohorts.
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The age range text "2 - 17 for pediatric participants and 18+ for adult cohorts" is missing the word "years" after "17". For consistency with "18 years" in the original text and clarity, it should read "2 - 17 years for pediatric participants and 18+ for adult cohorts" or "2 - 17 years for pediatric participants and 18+ years for adult cohorts".

Suggested change
2 - 17 for pediatric participants and 18+ for adult cohorts.
2 - 17 years for pediatric participants and 18+ for adult cohorts.

Copilot uses AI. Check for mistakes.
**Clinical validation:** Clinical validation is performed by qualified physician or practitioner based on established gold standards for diagnosis **(Table 1)**.

**Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**.
**Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**.
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "A total of 22 acoustic Tasks are recorded through the app" but this count appears to be outdated. Looking at the Acoustic_Tasks_Protocol.csv table, there are now 41 tasks listed (23 original adult tasks + 18 pediatric tasks). The documentation should be updated to reflect the correct total number of acoustic tasks, or clarify that 22 refers only to adult tasks.

Suggested change
**Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 22 acoustic Tasks are recorded through the app **(Table 2)**.
**Acoustic Tasks:** Voice, breathing, cough, and speech data are recorded with the app for adults, and with reproschema-ui for pediatrics. A total of 23 adult and 18 pediatric acoustic tasks (41 total) are included in the protocol **(Table 2)**.

Copilot uses AI. Check for mistakes.
├── dataset_description.json
├── participants.json
├── participants.tsv
├── phenotype
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be an inconsistent indentation level. The "phenotype" line (line 18) should be at the same indentation level as "dataset_description.json" (line 17) to maintain consistent hierarchy in the folder structure diagram. Currently it appears to be indented one extra level.

Copilot uses AI. Check for mistakes.

# Convert DataFrame to HTML table
html_table = df.to_html(index=False, classes='table table-striped')
html_table = df.to_html(index=False, classes='table table-striped',escape=False,formatters=link_formattter)
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.

Copilot uses AI. Check for mistakes.
csv_file_path = "tables/Acoustic_Tasks_Protocol.csv"
caption = 'Table 2 - Acoustic Tasks in Protocol'
create_html_table(csv_file_path, caption)
create_html_table(csv_file_path, caption,link_formattter={"Task": map_tasks_to_videos})
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.

Copilot uses AI. Check for mistakes.
csv_file_path = "tables/Validated_Questionnaires.csv"
caption = 'Table 3 - Validated Questionnaires integrated into App'
create_html_table(csv_file_path, caption, ['X'])
create_html_table(csv_file_path, caption, ['X'],link_formattter={'Example': map_questionnaire_link})
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the parameter name. The parameter is named "link_formattter" (with three 't's) but should be "link_formatter" (with two 't's). This is inconsistent with the function parameter definition.

Copilot uses AI. Check for mistakes.
"""
)
st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice-audio/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:")
st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice/3.0.0/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:")
Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URL for the "Audio included" dataset appears to be the same as the main dataset URL. Both buttons now point to "https://physionet.org/content/b2ai-voice/3.0.0/", but the previous version had a separate URL "https://physionet.org/content/b2ai-voice-audio/" for the audio-included version. If these are indeed separate datasets, the URL should be different to distinguish between them.

Suggested change
st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice/3.0.0/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:")
st.link_button("Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", type="primary", url="https://physionet.org/content/b2ai-voice-audio/", help="Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information (Audio included)", icon=":material/login:")

Copilot uses AI. Check for mistakes.
Comment on lines +160 to +174
if num_plots <= rows*cols_per_row+col_index:
if chart_type is None:
cols[col_index].empty()
else:
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)

elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig())
else:
cols[col_index].empty()

Copy link

Copilot AI Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition if num_plots <= rows*cols_per_row+col_index on line 160 appears to be logically incorrect. This condition will almost never be true because num_plots is typically less than or equal to rows*cols_per_row, but here it's being compared to rows*cols_per_row+col_index which will be even larger. The logic should likely be if plot_index < num_plots to check if there's a plot to display at this index, which would be consistent with the outer check on line 151.

Suggested change
if num_plots <= rows*cols_per_row+col_index:
if chart_type is None:
cols[col_index].empty()
else:
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)
elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig())
else:
cols[col_index].empty()
if plot_index < num_plots:
if chart_type is None:
cols[col_index].empty()
else:
if chart_type == 'pie':
fig = create_pie_chart(labels, values, title, plot_props)
elif chart_type == 'horizontal_bar':
fig = create_bar_chart(labels, values, title, plot_props)
elif chart_type == 'vertical_bar':
fig = create_bar_chart(labels, values, title, plot_props)
cols[col_index].plotly_chart(fig, use_container_width=True, config=getPlotlyConfig())
else:
cols[col_index].empty()

Copilot uses AI. Check for mistakes.
@alexsigaras alexsigaras merged commit ff41134 into main Dec 18, 2025
12 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants