Version: 1.1
Audience: End users with no coding background
Export App is a simple data export and summary tool for CSV files.
It allows users to upload multiple CSV files (or a ZIP of CSVs), preview their contents, export raw data into Excel, and calculate basic summary statistics (mean and standard deviation).
- Upload multiple CSV files
- Upload ZIP archives containing CSV files
- Load CSV files from a server folder
- Preview CSV data (first rows and columns) and view original → sanitized column name mapping
- Export raw CSVs into a single Excel workbook
- Define derived columns using simple expression syntax and preview them on a chosen file
- Calculate summary statistics (mean and standard deviation) using selected columns and/or derived columns
- Export summary statistics to Excel or CSV
You can define temporary derived columns that are applied only when calculating summaries (they do not mutate your stored data). Enter one definition per line using the format:
NAME = op(col1,col2,...)
Supported operations:
- sum(A,B,...) — elementwise sum across columns
- mean(A,B,...) | avg(A,B,...) — elementwise mean (ignores NaNs)
- median(A,B,...) — elementwise median
- ratio(num, den) or r(num, den) — elementwise division (denominator zeros become NaN)
- wmean(A:weight, B:weight, ...) — weighted mean; weights are numeric and optional (default weight 1)
Examples (use the UI example buttons to auto-insert these):
- TOTAL = sum(Flow,Volume)
- AVG = mean(Flow,Volume)
- W = wmean(Flow:0.3, Volume:0.7)
- R = ratio(sum(Flow,Volume), Volume)
Notes:
- Column names must match the sanitized names shown in the preview mapping (spaces and special characters are replaced by underscores on load).
- The parser will warn about non-numeric or negative weights and other parse issues.
- Use the "Preview derived columns on preview file" button to see the computed columns for a single file before calculating the full summary.
When the app opens, the screen is divided into:
- A left sidebar (upload and settings)
- A main panel (file selection, preview, and export)
Follow the steps below to load data and run summaries.
Open the sidebar and locate Upload / Source.
Option A: Multiple CSV Files
- Select Multiple CSV files
- Click Choose one or more CSV files
- Select your files
- Click Load uploaded files in the main panel
Option B: ZIP Archive
- Select ZIP archive (upload)
- Upload a ZIP file containing CSV files
- Click Load ZIP and extract CSVs
Option C: Server Folder
- Select Folder (server)
- Enter the folder path
- Tick Scan folder now
- Click Scan folder and load CSVs
After loading, the app captures original header names and shows the sanitized names used internally — check the preview pane for the mapping.
- Use the file list to choose which files to include in the summary.
- Use the Select all files / Clear selection buttons for convenience.
- Expand Show file preview to view first rows of a file and a per-file button Select columns from to quickly populate the column selector.
- Use the column select controls to pick which numeric columns to summarize. The app also provides Select all columns / Clear columns buttons.
- Use the Derived columns text area to enter expressions (one per line).
- Use the auto-generated example buttons to insert templates based on detected column names.
- Review parser warnings shown beneath the text area and fix any issues.
- Use Preview derived columns on preview file to validate results for a single file.
- Select files and columns (and/or define derived columns).
- Click Calculate summary for selected files.
- The app computes mean and standard deviation per-file and produces a combined summary preview.
Once calculated, summaries can be exported.
Export to Excel
- Click Export summary to Excel — produces a formatted workbook with one sheet per variable, group means on top and std devs below.
Export to CSV
- Click Export summary to CSV — produces a flat summary table.
The app provides flexible filename parsing to extract study, individual, treatment and time information from file names. You can choose the parsing mode in the sidebar:
- Auto (underscore parts): splits the filename stem on underscores and infers fields from right to left.
- If the last token looks like a time (e.g.
45min,2h,birth), it is treated astime. - The next tokens (moving left) are treated as
treatmentthenindividual. - Any remaining left-hand tokens are treated as
study(and may contain underscores). - Examples:
STUDY_L01_IMV_45min→ study=STUDY, individual=L01, treatment=IMV, time=45minL01_IMV_45min→ individual=L01, treatment=IMV, time=45minSTUDY_L01_IMV→ study=STUDY, individual=L01, treatment=IMV
- If the last token looks like a time (e.g.
- Custom pattern (tokens): use a token template like
{study}_{individual}_{treatment}_{time}to map parts directly.- Use this when your filenames are consistent and tokens never include the separator.
- Regex (named groups): provide a regular expression with named groups such as
(?P<study>...),(?P<individual>...),(?P<treatment>...),(?P<time>...)for full control over parsing.- Use this when tokens may include separators or the overall structure is more complex.
When exporting files or summaries the app also supports a user-editable filename pattern with tokens. Supported tokens are:
{date}- current date YYYYMMDD{datetime}- current datetime YYYYMMDD_HHMMSS{prefix}- fallback prefix (usually default_export_name or default_summary){study}- parsed study from the first selected file name{individual}- parsed individual from the first selected file name{treatment}- parsed treatment from the first selected file name{time}- parsed time from the first selected file name{group}- derived group for the first selected file (if available){firstfile}- stem (no ext) of the first selected file{count}- number of selected files{index}- numeric index (useful for multiple outputs)
The app sanitises and truncates generated names so they are safe for most filesystems.
You can override the built-in allocation sets by choosing "Custom" under "Individual allocation mapping" in the sidebar. The UI accepts arbitrary group names and their member IDs using a single free-form textarea. Enter one group per line with the format:
GROUP_NAME = ID1, ID2, ID3
Example: GROUP_A = L1, L2, L3
Group names are free-form (not limited to IMV/NIV/NIVe/NIVf) and can be used as the group value in summaries and export filenames depending on your Group source setting (see below).
Definitions are parsed and validated when you click Save settings and are stored in the session for subsequent summary calculations.
If no custom groups are provided, the app falls back to using other sources (e.g. filename fields like treatment) based on the Group source configuration.
In the sidebar under Summary options, you can control how the summary/export group value is derived. This makes grouping modular so you can switch between using custom mapping sets and filename-derived fields.
Available options:
- Auto (individual → mapping, else treatment) (default)
- Try to find the file's
individualin your Custom allocation mapping sets. - If no match is found, fall back to the parsed
treatmentfrom the filename.
- Try to find the file's
- From filename: individual — use the parsed
individualtoken - From filename: treatment — use the parsed
treatmenttoken - From filename: study — use the parsed
studytoken - From filename: time — use the parsed
timetoken - Fixed value (manual) — use a manually entered value from the sidebar
Additional fields:
- Fixed group value: used only when "Fixed value (manual)" is selected.
- Group fallback value: used when the computed group is blank (e.g. missing token).
These settings affect:
- the
groupcolumn in the summary table - the
{group}token when building export filenames
By default the app runs in privacy mode (data stays local and the app will not access server folders). If you disable privacy mode you may choose the "Folder (server)" upload option to read CSVs from a path on the machine running the app. Folder access is powerful but potentially risky on public/shared servers — the app warns you and requires a confirmation checkbox before enabling server folder scanning.
- Column sanitization: headers are converted to letters, numbers and underscores only; see the mapping in the preview to find the correct names for expressions.
- Ratio division by zero yields NaN and is ignored in mean/std computations.
- If derived evaluation fails on a file, detailed errors are shown to help debugging (which derived name, which file, and a short message).
- Large files: the app will warn about files exceeding the configured MAX_FILE_MB and skip them to avoid memory issues.
- Session persistence: derived definitions and column selections persist in the session so you don't lose them on reruns.
If you are concerned about privacy or prefer to keep all data on your machine, you can run the app locally. When run locally the app processes files only on your computer — no data is uploaded to any external server.
Recommended steps (macOS / zsh):
-
Create and activate a virtual environment
zsh commands: python3 -m venv venv source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
Note: This requires internet access once to download packages. If you need fully offline installation, create a wheelhouse on another machine and install from that.
-
Run the Streamlit app
streamlit run app_streamlit.py
By default this opens the app in your browser at http://localhost:8501 and keeps all processing local.
-
Using a local folder as data source
- In the app sidebar choose "Folder (server)" or the equivalent option.
- Enter an absolute or relative path to the folder on your machine containing the CSV files.
- Tick "Scan folder now" (or click the scan/load button) to load files. The app will read files from that path on your disk.
-
Tips for strict offline use
- Install dependencies before disconnecting from the internet.
- Do not enable any cloud integrations or remote paths in the app settings.
- If you need to share the app with colleagues in a closed network, package the virtual environment or provide a requirements wheelhouse.
Quick commands (zsh):
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt streamlit run app_streamlit.py