Skip to content

Replace values with commands when summary report is large #9

@gadenbuie

Description

@gadenbuie

Summary reports for large data sets get very out of hand very quickly.

── different: Comparison Summary ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
# Dimensions
    set    rows  cols
    ----- ----- -----
    .$x   20437   500
    .$y   20502   378

# Columns
● 165 columns in .$x are not in .$y:
    `TCGA-2A-A8VX-01`, `TCGA-2A-AAYF-01`, `TCGA-2A-AAYO-01`, `TCGA-2A-AAYU-01`, `TCGA-4L-AA1F-01`, 
    `TCGA-CH-5745-01`, `TCGA-EJ-7312-01`, `TCGA-EJ-7325-01`, `TCGA-EJ-A46B-01`, `TCGA-EJ-A46E-01`, `TCGA-EJ-
    A46F-01`, `TCGA-EJ-A46H-01`, `TCGA-EJ-A65B-01`, `TCGA-EJ-A65D-01`, `TCGA-EJ-A65M-01`, `TCGA-EJ-A7NN-01`, 
    ...full list of 165 columns...

● 43 columns in .$y are not in .$x:
    `TCGA-CH-5761-11A-01R-1580-07`, `TCGA-CH-5767-11B-01R-1789-07`, `TCGA-CH-5768-11A-01R-1580-07`, `TCGA-CH-5769-
    11A-01R-1580-07`, `TCGA-EJ-7115-11A-01R-2118-07`, `TCGA-EJ-7123-11A-01R-1965-07`, `TCGA-EJ-7125-11A-01R-1965-07`, `TCGA-
    EJ-7314-11A-01R-2118-07`, `TCGA-EJ-7315-11A-01R-2118-07`, `TCGA-EJ-7317-11A-01R-2118-07`, `TCGA-EJ-7321-11A-01R-2263-07`, 
   ...full list of 43 columns...

● 335 columns appear in both .$x and .$y
  ✔ 2 columns have identical entries: 
    `Entrez_Gene_Id`, `Hugo_Symbol`
  ✖ 333 columns have differences: 
    `TCGA-2A-A8VL-01`, `TCGA-2A-A8VO-01`, `TCGA-2A-A8VT-01`, `TCGA-2A-A8VV-01`, `TCGA-2A-A8W1-01`, 
    `TCGA-2A-A8W3-01`, `TCGA-CH-5737-01`, `TCGA-CH-5738-01`, `TCGA-CH-5739-01`, `TCGA-CH-5740-01`, `TCGA-CH-
    5741-01`, `TCGA-CH-5743-01`, `TCGA-CH-5744-01`, `TCGA-CH-5746-01`, `TCGA-CH-5748-01`, `TCGA-CH-5750-01`, 
    ...full list of 333 columns...

# Differences
✖ There were 1690641 differences across 333 cols and 5077 rows
    variable        type.x  type.y  state n_diff diff                
    -----           -----   -----   -----  ----- ------              
    TCGA-2A-A8VL-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-2A-A8VO-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-2A-A8VT-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-2A-A8VV-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-2A-A8W1-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-2A-A8W3-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-CH-5737-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-CH-5738-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-CH-5739-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
    TCGA-CH-5740-01 numeric numeric diff    5077 <tibble [5,077 × 7]>
     with 533 more rows

When there are many column listed, instead of printing the full list the report should say

# Columns
● 165 columns in .$x are not in .$y:
    Use `diff_cols_unique()` to list common columns.

● 43 columns in .$y are not in .$x:
    Use `diff_cols_unique()` to list unique columns

● 335 columns appear in both .$x and .$y
    Use `diff_cols_common()` to list common columns.

or possibly

# Columns
● 165 columns in .$x are not in .$y
● 43 columns in .$y are not in .$x
● 335 columns appear in both .$x and .$y
i Use `diff_cols_unique()` or `diff_cols_common()` to list unique or common columns

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions