Skip to content

k_nn calculation #3

@ansonrel

Description

@ansonrel

Hi,

First of all, thank you for you package and for the great manuscript that is linked to it!
I'm trying to run Enhance on different datasets and some of them returned an error:

denois <- enhance (data)
[1] "Calculating number of neighbors to aggregate to aim for 2e+05 transcripts"
[1] "Number of neighbors to aggregate: 1"
[1] "Number of principal components to use: 50"

 Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...) : 
  'x' must be an array of at least two dimensions
 
10.stop("'x' must be an array of at least two dimensions")
 
9.base::rowSums(x, na.rm = na.rm, dims = dims, ...)
 
8.rowSums(D[, indices])
 
7.rowSums(D[, indices]) at
 enhance.R#147
6.FUN(X[[i]], ...)
 
5.lapply(X = X, FUN = FUN, ...)
 
4.sapply(nn, function(indices) {
    rowSums(D[, indices])
})
 
3.sapply(nn, function(indices) {
    rowSums(D[, indices])
}) at
 enhance.R#146
2.aggregate_nearest_neighbors(D = data_raw, nn = nn_1) at
 enhance.R#223

1.enhance(data)
 

Strangely, I only got this error message with my bigger datasets (>1e+06 transcripts per cells). I am surprised that k_nn estimation is 1 and I guess it is the source of the error.

Looking at the code I see that k_nn, the number of neighbors to aggregate, is defined as

  med_raw = median(colSums(data_raw))
  k_nn = ceiling(target_transcripts / med_raw)

Maybe I miss an important point but shouldn't be k_nn equal to ceiling(med_raw / target_transcripts ) instead, so that the number of neighbors to aggregate increases with the number of transcripts per cells ?

In every cases, do you know how I could avoid an error when k_nn = 1 ? Should I increase k_nn to 2 or does it mean that the dataset is too big/small for the method to work ?

[EDIT]
Sorry, I posted this issue in the Python repository instead of the R one. However the calculation of knn is the same and I guess my questions are still relevant here

k = int(ceil(target_transcript_count / transcript_count))

Thanks,
Anthony

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions