Skip to content

Support larger data via ARMA_64BIT_WORD #2

@pol-db4drd2

Description

@pol-db4drd2

Cannot estimate models on $2^{16}$ or more (lower-level) data points, because ARMA_64BIT_WORD is not enabled in compiled code.

Thanks for your attention. Data larger than 65535 units seem likely in practice (I am already working with a set of $\sim2^{17}$ points).

Reproducible example

do_model <- \(rows, cols) {
  size <- rows * cols
  rho <- 0.3
  
  wlist <- spdep::cell2nb(rows, cols, "queen")
  wtrix <- spdep::nb2mat(wlist)
  wtrix <- Matrix::Matrix(wtrix)
  ident <- Matrix::Diagonal(nrow(wtrix))
  
  W <- ident - rho * wtrix
  
  X <- dplyr::tibble(x = rnorm(size), z = sample(1:8, size, TRUE))
  Z <- dplyr::tibble(z = 1:8, u = rnorm(8))
  X <- dplyr::left_join(X, Z)
  X <- dplyr::mutate(X, y = as.numeric(Matrix::solve(W, rnorm(size, .data$x + .data$u))))
  
  d <- model.matrix(~ factor(z) + 0, data = X)
  d <- Matrix::Matrix(d)
  
  HSAR::hsar(y ~ x, X, wtrix, Delta = d)
}

system.time(foo <- do_model(255, 257)) # 65535 data points

system.time(bar <- do_model(256, 256)) # 65536 data points

Expected behavior

create objects of class mcmc_hsar_lambda_0 named foo and bar

Observed behavior

> system.time(foo <- do_model(255, 257)) # 65535 data points
    user   system  elapsed 
1929.525 3004.173  356.218 
> 
> system.time(bar <- do_model(256, 256)) # 65536 data points
Error: SpMat::init(): requested size is too large; suggest to enable ARMA_64BIT_WORD

Session info

> sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 24.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8   
 [6] LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C           LC_TELEPHONE=C        
[11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Matrix_1.6-5

loaded via a namespace (and not attached):
 [1] s2_1.1.7           sandwich_3.1-1     utf8_1.2.4         generics_0.1.3     class_7.3-22       LearnBayes_2.15.1 
 [7] KernSmooth_2.23-22 lattice_0.22-5     digest_0.6.37      magrittr_2.0.3     evaluate_1.0.3     grid_4.3.3        
[13] mvtnorm_1.3-3      pkgload_1.4.0      fastmap_1.2.0      e1071_1.7-16       HSAR_0.6.0         DBI_1.2.3         
[19] survival_3.5-8     spatialreg_1.3-6   multcomp_1.4-28    fansi_1.0.6        TH.data_1.1-3      codetools_0.2-19  
[25] cli_3.6.3          rlang_1.1.5        units_0.8-7        splines_4.3.3      yaml_2.3.10        tools_4.3.3       
[31] deldir_2.0-4       coda_0.19-4.1      dplyr_1.1.4        boot_1.3-30        vctrs_0.6.5        R6_2.5.1          
[37] zoo_1.8-13         proxy_0.4-27       lifecycle_1.0.4    classInt_0.4-11    MASS_7.3-60.0.1    spdep_1.3-10      
[43] pkgconfig_2.0.3    pillar_1.9.0       glue_1.8.0         Rcpp_1.0.14        sf_1.0-20          xfun_0.50         
[49] tibble_3.2.1       tidyselect_1.2.1   rstudioapi_0.17.1  knitr_1.49         htmltools_0.5.8.1  nlme_3.1-164      
[55] rmarkdown_2.29     wk_0.9.4           compiler_4.3.3     sp_2.2-0           spData_2.3.4      

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions