Skip to content

Vectorization of Fortran loops and other optimizations #227

@AlexKurek

Description

@AlexKurek

Just a question / suggestion: this are the results of profiling:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      571 7110.513   12.453 7110.513   12.453 {built-in method posix.waitpid}
        4 1365.157  341.289 2005.113  501.278 wavelet_atrous.py:667(check_islands_for_overlap)
        1  830.799  830.799  830.799  830.799 psf_vary.py:685(tess_simple)
     3407  381.475    0.112  381.497    0.112 necompiler.py:978(re_evaluate)
     1544  312.586    0.202  312.586    0.202 {method 'partition' of 'numpy.ndarray' objects}
       20  305.964   15.298  305.964   15.298 {built-in method scipy.ndimage._nd_image.geometric_transform}
       44  288.652    6.560  288.652    6.560 {built-in method gc.collect}
        5  270.768   54.154 1012.391  202.478 functions.py:2223(bstat)
      214  168.266    0.786  190.097    0.888 wavelet_atrous.py:649(renumber_islands)
   314391  137.325    0.000  225.320    0.001 _methods.py:196(_var)
  5110219  134.055    0.000  134.055    0.000 {method 'reduce' of 'numpy.ufunc' objects}
        6  117.511   19.585  251.401   41.900 make_residimage.py:25(__call__)
    27523  106.675    0.004  106.675    0.004 {method 'flatten' of 'numpy.ndarray' objects}
10056453/7133976   97.635    0.000  817.598    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
    31858   55.493    0.002   55.493    0.002 {built-in method posix.read}

Have you tried to vectorize loops there:
https://github.com/lofar-astron/PyBDSF/blob/master/src/fortran/pytess_simple.f#L16C1-L29C15
?
ChatGPT seems to be doing this:

program vectorized_code
  implicit none
  integer, parameter :: m = 100, n = 100, ngens = 10
  integer :: i, j, k
  real(8), dimension(n, m) :: volrank
  real(8), dimension(ngens) :: xgens, ygens, wts
  real(8), dimension(n, m, ngens) :: dist
  real(8) :: dumr
  integer, dimension(n, m) :: minind

  ! Initialize generators and weights (example values)
  do k = 1, ngens
     xgens(k) = k
     ygens(k) = k
     wts(k) = 1.0d0
  end do

  ! Vectorized computation
  volrank = 0.d0
  do k = 1, ngens
     ! Calculate distance from generator k to all points (i, j)
     dist(:, :, k) = sqrt((reshape(([(i, i = 1, n)], [n, m]) - xgens(k)), [n, m]) ** 2 + &
                          (reshape(([(j, j = 1, m)], [m, n]) - ygens(k)), [m, n]) ** 2) / wts(k)
  end do

  ! Find the minimum distance and corresponding generator index
  do j = 1, m
     do i = 1, n
        dumr = minval(dist(i, j, :))
        minind(i, j) = minloc(dist(i, j, :), dim = 1)
     end do
  end do
end program vectorized_code

but Im bad at Fortran so I cannot verify if this is OK.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions