Skip to content

Conversation

@jayavenkatesh19
Copy link
Contributor

@jayavenkatesh19 jayavenkatesh19 commented Nov 20, 2025

Adding a notebook here which consists of an end-to-end workflow example to load and preprocess Sentinel-2 data, train a Random Forest model to perform LULC classification and perform inference on an unseen Sentinel-2 tile.

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@jayavenkatesh19 jayavenkatesh19 requested a review from a team as a code owner November 20, 2025 17:16
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:49Z
----------------------------------------------------------------

Chore: It would be nice to link things like Microsoft Planetary Computer to their docs, the datasets to theirs, etc. Can't have too many links!

Chore: I would also be careful to define acronyms here on there first use, there are a few like AOI that readers may not know.


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:50Z
----------------------------------------------------------------

Line #14.    FEATURES_ZARR = Path("")

Suggestion (non-blocking): I would be tempted to separate out the variables that you expect users to change into a new cell. That way it's more clear what needs to be done.


ncclementi commented on 2025-12-02T14:10:21Z
----------------------------------------------------------------

+1 to this comment, I thought of the same.

@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:51Z
----------------------------------------------------------------

Suggestion (non-blocking): Have you thought about using distributed.progress here to make it a bit more pleasant to watch?


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:51Z
----------------------------------------------------------------

suggestion (non-blocking): you could use cupy-xarray here instead of manually mapping the blocks over to cupy. It would be nice to encourage people to use that library.


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:52Z
----------------------------------------------------------------

thought: This cell is going to take a while right? I wonder if you should add a note to talk about nvdashboard and suggest folks go and have a look at the docs there so they can see the activity of their hardware


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:54Z
----------------------------------------------------------------

nitpick: You say "note how...", this could be a note admonition.


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:54Z
----------------------------------------------------------------

Line #1.    model_path = Path("/raid/jjayabaskar/full-outputs/models/lulc_rf_2022.pkl")

suggestion (non-blocking): Maybe this path variable should be defined at the top with the rest of them


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:55Z
----------------------------------------------------------------

nit: I don't think you need to describe closing the client


@review-notebook-app
Copy link

review-notebook-app bot commented Nov 25, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-25T16:52:56Z
----------------------------------------------------------------

thought: The command to "Run it now" made me chuckle


Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really nice! I left a few comments throughout (I'm trying out conventional comments, let me know what you think).

A few higher level thoughts:

The chunk of text above each code cell feels a little robotic. I'm guessing you used AI to generate some portion of these. It feels a little like that code comment trap where you just describe what the code does. I do appreciate some amount of that because the code is necessarily terse, lots of geo data analysis code is. But I think what it is missing is "why". You start out at the beginning with some high level goals. It would be nice to keep a narrative thread throughout explaining why you are doing each things and how each step is bringing us closer to that goal.

The notebook feels like it ends abruptly. There is no conclusion or recap of what was done. I think you need to wrap the post up in some way, probably by recapping what was done and then talking a little more about what could be improved or done next.

Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@review-notebook-app
Copy link

review-notebook-app bot commented Nov 26, 2025

View / edit / reply to this conversation on ReviewNB

jacobtomlinson commented on 2025-11-26T17:45:09Z
----------------------------------------------------------------

Suggestion: This reads really nicely now. I would encourage you to add more links though. Any project name like dask, cudf, cupy, etc should be linked to their docs or GitHub repo on first use. I would also link NDVI and NDWI to their wikipedia pages.

I have two personas in my head for the readers of this post:

  1. Geospatial practitioners who are interested in Data Science
  2. Data Scientists who are interested in using geospatial data

At the moment this post reads very well for the 1st persona, but there is some geospatial terminology that it would be good to unpack for the 2nd persona. It's also worth noting that PyData Boston will be heavily weighted towards the 2nd persona.


ncclementi commented on 2025-12-02T14:04:02Z
----------------------------------------------------------------

+1 to Jacob's comment.

Few links to add, and typos:

The prerequisites are a bit confusing to me, I assume that you will probably explain that in the code below, if so maybe add a line that says so. If I try to see before continuing how do I get a Prerequisites 2 and 3, I'm not sure I know how, maybe mentioning that you'll show how this below? (if you show it), or point to a place that shows how to do that.

Copy link
Contributor

+1 to Jacob's comment.

Few links to add, and typos:

The prerequisites are a bit confusing to me, I assume that you will probably explain that in the code below, if so maybe add a line that says so. If I try to see before continuing how do I get a Prerequisites 2 and 3, I'm not sure I know how, maybe mentioning that you'll show how this below? (if you show it), or point to a place that shows how to do that.


View entire conversation on ReviewNB

Copy link
Contributor

+1 to this comment, I thought of the same.


View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:20Z
----------------------------------------------------------------

Do we have instructions on how to set up the environment somewhere, with pinned versions?


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:20Z
----------------------------------------------------------------

if this is the first time you introduce Zarr, then add a link to the library


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:21Z
----------------------------------------------------------------

Do we have any guides to point someone how to define this? most specifically the AOI_GEOJSON and FEATURE_ZARR, as somoene who is not entirely familiar with this, I wouldn't know how to do this.

If we can write as an example somewhere what you did for this notebook, that would be great.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:22Z
----------------------------------------------------------------

When you say "If you need to pin workers to specific GPUs or change memory limits" in this case, why would this happen? If you think it could be needed, add a link to the docs on how to do this, so the user has a guide.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:23Z
----------------------------------------------------------------

Line #1.    # %%

Is this a typo? if so we could remove it, I noticed it appears in several cells, not sure if it's intentional


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:24Z
----------------------------------------------------------------

There is a lot of info here, that only experts will understand, having a guide on how to prepare the AOI geometry as a reference would be helpful for the user.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:24Z
----------------------------------------------------------------

Line #19.    project = pyproj.Transformer.from_crs(4326, target_epsg, always_xy=True).transform

Is this because your CRS is on 4326 right?

It would be helpful to know what choices are made because of your particular case, and what is not, maybe some comments around why the target_epsg is what you defined.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:25Z
----------------------------------------------------------------

"Sentinel-2 tiles also include a special band called" ... It would be nice to have a link to the docs that supports this, in case the user wants to read more about it.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:26Z
----------------------------------------------------------------

Why do we do this step, maybe a small not mentioning why?

In general, I noticed that all the steps have text above on what are we doing, but not why is that needed or part of the process, I think this would be helpful for the audience with less geo knowledge.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:26Z
----------------------------------------------------------------

"data preprocessing steps are done ", At this point, after so many steps, it would be worth adding a sentence or two re-caping what was done, in words.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:27Z
----------------------------------------------------------------

Line #2.    random_state = 42

Is this for the RF? Do we need a random state? Asking for two reasons:

  • setting a random_state will make things slower, to guarantee replicability.
  • What version of cuml are you using? there have been a few updates around random state, not sure if any of this will affect.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:28Z
----------------------------------------------------------------

Add a comment, to expand on what is each plot in parentheses to remind the user: like

"NDVI ( short desc ), NDWI ( short desc ), and WorldCover ( short desc )"


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:29Z
----------------------------------------------------------------

"The strong skew you see here comes" is this s related to class 7? maybe add class 7 = Building area in parentheses.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:30Z
----------------------------------------------------------------

How did you choose those values (n_estimators, n_bins, n_streams), if these are optimized mention how you got there, if these are a good starting point, mentioned that. To give the user a reference.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:31Z
----------------------------------------------------------------

Do we know how long it takes for the hardware you run this on? It is helpful to give the user a reference of time.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 2, 2025

View / edit / reply to this conversation on ReviewNB

ncclementi commented on 2025-12-02T15:15:31Z
----------------------------------------------------------------

Any idea what those warnings are about?


Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>
@jayavenkatesh19 jayavenkatesh19 self-assigned this Dec 6, 2025
@jayavenkatesh19 jayavenkatesh19 changed the title [WIP] Notebooks for Geospatial Analysis using RAPIDS Notebooks for Geospatial Analysis using RAPIDS Dec 6, 2025
@jayavenkatesh19 jayavenkatesh19 changed the title Notebooks for Geospatial Analysis using RAPIDS Workflow Example for Geospatial Analysis using RAPIDS Dec 6, 2025
@jayavenkatesh19 jayavenkatesh19 changed the title Workflow Example for Geospatial Analysis using RAPIDS Add a Workflow Example for Geospatial Analysis using RAPIDS Dec 6, 2025
Copy link
Contributor

@ncclementi ncclementi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jacobtomlinson jacobtomlinson merged commit 78cfd6f into rapidsai:main Dec 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants