Skip to content

Docs: Add extended quickstart and installation guides (release + source)#2388

Open
yiseungmi87 wants to merge 4 commits intoapache:mainfrom
yiseungmi87:docs/extended-quickstart
Open

Docs: Add extended quickstart and installation guides (release + source)#2388
yiseungmi87 wants to merge 4 commits intoapache:mainfrom
yiseungmi87:docs/extended-quickstart

Conversation

@yiseungmi87
Copy link

@yiseungmi87 yiseungmi87 commented Dec 18, 2025

This PR introduces improved documentation for new users of SystemDS:

Added

  • quickstart_extended.md - Overview page linking installation and execution docs
  • release_install.md - Clean, updated installation guide for release users
  • source_install.md - Updated guide for building SystemDS from source
  • run_extended.md - Comprehensive execution guide (local, Spark, federated)
  • run.md- Slightly modified

Scope

  • Documentation-only changes
  • No changes to SystemDS code or runtime behavior
  • Existing run.md is intentionally left for compatibility

Purpose

These changes provide clearer onboarding for new SystemDS users and consolidate documentation into a consistent structure.

Let me know if adjustments are desired before merging.

Copy link
Contributor

@janniklinde janniklinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @yiseungmi87 for the good first PR, it seems to be quite clear and understandable so far.

I did not manage to set up SystemDS for Ubuntu by only following your guide (which should be the goal of the install guide) so please have a look into that. You can use a clean docker image to follow your guide and identify possible points of failure. Similarly, please check that for the other operating systems no such weak points exist (if you have windows, maybe try the setup on a new user). Also, I realized that when cloning SystemDS source code via GitHub Desktop on Windows, it might get stuck in the cloning process so we should provide a solution for that (e.g. use 'git' CLI for cloning rather than the app). So far, I have not tested the install for Windows / macOS but will do so once my current comments are resolved.

@yiseungmi87
Copy link
Author

Thanks again for the detailed feedback @janniklinde !

In addition to the changes in this PR, I verified the guides end-to-end on clean OS environments (Windows, Ubuntu, Mac).
Following only the Quickstart and installation guides, I was able to successfully complete the setup from installation through running the basic Hello World example on all tested systems.

Also, I’ve addressed the remaining minor issues such as markdown rendering issue or duplicate explanations.

Happy to adjust further if anything is still unclear or fails in your setup.

Copy link
Contributor

@janniklinde janniklinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the improvements @yiseungmi87. I left you some more feedback on where you could further refine the quickstart guide

@yiseungmi87
Copy link
Author

Hi, thanks a lot for the feedback! @janniklinde

I’ve updated the PR to reflect all points mentioned so far.

Regarding the note that this file feels like a clone of docs/site/install.md: that makes sense. The changes here are mainly meant as clarifications and fixes to the existing install guide, not as a separate alternative. So, what I understood is that the right next step would be to apply these changes directly to install.md and drop the separate file.

Before doing that, I just wanted to double-check that this approach is what you meant.

Thanks again!

Copy link
Contributor

@janniklinde janniklinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your contribution @yiseungmi87 and sorry for the late response. The install instructions now seem to work. I noticed that your references are sometimes inconsistent (sometimes you refer to the html and sometimes to md).

Please replace the existing files (install.md and run.md) and double check the links/references. Also, the old run.md contains instructions for using Intel MKL native instructions. Is there a specific reason you left that out?

Finally, for this guide to be accessible as html, we require the header (like in the old install.md). When these issues are addressed, I think we are ready to merge.

@yiseungmi87
Copy link
Author

Thanks for the feedback!
I’ve now merged the extended guides into the existing install.md and run.md.
All links and references were updated accordingly.
Please let me know if anything still looks off.

Copy link
Contributor

@janniklinde janniklinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates and overall contribution @yiseungmi87, with this PR your DIA project is successfully completed. I left some minor notes for us to fix before merging. We will take it from here.

(Optional but recommended) To make SystemDS available in new terminals, add the following lines to your shell configuration (e.g., ~/.bashrc or ~/.profile):
```bash
export SYSTEMDS_ROOT=/absolute/path/to/systemds-<VERSION>
export SYSTEMDS_JAR_FILE=$(find "$SYSTEMDS_ROOT" -maxdepth 1 -type f -name "systemds-*.jar" | head -n 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting this line into .bashrc or .profile is usually bad practice because this command then runs every time you start a new terminal. Putting a path like /absolute/path/to/systemds-<VERSION>-bin/systemds-<VERSION>.jar would be preferred.

Hello World!
```

On some Ubuntu setups (including clean Docker images), running SystemDS directly may fail with `Invalid or corrupt jarfile hello.dml` Error. Ensuring that `SYSTEMDS_JAR_FILE`points to the SystemDS JAR shipped with the release resolves this issue.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed because the manual explicitly states to set SYSTEMDS_JAR_FILE -> redundant


If you simply want to *use* SystemDS without modifying the source code, the recommended approach is to install SystemDS from an official Apache release.

**Full Release Installation Guide:** [SystemDS Install from release](https://apache.github.io/systemds/site/release_install.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name like title of release_install.


If you plan to contribute to SystemDS or need to modify its internals, you can build SystemDS from source.

**Full Source Build Guide:** [SystemDS Install from source](https://apache.github.io/systemds/site/source_install.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name like title from source_install and update reference because it's now called install.md

At the time of writing the commands to install R 4.0.2 are:
R can be installed using the CRAN repository.

**Ubuntu 22.04**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments, but LGTM.

Would be good with explicit tests for the docs.

Consider adding them as python tests inside /src/main/python

The tests should parse the docs, extract script parts, and run.

Comment on lines +113 to +123
**Ubuntu 22.04**

```bash
sudo apt install dirmngr gnupg apt-transport-https ca-certificates software-properties-common
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/'
sudo apt update
sudo apt install r-base
```

Optionally, you need to install the R dependencies for integration tests, like this:
(use `sudo` mode if the script couldn't write to local R library)
**Ubuntu 22.04**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both say 22.04, one should be the old.

Setup your environment variables with JAVA_HOME and MAVEN_HOME. Using these variables add the JAVA_HOME/bin and MAVEN_HOME/bin to the path environment variable. An example of setting it for Java can be found here: <https://www.thewindowsclub.com/set-java_home-in-windows-10>

To run the system we also have to setup some Hadoop and spark specific libraries. These can be found in the SystemDS repository. To add this, simply take out the files, or add 'src/test/config/hadoop_bin_windows/bin' to PATH. Just like for JAVA_HOME set a HADOOP_HOME to the environment variable without the bin part, and add the %HADOOP_HOME%/bin to path.
To run the system we also have to setup some Hadoop and Spark specific libraries. These can be found in the SystemDS repository. To add this, simply take out the files, or add 'src/test/config/hadoop_bin_windows/bin' to PATH. Just like for JAVA_HOME set a HADOOP_HOME to the environment variable without the bin part, and add the `%HADOOP_HOME%\bin` to path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use / not \

# 6. Next Steps

To execute dml scripts look at [Execute SystemDS](run), this step is not needed to develop in systemds, but it helps setting up the command-line execution of systemds.
Now everything is setup and ready to go! For running scripts in Spark mode or experimenting with federated workers, see the Execution Guide: [Execute SystemDS](run.html)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not completely sure, but i think we can simply link to run not run.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants