-
Notifications
You must be signed in to change notification settings - Fork 13
Installation and usage instructions updated #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,20 +47,152 @@ docker run --rm -v $PWD:/home/work tschijnmo/drudge:gristmill python3 script.py | |
| can also execute the script directly. | ||
|
|
||
|
|
||
| # Downloads and installation (Native) | ||
| # Local Installation (Development) | ||
|
|
||
| For development, the drudge stack can also be downloaded, compiled, and | ||
| installed from source. For most non-developmental users, execution by Docker | ||
| is recommended. | ||
|
|
||
| ## Conda | ||
|
|
||
| Conda is _strongly_ recommended for building and running this program locally. Conda is a utility for managing environments, which are like containers that hold a specific set of software installations. Just like a docker container can maintain exact versions and configurations of dependencies, so can Conda. The install script utilizes conda and will not run without it. To get conda simply go [here](https://www.anaconda.com/docs/getting-started/miniconda/install) and follow the instructions for your operating system. Once it is installed you may proceed | ||
|
|
||
| ## Install Script (Linux-x86_64) | ||
|
|
||
| **These instructions function only for Linux x86_64 machines**. If you are using a mac or a different architecture, skip straight to the manual installation instructions. | ||
|
|
||
| There exists an install script at `drudge/install/install.sh`. You must run this script by executing | ||
| ``` | ||
| source install/install.sh | ||
| ``` | ||
| You must use `source` as it executes the commands in the current process (as you) rather than opening a subshell in which conda usually breaks. Simply running this line should be enough to get you into shape. At this point if you'd like to utilize vscode you can simply run | ||
|
|
||
| ``` | ||
| code . | ||
| ``` | ||
|
|
||
| to open a vscode window to the drudge location. | ||
|
|
||
| Should anything go wrong, work your way through the manual installation instructions below | ||
|
|
||
| ## Manual Installation | ||
|
|
||
| These instructions follow directly the existing install script, but provide a more interactive experience if needed. Enter the root drudge directory before proceeding, this is the base directory of the github repository. | ||
|
|
||
| ### Create your Conda Environment | ||
| ``` | ||
| conda create --name $ENV_NAME python=3.9 -y | ||
| conda install --name $ENV_NAME -- file install/$ENV_TYPE.txt -y | ||
|
|
||
| conda init | ||
| conda activate $ENV_NAME | ||
| ``` | ||
|
|
||
| #### Parameters: | ||
| - `$ENV_NAME`: This is the name of your environment, replace it with an environment name that does not already exist. Check the existing environment names with `conda env list` | ||
| - `$ENV_TYPE.txt`: This is the desired environment dependencies file. There are two in the `install/` folder, `env_x86.txt` and `env_arm.txt`. To know which one you need simply run `uname -m` and it will return either `x86_64` or `arm64`, | ||
| - `arm64 -> env_arm.txt` | ||
| - `x86_64 -> env_x86.txt` | ||
| - If you get something other than `arm64` or `x86_64` feel free to try to install either file anyway and let us know if it doesn't work | ||
| - (UPDATE) Conda has been having issues lately so if you encounter problems ensure you're on the latest conda version with `conda update -n base -c defaults conda` and conda install the packages one at a time. Sometimes attempting to install multiple packages in a single transaction breaks conda. | ||
|
|
||
| ### Clone Submodules | ||
| ``` | ||
| git submodule update --init --recursive | ||
| ``` | ||
| This installs necessary dependencies` github repositories | ||
|
|
||
| ### Set Environment Variables and Build | ||
| ``` | ||
| python3 setup.py build | ||
| python3 setup.py install | ||
|
|
||
| export PYTHONPATH=/PATH/TO/drudge/build | ||
| export DUMMY_SPARK=1 | ||
| ``` | ||
|
|
||
| The first two lines build the c++ files into cpython files which can be imported and executed by our python program. By default python cannot import or utilize c++ files, this step is necessary for a fully functioning drudge. | ||
|
|
||
| The next two lines set necessary environment variables, so python knows where to find our python imports, and to utilize local dummy_spark instead of apache spark (which isn't quite working on python3.9 yet). The PYTHONPATH should be set to the build directory inside the drudge repository, this build directory is created by the `setup.py` `build` and `install` commands in the previous lines. | ||
|
|
||
| <!-- ### Copy the built cpython files | ||
| ``` | ||
| cp build/lib.linux-x86_64-cpython-39/drudge/wickcore.cpython-39-x86_64-linux-gnu.so drudge/ | ||
| cp build/lib.linux-x86_64-cpython-39/drudge/canonpy.cpython-39-x86_64-linux-gnu.so drudge/ | ||
| ``` | ||
| These are _**EXAMPLE**_ lines. Yours might look different. What you're looking for is: | ||
| 1) In the Build Folder inside the root folder | ||
| 2) In the folder that starts with `lib.` | ||
| 3) In its `drudge` folder | ||
| 4) The two files that end with `.so` | ||
|
|
||
| Copy both these files into the `drudge/drudge` folder that contains the `canonpy.cpp` and `wickcore.cpp` files. | ||
|
|
||
| This step is the reason that the script is impossible to generalize across operating systems. I can see little rhyme or reason for the way these files/folders get named so predicting them in code appears impossible. --> | ||
|
|
||
| ### Get Dummy Spark | ||
| Ensure you're in the base drudge directory and | ||
| ``` | ||
| git clone https://github.com/DrudgeCAS/DummyRDD ../dummyRDD/ | ||
| cp -r ../dummyRDD/dummy_spark . | ||
| rm -rf ../dummyRDD/ | ||
| ``` | ||
|
|
||
| These code lines do the following: | ||
| 1) Clone the repository that contains dummy_spark just outside of the drudge directory | ||
| 2) Copies the relevant `dummy_spark/` folder into the drudge base directory | ||
| 3) Deletes the now unnecessary `dummyRDD` repository | ||
|
|
||
| ### Running It | ||
| It should be completely installed now. You can open your desired location in your desired IDE but you must ensure 3 things are true before running the code. | ||
|
|
||
| 1) The python interpreter used by the IDE must be set to the python executable inside your conda environment, likely at something like `~/miniconda3/envs/drudge/bin/python`. This tells the IDE which python executable to use to run all your code. | ||
| 2) The conda environment must be activated by whatever is running the code. In VSCode there's a terminal which shows the debug or run commands when you click debug or run. This terminal is where the conda environment must be activated. This tells the IDE where to find all the dependencies/packages required for drudge to run | ||
| 3) PYTHONPATH environment variable must be set to the `build` folder in your drudge repository. This build folder is created when you run the `setup.py` steps. This tells the IDE where to find all the drudge files that you'll want to import. | ||
|
|
||
| ## VSCode Specific Instructions | ||
| There are some steps required for drudge to function in its current state in vscode. You should treat this as a checklist every time you open the project. | ||
|
|
||
| ### 1) Correct Interpreter | ||
| In the bottom right of vscode, just to the left of the bell you should see something like `3.9.20 ('drudge': conda)`. This denotes the currently enabled python interpreter (interpreter=python executable file). If it does not show up, ensure you have a python file open. You want to look through this interpreter list and choose the one that corresponds to the conda environment you made during installation. | ||
|
|
||
| ### 2) Conda Environment Activated | ||
| In the terminal in the bottom portion of the window (you can open with `ctrl/cmd + ~` if it's closed) with the `TERMINAL` tab selected you should see your command prompt. At the very left there might be some things in parenthesis like `(drudge) (base)`. You want to ensure that your drudge environment name is here. If it's not, simply run | ||
|
Comment on lines
+159
to
+160
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. |
||
| ``` | ||
| conda activate drudge | ||
| ``` | ||
| with drudge replaced with whatever your environment name is. | ||
|
|
||
| NOTE: If you're having import or version issues and have multiple environments listed in parenthesis it can be helpful to deactivate all environments by running | ||
| ``` | ||
| conda deactivate | ||
| ``` | ||
| repeatedly until all environment names are cleared, and then activating your drudge environment again. | ||
|
|
||
| ### 3) Set Environment Variables | ||
| ``` | ||
| export PYTHONPATH=$(pwd) | ||
| export DUMMY_SPARK=1 | ||
| ``` | ||
| Every time you open a new terminal: | ||
| - You just opened vscode | ||
| - You clicked debug for the first time this session | ||
| - You hit the plus sign to the right of the TERMINAL tab | ||
|
|
||
| you'll need to ensure that both the drudge conda environment is activated, and that the environment variables are set. To set the `PYTHONPATH` variable correctly you need to ensure you are in the `drudge/` base directory (the one that github clones). | ||
|
|
||
| ### You're set. | ||
| This should be everything you need to do to get drudge running. If there's problems or you encounter other errors, I recommend you take notes on what the problem was and what you did to fix it. | ||
|
|
||
| <!-- | ||
| ## Dependencies | ||
|
|
||
| In order to fully take advantage of the latest technology, the drudge/gristmill | ||
| stack requires Python at least 3.6, and Apache Spark at least 2.2 is need. To | ||
| compile the binary components, a C++ compiler with good C++14 support is | ||
| required. Clang++ later than 3.9 and g++ later than 6.3 is known to work. | ||
|
|
||
|
|
||
| --> | ||
| ## Downloads | ||
|
|
||
| All components of the drudge/gristmill stack are hosted on Github. The | ||
|
|
@@ -84,7 +216,7 @@ submodules of | |
| wrapping core C++ native modules for Python with ease. | ||
|
|
||
| As a result, to clone the repositories, `--recurse-submodules` is recommended. | ||
|
|
||
| <!-- | ||
| ## Compilation and installation | ||
|
|
||
| By `setuptools`, inside the root directory of the source tree of drudge or | ||
|
|
@@ -94,4 +226,4 @@ gristmill, the compilation and installation can simply be | |
| python3 setup.py build | ||
| python3 setup.py install | ||
| ``` | ||
|
|
||
| --> | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the copying of compiled CPython libraries is not required if
python setup.py installis successfully executed. @Wholinator can you double check?