The purpose of this preparatory assignment is to make sure that you (and your computer) are well prepared for the block course in August to avoid having technical issues during the block course. Therefore, this assignment is mandatory and has to be submitted until Friday, August 7th.
The assignment can be accessed and submitted via GitHub Classroom. This requires a GitHub user account. Please create one and register for GitHub Education benefits.
Important notes:
- Please make sure to name the output files exactly as described.
- If you encounter problems which you cannot solve on your own, create a new topic in the GitHub forum of this course and describe your problem.
After you have completed this assignment you will be able to ...
- name different best practice methods for scientific computing,
- install and configure all required software for the course,
- use Jupyter notebook to write Python code and
- use git to track your work progress.
Read the paper by Wilson et al. (2014) on Best Practices for Scientific Computing and answer the following questions. Write your answer in a text file called scientific_programming.txt.
- The paper describes several problems scientist face when performing scientific data analyses. From your experience in performing (GIS) analyses, which of these problems seem familiar to you? Have you faced other problems not mentioned in the paper? (~100 words)
- Which methods described in the paper could help you avoid these problems in the future? (~100 words)
- One of the recommendations by Wilson et al. (2014) for scientists is to use a Version Control System (VCS). Briefly explain in your own words, what the benefits of VCS are in the context of scientific analyses. (~100 words)
Image source: “Piled Higher and Deeper” by Jorge Cham, http://www.phdcomics.com
-
Follow the instructions given in the section Software Setup to install and configure all software that is required for the course on your computer.
-
Execute the python file check_environment.py from within your new Python environment to verify that all required packages have been installed successfully. Using the following command the output of the program will be written to a new text file called check_environment_result.txt. Make sure that you don't get any import errors.
$ python check_environment.py > check_environment_result.txt -
Check whether the new environment is also available within PyCharm by executing the file check_environment.py from within PyCharm.
In this last section, you will use git to track the changes of the files that we have created so far. If you are not familiar with git yet, work through the section Introduction to git before continuing with the exercises.
-
If you haven't done so already, clone the GitHub repository of this assignment on your computer. Make sure to use the URL of your repository.
$ git clone https://github.com/geoscripting/preparatory-assignment-redfrexx.git -
In the previous two exercises you have created two files. Copy these files into the main folder of the cloned repository. Create a commit for each one of them in order to track them in your local git repository, e.g.
$ git add scientific_programming.txt $ git commit -m "added scientific_programming.txt"
-
Activate the advgeo environment and start a Jupyter Notebook server.
$ jupyter notebook -
Create a new Jupyter notebook using the Python 3 kernel and name it preparatory_assignment.ipynb.
-
Import the function
check_packages()from check_environment.py and execute it in order to check whether the notebook is using the right anaconda environment advgeo. If you get aModuleImportError, check whether the right kernel is selected (Jupyter Menu: Kernel → Change kernel). -
If everything works, save your notebook and add it to your git repository by creating a new commit.
-
Within the notebook, write a new function called
list_sum(), which calculates the sum of all numbers in a list. e.g.>>> numbers = [1,2,3,4] >>> numbers_sum = list_sum(numbers) >>> print(numbers_sum) 10
-
Create a new commit containing the changes of the notebook.
The last task of this assignment is to push all your commits to GitHub.
$ git push origin master
One last note: If you make further edits to your files (e.g. editing your answers to the questions in exercise 1) remember that you can always synchronise these changes with your GitHub repository by creating further commits and pushing them to GitHub.
