From a557842265cb7cefccade697d62bbcfcb119ccc9 Mon Sep 17 00:00:00 2001 From: Harriet Sands Date: Fri, 15 Dec 2023 09:50:19 +0000 Subject: [PATCH] Spellcheck and ignore vscode files --- .gitignore | 2 ++ 02-easy_to_read.Rmd | 8 +++---- 03-correct_clear_concise.Rmd | 4 ++-- 07-demonstrably_correct.Rmd | 8 +++---- 08-sensible_defaults.Rmd | 6 ++--- 10-data_structure.Rmd | 2 +- R/fix_projects.R | 44 ++++++++++++++++++------------------ README.md | 6 ++++- docs/R/fix_projects.R | 44 ++++++++++++++++++------------------ index.Rmd | 20 ++++++++-------- note_R_at_dhsc.Rmd | 20 ++++++++-------- 11 files changed, 85 insertions(+), 79 deletions(-) diff --git a/.gitignore b/.gitignore index c67c2e8..c76d803 100644 --- a/.gitignore +++ b/.gitignore @@ -31,3 +31,5 @@ rsconnect/ docs/*.html #readme rendered *README.html +#VSCode +.vscode/ diff --git a/02-easy_to_read.Rmd b/02-easy_to_read.Rmd index 62d84f9..620849c 100644 --- a/02-easy_to_read.Rmd +++ b/02-easy_to_read.Rmd @@ -15,7 +15,7 @@ Most languages have several available style guides, which define a set of conventions to produce clean and consistently formatted code. Your style guide will define things like: * How to use indentation and spacing -* Line length +* Line length * Naming conventions & formats * Comment & documentation use @@ -28,7 +28,7 @@ Please use this style; consistency will make it easier for colleagues to underst | Python | [PEP-8](https://www.python.org/dev/peps/pep-0008/) | ## Linters & Code Formatters {#linters} -[Linters](https://en.wikipedia.org/wiki/Lint_%28software%29) are tools that you can use to ensure that you are following a given style guide. Code Formatters will take your code, and format it so that it conforms to a stanard. +[Linters](https://en.wikipedia.org/wiki/Lint_%28software%29) are tools that you can use to ensure that you are following a given style guide. Code Formatters will take your code, and format it so that it conforms to a standard. | Language | DHSC Recommended Linter / Formatter | |----------|-------------------------------------------------------| @@ -42,7 +42,7 @@ a comment and make life easier for other readers. This includes your future self Find a balance: avoid meaningless names like `obj` or `foo`; but don't put an entire sentence in a variable name. -Generally, variable names should be nouns and function names should be verbs. You +Generally, variable names should be nouns and function names should be verbs. You Use single-letter variables only where the use or meaning is clear - such as an iterator for a loop, or where the letter represents a well-known mathematical property (think: $e = mc^2$). @@ -52,7 +52,7 @@ If you find yourself attempting to cram data into variable names (e.g. model_201 ## Avoid Overlaps {#dont_overlap} When naming things be wary of overlapping with other meanings. In one context, using $e$ for energy or $i$ for an iterator might be sensible, but in another context might be confused with $e$ for exponent and $i$ for imaginary as in: $e^{iθ} = cos(θ) + i sin(θ)$. -Be conscious of overlapping names with things which are parts of the language, or popular functions. +Be conscious of overlapping names with things which are parts of the language, or popular functions. For example in Python, you probably want to avoid common abbreviated library names (`np` or `pd`), or in R be careful about overwriting things like `c` which is the name of the function used to make a vector. ## Name Formats {#name_formats} diff --git a/03-correct_clear_concise.Rmd b/03-correct_clear_concise.Rmd index 239a129..0ff92f4 100644 --- a/03-correct_clear_concise.Rmd +++ b/03-correct_clear_concise.Rmd @@ -29,10 +29,10 @@ You may find that you have produced code which takes some time to run. If you expect to run it many times, _then_ its time to think about how you could make things faster. But don't fall into the trap of optimising before you need to. Ask yourself how much time you are going to save, if it’s a couple of mins, but the optimising takes you several days, is it worth it? -For most languages there are [profiling](https://en.wikipedia.org/wiki/Profiling_%28computer_programming%29) tools you can use to understand resource usage when you need to. +For most languages there are [profiling](https://en.wikipedia.org/wiki/Profiling_%28computer_programming%29) tools you can use to understand resource usage when you need to. ## Concise {#ccfc_concise} Keeping the amount of code you use to achieve a goal at a minimum can often be a good thing. There is less code to go wrong or debug, less to explain, style and document. But, remember that concision is less important than correctness, clarity and speed. -Don't make it shorter than it needs to be, and think of the tradeoff with clarity and flexibility. +Don't make it shorter than it needs to be, and think of the trade-off with clarity and flexibility. diff --git a/07-demonstrably_correct.Rmd b/07-demonstrably_correct.Rmd index 377f5ea..8289dbb 100644 --- a/07-demonstrably_correct.Rmd +++ b/07-demonstrably_correct.Rmd @@ -3,17 +3,17 @@ (ref:demonstrablycorrect-intro) **You Must** - (ref:demonstrablycorrect-must) - + **You Should** - (ref:demonstrablycorrect-should) **You Could** - (ref:demonstrablycorrect-could) - + |Related Areas: | [Version Control](#version_control)
[Be Reproducible](#reproducible) | |--------------- |------------------------------------------------------------| ## Quality Assurance Applies to Code {#qa_code_too} -Just because you have written code rather then making a spreadsheet doesnt mean your analysis is correct. -Code is not exempt from Quality Assurance processes. +Just because you have written code rather then making a spreadsheet doesn't mean your analysis is correct. +Code is not exempt from Quality Assurance processes. As with any other analysis you need to record evidence that your code is: * doing the right thing diff --git a/08-sensible_defaults.Rmd b/08-sensible_defaults.Rmd index 4e794d9..b231cf1 100644 --- a/08-sensible_defaults.Rmd +++ b/08-sensible_defaults.Rmd @@ -14,11 +14,11 @@ ## General Defaults {#general_defaults} * Use 'Tidy' data. See the [tidy data principle](#data_structure)! * Use [git](#git) for version control of code (rather than SVN, Mercurial etc). See the [version control principle](#version_control) -* Use preexisting packages and code before writing your own. +* Use pre-existing packages and code before writing your own. * Use popular, mature and well supported packages in preference to up & coming ones. ## Language Specific Defaults: -In addition to the general principles, please see the language specifc pages: +In addition to the general principles, please see the language specific pages: * [Python at DHSC](#py_at_dhsc) -* [R at DHSC](#r_at_dhsc) \ No newline at end of file +* [R at DHSC](#r_at_dhsc) diff --git a/10-data_structure.Rmd b/10-data_structure.Rmd index 27e8202..eb60ae5 100644 --- a/10-data_structure.Rmd +++ b/10-data_structure.Rmd @@ -28,7 +28,7 @@ Use tidy data structures as part of your work. You should attempt to convert inc Any data that is output that may be used in other projects should be in tidy format as well as any other required formats. ## Data Types and Structures {#data_types} -Data *types* are the basic units which your language uses to store data, things like integers, doubles, strings and logical data. Typically you are working with data frames, arrays, matricies or lists. These hold multiple items of data in a data *structure*. +Data *types* are the basic units which your language uses to store data, things like integers, doubles, strings and logical data. Typically you are working with data frames, arrays, matrices or lists. These hold multiple items of data in a data *structure*. Different types and structures are used for different things, and have different capabilities. To be effective, know about the data types and structures available to you and use the right ones for the job! diff --git a/R/fix_projects.R b/R/fix_projects.R index c528c4d..feacaa7 100644 --- a/R/fix_projects.R +++ b/R/fix_projects.R @@ -1,6 +1,6 @@ # Creates Rprofile and Renvironment files which fix projects on DHSC Rstudio install -# RStudio startup +# RStudio startup # https://rviews.rstudio.com/2017/04/19/r-for-enterprise-understanding-r-s-startup/ # N.B. To get a list of all the environment variables which have been set you can run: @@ -14,18 +14,18 @@ # 2) set environment variables using R_HOME/etc/Reviron # __Environment Variables ---- -# 3) set _site_ variables using file pointed to by R_ENVIRON, +# 3) set _site_ variables using file pointed to by R_ENVIRON, # or R_HOME/etc/Reviron.site -# 4) set _user_ variables using e pointed to by R_ENVIRON_USER +# 4) set _user_ variables using e pointed to by R_ENVIRON_USER # or .Renviron in current dir # or .Renviron in RHOME # __RProfile---- -# 5) set site profile using file pointed to by R_PROFILE, +# 5) set site profile using file pointed to by R_PROFILE, # or R_HOME/etc/Rprofile.site -# 6) set site profile using file pointed to by R_PROFILE_USER, +# 6) set site profile using file pointed to by R_PROFILE_USER, # or R_HOME/etc/Rprofile.site @@ -49,7 +49,7 @@ # variable. That is, we can use a .Renviron read at stage 4), to change which # .Rprofile is read at stage 6). -# Adding an Rprofile to RUser if it doesnt exist --------------------------- +# Adding an Rprofile to RUser if it doesn't exist --------------------------- # check to see if .RProfile file exists and if not create profile_path <- file.path(Sys.getenv("R_USER"), ".RProfile") @@ -57,37 +57,37 @@ profile_path <- file.path(Sys.getenv("R_USER"), ".RProfile") if (!file.exists(profile_path)) { profile_text <- ' # Things you might want to change - + # options(papersize="a4") # options(editor="notepad") # options(pager="internal") - + # set the default help type # options(help_type="text") options(help_type="html") - + # set a site library # .Library.site <- file.path(chartr("\\", "/", R.home()), "site-library") - + # set a CRAN mirror # local({r <- getOption("repos") # r["CRAN"] <- "http://my.local.cran" # options(repos=r)}) - + # Give a fortune cookie, but only to interactive sessions # (This would need the fortunes package to be installed.) - # if (interactive()) + # if (interactive()) # fortunes::fortune() ' cat(profile_text, file = profile_path) - + print('New Rprofile Written') - + } else { - + cat(paste0('\nRprofile already present at: \n', profile_path, '\nNo change made\n'), file = '') - + } @@ -98,14 +98,14 @@ env_path <- file.path(Sys.getenv("R_USER"), ".Renviron") if (!file.exists(env_path)) { env_text <- 'R_PROFILE="${R_USER}/.RProfile"' - + cat(env_text, file = env_path) - + print('New Renviron Written') - + } else { - + cat(paste0('\nRenviron already present at: \n',env_path,'\nNo change made'), file = '') - -} \ No newline at end of file + +} diff --git a/README.md b/README.md index 2f45d4a..8b93d75 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,15 @@ # DHSC Coding Principles + See the principles *[HERE](https://datas-dhsc.github.io/coding_principles_book/)*. ## What is this? + This repository contains a set of R markdown documents which, when rendered produce a bookdown site. -The site contains a set of 10 principles, targeted at the DHSC analytical community, with the aim of raising coding standards across the department. These are based on the [MOJ prinicples](https://github.com/moj-analytical-services/our-coding-standards). +The site contains a set of 10 principles, targeted at the DHSC analytical community, with the aim of raising coding standards across the department. These are based on the [MOJ principles](https://github.com/moj-analytical-services/our-coding-standards). ## Publishing Process + The code contains [build](_build.sh) and [deploy](_deploy.sh) scripts which are run by [travis](https://travis-ci.org/mattm-dhsc/coding_principles_book). These scripts render the book, and push the rendered files to a `gh-pages` branch which is then displayed using [github pages](https://datas-dhsc.github.io/coding_principles_book/). @@ -16,4 +19,5 @@ The scripts adapted from the example scripts provided in the [bookdown example]( This process means that the book hosted on github pages is kept in sync with the code which produces it without manual intervention. ## Ownership + The DHSC Coding Principles are maintained by the Data Science Hub, who sit within the Office of the Chief Analyst Directorate. diff --git a/docs/R/fix_projects.R b/docs/R/fix_projects.R index c528c4d..feacaa7 100644 --- a/docs/R/fix_projects.R +++ b/docs/R/fix_projects.R @@ -1,6 +1,6 @@ # Creates Rprofile and Renvironment files which fix projects on DHSC Rstudio install -# RStudio startup +# RStudio startup # https://rviews.rstudio.com/2017/04/19/r-for-enterprise-understanding-r-s-startup/ # N.B. To get a list of all the environment variables which have been set you can run: @@ -14,18 +14,18 @@ # 2) set environment variables using R_HOME/etc/Reviron # __Environment Variables ---- -# 3) set _site_ variables using file pointed to by R_ENVIRON, +# 3) set _site_ variables using file pointed to by R_ENVIRON, # or R_HOME/etc/Reviron.site -# 4) set _user_ variables using e pointed to by R_ENVIRON_USER +# 4) set _user_ variables using e pointed to by R_ENVIRON_USER # or .Renviron in current dir # or .Renviron in RHOME # __RProfile---- -# 5) set site profile using file pointed to by R_PROFILE, +# 5) set site profile using file pointed to by R_PROFILE, # or R_HOME/etc/Rprofile.site -# 6) set site profile using file pointed to by R_PROFILE_USER, +# 6) set site profile using file pointed to by R_PROFILE_USER, # or R_HOME/etc/Rprofile.site @@ -49,7 +49,7 @@ # variable. That is, we can use a .Renviron read at stage 4), to change which # .Rprofile is read at stage 6). -# Adding an Rprofile to RUser if it doesnt exist --------------------------- +# Adding an Rprofile to RUser if it doesn't exist --------------------------- # check to see if .RProfile file exists and if not create profile_path <- file.path(Sys.getenv("R_USER"), ".RProfile") @@ -57,37 +57,37 @@ profile_path <- file.path(Sys.getenv("R_USER"), ".RProfile") if (!file.exists(profile_path)) { profile_text <- ' # Things you might want to change - + # options(papersize="a4") # options(editor="notepad") # options(pager="internal") - + # set the default help type # options(help_type="text") options(help_type="html") - + # set a site library # .Library.site <- file.path(chartr("\\", "/", R.home()), "site-library") - + # set a CRAN mirror # local({r <- getOption("repos") # r["CRAN"] <- "http://my.local.cran" # options(repos=r)}) - + # Give a fortune cookie, but only to interactive sessions # (This would need the fortunes package to be installed.) - # if (interactive()) + # if (interactive()) # fortunes::fortune() ' cat(profile_text, file = profile_path) - + print('New Rprofile Written') - + } else { - + cat(paste0('\nRprofile already present at: \n', profile_path, '\nNo change made\n'), file = '') - + } @@ -98,14 +98,14 @@ env_path <- file.path(Sys.getenv("R_USER"), ".Renviron") if (!file.exists(env_path)) { env_text <- 'R_PROFILE="${R_USER}/.RProfile"' - + cat(env_text, file = env_path) - + print('New Renviron Written') - + } else { - + cat(paste0('\nRenviron already present at: \n',env_path,'\nNo change made'), file = '') - -} \ No newline at end of file + +} diff --git a/index.Rmd b/index.Rmd index 508648d..e0f4088 100644 --- a/index.Rmd +++ b/index.Rmd @@ -5,23 +5,23 @@ site: bookdown::bookdown_site documentclass: book output: #bookdown::pdf_book: default - bookdown::gitbook: + bookdown::gitbook: config: - download: null + download: null toc: collapse: section scroll_highlight: yes after: |
  • Coding Principles Book on GitHub
  • mathjax: default - + --- # Disclaimer {-} This is an unapproved draft and does not represent the views of the DHSC. -## Aknowledgements {-} +## Acknowledgements {-} These principles were inspired by the MOJ [Coding Principles and Standards](https://moj-analytical-services.github.io/our-coding-standards/web/) and developed further by a group at DHSC to produce a version targeted at DHSC Analysts. @@ -50,7 +50,7 @@ The principles are designed to be achievable by all DHSC analysts producing code (ref:versioncontrol-should) Use standard tools ([Git](#git) & [GitHub](#github)) to help you version control code. -(ref:versioncontrol-could) With your team, agree and design a version control [workflow](#workflow_vc). Use ([Git](#git) & [GitHub](#github)) collaboratively and [effectively](#effective_vc). +(ref:versioncontrol-could) With your team, agree and design a version control [workflow](#workflow_vc). Use ([Git](#git) & [GitHub](#github)) collaboratively and [effectively](#effective_vc). @@ -65,7 +65,7 @@ The principles are designed to be achievable by all DHSC analysts producing code (ref:easytoread-must) Follow the [DHSC adopted style](#style_guides) for your language, use [meaningful names](#meaningful_names) and avoid [overlaps](#dont_overlap). -(ref:easytoread-should) Use a [linter](https://en.wikipedia.org/wiki/Lint_%28software%29) or code formatter to ensure that your code conforms to the style guide. +(ref:easytoread-should) Use a [linter](https://en.wikipedia.org/wiki/Lint_%28software%29) or code formatter to ensure that your code conforms to the style guide. (ref:easytoread-could) Review your code with colleagues to make ensure your names and style promote understanding. @@ -80,7 +80,7 @@ The principles are designed to be achievable by all DHSC analysts producing code (ref:correctclearconcise-intro) Write code with your colleagues priorities in mind. They need your code to work correctly, and they will have to understand and check it before they can benefit from it being fast or concise. -(ref:correctclearconcise-must) Ensure that your code is [correct](#ccfc_correct) and that it is [clear](#ccfc_clear) how it functions. +(ref:correctclearconcise-must) Ensure that your code is [correct](#ccfc_correct) and that it is [clear](#ccfc_clear) how it functions. (ref:correctclearconcise-should) Make your code [fast](#ccfc_fast) and [concise](#ccfc_concise), where possible _without_ sacrificing correctness, clarity or excessive resource! Record choices made to achieve this balance. @@ -101,7 +101,7 @@ The principles are designed to be achievable by all DHSC analysts producing code (ref:flexiblecode-should) Think about, and document the way your code might break with different inputs. Include [input validation](#input_validation) to catch mistakes earlier in your code and make it easier to repurpose. -(ref:flexiblecode-could) Implement and test thorough [error handling](#error_handling). Consider writing and sharing general purposes 'tool' code, especially if you solve a problem someone else might have. +(ref:flexiblecode-could) Implement and test thorough [error handling](#error_handling). Consider writing and sharing general purposes 'tool' code, especially if you solve a problem someone else might have. @@ -114,7 +114,7 @@ The principles are designed to be achievable by all DHSC analysts producing code (ref:comments-intro) Comment your code so that it's function is clear. Well targeted comments make it less likely that avoidable mistakes are made when using or updating your code. Colleagues and your future self will thank you. -(ref:comments-must) Write and maintain accurate comments as you code. +(ref:comments-must) Write and maintain accurate comments as you code. (ref:comments-should) Think carefully about _why_ you are leaving comments, what to capture, and what belongs elsewhere (in documentation). @@ -129,7 +129,7 @@ The principles are designed to be achievable by all DHSC analysts producing code -(ref:documentation-intro) Maintain documentation for your code. Code is _not_ self documenting and code without documentation won't be useful later. You need to capture higher level context such as what the code is for, why it is written a certain way and what the inputs and outputs are. +(ref:documentation-intro) Maintain documentation for your code. Code is _not_ self documenting and code without documentation won't be useful later. You need to capture higher level context such as what the code is for, why it is written a certain way and what the inputs and outputs are. (ref:documentation-must) Produce documentation in line with [Aqua book guidance](#aqua). diff --git a/note_R_at_dhsc.Rmd b/note_R_at_dhsc.Rmd index 2d9ac69..fba5bf5 100644 --- a/note_R_at_dhsc.Rmd +++ b/note_R_at_dhsc.Rmd @@ -3,11 +3,11 @@ The following are the DHSC [sensible defaults](#sensible_defaults) for R: ## R Version & IDE -The dominant IDE for R is Rstudio, which comes packaged with R. +The dominant IDE for R is Rstudio, which comes packaged with R. For a new project you should use the latest version of Rstudio available from the software portal. ## General -Default to packages from the [Tidyverse](http://tidyverse.org/).These have been carefully designed to work together effectively as part of a modern data analysis workflow. More info can be found here: [R for Data Science by Hadley Wickham](http://r4ds.had.co.nz). +Default to packages from the [Tidyverse](http://tidyverse.org/).These have been carefully designed to work together effectively as part of a modern data analysis workflow. More info can be found here: [R for Data Science by Hadley Wickham](http://r4ds.had.co.nz). For example: @@ -15,13 +15,13 @@ For example: * Use ggplot2 rather than base graphics * Use the pipe `%>%` rather than nesting function calls. (...but not always e.g. see [here](https://twitter.com/hadleywickham/status/603883121197514752)). * Prefer `purrr` to the `apply` family of functions. See [here](http://r4ds.had.co.nz/iteration.html#the-map-functions) - + ## Packages {#r_default_packages} -Recommended Packages: +Recommended Packages: -* Data Manupulation and Tidying +* Data Manipulation and Tidying * [dplyr](https://dplyr.tidyverse.org/) * [tidyr](https://tidyr.tidyverse.org/) * Working with messy spreadsheets @@ -46,7 +46,7 @@ Recommended Packages: * Machine Learning * [tensorflow](https://tensorflow.rstudio.com/) * [h2o](https://github.com/h2oai/h2o-3) - + ## Project Workflow {#r_projects} Always work in a project. See the guide to [Using Projects](https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects). @@ -54,8 +54,8 @@ Always work in a project. See the guide to [Using Projects](https://support.rstu Projects functionality is broken in DHSC's packaged version of Rstudio - see the fix [here](R/fix_projects.R) ## Packaging Your Code {#r_package} -Packages are the fundamental unit of reproducible R code. -Therefore, if possible, build an R Package to share and document your code. +Packages are the fundamental unit of reproducible R code. +Therefore, if possible, build an R Package to share and document your code. Hadley's book on [R Packages](http://r-pkgs.had.co.nz/) is an effective guide on how to produce a package. @@ -85,7 +85,7 @@ Simply start your script with: library(checkpoint) checkpoint(snapshotDate = "2015-01-15", - checkpointLocation = getwd()) + checkpointLocation = getwd()) ``` This will download and fetch all the packages as they existed on the given date and install them to a library on your home drive. @@ -105,5 +105,5 @@ Effective error handling in R requires understanding the _conditions_ system. Th If you are iterating over many inputs, it is recommended that you use the `safely()` family of functions from `purrr` to create versions which return errors within a list for handling at a later stage. ## Unit Testing {#r_tests} -Use the `testthat` package for performing unit tests. +Use the `testthat` package for performing unit tests. For details see the ['tests' chapter of R Packages](https://r-pkgs.org/tests.html).