Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish-javadoc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,5 @@ jobs:
GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
javadoc-branch: javadoc
java-version: 21
target-folder: javadoc/1.5
target-folder: javadoc/1.6
project: maven
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cff-version: 1.2.0
title: Sugar Removal Utility (SRU)
version: 1.5.0.0
version: 1.6.0.0
message: "If you use this software, please cite it as below and also cite the accompanying scientific publication referenced below."
type: software
authors:
Expand All @@ -17,7 +17,7 @@ authors:
given-names: "Maria"
orcid: "https://orcid.org/0000-0001-9359-7149"
doi: "10.5281/zenodo.7082113"
date-released: 2025-03-19
date-released: 2025-10-01
url: "https://github.com/JonasSchaub/SugarRemoval"
license: MIT
references:
Expand Down
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,18 +33,20 @@ removal of circular and linear sugars from molecular structures, is described in
C. et al. Too sweet: cheminformatics for deglycosylation in natural products. J Cheminform 12, 67 (2020)](https://doi.org/10.1186/s13321-020-00467-y). There,
you can find all necessary details about the algorithm and its various configuration options. We also published a
[follow-up article](https://doi.org/10.3390/biom11040486) where we used the SRU to analyse sugar moieties in the
Collection of Open Natural products (COCONUT) database.
Collection of Open Natural products (COCONUT) database. Recently, we also developed an extension called Sugar Detection
Utility (SDU) which is described here, for now: [https://github.com/cdk/cdk/pull/1225](https://github.com/cdk/cdk/pull/1225).
* This repository *used to* host the SRU source code, but it has now moved to the
<a href="https://github.com/cdk/cdk">Chemistry Development Kit (CDK)</a> Java library for cheminformatics. If you want to
use the SRU as a Java library, you now need to use the CDK version 2.10 or higher. Information on how to install and use
the CDK can be found in the GitHub repository linked above. You can then use the SRU via CDK's
[SugarRemovalUtility class](https://github.com/cdk/cdk/blob/main/misc/extra/src/main/java/org/openscience/cdk/tools/SugarRemovalUtility.java).
[SugarRemovalUtility class](https://github.com/cdk/cdk/blob/main/misc/extra/src/main/java/org/openscience/cdk/tools/SugarRemovalUtility.java)
or the [SugarDetectionUtility extension](https://github.com/cdk/cdk/blob/main/misc/extra/src/main/java/org/openscience/cdk/tools/SugarDetectionUtility.java).
* This repository now only hosts the SRU command-line application and its source code and it serves as a place for
documentation about the algorithm.
* The SRU's functionalities can also be used in other software tools:
* The SRU web application is available at [https://sugar.naturalproducts.net](https://sugar.naturalproducts.net) and its source code
can be found [here](https://github.com/mSorok/SugarRemovalWeb).
* The Sugar Removal Utility is also available in the open Java rich client application MORTAR ('MOlecule fRagmenTation
* The Sugar Removal/Detection Utility is also available in the open Java rich client application MORTAR ('MOlecule fRagmenTation
fRamework') where <i>in silico</i> molecule fragmentation can be easily conducted on a given data set and the results
visualised ([MORTAR GitHub repository](https://github.com/FelixBaensch/MORTAR)
| [MORTAR article](https://doi.org/10.1186/s13321-022-00674-9))
Expand All @@ -59,8 +61,8 @@ moiety detection and removal using the SRU.
### Sources
The sources available in <i>/src/main/java/de/unijena/cheminf/deglycosylation/</i> belong to the SRU command-line
application. It makes the various settings for fine-tuning the sugar detection and removal process available through
command-line arguments. But using the CDK <i>SugarRemovalUtility</i> class directly in your own software project offers some
additional configuration options and functionalities:
command-line arguments. But using the CDK <i>SugarRemovalUtility / SugarDetectionUtility</i> classes directly in your
own software project offers some additional configuration options and functionalities:
* Adding and removing circular and linear sugar patterns for the initial detection steps
* Sugar detection without removal
* Detecting only the number of sugar moieties of a molecule
Expand All @@ -69,6 +71,7 @@ The class <i>SugarRemovalUtilityTest</i> can be found in the directory
<i>/src/test/java/de/unijena/cheminf/deglycosylation/</i>. It is a JUnit test class that tests the performance of the
Sugar Removal Utility on multiple specific molecular structures of natural products hand-picked from public databases
(see article linked above). Code examples of how to use and configure the <i>SugarRemovalUtility</i> class can be found here.
There is also an analogous test class for the <i>SugarDetectionUtility</i>.

### SugarRemovalUtility CMD App
The sub-folder ["SugarRemovalUtility CMD App"](https://github.com/JonasSchaub/SugarRemoval/tree/main/SugarRemovalUtility%20CMD%20App)
Expand All @@ -93,12 +96,12 @@ As stated above, the Sugar Removal Utility is now part of the
install the SRU externally, you can use it via CDK's SugarRemovalUtility class. If not, please follow the installation
description in the CDK repository linked above.
<br>
The Sugar Removal Utility web applcation in *this* repository is hosted as a package/artifact on the sonatype maven
The Sugar Removal Utility web application in *this* repository is hosted as a package/artifact on the sonatype maven
central repository. See the [artifact page](https://central.sonatype.com/artifact/io.github.jonasschaub/sru/) for installation guidelines using build tools like maven or gradle.
To install it via its JAR archive, you can get it from the [releases](https://github.com/JonasSchaub/SugarRemoval/releases).
Note that other dependencies will need to be installed via JAR archives as well this way.

### Command line application JAR
### Command line application JAR
The command-line application JAR has to be downloaded. After that, it can be executed from the command-line
as described in the usage instructions. Java version 17 or higher has to be installed on your machine.

Expand All @@ -117,7 +120,7 @@ care of installing all dependencies.
* Apache License, version 2.0

**Managed by Maven:**
* Chemistry Development Kit (CDK) version 2.10
* Chemistry Development Kit (CDK) version 2.12-SNAPSHOT
* [Chemistry Development Kit on GitHub](https://cdk.github.io/)
* License: GNU Lesser General Public License 2.1
* JUnit version 5.10.0
Expand Down
Binary file not shown.
42 changes: 31 additions & 11 deletions SugarRemovalUtility CMD App/Usage instructions.txt
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
---------------------------------------------------------------------------------------------------
Sugar Removal Utility Command Line Application Usage instructions v. 1.5
Sugar Removal Utility Command Line Application Usage instructions v. 1.6
---------------------------------------------------------------------------------------------------

This application can be used to remove sugar moieties from molecules in a given data set, according
to "Schaub, J., Zielesny, A., Steinbeck, C., Sorokina, M. Too sweet: cheminformatics for deglycosylation in natural
products. J Cheminform 12, 67 (2020). https://doi.org/10.1186/s13321-020-00467-y".
products. J Cheminform 12, 67 (2020). https://doi.org/10.1186/s13321-020-00467-y". Additionally, it includes the
"Sugar Detection Utility" extension (https://github.com/cdk/cdk/pull/1225) to extract sugar moieties correctly.

Accepted input formats: MDL Molfile, MDL Structure data file (SDF), and SMILES file (of
format: [SMILES string][space][name] in each line, see example file)
Accepted input formats: MDL Molfile, MDL Structure data file (SDF), and SMILES file
(determined by file extension; .sdf and .mol will be recognized as SDF and Molfile, respectively; .smi, .smiles, .csv,
.tsv, and .txt will be interpreted as SMILES file).

The output is a comma-separated value (CSV) text file containing:
The output is a comma-separated value (CSV) text file containing (separated by semicolon ;):
- the respective number of each molecule in the input file
- detected identifiers of the molecules
- SMILES strings of the original molcules
- SMILES strings of the deglycosylated molecules
- SMILES strings of the removed sugar moieties
- SMILES strings of the original molecules
- SMILES strings of the deglycosylated molecules (aglycone)
- whether the molecule contained sugar moieties
- SMILES strings of the removed sugar moieties (variable number of columns here depending on how many sugars were removed from
the respective molecule)

The output file and a log file will be created in the same directory as the input file.

Expand All @@ -29,7 +33,8 @@ the JAR file. These are (-[short name] --[long name] <argument>):
java -jar SugarRemovalUtility-jar-with-dependencies.jar -i <filePath> -t <integer>
[-glyBond <boolean>] [-remTerm <boolean>] [-presMode <integer>] [-presThres <integer>]
[-oxyAtoms <boolean>] [-oxyAtomsThres <number>] [-linSugInRings <boolean>] [-linSugMinSize <integer>]
[-linSugMaxSize <integer>][-linAcSug <boolean>] [-circSugSpiro <boolean>] [-circSugKetoGroups <boolean>]
[-linSugMaxSize <integer>] [-linAcSug <boolean>] [-circSugSpiro <boolean>] [-circSugKetoGroups <boolean>]
[-markR <boolean>] [-postProcSug <boolean>] [-limitPostProcBySize <boolean>]

- Option -h --help: Print usage and help information regarding the required command-line arguments and
options. If this option is used, the application is exited afterwards.
Expand Down Expand Up @@ -57,7 +62,7 @@ java -jar SugarRemovalUtility-jar-with-dependencies.jar -i <filePath> -t <intege
- This option and its argument are always required

-> These two options must ALWAYS be given (except if --help or --version are used). The remaining
arguments are optional. If they are not specified, they will be in their default value.
arguments are optional. If they are not specified, they will be set to their default value.

- Option -glyBond --detectCircSugOnlyWithGlyBond <boolean>: Either "true" or "false", indicating
whether circular sugars should be detected (and removed) only if they have an O-glycosidic bond
Expand Down Expand Up @@ -129,14 +134,29 @@ java -jar SugarRemovalUtility-jar-with-dependencies.jar -i <filePath> -t <intege
whether circular sugar-like moieties with keto groups should be detected.
- Any other value of this argument will be interpreted as "false"
- Default: "false"
- Option -markR --markAttachmentPointsByR <boolean>: Either "true" or "false", indicating whether the attachment
points of removed sugar moieties should be marked by R groups (pseudo atoms, * in the exported SMILES codes)
in the deglycosylated core structure and on the extracted sugars.
- Any other value of this argument will be interpreted as "false"
- Default: "false"
- Option -postProcSug --postProcessSugars <boolean>: Either true or false, indicating whether the extracted sugar
moieties should be post-processed, i.e. bond splitting (O-glycosidic, ether, ester, peroxide) to separate the
individual sugars, before being output.
- Any other value of this argument will be interpreted as "false"
- Default: "false"
- Option -limitPostProcBySize --limitPostProcessingBySize <boolean>: Either true or false, indicating whether the
post-processing of extracted sugar moieties should be limited to structures bigger than a defined size (see
preservation mode (threshold)) to preserve smaller modifications.
- Any other value of this argument will be interpreted as "false"
- Default: "false"

Example usages:
- Removing circular AND linear sugars from all molecules in the given example file, using default settings
implicitly:
java -jar "SugarRemovalUtility-jar-with-dependencies.jar" -i "smiles_test_file.txt" -t "3"
- Removing circular AND linear sugars from all molecules in the given example file, using all default settings
explicitly:
java -jar "SugarRemovalUtility-jar-with-dependencies.jar" -i "smiles_test_file.txt" -t "3" -glyBond "false" -remTerm "true" -presMode "1" -presThres "5" -oxyAtoms "true" -oxyAtomsThres "0.5" -linSugInRings "false" -linSugMinSize "4" -linSugMaxSize "7" -linAcSug "false" -circSugSpiro "false" -circSugKetoGroups "false"
java -jar "SugarRemovalUtility-jar-with-dependencies.jar" -i "smiles_test_file.txt" -t "3" -glyBond "false" -remTerm "true" -presMode "1" -presThres "5" -oxyAtoms "true" -oxyAtomsThres "0.5" -linSugInRings "false" -linSugMinSize "4" -linSugMaxSize "7" -linAcSug "false" -circSugSpiro "false" -circSugKetoGroups "false" -markR "false" -postProcSug "false" -limitPostProcBySize "false"
- Printing usage and help information and the version of the application:
java -jar "SugarRemovalUtility-jar-with-dependencies.jar" -h -v

Expand Down
10 changes: 3 additions & 7 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

<groupId>io.github.jonasschaub</groupId>
<artifactId>sru</artifactId>
<version>1.5.0.0</version>
<version>1.6.0.0</version>
<name>Sugar Removal Utility</name>
<packaging>jar</packaging>
<url>https://github.com/JonasSchaub/SugarRemoval</url>
Expand All @@ -46,7 +46,7 @@
<project.build.source>17</project.build.source>
<project.build.target>17</project.build.target>
<java.version>17</java.version>
<cdk.version>2.10</cdk.version>
<cdk.version>2.12-SNAPSHOT</cdk.version>
<junit.version>5.10.0</junit.version>
<hamcrest.version>2.2</hamcrest.version>
<spotless.version>2.40.0</spotless.version>
Expand Down Expand Up @@ -100,10 +100,6 @@
</developers>

<distributionManagement>
<snapshotRepository>
<id>ossrh</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots</url>
</snapshotRepository>
<repository>
<id>ossrh</id>
<url>https://s01.oss.sonatype.org/service/local/staging/deploy/maven2/</url>
Expand All @@ -112,7 +108,7 @@
<repositories>
<repository>
<id>ossrh</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots</url>
<url>https://central.sonatype.com/repository/maven-snapshots/</url>
</repository>
<repository>
<id>oss-sonatype</id>
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/de/unijena/cheminf/deglycosylation/Main.java
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
* SugarRemovalUtilityCmdApplication class, calls its execute() function and measures the time it takes for execution.
*
* @author Jonas Schaub
* @version 1.5
* @version 1.6
*/
public class Main {
/**
Expand Down
Loading