Skip to content

Releases: dan2097/opsin

v2.0.0

10 May 18:20

Choose a tag to compare

MAJOR CHANGES

  • Requires Java 1.6 or higher

  • CML (Chemical Markup Language) is now returned as a String rather than a XOM Element

  • OPSIN now attempts to identify if a chemical name is ambiguous. Names that appear ambiguous return with a status of WARNING with the structure provided being one interpretation of the name

  • Added support for "alcohol esters" e.g. phenol acetate [meaning phenyl acetate]

  • Multiplied unlocanted substitution is now more intelligent e.g. all substituents must connect to same group, and degeneracy of atom environments is taken into account

  • The ester interpretation is now preferred in more cases where a name does not contain a space but the parent is methanoate/ethanoate/formate/acetate/carbamate

  • Inorganic oxides are now interpreted, yielding structures with [O-2] ions

  • Added more trivial names of simple molecules

  • Support for nitrolic acids

  • Fixed parsing issue where a directly substituted acetal was not interpretable

  • Fixed certain groups e.g. phenethyl, not having their suffix attached to a specific location

  • Corrected interpretation of xanthyl, and various trivial names that look systematic

  • Name to structure is now ~20% faster

  • Initialisation time reduced by a third

  • InChI generation is now ~20% faster

  • XML processing dependency changed from XOM to Woodstox

  • Significant internal refactoring

  • Utility functions designed for internal use are no longer on the public API

  • Various minor bug fixes

Internal XML Changes:

  • Groups lacking a labels attribute now have no locants (previously had ascending numeric locants)
  • Syntax for addGroup/addHeteroAtom/addBond attributes changed to be easier to parse and allow specification of whether the name is ambiguous if a locant is not provided

v1.6.0

10 May 18:20

Choose a tag to compare

  • Added API/command-line options to generate StdInchiKeys
  • Added support for the IUPAC recommended nomenclature for carbobohydrate lactones
  • Added support for boronic acid pinacol esters
  • Added basic support for specifying chalcogen acid tautomer form e.g. thioacetic S-acid
  • Fused ring bridges are now numbered
  • Names with Endo/Exo/Syn/Anti stereochemistry can now be partially interpreted if warnRatherThanFailOnUninterpretableStereochemistry is used
  • The warnRatherThanFailOnUninterpretableStereochemistry option will now assign as much stereochemistry as OPSIN understands (All ignored stereochemistry terms are mentioned in the OpsinResult message)
  • Many minor nomenclature support improvements e.g. succinic imide; hexaldehyde; phenyldiazonium, organotrifluoroborates etc.
  • Added more trivial names that can be confused with systematic names e.g. Imidazolidinyl urea
  • Fixed StackOverFlowError that could occur when processing molecules with over 5000 atoms
  • Many minor bug fixes
  • Minor vocabulary improvements
  • Minor speed improvements
  • NOTE: This is the last release to support Java 1.5

v1.5.0

10 May 18:21

Choose a tag to compare

  • Command line interface now accepts files to read and write to as arguments
  • Added option to allow interpretation of acids missing the word acid e.g. "acetic" (off by default)
  • Added option to treat uninterpretable stereochemistry as a warning rather than a failure (off by default)
  • Added support for nucleotide chains e.g. guanylyl(3'-5')uridine
  • Added support for parabens, azetidides, morpholides, piperazides, piperidides and pyrrolidides
  • Vocabulary improvements e.g. homo/beta amino acids
  • Many minor bug fixes e.g. fulminic acid correctly interpreted

v1.4.0

10 May 18:21

Choose a tag to compare

  • Added support for dialdoses,diketoses,ketoaldoses,alditols,aldonic acids,uronic acids,aldaric acids,glycosides,oligosacchardides, named systematically or from trivial stems, in cyclic or acyclic form
  • Added support for ketoses named using dehydro
  • Added support for anhydro
  • Added more trivial carbohydrate names
  • Added support for sn-glcyerol
  • Improved heuristics for phospho substitution
  • Added hydrazido and anilate suffixes
  • Allowed more functional class nomenclature to apply to amino acids
  • Added support for inverting CAS names with substituted functional terms e.g. Acetaldehyde, O-methyloxime
  • Double substitution of a deoxy chiral centre now uses the CIP rules to decide which substituent replaced the hydroxy group
  • Unicode right arrows, superscripts and the soft hyphen are now recognised

v1.3.0

10 May 18:22

Choose a tag to compare

  • Added option to output radicals as R groups (* in SMILES)
  • Added support for carbolactone/dicarboximide/lactam/lactim/lactone/olide/sultam/sultim/sultine/sultone suffixes
  • Resolved some cases of ambiguity in the grammar; the program's capability to handle longer peptide names is improved
  • Allowed one (as in ketone) before yl e.g. indol-2-on-3-yl
  • Allowed primed locants to be used as unprimed locants in a bracket e.g. 2-(4'-methylphenyl)pyridine
  • Vocabulary improvements
  • SMILES writer will no longer reuse ring closures on the same atom
  • Fixed case where a name formed of many words that could be parsed ambiguously would cause OPSIN to run out of memory
  • NameToStructure.getInstance() no longer throws a checked exception
  • Many minor bug fixes

v1.2.0

10 May 18:22

Choose a tag to compare

  • OPSIN is now available from Maven Central
  • Basic support for cylised carbohydrates e.g. alpha-D-glucopyranose
  • Basic support for systematic carbohydrate stems e.g. D-glycero-D-gluco-Heptose
  • Added heuristic for correcting esters with omitted spaces
  • Added support for xanthates/xanthic acid
  • Minor vocabulary improvements
  • Fixed a few minor bugs/limitations in the Cahn-Ingold-Prelog rules implementation and made more memory efficient
  • Many minor improvements and bug fixes

v1.1.0

10 May 18:22

Choose a tag to compare

  • Significant improvements to fused ring numbering code, specifically 3/4/5/7/8 member rings are no longer only allowed in chains of rings
  • Added support for outputting to StdInChI
  • Small improvements to fused ring building code
  • Improvements to heuristics for disambiguating what group is being referred to by a locant
  • Lower case indicated hydrogen is now recognised
  • Improvements to parsing speed
  • Many minor improvements and bug fixes

v1.0.0

10 May 18:23

Choose a tag to compare

  • Added native isomeric SMILES output
  • Improved command-line interface. The desired format i.e. CML/SMILES/InChI as well as options such as allowing radicals can now all be specified via flags
  • Debugging is now performed using log4j rather than by passing a verbose flag
  • Added traditional locants to carboxylic acids and alkanes e.g. beta-hydroxybutyric acid
  • Added support for cis/trans indicating the relative stereochemistry of two substituents on rings and fused rings sytems
  • Added support for stoichiometry ratios and mixture indicators
  • Added support for alpha/beta stereochemistry on steroids
  • Added support for the method for naming spiro systems described in the 1979 recommendations rule A-42
  • Added detailedFailureAnalysis option to detect the part of a chemical name that fails to parse
  • Added support for deoxy
  • Added open-chain saccharides
  • Improvements to CAS index name uninversion algorithm
  • Added support for isotopes into the program allowing deuterio/tritio
  • Added support for R/S stereochemistry indicated by a locant which is also used to indicate the point of substitution for a substituent
  • Many minor improvements and bug fixes

v0.9.0

10 May 18:23

Choose a tag to compare

  • Added transition metals/f-block elements and nobel gases
  • Added support for specifying the charge or oxidation number on elements e.g. aluminium(3+), iron(II)
  • Calculations based off a van Arkel diagram are now used to determine whether functional bonds to metals should be treated as ionic or covalent
  • Improved support for prefix functional replacement e.g. hydrazono/amido/imido/hydrazido/nitrido/pseudohalides can now be used for functional replacement on appropriate acids
  • Ortho/meta/para handling improved - can now only apply to six membered rings
  • Added support for methylenedioxy
  • Added support for simple bridge prefixes e.g. methano as in 2,3-methanoindene
  • Added support for perfluoro/perchloro/perbromo/periodo
  • Generalised alkane support to allow alkanes of lengths up to 9999 to be described without enumeration
  • Updated dependency on JNI-InChI to 0.7, hence InChI 1.03 is now used.
  • Improved algorithm for assigning unlocanted hydro terms
  • Improved heuristic for determing meaning of oxido
  • Improved charge balancing e.g. ionic substance of an implicit ratio 2:3 can now be handled rather than being represented as a net charged 1:1 mixture
  • Grammar is a bit more lenient of placement of stereochemistry and multipliers
  • Vocabulary improvements especially in the area of nucleosides and nucleotides
  • Esters of biochemical compounds e.g. triphosphates are now supported
  • Many minor improvements and bug fixes

v0.8.0

10 May 18:24

Choose a tag to compare

  • NameToStructureConfig can now be used to configure whether radicals e.g. ethyl are output or not.
  • Names like carbon tetrachloride are now supported
  • glycol ethers e.g. ethylene glycol ethyl ether are now supported
  • Prefix functional replacement support now includes halogens e.g. chlorophosphate
  • Added support for epoxy/epithio/episeleno/epitelluro
  • Added suport for hydrazides/fluorohydrins/chlorohydrins/bromohydrins/iodohydrins/cyanohydrins/acetals/ketals/hemiacetals/hemiketals/diketones/disulfones named using functional class nomenclature
  • Improvements to algorithm for assigning and finding atoms corresponding to element symbol locants
  • Added experimental right to left parser (ReverseParseRules.java)
  • Vocabulary improvements
  • Parsing is now even faster
  • Various bug fixes and name intepretation fixes