Great project! Is there any interest in either standardizing or maybe some meta data annotation the placeholders that are present in the full text of several of the licenses? Some of the licenses use square brackets, some use angle brackets, some use an underscore, etc. Identifying the placeholders is one way to automatically infer the license from the full text. Or maybe there is another approach that I am missing.