-
Notifications
You must be signed in to change notification settings - Fork 3
Description
URL-encoded characters in external JAR paths cause Label creation to fail
Problem Description
When processing runtime classpath JARs from external repositories (e.g., Maven dependencies hosted on private artifact servers), the JAR paths may contain URL-encoded characters such as %40 (encoded @ symbol). Creating a Bazel Label with these encoded characters fails validation because % is not a valid character in Bazel target names.
Example Case
A JAR path from an external repository:
bazel-out/darwin_arm64-fastbuild/bin/external/minimal_j11_deps/v1/https/username%40artifacts.example.net/repository/maven/javax/activation/javax.activation-api/1.2.0/javax.activation-api-1.2.0.jar
Contains %40 (URL-encoded @) in the artifact server hostname username@artifacts.example.net.
Current Behavior
When creating a synthetic label for unknown JARs at line 198-199 in JavaAspectsInfo.java:
targetLabel = Label.create(
format("@_unknown_jar_//:%s", classJar.getRelativePath()));The Label.create() method throws an IllegalArgumentException because:
- The
%character is not in the allowed characters for Bazel target names (seeTargetName.ALLOWED_META) - Only these special characters are allowed:
+,_,,,=,-,.,@,~,#,,(,),$,!
Expected Behavior
The path should be URL-decoded before being used in a Label:
%40→@%2F→/- Other URL-encoded characters → their decoded equivalents
This allows the Label to be created successfully while maintaining proper indexing and lookup functionality.
Root Cause
External repository paths from Maven or other artifact sources may contain URL-encoded characters when:
- Artifact server hostnames contain special characters (e.g.,
user@server.com) - Repository paths contain spaces or special characters
- Bazel's external repository mechanism preserves these encoded paths
Proposed Solution
Add a sanitizePathForLabel() method to URL-decode the path before Label creation:
private static String sanitizePathForLabel(String path) {
try {
return URLDecoder.decode(path, "UTF-8");
} catch (UnsupportedEncodingException e) {
LOG.warn("Failed to URL-decode path '{}', using original path", path, e);
return path;
} catch (IllegalArgumentException e) {
LOG.warn("Path '{}' contains invalid URL encoding, using original path", path, e);
return path;
}
}Then use it when creating the synthetic label:
targetLabel = Label.create(
format("@_unknown_jar_//:%s", sanitizePathForLabel(classJar.getRelativePath())));Impact Analysis
Why this fix is safe:
- Index separation: JAR lookups use
classJar.getRelativePath()(with URL encoding preserved) via thelibraryByJdepsRootRelativePathmap, which is separate from the Label - Label as metadata: The Label is only used as an identifier in
TargetKeyfor:- Dependency graph construction
- Logging and debugging
- Deduplication of dependencies
- No path operations: The synthetic label
@_unknown_jar_//is never used for file system operations
Benefits:
- ✅ Allows Label creation to succeed for URL-encoded paths
- ✅ Improves log readability (decoded paths are more human-readable)
- ✅ Maintains correct indexing functionality
- ✅ No impact on JAR lookup logic
Affected Code Locations
- Primary:
JavaAspectsInfo.javaline 198-199 - Similar case:
JavaAspectsInfo.javaline 225 (same pattern in error handling branch)
Workarounds
Currently, users may encounter build failures when:
- Using private Maven repositories with special characters in hostnames
- Using external repositories with URL-encoded paths
- No known workarounds exist without this fix