Conversation
Summary of ChangesHello @logachev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several performance improvements to the BigQuery JDBC driver. The primary focus is on optimizing startup and query execution by fixing an issue with stateless query configuration, deferring the loading of the default project ID, and re-prioritizing private key parsing to reduce overhead for less common key types. These changes aim to enhance the overall efficiency and responsiveness of the driver. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several performance improvements, including a fix for stateless queries, lazy loading for the default project ID to improve startup time, and reordering of private key parsing logic. However, a critical security vulnerability was identified in the new key parsing logic. The switch from using an input stream to Files.readAllBytes() for reading key files specified in the connection string introduces a risk of Denial of Service (DoS) through memory exhaustion if a large file is specified, and potentially exposes the system to NetNTLM hash leakage on Windows via UNC paths. Furthermore, this new key parsing logic could also affect authentication using file-based PKCS8 keys.
| if (isJson(keyBytes)) { | ||
| stream = new ByteArrayInputStream(keyBytes); | ||
| } else if (pvtKey != null) { | ||
| key = privateKeyFromPkcs8(pvtKey); | ||
| } else { | ||
| key = privateKeyFromP12Bytes(keyBytes, p12Password); | ||
| } |
There was a problem hiding this comment.
The updated logic for parsing private keys has a flaw. It doesn't correctly handle PKCS8 keys provided via a file path (OAuthPvtKeyPath or OAuthPvtKey as a path).
- If
OAuthPvtKeyPathis used for a PKCS8 key file,pvtKeywill benull, causing theelse if (pvtKey != null)branch to be skipped and the code will incorrectly try to parse the key as P12 format. - If
OAuthPvtKeycontains a path to a PKCS8 key file,privateKeyFromPkcs8(pvtKey)will be called with the file path string instead of its content, which will fail.
This breaks authentication for these scenarios. The logic should attempt to parse the loaded keyBytes in the desired order (JSON -> PKCS8 -> P12), regardless of whether the key came from a string or a file.
Here is a suggested fix that correctly sequences the parsing attempts on the keyBytes.
if (isJson(keyBytes)) {
stream = new ByteArrayInputStream(keyBytes);
} else if (keyBytes != null) {
// Try to parse as PKCS8 private key
key = privateKeyFromPkcs8(new String(keyBytes, java.nio.charset.StandardCharsets.UTF_8));
if (key == null) {
// If not PKCS8, try to parse as P12
key = privateKeyFromP12Bytes(keyBytes, p12Password);
}
}| } | ||
| } else if (isJson(pvtKey)) { | ||
| stream = new ByteArrayInputStream(pvtKey.getBytes()); | ||
| keyBytes = Files.readAllBytes(Paths.get(keyPath)); |
There was a problem hiding this comment.
The use of Files.readAllBytes() to read a key file specified in the connection string can lead to a Denial of Service (DoS) via an OutOfMemoryError. An attacker who can control the connection string (e.g., in applications that allow users to provide their own JDBC URL) can point the OAuthPvtKeyPath or OAuthPvtKey property to a very large file (such as /dev/zero or a large log file) on the server. This will cause the application to attempt to read the entire file into memory, potentially crashing the JVM. Additionally, on Windows systems, providing a UNC path (e.g., \\attacker-ip\share\file) could lead to an information disclosure by leaking the Windows user's NetNTLM hash when the driver attempts to access the remote share.
if (isFileExists(keyPath)) {
try (InputStream is = Files.newInputStream(Paths.get(keyPath))) {
// Limit the read size to a reasonable maximum for a key file (e.g., 1MB)
byte[] buffer = new byte[1024 * 1024];
int bytesRead = is.read(buffer);
if (bytesRead != -1) {
keyBytes = java.util.Arrays.copyOf(buffer, bytesRead);
}
}
}
Various small-ish changes that affect startup or query performance.
setQueryPreviewEnabledinstead ofsetDefaultJobCreationMode, so stateless queries were not used.BigQueryOptions.getDefaultProjectId()is visible in JFR profiling because it calls metadata server if env are not present. I think it is responsible for ~0.2s on startup even whenprojectIdis available, so added lazy defaults for that property.