CDAP-15864. Google Drive sink. Base code. (dvitiiuk)#2
CDAP-15864. Google Drive sink. Base code. (dvitiiuk)#2
Conversation
src/main/java/io/cdap/plugin/google/common/GoogleDriveBaseConfig.java
Outdated
Show resolved
Hide resolved
| fileToWrite.setName(fileFromFolder.getFile().getName()); | ||
| fileToWrite.setParents(Collections.singletonList(folderId)); | ||
| ByteArrayContent fileContent = new ByteArrayContent(null, fileFromFolder.getContent()); | ||
| service.files().create(fileToWrite, fileContent).execute(); |
There was a problem hiding this comment.
Please check if this gives a good readable by end user exceptions. So if this fails when something goes wrong. User can diagnose.
There was a problem hiding this comment.
IOException is allowed for GoogleDriveRecordWriter.write(...), so I this is not need to replace
There was a problem hiding this comment.
You didn't understand the question. I asked if it gives messages which can be easily understood by user. In case something goes wrong, like disk full, directory deleted or something other.
src/main/java/io/cdap/plugin/google/sink/GoogleDriveSinkConfig.java
Outdated
Show resolved
Hide resolved
|
also use |
pom.xml
Outdated
| <modelVersion>4.0.0</modelVersion> | ||
|
|
||
| <groupId>io.cdap.plugin</groupId> | ||
| <name>Google Drive plugin</name> |
| } catch (Exception e) { | ||
| collector.addFailure( | ||
| String.format("Exception during authentication/directory properties check: %s", e.getMessage()), | ||
| "Check message and reconfigure the plugin"); |
| } | ||
|
|
||
| private void validateSchemaField(FailureCollector collector, Schema schema, String propertyName, | ||
| String propertyValue, Schema requiredSchema) { |
There was a problem hiding this comment.
You have to check if field is not logical type and if it's nullable than to get non-nullable part.
Code example from one of my plugins:
Schema.Field vertexField = inputSchema.getField(vertexType);
if (vertexField == null) {
failureCollector.addFailure(String.format("Field '%s' is not present in input schema.", vertexType),
null).withConfigProperty(VERTEX)
.withInputSchemaField(VERTEX, null);
} else {
Schema vertexFieldSchema = vertexField.getSchema();
if (vertexFieldSchema.isNullable()) {
vertexFieldSchema = vertexFieldSchema.getNonNullable();
}
if (vertexFieldSchema.getLogicalType() != null || vertexFieldSchema.getType() != Schema.Type.STRING) {
failureCollector.addFailure(String.format("Field '%s' must be of type 'string' but is of type '%s'.",
vertexField.getName(), vertexFieldSchema.getDisplayName()),
null).withConfigProperty(VERTEX)
.withInputSchemaField(VERTEX, null);
}
}
Also use .withInputSchemaField
docs/GoogleDrive-batchsink.md
Outdated
| Description | ||
| ----------- | ||
| Sink plugin saves files from the pipeline to Google Drive directory via Google Drive API. | ||
| The minimal input schema contains only byte array with field name configured by **schemaBodyFieldName** property. |
There was a problem hiding this comment.
schemaBodyFieldName
and
schemaNameFieldName
Are core properties of the plugin. Which represent how it works I think they should be at the top of widget and doc not at the bottom.
Also move this description to properties description, instead of general description.
There was a problem hiding this comment.
Also add information about required type of the schema fields.
| "display-name": "Google Drive sink", | ||
| "configuration-groups": [ | ||
| { | ||
| "label": "Basic", |
There was a problem hiding this comment.
Need to have two sections:
Basic:
- Reference Name
- Schema field used as name of file
- Schema field used as body of file
Authentication:
everything else.
| fileToWrite.setName(fileFromFolder.getFile().getName()); | ||
| fileToWrite.setParents(Collections.singletonList(folderId)); | ||
| ByteArrayContent fileContent = new ByteArrayContent(null, fileFromFolder.getContent()); | ||
| service.files().create(fileToWrite, fileContent).execute(); |
There was a problem hiding this comment.
You didn't understand the question. I asked if it gives messages which can be easily understood by user. In case something goes wrong, like disk full, directory deleted or something other.
| try { | ||
| driveClient.isFolderAccessible(directoryIdentifier); | ||
| } catch (GoogleJsonResponseException e) { | ||
| collector.addFailure(e.getMessage(), "Provide an existing folder identifier") |
| try { | ||
| driveClient.checkRootFolder(); | ||
| } catch (GoogleJsonResponseException e) { | ||
| collector.addFailure(e.getMessage(), "Provide valid credentials"); |
|
|
||
| @Nullable | ||
| @Name(SCHEMA_NAME_FIELD_NAME) | ||
| @Description("Name of the schema field which will be used as name of file.") |
| protected String schemaNameFieldName; | ||
|
|
||
| @Name(SCHEMA_BODY_FIELD_NAME) | ||
| @Description("Name of the schema field which will be used as body of file.") |
|
|
||
| } | ||
|
|
||
| private void validateReferenceId(FailureCollector collector) { |
There was a problem hiding this comment.
Use IdUtils.validateReferenceName(referenceName, failureCollector); instead.
docs/GoogleDrive-batchsink.md
Outdated
| ---------- | ||
|
|
||
| **clientId** OAuth2 client id. | ||
| **File body field** The minimal input schema contains only byte array with field name configured by this property. |
There was a problem hiding this comment.
The description should answer "what this property does". And begin with a noun.
There was a problem hiding this comment.
Also don't start with "the" or "a/an". Otherwise all descriptions would start like that.
|
|
||
| @Name(SCHEMA_BODY_FIELD_NAME) | ||
| @Description("Name of the schema field which will be used as body of file.") | ||
| @Description("The minimal input schema contains only byte array with field name configured by this property.\n" + |
6e51944 to
70125b3
Compare
CDAP-15864. Google Drive Sink. Updated auth configs.
CDAP-15864. Google Drive Sink. Made authentication process more user-friendly
b4aac47 to
fa270a9
Compare
fa270a9 to
f8f25ca
Compare
JIRA: https://issues.cask.co/browse/CDAP-15864
DOC: https://wiki.cask.co/display/CE/Google+Drive+plugins
This is the base PR for Google Drive batch sink plugin.
TODOs:
Cover the code with tests.
Remove workaround for OAuth2 authentication.
Undocumented points:
As part of OAuth2 workaround 3 new properties were added for source and sink plugins: "Client Id", "Client secret", "Refresh token".
"App Id" property was removed because it has no sense in the scope of Google Authentication.
Sink generates random 16-symbol file name when "name" is not a part of the input scheme.
Tested:
File source -> Google Drive sink pipeline