This document describes the complete protosearch API.
protosearch exposes two extensions.
| Extension | Message | Description |
|---|---|---|
protosearch.field |
protosearch.Field |
Manage field configuration |
protosearch.index |
protosearch.Index |
Manage index configuration |
protosearch.Field is a message with the following fields:
| Field | Type | Description |
|---|---|---|
name |
string |
Rename a field in the mapping. |
mapping |
protosearch.FieldMapping |
Define mapping field parameters. |
target |
repeated protosearch.Target |
Configure a literal mapping for a specific target. |
The protoc-gen-protosearch plugin compiles these message options to a JSON file containing the document mapping.
The simplest way to annotate a field is:
string uid = 1 [(protosearch.field) = {}];This will generate a basic field mapping with no parameters except for type. See type inference below.
If you do not annotate a protobuf field with (protosearch.field) options, it will be excluded from the mapping.
The name field lets you rename a protobuf field in the compiled mapping.
string uid = 1 [(protosearch.field).name = "user_uid"];
{
"properties": {
"user_uid": {
"type": "keyword"
}
}
}In most cases, you will need to use mapping to define field parameters.
FieldMapping supports the most common mapping parameters with one important difference:
- It does not support
properties, because the plugin supports definingobjectandnestedfields as protobuf message fields.
Certain fields, namely dynamic, index_options, and term_vector, are enums.
All provide a default UNSPECIFIED value.
The plugin will not output an enum parameter if it has the default UNSPECIFIED value.
If you need to generate a parameter that is not in this list, see target below.
| Field | Type | Description |
|---|---|---|
type |
string |
The field type. If omitted, the plugin infers the type from the protobuf field type. |
analyzer |
string |
Analyzer used at index time. Applies to text fields. |
boost |
double |
Boost a field's score at index time. |
coerce |
bool |
Whether to coerce values to the declared mapping type. Applies to numeric and date fields. |
copy_to |
repeated string |
Copy this field's value to the named field. |
doc_values |
bool |
Whether to store doc values for sorting and aggregation. |
dynamic |
protosearch.Dynamic |
How to handle unknown subfields. Applies to object fields. |
eager_global_ordinals |
bool |
Whether to load global ordinals at refresh time. |
enabled |
bool |
Whether to parse and index the field. |
fielddata |
bool |
Whether to use in-memory fielddata for sorting and aggregations. Applies to text fields. |
fields |
map<string, FieldMapping> |
A multi-field mapping. |
format |
string |
The date format. Applies to date and date_nanos fields. |
ignore_above |
int32 |
Do not index strings longer than this length. Applies to keyword fields. |
ignore_malformed |
bool |
Ignore invalid values instead of rejecting the document. |
index_options |
protosearch.IndexOptions |
Which information to store in the index. Applies to text fields. |
index_phrases |
bool |
Whether to index bigrams separately. Applies to text fields. |
index_prefixes |
protosearch.IndexPrefixes |
Index term prefixes to speed up prefix queries. Applies to text fields. |
index |
bool |
Whether to index the field. |
meta |
map<string, string> |
Metadata about the field. |
normalizer |
string |
Normalize keyword fields with this normalizer. |
norms |
bool |
Whether to store field length norms for scoring. |
null_value |
google.protobuf.Value |
Replace explicit null values with this value at index time. |
position_increment_gap |
int32 |
A gap inserted between elements in an array to prevent spurious matches. Applies to text fields. |
search_analyzer |
string |
Analyzer used at search time. |
similarity |
string |
The scoring algorithm. |
store |
bool |
Whether to store this field separately from _source. |
subobjects |
bool |
Whether dotted field names are interpreted as nested subobjects. |
term_vector |
protosearch.TermVector |
Whether to store term vectors. |
protosearch.Dynamic is an enum with the following values:
DYNAMIC_TRUEDYNAMIC_FALSEDYNAMIC_STRICTDYNAMIC_RUNTIME
protosearch.IndexOptions is an enum with the following values:
INDEX_OPTIONS_DOCSINDEX_OPTIONS_FREQSINDEX_OPTIONS_POSITIONSINDEX_OPTIONS_OFFSETS
protosearch.IndexPrefixes is a message with the following fields:
| Field | Type | Description |
|---|---|---|
min_chars |
int32 |
Minimum prefix length to index. |
max_chars |
int32 |
Maximum prefix length to index. |
protosearch.TermVector is an enum with the following values:
TERM_VECTOR_NOTERM_VECTOR_YESTERM_VECTOR_WITH_POSITIONSTERM_VECTOR_WITH_OFFSETSTERM_VECTOR_WITH_POSITIONS_OFFSETSTERM_VECTOR_WITH_POSITIONS_PAYLOADSTERM_VECTOR_WITH_POSITIONS_OFFSETS_PAYLOADS
The target field gives you complete control over how a protobuf field compiles to a mapping property.
It is a message with the following fields:
| Field | Type | Description |
|---|---|---|
label |
string |
A human-readable label used to target that particular mapping with --protosearch_opt=target=<label>. |
json |
string |
A literal JSON string containing the mapping. |
Use this to define more complex mapping types, or specify parameters that are not supported in FieldMapping.
You can also use this to define mappings for different clusters or vendors.
You can specify this field more than once.
For example, you might want to represent a Point object as a geo_point in Elasticsearch and an xy_point in OpenSearch.
You can create targets for both mappings:
Point origin = 1 [(protosearch.field) = {
target: {
label: "elasticsearch"
json: '{"type": "point"}'
}
target: {
label: "opensearch"
json: '{"type": "xy_point"}'
}
}];With --protosearch_opt=target=elasticsearch:
{
"properties": {
"origin": {
"type": "point"
}
}
}With --protosearch_opt=target=opensearch:
{
"properties": {
"origin": {
"type": "xy_point"
}
}
}If target does not match an existing label, the plugin falls back on the common mapping parameters.
protosearch.Index is a message with the following fields:
| Field | Type | Description |
|---|---|---|
mapping |
protosearch.IndexMapping |
Define index mapping parameters. |
protosearch.IndexMapping is a message with the following fields:
| Field | Type | Description |
|---|---|---|
date_detection |
bool |
Whether to detect date strings as date fields. |
dynamic |
protosearch.Dynamic |
How to handle unknown fields. |
dynamic_date_formats |
repeated string |
Date formats to use for dynamic date detection. |
_field_names |
protosearch.IndexFieldNames |
Controls the _field_names metadata field. |
_meta |
map<string, string> |
Metadata about the index mapping. |
numeric_detection |
bool |
Whether to detect numeric strings as numeric fields. |
_routing |
protosearch.IndexRouting |
Controls the _routing metadata field. |
_source |
protosearch.IndexSource |
Controls the _source metadata field. |
dynamic uses the same protosearch.Dynamic enum as field.mapping.dynamic.
protosearch.IndexFieldNames is a message with the following fields:
| Field | Type | Description |
|---|---|---|
enabled |
bool |
Whether to enable the _field_names metadata field. |
protosearch.IndexRouting is a message with the following fields:
| Field | Type | Description |
|---|---|---|
required |
bool |
Whether to require routing for all document operations. |
protosearch.IndexSource is a message with the following fields:
| Field | Type | Description |
|---|---|---|
compress |
bool |
Whether to compress stored source data. OpenSearch only. |
compress_threshold |
string |
Minimum source size to trigger compression. OpenSearch only. |
enabled |
bool |
Whether to store the _source field. |
excludes |
repeated string |
Fields to exclude from the stored _source. |
includes |
repeated string |
Fields to include in the stored _source. |
mode |
protosearch.SourceMode |
How to store the _source field. |
protosearch.SourceMode is an enum with the following values:
SOURCE_MODE_DISABLEDSOURCE_MODE_STOREDSOURCE_MODE_SYNTHETIC
If type is not specified, protoc-gen-protosearch will infer a field type from the protobuf type.
| Protobuf | Elasticsearch |
|---|---|
string |
keyword |
bool |
boolean |
int32, sint32, sfixed32 |
integer |
uint32, fixed32 |
long |
int64, sint64, sfixed64 |
long |
uint64, fixed64 |
unsigned_long |
float |
float |
double |
double |
bytes |
binary |
| message | object |
| enum | keyword |
google.protobuf.Timestamp |
date |
The plugin validates some field options and collects diagnostics during compilation.
Errors (EXXX) are fatal; protoc will exit with an error code and will not produce any output.
The plugin prints warnings (WXXX) to standard output.
The specified value is invalid for this parameter. The plugin will report the reason.
target.json is not valid JSON.
target.json is not a JSON object.
name is invalid.
Names must match the pattern [@a-z][a-z0-9_]*(\.[a-z0-9_]+)*.
These are all allowed names:
@timestamp
foo
foo_bar
foo.bar.baz
foo_123
The target label does not correspond to a known target.
With protoc-gen-protosearch installed on your $PATH, you can compile mappings like so:
protoc -I proto/ --plugin=protoc-gen-protosearch --protosearch_out=. proto/example/article.proto
Specify --protosearch_opt=target=<label> to compile the mapping for a specific target.
protoc -I proto/ --plugin=protoc-gen-protosearch --protosearch_out=. --protosearch_opt=target=<label> proto/example/article.proto
The plugin pretty-prints output by default.
Specify --protosearch_opt=pretty=false to disable this.