-
Notifications
You must be signed in to change notification settings - Fork 3
ES8 (and OpenSearch) support #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
86812e9 to
3fb4418
Compare
…es8 instance are included in our github action matrices, and tests are passing in es7 and es8 (except ones relying on mapping types, which aren't supported by es8) cleanup (+15 squashed commits) Squashed commits: [3fb4418] start excluding mapping types in es8 tests again [4dc2703] alright separating and cleaning [34590cc] I want to check something cheeky [2fec4a1] flip reverse it so only es7 runs mapping type tests [ce87d3d] getting weird with it [fc17334] ok next level [3ce0564] first pass on a matrix [7a88a3a] bringing back es8 [bb819f6] bringing the port back [224e686] skipping es8 tests, not running its action [0e46124] whatever I'll just skip those tests again [c0afa77] do I get a different error at least [5842e27] I call this throwing stuff at the wall and hoping for the best [364c0e1] container name overlap [eb6d57a] es8 in gh actions
3fb4418 to
b3a97fe
Compare
| def bulk_operation({op_type, struct}), do: bulk_operation_default({op_type, struct, :index}) | ||
|
|
||
| defp bulk_operation_default({op_type, %{__struct__: model} = struct, index}) do | ||
| op = %{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from here until the tests the code should be unchanged from #31
| @@ -0,0 +1,325 @@ | |||
| defmodule ExlasticSearch.Repo.MappingTypesTest do | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here's where the code starts diverging a bit from #31
this module is just tests that used mapping types, which es7 supported but are deprecated. these tests seem to be the original tests for the library, with new ones brought on that use typeless models (and whose tests include "es8+" in the title). those new tests remain in repo_test.exs, and these tests that rely on mapping types and will fail in es8+ have been quarantined here, in a module that I spent very little time thinking of the naming or folder structure of.
this module should align very closely with the original tests, and repo_test.exs should have similar coverage (they probably are the same tests) but using typeless models.
| setup_all do | ||
| Repo.delete_index(TestModel) | ||
| Repo.create_index(TestModel) | ||
| {:ok, %{status_code: 200}} = Repo.create_mapping(TestModel, :index, include_type_name: true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test setups weren't matching against anything for the return value of Repo.create_mapping(...), but that also covered up that they were silently failing on creating mappings for deprecated mapping types, so I put in matching against {:ok, ...} and that the status code is 200, so if these fail to create we'd at least know about it.
| # can we unskip? fails from mapping types | ||
| @tag :skip | ||
| test "It can perform top_hits aggregations, even when nested" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test (and the other test) were skipped before I got here, they seem to fail in CI so I think a previous commit (before this PR) skipped them. I tried unskipping them after fixing the Repo.create_mapping(...) calls but they still seemed to fail. I suppose we should just delete them if we don't expect them to ever pass, so I'm open on votes to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Milton added the skip. If he was planning to come back and fix the tests we could leave them. If not, I think deleting them makes sense.
test/exlasticsearch/repo_test.exs
Outdated
| test "It will index an element in es" do | ||
| model = %ExlasticSearch.TestModel{id: Ecto.UUID.generate()} | ||
| {:ok, _} = Repo.index(model) | ||
| @tag :elasticsearch8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tags aren't functionally needed anymore, I added them because I was originally excluding es8 tests on es7, but es8 tests will pass on es7 (it's es7 tests, i.e. ones relying on mapping types, that will fail in es8).
I'll go ahead and remove these :elasticsearch8 tests unless someone feels strongly otherwise.
| - elixir: "1.18" | ||
| otp: "27" | ||
| lint: true | ||
| elasticsearch: "8.13.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it would be nice to have testing against an opensearch docker container would be nice too, but I'm not sure of an elegant way to add that in here.
any suggestions on how to do that in a simple way, or on how to clean up my heavy-handed matrix approach, would be much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To start an OpenSearch container for testing, you can follow a pattern like this (adding Kafka container for testing): https://github.com/Frameio/massdriver/pull/18087/files#diff-95557d53bf91069e59d82b9d8fcfaf52ec84a762858983edd5ef3c9b3e4c8191R72
lib/exlasticsearch/repo.ex
Outdated
| match?(%{doc: _}, data) -> | ||
| model | ||
| |> struct(Map.put(data.doc, :id, id)) | ||
| |> index(index) | ||
|
|
||
| true -> | ||
| {:error, "ExlasticSearch only supports doc updates for ES 8+"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hack I put in the original PR based on the way I know massdriver is using Exlasticsearch. In general this is not correct.
An "update" operation will only update fields that are present in data.doc, but an "index" operation will replace the entire indexed document with whatever is in data.doc. If data.doc is an incomplete document, that means the fields that are not included will be deleted. This would be very surprising behavior for someone using this library.
Several options:
- Leave the original Exlasticsearch.Repo.update function in place and update massdriver so that we never call it, but instead always call Exlasticsearch.Repo.index.
- Fork Exlasticsearch
- Submit a PR to Elastix that allows a
nildoc_type to be given toDocument.update. This would be a fairly small PR. This is the function that would need to be updated to construct the correct path when there is anildoc type: https://github.com/werbitzky/elastix/blob/5963511a5b5b5ccc57e2ff70e64021fd3a4b795f/lib/elastix/document.ex#L197
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any downside to option number 1? seems the most straightforward of them, although I see fabian has made some recent commits to elastix so maybe that's not hard to swing 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok good news, massdriver already doesn't use Exlasticsearch.Repo.update anywhere. What it does use is bulk, so we just need to update the bulk_operation_update function in lib/exlasticsearch/bulk.ex to handle the case when doc type is nil.
I recommend reverting the changes to Exlasticsearch.Repo.update. We're not making any claims that Exlasticsearch is "fully compatible with ES8/OpenSearch". I think it'd be better to have near compatibility than "compatibility" with incorrect behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking in to bulk_operation_update, weirdly enough, the typelesstestmodel tests running on es8 pass, even with the new ones that use Repo.bulk. they're still passing _type, just with a value of nil. if I add a fake value like _dummy_value_not_used_in_api: "huh", then the bulk api returns an error. so it doesn't accept unknown fields, but does still accepts _type 🤔 .
still think I'm going to only include the _type field if __doc_type__() is not nil, for sanity's sake, but that's kinda weird isn't it.
…ak stuff without knowing just because the http request succeeds but with an errorful status code
this PR is simply this branch: #31 but rebased off of master, kept as a separate PR for now to retain the original PR's code, and I can verify that there are no rebase artifacts or anything weird.
here is the original PR's description:
what I did in this PR, from that:
masterrepo_test.exs, should be testing the same kind of thing, but since they don't use mapping types, will pass under es7 and es8