Skip to content

Least request lb enable full scan mode#1

Open
jkirschner-hashicorp wants to merge 200 commits intomainfrom
least_request_lb_enable_full_scan_mode
Open

Least request lb enable full scan mode#1
jkirschner-hashicorp wants to merge 200 commits intomainfrom
least_request_lb_enable_full_scan_mode

Conversation

@jkirschner-hashicorp
Copy link
Copy Markdown
Owner

This is a WIP attempt to combine the PRs related to full scan mode (primarily authored by @barroca) into a single PR.

This branch was made relative to Envoy main on Dec 20.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Fix this, since the bool has been converted to an enum

@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 33efb88 to dfaa3b6 Compare December 23, 2023 21:41
jkirschner-hashicorp pushed a commit that referenced this pull request Jan 16, 2024
Commit Message: the probing socket is released when port migration fails. If this happens in response to an incoming packet during an I/O event, the follow socket read could cause use-after-free.

[2024-01-08 16:30:53.386][12][critical][backtrace] [./source/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2024-01-08 16:30:53.387][12][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-01-08 16:30:53.387][12][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 0/1.29.0-dev/test/DEBUG/BoringSSL
[2024-01-08 16:30:53.413][12][critical][backtrace] [./source/server/backtrace.h:96] #0: Envoy::SignalAction::sigHandler() [0x55bb876d499e]
[2024-01-08 16:30:53.413][12][critical][backtrace] [./source/server/backtrace.h:98] #1: [0x7f55fbf92510]
[2024-01-08 16:30:53.440][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#2: Envoy::Network::Utility::readPacketsFromSocket() [0x55bb875de0ef]
[2024-01-08 16:30:53.466][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#3: Envoy::Quic::EnvoyQuicClientConnection::onFileEvent() [0x55bb8663e1eb]
[2024-01-08 16:30:53.492][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#4: Envoy::Quic::EnvoyQuicClientConnection::setUpConnectionSocket()::$_0::operator()() [0x55bb8663f192]
[2024-01-08 16:30:53.518][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#5: std::__invoke_impl<>() [0x55bb8663f151]
[2024-01-08 16:30:53.544][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#6: std::__invoke_r<>() [0x55bb8663f0e2]
[2024-01-08 16:30:53.569][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#7: std::_Function_handler<>::_M_invoke() [0x55bb8663efc2]
[2024-01-08 16:30:53.595][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#8: std::function<>::operator()() [0x55bb85cb8f44]
[2024-01-08 16:30:53.621][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#9: Envoy::Event::DispatcherImpl::createFileEvent()::$_5::operator()() [0x55bb8722560f]
[2024-01-08 16:30:53.648][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#10: std::__invoke_impl<>() [0x55bb872255c1]
[2024-01-08 16:30:53.674][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#11: std::__invoke_r<>() [0x55bb87225562]
[2024-01-08 16:30:53.700][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#12: std::_Function_handler<>::_M_invoke() [0x55bb872253e2]
[2024-01-08 16:30:53.700][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#13: std::function<>::operator()() [0x55bb85cb8f44]
[2024-01-08 16:30:53.726][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#14: Envoy::Event::FileEventImpl::mergeInjectedEventsAndRunCb() [0x55bb872358ec]
[2024-01-08 16:30:53.752][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#15: Envoy::Event::FileEventImpl::assignEvents()::$_1::operator()() [0x55bb87235ed1]
[2024-01-08 16:30:53.778][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#16: Envoy::Event::FileEventImpl::assignEvents()::$_1::__invoke() [0x55bb87235949]
[2024-01-08 16:30:53.804][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#17: event_persist_closure [0x55bb87fab72b]
[2024-01-08 16:30:53.830][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#18: event_process_active_single_queue [0x55bb87faada2]
[2024-01-08 16:30:53.856][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#19: event_process_active [0x55bb87fa56c8]
[2024-01-08 16:30:53.882][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#20: event_base_loop [0x55bb87fa45cc]
[2024-01-08 16:30:53.908][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#21: Envoy::Event::LibeventScheduler::run() [0x55bb8760a59f]
Risk Level: low
Testing: new unit test
Docs Changes: N/A
Release Notes: Yes
Platform Specific Features: N/A

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 9024501 to 0cf6424 Compare January 16, 2024 19:52
By default, the least request load balancer returns the host with the fewest
active requests from a set of N randomly selected hosts.

This introduces a new "full scan" selection method that returns the host with
the fewest number of active requests from all hosts. If multiple hosts are
tied for "least", a host is selected using an approximately random means that
is time and memory efficient. The selection probability among tied hosts may
not be equal, depending on:
- The host count
- How many hosts are tied
- How those tied hosts are distributed within the hosts array

Signed-off-by: Jared Kirschner <jkirschner@hashicorp.com>
Signed-off-by: Leonardo da Mata <ldamata@spotify.com>
@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 0cf6424 to 47e2838 Compare January 16, 2024 19:59
alyssawilk and others added 23 commits January 16, 2024 15:34

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Use the same TIMEOUT_FACTOR when running TSAN, MSAN, and ASAN since running the tests under sanitizers can take longer. This also renames TSAN_TIMEOUT_FACTOR to TIMEOUT_FACTOR to reflect the name.

Risk Level: low (tests only)
Testing: n/a
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Fredy Wijaya <fredyw@google.com>
Signed-off-by: Fredy Wijaya <fredyw@google.com>
Signed-off-by: Raven Black <ravenblack@dropbox.com>
* Update RELEASES.md to less outdated instructions

Signed-off-by: Raven Black <ravenblack@dropbox.com>
Additional Description: The CryptoMB private key provider only supports RSA at the time, the patch adds ECDSA support to it.
Risk Level: Low (as contrib extension)
Testing: Unit and integration tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: Requires AVX512 or equivalent CPU instruction set

Signed-off-by: Xie Zhihao <zhihao.xie@intel.com>
The OpenTelemetry exporter was previously using client-side streaming for exporting trace data. This appears to work fine with the reference OpenTelemetry collector, but some collectors (DataPrepper for one example) instead seem to expect unary RPC calls for trace exports (in accordance with the OpenTelemetry specification). It seems that Data Prepper appears to rely specifically on the end-stream flag arriving in order for it to properly process the trace exports.

Risk Level: medium
Testing: no new tests

Fixes envoyproxy#31497 

Signed-off-by: Ashish Banerjee <ashish.banerjee@solo.io>
See envoyproxy/go-control-plane#824 for more information

This PR adds the vtprotobuf protoc plugin for Go. This works on top of the existing protoc-gen-go, to add optimized marshal functions that callers can opt in to using. This is not like gogo, which was a very invasive change -- everything is layered and opt-in. See issue for benchmarks, etc.

Additionally, to avoid possible binary size increase, the entire new code is protected under a go build tag. Users will need to opt-in at build time (-tags=vtprotobuf). By default, there is no impact for users at all.

Risk Level: Low - only additional opt-in code
Testing: Manually tested in Istio codebase

Signed-off-by: John Howard <howardjohn@google.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
…red/golang (envoyproxy#31860)

build(deps): bump golang in /examples/shared/golang

Bumps golang from `cbee5d2` to `c4b696f`.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…in /examples/shared/node (envoyproxy#31859)

build(deps): bump node in /examples/shared/node

Bumps node from 21.5-bookworm-slim to 21.6-bookworm-slim.

---
updated-dependencies:
- dependency-name: node
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… initialized (envoyproxy#31834)

* initialize variable request_duration_

Signed-off-by: Zhiyong Yang <gayang@microsoft.com>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
…oxy#31852)

Moves some frame accounting to onBeforeFrameReceived.

Signed-off-by: Biren Roy <birenroy@google.com>
…p2_settings` internally (envoyproxy#31875)

This is refactoring in preparation for removing the nghttp2_callback API layer.

Commit Message: http2: updates CodecImpl to use a more abstract data type than nghttp2_settings internally
Additional Description:
Risk Level: low; refactoring, no functional change intended
Testing: ran unit and integration tests locally
Docs Changes:
Release Notes:
Platform Specific Features:

Signed-off-by: Biren Roy <birenroy@google.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
…31888)

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
…oxy#31891)

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
---------

Signed-off-by: Xuyang Tao <taoxuy@google.com>
Fixes envoyproxy#31856

Signed-off-by: Fredy Wijaya <fredyw@google.com>
gRPC clients expect that request status will be communicated using the gRPC status header, 
and that HTTP status for responses should always be 200 for properly behaving servers.
This commit updates the grpc_http1_reverse_bridge to follow this behavior.

Risk Level: medium
Testing: unit test, manually tested with grpcurl
Docs Changes: n/a
Release Notes: updated
Runtime guard: envoy.reloadable_features.grpc_http1_reverse_bridge_change_http_status

Signed-off-by: Henry Qin <hqin@squareup.com>
Signed-off-by: Henry Qin <root@hq6.me>
dependabot bot and others added 30 commits February 2, 2024 11:50
…hared/postgres (envoyproxy#32158)

build(deps): bump postgres in /examples/shared/postgres

Bumps postgres from `49c276f` to `db2d3c8`.

---
updated-dependencies:
- dependency-name: postgres
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…al_ratelimit with 1 update (envoyproxy#32162)

build(deps): bump the examples-local-ratelimit group

Bumps the examples-local-ratelimit group in /examples/local_ratelimit with 1 update: nginx.


Updates `nginx` from `4c0fdaa` to `6eb9534`

---
updated-dependencies:
- dependency-name: nginx
  dependency-type: direct:production
  dependency-group: examples-local-ratelimit
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Fredy Wijaya <fredyw@google.com>
Signed-off-by: Raven Black <ravenblack@dropbox.com>
)

Currently, if propagate_response_headers option is enabled, either for UDP tunneling and TCP tunneling, and retry is also enabled, the second HTTP response headers will cause a crash in Envoy, since the first filter state object is set with ReadOnly attribute. Fixing this by changing to Mutable with relevant tests that check this use case.

Risk Level: low
Testing: integration tests
Docs Changes: None
Release Notes: updated
Platform Specific Features: None
Signed-off-by: ohadvano <ohadvano@gmail.com>
Add Envoy::ExecutionContext, it can be inherited by subclasses to represent arbitrary information associated with the execution of a piece of code. Its activate/deactivate methods are called when the said execution starts/ends.

This is useful for things like resource accounting - see github issue for more details.

Commit Message: Add Envoy::ExecutionContext.
Additional Description:
Risk Level: Low. It is no op in open source code.
Testing: Unit test in test/common/common/execution_context_test.cc.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Bin Wu <wub@google.com>
It is causing test flakes and friction on CI. I was able to reproduce
this, and the crash happens in
https://github.com/envoyproxy/envoy/blob/faec43168bb50028b4b4351f477dfef40e0fb227/test/integration/fake_upstream.h#L714.

But I'm not sure cause, so will disable it for now until we have
deflaked it.

See envoyproxy#32151 for details.

Signed-off-by: Ali Beyad <abeyad@google.com>
…oyproxy#32105)

* stream info: rename the extended response flag and response flag

Signed-off-by: wbpcode <wbphub@live.com>

* minor update

Signed-off-by: wbpcode <wbphub@live.com>

* address comments

Signed-off-by: wbpcode <wbphub@live.com>

---------

Signed-off-by: wbpcode <wbphub@live.com>
* Exclude hosts in DRAINING state

Signed-off-by: Networking team <networking@lyft.com>

* WIP

Signed-off-by: Networking team <networking@lyft.com>

* fix upstream_impl_test

Signed-off-by: Networking team <networking@lyft.com>

* add more tests

Signed-off-by: Networking team <networking@lyft.com>

* fix lint

Signed-off-by: Networking team <networking@lyft.com>

* fix cluster_handler_test

Signed-off-by: Networking team <networking@lyft.com>

* fix lint

Signed-off-by: Networking team <networking@lyft.com>

* rm excluded_via_eds_draining

Signed-off-by: Chao-Han Tsai <milton0825@gmail.com>

* rename

Signed-off-by: Chao-Han Tsai <milton0825@gmail.com>

* restore

Signed-off-by: Chao-Han Tsai <milton0825@gmail.com>

* remove comment

Signed-off-by: Chao-Han Tsai <milton0825@gmail.com>

* add runtime guard and changelog

Signed-off-by: Networking team <networking@lyft.com>

* fix lint

Signed-off-by: Networking team <networking@lyft.com>

---------

Signed-off-by: Networking team <networking@lyft.com>
Signed-off-by: Chao-Han Tsai <milton0825@gmail.com>
Co-authored-by: Networking team <networking@lyft.com>
…roxy#32193)

Bumps [orjson](https://github.com/ijl/orjson) from 3.9.12 to 3.9.13.
- [Release notes](https://github.com/ijl/orjson/releases)
- [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md)
- [Commits](ijl/orjson@3.9.12...3.9.13)

---
updated-dependencies:
- dependency-name: orjson
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…hared/postgres (envoyproxy#32195)

build(deps): bump postgres in /examples/shared/postgres

Bumps postgres from `db2d3c8` to `4d1b17a`.

---
updated-dependencies:
- dependency-name: postgres
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
envoyproxy#32190)

Bumps redis from `247eb51` to `f44e917`.

---
updated-dependencies:
- dependency-name: redis
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…red/golang (envoyproxy#32191)

build(deps): bump golang in /examples/shared/golang

Bumps golang from `cf1c8a7` to `3efef61`.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…al_ratelimit with 1 update (envoyproxy#32189)

build(deps): bump the examples-local-ratelimit group

Bumps the examples-local-ratelimit group in /examples/local_ratelimit with 1 update: nginx.


Updates `nginx` from `6eb9534` to `5f44022`

---
updated-dependencies:
- dependency-name: nginx
  dependency-type: direct:production
  dependency-group: examples-local-ratelimit
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…xamples/zipkin (envoyproxy#32097)

build(deps): bump openzipkin/zipkin in /examples/zipkin

Bumps openzipkin/zipkin from `24164b3` to `4c7427b`.

---
updated-dependencies:
- dependency-name: openzipkin/zipkin
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…les/skywalking (envoyproxy#32188)

build(deps): bump elasticsearch in /examples/skywalking

Bumps elasticsearch from `60babae` to `2529e4b`.

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The ResetWithBidiTrafficExplicitData was crashing sporadically with
memory errors when run for HTTP3. Running with msan showed that the
crashes were caused in headers_callback.setReady() being called after
the test finished and member variables of the test were already
destroyed.

I noticed that in the test, we didn't call headers_callback.waitReady()
when the test was running for HTTP3.  It wasn't clear to me why, but
that seemed like the cause.  I removed the if-guard so we always call
headers_callback.waitReady(), and the test seems to pass now (over 500
runs).

Fixes envoyproxy#32024

Signed-off-by: Ali Beyad <abeyad@google.com>
Signed-off-by: liang.he@intel.com <liang.he@intel.com>
Lowering API limits so we can more easily test dns refresh.
Adding a bunch of e2e DNS refresh tests.

Risk Level: low
Testing: yes
Docs Changes: inline
Release Notes:
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
…roxy#32183)

We followed the C++ style to register configFactory, but it's not a proper choice for Golang filter.
Here is the Reasons:

Golang introduced the ConfigParser interface to Parse/validate a config, configFactory does not need to parse config.
better performance. By using fiterFactory, we could ommit generate a closure per http request, that may take ~1ns.
Yep, this is a breaking change, people need to change the register API.
So, I suggest we fix it earlier.
Signed-off-by: doujiang24 <doujiang24@gmail.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
…histograms (envoyproxy#32197)

Commit Message: Works around n^2 issue when destructing hundreds of thousands of mocks in the benchmark context. It does thi s by:

only initializing the hosts when per-host stats are enabled in the cluster config
adding a 'disabled' and 'enabled' variant to the histogram-json test so that it's easy to filter using --benchmark_filter=disabled to avoid initializing the hosts.
See also: envoyproxy#32198 -- this doesn't fix that bug but we could in theory revert this PR once that bug is fixed.
Additional Description:
Risk Level: low
Testing: just the one test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Joshua Marantz <jmarantz@google.com>
…nvoyproxy#32174)

mobile: Fix the singleton implementation to be thread-safe

Signed-off-by: Fredy Wijaya <fredyw@google.com>
…d/node (envoyproxy#32192)

build(deps): bump node in /examples/shared/node

Bumps node from `0ca0e0a` to `5792ef2`.

---
updated-dependencies:
- dependency-name: node
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Jared Kirschner <jkirschner@hashicorp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.