Skip to content

Conversation

@trozet
Copy link
Contributor

@trozet trozet commented Sep 10, 2021

Need a RWMutex to protect nil client usage during reconnect. Also includes a workaround for https://bugzilla.redhat.com/show_bug.cgi?id=2000375

Mutex is needed to protect client usage when client may be nil during a
reconnect. Otherwise the code will panic in multiple places.

Co-authored-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Signed-off-by: Tim Rozet <trozet@redhat.com>
This is a workaround due to a bug in ovsdb-server where it may silently
drop monitors:
https://bugzilla.redhat.com/show_bug.cgi?id=2000375

Related to ovnkube bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1998614

tl;dr ovsdb-server may drop the monitor to go-ovn silently which will
cause the go-ovn cache to go stale. Subsequent retry txns for the
resource will result in errors from ovsdb-server (since the resource may
have been created in ovsdb, but we are not getting updates) threfore a
txn error is encountered force reconnect to rebuild cache and get a new
monitor.

Co-authored-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

Signed-off-by: Tim Rozet <trozet@redhat.com>
@trozet
Copy link
Contributor Author

trozet commented Sep 10, 2021

@hzhou8
Copy link
Contributor

hzhou8 commented Sep 17, 2021

Hi @trozet, the problem looks clear but I wonder if this is the right way to fix. There can be different reasons for a transaction to fail - it can be temporary. Are we going to blindly reconnect regardless of the reason?

opsInfo = fmt.Sprintf("%v", ops[i])
}
return nil, fmt.Errorf("Transaction Failed due to an error: %v details: %v in %s",
odbi.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't trigger reconnect if odbi.reconn is not true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants