Skip to content

Send DDLs to the right catalog.#2

Open
anuragmantri wants to merge 1 commit intovihangk1:catalog_hafrom
anuragmantri:catalog_ha
Open

Send DDLs to the right catalog.#2
anuragmantri wants to merge 1 commit intovihangk1:catalog_hafrom
anuragmantri:catalog_ha

Conversation

@anuragmantri
Copy link

Currently added for create table. Tested manually by creating
different tables. DDL requests are always sent to only one
catalog which responds with the right address and the operation
is retried on that new catalog.

Change-Id: Id34d4e50cd238ebce2a25898ed40aa5d40207619

Currently added for create table. Tested manually by creating
different tables. DDL requests are always sent to only one
catalog which responds with the right address and the operation
is retried on that new catalog.

Change-Id: Id34d4e50cd238ebce2a25898ed40aa5d40207619
vihangk1 pushed a commit that referenced this pull request Jul 30, 2020
Fixes the following TSAN data races that come up when running custom
cluster tests. The immediate goal is to fix all remaining data races in
custom cluster tests and then enable custom cluster tests in the TSAN
builds. This patch fixes about half of the remaining data races reported
during a TSAN build of custom cluster tests.

SUMMARY: ThreadSanitizer: data race util/stopwatch.h:186:9 in impala::MonotonicStopWatch::RunningTime() const
  Read of size 8 at 0x7b580000dba8 by thread T342:
    #0 impala::MonotonicStopWatch::RunningTime() const util/stopwatch.h:186:9
    #1 impala::MonotonicStopWatch::Reset() util/stopwatch.h:136:20
    #2 impala::StatestoreSubscriber::Heartbeat(impala::TUniqueId const&) statestore/statestore-subscriber.cc:358:35
  Previous write of size 8 at 0x7b580000dba8 by thread T341:
    #0 impala::MonotonicStopWatch::Reset() util/stopwatch.h:139:21 (impalad+0x1f744ab)
    #1 impala::StatestoreSubscriber::Heartbeat(impala::TUniqueId const&) statestore/statestore-subscriber.cc:358:35

SUMMARY: ThreadSanitizer: data race status.h:220:10 in impala::Status::operator=(impala::Status&&)
  Write of size 8 at 0x7b50002e01e0 by thread T341 (mutexes: write M17919):
    #0 impala::Status::operator=(impala::Status&&) common/status.h:220:10
    #1 impala::RuntimeState::SetQueryStatus(std::string const&) runtime/runtime-state.h:250
    #2 impala_udf::FunctionContext::SetError(char const*) udf/udf.cc:423:47
  Previous read of size 8 at 0x7b50002e01e0 by thread T342:
    #0 impala::Status::ok() const common/status.h:236:42
    #1 impala::RuntimeState::GetQueryStatus() runtime/runtime-state.h:15
    #2 impala::HdfsScanner::CommitRows(int, impala::RowBatch*) exec/hdfs-scanner.cc:218:3

SUMMARY: ThreadSanitizer: data race hashtable.h:370:58
  Read of size 8 at 0x7b2400091df8 by thread T338 (mutexes: write M106814410723061456):
...
    apache#3 impala::MetricGroup::CMCompatibleCallback() util/metrics.cc:185:40
...
    apache#9 impala::Webserver::RenderUrlWithTemplate() util/webserver.cc:801:3
    apache#10 impala::Webserver::BeginRequestCallback(sq_connection*, sq_request_info*) util/webserver.cc:696:5
  Previous write of size 8 at 0x7b2400091df8 by thread T364 (mutexes: write M600803201008047112, write M1046659357959855584):
...
    apache#4 impala::AtomicMetric<(impala::TMetricKind::type)0>* impala::MetricGroup::RegisterMetric<> >() util/metrics.h:366:5
    apache#5 impala::MetricGroup::AddGauge(std::string const&, long, std::string const&) util/metrics.h:384:12
    apache#6 impala::AdmissionController::PoolStats::InitMetrics() scheduling/admission-controller.cc:1714:55

Testing:
* Ran core tests
* Re-ran TSAN tests and made sure issues were resolved
* Ran single_node_perf_run for workload TPC-H scale factor 30;
  no regressions detected

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(30) | parquet / none / none | 7.36    | -1.77%     | 5.01       | -1.61%         |
+----------+-----------------------+---------+------------+------------+----------------+

Change-Id: Id4244c9a7f971c96b8b8dc7d5262904a0a4b77c1
Reviewed-on: http://gerrit.cloudera.org:8080/16079
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
vihangk1 pushed a commit that referenced this pull request Nov 18, 2020
This patch fixes the remaining TSAN errors reported while running custom
cluster tests. After this patch, TSAN can be enabled for custom cluster
tests (currently it is only run for be tests).

Adds a data race suppression for
HdfsColumnarScanner::ProcessScratchBatchCodegenOrInterpret, which
usually calls a codegen function. TSAN currently does not support
codegen functions, so this warning needs to be suppressed. The call
stack of this warning is:

    #0 kudu::BlockBloomFilter::Find(unsigned int) const kudu/util/block_bloom_filter.cc:257:7
    #1 <null> <null> (0x7f19af1c74cd)
    #2 impala::HdfsColumnarScanner::ProcessScratchBatchCodegenOrInterpret(impala::RowBatch*) exec/hdfs-columnar-scanner.cc:106:10
    apache#3 impala::HdfsColumnarScanner::TransferScratchTuples(impala::RowBatch*) exec/hdfs-columnar-scanner.cc:66:34

Fixes a data race in DmlExecState::FinalizeHdfsInsert where a local
HdfsFsCache::HdfsFsMap is unsafely passed between threads of a
HdfsOperationSet. HdfsOperationSet instances are run in a
HdfsOpThreadPool and each operation is run in one of the threads from
the pool. Each operation uses HdfsFsCache::GetConnection to get a hdfsFs
instance. GetConnection can take in a 'local_cache' of hdfsFs instances
before using the global map. The race condition is that the same local
cache is used for all operations in HdfsOperationSet.

Testing:
* Re-ran TSAN tests and confirmed the data races have disappeared

Change-Id: If1658a9b56d220e2cfd1f8b958604edcdf7757f4
Reviewed-on: http://gerrit.cloudera.org:8080/16426
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant