Skip to content
This repository was archived by the owner on Dec 28, 2017. It is now read-only.
This repository was archived by the owner on Dec 28, 2017. It is now read-only.

Index scan cannot read all data #201

@Novemser

Description

@Novemser

Seems there's some issue in IndexScanIterator.java.
The below sql

scala> spark.sql("select L_ORDERKEY from lineitem where L_ORDERKEY < 10000000 order by l_orderkey").show

should print result as

+----------+
|L_ORDERKEY|
+----------+
|         1|
|         1|
|         1|
|         1|
|         1|
|         1|
|         2|
|         3|
|         3|
|         3|
|         3|
|         3|
|         3|
|         4|
|         5|
|         5|
|         5|
|         6|
|         7|
|         7|
+----------+
only showing top 20 rows

But we got this:

+----------+
|L_ORDERKEY|
+----------+
|    499683|
|    499683|
|    499684|
|    499684|
|    499684|
|    499684|
|    499685|
|    499685|
|    499685|
|    499685|
|    499686|
|    499686|
|    499686|
|    499686|
|    499686|
|    499687|
|    499687|
|    499687|
|    499712|
|    499713|
+----------+
only showing top 20 rows

Plan:

spark.sql("select L_ORDERKEY from lineitem where L_ORDERKEY < 10000000 order by l_orderkey").explain

== Physical Plan ==
*Sort [l_orderkey#18L ASC NULLS FIRST], true, 0
+- Exchange rangepartitioning(l_orderkey#18L ASC NULLS FIRST, 200)
   +- TiDB CoprocessorRDD{[table: lineitem] [Index: primary] , Ranges: Start:[1], End: [1], Columns: [L_ORDERKEY], Filter: UnaryNot(IntIsNull([L_ORDERKEY]))}

Code in IndexScanIterator.java

  @Override
  public boolean hasNext() {
    try {
      if (rowIterator == null) {
        TiSession session = snapshot.getSession();
        while (handleIterator.hasNext()) {
          TLongArrayList handles = feedBatch();
          batchCount++;
          completionService.submit(() -> {
            List<RegionTask> tasks = RangeSplitter
                .newSplitter(session.getRegionManager())
                .splitHandlesByRegion(dagReq.getTableInfo().getId(), handles);
            return CoprocessIterator.getRowIterator(dagReq, tasks, session);
          });
        }
        while (batchCount > 0) {
          rowIterator = completionService.take().get();
          batchCount--;

          if (rowIterator.hasNext()) {
            return true;
          }
        }
      }
      if (rowIterator == null) {
        return false;
      }
    } catch (Exception e) {
      throw new TiClientInternalException("Error reading rows from handle", e);
    }
    return rowIterator.hasNext();
  }

Seems rowIterator cannot retrieve all the result from completionService since rowIterator = completionService.take().get(); may not execute when data in first not null iterator ended.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions