COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks

## short summary
カスタマーサポートを効率化するフレームワークCOTA v1(feature engineering base)とCOTA v2(deep learning base)の提案。

Uberにおけるカスタマーサポートのワークフローは、
- contact type identification
  - ユーザからのticketが何についてか（wrong food delivered or trip cancellation, etc）
- reply template selection
  - 大量のreply templateの中から適切なものを選択する。

の２つから成る。

contact typeやreply tenmplateを決定する際に参照する特徴量
- ticket message
- ticket metadata(time, product type(UBER Eats, UberX, etc), etc)
- user information(user type(driver, rider or eater), etc)
- trip information(city, trip status, arrival time, etc)

### COTA v1
<img width="455" alt="スクリーンショット 2019-05-02 4 19 02" src="https://user-images.githubusercontent.com/17867677/57037274-b1c44b00-6c91-11e9-87cf-57bdc5a9820d.png">

ticket messageの処理には、tf-idfとLSAを用いる。
しかし、これらのベクトルをそのまま使うと、次元の高さに起因する問題がある。
→ cosine類似度によるfeature engineering

各contact type, reply templateに対して、過去のticketからbag of words表現を獲得する（pi）。
そして、新しいticketのベクトル表現（tj）との間でcosine類似度を計算することで、クラスiとticket jの適合度を算出する。
ベクトル表現の計算方法としては、tf-idf, LSA両方を用いる。
この手順により、次元を大きく削減できる。

また、学習方法としては、多クラス分類を行うのではなく、各ticket-class pairに対して、cosine類似度による特徴量とその他の特徴量を組み合わせて0/1の２クラス分類を行い、その後そのスコアを並び替えることで、上位を候補として推薦する。（pointwise-ranking）

### COTA v2
<img width="582" alt="スクリーンショット 2019-05-02 4 52 42" src="https://user-images.githubusercontent.com/17867677/57038946-31ecaf80-6c96-11e9-80fd-78bce8718ed7.png">

テキスト情報に加え、metadataなどを考慮するので、wide-and-deep的なアプローチをとる。
各特徴量をencoderで処理し、combinerで繋げてdecoderで出力する。
また、contact typeとreply template selection用にそれぞれ別のdecoderを用意してマルチタスク学習を行う。

#### 事前知識のモデルへの埋め込み
<img width="368" alt="スクリーンショット 2019-05-02 5 02 31" src="https://user-images.githubusercontent.com/17867677/57039479-aecc5900-6c97-11e9-9f05-7a1e74024d78.png">

contact typeは階層的に定義されている。
よって、RNNを用いて、木の中のパスとして予測を行うことで、精度の向上と、間違える場合よりそれっぽいものと間違えるようになることを目指す。（contact typeの木の中で、遠いものと間違えるのと近いものと間違えるのではオペレータの工数が違う）

<img width="382" alt="スクリーンショット 2019-05-02 5 07 28" src="https://user-images.githubusercontent.com/17867677/57039656-36b26300-6c98-11e9-83e3-1de3b80389b8.png">
また、今回の設定として、reply template selectionは、contact typeに大きく依存するので、あるdecoderの出力を別のdecoderの入力に加えることで依存関係を表現する。

### 結果
<img width="388" alt="スクリーンショット 2019-05-02 5 10 59" src="https://user-images.githubusercontent.com/17867677/57039857-c0623080-6c98-11e9-8c90-5e461f4598aa.png">

COTA v1において、feature engineering + rankingを導入することで精度が大きく上昇。
COTA v2において、sequence decoder(contact typeの木の中のパスを予測する)の導入により、精度向上。また、正解ノードの親ノードは正解であるという設定だと、普通のsoftmaxによる多クラス分類よりさらに差が広がることから、より理にかなった間違いをするようになっている。
また、contact typeとreply selectionのdecoderの間に依存関係をおくことで精度向上。

v1よりv2の方が精度が高い。

## author
Piero Molino
Uber AI Labs
San Francisco, California
piero@uber.com
Huaixiu Zheng
Uber Technologies
San Francisco, California
huaixiu.zheng@uber.com
Yi-Chia Wang
Uber Technologies
San Francisco, California
yichia.wang@uber.com

## URL
https://arxiv.org/pdf/1807.01337.pdf
https://eng.uber.com/cota-v2/
ブログでは実運用についても述べられている。

## year
KDD2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks #56

short summary

COTA v1

COTA v2

事前知識のモデルへの埋め込み

結果

author

URL

year

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks #56

Description

short summary

COTA v1

COTA v2

事前知識のモデルへの埋め込み

結果

author

URL

year

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions