Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
319 commits
Select commit Hold shift + click to select a range
2aa35b6
lda result dir global and dir err hand
BaekTree Jan 10, 2020
8ef160a
prs del glob ->local
BaekTree Jan 10, 2020
6cb2175
LDA new object ok
BaekTree Jan 10, 2020
b075023
LDA ts update ok
BaekTree Jan 10, 2020
4937040
ready date cont if
BaekTree Jan 10, 2020
3fa621d
LDA contents
BaekTree Jan 10, 2020
1a7c01c
mecab update
BaekTree Jan 10, 2020
dee02b8
make tfidf structure
SongJinBeom Jan 10, 2020
7059ee0
Merge pull request #6 from BaekTree/master
SongJinBeom Jan 10, 2020
35c33f6
app.py dir bug fix
BaekTree Jan 10, 2020
d98a915
Merge pull request #7 from BaekTree/master
SongJinBeom Jan 10, 2020
e7dd79b
error1
SongJinBeom Jan 10, 2020
58f80e2
update
BaekTree Jan 10, 2020
29705ba
Merge branch 'master' of https://github.com/HGUISEL/TIBigdataMiddleware
BaekTree Jan 10, 2020
d6eb7b8
okt remove
BaekTree Jan 10, 2020
f85ceaa
okt -> mecab
BaekTree Jan 10, 2020
439dc4f
server host 0.0.0.0
BaekTree Jan 10, 2020
d0eada5
server host 0.0.0.0
BaekTree Jan 10, 2020
17eb4a1
Merge branch 'master' of https://github.com/BaekTree/TIBigdataMiddleware
BaekTree Jan 10, 2020
b6861de
mecab eunjeon only if windows
BaekTree Jan 10, 2020
dc5a031
mecab for linux : konlpy
BaekTree Jan 10, 2020
02b3c73
os system show
BaekTree Jan 10, 2020
ad2068c
LDA : model save : dir err fix
BaekTree Jan 11, 2020
9d40e30
LDA model files
BaekTree Jan 11, 2020
08284d8
clean comments
BaekTree Jan 12, 2020
4c58976
test cor and model save
BaekTree Jan 12, 2020
bd9bd09
print cur dir
BaekTree Jan 12, 2020
729de2a
remove unness path setting
BaekTree Jan 12, 2020
65a823b
error except topic index temp broker
BaekTree Jan 12, 2020
971c75a
error temp again
BaekTree Jan 12, 2020
8e7c9b6
partial merge from master : os branch eunjeon
BaekTree Jan 12, 2020
915c2f9
clean comments
BaekTree Jan 12, 2020
3b205bf
LDA output commet
BaekTree Jan 12, 2020
bf41bcd
Merge branch 'master' into ldaResult
BaekTree Jan 12, 2020
0de85e7
print save log
BaekTree Jan 12, 2020
0b872cc
lda.save
BaekTree Jan 12, 2020
db48bca
linux dir sign
BaekTree Jan 13, 2020
12b86f5
lda model file miss fix
BaekTree Jan 13, 2020
685b22f
os name comment
BaekTree Jan 13, 2020
3a39f79
apply mecab
SongJinBeom Jan 14, 2020
e56bc37
color and stable
SongJinBeom Jan 20, 2020
1355910
dir bug fix
BaekTree Jan 20, 2020
5ed8990
lda coh
BaekTree Jan 20, 2020
f6bcfb3
add lda coh test
BaekTree Jan 20, 2020
feb5f78
data 548
BaekTree Jan 20, 2020
4aad2e0
lda coh : 548
BaekTree Jan 20, 2020
660c1ce
pandas
BaekTree Jan 20, 2020
c1310f4
tuning result
Jan 21, 2020
1a86a99
prs:load data argument
BaekTree Jan 22, 2020
a05e8d9
cos similar py
BaekTree Jan 22, 2020
e64d791
cos sim ok with index
BaekTree Jan 22, 2020
4297f2d
tfidf documentation
SongJinBeom Jan 22, 2020
9c0ddc5
print top 5 list
BaekTree Jan 23, 2020
cf6a371
Merge pull request #8 from SongJinBeom/master
BaekTree Jan 23, 2020
d9b1867
load data independently possible
BaekTree Jan 23, 2020
f77179a
update
BaekTree Jan 23, 2020
f8319cb
comment delete
BaekTree Jan 23, 2020
70dcf98
id unit
BaekTree Jan 23, 2020
37bb1e7
LDA_model move to ldaResult branch
BaekTree Jan 23, 2020
22c5012
remove cache files
BaekTree Jan 23, 2020
8bddd78
Merge branch 'master' of https://github.com/HGUISEL/TIBigdataMiddleware
BaekTree Jan 23, 2020
f681437
rm unness files
BaekTree Jan 23, 2020
91cf964
add main cosSim
BaekTree Jan 23, 2020
4ac16d4
add app.py cosSim
BaekTree Jan 23, 2020
6b77db7
console if rcmd func ok
BaekTree Jan 27, 2020
c665b98
Merge branch 'master' into recom
BaekTree Jan 27, 2020
caf98f1
update
BaekTree Jan 27, 2020
f816e8b
logic err fix : index till 4
BaekTree Jan 27, 2020
62a811a
remove git repo labs dir
BaekTree Jan 27, 2020
d9ba960
remove unness files
BaekTree Jan 27, 2020
a8697a1
update
BaekTree Jan 28, 2020
613b63e
option re-calc again or not
BaekTree Jan 28, 2020
4c88a77
Merge branch 'recom'
BaekTree Jan 28, 2020
650f58f
app.py lda name
BaekTree Jan 28, 2020
55fa5d2
cosSim folder and data
BaekTree Jan 28, 2020
63ba4c5
cos Sim comment
BaekTree Jan 28, 2020
aa9ccb6
recommand function console
BaekTree Jan 28, 2020
28c9831
debug :(
BaekTree Jan 28, 2020
158a07d
rcmd module and package
BaekTree Jan 28, 2020
93425be
Merge pull request #9 from BaekTree/master
SongJinBeom Jan 30, 2020
e028e4b
update
BaekTree Feb 2, 2020
f3f4a5b
save func console update
BaekTree Feb 2, 2020
b0962ce
new raw file 620
BaekTree Feb 2, 2020
b83472c
lda helper folder
BaekTree Feb 2, 2020
c6fde44
raw raw data from backend
BaekTree Feb 9, 2020
30797d1
prs bug fix
BaekTree Feb 14, 2020
f8a1cc8
up
BaekTree Feb 15, 2020
463bdc3
Colaboratory를 통해 생성됨
BaekTree Feb 15, 2020
2028ca1
Colaboratory를 통해 생성됨
BaekTree Feb 15, 2020
27145a9
Colaboratory를 통해 생성됨
BaekTree Feb 15, 2020
fa9d440
Colaboratory를 통해 생성됨
BaekTree Feb 15, 2020
8aeab1e
Merge pull request #10 from BaekTree/master
BaekTree Feb 17, 2020
ab697e4
rcmd update
BaekTree Feb 18, 2020
0f1c52f
up
BaekTree Feb 19, 2020
1ba6e12
rcmd speed up
BaekTree Feb 19, 2020
22eb339
fix err
BaekTree Feb 19, 2020
920efa9
up
BaekTree Feb 19, 2020
1c44b78
time
BaekTree Feb 19, 2020
f1cd44b
start time add
BaekTree Feb 19, 2020
2fb6cf4
up
BaekTree Feb 19, 2020
011a8d8
load static file always
BaekTree Feb 19, 2020
2d6f2ca
up
BaekTree Feb 19, 2020
7cb5c29
wdrk py
BaekTree Feb 19, 2020
1cc6275
topic keys
BaekTree Feb 19, 2020
ba2f192
port 808
BaekTree Apr 19, 2020
05285a1
es port 9200
BaekTree Apr 21, 2020
e255b7b
Merge branch 'master' of https://github.com/HGUISEL/TIBigdataMiddleware
BaekTree Apr 21, 2020
37027cb
Merge pull request #11 from BaekTree/master
BaekTree Apr 21, 2020
3b79942
create user history sample data
BaekTree Apr 27, 2020
d381e04
Merge pull request #13 from BaekTree/master
BaekTree Apr 30, 2020
825ce86
es put 620 data
BaekTree May 14, 2020
80df1b7
use bulk function and raw raw data
BaekTree May 14, 2020
064dc26
create user history sample data
BaekTree Apr 27, 2020
623642b
check ip and adjust ip address
BaekTree May 15, 2020
bc66f0b
Merge branch 'master' into baek
BaekTree May 15, 2020
5c98b12
clean comment
BaekTree May 18, 2020
4d19030
move tfidf static file to mongodb
BaekTree May 18, 2020
ae2bcb3
migration to nt server adjustion
BaekTree May 19, 2020
6e035a8
Merge pull request #14 from BaekTree/master
BaekTree May 19, 2020
1a43a86
history samples
BaekTree May 31, 2020
9e84dc7
history samples
BaekTree May 31, 2020
d1851ec
lab update
BaekTree May 31, 2020
6f244db
lab update
BaekTree May 31, 2020
58bc13f
file location
BaekTree May 31, 2020
3d91e5e
file location
BaekTree May 31, 2020
889ec4b
es ip
BaekTree Jun 9, 2020
1fc703e
es ip
BaekTree Jun 9, 2020
23a1866
Merge branch 'serverEnvTest' of https://github.com/HGUISEL/TIBigdataM…
Jun 15, 2020
e7746c5
Merge branch 'serverEnvTest' of https://github.com/HGUISEL/TIBigdataM…
Jun 15, 2020
18f2f5e
ignore and bug fix
BaekTree Jun 18, 2020
fa78120
ignore and bug fix
BaekTree Jun 18, 2020
04692eb
wordrank fix upto high top 3
BaekTree Jun 22, 2020
2b2803e
wordrank fix upto high top 3
BaekTree Jun 22, 2020
e9ff562
tfidf main
BaekTree Jun 29, 2020
bcac9a2
tfidf main
BaekTree Jun 29, 2020
aae44a3
esLarger and tweak
BaekTree Jun 29, 2020
96187c0
esLarger and tweak
BaekTree Jun 29, 2020
9185d07
esLarger and tweak
BaekTree Jun 29, 2020
170372b
esLarger and tweak
BaekTree Jun 29, 2020
8167e6e
fix
BaekTree Jun 29, 2020
26ff692
fix
BaekTree Jun 29, 2020
9b680a5
fix
BaekTree Jun 29, 2020
74e10f6
fix
BaekTree Jun 29, 2020
d198e76
fix
BaekTree Jun 29, 2020
178a83a
fix
BaekTree Jun 29, 2020
cb67f7b
rcmd
BaekTree Jun 30, 2020
eea2fe8
rcmd
BaekTree Jun 30, 2020
0914802
update
BaekTree Jul 27, 2020
0912a60
update
BaekTree Jul 27, 2020
8636100
server side update
Jul 27, 2020
5a61a41
server side update
Jul 27, 2020
f25e849
catch up server
BaekTree Jul 27, 2020
3884e65
catch up server
BaekTree Jul 27, 2020
c0646c5
update
BaekTree Aug 11, 2020
34e4b61
update
BaekTree Aug 11, 2020
df53c5e
lstm dummy var file
BaekTree Aug 11, 2020
41632e4
lstm dummy var file
BaekTree Aug 11, 2020
64b9167
bug fix
BaekTree Aug 11, 2020
003cdeb
bug fix
BaekTree Aug 11, 2020
2813b21
download tokenizer fro ma colab
BaekTree Aug 11, 2020
2e7a72f
download tokenizer fro ma colab
BaekTree Aug 11, 2020
a448e7c
update
Aug 11, 2020
98c9d39
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
Aug 11, 2020
8be3d59
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
Aug 11, 2020
7d5336a
first commit
Aug 11, 2020
1998bac
first commit
Aug 11, 2020
f5896ba
XXXXXXXXX:O
BaekTree Aug 11, 2020
6be067c
update
Aug 11, 2020
0a98e03
XXXXXXXXX:O
BaekTree Aug 11, 2020
abb9d4a
XXXXXXXXX:O
BaekTree Aug 11, 2020
eb6a2a8
update
BaekTree Aug 11, 2020
56b9ef8
update
BaekTree Aug 11, 2020
2e47e67
first commit
BaekTree Aug 11, 2020
0ecc70b
first commit
BaekTree Aug 11, 2020
5f71849
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 11, 2020
bb6beaf
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 11, 2020
b87dd99
lstm 1000
BaekTree Aug 11, 2020
b10ad05
lstm 1000
BaekTree Aug 11, 2020
ce66c1e
fix
BaekTree Aug 11, 2020
636fe0b
data analysis files from 1000 tfidf and rcmd
BaekTree Aug 11, 2020
f7637e6
data analysis files from 1000 tfidf and rcmd
BaekTree Aug 11, 2020
498a9fd
update
BaekTree Aug 11, 2020
d47bd99
update
BaekTree Aug 11, 2020
3fb5da0
refactoring
BaekTree Aug 11, 2020
62bd242
refactoring
BaekTree Aug 11, 2020
2055ef1
update
BaekTree Aug 11, 2020
b0ed4d2
update
BaekTree Aug 11, 2020
db0aca0
rcmd in fix
BaekTree Aug 12, 2020
28091d9
rcmd in fix
BaekTree Aug 12, 2020
f149d28
save latest tokenized doc after prs
BaekTree Aug 12, 2020
89126d2
model 10000
BaekTree Aug 14, 2020
9d85001
model files
BaekTree Aug 14, 2020
4370af7
rcmd update
BaekTree Aug 16, 2020
c6a5a82
rcmd update
BaekTree Aug 16, 2020
6abdc25
merge with esLargeData
BaekTree Aug 16, 2020
99443b4
merge with esLargeData
BaekTree Aug 16, 2020
48d79a9
Merge branch 'esLargeInt' of https://github.com/BaekTree/TIBigdataMid…
BaekTree Aug 16, 2020
57f3ee0
Merge branch 'esLargeInt' of https://github.com/BaekTree/TIBigdataMid…
BaekTree Aug 16, 2020
cae72ec
Merge branch 'esLargeData' into tenThousands
BaekTree Aug 16, 2020
721c598
Merge branch 'esLargeData' into tenThousands
BaekTree Aug 16, 2020
0055d3a
rm large size file
BaekTree Aug 16, 2020
fa12976
rm large size file
BaekTree Aug 16, 2020
76c6cbc
large file uploaded
BaekTree Aug 16, 2020
ee4e099
large file uploaded
BaekTree Aug 16, 2020
a687f56
merge with remote esLargeData
BaekTree Aug 16, 2020
67c8d71
merge with remote esLargeData
BaekTree Aug 16, 2020
1d3e291
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 16, 2020
b61979a
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 16, 2020
6e961e4
deleted large files
BaekTree Aug 16, 2020
326acc6
deleted large files
BaekTree Aug 16, 2020
d26d334
new file
BaekTree Aug 16, 2020
c1f8be4
new file
BaekTree Aug 16, 2020
0942e23
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 16, 2020
ea47e7a
Merge branch 'esLargeData' of https://github.com/BaekTree/TIBigdataMi…
BaekTree Aug 16, 2020
a7b85be
update
BaekTree Aug 16, 2020
173ed2a
update
BaekTree Aug 16, 2020
3fcdf16
update tfidf
BaekTree Aug 16, 2020
8bff302
update tfidf
BaekTree Aug 16, 2020
491ae66
merge
BaekTree Aug 16, 2020
18ea6fc
merge
BaekTree Aug 16, 2020
6b90fc8
Merge branch 'esLargeData' into HEAD
BaekTree Aug 16, 2020
a435ecf
Merge branch 'esLargeData' into HEAD
BaekTree Aug 16, 2020
1fdc875
frontend es local
BaekTree Sep 19, 2020
ec86bb7
frontend es local
BaekTree Sep 19, 2020
7d3912e
change es index to 3000
BaekTree Sep 28, 2020
65bcf07
change es index to 3000
BaekTree Sep 28, 2020
5ab8cec
success saving related docs data to mongo with 30 documents
BaekTree Sep 28, 2020
f03b86d
success saving related docs data to mongo with 30 documents
BaekTree Sep 28, 2020
8349a47
rcmd test with 620 data works
BaekTree Sep 28, 2020
73ce17b
rcmd test with 620 data works
BaekTree Sep 28, 2020
6e11a09
related docs feature works with 3000 data
BaekTree Sep 28, 2020
2203b09
related docs feature works with 3000 data
BaekTree Sep 28, 2020
7bb9149
keywords works with docs 3000
BaekTree Sep 29, 2020
a6fa3f5
keywords works with docs 3000
BaekTree Sep 29, 2020
ffe2bb4
prs.py save automatically the lastest prs result in /middleware/lates…
BaekTree Sep 29, 2020
fd0dd09
prs.py save automatically the lastest prs result in /middleware/lates…
BaekTree Sep 29, 2020
392fa45
prs.loadData function saves automatically its latest loadData result
BaekTree Sep 29, 2020
0c9140a
prs.loadData function saves automatically its latest loadData result
BaekTree Sep 29, 2020
7537d28
related docs function works with 3000 docs
BaekTree Sep 29, 2020
a007208
related docs function works with 3000 docs
BaekTree Sep 29, 2020
c7de568
docs topic distribution works with 3000 docs
BaekTree Sep 29, 2020
849ffa6
docs topic distribution works with 3000 docs
BaekTree Sep 29, 2020
5a23462
remove files and change names
BaekTree Sep 29, 2020
181617a
esFunc clean
BaekTree Sep 29, 2020
2df3564
prs clean
BaekTree Sep 29, 2020
686faae
final
BaekTree Oct 6, 2020
5720f11
Merge branch 'master' of https://github.com/BaekTree/TIBigdataMiddleware
BaekTree Oct 6, 2020
75fbbcb
Merge branch 'master' of https://github.com/BaekTree/TIBigdataMiddleware
BaekTree Oct 6, 2020
5b72e48
Merge branch 'master' of https://github.com/BaekTree/TIBigdataMiddleware
BaekTree Oct 6, 2020
6d9937e
readme
BaekTree Oct 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
**/*.pyc
__pycache__
148 changes: 148 additions & 0 deletions Labs/elasticsearch regrex and filter/esRegFilter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
from elasticsearch import Elasticsearch
import json

DB_URL = "http://203.252.112.14:9200/" + "nkdb200810"
es = Elasticsearch(DB_URL)
# es = Elasticsearch(timeout=30)

body = {}

"""
post_Date가 있는 문서 수 확인
"""
if False:
body["query"] = {"exists": {"field": "post_date"}}

#count
res_isDate = es.count(body)
body["query"] = {"match_all": {}}
res_all = es.count(body)
print(str(res_isDate["count"]) + " docs have post_date field out of " + str(res_all["count"]))

"""
nkdb200810
post_date가 있는 문서 898 out of 9??
없는 문서

nkdb200811
post_date가 있는 문서 9999 out of 13???
"""




"""
regex 적용해보기
"""
if False:
# body = {"query" : {"regexp" : {"post_date" : {"[0-9]{4}-[0-9]{2}-[0-9]{2}",}}}}
# body["query"] = {"regexp": {"post_date": {"value" : "[0-9]{4}-[0-9]{2}-[0-9]{2}",}}}
body["query"] = {"regexp": {"post_date": {"value" : "[0-9]{4}",}}}
body["_source"] = ["post_date"]
body["size"] = 1000
# body += json.dumps(query)


"""
실제... post_date 출력해보기.
이제 여기서... regrex 을 적용해보기.
app 에서 상원이 로직 가지고 오기
"""

#search
res = es.search(body)["hits"]["hits"]
# print(res[0])
# # count_s = 0
count = 0
for doc in res:
data = doc["_source"]["post_date"]
print(data)
try:
print(data)
# count += 1
# if count > 10:
# break
except:
count += 1
print("ERROR! count : ", count)
# # print(type(date))
print("error count : ", count)
# # t = type(date)
# # if date == "string":



"""
합치기 위해서 elasticsaerch aggregation을 사용해보기
"""
if False:
aggs1 = {"regexp": {"post_date": {"value" : "[0-9]{4}",}}}

body["aggs"] = {
"reg" : aggs1
}
body["size"] = 1000




res = es.search(body)
print(res)
# res = es.search(body)["hits"]["hits"]
# count = 0
# for doc in res:
# data = doc["_source"]["post_date"]
# print(data)
# try:
# print(data)
# # count += 1
# # if count > 10:
# # break
# except:
# count += 1
# print("ERROR! count : ", count)
# # # print(type(date))
# print("error count : ", count)



"""
regex 적용해보기 with ...
"""
if False:
# body = {"query" : {"regexp" : {"post_date" : {"[0-9]{4}-[0-9]{2}-[0-9]{2}",}}}}
# body["query"] = {"regexp": {"post_date": {"value" : "[0-9]{4}-[0-9]{2}-[0-9]{2}",}}}
body["query"] = {"regexp": {"post_date": {"value" : "[0-9]{4}",}}}
body["_source"] = ["post_date"]
body["size"] = 1000
# body += json.dumps(query)


"""
실제... post_date 출력해보기.
이제 여기서... regrex 을 적용해보기.
app 에서 상원이 로직 가지고 오기
"""

#search
res = es.search(body)["hits"]["hits"]
# print(res[0])
# # count_s = 0
count = 0
for doc in res:
data = doc["_source"]["post_date"]
print(data)
try:
print(data)
# count += 1
# if count > 10:
# break
except:
count += 1
print("ERROR! count : ", count)
# # print(type(date))
print("error count : ", count)
# # t = type(date)
# # if date == "string":


24 changes: 13 additions & 11 deletions Labs/postElasticSearch/postES.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
import json

FILE_DIR = "../../raw data sample/rawrawData.json"
FILE_DIR = "../../raw data sample/rawRawData.json"

with open(FILE_DIR,'r', encoding="utf-8") as fp:
esData = json.load(fp)

meaningful_data = esData['hits']['hits']

indexName = "frontend_test"
body = ""

for i,d in enumerate(meaningful_data):
body += json.dumps({'index' :
{
'_index' : d['_index'],
'_type' : d['_type'],
'_index' : indexName,
'_type' : indexName,
'_id' : d['_id']

}
Expand All @@ -36,14 +36,16 @@
print(body)

from elasticsearch import Elasticsearch
import socket
def get_ip_address():
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(("8.8.8.8", 80))
return s.getsockname()[0]

#function that find current ip address
# import socket
# def get_ip_address():
# s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# s.connect(("8.8.8.8", 80))
# return s.getsockname()[0]



DB_URL = "http://localhost:9200/nkdb"
DB_URL = "localhost:9200/"+indexName
es = Elasticsearch(DB_URL)
es.bulk(body)
es.bulk(body)
2 changes: 1 addition & 1 deletion Labs/sample user history/history.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import sys
sys.path.append(str(homeDir))

from common import cmm
from common import config
from common import esFunc
from common import prs

Expand Down
2 changes: 1 addition & 1 deletion Labs/sample user history/tokened_history.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions Labs/save static to mongo/tfidf2Mng.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
import json
TFIDF_DATA_DIR = "../../../TIBigdataFE/src/assets/entire_tfidf/data.json"
TFIDF_DATA_DIR = "./tfidfTotaldata.json"

with open(TFIDF_DATA_DIR,'r', encoding="utf-8") as fp:
tfidfData = json.load(fp)

import pymongo
from pymongo import MongoClient
client = MongoClient('localhost',27017)
db = client.analysis
db = client.analysis0919
collection = db.tfidf
collection.insert_many(tfidfData)
Loading