-
Notifications
You must be signed in to change notification settings - Fork 47
Project dependencies may have API risk issues #13
Description
Hi, In NeuralDB, inappropriate dependency versioning constraints can cause risks.
Below are the dependencies and version constraints that the project is using
tqdm
pymongo
numpy
The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict.
The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.
After further analysis, in this project,
The version constraint of dependency tqdm can be changed to >=4.42.0,<=4.64.0.
The version constraint of dependency pymongo can be changed to >=2.9,<=4.1.1.
The above modification suggestions can reduce the dependency conflicts as much as possible,
and introduce the latest version as much as possible without calling Error in the projects.
The invocation of the current project includes all the following methods.
The calling methods from the tqdm
itertools.product tqdm.tqdm
The calling methods from the pymongo
pymongo.MongoClient pymongo.UpdateOne
The calling methods from the all methods
loaded.extend global_obs.append search_key.split qh2_filtered.append r.split json.load.keys pathlib.Path transformers.integrations.deepspeed_init special_tokens.extend instance.strip.rstrip write_updates os.environ.get collections.defaultdict.items torch.utils.data.DataLoader self.context_tokenizer.items get_longest self.RankArgs fact.split.detok.detokenize.replace.replace.replace pathlib.Path.exists s.startswith try_recovery collections.defaultdict tmp_heights.append idx.by_idx.append logging.getLogger.info self.callback_handler.on_prediction_step model maybe_split matplotlib.pyplot.xticks nltk.tokenize.treebank.TreebankWordDetokenizer.detokenize random.choice.replace map_triples_to_facts f.write resolve_redirect grp.tuple.all_unique.append collections.defaultdict.add all_series.append collections.Counter.most_common torch.stack item.replace.strip.replace neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.save_metrics EvalPredictionWithMetadata get_unit name.replace.replace.replace resolve_later_ref.append read_questions_into_dict.items torch.utils.data.sampler.WeightedRandomSampler collections.defaultdict.values all_grams.extend matplotlib.pyplot.savefig sampled.extend refs.split.replace convert_numeric_hypothesis hdr.replace name.replace.replace.replace.replace.replace.replace.replace json.load.get ax.plot map question_template.split self.model.generate context_outputs.pooler_output.T.question_outputs.pooler_output.torch.matmul.cpu get_generator sizes.append r.final_templates.keys relation_id.additional_subjects.get.keys.set.union name.replace.replace float Exception glob.glob collections.defaultdict.keys generate_positive_question list.append v.to matplotlib.pyplot.plot subj.by_relation.extend self.tokenizer.convert_tokens_to_ids next_actions.tolist.tolist long_questions.extend new_search.append stripped_template.replace.replace is_valid_file population.pop find_matches r.source_mutations.keys generate_answers resolve_first_ref self.maybe_tokenize_db result.strip self.database_reader.load_instances loaded.append self.model.prepare_decoder_input_ids_from_labels logging.getLogger.debug relation_id.additional_objects.get.keys.set.union question_template.split.strip aggr.update numpy.concatenate pandas.pivot_table logging.getLogger transformers.integrations.is_deepspeed_zero3_enabled f.read transformers.trainer_utils.is_main_process hasattr subject_name.q.replace.replace pydash.get pandas.DataFrame join normalize_subject resolve_later_ref.split object_id.startswith model.half.to.half added_instances.append matplotlib.pyplot.subplots subj.by_subject.extend rel.subj.by_sub_rel.append join_decoded sro.startswith resolve_later_ref feature.items hdr.replace.replace os.path.exists bool build_questions_for_db batch.append k.replace os.path.isdir tmp_rels.append instance.strip.rstrip.replace.replace.split merge_type read_csv.items self._prepare_inputs rel.split.split_by_relation.extend torch.nn.Softmax partition_questions statement.replace all predicted.split answer_sizes.append dataset.append print threshold.cos_scores.np.nonzero.squeeze map_triples_to_facts.keys torch.no_grad transformers.trainer_utils.denumpify_detensorize.pop isinstance collections.OrderedDict found_sro.hf.set.union generate_derivations.append TFIDFRetriever.closest_docs answer.strip query.active_questions.append search_toks.index element.split tokenizer.pad_token.tokenizer.pad_token.tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.label.replace.replace.replace.strip nltk.word_tokenize itertools.chain ordering.index subject_name.question_template.replace.replace str all_losses.mean.item clean_title actual.split transformers.trainer_pt_utils.find_batch_size dev_examples.append partition_subject k.strip.answers.append derivation.strip.split json.dumps question_template.replace.replace qh1_filtered.append stds.extend sentence_transformers.util.pytorch_cos_sim self._wrap_model operator.itemgetter outputs.outputs.dict.outputs.isinstance.mean sample_databases final_questions.append random.uniform logging.getLogger.warning range subj.by_object.extend evaluate_ndb_with_ssg os.getenv repr o.by_object.append statement.replace.replace context_tokens.append pandas.set_option itertools.repeat get_instances_from_file precision ax.fill_between csv.DictReader linearize transformers.HfArgumentParser.parse_args_into_dataclasses item.replace.strip tokenizer.pad_token.tokenizer.pad_token.tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.label.replace.replace.replace.strip.split matplotlib.pyplot.legend dpr.context_model.eval logging.getLogger.critical all_metadata.extend self.collection.find_one wikidata_common.wikpedia.Wikipedia medium_questions.extend tokenizer.batch_decode group_derivations b.count copy.copy process_lists key.startswith setuptools.find_packages json.loads.split partition_relation q.strip numpy.mean question_template.replace object_name.replace TFIDFRetriever.lookup question.split.strip features.keys nltk.ngrams fact.split.detok.detokenize.replace.replace.split tokenizer.add_special_tokens transformers.DPRQuestionEncoder.from_pretrained collections.Counter.update list model.half.to kvp.split hdr.replace.replace.replace.replace.replace v.torch.LongTensor.to new_states.append object_name.replace.replace template.keys.set.difference search_str.clean.strip example.append read_csv fact.split.detok.detokenize.replace.replace torch.matmul dataclasses.field query_obj.range.set.difference majority_vote ndb_data.util.log_helper.setup_logging numpy.argmax tmp_positive_answers.append neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.train self._process_query v.len.by_len.append logging.root.addHandler generate_facts_for_db.append transformers.DPRContextEncoder.from_pretrained wikidata_common.wikpedia.Wikipedia.resolve_redirect set ndb_data.wikidata_common.wikidata.Wikidata prediction.replace.split filter neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.save_state neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.save_model setup_logging min lookup_entity all_subjects.set.union self.subsampler.maybe_drop_sample generate_joins_extra neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.log_metrics tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.label.replace.replace.replace subj.by_sub_rel.append round sampled.append neuraldb.dataset.neuraldb_parser.NeuralDBParser.load_instances sampled_fact.strip dpr.question_model.eval in_dict.items dict_flatten self._prepare_inputs.items collection.insert_many pymongo.UpdateOne numpy.count_nonzero num_fact_used.append get_bool_breakdown reader_cls.read sentence_transformers.losses.ContrastiveLoss transformers.DPRContextEncoderTokenizer.from_pretrained neuraldb.evaluation.scoring_functions.average_score numpy.std sorted read_dump matplotlib.pyplot.hlines v.split.strip.strip q.by_qid.append find_longest_match ssg_data.append tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.pred.replace.replace.replace generate_joins_filter transformers.DPRQuestionEncoderTokenizer.from_pretrained matplotlib.pyplot.ylabel self.context_model DPRRetriever.lookup i.values transformers.trainer_pt_utils.nested_truncate ssg_output.remove relation_id.extra_subjects.get.keys matplotlib.pyplot.style.use db.split.rsplit neuraldb.evaluation.scoring_functions.breakdown_score delattr qid.split open.read self.tokenizer.pad.append tokenizer.pad_token.tokenizer.pad_token.tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.pred.replace.replace.replace.strip.split local_obs.append neuraldb.dataset.instance_generator.subsampler.Subsampler fact.split.detok.detokenize.replace torch.cat prediction.split.replace transformers.AutoTokenizer.from_pretrained.tokenize scoring_function self.tokenizer.tokenize short_questions.extend self.features.keys numpy.sum self._generate self._process_answer transformers.DPRQuestionEncoder.from_pretrained.to tok.startswith k.strip math.pow subject.get numpy.concatenate.mean generate_negative_bool q.startswith s.strip additional_ids.difference.difference question.strip instance.strip.rstrip.replace random.choices matplotlib.pyplot.xlabel ndb_data.wikidata_common.wikidata.Wikidata.get_by_id_or_uri set.append reader_cls normalize_subject.replace generate_derivations self.concatenate_answer transformers.trainer_utils.denumpify_detensorize.keys actual.join_decoded.lower rel.by_sub_rel.append state.copy.append self.context_tokenizer self.prediction_step transformers.AutoTokenizer.from_pretrained.decode check_match derivation.strip.startswith self.compute_metrics random.choice derivation.rsplit len.keys inputs.outputs.self.label_smoother.mean make_symmetric ndb_data.construction.make_database_initial.normalize_subject transformers.AutoModelForSeq2SeqLM.from_pretrained.resize_token_embeddings super.__init__ matplotlib.pyplot.title question_template.replace.replace.replace self.collection.find get_size_bin b.strip.first_bit.strip template.keys claim.pydash.get.values ef.write q_idx.db_idx.questions_answers.append len random.random numpy.min prediction.replace.replace k.rel_avgs.append get_bool_ans argparse.ArgumentParser.parse_args prediction.split json.loads.replace neuraldb.dataset.seq2seq_dataset.Seq2SeqDataset ValueError prediction.split.replace.replace self.validation_file.split all_stds.append tmp_derivations.append sentence_transformers.evaluation.BinaryClassificationEvaluator.from_input_examples torch.cuda.amp.autocast try_numeric self.test_file.split generate_joins self.instance_generator.generate clean.append name.replace.replace.replace.replace.replace.replace neuraldb.dataset.data_collator_seq2seq.DataCollatorForSeq2SeqAllowMetadata context_outputs.pooler_output.T.question_outputs.pooler_output.torch.matmul.cpu.detach.numpy.argsort pred.replace key.added_q_type_bin.append by_subj.keys.set.difference logging.StreamHandler next extended_question_answers.append sentence_transformers.SentenceTransformer self.answer_delimiter.join numpy.where wikidata_common.wikidata.Wikidata.find_custom torch.zeros logging.getLogger.error instance.update format collection.find super.compute_loss context_outputs.pooler_output.T.question_outputs.pooler_output.torch.matmul.cpu.detach ndb_data.wikidata_common.kelm.KELMMongo logging.root.setLevel get_indexable load_experiment drqascripts.retriever.build_tfidf_lines.OnlineTfidfDocRanker extra_negative_facts.append collections.Counter ndb_data.wikidata_common.kelm.KELMMongo.find_entity_rel torch.LongTensor any pydash.get.items sum setuptools.setup wikidata_common.wikidata.Wikidata model.half.to.eval clean second.nested.n_count.add lengths.append hyp.original_for.append self._nested_gather question_template.replace.replace.replace.replace.replace os.path.basename index_dump bulks.append matplotlib.pyplot.fill_between zip search_key.result.n_count.add json.loads.append generate_db_facts self._pad_tensors_to_max_len singleton_questions.extend elem.to transformers.AutoTokenizer.from_pretrained statement.replace.replace.replace compute_f1 set.add transformers.AutoTokenizer.from_pretrained.encode bz2.open similarity.normalized_levenshtein.NormalizedLevenshtein argparse.ArgumentParser.add_argument transformers.trainer_utils.EvalLoopOutput generate_derivations.extend s.copy context_outputs.pooler_output.T.question_outputs.pooler_output.torch.matmul.cpu.detach.numpy.argsort.tolist generate_hypotheses hdr.replace.replace.replace.replace sentence_transformers.SentencesDataset retokenize self.context_delimiter.join out_file.write self._maybe_sample label.replace json.loads transformers.set_seed self.tokenizer.encode_plus type question.strip.replace neuraldb.evaluation.scoring_functions.f1 json.load.items additional_ids.difference.update sentence_transformers.InputExample o.startswith random.shuffle transformers.AutoConfig.from_pretrained sentence_transformers.SentenceTransformer.encode re.match prediction.split.replace.replace.lower math.floor additional_subjects.keys.set.union derivation.split datasets.tqdm self._pad_across_processes instance.questions.append name.replace question.replace self.tokenizer.as_target_tokenizer question_template.replace.replace.replace.replace recall read_databases candidate_negatives_1.append plot.append question_template.startswith numpy.max nltk.tokenize.treebank.TreebankWordDetokenizer ctx.insert isinstance.items self._load_instances local_f.append qbin.qtype.all_questions_binned.append generate_facts_for_db outputs.outputs.dict.outputs.isinstance.mean.detach get_numeric_value tmp_types.append normalize_subject.split self.question_types.values random.choice.startswith context_outputs.pooler_output.T.question_outputs.pooler_output.torch.matmul.cpu.detach.numpy collection.bulk_write ndb_data.wikidata_common.kelm.KELMMongo.close transformers.trainer_utils.denumpify_detensorize ssg_utils.read_NDB instance.split.TreebankWordDetokenizer.detokenize.replace.replace data_files.items instance.copy.strip super generator_cls subject_name.question.replace.replace r.final_templates.keys.set.difference self.tokenizer.add_tokens dataset.extend all_experiments.append plot.sort self.num_examples json.dump object.get self._prepare_inputs.pop v.split.strip collection.estimated_document_count expt.update neuraldb.dataset.neuraldb_parser.NeuralDBParser q.replace keys.split final_sets.append db.extend tmp_fact_ids.append hyp.extra_kelm_for.append self.train_file.split kwargs.get argparse.ArgumentParser.error numpy.nonzero similarity.normalized_levenshtein.NormalizedLevenshtein.similarity argparse.ArgumentParser config_kwargs.update refs.split.split startptr.toks.join.clean.split keys.split.strip db_idx.to_add.append ndb_data.generation.question_to_db.generate_answers tuple transformers.utils.logging.set_verbosity_info shutil.rmtree additional_objects.keys.set.union pandas.pivot_table.to_records NotImplementedError a.strip super.prediction_step itertools.product tqdm.tqdm derivation.strip subject_name.islower random.choice.split post_process_instances property.property_entity.append matplotlib.pyplot.show os.unlink final_period self.tokenizer.decode instance.strip.rstrip.replace.replace swap_so question.split v.strip.answers.append re.match.group outputs.outputs.dict.outputs.isinstance.mean.detach.repeat relation_id.additional_objects.get.keys line.rstrip instance.split.TreebankWordDetokenizer.detokenize.replace self.tokenizer.pad numpy.cumsum copy.copy.extend tuple.startswith partition_idx partition_subject.keys functools.reduce states.pop.copy hyp.hypotheses_facts.append transformers.utils.logging.enable_default_handler others.append read_questions_into_dict self.maybe_decorate_with_metadata obj.startswith v.split.strip.split neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer.evaluate object_name.islower cos_scores.cpu.cpu transformers.trainer_pt_utils.nested_concat json.load b.strip self.question_tokenizer series.extend partition_subject_relation transformers.trainer_pt_utils.nested_numpify get_file_stats bring_extra_facts pymongo.MongoClient read_csv.lower json.loads.strip postprocess_text numpy.percentile transformers.utils.logging.enable_explicit_format os.listdir random.sample os.makedirs derivation.split.strip hf.keys self.label_smoother sentence_transformers.SentenceTransformer.fit is_valid_folder read_questions_into_dict.keys tokenizer.pad_token.tokenizer.pad_token.tokenizer.eos_token.tokenizer.eos_token.tokenizer.bos_token.tokenizer.bos_token.pred.replace.replace.replace.strip TFIDFRetriever transformers.trainer_utils.get_last_checkpoint collections.Counter.items subject_name.modifier.is_subject.q.replace.replace transformers.DPRContextEncoder.from_pretrained.to out.extend subject_name.fact.replace.replace extract_operator inputs.outputs.self.label_smoother.mean.detach ssg_utils.create_dataset self.question_tokenizer.items lookup_relation start.toks.join.startswith r.by_relation.append derv.split max derv.tokenizer.encode.tokenizer.decode.strip tokenizer.bos_token.tokenizer.bos_token.label.replace.replace int remove_lst.append random.randint transformers.AutoModelForSeq2SeqLM.from_pretrained neuraldb.modelling.neuraldb_trainer.NeuralDBTrainer example.self.tokenizer.convert_tokens_to_ids.self.tokenizer.decode.split open set.update derivation.rsplit.strip hdr.replace.replace.replace enumerate k.startswith pandas.DataFrame.select_dtypes states.pop weights.append v.items DPRRetriever name.replace.replace.replace.replace.replace tmp_questions.append of.write main tokenizer.bos_token.tokenizer.bos_token.pred.replace.replace predicted.join_decoded.lower unit_uri.replace relation_id.additional_subjects.get.keys flatten_dicts transformers.HfArgumentParser self.maybe_tokenize_answer convert_comparable self._prepend_prediction_type_answer self.question_model q.q_heights.append train_examples.append name.replace.replace.replace.replace dict set.items evaluation_metrics pydash.get.values virtual_features.extend s.by_subject.append batch_update.append relation_id.extra_objects.get.keys name.replace.replace.replace.replace.replace.replace.replace.replace neuraldb.util.log_helper.setup_logging search_toks.append wikidata_common.wikidata.Wikidata.get_by_id_or_uri
@developer
Could please help me check this issue?
May I pull a request to fix it?
Thank you very much.