Generalized Architecture for Practical Linguistic Intelligence

In the race for Artifical General Intelligence (AGI) with Large Language Models (LLMs) being the crucial advancement in language processing and question answering, we try to answer the question: "Are singular question answering (QA) systems and LLMs capable of accurately answering a \textit{\textbf{diverse}} set of questions?" In this project, we introduce a generalized and modular architecture with a plug-and-play approach, where different specialized state-of-the-art models can be plugged in, to answer questions in their own domain. We hypothesize that this modular architecture will not compromise performance for generalizability and can thus beat unified QA models that are simultaneously trained for multiple QA tasks.

Adding a new model to the pipeline

Add the model json entry to the respository folder. The entry will look as follows:

{
  "model_name": <model name on HuggingFace>,
  "type": <list of model answer type: abstractive, extractive, etc.>,
  "description": <description of the model>,
  "downloads": <number of downloads on HuggingFace>,
  "dataset": <list of illustrative HuggingFace datasets>,
  "configs": <optional list of config options sometimes required by datasets>,
  "columns": <list of [question, context] pairs for each dataset>,
  "domain": <list of model domains>,
  "task": <model task (only "question-answering" as of now)>,
  "split" : <list of splits to draw example queries from in each dataset>
}

Regenerate the model map with python utils.py.
Add model entries to load_models() and load_model() in model_pipelines.py, including tokenizer, model, and task.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
repository		repository
tests		tests
.gitignore		.gitignore
README.md		README.md
answer_verification.py		answer_verification.py
create_evaluation_dataset.ipynb		create_evaluation_dataset.ipynb
eval_dataset.csv		eval_dataset.csv
eval_dataset_qid.csv		eval_dataset_qid.csv
evaluate_answer_verification.py		evaluate_answer_verification.py
evaluate_baselines.py		evaluate_baselines.py
evaluate_model_selection.py		evaluate_model_selection.py
evaluate_system.py		evaluate_system.py
evaluation_dataset.ipynb		evaluation_dataset.ipynb
llm_utils.py		llm_utils.py
main.py		main.py
model_pipelines.py		model_pipelines.py
open_domain.py		open_domain.py
overlap_analysis.py		overlap_analysis.py
prompts.py		prompts.py
requirements.txt		requirements.txt
results_analysis.py		results_analysis.py
results_expected.csv		results_expected.csv
results_gpt.csv		results_gpt.csv
results_ideal.csv		results_ideal.csv
results_nodom2.csv		results_nodom2.csv
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generalized Architecture for Practical Linguistic Intelligence

Adding a new model to the pipeline

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generalized Architecture for Practical Linguistic Intelligence

Adding a new model to the pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages