Extended Kendall Tau

Project

This project allows you to perform data cleaning, obtain statistics in the form of lists ranked by occurrences of letters of the alphabet, calculate Kendall's Tau distance, and Extended Kendall's Tau distance, statistics, and statistical tests on these results.

Overview

The project consists of several scripts, each handling different aspects of the data processing and analysis pipeline:

Data Cleaning and Summary Statistics: Cleans provided .txt files and converts them into summarized statistics.
Matrix Calculation: Generates matrices for Kendall Tau and Extended Kendall Tau calculations based on summarized data.
Statistical Analysis: Calculates and aggregates statistics from the generated matrices.

Dependencies

This project requires the following Python libraries:

itertools: For efficient looping.
re: For regular expression matching.
ast: For safely evaluating strings containing Python expressions.
collections: For high-performance container datatypes.
numpy: For scientific computing with Python.
openpyxl: For reading and writing Excel files.
scipy: For scientific and technical computing.
os: For interacting with the operating system.
pandas: For data manipulation and analysis.

You can install most of these dependencies using pip (note: some of these come with Python standard library):

Getting Started

Data Preparation: Place your .txt files in designated folders named after their respective categories (e.g., 'ES', 'PT', 'IN', etc.).
Execution Steps:
- Run the process_folder function for each folder containing .txt files to clean data and calculate preliminary statistics.
- Use the aggregate function to compile and summarize statistics across all folders.
- Execute the matrix function to fill Excel matrices for KT and EKT data analysis.
- Copy data from outputkt.xlsx and outputekt.xlsx to previously formated outputkt_form.xlsx and outputekt_form.xlsx
- Perform statistical calculations using the calculate_statistics function for in-depth analysis.

Usage

The main.py script orchestrates the project's workflow. Adjust the folder names and paths as necessary before execution. Original data used in submited paper was provided in folders. Additional spreadsheets in Excel format can be found in the project's Root.

Contributing

Contributions to improve the project are welcome. Please ensure to follow the project's coding standards and submit pull requests for any enhancements.

License

This project is released into the public domain and is free of licenses. It can be used, modified, and distributed without any restrictions. For more details, please refer to the Creative Commons CC0 declaration.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
AL		AL
ES		ES
IN		IN
IT		IT
PT		PT
D Cohen and abs results from StatsEKT and StatsKT.xlsx		D Cohen and abs results from StatsEKT and StatsKT.xlsx
FUNCTIONS.py		FUNCTIONS.py
Readme.md		Readme.md
main.py		main.py
outputekt_form.xlsx		outputekt_form.xlsx
outputkt_form.xlsx		outputkt_form.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Extended Kendall Tau

Project

Overview

Dependencies

Getting Started

Usage

Contributing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

dew-uff/EKT

Folders and files

Latest commit

History

Repository files navigation

Extended Kendall Tau

Project

Overview

Dependencies

Getting Started

Usage

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages