P-hacking is the practice of selectively reporting or analysing data until statistical significance is achieved, often inflating false positives. Despite its prevalence, quantifying how difficult specific results are to obtain via p-hacking is a cumbersome task. This project introduces a novel metric, the estimated P-Hacking Attempt-Insistence-Length (phail), defined as the expected number of times researchers would need to rerun their analyses to obtain the number of significant results reported in a paper. We will apply this measure to a sample of highly cited behavioural and cognitive neuroscience studies. The project will produce a library that computes the phail, providing an accessible resource for researchers and meta-scientists. If time allows, we will begin automating the full workflow from paper to index computation. Ultimately, this project aims to deliver a practical tool for assessing the reliability of specific methodological pipelines and tools, and support the assessment of authors, reviewers, and editors.
- Define and formalise the P-Hacking Attempt-Insistence-Length (phail)
- Think of a better name/acronym (BNA) for the phail
- Select a set of papers using linear mixed-effects (LME) models for testing
- Code the analysis pipelines of the selected papers and use simulations to estimate BNA
- Create visualisations summarising the BNA distribution across studies (hopefully N > 20)
- Scale up the code to an open-source library for the general computation of BNA on LMEs
- Devise a very cool name and acronym for the library
- Explore automation of the end-to-end paper → BNA computation pipeline
- Draft a short report detailing motivation, methods, findings, and next steps
- Gathering a representative sample of highly cited behavioural and cognitive neuroscience papers, extracting reported statistical results and methodological details.
- Code the paper's analysis pipelines that are ready for automatic processing.
- Translating the conceptual idea of "p-hacking attempt insistence length" into a computable quantitative index.
- Find a cooler name and acronym for the index
- Developing and testing algorithms to estimate phail from reported statistics and study designs, including iterative simulation or inference methods.
- Generating summary statistics and plots
- Implementing the phail computation as an open-source library with clear APIs and documentation.
- Summarise the commonly-occurring components of the paper-specific analysis pipelines into modules
- Write an introduction to the index and main idea of the project for a potential manuscript
- Write the methods section
- Write a first draft of the results and discussion
- Prepare the slides for the Brainhack project presentations
- Assign teams for the different components of the project
- Define the phail metric conceptually and outline how it will be computed
- Identify and collect a small test set of target papers (e.g., 5–10)
- Code the generation of null-hypothesis-data for at least one type of data used in the papers
- Implement the analysis pipeline for at least one study
- Set up shared repository and code skeleton for the library
- Write the introduction for the manuscript
- Expand the paper dataset to at least 20 papers
- Expand the null-hypothesis-data generator to all types of data present in the papers
- Implement the analysis pipeline for at least 10 of the studies
- Extract reported results from all test papers (e.g., sample sizes, p-values, effects)
- Implement a minimal working version of the phail computation pipeline
- Write the methods section of the manuscript
- Refactor the analysis pipelines to a modular library
- Implement the analysis pipelines for the rest of the studies using the library
- Compute phail for all the papers in dataset
- Produce summary plots and quick field-level patterns
- Write the draft of the results and discussion sections of the manuscript
- Prepare the slides for the project presentations
- Open science mindset: Understanding p-values, p-hacking, research transparency, and reproducibility issues in cognitive neuroscience.
- Metric development: Designing and formalising quantitative indices from conceptual ideas
- Collecting and structuring reported methodological pipelines from published papers
- Python programming and library development: Writing reusable code for computations and interacting with other programming languages from Python
We welcome contributions from all participants! Please see CONTRIBUTING.md for information on how to add yourself as a contributor to this project.
To be determined