A CLI tool to visualize the distribution of file types and disk usage in a directory, built with Rich and Typer.
It's helpful to get a picture of all the different filetypes in a directory along with their disk usage, to inform what needs to be added to .gitignore or .gitattributes (for Git LFS). This was the original use-case for this tool.
- Tables: Output with icons for common file types.
- Scanning: Directory scanning with a progress bar.
- Tree View: Hierarchical view of file types.
- Example Files: Lists files contributing to each category.
- JSON Export: Save statistics to JSON.
- Filtering: Options to include/exclude hidden files and limit results.
Clone the repository and install using uv:
git clone <repository_url>
cd filetype-stats
uv tool install .Run the tool in the current directory:
filetype-stats analyzeAnalyze a specific directory:
filetype-stats analyze /path/to/directoryArguments:
DIRECTORY: The directory to analyze. Defaults to the current directory (.).
Options:
--max-ext-length INTEGER: Maximum length for valid file extensions. Default is 10.--include-hidden: Include hidden files and directories in the analysis. Default isFalse(hidden files are ignored).--sort [size|count|name]: Sort results by total size, file count, or extension name. Default issize.--top INTEGER: Show only the top N file types. If not specified, shows all found types.--show-examples: Show example files (up to 3) for each file type in the table. Default isFalse.--tree: Display results as a tree view instead of a table. Default isFalse.--export-json TEXT: Export the analysis results to a JSON file at the specified path.
Example with complex options:
filetype-stats analyze ./src --include-hidden --max-ext-length 5 --sort count --export-json stats.json-
First check that you have
--include-hiddenon.filetype-statsignores hidden files and directories by default, whileduby default includes everything. -
Any additional discrepancy between sizes reported by
duvsfiletype-statsis becauseduandfiletype-statsmeasure different things.dureports disk usage including block size overhead, blocks taken up by directories, and any files system metadataos.path.getsize()reports actual file size — just the actual number of bytes taken up by each file, ignoring block size overhead. For this reason,duwill always report larger sizes thanfiletype-stats