Skip to content

Stata command for creating categorical variables from multiple logical conditions using power-of-two indexing

License

Notifications You must be signed in to change notification settings

bukanpeneliti/group

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

group: Categorical Variables from Logical Conditions

Stata Package Version Stata 16+ GitHub Downloads GitHub Stars GitHub license

group is a Stata command that creates categorical variables from multiple logical conditions. Instead of forcing you to choose between overlapping categories, it uses a power-of-two indexing system to elegantly handle cases where observations meet multiple criteria simultaneously. Perfect for data validation, complex categorization, and any scenario where traditional mutually exclusive categories fall short.

Features ✨

  • Handles Overlapping Conditions: Intelligently manages cases where an observation meets multiple criteria
  • Robust Categorization: Uses a power-of-two indexing system to ensure every combination of conditions gets a unique, interpretable value
  • Flexible Syntax: Allows custom delimiters to avoid conflicts with condition or label text
  • Descriptive Labels and Notes: Automatically generates clear value labels and detailed notes for combined categories
  • Programmatic Access: Returns detailed results in r() for use in other scripts or validation checks

Installation

First, install the required dependency:

ssc install tuples, replace

Then install group directly from GitHub:

net install group, from("https://raw.githubusercontent.com/hafizarfyanto/group/main/")

Updating to Latest Version

To ensure you have the most recent features and bug fixes:

net install group, replace from("https://raw.githubusercontent.com/hafizarfyanto/group/main/")

Uninstalling

If you need to remove the package:

ado uninstall group

Alternative Uninstall Method

If the standard uninstall method doesn't work (e.g., if group was installed multiple times), follow these steps:

  1. Run: ado dir group in Stata command window

  2. Note all index numbers shown for group installations

  3. Uninstall packages using index numbers in descending order:

    ado uninstall [highest_index]
    ado uninstall [next_index]

Practical Example

* Load sample data
sysuse auto, clear

* Create categories based on overlapping conditions
group, rules(mpg > 22 :: "Efficient" ||| price < 5000 :: "Affordable") generate(car_cat)

* See the results
table car_cat
label list CAR_CAT
notes car_cat

Output explanation:

  • Value 0: Neither efficient nor affordable
  • Value 1: Efficient only (mpg > 22)
  • Value 2: Affordable only (price < 5000)
  • Value 3: Both efficient AND affordable (1 + 2 = 3)

This power-of-two system ensures every possible combination gets a unique, interpretable value.

Syntax & Options

group, rules(string) [generate(newvar) label pairdelimiter(string) labdelimiter(string)]

Main Options:

  • rules(string): Define conditions and labels using condition :: label ||| condition :: label format
  • generate(newvar): Specify custom variable name (default: _group)
  • label: Use descriptive text for value labels instead of numeric codes
  • pairdelimiter(string): Custom separator between condition-label pairs (default: |||)
  • labdelimiter(string): Custom separator between condition and label (default: ::)

For complete documentation, see: help group

Compatibility

  • Requires Stata 16 or newer
  • Requires the user-written tuples command (install with ssc install tuples)

Support

Report issues or suggest improvements:
GitHub Issues

Author

Hafiz Arfyanto
Email | GitHub

Citation

If you use group in your research, please cite:

Plain Text:

Hafiz Arfyanto (2025). group: Categorical Variables from Logical Conditions. Version 1.0.0.
Retrieved from https://github.com/hafizarfyanto/group

BibTeX Entry:

@misc{arfyanto2025group,
  author = {Hafiz Arfyanto},
  title = {group: Categorical Variables from Logical Conditions},
  version = {1.0.0},
  year = {2025},
  url = {https://github.com/hafizarfyanto/group},
  note = {Stata command for creating categorical variables from multiple logical conditions}
}

For detailed documentation, see the official help file in Stata*

help group

About

Stata command for creating categorical variables from multiple logical conditions using power-of-two indexing

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published