Skip to content

Conversation

@Ashvin-Ranjan
Copy link
Contributor

Changes

  • Add IB Optimization tools to ULTK
    • Adds IBStructure to represent a situation (Meanings and Referents)
    • Adds IBLanguage to represent a language (Mapping from Meanings to Expressions)
    • Adds optimization methods from Tishby et al.
  • Add tests for IB Optimization
    • Tests for ib_structure.py, ib_language.py, ib_optimization.py, ib_utils.py
  • Adds utils for conversion from ULTK classes to IB classes
  • Small fixes for Meaning and Grammar
    • Added type annotations for Meaning.__init__
    • Fixed bug where an error message in Grammar was not a proper format string

Notes

  • These new features do not add suboptimal sampling
    • These can be added in the PR if wanted
  • These have been tested against the results from Zaslavsky et al.
    graph

mickeyshi-bah and others added 30 commits May 27, 2023 10:53
Added GitHub action to automatically push after linting changes.  Merging and updating while Mickey's on vacation
Renamed altk to ultk in most files that aren't links to documentation
Ashvin-Ranjan and others added 25 commits February 26, 2025 19:06
- Fixed bug where `log_mh_sample` would return `False` instead of just skipping return
- Subtract node counts intead of adding to get proper likelihood
- Fix issue where natural log of node counts was not taken
- Add in two new parameters for weighting the values for `log_mh_accept`
  - Add relevant documentation
- Involved the fact that `mh_generate` would edit `old_tree`, causing `mh_sample` to be wrong
  - No changes to `mh_generate`, instead precalculates `old_tree_likelihood` from `expr`
  - Changes made both to `mh_sample` and `log_mh_sample`
- Address syntax feedback in grammar.py
  - Remove unneeded braces in the file
- Add new types and rewrite functions in likelihood.py
  - Added Datum type and rewrote proper type signatures
  - Rewrote `noise_match` to use `aggregate_individual_likelihoods`
- Add new tests for likelihood functions
  - This is to hopefully avoid mysterious errors in the future caused by broken likelihoods
- Created new classes for IB handling
  - Created IBStructure for system information
  - Created IBLanguage for individual languages
  - Added relevant functions for complexity and accuracy
- Added utils file
- Made QOL edits to Grammar and Universe
- Most of the code, especially all of the matrix math, is only lightly tested
- Add expected KL divergence to IBLanguage
- Fix math for I(W; u) in IBLanguage
  - Have confirmed that I(M; U) - I(W; U) == E[D[M; M']]
- Added tests for IB things
  - Very rudementary, will add specific value checks later
  - All tests are passing
- Reformatted various files
- First attempt at writing the optimization function
  - The function appears to be right, but is not producing desired results
- Optimization appears to work
  - There is a strange normalization step in there which should not be needed, but it breaks without it
- TODO: Add error to require structure.mu has no 0s
  - This was one of the main issues which was stopping optimization earlier
- Allow dropping expressions during recalculation
  - Languages end up recalculating to the simplest language over time
  - This is in part seemingly an issue with the optimization algorithm
- Add calculate optimal
- Add random expression distribution generator
- When being create the transpose was actually being made
- This causes the convergence to seem to work properly now
  - Normalization is still needed, which is really strange
- Tests need to be updated
- Structure should probably ensure that there are no 0s in mu
- Stop checking based on an incorrect metric
- Verify definitions for complexity and accuracy
  - Move use of the mutual information function into ib_utils
- Change language and structures from taking in Meanings and Dicts to ndarrays
  - There are new helper functions to convert from those into arrays
  - This saves on processing time
- Remove test file
  - Avoid testing errors until after everything is fully implemented
- TODO: Optimization function appears to be doing opposite of what is wanted, investigate
- Fixed issue where one normal was calculated for all meaingings
  - This fixes the issue where reconstructed qwm needed to be normalized
- Added divergence_array to IBLanguage to reduce time cost
- Optimizer still does not work, but one step closer
- Untested
- Allows for multithreading
- Fix potential issue with change of base log
  - Program seems to optimize for the right hyperparameter
  - Issue is that it finds local minimums too fast
    - May be alleviated with log probability (?)
- Move from ultk/language/ib to ultk/ib
  - There is little relation to the ultk/language package anymore
Add new features for Metropolis-Hastings sampler
- Confirm that functions work as expected
- Add normalization to language reconstructed meanings and expression prior
  -  This is to avoid floating point rounding error at small numbers
- Add deterministic annealing function
- TODO: Clean up names and add docstrings
- Add docstrings to all functions and classes
- Rename structure.mu to structure.pum
  - This is to have it make more sense
- Added IB Tests file
- Made fixes to various files
- Fix recalculate_language test
- Add missing tests for `ib_language.py`
- Add docstrings for get_optimal_languages
@nathimel nathimel force-pushed the main branch 4 times, most recently from b33a0c7 to a17b214 Compare June 24, 2025 23:56
@nathimel nathimel closed this Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants