GitHub - agybarra/HypergraphResearch: Senior research project analyzing the social relationships and general trends between adult female elephants located in a nature preserve.

Elephant Social Network Analysis

Overview

This repository contains my senior mathematics research project, which analyzes the social relationships and behavioral trends among adult female elephants located in a protected nature preserve. The project applies methods in graph theory, ecology, and hypergraph modeling techniques to better understand ever-changing group dynamics, relationship strength, and patterns of interaction over time.

Objectives:

-Model elephant social interactions using mathematical and computational tools

-Identify patterns in group associations and relationship stability over time

-Explore trends in group size, centrality, and clustering over time

-Visualize and interpret social structures using Python-based data analysis and graph visualization tools

Methods & Tools Languages: Python

Key Libraries: hypernetx, pandas, networkx, matplotlib

Mathematical Concepts: Graph theory, hypergraphs, modularity analysis, clustering algorithms

Data: Social association data collected from observations within a nature preserve

Results and Script Details

IMPORTANT NOTE

For the sake of simplicity, both algorithms have been run with the 'resident' individuals in the study and any elephants they are connected to. So the hypergraph is built off of the elephants that had sightnings every year of the study and any elephants they were seen with. This script has not been ran with the full population as that would cause confusion when developing this project. In theory, the algorithm would behave the same whether given the full population or not, but the results and further analysis of this project will be mainly on the results achieved from the resident hypergraphs.

Main Clustering Algorithm

text is an algorithm that creates and visualizes the hypergraph per year of the study. The clustering is done by package HypernetX and its given clustering algorithms. The intra-cluster edges are color coded with the nodes of that cluster while the inter-cluster edges are represented in a gray color. There are no duplicate edges in the hypergraph due to the HypernetX function 'collapse_edges'. This function aggregates edges that are duplicates and sums up their weights. Initally, each edge has a weight of 1 (representing 1 sighting) so any edge that has a weight n listed represents n sightings of individuals in the hyperedge. Weights that were equal to 1 were not represented in the visualization for clarity. The visualizations, stored in text, are a bit complicated to see unless zoomed in, but once zoomed in they offer lots of detail of the inner workings of each cluster. There we are able to see the nodes and their impacts on the intra-cluster edges more clearly. The results, stored in text, indicated that the modularity greatly increased using the HypernetX algorithms, as well as the number of communities detected. The individual elephant and their clusters per year are found at text

Condensed Clustering Algorithm

text Uses the same inital hypergraph and clustering algorithms above. There are no duplicate edges in the hypergraph due to the HypernetX function 'collapse_edges'. This function aggregates edges that are duplicates and sums up their weights. Initally, each edge has a weight of 1 (representing 1 sighting) so any edge that has a weight n listed represents n sightings of individuals in the hyperedge. Weights that were equal to 1 were not represented in the visualization for clarity. As a further step, each intra-hyperedge and inter-hyperedge for each cluster was aggregated into one hyperedge with a summation of all of the weights. The weight thus represents the total sightnings of all elephants in the respective cluster for that year. So, if a cluster containing elephants [A, B, C] where A was seen 3 times, B was seen 2 times, and C was seen 2 times, the intra-edge for this cluster would have a weight of 7. A similar method was taken for the inter-edges. If there was a second cluster [D, E], and A and D were seen together 2 times, then the inter-hyperedge between the two clusters would have a weight of 2. The visualizations, stored in text, offer a more clear visualization than the main clustering vizualizations but it does lose some detail regarding the individual behaviors of the elephants. The results, stored in text, indicated that the modularity greatly increased using the HypernetX algorithms (though often a bit less than the non-condensed clusters), as well as the number of communities detected. The individual elephant and their clusters per year are found at text

Cluster Comparison Algorithm

This script's purpose is to compare the results and consistency with the hypergraph model to the regular graph model. The results can be found at text The ARI score and mutual information score are independent of the cluster labels and calculate the similarity of the groupings. The ARI score measures the pairwise agreement and the mutual information score calculates the information overlap between the two groups. Both metrics result in a very low similarity between the hypergraphs and the graph clustering. In addiiton, the confusion matrices found at and represent the basic pairwise agreement, similar to the ARI. There is the pure statistics and the ratios listed, which compare the true and false negatives and positives of the consistency of the hypergraph clusterings and the regular graph clusterings.

Probability Functions

This script finds occurance counts, marginal probabilities, and conditional probabilities given that there elephants A and B that are in the same cluster for n years, if there is a C that was in the same cluster as A and B in the first year, what are the probabilites C will be in the same cluster across the rest of the study. The timeframe of how often A and B are together can be changed as well as the number of groups looked at overall. Violin plots representing P(C|A & B) and P(A & B|C) are saved as well.

Probability Time Jump

This script analyzes elephant social associations across time by focusing on pairs of elephants A and B that were in the same cluster in 2007. For each pair, a third elephant C is selected from the same cluster in that year, and the script computes the probability that C will appear in the same cluster as A and B both in the baseline year and in subsequent years after a specified time jump. It calculates occurrences, marginal and conditional probabilities, including Bayesian probabilities. The number of years to examine after the time jump and the number of elephant pairs analyzed can be adjusted. Results are visualized using violin plots to compare P(C | A & B) in the baseline year versus after the time jump, and a histogram illustrates the distribution of differences in probabilities over time.

Keystone Elephant Removal Experiment

This script examines at how often groups of three elephants (A, B, and C) are seen together over the years 2007–2011. It first loads the cluster information and daily sightings for each elephant. For each year, it picks triples where C is likely to be seen with A and B (0.5 or greater). It calculates four probabilities for each triple: P(C|AB), P(AB|C), P(A or B | not C). Then, it “removes” C from the sightings and recalculates these probabilities to see how the removal changes the relationshops. All the results are saved to a text file found in the respective experiment folder within text. The script then generates violin plots comparing the distributions of each probability before and after C is removed, allowing visualization of how removing a single elephant affects dyadic and triadic co-occurrence patterns over time.

Results to be updated

Acknowledgements Special thanks to my mentors and the Mathematics Department at UC San Diego for guidance and support.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
WinterQuarter		WinterQuarter
.DS_Store		.DS_Store
Icon		Icon
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Elephant Social Network Analysis

IMPORTANT NOTE

Main Clustering Algorithm

Condensed Clustering Algorithm

Cluster Comparison Algorithm

Probability Functions

Probability Time Jump

Keystone Elephant Removal Experiment

Results to be updated

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Elephant Social Network Analysis

IMPORTANT NOTE

Main Clustering Algorithm

Condensed Clustering Algorithm

Cluster Comparison Algorithm

Probability Functions

Probability Time Jump

Keystone Elephant Removal Experiment

Results to be updated

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages