Skip to content

enh(Agent): Allow for periodic historical tracking of X state#231

Merged
nicola-bastianello merged 16 commits intoteam-decent:mainfrom
Simpag:x-tracking
Dec 24, 2025
Merged

enh(Agent): Allow for periodic historical tracking of X state#231
nicola-bastianello merged 16 commits intoteam-decent:mainfrom
Simpag:x-tracking

Conversation

@Simpag
Copy link
Contributor

@Simpag Simpag commented Dec 17, 2025

Adds functionality to customize the period for which agent states are saved, useful when the agent state is large in order to prevent filling user memory. Following is a list of further minor enhancements made:

closes #197, closes #199, closes #188, closes #198, closes #212, closes #230, closes #162, closes #194

Copilot AI review requested due to automatic review settings December 17, 2025 01:07
@Simpag Simpag changed the title ref(Agent): Allow for periodic historical tracking of X state enh(Agent): Allow for periodic historical tracking of X state Dec 17, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds configurable periodic history tracking for agent states to reduce memory usage when agent states are large. The key enhancement allows users to specify a history_period parameter (e.g., 10) so that agent states are only recorded every N iterations instead of at every iteration. Additionally, the PR includes several metric calculation improvements: customizable table formatting, computational cost-based x-axis for plots, performance optimizations for table metric aggregation, and progress bars for metric calculations.

Key Changes

  • Modified Agent class to use dict-based history storage with configurable recording periods
  • Added ComputationalCost dataclass for plot metrics to display computational cost instead of iterations on x-axis
  • Refactored table metric calculation to compute data once and apply multiple statistics, improving performance
  • Added MetricProgressBar for user feedback during metric calculations

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
decent_bench/agents.py Refactored Agent to use dict-based x_history with configurable history_period; added x_updates counter to AgentMetricsView
decent_bench/benchmark_problem.py Added agent_history_period field to BenchmarkProblem and create_regression_problem
decent_bench/networks.py Updated Agent instantiation to pass history_period parameter
decent_bench/benchmark.py Added computational_cost parameter to benchmark function; removed Status wrappers for table/plot generation (now use progress bars)
decent_bench/metrics/metric_utils.py Added MetricProgressBar class; updated x_mean and x_error to work with dict-based history
decent_bench/metrics/table_metrics.py Added customizable fmt parameter for value formatting; refactored to calculate data once per metric; added progress bar
decent_bench/metrics/plot_metrics.py Added ComputationalCost dataclass; updated to handle sparse history and scale x-axis by computational cost; added progress bar
test/test_agents.py Added comprehensive test for in-place operations with history tracking across multiple frameworks
docs/source/user.rst Added example usage of agent_history_period parameter
docs/source/developer.rst Removed trailing whitespace
docs/source/api/decent_bench.metrics.metric_utils.rst Added exclude-members directive to hide MetricProgressBar from API docs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Simpag
Copy link
Contributor Author

Simpag commented Dec 19, 2025

Would it be of use to be able to plots over iterations and over costs at the same time? That is for the default plotmetrics you would get 4 plots instead of 2?

@nicola-bastianello
Copy link
Member

Would it be of use to be able to plots over iterations and over costs at the same time? That is for the default plotmetrics you would get 4 plots instead of 2?

I'm not sure it would be good to have it as default, it might be confusing to casual users. but it definitely sounds interesting to have this as a possibility for "advanced" users; so maybe a boolean toggle?

this makes me think about the layout of the plots: currently, the plots are arranged in maximum two columns, each with as many elements as needed. however, since all plots share the same x-axis, it might be better to have a single column that is as long as needed (maybe a warning of poor readability is emitted when more than 4/5 plots need to be plotted). this would also allow having a single x-label, and customizing only y-labels.

with this new layout, then your suggestion would just require creating two columns, one with iterations for the x-axis the other with computational cost

@nicola-bastianello
Copy link
Member

also, currently a legend is printed for each separate subplot; this is redundant information. as part of the layout redesign, we could print only a single legend. and I would actually place it before the first subplot (like figure 1 in https://arxiv.org/abs/2501.13516)

decent-bench allows you to benchmark decentralized optimization algorithms under various communication constraints,
providing realistic algorithm comparisons in a user-friendly and highly configurable setting.

Report any bugs you *may find* to `GitHub <https://github.com/team-decent/decent-bench/issues>`_.
Copy link
Member

@nicola-bastianello nicola-bastianello Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you expand "report any bugs ..." adding "contributions are welcome, see developer guide on how to get started. Please contact Dr. Nicola Bastianello (with link https://bastianello.me/) for discussions". and with this change we can consider #162 closed

@Simpag
Copy link
Contributor Author

Simpag commented Dec 19, 2025

I have pushed the majority of the fixes. Will post the rest later this evening. You can now test the new plots

@nicola-bastianello
Copy link
Member

The new plots look great, thanks!

@Simpag
Copy link
Contributor Author

Simpag commented Dec 19, 2025

Would it be of use to be able to plots over iterations and over costs at the same time? That is for the default plotmetrics you would get 4 plots instead of 2?

I'm not sure it would be good to have it as default, it might be confusing to casual users. but it definitely sounds interesting to have this as a possibility for "advanced" users; so maybe a boolean toggle?

this makes me think about the layout of the plots: currently, the plots are arranged in maximum two columns, each with as many elements as needed. however, since all plots share the same x-axis, it might be better to have a single column that is as long as needed (maybe a warning of poor readability is emitted when more than 4/5 plots need to be plotted). this would also allow having a single x-label, and customizing only y-labels.

with this new layout, then your suggestion would just require creating two columns, one with iterations for the x-axis the other with computational cost

Will all plotmetrics have the same x-label? If not it might be strange to only show the label for one?

@nicola-bastianello
Copy link
Member

nicola-bastianello commented Dec 19, 2025

yes I think all plotmetrics will have the same x-label (either "iterations" or "computational cost" -- or whatever names we decide to use). so it will be ok to only show one (the x-label of the plot at the bottom). we should use the sharex=True option of matplotlib.subplots, and define the x-label of the bottom plot only

@Simpag
Copy link
Contributor Author

Simpag commented Dec 19, 2025

Something like this? This is using ComputationalCost(proximal=2.0, communication=0.1) and I made a boolean called show_cost_and_iterations but I don't like the name. Do you have any suggestions?

If we decide to use the same x-label for all plots then we should remove the x_label property of plot metrics

image

@nicola-bastianello
Copy link
Member

nicola-bastianello commented Dec 20, 2025

yes, this is exactly what I had in mind, looks great! for the x-label of the first column, I would maybe use "time (computational cost units)" and for the second column "iterations"

yes, let's remove the x-label property from plot metrics. it might be interesting to allow users to customize the x-labels of the two columns, but we can think about that later

maybe instead of show_cost_and_iterations we could have compare_iterations_and_computational_cost? I know it's very long, but this is not a trivial thing to understand so it would benefit from a clear name

another thing: when we have two column, the y-label of the plots in each row is going to be the same. in that case we could keep it only for the first column, and also use sharey=True

@Simpag
Copy link
Contributor Author

Simpag commented Dec 20, 2025

Should we maybe have grid on too? I have turned it on and I think it helps with readability, let me know what you think

@Simpag
Copy link
Contributor Author

Simpag commented Dec 20, 2025

MyPy is failing because it is using networks.py from the main branch for some reason (refers to line 521 which doesnt exist in this PR). This issue is mentioned by Elias in some issue, it passes mypy on my machine. I dont have access to modify the checks so I cannot fix this nor do I have a lot of experience with github checks.

From what I can find the workflow should be updated to add:

with:
      ref: ${{ github.event.pull_request.head.sha }}

after the uses: actions/checkout@v4 for each of the jobs (under the step named name: Check out code). For example mypy would become (all others should be updated too):

mypy:
    runs-on: macos-latest

    steps:
      - name: Check out code
        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.pull_request.head.sha }}

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.13"

      - name: Install tox
        run: pip install tox

      - name: Run mypy
        run: tox -e mypy

@nicola-bastianello
Copy link
Member

Should we maybe have grid on too? I have turned it on and I think it helps with readability, let me know what you think

yes, that's a good idea, let's have it on by default. but maybe we can also have a toggle to disable it

@nicola-bastianello
Copy link
Member

MyPy is failing because it is using networks.py from the main branch for some reason (refers to line 521 which doesnt exist in this PR). This issue is mentioned by Elias in some issue, it passes mypy on my machine. I dont have access to modify the checks so I cannot fix this nor do I have a lot of experience with github checks. ......

this is very annoying.. Elias is not available for fixing this until mid-January I think, and I don't know how to do it. does syncing your branch with the current main work as a temporary fix?


def common_sorted_iterations(agents: Sequence[AgentMetricsView]) -> list[int]:
"""
Get a sorted list of all common iterations reached by agents in *agents*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe add a bit more context, like: "since the agents can sample periodically, and potentially at different times, this function can be used to find the numbers of iterations where all agents have recorded their states, which can then be used to compute the metrics"

subplot.legend()

if len(metrics) > 4:
LOGGER.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's open an issue for the following addition: creating several figures, each with 4 plots, when more than 4 metrics are plotted. alternatively, we could provide a boolean toggle to plot each metric in a separate figure. in any case when we implement storing the results users will be able to create their own custom plots

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have a sequence of plot metrics to specify what you want to plot, what if we changed it to be nested lists? Then the elements would be figures and then the second list would be which metrics in each figure? Like

plot_metrics=[
    [Regret, GradientNorm],
    [PlotMetric3, PlotMetric4, PlotMetric5],
]

Would yeild 2 figures with the first having Regret and GradientNorm and the second figure having three plots.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer that the user still only specifies a list, and the framework takes care of the plotting details. otherwise it would be a little less user friendly

but again let's discuss in the future, there is not rush to change this

Copy link
Contributor Author

@Simpag Simpag Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I would like that the user specifies what is contained in one figure is that a user might want certain subplots in one figure. If the framework decides then they would need to run multiple plotting runs (when we save the final state).

Could make it so that both work. Maybe have an int specifying maximum number of plots per figure and list of lists also work to manually split them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's good to leave the users more freedom to decide what to do with the plots. however, I have some doubts about having so many options (and their docs), because I think it might get confusing quite quickly and put off some users. in a sense, if we provide this plotting functionality we are providing a non-trivial interface to matplotlib, which is harder to maintain in the long run. I say non-trivial because it's not just passing the same kwargs of matplotlib. this is the same discussion as networkx in #233

we also need to consider whether users would use these plots directly in papers or if they would want to customize them. I personally would likely look at the plots coming out of decent-bench to decide what to display in papers, and then create my own using the data stored after the benchmark has executed. this is because I often need to control the fontsize, add a reference number like [4] to legen labels, etc. but if we want to provide this level of customizability to the plots, then essentially we are reproducing matplotlib or providing a complex interface to it, which is not a good idea. and we have just seen that creating plots is hard and subplots are especially messy

so to conclude: I'm not against having some additional options to customize the plots, but I don't want them to be too complicated, since I expect users will create their own plots afterwards. I also don't want to have to maintain complex plotting functionality in the code-base. so for now I would keep the warning, and open an issue copy-pasting this discussion

once my colleagues (hopefully) start using the code-base, I can also ask them what they would like on the plotting side

Copy link
Contributor Author

@Simpag Simpag Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe then instead of having customizable plots we could export the data used to create the plots so that users can customize their own plots? Creating the data for the plots is not very trivial as the structurer is rather nested, having the pre-calculated data would be much faster and just export it as a csv. We should probably have a logs folder anyway where we store all of the important console outputs and such from each run

Copy link
Member

@nicola-bastianello nicola-bastianello Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think this is a very good idea. let's aim for two things: 1) a simple plot provided by us at the end of the benchmark execution, with limited customizable options (I would keep the option of having iterations and computational cost), 2) the data is also exported in a log/results folder, so that users can create their own plots; in my experience storing data as ndarrays in a npz file works well, but also csv works, although we likely need to use pandas to import it.

but 2) requires a lot of work, so I would open an issue and leave it for the future (also related to #217)

Copy link
Contributor Author

@Simpag Simpag Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can include this into the metric union (table + plot) issue. Exporting to csv is more general and easier to work with than numpy data imo. A csv also allows users to quickly inspect it, there are (think built in) easy libraries to import csv files as dictionaries in python using the csv module but pandas is much easier to work with and allows for very easy plotting.

The logs folder could include full console log, a file for the table metrics as latex format and the plot data + image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points! sounds good to go for csv files then. and yes, let's include this into #220

@nicola-bastianello
Copy link
Member

nicola-bastianello commented Dec 21, 2025

there are some issues with the figure as rendered on my laptop: 1) the x label half disappears under the bottom border, 2) the y labels are not printed, 3) the legend hangs over the top plot

here is what I see, both figure as it appears and figure maximized to full screen

Figure_1 Figure_2

@Simpag
Copy link
Contributor Author

Simpag commented Dec 21, 2025

MyPy is failing because it is using networks.py from the main branch for some reason (refers to line 521 which doesnt exist in this PR). This issue is mentioned by Elias in some issue, it passes mypy on my machine. I dont have access to modify the checks so I cannot fix this nor do I have a lot of experience with github checks. ......

this is very annoying.. Elias is not available for fixing this until mid-January I think, and I don't know how to do it. does syncing your branch with the current main work as a temporary fix?

It will solve it but there really isnt a point unless there are major merge conflicts, I've already made sure it passes mypy before commiting.

@Simpag
Copy link
Contributor Author

Simpag commented Dec 21, 2025

there are some issues with the figure as rendered on my laptop: 1) the x label half disappears under the bottom border, 2) the y labels are not printed, 3) the legend hangs over the top plot

here is what I see, both figure as it appears and figure maximized to full screen
Figure_1 Figure_2

That is really annoying... I guess matplotlib is dependent on screen resolution. As you can see above they are square for me. I'll have to take a look at it, I assumed the padding and such was ratio based? What is your screen resolution? Could you also provide me with your script

Edit: y-label not showing was because of my logic mistake, I did if use_cost and compare_iterations_and_computational_cost and i % 2 == 0: so it only set if you use cost and I didnt notice it

@nicola-bastianello
Copy link
Member

great that the y-label is fixed.

I have 1920x1080 screen resolution, and I just sent you the script by email

@nicola-bastianello
Copy link
Member

MyPy is failing because it is using networks.py from the main branch for some reason (refers to line 521 which doesnt exist in this PR). This issue is mentioned by Elias in some issue, it passes mypy on my machine. I dont have access to modify the checks so I cannot fix this nor do I have a lot of experience with github checks. ......

this is very annoying.. Elias is not available for fixing this until mid-January I think, and I don't know how to do it. does syncing your branch with the current main work as a temporary fix?

It will solve it but there really isnt a point unless there are major merge conflicts, I've already made sure it passes mypy before commiting.

ok then for now let's solve it this way, and we can talk about it with Elias in January

@Simpag
Copy link
Contributor Author

Simpag commented Dec 22, 2025

I have hopefully solved it now, I made a lot of changes to hopefully make it more stable. The plots might be very big on lower resolutions if more than 2 plot rows are used but I kept this because I wanted the plots to keep the same shape and size independent from how many plots you show. Can make the individual plots smaller but that would require to use a smaller font size and some other minor modifications. Let me know what you think. Also moved away from plt.tight_layout to layout="constrained" as per matplotlibs documentation recommendation as apparently plt.tight_layout was their first layout engine and its not the best.

@nicola-bastianello
Copy link
Member

nicola-bastianello commented Dec 23, 2025

thanks for the hard work, the plots have definitely improved. when I maximize the figure window everything displays as intended. however, the window as it is first displayed (not maximized) is too high for the screen, since it disappears under the Windows task bar at the bottom (the upper edge of the figure is aligned with the top of the screen):

Capture

but when I save the figure, the x-labels are correctly displayed and not cut off

Figure_1

so I think this is a matter of the height of the figure being too high, because the usable screen space is smaller due to the taskbar

@Simpag
Copy link
Contributor Author

Simpag commented Dec 23, 2025

That is still not how it looks on my end. I spent a lot of time making sure that the plots looked the same no matter if you had 1 plot or 4 and that the label box never clipper into the plots. I copied the logic from matplotlib source code that calculates the height of certain parts and did empirical testing for some things that were too complex to find anything on. Are you using any kind of zoom/scaling in windows?

@nicola-bastianello
Copy link
Member

Are you using any kind of zoom/scaling in windows?

no, I'm not using any zoom/scaling (at least as far as I was able to verify in Window's messy settings). one thing to notice is that I'm still on windows 10 on the KTH laptop

anyway, this is a very messy problem and I don't think we can easily solve it ourselves. I was looking a bit more at matplotlib options, and it looks like there might be way to specify legend position with loc="outside upper center", see here. it looks like this is available in the stable version 3.10. if this works I think it's the best way to do it, so that we delegate the messy plotting entirely to matplotlib

@Simpag
Copy link
Contributor Author

Simpag commented Dec 23, 2025

I think I will remove everything that has to do with trying to make the plots have the same size regardless of how many plots you show and instead just have a fixed window size (like how matplotlib does it by default) and fit the plots into that. If these plots are more meant as a quick overview for how well an algorithm works then they dont have to bee too pretty.

I will also take a look at that "outside upper center". I have not seen that before and if it exists they havent updated their loc error because it prints all the possible values and it is not in there. Might have been because I was using tight_layout and it seems like those locations are new for the constrained layout.

@nicola-bastianello
Copy link
Member

sounds good, thank you! I also find that the new loc option is not widely documented, but it seems to be in the stable version.. let's see if it works

@Simpag
Copy link
Contributor Author

Simpag commented Dec 23, 2025

Lets hope this works, let me know! Also, interestingly the outside location is not in their error message, see:
ValueError: 'outsidee upper center' is not a valid value for loc; supported values are 'best', 'upper right', 'upper left', 'lower left', 'lower right', 'right', 'center left', 'center right', 'lower center', 'upper center', 'center'

@nicola-bastianello
Copy link
Member

I confirm that this works as intended, well done! last thing: could you please update the example plot in the user guide? then I'll merge

@nicola-bastianello nicola-bastianello merged commit 607a6a2 into team-decent:main Dec 24, 2025
7 checks passed
@Simpag Simpag deleted the x-tracking branch January 19, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants