[newchem-cpp] Document Multi-Grain Dust Model & Lack of Generalizability

My apologies if this all comes off as very "rambly." I didn't anticipate that the issue would be this long and at a certain point, I just wanted to move on. With that said, I'm more than happy to clarify anything that doesn't makes sense

-----
# Overview
I'm opening this issue after familiarizing myself with  Appendix A of [Chiaki & Wise 2019](https://ui.adsabs.harvard.edu/abs/2019MNRAS.482.3933C/abstract) (and parts of section 2 from [Chiaki+15](https://ui.adsabs.harvard.edu/abs/2015MNRAS.446.2659C/abstract)), which described the clever physical model that is implemented by the multi-grain-chemistry dust-model (in Grackle's `newchem-cpp` branch). 

First, I will [define the model.](#defining-the-model) For a well-defined volume of space, the model is able to leverage some simple assumptions to model dust properties using a sparse number of variables to model dust growth and dust properties extremely accurately for a given region of space.

Unfortunately, these simple assumptions **severely** require the applicability of the model to a class of simulations that obey a highly restricted set of conditions:
- *Most importantly* I [will discuss](#incapacity-to-efficiently-model-grain-growth-and-destruction) how is **IMPOSSIBLE** to efficiently generalize the model to also model dust destruction.
- Furthermore, the underlying [assumptions will be violated](#violating-assumptions) outside of very specialized simulations.

This purpose of this issue is to:
1. Document the underlying assumptions of the dust grain model, its limitations, and the assumptions that user must obey. (If nothing else, it ensures that all of us are all on the same page, and can be referenced by other issues)
2. Possibly initiate a discussion for the feature-set that Grackle should ultimately support. As we have previously discussed, the current implementation provides lots of "knobs" and customizability. Given the model's inherent limitations, most of the knobs probably provide (at most) marginal value.[^cut-functionality] (At this point, I do think we should transcribe everything to C++, and cut functionality afterwards). probably provide marginal I think we should fully transcribe the model, I think)
3. *Possibly* serve as a source for the narrative documentation that would accompany the multi-grain-chemistry dust-model. While the tone may need to be reframed, I think it's essential that we present all of the limitations to users.

# Defining the Model
Before anything else, let's explicitly clarify that the physical model assumes that we are consider a one-zone model (i.e. we are describing the evolution of volume-averaged properties for a single "zone" or a region of space with finite volume). This obviously makes sense because this is how Grackle works (i.e. it treats chemical evolution for each cell or particle *completely* independently). 

To simplify our notation and explanation, most of this section only considers a single grain species. It is trivial to generalize from 1 to many species.

We organize this section into several parts:
1. Defining grain properties.
2. Modeling how grains evolve.
3. Treatment of grain creation.
4. What Grackle actually implements
## 1. Defining grain properties
In this subsection, we define some physical properties of interest. (Reminder: The discussion assumes a single grain-species). Throughout this discussion, we assume that an individual grain is a sphere with radius $r$, and that every grain has the same density $\zeta$. 

The quantities that get evolved are volume averaged. At a time, $t$, our dust-grain species has a mass density $\rho_{\rm grain}(t)$, a number density $n_{\rm grain}(t)$ and a differential grain-size distribution function $\phi_t(r)$.

It's instructive to briefly elaborate on some details about $\phi_t(r)$:
- The function is normalized such that $\int_0^\infty \phi_t(r) dr = 1$.
- the number density of this dust grains with radii between $r$ and $r+dr$ (of the considered species) is $n_{\rm grain}(t)\phi_t(r)dr$.
- let's define the $p$th order moment as $\langle r^p\rangle_t = \int_0^\infty r^p \phi_t(r)\, {\rm d}r$. For context, the grain's species average cross-section and average volume are $\pi\langle r^2\rangle_t$ and $(4\pi/3)\langle r^3\rangle_t$. 
- the average cross-section is important for heating/cooling, H2 formation, etc.

Finally, $\rho_{\rm grain}(t)$, $n_{\rm grain}(t)$, and $\phi_t(r)$ are related by:
$$n_{\rm grain}(t) = \frac{\rho_{\rm grain}(t)}{(4\pi/3)\, \zeta\, \langle r^3\rangle_t}$$
## 2. Modeling how grains evolve. 
The crux of the model is a simple (and clever) description of how the grain species evolves. 

> [!important]
> Within the context of this subsection, "evolution" is only concerned with what happens **AFTER** grains are formed (the next subsection relates to the "how" of dust creation)

Let's reframe this slightly. At some time $t_0$ we know the properties $\rho_{\rm grain}(t_0)$ and $\phi_0(r)$, and at a subsequent time, $t$, we need to describe the values of $\rho_{\rm grain}(t)$ and $\phi_t(r)$.

The model essentially makes the following 2 assumptions:
1. any difference between $\rho_{\rm grain}(t_0)$ and $\rho_{\rm grain}(t)$ is entirely explained by local dust-evolution processes that do **NOT** change grain number density.
2. the differential grain-size distribution function $\phi_t(r)$ is not allowed to change shape, but it can be translated.

Let's consider the consequences of these assumptions from a mathematical perspectiv. The 1st assumption lets us use the definition of $n_{\rm grain}$ in terms of  $\rho_{\rm grain}(t)$ and $\phi_t(r)$ in order to write:
$$\frac{\rho_{\rm grain}(t_0)}{(4\pi/3)\, \zeta\, \langle r^3\rangle_0} = \frac{\rho_{\rm grain}(t)}{(4\pi/3)\, \zeta\, \langle r^3\rangle_t},$$
or equivalently, $\langle r^3\rangle_t = \langle r^3\rangle_0 \left(\rho_{\rm grain}(t) / \rho_{\rm grain}(t_0)\right)$. The 2nd assumption indicates that $\phi_t(r)=\phi_0(r-\delta r(t))$. We can solve for $\delta r$ by leveraging $\langle r^3\rangle_t = \langle r^3\rangle_0 \left(\rho_{\rm grain}(t) / \rho_{\rm grain}(t_0)\right)$.

What does this mean from a physical perspective? It means that **any change** in $\rho_{\rm grain}(t_0)$ and $\rho_{\rm grain}(t)$ **MUST** be entirely accounted for by a corresponding increase or loss in mass in every existing grain. Importantly, this process changes a grain's radius at rate that is independent  of the current radius (i.e. $\frac{dr}{dt}$ that is constant with respect to $r$). 

It is not a coincidence that the process of grain growth precisely satisfies these requirements.[^unintuitive-growth]

The clever part of all this is that:
- you don't actually need to keep track of the full details of $\phi_0(r)$. In practice, you just need to know the first several moments of the distribution: $\langle r^3\rangle_0$, $\langle r^2\rangle_0$, and $\langle r\rangle_0$.
- This is because:
  - Eqn A3 of  [Chiaki & Wise 2019](https://ui.adsabs.harvard.edu/abs/2019MNRAS.482.3933C/abstract) demonstrates that the value of $\delta r(t)$ is the root of the [cubic equation](https://en.wikipedia.org/wiki/Cubic_equation), where the coefficients are tied to $\langle r^3\rangle_t$, $\langle r^3\rangle_0$, $\langle r^2\rangle_0$, and $\langle r\rangle_0$.
  - And, the quantity that we are primarily interest in deriving from $\phi_t(r)=\phi_0(r-\delta r(t))$ is the cross section $\pi \langle r^2\rangle_t$, and $\langle r^2\rangle_t = \langle r^2\rangle_0 + 2\langle r\rangle_0\, \delta r (t) + \left(\delta r (t)\right)^2$. 
- There is maybe 1 compromise. I *think* we need to assume optical opacity is independent of $\delta r(t)$, but I haven't checked.
## 3.  Treatment of Grain creation

The model assumes that grains are **ONLY** created through one or more well-defined injection pathways. Let's consider `n_inject_path` injection paths. For the most part, each pathway corresponds to a model of a different supernova. However, Grackle also implements a pathway that corresponds to local ISM conditions.

For the $j$th pathway:
- let $\rho_{{\rm inject},j}$ denote the total (non-primoridal) mass density injected into the zone by the pathway
- let $f_{{\rm grain},j}$ specify the fraction of $\rho_{{\rm inject},j}$ that directly contributes to $\rho_{\rm grain}$
- let $\Phi_j(r)$ describe the initial differential grain-size distribution of a grain species immediately after it was injected by the pathway, and let's denote $\langle r^p\rangle_{0,j}$ as the $p$th moment of $\Phi_j(r)$

We can relate these quantities to the variables we previously defined. If all pathways injected material at a time $t_0$, then
- the initial mass density is $\rho_{\rm grain}(t_0) = \sum_j f_{{\rm grain},j} \rho_{{\rm inject},j}$
- the initial number density of the grain species contributed by a pathway is $n_{{\rm grain},j}=\left(3\, f_{{\rm grain},j}\, \rho_{{\rm inject},j}\right) / \left(4\pi\, \zeta\, \langle r^3\rangle_{0,j}\right)$
- The initial grain size distribution is $\phi_0(r) = \left(\sum_j n_{{\rm grain},j} \Phi_j(r)\right)/\left(\sum_j n_{{\rm grain},j}\right)$
## 4. What Grackle actually implements

So how does Grackle actually implement the model? In practice we don't explicitly track the number density of a grain species

Let's focus on the variables used to actually track the state of each grain species 
- For each grain species, Grackle tracks the current mass density. Suppose there are `n_grain_species` different grain species (our discussion thus far has just assumed a single species, but this is easy to generalize)
- It tracks the cumulative (non-primordial) mass density injected by each of the `n_inject_path` injection paths injection pathway $\rho_{{\rm inject},j}$
- for each pathway, Grackle also uses precomputed tables of values. Keep in mind that a table of a single quantity for pathway $j$ requires at least `n_grain_species` different entries (an entry for each grain-species). We list the tabulated quantities of pathway $j$:
  - $\langle r^3\rangle_{0,j}$, $\langle r^2\rangle_{0,j}$, and $\langle r\rangle_{0,j}$ (i.e. moments of the size distribution of the grain when they are initially injected).[^moment-table]
  - $f_{{\rm grain},j}$ (i.e. the fraction of the injected (non-primordial) mass density $\rho_{{\rm inject},j}$ that directly contributed to a grain species's mass density).
  - the optical opacity of the grain-species (it depends on the initial size distribution)
Let's quickly summarize. To model and evolve grain species densities in multiple (independent) one-zone models, simultaneously, we track `n_grain_species + n_inject_path` per zone plus a few tables that are reused between zones

It's instructive to briefly consider a naive alternative approach. A simple alternative that describes grain size distribution might track the mass density of a grain species in multiple size bins. If the number of size bins is `n_size_bins`, then we would need to track `n_grain_species * n_size_bins` size bins.

**The key point of this comparison:** Grackle is implementing a model that uses has `O(n_grain_species + n_inject_path)` memory complexity per zone rather than `O(n_grain_species * n_size_bins)`.  Additionally, `n_size_bins` probably needs to be pretty big for the naive alternative to approach comparable accuracy for characterizing the grain shape distribution. This model almost perfectly[^opacity] models the grain shape distribution, and, at face value, the **only** price seems to be related to extending the API (more on that at the very end).

Unfortunately, there's "no free lunch." As is common in astrophysical/cosmological simulations,[^low-order-hydro], the model we are highlighting achieves comparable accuracy to a simpler alternative with far fewer computational resources by making additional assumptions, and sacrifices robustness when the assumptions are violated .that we've described has assumed:
- we are considering an isolated one-zone model
- all injection pathways introduce dust grains at a single time $t_0$
- injection is the **only** way to change the number density of a given grain species
- other processes can only change the mass density of grain species as long as they don't change the number density, and the shape distributed is not deformed (but it can be translated).
We will discuss below it is hard to efficiently work around these assumptions and the consequences of violating them.
# Incapacity to efficiently model Grain Growth and destruction

The model we have described is fundamentally a model designed to model grain growth. However, dust destruction is extremely important. People often talk about 2 destruction processes:
- sublimation: occurs when grain reach a species-specific threshold temperature
- sputtering: in a hot enough gas, collisions with the gas cause the grain to lose mass (one atom at a time).
While the Chiaki papers primarily mention sublimation (it appears to be most important for the context of Pop III star formation), my impression is that sputtering is the dominant process in almost all other contexts (the Draine textbook doesn't seem to even describe sublimation).

Unfortunately, it is impossible to efficiently include both destruction and growth in the model (it's fundamentally incompatible with the assumption that grain number density is conserved). We will elaborate on this point in the context of both destruction processes.

##  Sublimation

Let's start with the concept of sublimation, since Grackle's multi-species grain model was implemented with an option for modeling sublimation, when growth is disabled; issue #420 highlights that the implementation currently has crippling flaws. The model fundamentally assumes that sublimation is a nearly instantaneous process that destroys a species regardless of size.

 The incompatibility arises because the growth model fundamentally assumes that the tracked $\rho_{{\rm inject},j}$ variables can be used to infer the current number density of the grain species. That assumption is simply wrong if $\rho_{\rm grain}(t)$ goes to zero because of sublimation and will cause lots of issues if you ever consider multiple injection events or mixing between zones.

There *is* a solution to this problem:
- the solution is trivial when we just have a single grain species (`n_grain_species=1`): when we set $\rho_{\rm grain}(t)$ to zero, we also set each of the `n_grain_species`  $\rho_{{\rm inject},j}$ values to zero
- however, Grackle always has `n_grain_species>1`. Because sublimation occurs at different temperatures for each species, we need to replace each of the `n_inject_path` $\rho_{{\rm inject},j}$ variables with a collection of `n_grain_species * n_inject_path` variables (to track the contributions to each grain species by each injection pathway).

I suspect that anybody who wants to all of the dust chemistry machinery and wants to model dust destruction probably wants to model multiple injection pathways. In that case, Grackle uses `n_inject_path >~ 10`. At that point, I think most people would be *MUCH* better off using a simpler model that tracks each grain species's mass densities in 10 different size bins. Given all of the other shortcomings that we will describe in following sections, I suspect that the simpler would deliver much more meaningful results.

## Sputtering

If dust grains are never destroyed, it would actually be fairly trivial to extend the growth model to also include sputtering. This is because sputtering, like grain growth, changes the rate of a grain at a rate independent of the current radius.

However there are 2 major problems if you allow grain destructions.
1. We must introduce modifications (like those described for sublimation) to account for changes in number density. 
2. The shape distribution is distorted because sputtering destroys grains below a minimum size $r_{\min}$. It is described by $\phi_t(r)=A\, H(r-r_{\rm min})\, \phi_0(r-\delta r(t))$, where $A$ is a normalization constant, $H(x)$ is the [Heaviside step function](https://en.wikipedia.org/wiki/Heaviside_step_function), and $\delta r(t)$ has a negative value. We'll discuss the general impact of deforming $\phi_t(r)$ in the next section.

# Violating Assumptions
Outside of very specialized Pop III star formation simulations and perhaps some other highly specialized cases (although, I would note that in most of the other cases, you would probably be served with a much simpler model), you will violate the assumptions of the model.

**_The most important invariant is that the shape distribution isn't deformed._** To avoid violating this invariant, we introduced the model by assuming that
- all dust injections occur at a singular time
- we **ONLY** have a one-zone model that evolves independently of other zones (i.e. there is no zone)

What happens when you violate the invariant? In short: you can't. In other words, if a physical scenario would violate the invariant, the model forces the interpretation of the state variables under the assumption that the invariant isn't violated.  To frame it another way, usage of the model when the invariant should be violated effectively introduces a totally artificial process that redistributes the shape distribution to match the assumption. In practice, this almost means that the model forces an artificial "narrowing" of the true distribution (i.e. deformations commonly make the shape distribution much broader).

We are going to talk through the how violating each of the assumptions can violate the invariant. For simplicity, we'll assume a single injection pathway (i.e. `n_inject_path=1`). Then, we will briefly mention what happens when there are additional pathways.

Be mindful that while any one scenario may not sound too bad, all of the enumerated scenarios will occur in the majority of lower redshift galaxy-scale simulations and the effects all compound.
## Violations with only 1 Injection Pathway

For simplicity, let's start by just assuming that only 1 of the injection pathways is in use.
### More than 1 injection event
Up until now, we have **only** considered injection at a single time, $t_0$. (To be clear, we **STILL** only consider a single zone). What happens if there are multiple injection events? In other words, what if we want to allow the (common) scenario where more than 1 supernova is allowed to inject metals.

Here's a quick timeline:

- $t < t_0$: $\rho_{\rm grain} = 0$
- $t=t_0$: supernova event A injects grains
- $t_0 < t < t_1$, grains are allowed to grow
- $t = t_1$: supernova event B injects grains

At this point, $\phi(r)$ **SHOULD** be a linear combination of $\Phi_j(r)$ and $\Phi_j(r - \Delta_A)$ , where $\Delta_A$ is "translation amount" from grain growth growth between $t_0$ and $t_1$. In practice, the model force $\phi(r)=\Phi_j(r - \Delta_{AB})$, where $0 < \Delta_{AB} < \Delta_A$.

If we allowed $N$ supernova injections at $N$ different times, the actual $\phi(r)$ would be a linear of  $\sum_k^N \Phi_j(r-\Delta_k)$, $\Delta_k$ has a different value. In other words, there would be a fairly wide size distribution. However, the model effectively artificially redistributes the size distribution so that it always has the shape have the shape of $\Phi_j(r)$ with a **single** translation.

### More than 1 zone
Let's go back to considering a single injection event, but this time lets consider having 2 zones. Suppose that a single supernova injected gas in 2 regions of space at $t_0$. Suppose that the regions evolve independently until some time $t_1$. It is very plausible that the growth rates may be different between the regions (e.g. gas temperature could reduce the availability of growth ingredients). The suppose at $t_1$ the 2 regions suddenly mix their gas, and for the sake of argument let's assume mixing is instantaneous.

Let's review the values of $\phi(r)$:
- Just before mixing, zone A has $\phi(r) = \Phi_j(r - \Delta_A)$, while zone B has $\phi(r) = \Phi_j(r - \Delta_B)$.
- Just after mixing, the 2 zones have a $\phi(r)$ that is a linear combination of the 2 definitions
- The model instead forces $\phi(r)=\Phi_j(r - \Delta_{AB})$, where ${\rm min}(\Delta_A,\Delta_B) < \Delta_{AB} < {\rm max}(\Delta_A,\Delta_B)$.
- Just like the last scenario, the model is artificially narrowing the distribution
If we now considered mixing between more and more zones, like the last scenario the actual distribution would get wider and wider, but the model always artificially redistributes the size distribution so that it always has the shape  have the shape of $\Phi_j(r)$ with a **single** translation.

## What if we allow multiple injection pathways?

This makes both of the enumerated scenarios significantly worse since kinds of supernovae can occur in different parts of galaxies at the same time or at different times. The model will always force $\phi(r)$ to be a superposition of each $\Phi_j(r)$ and that they all common translation, while in fact the distributions should almost always be much broader.

# Closing Thoughts

To recap: the Grackle's multi-grain  species dust chemistry model only really works in a highly specialized hydro simulations: 
- It only works if you are running a simulation where dust grains are all injected into a given region of one or more cells (by "region" I mean any grouping of cells that are in close proximity with each other and where hydrodynamic mixing is inevitable)
- Importantly, all cells in such a region **MUST** have physical conditions where dust growth is extremely comparable (i.e. the dust grain size distribution should be very comparable between all cells in the region -- this ensures that the process of mixing shouldn't affect the expected size distribution)
- Dust destruction should entirely unimportant to the simulation.

These conditions seem to describe the conditions a cosmological simulation that you run through the formation of the very first stars, their supernovae, and stop just as the second generation of stars would inject additional dust into the simulation (This assumes that any dust in the ejected by supernovae that is ejected out of the galaxy in hotter gas has no bearing on the simulation's outcome).  It makes a lot of sense that this would work since the underlying model was designed in the context of studying Pop III star formation.

I could maybe imagine running some specialized simulations of the ISM, where (i) all gas is at low enough temperatures that sputtering isn't relevant, (ii) the initial dust distributions are homogenous throughout the simulation, and (iii) all grain growth is consistent between all cells in the simulations. But frankly, if this is all true, it seems to defeat the purpose of tracking all of the grain species as individual fields... (since the grain properties and evolution are constant over the entire domain)... Are there any other obvious use-cases that I'm missing?

Given that the underlying strength of this model is that its extremely accurate about modeling each grain species's shape distributions, it seems a little silly to me to run simulations where violations of the model's invariants mean that the model is simply wrong about the shape. But maybe that's tolerable (after all, I'm not an expert in dust chemistry -- maybe the difference between a true distorted shape distribution and the assumed distribution are relatively negligible). Even if we set aside those issues, the fact that we explicitly can't generalize the model to model both grain growth and destruction is alone a massive limit on the applicability of the model.

All of this is to say:
- The multi-grain-species dust chemistry is a feature that only works for highly specialized sets of simulations.
- There is nothing inherently wrong with this. After all, Grackle has plenty of features that can only be used in specialized cases.
- The main distinction is that the multi-grain-species dust model touches affects an absolutely massive part of the codebase compared to any other specialized feature.[^specialized]

It's definitely worth keeping all of this in mind as we finish integrating the feature into Grackle. It certainly colors my perspective on some issues. For example:
- I might suggest that we eliminate the functionality for modeling grain sublimation rather than fixing all of its current problems.
- It makes me question how easy it needs to be for people to be able to add custom new injection pathways
- I care quite a bit less about the quality of the function used for computing dust temperatures for individual grain species. Yes, we need to make some changes, but I don't think my very robust and built-out solution is really needed.

[^effectively-generalize]: We'll circle back to this, but essentially the only conceivable way that we *could* generalize the model, would sacrifice a dramatic level of computational efficiency. At that level of efficiency there are much simpler alternatives that are almost certainly more accurate.

[^cut-functionality]: At this point, I do think we should transcribe everything to C++, and cut functionality afterwards). probably provide marginal I think we should fully transcribe the model, I think)

[^unintuitive-growth]: At face value that seems unintuitive that grain growth produces a value of $\frac{dr}{dt}$ that is independent of $r$. It is important to understand that this condition arises because $\frac{dm}{dt}\propto r^2$

[^moment-table]: I haven't looked very carefully at the code. We might actually track quantities derived from linear combinations of these moments...

[^opacity]: I haven't taken a very close look at the relevant code, but I suspect that the optical opacity calculations assume that the optical opacity isn't dependent on the grain size distribution.

[^low-order-hydro]: A prominent example is the way that grid-based astrophysics simulations use **MUCH** lower order hydrodynamic schemes than the schemes used in fluid dynamics simulations used for engineering.

[^specialized]: If I had realized just how specialized the feature is, I might have pushed back harder against accepting the giant PR adding the feature. But, no point in dwelling on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[newchem-cpp] Document Multi-Grain Dust Model & Lack of Generalizability #444

Overview

Defining the Model

1. Defining grain properties

2. Modeling how grains evolve.

3. Treatment of Grain creation

4. What Grackle actually implements

Incapacity to efficiently model Grain Growth and destruction

Sublimation

Sputtering

Violating Assumptions

Violations with only 1 Injection Pathway

More than 1 injection event

More than 1 zone

What if we allow multiple injection pathways?

Closing Thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[newchem-cpp] Document Multi-Grain Dust Model & Lack of Generalizability #444

Description

Overview

Defining the Model

1. Defining grain properties

2. Modeling how grains evolve.

3. Treatment of Grain creation

4. What Grackle actually implements

Incapacity to efficiently model Grain Growth and destruction

Sublimation

Sputtering

Violating Assumptions

Violations with only 1 Injection Pathway

More than 1 injection event

More than 1 zone

What if we allow multiple injection pathways?

Closing Thoughts

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions