book/rp_ClassicalMechanics.tex at master · assumptionsofphysics/book · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Classical mechanics}

The standard view in physics is that classical mechanics is perfectly understood. It has three different but equivalent formulations, the oldest of which, Newtonian mechanics, is based on three laws. Classical mechanics is the theory of point particles that follow those laws. Unfortunately, this view is incorrect.

We will see that the three formulations are not equivalent, in the sense that there are physical systems that are Newtonian but not Hamiltonian and vice-versa. There are also a number of questions that have been left unanswered, such as the precise nature of the Hamiltonian or the Lagrangian, and what exactly the principle of stationary action represents physically. While shedding light on these issues, we will also find that classical mechanics already contains elements that are typically associated with other theories, such as quantum mechanics/field theories (uncertainty principle, anti-particles), thermodynamics/statistical mechanics (thermodynamic and information entropy conservation) or special relativity (energy as the time component of a four-vector). In other words, the common understanding of classical mechanics is quite shallow, and its foundations are, in fact, not separate from the ones of classical statistical mechanics or special relativity.

What reverse physics shows is that the central assumption underneath classical mechanics is that of \textbf{infinitesimal reducibility (IR)}: a classical system can be thought of as made of parts, which in turn are made of parts and so on; studying the whole system is equivalent to studying all its infinitesimal parts. This assumption, together with the assumption of \textbf{independence of degrees of freedom (IND)}, is what gives us the structure of classical phase space with conjugate variables. The additional assumption of \textbf{determinism and reversibility (DR)}, the fact that the description of the system at one time is enough to predict its future or reconstruct its past, leads us to Hamiltonian mechanics. On the other hand, assuming \textbf{kinematic equivalence (KE)}, the idea that trajectories in space are enough to reconstruct the state of the system and vice-versa, leads to Newtonian mechanics. The combination of all above assumptions, instead, leads to Lagrangian mechanics and, in particular, to massive particles under (scalar and vector) potential forces.

As a guide to the chapter, here is the list of main points in the order in which they will be presented, one for each section.
\begin{enumerate}
	\item Review of classical formulations
	\item Lagrangian mechanics is Hamiltonian mechanics and KE
	\item Kinematics, in general, is not enough to reconstruct dynamics
	\item Hamiltonian mechanics (one DOF) is equivalent to DR
	\item Hamiltonian mechanics (multiple DOFs) is equivalent to DR plus IND
	\item Differential calculus and its generalization, differential topology, study infinitesimally additive quantities that depend on geometric shapes (i.e. lines, surfaces, volumes)
	\item The principle of least action is a consequence of DR, IND and KE
	\item Massive particles under potential forces are a consequence of DR, IND and KE
	\item Special relativity is a consequence of DR, IND and KE
	\item Phase space is the only structure that makes distributions, state counting and entropy frame invariant
	\item Newtonian mechanics is a consequence of KE
	\item Three dimensional spaces are the only spaces for which distributions over directions are frame invariant
	\item Classical particle states as points in phase space are equivalent to IR
\end{enumerate}

\section{Formulations of classical mechanics}

In this section we will briefly review the three main formulations of classical mechanics. Our task is not to present them in detail, but rather to provide a brief summary of the equations so that we can proceed with the comparison. In particular, given that different conventions are used across formulations, within the same formulation and among different contexts (e.g. relativity, symplectic geometry), we will want to make the notation homogeneous to allow easier comparisons.

\subsection{Newtonian mechanics}

For all formulations, the system is modeled as a collection of point particles, though we will mostly focus on the single particle case. For a Newtonian system, the state of the system at a particular time $t$ is described by the position $x^i$ and velocity $v^i$ of all its constituents. Each particle has its mass $m$, not necessarily constant in time, and, for each particle, we define kinetic momentum as $\Pi^i = m v^i$.\footnote{We will use the letter $t$ for the time variable, $x$ for position and $v$ for velocity, which is a very common notation in Newtonian mechanics. However, we will keep using the same letters in Lagrangian mechanics as well, instead of $q$ and $\dot{q}$, for consistency. Given that the distinction between kinetic and conjugate momentum is an important one, we will denote $\Pi$ the former and $p$ the latter. The Roman letters $i,j,k,...$ will be used to span the spatial components (e.g. $i \in \{1,2,3\}$ for a particle in 3 dimensional space and $i \in \{1,2,\dots, 3n\}$ for n particles), while we will use the Greek letters $\alpha, \beta, \gamma, ...$ to span space-time components (e.g. $\alpha \in \{0,1,2,3\}$ where the $0$ value of the index is used for time). Unlike some texts, $x^i$ do not represent Cartesian coordinates, and therefore they should be understood already as generalized coordinates.}

The evolution of our system is given by Newton's second law:\footnote{For derivatives, we will use the shorthand $d_t$ for $\frac{d}{dt}$ and $\partial_{x^i}$ for $\frac{\partial}{\partial x^i}$. For functions that depend on multiple arguments we use a free index to note that it depends on all elements; each argument will have a different index to highlight that there is no relationship between arguments. }
\begin{equation}\label{rp-cm-NewtonsSecondLaw}
	F^i(x^j, v^k, t) = d_t \Pi^i.
\end{equation}
Mathematically, if the forces $F^i$ are locally Lipschitz\footnote{Lipschitz continuity means that the slope of the function is bounded. For example, $\sqrt{x}$ in the neighborhood of $0$ is not Lipschitz continuous as it has a vertical asymptote at that point. One can construct examples (e.g. Norton's dome) where the forces are not locally Lipschitz continuous, and therefore the initial position and velocity do not yield a unique solution (i.e. in Norton's dome, the body can stay on the top of the dome indefinitely, or it can fall down after an arbitrary amount of time). In this case, something else, outside the system, will necessarily determine what is the motion of the system, and therefore it is not true that the force and the state of the system fully determine the dynamics of the system.} continuous, then the solution $x^i(t)$ is unique. That is, given position and velocity at a given time, we can predict the position and velocity at future times. We will assume a Newtonian system has this property.

An important aspect of Newtonian mechanics is that the equations are not invariant under coordinate transformation. To distinguish between apparent forces (i.e. those dependent on the choice of frame) and the real ones, we assume the existence of inertial frames. In an inertial frame there are no apparent forces, and therefore a free system (i.e. no forces) with constant mass proceeds in a linear uniform motion, or stays still.\footnote{Recall that linear motion simply means that it describes a line in space, while uniform motion means that the speed is constant. Therefore we can have linear non-uniform motion (e.g. an object accelerated along the same direction) or a non-linear uniform motion (e.g. an object going around in a circle at constant speed).}

\subsection{Lagrangian mechanics}

The state for a Lagrangian system is also given by position $x^i$ and velocity $v^i$. The dynamics is specified by a single function $L(x^i, v^j, t)$ called the Lagrangian. For each spatial trajectory $x^i(t)$ we define the action as $\mathcal{A}[x^i(t)] = \int_{t_0}^{t_1} L(x^i(t), d_t x^i(t), t) dt$. The trajectory taken by the system is the one that makes the action stationary:
\begin{equation}
\delta \mathcal{A}[x^i(t)] = \delta \int_{t_0}^{t_1} L\left(x^i(t), d_t x^i(t), t\right) dt=0
\end{equation}
The evolution can equivalently be specified by the Euler-Lagrange equations:
\begin{equation}\label{rp-cm-EulerLagrange}
	\partial_{x^i}L=d_t \partial_{v^i} L.
\end{equation}

%TODO: we should improve how the invertibility/Jacobian \neq 0 is handles. Technically, if we only require invertibility, the Jacobian is either positiove or negative semi-definite (it can be zero). However, if the relationship is differentiable, the zero case is excluded.

Note that not all Lagrangians lead to a unique solution. For example, $L=0$ will give the same action for all trajectories and therefore, strictly speaking, all trajectories are possible. The stationary action leads to a unique solution if and only if the Lagrangian is hyperregular, which means the Hessian matrix $\partial_{v^i}\partial_{v^j} L$ is invertible. Like in the Newtonian case, we will assume Lagrangian systems satisfy this property.

Unlike Newton's second law, both the Lagrangian and the Euler-Lagrange equations are invariant under coordinate transformations. This means that Lagrangian mechanics is particularly suited to study the symmetries of the system.

\subsection{Hamiltonian mechanics}

In Hamiltonian mechanics, the state of the system is given by position $q^i$ and conjugate momentum $p_i$. The dynamics is specified by a single function $H(q^i, p_j, t)$ called the Hamiltonian.\footnote{We use a different symbol for position in Hamiltonian mechanics because, while it is true that $q^i = x^i$, it is also true that $\partial_{q^i} \neq \partial_{x^i}$: the first derivative is taken at constant conjugate momentum while the second is taken at constant velocity. This creates absolute confusion when mixing and comparing Lagrangian and Hamiltonian concepts, which our notation avoids completely.} The evolution is given by Hamilton's equations:
\begin{equation}\label{rp-cm-HamiltonEq}
	\begin{aligned}
		d_t q^i = \partial_{p_i} H \\
		d_t p_i = - \partial_{q^i} H \\
	\end{aligned}
\end{equation}
We will again want these equations to yield a unique solution, which means the Hamiltonian must be at least differentiable, and the derivatives must at least be Lipschitz continuous.

Hamilton's equations are invariant as well. The Hamiltonian itself is a scalar function which is often considered (mistakenly as we'll see later) invariant. This formulation is the most suitable for statistical mechanics as volumes of phase space correctly count the number of possible configurations.

\section{Inequivalence of formulations}\label{rp-cm-sec-inequivalenceOfFormulations}

It is often stated in physics books that all three formulations of classical mechanics are equivalent. We will look at this claim in detail, and conclude that this is not the case: there are systems that can be described by one formulation and not another. More precisely, the set of Lagrangian systems is exactly the intersection of Newtonian and Hamiltonian systems.

\subsection{Testing equivalence}

We will consider two formalisms equivalent if they can be applied to exactly the same systems. That is, Newtonian and Lagrangian mechanics are equivalent if any system that can be described using Newtonian mechanics can also be described by Lagrangian mechanics and vice-versa. In general, in physics great emphasis is put on systems that can indeed be studied by all three, leaving the impression that this is always doable.\footnote{If one asks the average physicist whether Newtonian and Hamiltonian mechanics are equivalent, the answer most of the time will be  enthusiastically positive. If one then asks for the Hamiltonian for a damped harmonic oscillator, the typical reaction is annoyance due to the nonsensical question (damped harmonic oscillators do not conserve energy), followed by a realization and partial retraction of the previous claim. The moral of the story is to never take these claims at face value.} However, just with a cursory glance, we realize that this can't possibly be the case.

The dynamics of a Newtonian system, in fact, is specified by three independently chosen functions of position and velocity, the forces applied to each degree of freedom (DOF). On the other hand, the dynamics of Lagrangian and Hamiltonian systems is specified by a single function of position and velocity/momentum, the Lagrangian/Hamiltonian. Intuitively, there are more choices in the dynamics for Newtonian systems than for Lagrangian and Hamiltonian.

Now, the reality is a bit trickier because the mathematical expression of the forces is not enough to fully characterize the physical system. We need to know in which frame we are, what coordinates are being used and the mass of the system, which is potentially a function of time. On the Lagrangian side, note that the Euler-Lagrange equations are homogeneous in $L$. This means that multiplying $L$ by a constant leads to the same solutions, meaning that the same system can be described by more than one Lagrangian. The converse is also true: if one system is half as massive and is subjected to a force half as intense, the resulting Lagrangian is also simply rescaled by a constant factor. Therefore the map between Lagrangians and Lagrangian systems is not one-to-one: it is many-to-many. This is why we should never look simply at mathematical structures if we want to fully understand the physics they describe.

Regardless, our task is at the moment much simpler: we only need to show that there are Newtonian systems not expressible by Lagrangian or Hamiltonian mechanics. We can therefore limit ourselves to systems with a specific constant mass $m$ in an inertial frame and write $a^i=F^i(x^j, v^k, t)/m$. Given that the force is arbitrary, the acceleration can be an arbitrary function of position, velocity and time. Similarly, we can write the acceleration of a Lagrangian system as $a^i=F^i[L]/m$. That is, the acceleration is going to be some functional of the Lagrangian. Given the Euler-Lagrange equations \ref{rp-cm-EulerLagrange}, the map between the Lagrangian and the acceleration must be continuous in both directions: for a small variation of the Lagrangian we must have a small variation of the equations of motion and therefore of the acceleration, and for a small variation of the equations of motion we must have a small variation of the Lagrangian. But a continuous surjective map from the space of a single function (i.e. the Lagrangian) to the space of multiple functions (i.e. those that specify the acceleration in terms of position and velocity) does not exist,\footnote{Mathematically, the space of continuous functions $C(\mathbb{R}, \mathbb{R})$ and $C(\mathbb{R}^n, \mathbb{R})$ are not homeomorphic. Intuitively, the underlying reason is the same as to why a map from a volume to a line can't be continuous: in a volume you have infinitely many directions you can move away from a point, while on a line you only have two.} and therefore there must be at least one Newtonian system with constant mass expressed in an inertial frame that is not describable using Lagrangian mechanics. The same argument applies for Hamiltonian mechanics, since the dynamics in this case is also described by a single function in the same number of arguments. We therefore reach the following conclusion:
\begin{insight}
	Not all Newtonian systems are Lagrangian and/or Hamiltonian.
\end{insight}

\subsection{Newtonian vs Lagrangian/Hamiltonian}

We now want to understand whether all Lagrangian systems are Newtonian. Given what we discussed, we cannot expect to reconstruct the mass and force uniquely from the expression of the Lagrangian. We consider the mass and the frame fixed by the problem, together with the Lagrangian, and therefore we must only see whether we can indeed find a unique expression for the acceleration. From the Euler-Lagrange equations \ref{rp-cm-EulerLagrange} we can write
\begin{equation}
	\begin{aligned}
	\partial_{x^i}L&=d_t \partial_{v^i} L=\partial_{x^j} \partial_{v^i} L \, d_t x^j + \partial_{v^k} \partial_{v^i} L \, d_t v^k + \partial_{t} \partial_{v^i} L \, d_t t \\
	&= \partial_{x^j} \partial_{v^i} L \, v^j + \partial_{v^k} \partial_{v^i} L \, a^k + \partial_{t} \partial_{v^i} L \\
	\partial_{v^k} &\partial_{v^i} L \, a^k = \partial_{x^i}L - \partial_{x^j} \partial_{v^i} L \, v^j - \partial_{t} \partial_{v^i} L.
	\end{aligned}
\end{equation}
To be able to write the acceleration explicitly, we must be able to invert the Hessian matrix $\partial_{v^k} \partial_{v^i} L$. As we noted before, this is exactly the condition for which the principle of stationary action leads to a unique solution, and we can better understand why. If it is not invertible at a point, the determinant is zero and therefore one eigenvalue is zero. The corresponding eigenvector corresponds to a direction for which the equation tells us nothing, and therefore a variation of the acceleration in that direction will not change the action. This is why the invertibility of the Hessian is required in order to obtain unique solutions.

What we find, then, is that for any Lagrangian system, which we assume to have a unique solution, we can explicitly write the acceleration as a function of position, velocity and time. Therefore
\begin{insight}
	All Lagrangian systems are Newtonian.
\end{insight}

Now we turn our attention to Hamiltonian mechanics and, similarly, we ask whether we can express the acceleration as a function of the state. We have
\begin{equation}
	\begin{aligned}
		a^i &= d_t v^i = d_t d_t q^i = d_t \partial_{p_i} H = \partial_{q^j} \partial_{p_i} H d_t q^j + \partial_{p_k} \partial_{p_i} H d_t p_k + \partial_{t} \partial_{p_i} H d_t t\\
		&= \partial_{q^j} \partial_{p_i} H \partial_{p_j} H - \partial_{p_k} \partial_{p_i} H \partial_{q^k} H + \partial_{t} \partial_{p_i} H.
	\end{aligned}
\end{equation}
This tells us that we can always write an explicit function for the acceleration. However, this is not enough. States in Newtonian mechanics are in terms of position and velocity, not position and momentum. For a Hamiltonian system to be equivalent to a Newtonian system we need to be able to write the momentum as a function of position and velocity and vice versa. Note that Hamilton's equations already give a way to express the velocity in terms of position and momentum. We just need that expression to be invertible, which means the Jacobian must be invertible. We must have:
\begin{equation}
	\left|\partial_{p_i} v^j\right| = \left|\partial_{p_i}\partial_{p_j} H\right| \neq 0 .
\end{equation}
To be able to express momentum as a function of position and velocity, then, we need the Hessian of the Hamiltonian to be invertible (i.e. to have non-zero determinant).

Note that we had no such requirement for the Hamiltonian itself. For example, $H=0$ leads to equations $d_t q^i = 0$ and $d_t p_i = 0$, which have unique solutions: both position $q^i(t) = k_{q^i}$ and momentum $p_i(t) = k_{p_i}$ are constants of motion. The Hessian, being the zero matrix, is not invertible, and in fact we cannot write momentum as a function of position and velocity: velocity $d_t q^i$ is always zero in all cases while conjugate momentum can be any value $k_{p_i}$. Though this case may not be physically interesting, it is a perfectly valid Hamiltonian system and shows that we should always check the trivial mathematical case. However, let us go through a more physically meaningful case.

\begin{figure}
	\centering
	\begin{tikzpicture}
		\pgfplotsset{ticks=none}
		\begin{axis}[axis lines=middle,
			xlabel=$q$,
			xlabel style={below=5pt, fill=white},
			ylabel=$p$,
			ylabel style={above=2pt, right=4pt},
			domain=-1.5:1.5,
			ymin=-1.8, ymax=1.8,]
			\foreach \yvalue in {-1.5,-1,-0.5, 0.5, 1, 1.5} {
				\addplot[blue,-stealth,samples=5,
				quiver={
					u={y/abs(y)},
					v={0},
					scale arrows=0.2},
				] {\yvalue};
				\addplot[black,samples=2,opacity=0.2,domain=-1.8:1.8]{\yvalue};
				\addplot[black,samples=2,opacity=0.2,domain=-1.8:1.8]{\yvalue};}
			\addplot[scatter, only marks, mark size=3pt, samples=5, mark=x, color=green]{0};
		\end{axis}

		\pgfplotsset{ticks=none}
		\begin{axis}[
			xshift=7.5cm,
			width=5cm,
			height=7.25cm,
			axis lines=middle,
			xlabel=$H$,
			xlabel style={below=5pt, fill=white},
			ylabel=$p$,
			ylabel style={above=2pt, right=4pt},
			domain=0:1,
			xmin=0, xmax=1,
			ymin=-1, ymax=1,]
			\addplot[black,thick,samples=2,domain=0:1]{x};
			\addplot[black,thick,samples=2,domain=0:1]{-x};
		\end{axis}
	\end{tikzpicture}
	\caption {On the left, the phase-space diagram for a photon treated as a point particle. The Hamiltonian $H=c|p|$, on the right, is proportional to the modulus of $p$. Since $H$ is not differentiable when $p=0$, those states are excluded, consistent with the physics. The displacement field has only a $q$ component, which is $+c$ above the horizontal axis and $-c$ below the horizontal axis. } \label{rp-cm-fig-photon}
\end{figure}

\textbf{Photon as a particle}. If we want to treat the photon as a classical particle, we can write the Hamiltonian by expressing the energy as a function of momentum
\begin{equation}
	H=\hbar | \omega| = c \hbar |k_i| = c |p_i|.
\end{equation}
If we apply Hamilton's equations, we have
\begin{equation}
	\begin{aligned}
		d_t q^i &= c \frac{p_i}{|p_i|} \\
		d_t p_i &= 0.
	\end{aligned}
\end{equation}
That is, the norm of the velocity is always $c$, the momentum decides its direction, and the momentum itself does not change in time, as shown in fig. \ref{rp-cm-fig-photon}. This is indeed the motion of a free photon. One can confirm, through tedious calculation, that the determinant of the Hessian is indeed zero, yet it is easier and more physically instructive to see that we cannot reconstruct the momentum from the velocity. Relativistically, all photons travel along the geodesics at the same speed, therefore two photons that differ only by the magnitude of the momentum will travel the same path.

Hamiltonian systems that are also Newtonian, then, need to satisfy this extra condition, so let us give it a name.
\renewcommand{\theassump}{KE}
\begin{assump}[Kinematic Equivalence]\label{assum_kineq}
	The kinematics of the system is sufficient to reconstruct its dynamics and vice-versa. That is, specifying the motion of the system is equivalent to specifying its state and evolution.
\end{assump}
\renewcommand{\theassump}{\Roman{assump}}
By kinematics we mean the motion in space and time and by dynamics we mean the state and its time evolution in phase space. We will need to analyze the difference between the two more in detail, but we should first finish our comparison between the different formulations.

Summing up, we find that
\begin{insight}
	Not all Hamiltonian systems are Newtonian: only those for which  \ref{assum_kineq} is valid.
\end{insight}

\subsection{Lagrangian vs Hamiltonian}

We now need to compare Lagrangian and Hamiltonian systems. The task is a lot easier because we already have a precise way to connect the two. If we are given a Lagrangian $L$, we define the conjugate momentum $p_i = \partial_{v^i} L$ and the Hamiltonian $H = p_i v^i - L$. If we are given a Hamiltonian $H$, we can define a Lagrangian $L = p_i v^i - H$ and a velocity $v^i = d_t q^i = \partial_{p_i} H$. However, this is a bit misleading: the above relationships connect the values of the functions for each state $s$. That is, $L(s) = p_i(s) v^i(s) - H(s)$. Both the Lagrangian and the Hamiltonian are functions of specific variables, so we have to make sure we can express them in the appropriate variables.

Going from a Hamiltonian to a Lagrangian, it again means that we can write momentum as a function of position and velocity, and therefore assumption \ref{assum_kineq} must hold. This makes sense: if all Lagrangian systems are Newtonian, and \ref{assum_kineq} was required for a Hamiltonian system to be Newtonian, then it is also required for a Hamiltonian system to be Lagrangian. But the connection is stronger: \ref{assum_kineq} is the \emph{only} additional assumption we need to be able to write a Lagrangian given a Hamiltonian.

Going from a Lagrangian to a Hamiltonian, it means that we can write velocity as a function of position and momentum. Note that since we define conjugate momentum as the derivative of the Lagrangian, we can already express momentum as a function of position and velocity, which means we are simply asking that expression to be invertible. This is, again, assumption \ref{assum_kineq}, just in the opposite direction. We must have
\begin{equation}
	0 \neq \left| \partial_{v^i} p_j \right| = \left| \partial_{v^i} \partial_{v^j} L \right|.
\end{equation}
This means that assumption \ref{assum_kineq} is exactly the invertibility of the Hessian, the condition for unique solution of the Lagrangian. All Lagrangian systems that admit unique solutions, then, satisfy assumption \ref{assum_kineq}. In fact, we can see that the Hessian determinants are related
\begin{equation}
	\left| \partial_{v^i} \partial_{v^j} L \right| = \left| \partial_{v^i} p_j \right| = \left| \partial_{p_i} v^j \right|^{-1} = \left|\partial_{p_i}\partial_{p_j} H\right|^{-1}.
\end{equation}
This means that every Lagrangian admits a Hamiltonian, but not every Hamiltonian admits a Lagrangian. Only the Hamiltonian systems for which \ref{assum_kineq} is valid will also be Lagrangian systems, with a guaranteed unique solution given that \ref{assum_kineq} is exactly the assumption needed for that as well. Therefore we conclude that
\begin{insight}
	Lagrangian systems are exactly those Hamiltonian systems for which \ref{assum_kineq} is valid.
\end{insight}

\subsection{Relationship between formulations}

The relationship between the different formulations, then, can be summarized with the Venn diagram in fig. \ref{rp-cm-fig-vennDiagramEarly}.

\begin{figure}[h]
	\centering
	\begin{tikzpicture}
		\node[ellipse, minimum width=10cm, minimum height=5cm, draw, left] (ns){};
		\node [align=left, above] at ([xshift=-6mm,yshift=1mm]ns.north west) {Newtonian \\systems};

		\node[ellipse, minimum width=6cm, minimum height=4.5cm, draw] (hs) at ([xshift=-1.7cm]ns.east){};
		\node [align=right, above right] at ([xshift=8mm, yshift=-4mm]hs.north) {Hamiltonian\\ systems};
		\node [align=center, right] at ([xshift=14
		mm]hs.west) {Lagrangian \\systems};
	\end{tikzpicture}
	\caption {Not all Hamiltonian systems are Newtonian and not all Newtonian systems are Hamiltonian. All Lagrangian systems are both Newtonian and Hamiltonian.}\label{rp-cm-fig-vennDiagramEarly}
\end{figure}


We have found that \ref{assum_kineq} is a constitutive assumption of Lagrangian mechanics, and that it clearly marks which Hamiltonian systems are Newtonian/Lagrangian. By constitutive assumption we mean an assumption that must be taken, either explicitly or implicitly, for a theory to be valid. But what makes a system Hamiltonian and what makes a system Newtonian? Can we find a full set of constitutive assumptions for classical mechanics?

\section{Kinematics vs dynamics}

We have seen the importance of the connection between kinematics and dynamics. In this section we will explore this link more deeply and come to the following conclusion: the kinematics of a system is not enough to reconstruct its dynamics.

\subsection{Particle under linear drag}

Let us first review exactly what the kinematics and dynamics are. Given a system, its kinematics is the description of its motion in space and time. Position, velocity, and acceleration are kinematic variables because they describe the motion. Kinematics is what Galileo studied and started to give a rigorous account of. The dynamics, instead, describes the cause of such motion. Force, mass, momentum, energy are dynamic quantities as they are used to describe why a body moves in a particular way. Dynamics is what Newton introduced and his second law, expressed as $F=ma$, clearly shows the link.

The link between the two concepts seems important given the constitutive role of \ref{assum_kineq} in Lagrangian mechanics. Moreover, while both Newtonian and Hamiltonian mechanics are dynamical theories, in the sense that quantities like force and momentum are intrinsic parts of the respective theories, Lagrangian mechanics seems to be a purely kinematic theory, as it is described only by kinematic variables like position and velocity. Therefore it seems useful to characterize the kinematics-dynamics link as much as possible. Let's analyze a concrete example.

Suppose we are given the following equation:
\begin{equation}\label{rp-cm-frictionEquation}
	m a = - b v .
\end{equation}
The equation is in terms of kinematic variables and, given initial conditions $x_0$ and $v_0$, it admits a unique solution, a unique trajectory.
The solution, plotted in fig. \ref{rp-cm-fig-dragEvolution}, is
\begin{equation}
	\begin{aligned}
	x(t)&= x_0 + v_0 \frac{m}{b} \left( 1 - e^{-\frac{b}{m}t}\right) \\
	v(t)&= v_0 e^{-\frac{b}{m}t} \\
	a(t)&= - v_0 \frac{b}{m} e^{-\frac{b}{m}t}
	\end{aligned}
\end{equation}
Can we reconstruct the forces acting on this system?

\begin{figure}
	\centering
	\begin{tikzpicture}
		\def\xi{0.25};
		\def\vi{1.25};
		\def\m{1};
		\def\b{1};
		\pgfplotsset{ticks=none}
		\begin{axis}[
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$x$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=2,
			domain=0:4,
			]
			\addplot[black,samples=30]{\xi + \vi*(\m/\b)*(1-exp((-\b/\m)*x))}
			node [pos=0,left] {$x_0$};
			\addplot[black,samples=2,dashed,opacity=0.5] {1.5}
			node [pos=0,left,opacity=1] {$x_0+ \frac{m}{b}v_0$};
		\end{axis}

		\begin{axis}[
			xshift=5cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$v$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=1.7,
			domain=0:4,
			]
			\addplot[black,samples=30]{\vi*exp((-\b/\m)*x))}
			node [pos=0,left] {$v_0$};

		\end{axis}

		\begin{axis}[
			xshift=10cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$a$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=-1.4,ymax=0.4,
			domain=0:4,
			]
			\addplot[black,samples=30]{-\vi*(\b/\m)*exp((-\b/\m)*x))}
			node [pos=0,left] {$-\frac{b}{m}v_0$};

		\end{axis}
	\end{tikzpicture}
	\caption {Evolution in time of position, velocity and acceleration for $ma = - bv$. Both acceleration and velocity will tend to zero as time increases. The position will tend to an equilibrium given by initial position and initial velocity.} \label{rp-cm-fig-dragEvolution}
\end{figure}

The obvious answer seems to be that the constant $m$ represents the mass of the system and $F = -bv$ the force. This is the case of a particle under linear drag:  the system is subjected to a frictional force that is proportional and opposite to the velocity. If we set the Lagrangian
\begin{equation}\label{rp-cm-frictionLagrangian}
	L = \frac{1}{2} m v^2 e^{\frac{b}{m}t}.
\end{equation}
and apply the Euler-Lagrange equation \ref{rp-cm-EulerLagrange} we have
\begin{equation}
	\begin{aligned}
	\partial_x L &= 0 = d_t \partial_v L = d_t \left(m v e^{\frac{b}{m}t} \right)=mae^{\frac{b}{m}t} + \frac{b}{m} m v e^{\frac{b}{m}t} = e^{\frac{b}{m}t}(ma + bv) \\
	ma &= - bv.
	\end{aligned}
\end{equation}
Therefore we have a Lagrangian for the system. We can also find a Hamiltonian
\begin{equation}\label{rp-cm-momentumHamiltonian}
	\begin{aligned}
	p &= \partial_v L = m v e^{\frac{b}{m}t} \\
		v &= \frac{p}{m} e^{-\frac{b}{m}t} \\
		H &= p v - L = p \frac{p}{m} e^{-\frac{b}{m}t} - \frac{1}{2} m \left( \frac{p}{m} e^{-\frac{b}{m}t} \right)^2 e^{\frac{b}{m}t} = \frac{p^2}{m}  e^{-\frac{b}{m}t} - \frac{1}{2} \frac{p^2}{m}  e^{-\frac{b}{m}t} \\
		&=\frac{1}{2} \frac{p^2}{m}  e^{-\frac{b}{m}t}
	\end{aligned}
\end{equation}
and apply Hamilton's equations \ref{rp-cm-HamiltonEq}
\begin{equation}
	\begin{aligned}
		d_t q &= \partial_p H = \frac{p}{m}  e^{-\frac{b}{m}t} \\
		d_t p &= - \partial_q H = 0.
	\end{aligned}
\end{equation}
The second equation tells us momentum is constant $p_0$. Substituting the constant in the first equation, we have the velocity as a function of time, which we can integrate. We have
\begin{equation}
	\begin{aligned}
	q(t) &= q_0 + \frac{p_0}{b} \left( 1 - e^{-\frac{b}{m}t}\right) \\
	p(t) &= p_0.
	\end{aligned}
\end{equation}

The kinematics works perfectly, but the dynamics seems off, as shown in fig. \ref{rp-cm-fig-dragDynamics}. First of all, based on the physics, one would expect the momentum to be decreasing in time
\begin{equation}
	p(t)=m v(t) = m v_0 e^{-\frac{b}{m}t}.
\end{equation}
However, conjugate momentum is a constant of motion. For the energy, we would expect the Hamiltonian to match the kinetic energy
\begin{equation}
	E(t)=\frac{1}{2} m v^2(t) = \frac{1}{2} m v_0^2 e^{-2\frac{b}{m}t}
\end{equation}
but if we express the Hamiltonian in terms of velocity we have
\begin{equation}
	H(t)=\frac{1}{2} \frac{p^2}{m} e^{-\frac{b}{m}t} = \frac{1}{2} \frac{1}{m} \left( m v(t) e^{\frac{b}{m}t} \right)^2 e^{-\frac{b}{m}t}= \frac{1}{2} m v^2(t) e^{\frac{b}{m}t} = \frac{1}{2} m v_0^2 e^{-\frac{b}{m}t}.
\end{equation}
That is, the energy decreases more slowly than it should. This is not good.

\begin{figure}
	\centering
	\begin{tikzpicture}
		\def\qi{0.25};
		\def\pi{1.25};
		\def\vi{1.25};
		\def\m{1};
		\def\b{1};
		\pgfplotsset{ticks=none}
		\begin{axis}[
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$q$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=2,
			domain=0:4,
			]
			\addplot[black,samples=30]{\qi + (\pi/\b)*(1-exp((-\b/\m)*x))}
			node [pos=0,left] {$q_0$};
			\addplot[black,samples=2,dashed,opacity=0.5] {1.5}
			node [pos=0,left,opacity=1] {$q_0+\frac{1}{b}p_0$};
		\end{axis}
		\begin{axis}[
			xshift=5cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$p$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=1.7,
			domain=0:4,
			]
			\addplot[black,samples=30]{\m*\vi*exp((-\b/\m)*x)}
			node [pos=0.5,above=5pt] {$mv(t)$};
			\addplot[black,samples=30]{\pi}
			node [pos=0,left] {$p_0$}
			node [pos=0.5,above] {$p(t)$};

		\end{axis}

		\begin{axis}[
			xshift=10cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$E$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=1,
			domain=0:4,
			]
			\addplot[black,samples=30]{0.5*\m*pow(\vi,2)*exp((-\b/\m)*2*x))}
			node [pos=0.3,below,left=1pt] {$E(t)$};
			\addplot[black,samples=30]{0.5*\m*pow(\vi,2)*exp((-\b/\m)*x))}
			node [pos=0,left] {$\frac{p_0^2}{2m}$}
			node [pos=0.4,above=5pt] {$H(t)$};

		\end{axis}
	\end{tikzpicture}
	\caption {Trying to interpret $L = \frac{1}{2} m v^2 e^{\frac{b}{m}t}$ and $H=\frac{1}{2} \frac{p^2}{m} e^{-\frac{b}{m}t}$ as respectively the Lagrangian and Hamiltonian of a particle under linear drag. While evolution of the position matches, note how the conjugate momentum is constant while the kinetic momentum decreases. Also, the Hamiltonian and the energy do not decrease at the same rate.} \label{rp-cm-fig-dragDynamics}
\end{figure}

Now, it is true that conjugate momentum is not the same as kinetic momentum. But the difference, as we will see much more clearly later, is caused by non-inertial non-Cartesian coordinate systems and/or the presence of vector potential forces.\footnote{The relationship is $p_i = m g_{ij} v^j + \mathfrak{q} A_i$. This reduces to $p_i = m v^i$ if and only if we are in an inertial frame with Cartesian coordinates (i.e. $g_{ij}=\delta_{ij}$) and no forces $A_i = 0$} We are not at all in that case. Also, note that at time $t=0$ the momentum and the energy do match our expectation, but not after. Therefore imagine a situation where friction is non-negligible only in a particular region. We would expect $p=mv$ to be valid before it enters, but not when it comes out. But wouldn't it come out in another region where we would expect $p=mv$ to work? This is strange. How should we proceed?

\subsection{Variable mass system}

As it is typical in reverse physics, we will assume that things work in a reasonable way and that we simply have the wrong connection between physics and math. Recall that we started just with an equation, and we then interpreted $m$ to be the mass of the system. Let's just assume that $m$ is a constant with units of mass and define the actual mass of the system as the ratio between conjugate momentum and velocity. Looking back at \ref{rp-cm-momentumHamiltonian}, as shown in fig. \ref{rp-cm-fig-dragVariableMass}, we have
\begin{equation}
	\begin{aligned}
	\hat{m}(t) &= p(t) / v(t) = m e^{\frac{b}{m}t} \\
	p(t) &= mv(t)e^{\frac{b}{m}t} = \hat{m}(t) v(t) \\
	H(t) &= \frac{1}{2} \frac{p^2(t)}{m}  e^{-\frac{b}{m}t} = \frac{1}{2} \frac{p^2(t)}{\hat{m}} = \frac{1}{2} \hat{m}(t) v^2(t) = E(t)
	\end{aligned}
\end{equation}
Now everything actually works perfectly: the relationship between velocity and conjugate momentum is respected, the Hamiltonian matches the kinetic energy. We just have a variable mass system. How and why does this work exactly?

\begin{figure}
	\centering
	\begin{tikzpicture}
		\def\qi{0.25};
		\def\pi{1.25};
		\def\vi{1.25};
		\def\m{1};
		\def\b{1};
		\pgfplotsset{ticks=none}
		\begin{axis}[
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$\hat{m}$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=2,
			domain=0:4,
			]
			\addplot[black,samples=20,domain=0:4]{\m*exp((\b/\m)*x)/30}
			node [pos=0,left] {$m$};
		\end{axis}
		\begin{axis}[
			xshift=5cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$p$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=1.7,
			domain=0:4,
			]
			\addplot[black,samples=20]{\pi}
			node [pos=0,left] {$p_0$}
			node [pos=0.5,above] {$p(t) = \hat{m}(t)v(t) $};
		\end{axis}

		\begin{axis}[
			xshift=10cm,
			width=5.3cm,
			height=4.5cm,
			axis lines=middle,
			axis line style={-},
			clip=false,
			xlabel=$t$,
			xlabel style={below=5pt},
			ylabel=$E$,
			ylabel style={above=2pt, left},
			xmin=0,xmax=4,
			ymin=0,ymax=1,
			domain=0:4,
			]
			\addplot[black,samples=20]{0.5*\m*pow(\vi,2)*exp((-\b/\m)*x))}
			node [pos=0,left] {$\frac{p_0^2}{2m}$}
			node [pos=0.2,above,right=5pt] {$H(t) = E(t)$};

		\end{axis}
	\end{tikzpicture}
	\caption {Showing how $L = \frac{1}{2} m v^2 e^{\frac{b}{m}t}$ and $H =\frac{1}{2} \frac{p^2}{m} e^{-\frac{b}{m}t}$ can be interpreted as the Lagrangian and Hamiltonian of a variable mass system. The mass is increasing exponentially in time, while both conjugate and kinetic momentum remain constant. This means the velocity will need to decrease. The energy decreases at the same rate as the Hamiltonian. } \label{rp-cm-fig-dragVariableMass}
\end{figure}

Let us expand Newton's second law for a variable mass system.\footnote{Note that, in general, the variable mass system should take into account the momentum gained or lost by the system when the mass is acquired or ejected. In our case, we are assuming that no momentum is lost, which means that either the mass is acquired/ejected uniformly from all directions or it is just an apparent change that depends on the change of coordinates.} We have:
\begin{equation}
	\begin{aligned}
		F^i &= d_t (\hat{m}v^i) = d_t \hat{m} \, v^i + \hat{m} a^i \\
		\hat{m} a^i &= F^i - d_t \hat{m} v^i
	\end{aligned}
\end{equation}
In particular, for our one dimensional case, let us set $F=0$ and substitute $\hat{m}$
\begin{equation}
	\begin{aligned}
		m e^{\frac{b}{m}t} a &= 0 - d_t m e^{\frac{b}{m}t} v = -\frac{b}{m} m e^{\frac{b}{m}t} v \\
		ma &= -bv.
	\end{aligned}
\end{equation}
Therefore the same equation, the same kinematics, applies to a variable mass system that increases the mass over time. You can imagine, for example, a body that is absorbing mass from all directions, so that the balance of forces on the body is zero. The body, then, is not slowing down because of friction. It is slowing down because momentum is conserved, and if the mass is increasing, the velocity must be decreasing at the same rate. The energy, on the other hand, will decrease because the square of the velocity will decrease faster than the mass increases.

In Newtonian mechanics, we can readily distinguish these two cases because we have to be explicit about forces and masses. In Hamiltonian mechanics things are a bit more difficult because, as we will see later more precisely, conjugate momentum is not exactly kinetic momentum and the Hamiltonian is not exactly energy. Yet, conjugate momentum and the Hamiltonian are not kinematic quantities, they are dynamic quantities and therefore we can see that these would be different in different cases. In Lagrangian mechanics this is even more difficult to see because it looks like a purely kinematic theory, while it is not: the Lagrangian itself is not a purely kinematic entity. As we saw, Lagrangian mechanics implicitly assumes \ref{assum_kineq}, which is a condition on the dynamics as well, and the Lagrangian itself is used to reconstruct conjugate momentum and the Hamiltonian. Moreover, if Lagrangian mechanics were a purely kinematic theory, and told us nothing about forces, energy or momentum, it would not be a complete formulation of classical mechanics.

So we have seen that the same kinematic equation can describe a constant mass dissipative system or a variable mass system. Is that it? Not quite. Recall that we mentioned that kinetic and conjugate momentum will differ in non-inertial frames. Note that we implicitly assumed that $x$ and $t$ represented the variables for an inertial observer, in the same way that we originally assumed $m$ was the mass of the system. Could the same equation, then, be describing yet another system but in a non-inertial frame?

\subsection{Non-inertial motion}

Let's compare the motion of a particle traveling at constant velocity in an inertial frame, using $t$ as the time variable, and the motion of a particle decelerating exponentially, using $\hat{t}$ as the time variable
\begin{equation}
	\begin{aligned}
		x(t) &= x_0 + v_0 t \\
		x(\hat{t}) &= x_0 + v_0 \frac{m}{b}\left(1-e^{-\frac{b}{m}\hat{t}}\right).
	\end{aligned}
\end{equation}
Note the striking similarity: we can simply set
\begin{equation}
	t = \frac{m}{b} \left(1-e^{-\frac{b}{m}\hat{t}}\right)
\end{equation}
which clearly takes us to a non-inertial frame since uniform motion is no longer uniform in the new frame.

Let's study how Newton's second law changes if we make a change of time variable while keeping the position variables unchanged
\begin{equation}
	\begin{aligned}
		\hat{t}&=\hat{t}(t) \\
		F^i &= d_t  (m \, v^i) = d_t  (m \, d_t x^i) = d_t \hat{t} \, d_{\hat{t}}  (m \, d_t \hat{t} \, d_{\hat{t}} x^i).
	\end{aligned}
\end{equation}
If we set
\begin{equation}
	\hat{m} = m \, d_t \hat{t}
\end{equation}
we can express the previous equation in the following form
\begin{equation}
	\begin{aligned}
		F^i &= d_t \hat{t} \, d_{\hat{t}}  (\hat{m} \, d_{\hat{t}} x^i) = d_t \hat{t} \, d_{\hat{t}}  (\hat{m} \, \hat{v}^i) = d_t \hat{t} \hat{F}^i.
	\end{aligned}
\end{equation}
This tells us that the second observer will see an effective mass rescaled exactly by the ratio between the time variables. Note that this is exactly what happens in special relativity: the clock for a boosted observer is dilated by a factor of $\gamma$ which is exactly the factor used in the relativistic mass.\footnote{It may be surprising to see a proto-relativistic effect showing up given that no assumption on space-time has been made. As we will see, these types of connections between different theories come up often in reverse physics.} If $t$ is the time variable for an inertial frame and $t(\hat{t})$ is a non-linear function, the resulting frame will be non-inertial and the observer will see an effective variable mass system.

If we look at our problem this way, the rescaling of the mass, then, is not due to a truly variable mass, but a variable effective mass due to the slowing down of the clock. The body slows down because the non-inertial time is slowing down and the body appears to stop because the clock becomes infinitely slow. While this might sound like a contrived case,\footnote{On the surface, it sounds similar to what happens in general relativity with a black-hole. An observer that sees someone falling into a black hole will see him gradually slowing down as he approaches the event horizon and asymptotically stop there. The observer falling inside the black hole, instead, will perceive his time flowing uniformly and nothing special will happen as the event horizon is crossed.} these are exactly the type of situations a fully relativistic theory (i.e. one that works for all definitions of time and space variables) needs to take into account.

We can verify that this gives us the correct effective mass
\begin{equation}
	\begin{aligned}
	d_{\hat{t}} t  &=d_{\hat{t}} \left( \frac{m}{b} (1-e^{-\frac{b}{m}\hat{t}}) \right) =\frac{m}{b} d_{\hat{t}} (1-e^{-\frac{b}{m}\hat{t}}) = - \frac{m}{b} d_{\hat{t}} e^{-\frac{b}{m}\hat{t}} = + \frac{m}{b} \frac{b}{m} e^{-\frac{b}{m}\hat{t}} = e^{-\frac{b}{m}\hat{t}} \\
	\hat{m} &= m d_t \hat{t} = m (d_{\hat{t}} t)^{-1} = m e^{\frac{b}{m}\hat{t}}.
	\end{aligned}
\end{equation}
And we can verify that we get the same equation by plugging in the time transformation in Newton's second law with a zero force
\begin{equation}
	\begin{aligned}
		0 &= d_t  (m \, v) = d_t  (m \, d_t x) = d_t \hat{t} \, d_{\hat{t}}  (m \, d_t \hat{t} \, d_{\hat{t}} x) \\ &= e^{\frac{b}{m}\hat{t}} \, d_{\hat{t}}  (m e^{\frac{b}{m}\hat{t}} \, \hat{v}) = e^{\frac{b}{m}\hat{t}} \left( m \frac{b}{m} e^{\frac{b}{m}\hat{t}} \hat{v} + m e^{\frac{b}{m}\hat{t}} \hat{a} \right)  = e^{2\frac{b}{m}\hat{t}} \left( b \hat{v} + m \hat{a} \right) \\
		m \hat{a} &= - b \hat{v}.
	\end{aligned}
\end{equation}
Note that the expressions for momentum and energy will match the previous case because the system in the non-inertial frame looks like a variable mass system.

\subsection{The relationship between kinematics and dynamics}

%TODO: It may be worth to use the following more clear definitions. Motion is the actual frame-independent object (i.e. the trajectory); kinematics is the representation of the motion in a particular frame; cause of motion is the actual frame-independent object; the dynamics is the expression of the causes in a particular frame (i.e. the forces). In general, one needs to know the actual frame to be able to relate kinematics/motion; dynamics/causes of motion

This last case highlights a more subtle issue. In the two previous cases we were in the same inertial frame, we saw the same trajectory, the same kinematics, but we couldn't tell whether we were looking at a fixed mass system under linear drag or a variable mass system: we couldn't tell the dynamics. Now, we have the same system, a constant mass particle under no forces, described in two different frames, one inertial and one not. The motion of the system will naturally have different representations in the different frames, but this does not mean the motion or the causes of motion are different: it's the same object. Therefore we have the same motion even though we have different expressions for the trajectory. The expression $x(t)$, then, is not enough to define the kinematics if we do not know exactly what $x$ and $t$ represent physically, if the frame is not given.

While typically one proceeds by defining the frame first and then the dynamics (i.e. the forces acting on the system), here we have followed a different approach: we first defined the dynamics (i.e. constant mass system under no forces) and then found the frame that matched the given kinematics (i.e. the trajectory or the relationship between velocity and acceleration). Given that Lagrangian and Hamiltonian mechanics are frame invariant, an intrinsic characterization of the system itself is exactly what we should be looking for. Saying, for example, that a system is subjected to no forces or to a linear drag is not frame invariant because forces are not frame invariant.

It is clear that the type of apparent variable mass due to non-inertial frames is unavoidable if we want to have a consistent theory with invariant laws. Therefore both Lagrangian and Hamiltonian mechanics must include these cases. However, it is not exactly clear what to do for true variable mass systems. From a cursory look, it would seem that everything is fine and there is no harm in including them. Yet again, from a cursory look we seemed to have a Lagrangian for a particle under linear drag. As we will see later, there are implicit connections between Lagrangian/Hamiltonian mechanics on one side and thermodynamics, statistical mechanics and special relativity on the other. Given that it is not clear to us whether these connections hold or not,\footnote{For example, areas of phase space are connected to entropy. Does this connection hold with a variable mass system?} we will concentrate on the constant mass case from now on.

Let's recap what we learned. The biggest point is that we can't simply look at the kinematics and understand the causes of motion. The different formulations have different ways to relate the dynamics and the kinematics. Newtonian mechanics is the most clear about the dynamics as it makes us clearly spell out what is going on. This, however, comes at a cost: the equations are not covariant, meaning they have a different expression in different frames. The second law, in fact, is valid only for inertial frames with Cartesian coordinates. It is only in these frames, in fact, that a body will proceed in uniform motion if no forces are applied to it. If we are in polar coordinates, for example, the trajectory expressed in radius $r$ and angle $\theta$ will not be linear. Even the notion of force is, if one looks closely, a bit ambiguous. In principle, we want to write both the second law $F=ma$ and the expression for work $dW = F dx$. If $dW$ is invariant under change of position variables, the force should be a covector and therefore $dW = F_i dx^i$. But since the acceleration $a$ will change like a vector, we also have $F^i = m a^i$. The notion of force in the second law and in the infinitesimal work are slightly different, and they coincide only if we are in an inertial frame and Cartesian coordinates.

On the other side, Hamiltonian and Lagrangian mechanics are coordinate independent: the laws remain the same if we change position variables. This makes them more useful in many contexts. Lagrangian mechanics is more useful when trying to study the symmetries of the system. Hamiltonian mechanics is more useful for statistical mechanics and to better separate degrees of freedom. However, this comes at a price. Hamiltonian and Lagrangian mechanics apply in fewer cases than Newtonian mechanics. As we saw, linear drag may look like it has a valid Hamiltonian/Lagrangian, but it doesn't. For quadratic drag or friction due to normal force, one cannot find a suitable trick, and is forced to use Rayleigh’s dissipation functions which modify the Euler-Lagrange equations. This is not a coincidence: while Newtonian mechanics links kinematics and dynamics by choosing a particular frame, Hamiltonian and Lagrangian mechanics do so by fixing a type of system. It is the implicit knowledge of the type of system that allows us to reconstruct the dynamics just by looking at the kinematics in an unknown frame. What we need to understand, then, is what exactly is this restriction.

\section{Reversing Hamiltonian mechanics}

We now turn our attention to Hamiltonian mechanics and try to understand exactly what types of systems it focuses on. We will find twelve equivalent formulations of Hamiltonian mechanics that link ideas from vector calculus, differential geometry, statistical mechanics, thermodynamics, information theory and plain statistics. The overall result is that Hamiltonian mechanics focuses on systems that are assumed to be deterministic and reversible. We will see how the physical significance of that assumption differs from mathematically naive characterizations.

\subsection{Mathematical characterizations}

%TODO: finalize syntax for vectors (no commas?)

To simplify our discussion, we will first concentrate on a single degree of freedom. The first characterization of Hamiltonian mechanics is naturally in terms of the equations
\begin{equation}\notag
	\begin{aligned}
		d_t q &= \partial_p H \\
		d_t p &= - \partial_q H.
	\end{aligned}
	\tag{HM-1D}\label{rp-cm-hm-condEquations}
\end{equation}
We will want to treat phase space as a generic two-dimensional space (i.e. manifold), like we would for a plane in physical space. We will reserve the term coordinate for the position variable $q$, while we will refer to the collection of position and momentum as state variables and will note them as $\xi^a = [q, p]$. We can now define the displacement field
\begin{equation}\label{rp-cm-displacement1d}
	S^a = d_t \xi^a = [d_t q, d_t p]
\end{equation}
which is a vector field that defines the evolution of the system in time. Hamilton's equations, then, can be expressed as
\begin{equation}\label{rp-cm-displacementCurl}
	\begin{aligned}
		S^q &= \partial_p H \\
		S^p &= - \partial_q H.
	\end{aligned}
\end{equation}

To bring out the geometric meaning of the equations, we introduce the matrix
\begin{equation}\tag{SF-1D}\label{rp-cm-sf-symplectic1d}
	\omega_{ab} = \left[\begin{array}{cc}
		\omega_{qq} & \omega_{qp} \\
		\omega_{pq} & \omega_{pp}
	\end{array} \right]= \left[\begin{array}{cc}
		0 & 1 \\
		-1 & 0
	\end{array} \right]
\end{equation}
which rotates a vector by a right angle.\footnote{The notion of angle is technically ill-defined in phase space, but this slight imprecision makes it easier to get the point across.} That is, if $v^a = [v^q, v^p]$, then $v_a = v^b \omega_{ba}  = [-v^p, v^q]$.\footnote{The notation is purposely similar to how indexes are raised and lowered in general relativity by the metric tensor $g_{\alpha\beta}$, since $\omega_{ab}$ plays a similar geometric role in phase space. One should be careful, however, that $\omega_{ab}$ is anti-symmetric (i.e. $\omega_{ab} = - \omega_{ba}$), so it matters which side is contracted. In terms of symplectic geometry, the rotated displacement field $S_a$ corresponds to the interior product of the displacement field with the symplectic form, usually noted as $\iota_S \omega$ or $S \lrcorner \, \omega$.} We can rewrite equation \ref{rp-cm-hm-condEquations} as
\begin{equation}\tag{HM-G}\label{rp-cm-hm-condGeneralizedEquations}
	\begin{aligned}
		S_a = S^b \omega_{ba} &= \partial_a H
	\end{aligned}
\end{equation}
which tells us that the displacement field is the gradient of the Hamiltonian rotated by a right angle. Note that the gradient is perpendicular to the lines at constant energy. Therefore, as we can see in fig. \ref{rp-cm-fig-HamiltonianRotation}, a right angle rotation gives us a vector field tangent to those lines, making it geometrically evident that the value of the Hamiltonian is a constant of motion. Condition \ref{rp-cm-hm-condGeneralizedEquations} is just a re-expression of \ref{rp-cm-hm-condEquations}. Though it is already useful, we want to find different mathematical conditions which turn out to be equivalent to the equations.

\begin{figure}
	\centering
		\includegraphics{images/rp-cm-fig-HamiltonianRotation.pdf}
	\caption {The surface plot shows the value of the Hamiltonian for a harmonic oscillator $H=\frac{p^2}{2m} + \frac{1}{2} k q^2$, red means higher value. The lines are the regions at constant energy $H$. On the left, the gradient of the Hamiltonian is shown. On the right, the displacement field is shown, which is the gradient rotated by a right angle. Note how the displacement is always parallel to the lines at constant energy.} \label{rp-cm-fig-HamiltonianRotation}
\end{figure}


We start by noting that the displacement field as expressed by \ref{rp-cm-displacementCurl} looks very similar to a curl of $H$, except that it is a two dimensional version. In vector calculus, a vector field is the curl of another field if and only if its divergence is zero.\footnote{We will leave for now topological requirements as they would be a distraction from the overall point.} This holds here as well. First, we can verify that
\begin{equation}
	\partial_a S^a = \partial_q S^q + \partial_p S^p = \partial_q \partial_p H - \partial_p \partial_q H = 0.
\end{equation}
Geometrically, this means that the flow of $S^a$ through a closed region is always zero, as shown in fig. \ref{rp-cm-fig-HamiltonianFlow}. That is, $\oint \left( S^q dp - S^p dq \right) = 0$. Note that, since we are in a two dimensional space, a hyper-surface has dimension $n-1 = 2-1 = 1$ and therefore hyper-surfaces are lines. Therefore we have
\begin{equation}
	\oint \left( S^q dp - S^p dq \right) = \oint \left( \partial_p H dp + \partial_q H dq \right) = \oint dH = 0.
\end{equation}
That is, the flow of the displacement field is the line integral of the gradient of $H$, which is zero over a closed curve.

Conversely, we can see that each divergenceless field in two dimensions admits a stream function $H$ that satisfies \ref{rp-cm-hm-condEquations}. Geometrically, we can construct $H$ in the following way. Take a reference state $O$ in phase space and assign $H(O) = 0$. For any other state $P$, consider the flow of $S$ through any two lines that connect $O$ and $P$. Given that the flow through the region contoured by those lines must be zero, the flow through each line must be equal. Therefore the flow through a line that connects $O$ and $P$ only depends on the states, it is path independent. We can assign $H(P) = \int_{OP} \left( S^q dp - S^p dq \right)$. If we expand the differential of $H$ we have
\begin{equation}
	dH = \partial_q H dq + \partial_p H dp = - S^p dq + S^q dp.
\end{equation}
If we equate the components, we recover \ref{rp-cm-hm-condEquations}. Geometrically, at least for the one dimensional case, we can understand the difference of the Hamiltonian between two states as the flow of the displacement field between them.

\begin{figure}
	\centering
	\begin{tikzpicture}[decoration={markings,
			mark=between positions 0 and 0.8 step 0.2
			with {\draw[blue,-stealth] (0pt,10pt) -- (10pt,-8pt);}}]
		\node (p) at (0,5) {p};
		\node (q) at (5,0) {q};
		\node (O) at (1,1) {O};
		\node (P) at (4,4) {P};
		\draw (p) -- (0,0) -- (q);
		\draw [postaction={decorate}] (O)  to[bend right] ++(1.5,1.5) to[bend left] (P);
	\end{tikzpicture}
	\begin{tikzpicture}
		\node (p) at (0,5) {p};
		\node (q) at (5,0) {q};
		\draw (p) -- (0,0) -- (q);
		\path (1.7,1) coordinate (A)
		(2.3,2.6) coordinate(B)
		(3.7,3.5) coordinate (C)
		(3.2,2.4) coordinate (D);
		\draw (A) .. controls (0.3,1.6) and (2,2.3) ..
		(B) .. controls (2.5,2.8) and (2.7,3.7) ..
		(C) .. controls (4.6,3.3) and (3.5,2.8) ..
		(D) .. controls (2.9,2) and (3.1,0.4) ..
		(A);
		\draw [blue,-stealth] (0.9,1.7)--(1.5,1.6);
		\draw [blue,-stealth] (1.2,2)--(1.9,1.9);
		\draw [blue,-stealth] (1.6,2.4)--(2.3,2.2);
		\draw [blue,-stealth] (2,2.7)--(2.6,2.6);
		\draw [blue,-stealth] (2.3,3)--(3,3);
		\draw [blue,-stealth] (2.6,1.25)--(3.2,1.1);
		\draw [blue,-stealth] (2.7,1.7)--(3.3,1.6);
		\draw [blue,-stealth] (2.8,2.1)--(3.4,2.1);
		\draw [blue,-stealth] (3.1,2.5)--(3.7,2.6);
		\draw [blue,-stealth] (3.5,3)--(4.2,3.1);
	\end{tikzpicture}
	\caption {The flow of the displacement field $S^a$ through a path, shown on the left, is equal to the difference of the Hamiltonian at the two points $\Delta H = \int_{OP} S^a \times d\xi^b$. The net flow of states through a region (i.e. the flow of the displacement field through the boundary) is zero, as shown on the left. This means that $S^a$ is divergenceless and will admit a stream function, a potential, which corresponds to the Hamiltonian $H$.} \label{rp-cm-fig-HamiltonianFlow}
\end{figure}

We conclude that the following condition
\begin{equation}\label{rp-cm-dr-condDivergenceDisplacement}
	\tag{DR-DIV}
	\eqtext{The displacement field is divergenceless: $\partial_a S^a = 0$}
\end{equation}
is equivalent to \ref{rp-cm-hm-condEquations}. Unlike \ref{rp-cm-hm-condGeneralizedEquations}, this is a truly different mathematical condition.

Having looked at the flow through a region, we turn our attention to how regions themselves are transported by the evolution. Liouville's theorem states that volumes of phase space are preserved during Hamiltonian evolution, which in our case will be areas over the $q-p$ plane. To see this, let us review how variables transform, together with infinitesimal volumes:
\begin{equation}\label{rp-cm-volumeTransformation1d}
	\begin{aligned}
		\hat{\xi}^a &= \hat{\xi}^a(\xi^b) \\
		d\hat{\xi}^a &= \partial_b \hat{\xi}^a d\xi^b \\
		d\hat{\xi}^1 \cdots d\hat{\xi}^n &= \left| \partial_b \hat{\xi}^a \right| d\xi^1 \cdots d\xi^n \\
		d\hat{q} d\hat{p} &= \left|\begin{array}{ c c }
			\partial_q \hat{q} & \partial_p \hat{q} \\
			\partial_q \hat{p} & \partial_p \hat{p} \\
		\end{array} \right| dq dp \\
	\end{aligned}
\end{equation}

This tells us that, mathematically, a transformation is volume preserving if the determinant of the Jacobian $\partial_b \hat{\xi}^a$ is unitary. If $\hat{q}$ and $\hat{p}$ represent the evolution of $q$ and $p$ after an infinitesimal time step $\delta t$, we have
\begin{equation}
	\begin{aligned}
	\hat{q} &= q + S^q \delta t \\
	\hat{p} &= p + S^p \delta t \\
	\partial_b \hat{\xi}^a &= \left[\begin{array}{ c c }
		1 + \partial_q S^q \delta t & \partial_p S^q \delta t \\
		\partial_q S^p \delta t & 1 + \partial_p S^p \delta t \\
	\end{array} \right] \\
	\left|\partial_b \hat{\xi}^a\right| &= (1 + \partial_q S^q \delta t) (1 + \partial_p S^p \delta t) - \partial_p S^q \, \partial_q S^p \, \delta t^2  = 1 + \left(\partial_q S^q + \partial_p S^p \right) \delta t + O(\delta t^2).
	\end{aligned}
\end{equation}
Note that the first order term is proportional to the divergence of the displacement field, therefore the Jacobian determinant is equal to one if and only if the displacement is divergenceless. In other words, condition
\begin{equation}\label{rp-cm-dr-condUnitaryJacobian}
	\tag{DR-JAC}
	\eqtext{The Jacobian of time evolution is unitary: $\left|\partial_b \hat{\xi}^a\right|=1$}
\end{equation}
and condition
\begin{equation}\label{rp-cm-dr-condConservedVolume}
	\tag{DR-VOL}
	\eqtext{Volumes are conserved through the evolution: $d\hat{\xi}^1 \cdots d\hat{\xi}^n = d\xi^1 \cdots d\xi^n$}
\end{equation}
are equivalent to \ref{rp-cm-dr-condDivergenceDisplacement}. We have found a third and a fourth way to characterize Hamiltonian evolution.

\begin{figure}[h]
	\centering
		\includegraphics{images/rp-cm-fig-areaDensityConservation.pdf}
	\caption {On the left side, we see how the displacement field $S^a$ transports areas of phase space to equal areas of phase space. On the right, we see Hamiltonian evolution transports a probability distribution point by point. The value of the probability density remains the same as it moves over phase space.}\label{rp-cm-fig-areaDensityConservation}
\end{figure}

While condition \ref{rp-cm-dr-condConservedVolume} is expressed in terms of areas, similar considerations will work for densities because a density is a quantity divided by an infinitesimal area. In fact densities
\begin{equation}\label{rp-cm-densityTransformation1d}
	\begin{aligned}
		\left| \partial_b \hat{\xi}^a \right| \hat{\rho}(\hat{\xi}^a) &= \rho(\xi^b).
	\end{aligned}
\end{equation}
transform in an equal and opposite way with respect to areas (i.e. the Jacobian determinant is on the other side of the equality). The unitarity of the Jacobian determinant, then, is equivalent to requiring that the density at an initial state is always equal to the density at the corresponding final state. Both areas and densities are transported unchanged by Hamiltonian evolution, as shown in fig. \ref{rp-cm-fig-areaDensityConservation}. Therefore
\begin{equation}\label{rp-cm-dr-condConservedDensity}
	\tag{DR-DEN}
	\eqtext{Densities are conserved through the evolution: $\hat{\rho}(\hat{\xi}^a) = \rho(\xi^b)$ }
\end{equation}
is yet another equivalent characterization.

To get a yet different perspective, we can reframe these arguments in terms of $\omega_{ab}$ and $S_a$. Given two vectors $v^a$ and $w^a$, the area of the parallelogram they form is $v^q w^p - v^p w^q$. This can be rewritten as $v^a \omega_{ab} w^b$, which means we can think of $\omega_{ab}$ as a tensor that, given two vectors, returns the area of the parallelogram they form.\footnote{More properly, $\omega_{ab}$ is a two-form.} If we denote $\hat{v}^a = \partial_b \hat{\xi}^a \, v^b$ and $\hat{w}^a = \partial_b \hat{\xi}^a \, w^b$ the transformed vectors, the invariance of the area can be written as
\begin{equation}
	v^a \omega_{ab} w^b = \hat{v}^c \omega_{cd} \hat{w}^d.
\end{equation}
Since
\begin{equation}
	\hat{v}^c \omega_{cd} \hat{w}^d = v^a \, \partial_a \hat{\xi}^c \omega_{cd} \, \partial_b \hat{\xi}^d \, w^b = v^a \, \hat{\omega}_{ab} w^b
\end{equation}
the previous equivalence means that $\omega_{ab} = \hat{\omega}_{ab}$, that is $\omega_{ab}$ remains unchanged. In other words, preserving the area for all possible pairs of vectors is the same as preserving the tensor $\omega_{ab}$ that returns the areas. We now see that $\omega_{ab}$ plays such an important geometric role that
\begin{equation}\label{rp-cm-di-condConservedSymplectic}
	\tag{DI-SYMP}
	\eqtext{The evolution leaves $\omega_{ab}$ invariant: $\hat{\omega}_{ab} = \omega_{ab}$}
\end{equation}
is yet another equivalent characterization of Hamiltonian mechanics.

%\begin{figure}[h]
%	\centering
%	\begin{tikzpicture}
%	\end{tikzpicture}
%	\caption {TODO: F4 Areas from vectors.}
%\end{figure}

It is useful to look more closely at the definition of the Poisson bracket
\begin{equation}
	\{f, g\} = \partial_q f \, \partial_p g - \partial_p f \, \partial_q g = \left|\begin{array}{ c c }
		\partial_q f & \partial_p f \\
		\partial_q g & \partial_p g \\
	\end{array} \right|.
\end{equation}
For a single degree of freedom, the Poisson bracket coincides with the Jacobian determinant, where $f$ and $g$ are the two new variables. It essentially tells us how the volume changes if we change state variables from $[q, p]$ to $[f, g]$. Canonical transformations, then, are those that do not change the units of area. The Poisson bracket can be expressed\footnote{To see how our definitions and notation map to that used in differential geometry, let us define $\partial^a H = \omega^{ab} \partial_b H$. Note that $\partial^a H$ corresponds to the Hamiltonian vector field of $H$ usually noted $X_H$. The Poisson bracket is usually defined as $\omega(X_f, X_g)$. In our notation this becomes $\partial^a f \, \omega_{ab} \partial^b g = \omega^{ac} \partial_c f \omega_{ab} \omega^{bd} \partial_d g = \omega^{ac} \partial_c f \delta_a^d \partial_d g = \omega^{ac} \partial_c f \partial_a g$. One can see how the notation mimics the Einstein notation of general relativity and avoids the introduction of ad-hoc symbols.} as
\begin{equation}
	\{f, g\} = - \partial_a f \omega^{ab} \partial_b g = \partial_b g \omega^{ba} \partial_a f
\end{equation}
where
\begin{equation}
	\omega^{ab} = \left[\begin{array}{cc}
		\omega^{qq} & \omega^{qp} \\
		\omega^{pq} & \omega^{pp}
	\end{array} \right]= \left[\begin{array}{cc}
		0 & -1 \\
		1 & 0
	\end{array} \right]
\end{equation}
is the inverse of $\omega_{ab}$. The invariance of the Poisson brackets is equivalent to the invariance of the inverse of $\omega_{ab}$, which is equivalent to \ref{rp-cm-di-condConservedSymplectic}. Therefore
\begin{equation}\label{rp-cm-di-condConservedPoisson}
	\tag{DI-POI}
	\eqtext{The evolution leaves the Poisson brackets invariant}
\end{equation}
is yet another equivalent characterization. So, again, we see how $\omega_{ab}$ plays a fundamental geometrical role.

We can also rewrite the flow of the displacement field
\begin{equation}
	\int \left( S^q dp - S^p dq \right) = \int S^a \omega_{ab} d\xi^b = \int S_b d\xi^b
\end{equation}
as the line integral of the rotated displacement field $S_a$. We can do that because in two dimensions the flow through a boundary is effectively a line integral along the boundary with the field rotated 90 degrees. This means that the following condition
\begin{equation}\label{rp-cm-di-condCurlRotatedDisplacement}
	\tag{DI-CURL}
	\eqtext{The rotated displacement field is curl free: $\partial_a S_b - \partial_b S_a = 0$}
\end{equation}
is equivalent to condition \ref{rp-cm-dr-condDivergenceDisplacement}.\footnote{Those familiar with relativistic electromagnetism will recognize the expression $\partial_a S_b - \partial_b S_a$ as the generalization of the curl. More properly, it is the exterior derivative applied to a one-form.} In fact, we can read equation \ref{rp-cm-hm-condGeneralizedEquations} as saying that the rotated displacement field is the gradient of the scalar potential $H$.

We can see that we have found plenty of alternative characterizations of Hamilton's equations \ref{rp-cm-hm-condEquations} (or \ref{rp-cm-hm-condGeneralizedEquations}). Conditions  \ref{rp-cm-dr-condDivergenceDisplacement}, \ref{rp-cm-dr-condUnitaryJacobian}, \ref{rp-cm-dr-condConservedVolume} and \ref{rp-cm-dr-condConservedDensity} relate more directly to the displacement field $S^a$, while conditions \ref{rp-cm-di-condConservedSymplectic}, \ref{rp-cm-di-condConservedPoisson} and \ref{rp-cm-di-condCurlRotatedDisplacement} relate more directly to $\omega_{ab}$ and the rotated displacement field $S_a$. Nonetheless, they are all in terms of the mathematical description. While these are useful, the final goal of reverse physics is to find physical assumptions, not just equivalent mathematical definitions. So it is time to step back and try to understand what the math is really about.

\subsection{Physical characterizations}

Let us first reflect on what we just found out: the defining characteristic of Hamiltonian mechanics is not the transport of points, but the transport of areas and densities. If classical Hamiltonian mechanics were really about and only about point particles, there would be no reason for it to be characterized by \ref{rp-cm-dr-condDivergenceDisplacement}, \ref{rp-cm-dr-condConservedVolume} or \ref{rp-cm-dr-condConservedDensity}. In fact, there would be no reason for the equations of motion \ref{rp-cm-hm-condEquations} to be differentiable. Differentiable equations are exactly needed if we need to define the Jacobian, the transport of areas, or of densities defined on those areas. Classical point particles, then, are more aptly conceived not as points, but as infinitesimal regions of phase space, as distributions so peaked that only the mean value is important.

This, in retrospect, matches how classical mechanics is used in practice: planets, cannonballs, pendulums, beads on a wire, all the objects we study with classical mechanics are not point-like objects. They can be considered point-like if their size is negligible compared to the scale of the problem. If the distance between two celestial bodies is smaller than the sum of their radii, the point particle approximation clearly fails. This is also consistent with fluid dynamics and continuum mechanics, where we are literally studying the motion of infinitesimal parts of a material. It is interesting to see echoes of these considerations present in the mathematics.\footnote{We will want to investigate this link in more detail later.}

If we look at physics more broadly, we realize that in statistical mechanics we already have a physical interpretation for volumes of regions in phase space: they represent the number of states. Hamiltonian mechanics, then, maps regions while preserving the number of states. This means that, for each initial state there is one and only one final state, which leads to the following condition:
\begin{equation}\label{rp-cm-dr-condDetRev}
	\tag{DR-EV}
	\eqtext{The evolution is deterministic and reversible.}
\end{equation}
Note that by reversible here we mean that given the final state we can reconstruct the initial state. Given that areas measure the number of states, \ref{rp-cm-dr-condDetRev} is equivalent to \ref{rp-cm-dr-condConservedVolume}, which means this is another characterization of Hamiltonian mechanics. We can also see a connection to \ref{rp-cm-dr-condConservedDensity}. If we assign a density to an initial state, and we claim that all and only the elements that start in that initial state will end in a particular final state, we will expect the density of the corresponding final state to match. That is, if the evolution is deterministic and reversible, it may shuffle around a distribution, but it will never be able to spread it or concentrate it.

This makes us understand, at a conceptual level, why a dissipative system, like a particle under linear drag, is not a Hamiltonian system. A dissipative system will have an attractor: a point or a region to which the system will tend given enough time. This means that, in time, the area around the attractor must shrink, the density will concentrate over the attractor, but this is exactly what Hamiltonian systems cannot do. Therefore Hamiltonian systems cannot have attractors, they cannot be dissipative. By the same argument, they can't have unstable points or regions from which the system always goes away.

What may be confusing is that the motion of a particle under linear drag may seem reversible, in the sense that we are able to, given the final position and momentum, reconstruct the initial values. Mathematically, it maps points one-to-one and would seem to satisfy \ref{rp-cm-dr-condDetRev}, even though it is not a Hamiltonian system. This is a perfect example of how focusing on just the points leads to the wrong physical intuition. Physically, we would say that a one meter range of position allows for more configurations than a one centimeter range, even though mathematically they have the same number of points. If we understand that states are infinitesimal areas of phase space, we can see that a dissipative system, though it does map the center points of infinitesimal areas one-to-one, it does not map the full infinitesimal area one-to-one. In this sense dissipative systems fail to be reversible.

Let that sink in: we found that, if the system is deterministic and reversible, it admits a Hamiltonian, a notion of energy, and that energy is conserved over time. This may seem like a surprising and unexpected result. In retrospect, we can make an argument for it based on familiar physics considerations. If a system is deterministic and reversible it means that its evolution only depends on the state of the system itself. This means that it does not depend on the state of anything else. A system whose evolution does not depend on anything else is an isolated system. Therefore a deterministic and reversible system is isolated, and from thermodynamics we know that an isolated system conserves energy. It should not be surprising, then, that a deterministic and reversible system conserves energy. However, we found that not only does it conserve energy, it defines it. Therefore this link between mechanics and thermodynamics is actually deeper than we may think at first, and we should explore it further.

The idea that a dissipative system is not reversible sounds true on thermodynamic grounds. But thermodynamic reversibility is not the ability to reconstruct the initial state, but rather the existence of a process that can undo the change. Alternatively, a process is  thermodynamically reversible if it conserves thermodynamic entropy, which is a more precise characterization.\footnote{The actual existence of a reverse process is not something that can always be guaranteed.} We should not, then, confuse the two notions of reversibility, but we can easily show their relationship. The fundamental postulate of statistical mechanics tells us that the thermodynamic entropy $S = k_B \log W$ is the logarithm of the count of states, which corresponds to volume in phase space. Since the logarithm is a bijective function, conservation of areas of phase space is equivalent to conservation of entropy. Therefore
\begin{equation}\label{rp-cm-dr-condThermoRev}
	\tag{DR-THER}
	\eqtext{The evolution is deterministic and thermodynamically reversible}
\end{equation}
is yet another characterization of Hamiltonian mechanics.

There is another type of entropy that is also fundamental in both statistical mechanics and information theory: the Gibbs/Shannon entropy $I[\rho(\xi^a)]=-\int \rho \log \rho \, d\xi^1 \cdots d\xi^n$ which is defined for each distribution $\rho(\xi^a)$. Recalling the transformation rules for both volumes \ref{rp-cm-volumeTransformation1d} and densities \ref{rp-cm-densityTransformation1d}, we have
\begin{equation}
	\begin{aligned}
	I[\rho(\xi^a)] &= - \int \rho(\xi^a) \log \rho(\xi^a) \, d\xi^1 \cdots d\xi^n \\
&= - \int  \hat{\rho}(\hat{\xi}^b) \left| \partial_a \hat{\xi^b} \right| \log \left( \hat{\rho}(\hat{\xi}^b) \left| \partial_a \hat{\xi^b} \right| \right) \, d\xi^1 \cdots d\xi^n \\
&= - \int \hat{\rho}(\hat{\xi}^b) \log \left( \hat{\rho}(\hat{\xi}^b) \left| \partial_a \hat{\xi^b} \right| \right) \, d\hat{\xi}^1 \cdots d\hat{\xi}^n \\
&= - \int \hat{\rho}(\hat{\xi}^b) \log \hat{\rho}(\hat{\xi}^b) \, d\hat{\xi}^1 \cdots d\hat{\xi}^n - \int \hat{\rho}(\hat{\xi}^b) \log \left| \partial_a \hat{\xi^b} \right| \, d\hat{\xi}^1 \cdots d\hat{\xi}^n \\
&= I[\hat{\rho}(\hat{\xi}^b)] - \int \hat{\rho}(\hat{\xi}^b) \log \left| \partial_a \hat{\xi^b} \right| \, d\hat{\xi}^1 \cdots d\hat{\xi}^n.
	\end{aligned}
\end{equation}
Information entropy, then, remains constant if and only if the logarithm of the Jacobian determinant is zero, which means the Jacobian determinant is one. Therefore
\begin{equation}\label{rp-cm-dr-condInformation}
	\tag{DR-INFO}
	\eqtext{The evolution conserves information entropy}
\end{equation}
is equivalent to \ref{rp-cm-dr-condUnitaryJacobian} and is yet another characterization of Hamiltonian mechanics.

The fact that determinism and reversibility is equivalent to conservation of information entropy should not be, in retrospect, surprising. Given a distribution, its information entropy quantifies the average amount of information needed to specify a particular element chosen according to that distribution. If the evolution is deterministic and reversible, giving the initial state is equivalent to giving the final state and therefore the information to describe one or the other must be the same. Determinism and reversibility, then, can be understood as the informational equivalence between past and future descriptions.

Lastly, given that entropy is often associated with uncertainty, it may be useful to understand how Hamiltonian evolution affects uncertainty. Given a multivariate distribution, the uncertainty is characterized by the covariance matrix
\begin{equation}
	cov(\xi^a, \xi^b) = \left[\begin{array}{ c c }
		\sigma^2_q & cov_{q, p} \\
		cov_{p, q} & \sigma^2_p \\
	\end{array} \right].
\end{equation}
The determinant of the covariance matrix gives us a coordinate independent quantity to characterize the uncertainty. If the distribution is narrow enough, we can use the linearized transformation to see how the uncertainty evolves after an infinitesimal time step $\delta t$. We have
\begin{equation}
	\left| cov(\hat{\xi}^c, \hat{\xi}^d) \right| = \left| \partial_a \hat{\xi}^c  \, cov(\xi^a, \xi^b) \, \partial_b \hat{\xi}^d  \right| = \left| \partial_a \hat{\xi}^c \right| \left| cov(\xi^a, \xi^b) \right| \left| \partial_b \hat{\xi}^d  \right|,
\end{equation}
which means the uncertainty remains unchanged if and only if the Jacobian is unitary. So
\begin{equation}\label{rp-cm-dr-condUncertainty}
	\tag{DR-UNC}
	\eqtext{The evolution conserves the uncertainty of peaked distributions}
\end{equation}
is equivalent to \ref{rp-cm-dr-condUnitaryJacobian} and is another characterization of Hamiltonian mechanics.

%\begin{figure}[h]
%	\centering
%	\begin{tikzpicture}
%	\end{tikzpicture}
%	\caption {TODO: F5 Evolution of covariance matrix.}
%\end{figure}

This connection gives us yet another insight on the nature of determinism and reversibility in physics. Given that all physically meaningful descriptions are finite precision, a system is deterministic and reversible in a physically meaningful sense if and only if the past/future descriptions can be reconstructed/predicted at the same level of precision. This gives us another perspective as to why areas and densities must be conserved.

\subsection{Assumption of determinism and reversibility}

We have found twelve equivalent characterizations that link Hamiltonian mechanics, vector calculus, differential geometry, statistical mechanics, thermodynamics, information theory and plain statistics. Though we only talked about the case of a single degree of freedom, it gives us a much better idea of what systems Hamiltonian mechanics is supposed to describe, those that satisfy the following
\renewcommand{\theassump}{DR}
\begin{assump}[Determinism and Reversibility]\label{assum_detrev}
	The system undergoes deterministic and reversible evolution. That is, specifying the state of the system at a particular time is equivalent to specifying the state at a future (determinism) or past (reversibility) time.
\end{assump}
\renewcommand{\theassump}{\Roman{assump}}
We can see how this concept is implemented mathematically: it is not simply a one-to-one map between points. Classical particles should be more properly thought of as infinitesimal regions of phase space. Conceptually, the count of states, the thermodynamic entropy and information entropy are all conserved, and are all equivalent characterizations of determinism and reversibility. In terms of physical measurement, past and future states are given at the same level of uncertainty. But the most important lesson is that the foundations of classical mechanics are not disconnected from the foundations of all other disciplines we encountered. A full understanding of classical mechanics means understanding those connections as well.

\section{Multiple degrees of freedom}

We have seen how \ref{assum_detrev} is a constitutive assumption for Hamiltonian mechanics, and in fact is equivalent to Hamiltonian mechanics for one degree of freedom. We now turn our attention to the general case, and we will find that \ref{assum_detrev}, by itself, is not enough to recover the equations. We will need an additional assumption, that of the independence of degrees of freedom.

First, let's take Hamilton's equations for multiple degrees of freedom
\begin{equation}\label{rp-cm-hm-condNEquations}
	\tag{HM-ND}
	\begin{aligned}
		d_t q^i &= \partial_{p_i} H \\
		d_t p_i &= - \partial_{q^i} H
	\end{aligned}
\end{equation}
and re-express them in terms of generalized state variables. These will be noted as $\xi^a = [q^i, p_i]$ and will span a $2n$-dimensional space (i.e. manifold). The displacement field will be
\begin{equation}\label{rp-cm-displacementNd}
	S^a = d_t \xi^a = [d_t q^i, d_t p_i]
\end{equation}
which again is the vector field that defines the evolution of the system in time. Hamilton's equations, then, can be expressed as
\begin{equation}
	\begin{aligned}
		S^{q^i} &= \partial_{p_i} H \\
		S^{p_i} &= - \partial_{q^i} H.
	\end{aligned}
\end{equation}

Similarly to the previous case, let's introduce the following matrix
\begin{equation}\label{rp-cm-sf-symplecticForm}
	\tag{SF-ND}
	\omega_{ab} = \left[\begin{array}{cc}
		\omega_{q^i q^j} & \omega_{q^i p_j} \\
		\omega_{p_i q^j} & \omega_{p_i p_j}
	\end{array} \right]= \left[\begin{array}{cc}
		0 & I_n \\
		- I_n & 0
	\end{array} \right] = \left[\begin{array}{cc}
	0 & 1 \\
	-1 & 0
\end{array} \right] \otimes I_n
\end{equation}
which performs a 90 degree rotation within each degree of freedom, switching the components between position and momentum. That is, if $v^a = [v^{q^i}, v^{p_i}]$, then $v_a = v^b \omega_{ba}  = [-v^{p_i}, v^{q^i}]$.\footnote{For those versed in symplectic geometry, $v^a \omega_{ab}$ are the components of the one-form $\omega(v, \cdot)$. However, we are not going to call it a one-form as that assumes that the whole object is a map from a vector field to a scalar field, and we do not know whether that is the correct physical understanding. In other words, we want simply to understand what the quantities are doing without being tied, as much as possible, to a particular way to frame it. Full reverse engineering of differential geometry will be done in a later chapter, once the physics we need to describe is clear.} We can rewrite equation \ref{rp-cm-hm-condNEquations} as
\begin{equation}\label{rp-cm-HamiltonSymp}
	\begin{aligned}
		S_a = S^b \omega_{ba}  &= \partial_a H
	\end{aligned}
\end{equation}
which notationally is the same as \ref{rp-cm-hm-condGeneralizedEquations}. The insight that the displacement field is equal to the gradient of $H$ rotated 90 degrees still applies, except there are now multiple ways, in principle, to do that rotation. It is only the one defined by $\omega_{ab}$ that works.

Conditions \ref{rp-cm-dr-condDivergenceDisplacement}, \ref{rp-cm-dr-condUnitaryJacobian}, \ref{rp-cm-dr-condConservedVolume} and \ref{rp-cm-dr-condConservedDensity} are still satisfied and equivalent to each other. In fact, the divergence of the displacement field is zero
\begin{equation}
	\partial_a S^a = \partial_{q^i} S^{q^i} + \partial_{p_i} S^{p_i} = \partial_{q^i} \partial_{p_i} H - \partial_{p_i} \partial_{q^i} H = 0
\end{equation}
and the Jacobian is unitary
\begin{equation}
	\begin{aligned}
		\hat{q}^i &= q^i + S^{q^i} \delta t \\
		\hat{p}_i &= p_i + S^{p_i} \delta t \\
		\partial_{b} \hat{\xi}^a &= \left[\begin{array}{ c c }
			\delta_j^i + \partial_{q^j} S^{q^i} \delta t & \partial_{p_j} S^{q^i} \delta t \\
			\partial_{q^j} S^{p_i} \delta t & \delta_i^j + \partial_{p_j} S^{p_i} \delta t \\
		\end{array} \right] \\
		\left|\partial_{b} \hat{\xi}^a\right| &= \left|\delta_j^i + \partial_{q^j} S^{q^i} \delta t\right| \left|\delta_i^j + \partial_{p_j} S^{p_i} \delta t\right| - \left|\partial_{p_j} S^{q^i} \, \delta t \right| \, \left| \partial_{q^j} S^{p_i} \, \delta t \right| \\
		&= 1 + \left(\partial_{q^i} S^{q^i} + \partial_{p_i} S^{p_i} \right) \delta t + O(\delta t^2)
	\end{aligned}
\end{equation}