Hamiltonian mechanics is conservation of information entropy
TL;DR – Hamiltonian systems are those that conserve information entropy during time evolution.
The idea is the following: suppose you have a distribution over position and momentum $\rho(x, p)$. Suppose you evolve it in time and get a final distribution $\hat{\rho}(x, p)$ using Hamiltonian evolution. That is: you take each little area in phase space and move it according to Hamilton’s equations. How does the information entropy change? Answer: it doesn’t. Suppose you look for all possible time evolutions that conserve information entropy, what do you get? You get Hamiltonian mechanics.
That is: Hamiltonian systems are the ones and only the ones that preserve information entropy of a distribution $\rho(x, p)$ over states. Identifying an element in the initial distribution requires exactly the same information as identifying an element in the final distribution. Knowing the initial element you know the final element and vice-versa. The evolution is deterministic and reversible.
Let’s see how the math works.
1. Hamiltonian mechanics conserves information entropy
Suppose we have a distribution over position and momentum $\rho(x, p)$. How does it change during Hamiltonian evolution? Start with a little region $dx \,dp$. The fraction of the distribution contained in that region will be $\rho(x, p) dx \,dp$. After an infinitesimal time step the new coordinates, according to Hamiltonian evolution, will be:
\begin{equation}
\begin{aligned}
\hat{x} = x + \frac{dx}{dt} dt = x + \frac{\partial H}{\partial p} dt \\
\hat{p} = p + \frac{dp}{dt} dt = p – \frac{\partial H}{\partial x} dt \\
\end{aligned}
\label{newCoordinates}
\end{equation}
In general, the density and the volume would change according to the Jacobian $|J|$ of the transformation:
\begin{equation}
\begin{aligned}
|J| &= \left| \begin{matrix}
\dfrac{\partial \hat{x}}{\partial x} & \dfrac{\partial \hat{x}}{\partial p} \\[2.2ex]
\dfrac{\partial \hat{p}}{\partial x} & \dfrac{\partial \hat{p}}{\partial p} \end{matrix} \right| = \frac{\partial \hat{x}}{\partial x} \frac{\partial \hat{p}}{\partial p} – \frac{\partial \hat{p}}{\partial x} \frac{\partial \hat{x}}{\partial p}\\
d\hat{x}\,d\hat{p} &= |J| dx \,dp \\
\hat{\rho}(\hat{x}, \hat{p}) &= \frac{\rho(x, p)}{|J|} \\
\end{aligned}
\label{newDistribution}
\end{equation}
BTW: if you are familiar with the Poisson bracket, you can see that the Jacobian is simply the Poisson bracket of the new position and momentum. But that’s for another time.
Because the evolution is Hamiltonian, Liouville’s theorem applies so the Jacobian is unitary. Let’s verify that by combining the Jacobian definition in \eqref{newDistribution} with the new coordinates defined in \eqref{newCoordinates}:
\begin{equation}
\begin{aligned}
|J| &= \left| \begin{matrix}
1 + \dfrac{\partial}{\partial x} \left( \dfrac{dx}{dt} \right) dt & \dfrac{\partial}{\partial p} \left( \dfrac{dx}{dt} \right) dt \\[2.2ex]
\dfrac{\partial}{\partial x} \left( \dfrac{dp}{dt} \right) dt & 1 + \dfrac{\partial}{\partial p} \left( \dfrac{dp}{dt} \right) dt \end{matrix} \right| \\
&= 1 + \left[ \dfrac{\partial}{\partial x} \left( \dfrac{dx}{dt} \right) + \dfrac{\partial}{\partial p} \left( \dfrac{dp}{dt} \right) \right] dt + O(dt^2)\\
&= 1 + \left[ \dfrac{\partial}{\partial x} \frac{\partial H}{\partial p} – \dfrac{\partial}{\partial p} \frac{\partial H}{\partial x} \right] dt + O(dt^2)\\
&= 1 + O(dt^2)\\
\end{aligned}
\label{Jacobian}
\end{equation}
This means the volume $d\hat{x}\,d\hat{p} = dx \,dp$ remains unchanged and the final density at the final state is equal to the initial density at the initial state: $\hat{\rho}(\hat{x}, \hat{p}) = \rho(x, p)$.
How does the information entropy change? We calculate the entropy at the final state and then change variable using \eqref{newDistribution}:
\begin{equation}
\begin{aligned}
-\int \hat{\rho} \log \hat{\rho} \; d\hat{x}\,d\hat{p}&= -\int \frac{\rho}{|J|} \log \frac{\rho}{|J|} \; |J| dx \,dp \\
&= -\int \rho \log \frac{\rho}{|J|} \; dx \,dp \\
&= -\int \rho \log \rho \; dx \,dp + \int \rho \log |J| \; dx \,dp \\
\end{aligned}
\label{newEntropy}
\end{equation}
That is: the information entropy of the final distribution is equal to the one of the initial distribution plus the expectation of the logarithm of the Jacobian.
But since the evolution is Hamiltonian, the Jacobian is unitary and its logarithm is therefore zero. The information entropy of the final distribution is that of the initial distribution. The information entropy is conserved.
2. Conservation of information entropy gives Hamiltonian mechanics
We can run the argument backwards. Suppose we have a distribution over position and momentum $\rho(x, p)$. Suppose the evolution is such that the information entropy is conserved. How can we characterize the evolution? For the information entropy to be conserved the logarithm of the Jacobian has to be zero everywhere, which can only happen if the Jacobian is equal to one everywhere. Looking at \eqref{Jacobian}, this happens only if:
\begin{equation}
\dfrac{\partial}{\partial x} \left( \dfrac{dx}{dt} \right) + \dfrac{\partial}{\partial p} \left( \dfrac{dp}{dt} \right) = 0
\label{divergenceFree}
\end{equation}
This means that the vector field formed by the displacement vector $\left[ \dfrac{dx}{dt}, \dfrac{dp}{dt}\right]$ is divergence free, which means it admits a potential $H$ such that:
\begin{equation}
\begin{aligned}
\frac{dx}{dt} &= \frac{\partial H}{\partial p} \\
\frac{dp}{dt} &= – \frac{\partial H}{\partial x}
\end{aligned}
\label{Hamilton}
\end{equation}
So we have recovered Hamilton’s equations.
3. Conclusion As we have seen, Hamiltonian mechanics is equivalent to conservation of information entropy. This idea can be generalized to multiple independent degrees of freedom. The math is a bit more involved but the single degree of freedom already gives the general idea.
This is why isolated systems conserve energy: if they are isolated it means that their motion only depends on the information that is already contained in the system. Which means it cannot change. Which gives us Hamiltonian motion. Which means: energy is conserved.