Sufficiently wise

Induction and Yellow Scientific Journalism

Gabriele Carcassi — Thu, 05 Apr 2018 01:19:59 +0000

TL;DR – Scientific investigation is not simply finding patterns and generalizing from them.

Sometime ago I met someone who had high standards for what truth is. “Science cannot give any real truth!” “Science uses induction, which is not logically sound.” “All that scientists do is find patterns and generalize them to a rule.” I guess induction is in the eye of the generalizer.

When discussing scientific methodology, induction is often mentioned. Roughly speaking, deduction is when one argues from general to particular. For example: “all men are mortals”; “Socrates is a man” therefore “Socrates is mortal”. Induction is when one argues from particular to general. For example “Socrates is a man”, “Socrates is mortal” therefore “all mortal men are named Socrates”.

Naturally, there is nothing that guarantees that you are making the correct generalization or that the generalization will always hold. And some people argue that this limits the soundness of scientific results. For example: a scientist drops a few lead balls; they always fall to the ground; he makes a rule that lead balls fall to the ground. But there is nothing that in principle guarantees us that tomorrow lead balls will not do something different: like spontaneously transform into chickens. That would be delicious.

But is this how actual science works? Are scientists really sitting around all day looking at things and making gross generalizations? Or they do something completely different? Let’s explore this question with another inductively deductible thought experiment.

Yellow scientific journalism

You are a journalist. You spend your time searching for news and then rush out to be the first to publish the scoop. Some people may urge you to spend more time “checking your facts” or contest that random people on twitter do not “constitute a reliable source”. Some people call you a yellow journalist, but that does not bother you. The sun is yellow. Gold is yellow. You consider yourself a golden journalist.

You decide to cover scientific topics. Gravity is a topic that carries some weight, so you decide to investigate. You randomly drop some items off your table, and you notice this interesting phenomenon. If you take two sheets of paper, crumple one, and let them fall at the same time, the crumpled one will reach the ground before the other. So you publish your first scoop: “Spherical objects fall faster than flat objects!”

The response from the “scientific establishment” (i.e. anybody with a basic understanding of physics) to your article is negative. Some suggest to compare the fall of one sheet of paper with a book. You do this, and note that the book reaches the ground before the single sheet of paper. So you write your second scoop: “Heavier objects fall to the ground faster than lighter objects”.

Again the reaction from the establishment is negative. Some say that your articles are gross generalizations and are not well-researched. They say you didn’t take friction into account and that you should try making things fall in a vacuum. You open up your vacuum, but you fail to see how dropping things in it would make any difference.

Yet, the twitter user GalGal presents an interesting argument. Suppose that you have two objects of different weight. Suppose the heavier falls faster than the lighter. Now suppose you tied them with a string. Would the lighter drag the heavier, making it fall slower? Or would they count as a single object of heavier mass, making it go even faster? How thick has to be the string between the two objects so that they become one object of heavier mass? The idea of heavier objects falling faster, then, presents paradoxes. If every object, ignoring friction, falls at the same rate then one does not have these paradoxes.

All this thinking made your head hurt, so you decide to move onto other branches of science. You publish articles such as: “all regulations are bad for the economy” “there are no biological differences between men and women” and so on. While you still get negative reaction from the scientific establishment, your articles are retweet and shared a lot, providing the readership and ad-revenue that you craved so much.

Ruling rules out

When Galileo Galilei dropped the spheres of different weights from the tower of Pisa, he was not really testing a hypothesis. He already knew the answer. They had to fall at the same rate because the opposite premise would lead to contradictions. He didn’t drop things at random and find a pattern. He arrived at the conclusion logically and devised an experiment that clearly conveyed it.

The point is that there is no prescribed recipe for how you get to the correct scientific answer. Indeed, often you simply have some interesting system to play with, you get familiar with it by doing different things, and you start characterizing it intuitively by creating some heuristic. That is, some rule of the thumb. Like “when you drop bottles of beers to the ground, the people you took them from become unhappy.”

But this is not at all exclusive to science. It happens in math as well: you may have a new mathematical construct, you play around with it and you start characterizing it intuitively by forming some conjectures. You see: it’s not that Pythagoras started proving theorems at random until he proved that the square of the hypotenuse is equal to the sum of the square of the sides. Egyptians and Babylonians already were familiar with Pythagorean triples, so they knew the theorem as a heuristic. Pythagoras showed that it could not be any other way, that it wasn’t a mere coincidence. Even in math, the proof typically comes after the heuristic: you can’t write a proof for a theorem you haven’t even stated. And if what you stated seems unreasonable, you are not going to try to find a proof for it.

The same goes in science: you look for pattern and form heuristics. But those are not laws: for that you have to show that nothing can violate the rule. That is: it’s not enough to see that energy is conserved in a few cases and generalize. You have to try all possible ways you can think of to create the opposite result. You can’t simply repeat the same thing over and over. Once you have exhausted all ideas, and everybody else has done as well, and shown that the idea is logically sound, and so on and so forth, then you can say: “energy conservation agrees with experimental evidence”. This is not at all induction.

This also means that not all topics can be studied scientifically equally well. It is somewhat difficult to show conclusively the impact of a particular diet on childhood development as parents do not appreciate when you take away their children and put them in a tightly controlled environment. Therefore disciplines like physics or chemistry are always going to have an easier time providing solid conclusions than macroeconomics or cosmology.

Induction does not play a fundamental role in science. You can use it as a stepping stone in the process. But there is no general prescriptive recipe to reach the correct laws. You use whatever works. Just like in real life.

Quantum uncertainty is not caused by measurements

Gabriele Carcassi — Sun, 25 Mar 2018 02:03:30 +0000

TL;DR – The uncertainty relationship in quantum mechanics exists before and independently of measurements.

Some books still perpetuate the idea that the uncertainty introduced by quantum mechanics is caused by measurements. Either as an observer effect (i.e. the measurement causes the uncertainty) or as an observed relationship (i.e. our measurements are limited by that relationships). Unfortunately, neither of these are correct: the uncertainty exists before and independently of any measurement.

The problem is that in the early days physicists used these incomplete ideas as a stepping stone (e.g. Heisenberg’s microscope). This is not uncommon in physics: Galileo himself argued that the tides of the oceans are clear indication that the earth is spinning. The difference is that nobody repeats Galileo’s argument (though we keep the conclusion) while some early thought experiments in quantum mechanics are offered as insightful while they are actually misleading.

So let’s look at the math and see what it tells us.

1. Measurement collapse and the uncertainty

Suppose we have an incoming electron, whose state is described by a wave function $\psi(x)$. Suppose we measure its position. According to the theory, the wave function will collapse in one of the eigenstates of position. That is, we have one wave function coming in, we have a distribution of wave functions coming out. The probability associated with the eigenstate at a particular $x$ will be given by $\rho(x) = |\psi(x)|^2$. This distribution will have an associated standard deviation $\sigma_x = \sqrt{ \int x^2 \rho dx – (\int x \rho dx)^2}$.

Note that, since the outgoing states are all position eigenstates, the distribution in momentum for all of them is uniform. So the measurement has indeed introduced uncertainty. But all the uncertainty is introduced to momentum, the conjugate of position. The standard deviation $\sigma_x$ is the same because the distribution in position $|\psi(x)|^2$ is the same distribution we had in the wave function before the collapse. That is: the distribution over the observable we measure remains the same. If it didn’t, in fact, we couldn’t measure anything.

It’s not that we can write wave functions for which the uncertainty principle is not satisfied, we collapse them, and then the uncertainty principle is found. Not at all. It’s that we can’t even write wave functions that violate the uncertainty principle. Let me repeat it once more for effect: you can’t write a $\psi(x)$ for which the Fourier transform has an uncertainty on momentum that violates the uncertainty principle. It doesn’t matter whether you are going to collapse it or not.

2. Conclusion

It’s not that the uncertainties of our measurements are bound by the uncertainty principle. It’s that we can’t even prepare states that violate the uncertainty principle. Those states are excluded a priori: they do not exist in the theory.

Therefore we can’t consider the uncertainty principle an observer effect: it’s an intrinsic property of a quantum system.

Determinism and quantum mechanics

Gabriele Carcassi — Thu, 15 Mar 2018 15:17:26 +0000

TL;DR – The Schrodinger equation can be seen as the deterministic and reversible limit of the projection (i.e. collapse) associated with a measurement.

Quantum mechanics comes with two ways to go from an initial to a final state. The first is the Schrodinger equation that represents time evolution and the second is the projection (or collapse) that represents what happens during measurements. These are usually presented as two separate entities. In fact, much of the work surrounding various “interpretations” is to reconcile these two types of evolution.

What we want to show here is that there is a very natural way to reconcile them. As we saw in a previous post, a quantum state always has a set of quantities that are well defined (i.e. it is always an eigenstate of some Hermitian operator). Therefore we can regard the Schrodinger equation as a special case of the projection where at each instant a measurement is made. Let’s see how this works.

1. The projection postulate

Let’s first review the projection postulate which supposedly describes what happens during measurements. The idea is that you start with a single well defined initial state $|\psi\rangle$. You pick an observable $A$, which is defined by a set of eigenstates $|\psi_{a_i}\rangle$ and the corresponding eigenvalues $a_i$. After the measurement you end up with what is called a mixed state: a statistical distribution. More precisely, a distribution over the eigenstates, each having probability $|\langle \psi_{a_i} | \psi \rangle | ^2$. So, for example, if we start with spin up $|s^+_z\rangle$ and we measure the horizontal direction, we end up with 50% $|s^+_x\rangle$ and 50% $|s^-_x\rangle$.

Note that we don’t always end up with a mixed state: if the initial state is already an eigenstate of the observable, nothing changes. That is, if we start with spin up $|s^+_z\rangle$ and we measure the vertical direction, we end up with 100% $|s^+_z\rangle$ and 0% $|s^-_z\rangle$. Which is the same as what we started with. So the projection is not always non-deterministic. In fact we can go continuously from a process that is deterministic, where the direction of measurement is the same as the prepared direction, to one that is completely non-deterministic, where the direction of measurement is perpendicular to the prepared direction.

2. Deterministic projections

So the idea is the following: can we perform a measurement such that we are not exactly measuring the same observable (e.g. the same direction of spin) but something so close that the projection is still deterministic (e.g. the direction of spin in an infinitesimally close direction)? That is, we want the final state $|\psi_{t+dt}\rangle$ to be very close to the initial state $|\psi_t\rangle$ so that $|\langle \psi_{t+dt}|\psi_t\rangle|^2=|\langle \psi_t|\psi_{t+dt}\rangle|^2=1$. We can rewrite $|\psi_{t+dt}\rangle = (1 + dt \frac{\partial}{\partial t}) |\psi_t\rangle $. We have:

\begin{align*}
|\langle \psi_t|\psi_{t+dt}\rangle|^2 &= 1 = |\langle \psi_t | (1 + dt \partial_t) |\psi_t\rangle|^2 \\
&= \langle \psi_t | (1 + dt \partial_t)^\dagger |\psi_t\rangle \langle \psi_t | (1 + dt \partial_t) |\psi_t\rangle \\
&= (\langle \psi_t | \psi_t\rangle + \langle \psi_t | dt \partial_t^\dagger |\psi_t\rangle ) (\langle \psi_t | \psi_t\rangle + \langle \psi_t | dt \partial_t |\psi_t\rangle ) \\
&= 1 + dt (\langle \psi_t | \partial_t^\dagger |\psi_t\rangle + \langle \psi_t | \partial_t |\psi_t\rangle) + dt^2 (|\langle \psi_t | \partial_t |\psi_t\rangle|^2) \\
&\approx 1 + dt \langle \psi_t | \partial_t^\dagger + \partial_t |\psi_t\rangle \\
\partial_t &= – \partial_t^\dagger
\end{align*}

We have that deterministic time evolution must be unitary and that its generator $\partial_t$ must be anti-Hermitian. This means we can find a corresponding Hermitian operator $H$ such that $\partial_t = \frac{H}{\imath \hbar}$. This gives us the Schrodinger equation.
\begin{align*}
\frac{\partial}{\partial t} | \psi_t \rangle = \frac{H}{\imath \hbar} |\psi_t\rangle
\end{align*}

To put this in more concrete term, consider spin precession in a magnetic field. We can say that the magnetic field continuously measures the spin of the particle, and this is what causes the motion. The direction where this measurement happens is a function of the particle itself, which makes sense because the motion is deterministic: it only depends on the state of the particle.

3. Conclusion

We have seen that the Schrodinger equation is a continuous process that at every moment is measuring something that is very close to what was measured before. This way there is no difference between measurement and time evolution: it is the same type of process. It also means that there is nothing special about a measurement: whether we cause it or not, is just the same process that happens all the time.

What this also means is that the deterministic and reversible process is the particular case, not the general one. But this is true anyway: it is only when we are able to isolate a system enough from the environment that we can assume that its evolution is independent from it.

Mathematically, we are associating continuous unitary evolution with deterministic and reversible processes while non-deterministic processes have no such restriction. This is very interesting because it is the same conclusion that we reach from a completely different premise in our work on the assumption of physics.

Many Worlds and Television Interference

Gabriele Carcassi — Mon, 05 Mar 2018 21:06:29 +0000

TL;DR – There may be infinitely many parallel worlds but there is nothing to be detected.

A friend of mine is really into science fiction. “Did you know that there are infinite parallel worlds?” “And that there are infinitely many you and me, each a little bit different?” “And one day we will be able to travel to these different worlds, and taste all the different variations of pizza?” I wish I were in a universe in which he didn’t go on and on.

The whole idea of parallel worlds is something very very appealing. I mean, to science fiction writers. It allows to explore a whole range of what-if scenarios without messing up your main story line. Like what if your heroes actually were cat-people or something. This creates a lot of drama, or so I am told.

Unfortunately, one may get the impression that these ideas are somehow physically relevant, because it’s SCIENCE (fiction). It doesn’t help that the best physical theories we have are a bunch of mathematical equations which are then “interpreted” after the fact… like the Oracle of Delphi. Moreover, it seems that some have trouble distinguishing the case in which they forgot whether they parked their car in the front or in the back of the building, from the case in which they sawed it in half and distributed in the two places. Probability theory can be very confusing.

But let’s suppose there are other universes out there, all parallel parked. Are they relevant for the purpose of physics? Can we detect them? Can we communicate with people, or cat-people, in those realities? Let’s explore the question with a tangibly abstract thought experiment.

Television interference

When you were little you watched way too much sci-fi television. It inspired you to go into science and now you are disappointed because science is a lot less exciting than it looked on the telly. You really wanted a sonic screwdriver.

So, you decide to do something new and exciting. You heard about the SETI program: the Search for Extra Terrestrial Intelligence. This is where people look for signals coming to earth that may have been generated by alien civilizations. Basically, they try to watch alien television. Apparently, they believe we don’t have enough television programs here on earth… Anyway, what you want is the SPWI: the Search for Parallel-World Intelligence. You want to detect television programs coming from alternate realities.

So you start creating a receiver for this purpose. The first problem you find is that it’s picking up all the terrestrial television signals. Normally you’d be interested in those… you watch a lot of TV. But these are not the shows you are looking for. So you start devising a way to remove those unwanted signals.

After a few years of work, you are successful in removing the noise and you finally have a new signal! One that was not created on earth! You are very excited, but you realize that it’s not from a parallel universe… it seems to be coming from Kepler-442b, an exoplanet in orbit of Kepler-442. What a bummer! Plus their transmission is full of commercials. So you start devising ways to remove all these other unwanted signals.

While you are doing that, though, you realize that you may have a problem. To be able to detect the TV signal from one of the parallel universes you need to separate it from all the others. But there are infinite worlds… So what you would actually detect is the sum of the television signals from an infinite number of worlds. This signals would all be statistically uncorrelated. And the sum of infinitely many uncorrelated signals gives a Gaussian distribution. So what would you be detecting is random noise.

Moreover, the strength of the total signal cannot be more powerful than the earth TV signal. I mean: we can watch earth TV fine. So, the total alternate dimension signal must be finite. But there are infinitely many universes, so the signal coming from each one must be undetectably small, basically zero strength.

You decide to abandon science and become a science fiction writer, where your imagination needn’t be constrained by technical problems.

Noise and signal

A person much better at gardening than me once said: a weed is a plant in the wrong place. In the same vein, noise is a signal you are not interested in. Nature does not really distinguish between plants and weeds, between noise and signal. We do. We get both and then, depending what we want to do, we differentiate and filter.

This is something that engineers and experimental scientists are well aware of, but most other people don’t seem to realize. When you build a giant antenna to detect gravitational waves, you actually build a giant antenna that detects a lot of things. It may detect the train/subway/planes passing by. It may detect the vibration caused by thermal fluctuations. It may detect the imperfection in your laser source. That is, it detects everything including gravitational waves. So you have to be clever and find ways to remove the noise.

But no matter how much you remove, you can’t remove everything. What’s remains is your background. And you always have a background. It’s like pulling weeds: there will always be some around but you stop when you can finally see the plant you like. So you stop when the background is lower than the signal you are interested in. But if there are infinitely many similar signals all at the same strength, and you want to pick up one between them, your background is always going to be higher than your signal. You can’t distinguish it from the noise.

So, the next time you see a TV show that tells you that there are infinitely many worlds, in which all possible choices are executed, and you see the hero traveling to them, realize this: he wouldn’t be able to physically tell one apart from the other, so he would not be able to travel to a particular one. Ergo, each infinitesimal part of his body would go to one of the infinitely many alternate realities. Now, that’s a show I’d like to see!

All quantum states are equally defined

Gabriele Carcassi — Sun, 25 Feb 2018 13:49:52 +0000

TL;DR – No quantum state is more uncertain than the other. All states can be identified by a set of perfectly prepared quantities.

When studying quantum mechanics, you may get the (wrong) impression that some states are better defined than others. Some are eigenstates while others are just a superposition. Or that the gaussian packets, are more determined than the other states, since some satisfy the uncertainty principle with an equal. Well, that’s just not the case.

The confusion stems from the idea that quantum states are like classical statistical distribution, in which you have some well defined elements (which are the objects that are well defined) upon which you assign probabilities. Quantum states are not like this at all. We are going to see that any state is always the eigenstate of some Hermitian operator, and therefore they always have a quantity that is perfectly prepared. And they also have a symmetry under the transformation generated by that operator, so there is always another quantity for which no value is more likely than the other.

1. Prepared quantities and unprepared conjugates

Suppose that you have a quantum state for which the position is perfectly prepared. This corresponds to the eigenstate $|x\rangle$ of the operator $X$. The transformation generated by $X$ is $1+\frac{Xdp}{\imath\hbar}$ which corresponds to increasing momentum by $dp$. Now, since $|x\rangle$ is an eigenstate of $X$, it is also an eigenstate of the transformation generated by $X$. If we imagine the distribution of $|x\rangle$ over momentum, then, this has to be a distribution that does not change if we increase momentum. But the only distribution that is symmetric under that change is the one that has the same value for all possible values of momentum: the eigen state $|x\rangle$ is uniformly distributed in momentum. This can be generalized to any quantity. For example, an eigenstate of spin in the $z$ direction will be symmetric along the angle on the $(x,y)$ plane. This will also hold for any function of position and momentum and indeed for any Hermitian operator.

So, for a particular space of quantum states, we can imagine that, instead of specifying a wavefunction, we can give a set of operators and eigenvalues. Given that information, we can identify the corresponding eigenstate. This would not give the value for all quantities since the conjugate quantities will be left completely unspecified. The question is: can all states can be specified in this way? Which is equivalent to asking: are all states eigenstates of some Hermitian operator?

Suppose we have a state $|\psi\rangle$. We can construct the operator $O=|\psi\rangle a \langle \psi |$ where $a$ is a real number. $O$ is Hermitian and $|\psi\rangle$ is the eigenstate corresponding to the eigenvalue $a$. Yes, the operator is trivial, but we can indeed construct it. Which shows that for any state there exists Hermitian operators that allow that state as an eigenstate. In fact, there are infinitely many.

Overall, a state always has some well specified quantities and it also has some unspecified ones. For example, suppose we have an eigenstate for the operator $X + a P$, a linear combination of position and momentum. This means that the quantity $x + ap$ is perfectly prepared while the conjugate quantity, $\frac{1}{2a}(x-ap)$, is a uniform distribution. We can verify the two quantities are conjugates by calculating the commutator.

\begin{equation}
\begin{aligned}
\frac{\left[\frac{1}{2a}(X-aP), X+aP \right]}{\imath \hbar} &= \frac{1}{2a \imath \hbar} \left[X-aP, X+aP\right] \\
&= \frac{1}{2a \imath \hbar} \left( \left[X, X\right] + \left[-aP, X\right] + \left[X, aP\right] + \left[-aP, aP\right] \right) \\
&= \frac{1}{2a \imath \hbar} \left( 0 + a \imath \hbar + a \imath \hbar + 0 \right) \\
&= 1
\end{aligned}
\end{equation}

Now consider the two distributions for that state over position and momentum. Since all the values of $\frac{1}{2a}(x-ap)$ are equally likely, then all values of $x$ and $p$ are also equally likely. That is: the distribution is uniform for both quantities. The variance for both $x$ and $p$ is infinite. If we only know that, we would think that state to be infinitely less defined than an eigetnstate of position. What happens is that for that state the uniform distribution in $x$ and $p$ are strongly correlated, so much so that the distribution over $x+ap$ admits a single value. Therefore this state is no less defined of an eigenestate of $x$: what changes is what quantities are well defined and which aren’t.

2. Conclusion

What is significantly different in quantum mechanics is that all states are distributions but no distribution is more defined than the other. Even if a state has an extremely large spread in a pair of conjugate variables, it will have a quantity somewhere that will be perfectly defined. There is no base that is mathematically better then the other: all states can be part of a basis and all states are superpositions in some other basis. Whenever we are superposing two states, we are always adding some other correlation such that the new states is as well defined as the other two.

This actually makes a lot of sense conceptually. It’s telling us that each quantum state is a distribution that cannot be decomposed into smaller independent ones. Each quantum system is an irreducible unit. This is the main difference from classical mechanics, where each distribution over phase space can be divided into smaller pieces that can be studied independently.

What are commutators?

Gabriele Carcassi — Thu, 15 Feb 2018 14:35:08 +0000

TL;DR – Commutators, like Poisson bracket, tell us how a quantity changes under a transformation generated by another.

Commutators play a fundamental role in quantum mechanics. Mathematically, they tell us how multiplication between operators behaves but this does not provide any physical insight. What is more interesting physically is that they are intimately related to infinitesimal transformations, which gives a better understanding of why they are related to Poisson brackets and why commutation relationships between position, momentum and spin must be what they are.

1. Conjugate quantities

In classical mechanics, conjugate quantities are pair of variables that form an independent degree of freedom. If you integrate over a pair, you get a quantity that no longer depends on the unit and coordinate used for those quantities.

In quantum mechanics, things work a bit differently mathematically. In quantum mechanics you have operators and parameters. If a quantity is an operator (like spin) it means you can write the state as a wave-function over that quantity. If a quantity is a parameter (like angle) it means you can make and infinitesimal transformation along it. The conjugate relationship is between operators and parameters: for each operator $A$ you have one parameter $\alpha$ and we say that $A$ generates the infinitesimal transformation $(1 + \frac{A d\alpha}{\imath \hbar})$, which is a unitary transformation.

For example, spin component $S_z$ is the operator while the angle $\theta_{xy}$ is the parameter, and $S_z$ generates the rotation $(1 + \frac{S_z d\theta_{xy}}{\imath \hbar})$. The confusing part is that some quantities, like position and momentum, can both be operators and parameters. So we have the $P$ operator and the $x$ parameter and $P$ generates the translation $(1 + \frac{P dx}{\imath \hbar})$, but also $X$ operator and $p$ parameter and $X$ generates the change in momentum $(1 + \frac{X dp}{\imath \hbar})$. Note how the upper case denotes an Hermitian operator while the lower case denotes a real number.

2. Commutators

Now we ask: suppose we have two operators $A$ and $B$. Suppose you perform the infinitesimal transformation generated by the second, over the parameter $\beta$. How does the operator $A$ change?

The infinitesimal transformation corresponds to the unitary operator $U=(1 + \frac{B d\beta}{\imath \hbar})$. An operator under a unitary transformation changes as $\hat{A} = U^\dagger A U$. Therefore we have:

\begin{equation}
\begin{aligned}
\hat{A}&=U^\dagger A U = (1 – \frac{B d\beta}{\imath \hbar}) A (1 + \frac{B d\beta}{\imath \hbar}) \\
&=A + A \frac{B d\beta}{\imath \hbar} – \frac{B d\beta}{\imath \hbar} A – \frac{B d\beta}{\imath \hbar} A \frac{B d\beta}{\imath \hbar} \\
&=A + \frac{(AB – BA)}{\imath \hbar}d\beta + \frac{BAB}{\hbar^2}d\beta^2
\end{aligned}
\end{equation}

The first order for the change of $A$ is:

\begin{equation}
\begin{aligned}
\frac{dA}{d\beta}&=\frac{(AB – BA)}{\imath \hbar} \\
&= \frac{[A,B]}{\imath \hbar}
\end{aligned}
\end{equation}

Therefore the commutator between $A$ and $B$ tells us how $A$ changes under a transformation generated by $B$. In a previous post we saw that the Poisson brackets in classical Hamiltonian mechanics had the same role. That is why we can do the formal substitution: they describe the same physical relationship in a different mathematical framework.

We can also understand why commutators between notable quantities must be what they are. For example, the infinitesimal transformation generated by the operator momentum $P_j$ is the translation over the position parameter $x_j$. So how will the operator position $X_i$ change under the transformation generated by $P_j$?

\begin{equation}
\begin{aligned}
\frac{[X_i,P_j]}{\imath \hbar} &= \frac{dX_i}{dx_j}=I \delta_{ij}
\end{aligned}
\end{equation}

If $X_i$ and $P_j$ are along the same direction, then position operator will change of the same amount of the position parameter: they are physically the same thing. If they are along different directions there is no change.

Now, consider the operators $S_x$, $S_y$ and $S_z$ for the spin components. How will they change under the transformation generated by $S_z$, which is the rotation over $\theta_{xy}$?

\begin{equation}
\begin{aligned}
\frac{[S_x,S_z]}{\imath \hbar} &= \frac{dS_x}{d\theta_{xy}} = S_y \\
\frac{[S_y,S_z]}{\imath \hbar} &= \frac{dS_y}{d\theta_{xy}} = – S_x \\
\frac{[S_z,S_z]}{\imath \hbar} &= \frac{dS_z}{d\theta_{xy}} = 0
\end{aligned}
\end{equation}

The last one is the easiest: the $z$ component of spin does not change under rotation over the $xy$ plane. For the other two, just remember that the $x$ component is rotated in the $y$ direction and the $y$ direction is rotated in the $-x$ direction.

These are indeed the commutation relationships one has in quantum mechanics. Given the physical meaning of the operators and their transformations, the commutation relationship can’t be anything else.

3. Conclusion

We have seen that commutators have a well defined physical/geometrical meaning, which is the same one for Poisson brackets in classical Hamiltonian mechanics. I personally not only find it insightful, but it helps me remember the correct sign for the commutation relationships.

First-Person Experience and the Consciousness Transfer Device

Gabriele Carcassi — Mon, 05 Feb 2018 13:05:34 +0000

TL;DR – Subjective experiences, like consciousness, are outside of what can be experimentally tested.

I was at a bar, tired from a long day, when a person started pestering me with questions: “Doesn’t the brain process electrical signals, which is ultimately explained by quantum physics?” “Doesn’t quantum physics say that things happen only because we observe them and therefore we make cats die just by staring at them?” “And therefore consciousness creates reality?” Sometimes I’d rather be unconscious.

Since the birth of quantum physics, people have started to see potential connections between its mysterious features and other open problems. Is quantum uncertainty what gives us free will? Are quantum effects responsible for the holistic nature of consciousness? It goes like this: here is something (quantum mechanics) that I don’t really understand… and here is something else (consciousness) that I don’t really understand. Since they have something in common, they are probably the same thing.

Now, clearly, once we understand how the brain works, we’ll solve the issue since consciousness happens there. Wait, how do we know that consciousness happens in the brain? Ah, yes, if we poke it we can alter how information is processed and everything. It’s like saying: if I steal your keys, you can’t use your car, so your keys are what powers the car… No, wait, it’s not like that at all. Also, what exactly is consciousness?

By consciousness we typically mean the idea that a person is aware that he is a person, being aware that he is a person being aware of it. It is the fact that we have a first-person experience. What does that mean in experimental terms? Because, and that’s the problem, if we want to study consciousness scientifically we need, at the very least, a way to tell two different consciousnesses apart experimentally. Can we actually do it? Let’s explore the question with a consciously unconscionable thought experiment.

The consciousness transfer device

You are a conscious scientist. In the sense that you study consciousness. You somehow prefer that description over consciousness scientist. You seem to like having to explain the pun every time to new people. You think it’s very clever and makes for a good story… You seem so proud of it nobody has the heart to tell you…

Anyway, you think you have identified what consciousness is. Is it a physical entity? Like the “soulon”, a hypothetical fundamental particle? Or is it merely a configuration of matter? Like the way that neurons connect with each other to form memories? Or is it something external? And the brain is simply like a radio receptor, that takes commands from a different plane of reality? Which one is it? Come on, tell me!

But you are a conscientious conscious scientist, and before blabbering your results, you want to gather experimental evidence. You have created a machine that is able to swap consciousness. And you have two “volunteers” that you pay $10 an hour.

You place a metal colander-shaped hat on each of the subjects. The idea is that, when you flip the switch, the consciousness, and only the consciousness, will be switched. That is, the consciousness of the first person will stop seeing the world from the eyes of the first body and will start experiencing it through the eyes of the second body. And vice-versa.

The volunteers wait in trepidation expecting the machine to start making sparks, and loud noises, and that the air will get wobbly or something. “It’s done” you say. The volunteers are very disappointed when they realize that machines make sparks, loud noises and wobbly things only in the movies.

You are disappointed as well because the volunteers don’t seem to see any difference. You try the machine several times, but they report no change. You pay them their $10 each and dismantle the machine.

Days later, though, you think about it again. The machine only transferred the consciousness from one body to the other. So, even if consciousness A went in body B, it would be able to access only the memories that are available in body B. It would remember always having been in that body: it would not be able to tell that it moved because it left the memories… in the other body.

When it went back to body A, it would also find all the memories that that body accumulated in its absence. It wouldn’t be able to tell that those memories where recorded when consciousness B was in that body.

So, maybe the machine did actually work. Maybe their consciousnesses are switched. But there is no way to tell.

First person experience

The problem of first person experience is that it is, by definition, subjective. Not in the sense that it is biased. In the sense that only one person can experience it. I can’t experience your first-person experience and you can’t mine. In fact, maybe you don’t have one. Maybe you are just a purely mechanical device that is pretending to be conscious. A philosophical zombie! Who knows! Well, you do… and that’s kind of the point: only you can know if you are really conscious.

To be able to study anything scientifically, we need to be able to identify it experimentally and objectively. We can tell an electron from a positron because they have opposite charge and move differently within an electric field. We can tell peanut butter from Nutella because one is delicious and just the smell of the other makes some people die of allergic reactions. But how do you tell consciousnesses apart?

We can note the color of a person’s hair, we can give them puzzles to solve, we can tell them secrets, we can give them brain scans, … But all of these are just testing physical appearance, cognitive abilities, memory and the electromagnetic response. None of these will actually identify a consciousness.

To be sure that a machine can identify consciousness, we’d need to confirm that it detects a change of consciousness when everything else remained the same. But then we have the problem of the thought experiment: the consciousness would have access to all the other features and it may not be able to tell it wasn’t in that body before. So we can’t independently confirm that our machine actually can detect a change of consciousness.

So, while I am sure we are going to be able to investigate more of the brain in the coming decades, and that we will be able to analyze conscious states much better, and we are going to be able to produce drugs that turn on and off those processes very selectively, I am also sure that we are not going to be able to answer the actual question: why is your consciousness in that particular body? And how can you move into a better one?

Not everything we measure is an eigenvalue of a linear operator

Gabriele Carcassi — Thu, 25 Jan 2018 21:55:18 +0000

TL;DR – Statistical quantities (e.g. averages) and angles (e.g. direction of spin) are measurable quantities but are not associated with linear operators, eigenkets and eigenvalues.

When studying quantum mechanics you learn about observables, how to each you associate a Hermitian operator, how the value is only defined on the eigenstates of that operator and how, in general, you will have a distribution over eigenvalues. Position, momentum, energy and spin are all examples. Since one mostly deals with those, one usually gets the impression that that’s all there is. This may not be stated per se in your textbook, yet you may have that impression.

But is that true? Is everything that we measure an eigenvalue of some Hermitian operator? Here I’ll present two quantities that don’t follow the pattern: temperature and direction of spin.

1. Temperature

Suppose we have a box filled with gas in thermodynamic equilibrium. Its temperature $T$ will be proportional to the variance of the velocity of all the elementary constituents of the gas. If we call $|\psi>$ the state of all the particles, $P_i$ the momentum operator for the $i$-th particle and $m_i$ its mass, we’ll have something like:
\begin{equation}
T=\alpha<\psi|\sum_{i=1}^{n} \frac{P_i^2}{m_i}|\psi>
\end{equation}
where $\alpha$ is an appropriate constant.

Now, what’s important here is not the detail of the expression: the important aspect is that temperature is an average. Any state that represents a snapshot of a system (i.e. a pure state) will always have one and only one value of temperature. And that value needs to match what our thermometer says. That is: we are not going to have a statistical distribution over possible values of temperatures.

You may think: but the quantity $\sum_{i=1}^{n} \frac{P_i^2}{m_i}$ is an operator. And indeed it is. But that’s where the connection with temperature ends. Think about the eigenstates of that operator: they correspond to those states for which the magnitude of the momentum is perfectly prepared for all particles. Those are not the only states for which we have a well defined value of temperature. And measuring the temperature does not mean measuring the magnitude of the momentum for each particle. So $\sum_{i=0}^{n} \frac{P_i^2}{m_i}$ is an operator but is not the temperature operator: it’s an operator whose expectation corresponds to the temperature.

2. Spin direction

Now suppose we have a spin 1/2 system. You may be familiar with $S_x$, $S_y$ and $S_z$ which are the operators for the spin components along the respective directions. As you know, spin represents angular momentum so their conjugate quantity is the angle along the plane perpendicular to their direction. Which leads to the question: where is the spin angle operator? Well, there isn’t one.

All spin 1/2 states can be defined by a unique direction in space. We can write:
\begin{equation}
|\psi>=\cos(\theta/2)|z^+> + \sin(\theta/2)e^{\imath \phi}|z^->
\end{equation}
where $\theta$ and $\phi$ are the polar and azimuthal angle respectively. Note that we just need two states, $|z^+>$ and $|z^->$, to form a basis and those correspond to the two possible values of spin measured along the $z$ direction. An angle, instead, takes a continuum of possible values and therefore we would need an infinite number of eigenkets to form an angle operator (as it is for position and momentum). Since the space is two dimensional, all bases must be two dimensional: no angle operator.

But here is the thing: we can nonetheless measure angles. Suppose we have a source of electrons such that their spin comes aligned always in the same direction. With a Stern-–Gerlach type experiment, we can measure the fraction $0\leq f_z \leq 1$ that comes out with $z^+$. We have $f_z = <\psi|z^+> = \cos^2(\theta/2)$. So $\theta = 2 \arccos \sqrt{f_z}$ is definitely something we can measure. Similarly, we can find an expression for $\phi$.

Again you may think: all we did was measure the expectation of $|z^+>

3. Conclusion

While it is often useful to think of a quantum state as a distribution over the eigenvalues of some observable, this is not the only way we should think about it. Not all measurable quantities work like that. In particular, note that many macroscopic quantities are averages over a large number of particles and therefore one should always be very careful when extrapolating ideas from the quantum world.

What are complex numbers?

Gabriele Carcassi — Mon, 15 Jan 2018 18:47:57 +0000

TL;DR – Complex numbers should really be called rotation numbers. Whenever they are used in physics and engineering, some type of 2D vector and related rotation is lurking and begging to be understood.

In the past posts we went through a lot of insights in classical mechanics and I want to start doing the same for quantum mechanics. But first we have to talk about complex numbers. Complex numbers are prominent in quantum mechanics and, before we understand why that is, we have to understand what complex numbers are in general.

Unfortunately, complex and imaginary numbers have names that are not very meaningful: they are not complex and they are not imaginary. The other problem is that they are typically introduced in the way they were discovered historically (i.e. solving polynomial equations) and we drag along this idea of $\imath = \sqrt{-1}$. None of this is intuitive or useful to understand why complex numbers are important in physics and engineering.

Briefly: complex numbers describe 2D vectors (i.e. a pair of real numbers) and rotations acting on them. You may sort of know this already. You may not realize, though: that’s the only thing they do. Let’s see how this work.

1. Sets of numbers and bad naming

While we are on the subject, note that complex numbers are not the only horribly named set of numbers in the English language. So, let’s go through all of them.

First, we have the natural numbers $\mathbb{N} = \{0, 1, 2, …\}$. Now, there is nothing “natural” about them or “unnatural” about the other numbers. These are the numbers we use to count: counting numbers would be a better name for $\mathbb{N}$.

Next we have the integer numbers $\mathbb{Z} = {…, -2, -1, 0, 1, 2, …}$. The name comes from the same Latin root as “entire”. Whole numbers would be a better English name for $\mathbb{Z}$ (and sometimes they are called just that).

We can then take a pair of integer number, a numerator and a denominator, and construct the rational numbers $\mathbb{Q}$. They are not “rational” in the sense of “sensible”. They are rational because they express a ratio, a proportion between two integer values. Same goes for irrationals: they are not crazy numbers, they simply cannot be expressed as a ratio. Fractional numbers would probably be a less confusing name for $\mathbb{Q}$.

And finally we have the real numbers $\mathbb{R}$ which are defined as the distinct Cauchy sequences of rational numbers. There is nothing “real” about them. What they represent are values that in general can only be approximated by rationals. Physically, we use them when we assume we can measure values with arbitrary precision, so arbitrary precision numbers would be a more precise name for $\mathbb{R}$.

Now, despite that the names are not actually that good, you most likely have a good intuitive understanding of these types of numbers because they were introduced with real world examples. Hopefully.

2. Rethinking complex numbers

So how should we define and think about the complex numbers $\mathbb{C}$? As you know, a complex number $c$ is simply a pair of real numbers (i.e. arbitrary precision numbers). Instead of noting it $a+\imath b$, which makes the two elements look different, we can write it as $c = a\mathbf{1} + b \mathbf{i}$. Instead of calling $a$ the “real part”, let’s call it the “horizontal part”. Instead of calling $b$ the “imaginary part”, let’s call it the “vertical part”.

If we have two complex numbers $c_1$ and $c_2$, we can define the sum as $c_1+c_2 = (a_1+a_2) \mathbf{1} + (b_1+b_2)\mathbf{i}$: the horizontal part is the sum of the horizontal parts and the vertical part is the sum of the vertical parts. Note how the new name makes it seem beyond obvious: it’s just a standard vector with two components. Note that $\mathbf{1}$ and $\mathbf{i}$ are simply the unit vectors along the horizontal and vertical directions. And the norm is given by $|c| = \sqrt{a^2 + b^2}$

Now suppose we want to characterize the linear operations. Let’s first look at how they work in $\mathbb{R}$. Each number $a \in \mathbb{R}$ can be associated with the operation of rescaling by that number $f_a(x) = a x$. Each complex number $c \in \mathbb{C}$, instead, will be associated with a rescaling by the norm $|c|$ and a rotation by the angle between $c$ and the horizontal direction $\mathbf{1}$.

Since the operation is linear, we really just need to define the multiplication between the basis vectors. We have $\mathbf{1} \cdot \mathbf{1} = \mathbf{1}$ and $\mathbf{1} \cdot \mathbf{i} = \mathbf{i}$: since the angle between $\mathbf{1}$ and $\mathbf{1}$ is zero, the rotation leaves the direction unchanged. We also have $\mathbf{i} \cdot \mathbf{1} = \mathbf{i}$: since $\mathbf{i}$ is in the vertical direction, the corresponding rotation will move the horizontal by 90 degrees to the vertical direction. Lastly, $\mathbf{i} \cdot \mathbf{i} = – \mathbf{1}$: we are rotating the vertical direction by another 90 degrees so we obtain the opposite of the horizontal direction.

Note that we have obtained the usual relationship $\imath^2=-1$, but this time we know what we are talking about and there is nothing imaginary about it. It simply means that if we rotate 90 degrees twice, we get the opposite direction than we started with. It should also make it clear why we can express $c=\rho (\cos \theta \mathbf{1} + \sin \theta \mathbf{i})$. Mathematically, there shouldn’t really be anything new about it but I hope that intuitively it clicked that there is nothing mysterious or arbitrary about any of this.

Complex numbers are just 2D vectors and rotations: rotation numbers would be a much less confusing name for $\mathbb{C}$.

3. An example: linear systems

Let’s make complex numbers a little bit more concrete by quickly showing some highlights from linear control theory and signal processing. Suppose you have a system that takes an input signal $I(t)$ and converts it to an output signal $O(t)$. Suppose the system is linear: if we sum two inputs and feed them into the system we simply get the sum of the independent outputs. If that’s the case, on can show that the relationship between inputs and outputs is of the form:
\begin{equation}
O(t) = \mu_0 I(t) + \mu_1 \frac{d}{dt} I(t) + \mu_2 \frac{d^2}{dt^2} I(t) + … + \mu_n \frac{d^n}{dt^n} I(t) + …
\end{equation}

Now, suppose that the input is of the form $I(\omega, t) = a_\omega \cos \omega t$. The derivative will change cosines to sines and vice-versa, so the result will still be an oscillation of the form $O(t) = \hat{a}_\omega \cos \omega t + \hat{b}_\omega \sin \omega t$. We can set $\mathbf{1} = \cos \omega t$ and $\mathbf{i} = \sin \omega t$. So we have $I(\omega, t) = a_\omega \mathbf{1} = c_\omega$ and $O(\omega, t) = \hat{a}_\omega \mathbf{1} + \hat{b}_\omega \mathbf{i} = \hat{c}_\omega$. That is: we can express each frequency component with a two dimensional vector and the effect of the linear system is simply a rotation in that space. The linear system may change the magnitude (i.e. the strength of the signal) and the angle (i.e. the phase of the signal) but nothing else (i.e. not the shape of the signal).

Now, the point here is not to give a full introduction to control theory (which would take a lot more space). Hopefully, this should give you a sense of when and why rotation numbers, sorry, complex numbers are useful in physics and engineering: they will always be useful when we are studying the effect of linear systems, and all systems are, at first approximation, linear.

4. Conclusion

Complex numbers are not that complex to understand: they just describe 2D vectors and rotations. They are unfortunately named and the notation typically used is meaningless for the purpose of physics and engineering. It is important to understand that these are historical accidents, and they have no bearing on the actual concepts and their applications.

When we are solving complex polynomial equations, then, we are not looking for quantities: we looking for rotations that satisfy a particular property. For example, $x^2=-1$ is asking: what is the rotation that, if applied twice, is equivalent to a reflection? It’s the 90 degree rotation, of course, which is denoted by $\imath$.

Complex numbers are fundamental when studying linear systems. Quantum mechanics, as we’ll see, studies linear systems and that is why complex numbers are prevalent. You just need to understand what physical quantity is the 2D vector representing and what is rotating.

Statistics and prayer

Gabriele Carcassi — Fri, 05 Jan 2018 19:40:50 +0000

TL;DR – Statistics can only study correlations between populations and are blind to the effect on individuals.

Since I moved to the United States of America, I have met people with very strong views about religion and science. One would say: “the bible says the world was created 10,000 years ago.” Another would say: “science says God does not exists because of evolution.” Both would say: “science and religion are in direct conflict.” I must have missed the memo.

This whole idea that science and religion are at war is perplexing to me. I mean, yes there have been incidents like when Galileo wrote a book and, inadvertently, made the then Pope look like a fool. Or when Mendel, an Augustine friar, inappropriately used church resources for scientific activities. But what troubles me is something else: how can science or religion point a weapon at each other? They don’t have opposable thumbs.

I have even found that in the USA there are people who have conducted “scientific” studies on the power of prayer. For example, they took two sets of patients. They instructed one set of relatives to pray for their loved ones and the other not to. And then tried to measure whether one set got healthier than the other. And if they found no correlation… then they would conclude prayer is ineffective? And these are published in journals… of science? While this sounds absurd to me, somehow there are some people who think this is a fruitful endeavor. Just like discussing politics on social media.

Now, I am not going to discuss the theological implications here, which are numerous. I am not even going to discuss whether this constitutes science or not. I’ll overlook the most glaring problems and concentrate on a small detail: can you even conclude anything? That is, if you find no correlation can you say that prayer didn’t do anything? Let’s find out in a seriously ridiculous thought experiment.

The power of prayer

You are God. This is indeed a step up if you read the other thought experiments, where you have been a scientist, a mad statistician or a mathropologist. If you haven’t read the other thought experiments, you should! They are great stuff! And I am not above shameless self-promotion.

Anyway, you have just learned that there are these “scientists” that want to test you. They have divided a set of patients into two groups, one group will be prayed for and the scientist will check whether it will survive longer. Now, you distinctly remember saying: (Deut 6:16) “You shall not put the Lord your God to the test.” So, this kind of bugs you. And therefore, in your infinite spite, decide to intervene and fudge the data.

Now, you care about preserving free will. You don’t want to influence the result of the experiment at all, or people would be forced to conclude that you are answering the prayers and then people would pray only for utilitarian reasons. But you still want to answer the prayers. Can you do that? Can you affect the lives of those who pray without affecting the result of the experiment? Of course, you are God!

The idea is this. You know what would happen if you took no action. For the control group, the one that does not pray, you simply make it happen. For the group that prays, instead, you keep the overall outcomes the same. So, for example, if 12% died after a month, then 12% will die after a month. But, here’s the key: you can switch the outcomes between two people.

For example, without your intervention, Bob would live only 1 week while Rob would live 6 more months. The problem is that Rob would pass those 6 months in agony, while Bob would be only in mild pain and would use that time to make peace with his loved ones. So you switch. One would survive the illness, though would die anyway in a car crash 4 days after. So you switch with another guy that would have survived only 4 days. And so on.

You see: because the study is checking only the survival rate distribution, it would see no difference whatsoever in the data. It’s the same data. Just reshuffled. And here is the kicker: the bigger the study, the more people are in the group that prays, the more chances you have to find a better match for the switch. So, if the scientist sees no correlation, he will be more sure that prayer has no effect, while really the effect gets better and better!

You see all that you have done and think that it is good.

Outcomes over individuals or populations

The issue here is that statistical distributions and statistical measurements can only tell us correlations between statistical properties, which are properties of the population as a whole. They cannot tell us anything about the individuals. Two populations may be statistically identical but may be physically different and a statistical measurement cannot tell them apart.

Suppose you have a fair coin, you position it on your hand tails up and you toss it a number of times. You will get 50% chance of getting either heads or tails. Now, suppose you position it on your hand heads up and you toss it a number of times. You will still get 50% chance of getting either heads or tails. We get the same statistical distribution. So it seems switching initial orientation of the coin had no effect. Yet, it may have had the effect that it changed all the tails to heads and all the heads to tails. But this experiment will never be able to tell us. Only one that is much more controlled, that studies the tossing mechanism in a non-statistical way could tell.

Statistics only keeps track of the frequency of some attribute within some population, and not which individual has what attribute. So we have to be careful when extrapolating results to individual experience, which is ultimately what religion cares more about. Ah, and about conflict: the memo I got is that conflict is always between people, typically over some kind of power. When the fool points at the enemy, the wise looks at the finger.