Monday, December 30, 2013

Solving Hilbert’s sixth problem (part three of many)


A tale of two products


Thinking how to proceed on deriving quantum mechanics I came to the conclusion to reverse a bit the order of presentation and review first the goal of the derivation: a symmetric and a skew-symmetric product.

Let us start with classical mechanics and good old fashion Newton’s second law: F = ma. Let’s consider the simplest case of a one dimensional motion of a point particle in a potential V(x):

F = -dV/dx
a = dv/dt with “v” the velocity.
Introducing the Hamiltonian as the sum of kinetic and potential energy: H(p,x)= p2/2m + V(x) we have:

dp/dt = - ∂V/∂x = -∂/∂x (p2/2m + V) = -∂H/∂x
and
dx/dt=v=∂/∂p(p2/2m + V) = ∂H/∂p

In general, one talks not of point particles, but it is customary to introduce generalized coordinates q and generalized momenta p to take advantage of various symmetries of the problem (q = q1, q2,…,qn p = p1, p2,…,pn).


dp/dt = -∂H/∂q
dq/dt = +∂H/∂p

with H = H(q,p,t)

We observe two very important things right away: the equations are linear and q’s and p’s are in one-to-one correspondence.

Now we can introduce the Poisson bracket of two functions f(p,q) and g(p,q) as follows:

{f,g} = ∂f/∂q ∂g/∂p - ∂f/∂p ∂g/∂q

as a convenience to express the equations of motion like this:

dp/dt = {p, H}
dq/dt = {q, H}

The Poisson bracket defines a skew-symmetric product between any two functions f,g:

{f,g} = f o g = -g o f = -{g, f} and more importantly this product obeys the so-called Jacoby identity:

{f,{g,h}} + {h,{f,g}} + {g,{h,f}} = 0

This identity follows identically from the definition of the Poisson bracket and expanding and canceling the partial derivatives.

The Jacoby identity and the skew-symmetry property define a Lie algebra.

So in classical mechanics in phase space one defines two products: the regular function multiplication: f(q,p).g(q,p) which is a symmetric product, and the Poisson bracket {f(q,p),g(q,p)} which is a skew-symmetric product.

Now onto quantum mechanics. In quantum mechanics one replaces the Poisson bracket with the commutator [A, B] = AB-BA which can be understood as a skew-symmetric product between operators on a Hilbert space. There is also a symmetric product: the Jordan product defined as the anti-commutator: {A,B}=1/2 (AB+BA)

The commutator also obeys the Jacoby identity:

[A,[B,C]] + [C,[A,B]] + [B,[C,A]] =

[A, BC-CB] + [C, AB-BA] + [B, CA-AC]=

ABC-ACB –BCA+CBA + CAB-CBA-ABC+BAC +BCA-BAC-CAB+ACB =0

and the commutator also defines a Lie algebra, just like in classical mechanics.

How can we understand the Jordan product? In quantum mechanics operators do not commute and we cannot simply take the function multiplication.

To generate real spectra and positive probability predictions, observable operators must be self-adjoint: O=O meaning that in matrix form they are the same as the transposed and complex conjugate. However, because of transposition, the product of two self-adjoint operators is not self-adjoint:

(AB) = BA=BA != AB

However, the Jordan product preserves self-adjoint-ness:

{A,B} = ½ ( (AB) + (BA) ) = ½ (BA + AB) = {A,B}

if A=A and B=B

In quantum mechanics the Jordan product is a symmetric product.

Both classical and quantum mechanics have a symmetric and a skew-symmetric product:

                                    CM                              QM

Symm                           f.g                                Jordan product
Skew-Symm                Poisson bracket            Commutator

Both classical and quantum mechanics have dualities:

CM: duality between qs and ps: q <---> p
QM: duality between observables and generators:q <---> -i ħ ∂/∂q = p

So in this post we solved the simple direct problem: extract a symmetric and a skew symmetric product.

In subsequent posts we will show two important things:
1)      we will derive the symmetric and skew-symmetric products of classical and quantum mechanics from composability

2)      we will solve the inverse problem: derive classical and quantum mechanics from the two products.

In the meantime: HAPPY NEW YEAR!

Wednesday, December 25, 2013

Solving Hilbert’s sixth problem (part two of many)


Picking the physical principles


We can now try to pick essential physics principles. Suppose we play God and we need to select the building blocks of reality. To avoid infinite regression (who created God?) we need something which is timeless. Outside space and time, the only things which qualify are mathematical relationships. Euclidean geometry existed well before ancient Greeks, and E=mc^2 was valid before Einstein and before the solar system was formed. The names of the mathematical relationships are just historical accidents.

Fine, but the nature of mathematical relationships is very different than the nature of reality. Sticks and stones may break my bones, but when was the last time you heard that someone was killed by Pythagoras’ theorem? If reality is nothing but mathematical relationships arranged in a way to avoid contradictions, we need to look at the essential differences between mathematics and reality (http://arxiv.org/abs/1001.4586).

One key difference is that of “objective reality”. How can we quantify this? Objective reality means that any two observers can agree on statements about nature. In other words, one can define a universal (non-contextual) notion of truth. In mathematics truth is defined as a consequence of the axioms but in nature truth is defined as the agreement with experiment. Between two incompatible axiomatic systems there is no possible concept of true and false and the same statement can be true in one system, and false in another. Take for example the statement p=”two parallel lines do not intersect”. The same p is true in Euclidean geometry and false in non-Euclidean geometry.

If universal truth is to exist, it implies the possibility to reason consistently and to define probabilities. In a more mundane setting we demand positivity: it is what can define a bit. We take positivity as the first physical principle. We are not specifying what kind of bit we are talking about: classical bit, quantum qbit, current probability density (zbit); only that objective reality (it) can generate information such that any two observers can agree.

There is another key difference between the abstract world of math and the concrete real world. In mathematics there is a disjoined set of mathematical structures and the job of a mathematician is to explore this landscape and find bridges between seemingly isolated areas. Nature on the other hand is uniform and the laws of nature are the same (invariant). There are no island universes in our reality (even if the multiverse may exist, we cannot interact with other pockets of reality with different laws of physics). In mathematics two triangles can be combined to form something else than another triangle, but in nature, the laws of physics for system A and the laws of physics for system B are the same with the laws of physics for system A+B. For example, the Newtonian laws of motion for the Earth, are the same with the Newtonian laws of motion for the Sun, and they do not change when we consider the Earth+Sun system. This may look trivial, but it is an extremely powerful observation and from it we will derive three kinds of dynamics: classical mechanics, quantum mechanics, and another type of mechanics not present in nature (which will show that it violates the positivity condition).

The second physical principle we consider is composition: the laws of nature are invariant under tensor composition.

So if you are God, your requirements for the job are: use timeless mathematical structures as your building blocks, do it in such a way that you create objective reality (ability to define a context independent notion of truth) and make sure that the laws of reality (physics) are invariant. If however you are a physicist wanting to solve Hilbert’s sixth problem, your starting physical principles are: positivity and composition. The idea that reality is made of nothing but of mathematical structures is known as the “mathematical universe hypothesis” but Tegmark’s proposal is done incorrectly: it looks at the similarity between mathematics and reality and proposes computability as the physical principle. The right way is to look at the differences and this leads to composition and positivity. Composition (or composability-which is part of the name of this blog) was initially proposed and explored by Emile Grgin and Aage Petersen, while positivity (the objective reality) as a physical principle was first proposed and explored by the author.


Next time I will start using composability (or the invariance of the laws of nature under tensor composition) to start deriving three (and only three) possible dynamics (two of which being classical and quantum mechanics) in the Hamiltonian formalism. 

Friday, December 20, 2013

Solving Hilbert’s sixth problem (part one of many)


Outside in or inside out?


In 1900 David Hilbert proposed a set of problems to guide mathematics in the 20th century. Among them problem six asks for the axiomatization of physics.

Solving problem six is a huge task and the current consensus is that it is a pseudo-problem but I will attempt to prove otherwise in this and subsequent posts. I will also start formulating the beginning of the answer.

Let’s first try to get a feel for the magnitude of the problem. What does axiomatizing physics mean? Suppose the problem is solved and we have the solution on a piece of paper in front of us. Should we be able to answer any physics question without using experiments? Is the answer supposed to be a Theory of Everything? Let’s pause for a second and reflect on what we just stated: eliminate the need for experiments in physics!!! This is huge.

But what about Gödel Incompleteness Theorem? Because of it mathematics is not axiomatizable and has an infinite landscape. Do the laws of physics have an infinite landscape too?

The biggest roadblock for solving Hilbert’s sixth problem turns out to be Gödel Incompleteness Theorem. Let’s get the gist of it. Start with an antinomy (any antinomy will do): this statement is false. If the statement is true, its content is accurate but its content says that the statement is false. Contradiction. Likewise, if the statement is false, its negation is true, but the negation states that the statement is true. Again we have a contradiction. This was well known a long time before Gödel as the liars’ paradox. But now let’s follow Gödel and replace true and false with provable and unprovable. We get: this statement is unprovable. Suppose the statement is false. Then the statement is provable. Then there exists a proof to a false statement. Therefore the reasoning system is inconsistent. The only way to restore consistency is to have that the statement is true. Hence we just constructed a true but unprovable statement!

Now in a sufficiently powerful axiomatic system Sn suppose we start with axioms: a1, a2, …, a_n (at the minimum Sn must include the natural number arithmetic). Construct a statement P not provable in the axiomatic system (Gödel does this using the diagonal argument). Then we can add P to a1,…,a_n and construct the axiomatic system S_n+1 = a1, a2, …, a_n, P. We can also construct another axiomatic system S’_n+1 = a1, a2, …, a_n, not P. Both S_n+1 and S’_n+1 are consistent systems, but together are incompatible (because P and not P cannot be both true at the same time). The process can be repeated forever, and hence in mathematics there is no “Theory of Everything Mathematical”, no unified axiomatic system, and mathematics has an infinite domain. 

So it looks like the goal of axiomatizing physics is hopeless. Mathematics is infinite, and mathematicians seem to be able to keep exploring the mathematical landscape. Since mathematicians are part of nature too, axiomatizing physics seem to demand math axiomatization as well. Case closed, Hilbert sixth problem must be a pseudo-problem, right?

However, it turns out there is another way to do axiomatization. Let’s start by looking at nature. We see that space-time is four dimensional, we see that nature is quantum at core, we see that the Standard Model has definite gauge symmetries. Nature is written in the language of mathematics. But WHY some mathematical structures are preferred  by Nature over others? We cannot say that those mathematical structures are unique, all mathematical structures are unique! We can say that some mathematical structures are distinguished.

Solving Hilbert sixth problem demands as a prerequisite finding a mechanism to distinguish a handful of mathematical structures from the infinite world of mathematics.

And in a well known case we know the answer. Consider the special theory of relativity: this is a theory based on a physical principle. Finding essential physical principles is what needs to be done first. Suppose we now have all nature’s physical principles written in front of us. What is the next step? The next step is to use them as filters to select distinguished mathematical structures. If we pick the principles correctly, the accepted mathematical structures will be those and only those which are distinguished by nature as well.

So instead of doing an axiomatization in the traditional way outside in: from axioms derive statements, we use it inside out: from physical principles (axioms) we reject all mathematical structures but a distinguished few. The axioms are like the fence of a domain establishing its boundary. The orientation of the boundary matters. In everyday life, or in engineering this way of using the axioms is well known: they are called requirements. When we buy a car we do not start with the Standard Model Lagrangean to arrive at the make and model we will buy, but we start with requirements: the price range, the acceptable colors, etc. In other words we start with the acceptable features. In special theory of relativity, the relativity principle rejects all Lie groups except the Lorenz and the Galilean group. An additional constant of the speed of light postulate picks the final answer.


Whatever gets selected does not need to be a closed form theory of everything and we bypass the limitation from Gödel incompleteness theorem. Now this program is actually very feasible. Next time I will show how to pick the physical principles, we’ll pick two principles and in subsequent posts I’ll use those principles to derive quantum and classical mechanics step by step in a very rigorous mathematical way (it is rather lengthy to derive quantum mechanics and I don’t know how many posts I’ll need for it). It turns out that quantum and classical mechanics are also theories of nature based on physical principles just like theory of relativity is. The role of the constant of the speed of light postulate will be played in the new case by Bell’s theorem. In the process of deriving quantum mechanics we’ll make great progress towards solving Hilbert’s sixth problem, but we’ll still be far short from a full “theory of everything”. 

Saturday, December 14, 2013

Soliton Theory (part 2 of 2)


Besides the KdV equation covered last time, some well known soliton equations are:

-Nonlinear Schrodinger equation (NLSE):

i ∂t ψ = - ½ ∂x2 ψ + k |ψ|2 ψ


            φtt – φxx + sin φ = 0


            ∂x (∂t u + u ∂x u + 2ε2xxx u) ± ∂yy u = 0

I am most familiar with the nonlinear Schrodinger equation and its variations. NLSE was proposed some 40 years ago to be used for describing light propagation in optical fibers. People used signal fires



to transmit information since ancient times (or since Middle Earth J ) but one needs guided light transmission to eliminate interference from the atmosphere (rain, fog, etc). However using ordinary glass is not practical because ordinary glass is not transparent enough. Imagine looking through a glass 30 miles thick!!! 30 miles is the typical length for an optical fiber because it takes dissipation 30 miles to reduce the intensity in half and at that point signal amplification is required. Advances in glass manufacturing resulted in ultra-transparent optical fibers (at certain wavelengths) close to the theoretical transparency limit (this limit is due to Reyleigh scattering which is responsible for the color blue of the sky).

From Maxwell’s equations, one can derive NLSE. The anharmonic electron oscillation generates the nonlinear |ψ|2 ψ term. The ∂x2 ψ term corresponds to ordinary dispersion in the optical fiber and a soliton is a “light bullet” which carries one bit. Using solitons, a single optical fiber can carry giga (109) bits of information per second (the usual soliton pulse is measured in pico-seconds (10-9 s) but femto-second pulses for shorter distances are possible too). A single telephone call requires 64 kbit/s and a 1GB/s optical line can carry 15,625 concurrent calls. In 1999 rates of 300GB/s for a single fiber were commercially achieved and an optical cable bundle has much more than a single optical fiber. Advances in long distance transmission rates were so great that despite Moore’s law computers started to be viewed as the hopeless communication bottleneck.

For all their potential, optical solitons never materialize in practice and there is this funny disconnect between academic research and industry which I encountered several times. For example after I graduated I had a job interview at Bell Labs hoping to amaze them with my NLSE research, but using low intensity traditional pulses, they already achieved in practice one order of magnitude higher transmission rate than what was demonstrated with solitons at that time (based on publications in experimental journals). Solitons are not very practical due to arriving time instabilities and the need to use expensive active pulse repeaters every 30 miles instead of passive amplification (imagine an active repeater malfunctioning at the bottom of Pacific Ocean in an undersea cable).

Another academia-industry disconnect I encountered was the one on training a neural network for OCR (optical character recognition). Just like soliton theory generated thousands of research papers but the industry never adopted it, in neural networks there is an extensively studied “back propagation method”. In practice this is all useless and the industry has much more effective techniques for training neural networks, but they are all trade secrets. With the industry knowledge, reading the back propagation papers made for a good laugh.

But solitons are nice topic in themselves and they do have very interesting math properties.

In practice there are two standard techniques for computing soliton solutions: the Lax pair and the zero curvature condition. The Lax pair is linked with a Sturm-Liouville equation, and solitonic equations have an infinite number of conservation laws. Discovering a lot of conserved quantities for a new equation is a sure clue that the equation admits soliton solutions.

The KdV equation can be expressed as a Hamiltonian system in two distinct ways. In fact such a case is called bi-Hamiltonian, and the interplay between the two Hamitonians generates an infinite number of conservation laws.

Another way one can solve the solitonic equations is by the Riemann-Hilbert problem. Through this problem there is an unexpected link between solitons and renormalization in field theory.

In an initial value problem for solitonic equations, part of the initial condition excites dispersive waves, and (if the energy is large enough) part excites the solitonic pulses. To solve the Riemann-Hilbert problem, given a closed curve on the Riemann sphere (the complex plane + infinity) one has to marry two functions on each side of the curve given some initial value on the curve. In soliton land, the initial value on the curve corresponds to the dispersive waves, and solitons correspond to poles on the Riemann sphere. This is why solitons are robust: once there, the poles cannot be eliminated (in the absence of friction or additional effects which destroys the infinite number of conservation laws).

An excellent review on the Riemann-Hilbert problem and solitons can be found here. The gist of the Riemann-Hilbert problem is: reconstruct an analytic function from its singularities. As Alexander Its points out, in its most general way, integrability means that local properties (singularities) determine global behavior.

Saturday, December 7, 2013

Soliton theory (part 1 of 2)


In the last post I listed the amazing lectures of Mr. Bender on perturbation theory. If you managed to follow the lectures to the end, you got to see the WKB perturbation theory in action. The lecture ending was on extracting a “beyond all orders” behavior.

After watching a powerful movie, don’t you want sometimes to have a different ending? For example the movie Inception:


If you have not watch it, I won’t spoil it by saying more, but if you did, you understand what I mean.

So I cannot help and I’ll attempt to give an alternative ending to Mr. Bender lectures, by venturing into the wonderful area of soliton theory.

In lessons 13, 14 and 15 you got to see how to solve a potential well in Schrodinger’s equation using WKB. At each turning point there is a reflected wave, and one may ask: are there potential well for which there is no reflection? How would a reflectionless potential look like? This is the starting point of the so-called Inverse Scattering Theory .

But let’s start in historical fashion.

In 1832 a certain gentleman, Mr. John Scott Russell, got amazed by a peculiar wave along an English canal:

I was observing the motion of a boat which was rapidly drawn along a narrow channel by a pair of horses, when the boat suddenly stopped – not so the mass of water in the channel which it had put in motion; it accumulated round the prow of the vessel in a state of great agitation, then suddenly leaving it behind, rolled forward with great velocity, assuming the form of a large solitary elevation, a rounded, smooth and well-defined heap of water, which continued its course along the channel apparently without change of form or diminution of speed. I followed it on horseback, and overtook it still rolling on at a rate of some eight or nine miles an hour, preserving its original figure some thirty feet long and a foot to a foot and a half in height. Its height gradually diminished, and after a chase of one or two miles I lost it in the windings of the channel.”

It was not until 1895 when Diederik Korteweg and Gustav deVries derived the equation of this wave:

U_t + 6 U U_x + U_xxx = 0

(now called the KdV equation)

Because the equation contains the second power of U it is a nonlinear equation, and in particular it is a nonlinear partial differential equation, an ugly beast of intractable complexity which nobody knew how to solve.

Fast forward to 1952: Fermi, Pasta, and Ulam were doing computer modeling for a certain problem and noticed some odd periodic behavior.

Further investigating this in 1965, Zabusky and Kruskal observed the kind of solitary waves Mr. Russell witnessed. Those solitary waves or pulses were able to pass through one another with no perturbation whatsoever (and hence the term solitons) which was a very bizarre behavior.

 The mathematical breakthrough occurred in 1967 when Gardner, Green, Kruskal and Miura discovered the inverse scattering technique for solving the KdV equation.

But what is inverse scattering? Let us start simpler with linear partial differential equations. How do we solve the initial value problem? The standard technique is that of a Fourier transform.

In a Fourier transform we multiply a function of x: f(x) with exp(ikx) and then we integrate over all x. What results if a function F(k). Fine but what does this have to do with partial differential equations?

Let us take the Fourier transform on the linear partial differential equation. Wherever we have a partial derivation, we integrate by parts and transfer the derivation to the exp(ikx) term. In turn this extracts the iK factor out of the exponent, and under the integral sign we transformed the linear partial differential equation into a polynomial equation in K. Wow (applause please)!!!

Solving polynomial equations is MUCH easier than solving partial differential equations.

So the general technique is the following: extract the (Fourier) modes, solve the easy time problem in (Fourier) modes, and perform a reversed (Fourier) transform to obtain the solution at a later time:


K(t=0)à-solve time evolution in polynomial equation-à---K(t=T)
  ^                                                                                                 \/
  ^                                                                                                 \/
  ^                                                                                                 \/
F(K(t=0))                                                                               F(K(t=T))
  ^                                                                                                 \/
  ^                                                                                                 \/
Fourier Transform                                                       Inverse Fourier Transform
  ^                                                                                                 \/
  ^                                                                                                 \/
f(x(t=0))                                                                                    f(x(t=T))

Something similar happens in nonlinear partial differential equation and the role of the Fourier transform is taken by solving a Schrodinger equation scattering problem. The scattering potential in the Schrodinger equation is the solution to the nonlinear partial differential equation. The typical solitonic solution is of the form 1/cosh. Solving nonlinear partial differential equations takes this form:

 S(t=0)à-solve time evolution in polynomial equation-à---S(t=T)
  ^                                                                                                 \/
  ^                                                                                                 \/
  ^                                                                                                 \/
scattering parameters(t=0)                                      scattering parameters(t=T)
  ^                                                                                                 \/
  ^                                                                                                 \/
Direct Scattering Problem                                        Inverse Scattering Problem
  ^                                                                                                 \/
  ^                                                                                                 \/
 f(x(t=0))                                                                                    f(x(t=T))

Solving the direct scattering problem is straightforward, and here one may use WKB for example, but in practice this does not happen. Using WKB to derive the reflectionless property (this is why solitons pass through each other unperturbed) would yield the 1/cosh solution.

The tricky problem is the inverse scattering by solving a Gelfand-Levitan-Marchenko problem. This is where the computation becomes intensive. The technique makes use of Jost functions to define the boundary conditions, but those are details.

Solitons theory is a very nice and rich area with unexpected links in mathematics and physics. Next time I’ll present some famous solitonic equations, the unexpected link with renormalization theory, and the real world potential usefulness of solitons.