This essay is to distill the philosophical content of Space and time in special relativity. It can be achieved by focusing solely on the time aspect of special relativity, while at the same time making basically no use of formal proofs.

Time is homogeneous and constantly passing. It consists of durations that events can occupy. To identify (measure) durations, clocks are used.

Time is absolute, it permeates the whole universe. All physical objects, even the most remote ones, are connected through time. They all "experience" the very same time.

**(Definition)** A reference frame in which the law of inertia holds is called an inertial frame.

The law of inertia states that the velocity of a particle (= point-like rigid object) may only change if there is force acting upon the particle.

**Note:** velocity is speed together with its direction; i.e. it's a vector.

Intuitively, this means that if all forces acting upon a particle were "switched off", it would continue to move at the constant velocity it has reached until then. Thus, there is a physical reason to think that not all reference frames are "equal", but some of them are "special", namely the inertial frames.

It is true in both classical mechanics and special relativity that:

**(Theorem)** Given any one inertial frame, exactly those reference frames are inertial frames which are moving at a constant velocity relative to it.

The classical laws of motion are exactly the same in every inertial frame. That is, with respect to classical mechanics, inertial frames are all equivalent.

The following, plausible generalization is called the principle of relativity, and its validity is generally assumed in physics:

**(Principle)** The laws of nature are exactly the same in every inertial frame. That is, inertial frames are all equivalent in describing __any__ physical phenomena.

**(Definition)** The circumstances under which a theory can be considered correct are called the scope of the theory.

The originally intended scope can shrink as new facts come to light. This happens when it turns out that some (often tacit) assumptions cannot be considered correct in all situations. In special relativity, two such assumptions are Euclidean geometry and continuous quantities. Both have already been challenged by later developments in physics.

There are plenty of other, tacit assumptions which we don't even notice. We can rest assured that all those will be challenged some day. Still, it is more constructive to say that the assumptions are correct within limits, rather than saying they are just incorrect.

It is a well-established experimental fact that:

**(Law)** The tip of a ray of light propagates in vacuum at a constant c ≈ 300'000'000 m/s speed, always along a straight line, relative to __every__ inertial frame.

**(Corollary)** The speed of light in an inertial frame is independent of the state of motion of the entity that emits it.

This is contradictory, since according to classical mechanics the speed of a given entity, i.e. that of the tip of the ray of light in this case, should depend on the reference frame in which it is observed. To eliminate the contradiction, it was necessary in special relativity to revisit and challenge our intuitive concepts about space and time.

In order to ensure that we are not misled by intuition, Einstein suggested that all definitions in physics must be based on measurements which are, in principle, feasible.

Let K denote an inertial frame.

**(Definition)** Two events are simultaneous in K if and only if a symmetrically placed observer in K sees them, by the naked eye (through vacuum), happen simultaneously.

This definition is compatible with the classical conception of time.

**Note:** by saying "observer in K", it is meant that the observer is also at rest relative to K.

Instead of light, other signals could also be used, e.g. pistol bullets or even carrier-pigeons, as long as the symmetry of their propagation can be assumed. Since we are talking about mechanics, it seems reasonable to require only the symmetry of the net forces acting upon the chosen pair of (identical) "messengers". Einstein's original approach eliminates this requirement elegantly: it uses light signals and assumes that __nothing__ whatsoever can make a difference to their propagation in vacuum. The same is definitely not true for e.g. a bullet whose motion is affected by a multitude of gravitational forces at the very least.

**(Definition)** Two clocks (at rest) in K are synchronized if the same positions of their hands are simultaneous events in K.

It can be shown that it's possible to synchronize all clocks of interest in an inertial frame, so that they are all ticking in sync.

**(Definition)** In an inertial frame K, the time read off from its synchronized clocks is called the coordinate time of K.

The coordinate time of an event can be read off from the clock in K that is momentarily co-located with the entity to which the event happens.

As long as there is only one inertial frame considered, time seems no different from that of classical mechanics.

As for the simultaneity of two or more events, naked-eye observers placed at arbitrary (non-equidistant) locations would typically come to different conclusions, due to the finiteness of the speed of light. This suggests that __inertial frame wide simultaneity is not a direct experience but a mere definition__, based on an agreed way of measurement. Similar is true for coordinate time, since it also builds on the concept of simultaneity.

**Note:** already in the case of duration, two co-located observers need to agree on using an independent device (i.e. a clock) in order to avoid ambiguity.

It does not harm to imagine that the above concepts describe a directly intangible "objective reality" in an inertial frame, but as a matter of fact they are essentially just tools that help us in calculating answers to questions about our (more) direct experiences. The most important question to answer is this: between two events that happen directly to an observer, how much time elapses according to the observer's own clock?

The following suffices for our purposes:

**(Assumption)** Particles can be destroyed: if a particle gets destroyed, its trajectory is discontinued in every inertial frame. That is, if t_{K} denotes the time of destruction relative to inertial frame K, then the trajectory will not exist for any t > t_{K} in K.

**Note:** backed by the fact that the speed of light is finite, similar can be assumed about signals too.

**(Theorem)** The order of two events e_{1} and e_{2 }that happen directly to a given particle is the same relative to every inertial frame.

**(Proof)** Let K and K' be two inertial frames, and let e_{1} and e_{2} happen at t_{1} and t_{2} in K, respectively, with t_{1} ≤ t_{2}. Let t_{1}' and t_{2}' denote the corresponding time values in K'. Moreover, let the particle get destroyed in K at t_{2}, which corresponds to t_{2}' in K'. Then, since the particle does exist in K' at t_{1}', t_{1}' ≤ t_{2}' must hold. Finally, t_{1}' = t_{2}' if and only if t_{1} = t_{2}, so t_{1}' < t_{2}' if and only if t_{1} < t_{2}.

The argument can be applied to signals too:

**(Theorem)** The order in which a signal meets two given entities is the same relative to every inertial frame.

**Note:** the tip of the signal plays the role of the particle.

Let K and K' denote two inertial frames, with K' moving along the x-axis of K in the positive direction, at a constant speed v.

**Note:** vectors are written in boldface, e.g. velocity **v**, while scalars in plain text, e.g. speed v = |**v**|.

Let AB be a segment of non-zero length in K, parallel to the x-axis, with x_{A} < x_{B}. Let A' and B' denote the corresponding points in K', marked out at some given time t in K.

In K', let's send a light signal from A' toward B'. It will arrive at B' after some time Δt' > 0. For the corresponding duration in K, Δt > 0 must hold due to causality. (Relative to K', the light signal meets first A' and then B', so it happens in the same order relative to K too.)

Both A' and B' are moving at **v** relative to K, so when the light signal is emitted, B' is ahead of A' in K. Later the light signal meets B', which means it is traveling in K along a straight line parallel to the x-axis, in the positive direction. The fact that it does catch up with B' implies:

**(Theorem)** v < c.

As for the speed limit of signals (i.e. not inertial frames), it can be shown that due to causality no signal can travel faster than light.

Combining the principle of relativity with the constancy of the speed of light leads to a conclusion about time that is foreign to classical mechanics:

**(Theorem)** If an observer is moving at velocity **v** relative to an inertial frame K, then after Δt time elapsed in K, the observer's own clock will show only a corresponding Δt' = Δt · √1 – v2 / c2 time elapsed.

Thus, in general, the (coordinate) time that elapses between two events depends on the inertial frame in which it is measured. The effect is called time dilation since Δt > Δt' holds whenever v > 0 and Δt' > 0. In other words, the observer moving relative to K always sees as if time was passing faster in K.

**Note:** time dilation is unnoticeable in our everyday life.

The formula works for polygonal trajectories as well, provided that the speed is the same along all edges.

**(Assumption)** Any motion (of a point-like entity) can be approximated with arbitrary accuracy by polygonal motion.

This makes the formula valid for any motion of constant speed; even for circular ones as a matter of fact.

Let K and K' again denote two inertial frames, with K' moving along the x-axis of K in the positive direction, at a constant speed v > 0.

A consequence of time dilation is that:

**(Theorem)** At any given time t in K, the corresponding coordinate time t' in K' that is read off by an observer in K decreases as the x-coordinate of the observer's location increases. For an increase of Δx, there is a decrease in t' by Δx · (v / c^{2}) / √1 – v2 / c2 .

So two events that are simultaneous in K do not happen at the same (coordinate) time in K', unless they have the very same x-coordinate.

One might think of simultaneity as a bond between certain events. If two events in an inertial frame K happen at the same coordinate time, they are connected by "now".

However, two events that happen at different times in K can also be connected by "now", provided there exists another inertial frame relative to which the very same two events happen simultaneously. Moreover, assuming that the bond is transitive, it can be shown that in fact any two events are connected this way, i.e. all the events of the world (ever) are simultaneous. Yet at our outset, non-simultaneous events did seem to exist.

Instead of spending a lot of time figuring out what this really means, we rather throw out this elusive, action-at-a-distance-flavor bond and interpret inertial frame wide simultaneity as a mere definition without any immediate physical content.

Absolute time can be represented by a geometrical straight line T. Points on T correspond to moments, while distances to durations. The idea behind absolute time is that every event can be mapped to exactly one point on this single T, because there exist absolute temporal relationships among events.

However, time dilation revealed that the coordinate time of any one inertial frame is not absolute time, since both simultaneity and duration are inertial frame dependent. To be precise, from the perspective of absolute time what was demonstrated is this: when two inertial frames are moving relative to each other, it cannot be that the coordinate time in __both__ of them measure absolute time. Yet the principle of relativity suggests that the situation of the two inertial frames must be symmetrical. It would not fit in the picture if it was possible that the coordinate time in one frame is absolute, while in the other it's not… and to make matters worse, there would be no known way to tell which frame is which.

This is the point where we must realize that time dilation has left us with no tangible evidence but only one thing supporting the idea of absolute time: our imagination, i.e. the intuition that we've gained from the limited spectrum of our day-to-day experiences. Time dilation has refuted the strongest argument we thought we had for absolute time. Namely, it's not true that two clocks always show the same time elapsed between any two of their encounters. To test this, we don't even need to define simultaneity and coordinate time, all we need is the two clocks.

That's why, since it does not seem to help to go back and try to adjust the simple and robust definitions that led us here, the choice has been made in special relativity to rather give up absolute time.

What we've seen so far of the time aspect of special relativity allows us to already formulate quite a few philosophical ramifications.

The major obstacle in coming to terms with relativity theory is the objection from our own intuition. In addition to that, the formulas of special relativity are apparently more complicated than those of classical mechanics. This all happens because we seek an understanding of new phenomena in terms of such mathematical and physical concepts, and even senses, that developed for a very long time against a fundamentally different backdrop.

It does not help either that special relativity discovers more the "what" than the "why". As a logical process, it derives the strange consequences of a counter-intuitive assumption about light. It would be better for understanding if one could derive the same consequences from a more intuitive assumption instead. (It's not as hopeless as it sounds, take for example the many surprising consequences of the non-surprising law of conservation of angular momentum.)

In the future, when relativistic effects become more and more a common experience, the concepts and formulations of the theory, as well as our intuition, will adapt and hopefully make relativity look simpler and more intuitive to grasp. Nevertheless, as suggested by von Neumann, we never really understand a theory. It can only become intuitive at best, once we've managed to get used to it.

In the light of relativity theory, we can say that our perception of absolute time is an illusion: beyond the scope of classical mechanics, empirical evidence does not support it any longer.

The illusion arises due to restrictions prevailing in our environment: the relatively low speeds and short distances of physical objects in our everyday life, coupled with the limited accuracy of our senses and measurements. As soon as the restrictions are relaxed, we cease to perceive time as an absolute entity: there is no such thing anymore as an observer-independent, globally passing time.

In a sense, everything, or rather, every property we perceive is an illusion and ceases to exist as soon as the horizon of our perception and measurements sufficiently broadens. (Another example: is it due to causality that no signal can travel faster than light, or is it because nobody has ever seen a signal traveling faster than light that causality appears as a law of nature?)

Does then special relativity capture the real nature of time? Well, on the one hand it definitely does, in its own scope. But on the other hand it does not, for the real nature of time is that eventually, it doesn't exist.

A.J. Christian (2003), *Mit keresett Isten a nappalimban?*

Y. Lin (1948), *The Wisdom of Laotse*

Although special relativity is a theory of physics, the chief ingredient in deriving its astonishing results about space and time is mere logical thinking. Besides that, only surprisingly few initial experimental facts are needed to develop the theory.

In general, mechanics studies the laws of motion. The basic idea is that physical objects exert forces on one another during their interactions, and it is the forces, or lack thereof, that eventually determine motion. For the most part, classical mechanics was developed by observing solid objects in our environment, including heavenly bodies.

In the following, some of the core concepts, laws and assumptions of classical mechanics are presented. We focus on rigid objects, because it suffices for our purposes.

The space is static and homogeneous (incl. isotropy). It consists of locations that physical objects can occupy. To identify locations, coordinate systems are used.

**(Definition)** A Cartesian coordinate system that is fixed to a rigid object is called a reference frame.

Time is homogeneous and constantly passing. It consists of durations that events can occupy. To identify (measure) durations, clocks are used.

Time is absolute, it permeates the whole universe. All physical objects, even the most remote ones, are connected through time. They all "experience" the very same time.

Reference frames identify locations only momentarily: a millisecond later we can't tell whether a location that is static in a given reference frame still corresponds to the same (absolute) location in space.

Still, it is assumed there exist reference frames that are at rest in space and thus permanently identify the locations of space. There is no known way of finding such a reference frame though.

**(Definition)** A reference frame in which the law of inertia holds is called an inertial frame.

The law of inertia states that the velocity of a particle (= point-like rigid object) may only change if there is force acting upon the particle.

**Note:** velocity is speed together with its direction; i.e. it's a vector.

Intuitively, this means that if all forces acting upon a particle were "switched off", it would continue to move at the constant velocity it has reached until then. Thus, there is a physical reason to think that not all reference frames are "equal", but some of them are "special", namely the inertial frames.

**(Theorem)** Given any one inertial frame, exactly those reference frames are inertial frames which are moving at a constant velocity relative to it.

**Note:** a practical way of switching off forces is to net them out vectorially.

In every inertial frame, the laws of motion are exactly the same. That is, with respect to classical mechanics, inertial frames are all equivalent. The following, plausible generalization is called the principle of relativity, and its validity will be assumed everywhere in this essay:

**(Principle)** The laws of nature are exactly the same in every inertial frame. That is, inertial frames are all equivalent in describing any physical phenomena.

It's a well-established fact that a given ray of light propagates in vacuum at the constant speed c ≈ 300'000'000 m/s, relative to __every__ inertial frame. This is contradictory, since according to classical mechanics the speed of a given entity, i.e. that of the tip of the ray of light in this case, should depend on the reference frame in which it is observed.

**Note:** throughout this essay, light is always meant to travel in vacuum.

To eliminate the contradiction, it seems necessary that our intuitive concepts about space and time are revisited and challenged. In order to ensure that we are not misled by intuition, Einstein suggested that all definitions in physics must be based on measurements which are, in principle, feasible.

In the following, we limit our attention to phenomena that take place in inertial frames, and in which rigid objects, forces acting upon them, as well as rays of light are involved. A clear distinction is made between rigid objects and light: the latter is considered more like a signal, not as an "ordinary" physical object. Elastic solid objects (threads, springs, etc.) are also allowed, but used only as a means to exert forces on rigid objects in a measurable way.

All in all, the intended scope is the most general as far as the motion of rigid objects in inertial frames is concerned: there are no restrictions expected that would limit the applicability of the results obtained later.

**(Definition)** The circumstances under which a theory can be considered correct are called the scope of the theory.

The originally intended scope can shrink as new facts come to light. This happens when it turns out that some (often tacit) assumptions cannot be considered correct in all situations. In special relativity, two such assumptions are Euclidean geometry and continuous quantities. Both have already been challenged by later developments in physics.

**Note:** there are plenty of other, tacit assumptions which we don't even notice. We can rest assured that all those will be challenged some day.

Still, it is more constructive to say that the assumptions are correct within limits, rather than saying they are just incorrect.

The next couple of sections lay, based on Einstein's suggestions, the measurable foundations needed for the discussion of relativistic effects coming afterwards. Although almost all of the concepts and results here seem intuitive (and even banal time to time), they will be of the utmost importance for clarity and understanding when proceeding further.

It's also demonstrated how cumbersome things can get when the classical ground is cut from under one's feet, i.e. when only those things can be taken for granted that either have been measured or were inferred from the symmetries of a given situation. Putting it another way, we're going to describe what one can say about space and time without blindly assuming anything that isn't supported by tangible evidence, while at the same time making no use of the constancy of the speed of light. The latter will be used only afterwards, in the parts on relativistic effects.

We do make one assumption though to start with: that the geometry in every inertial frame is Euclidean. Keep in mind that our final goal is an adjustment to classical mechanics, as minimal as possible, that eliminates all contradictions posed by the constancy of the speed of light. And the Euclidean-ness of geometry is not among the top suspects to doubt.

Lastly, the word "symmetry" appears in many of the proofs. In the context of an argument, it refers to a kind of reasonless-ness, the common sense that two things must be identical if there is no sensible reason for difference. It indicates a level where we still trust our intuition.

**Note:** the basis of all physical symmetries within a single inertial frame is the principle of relativity, as it ensures that the inertial frame can be considered on its own, while properties like its motion relative to other, maybe even "favored", inertial frames do not matter.

In an inertial frame K, duration can be measured at each single location by using a uniformly ticking clock that is fixed right there.

**Note:** uniformity is ensured by producing all ticks by the very same method, so that we cannot think of any reason why one tick should take longer than the other.

The simultaneity of events that happen directly to entities __at rest__ relative to K is measured as follows:

**(Definition)** Two events are simultaneous in K if and only if a symmetrically placed observer in K sees them, by the naked eye (through vacuum), happen simultaneously.

This definition is compatible with the classical conception of time.

**Notes:** (a) by saying "observer in K", it is meant that the observer is also at rest relative to K, (b) we can imagine that each entity sends a light signal to the observer when its local event happens, (c) here and in the following, locations, events, entities, observers, signals and clocks are, unless explicitly stated otherwise, always meant to be point-like, (d) accordingly, by "light signal" it is understood just the tip of a ray of light.

Instead of light, other signals could also be used, e.g. pistol bullets or even carrier-pigeons, as long as the symmetry of their propagation can be assumed. Since we are talking about mechanics, it seems reasonable to require only the symmetry of the net forces acting upon the chosen pair of (identical) "messengers". Einstein's original approach eliminates this requirement elegantly: it uses light signals and assumes that __nothing__ whatsoever can make a difference to their propagation in vacuum. The same is definitely not true for e.g. a bullet whose motion is affected by a multitude of gravitational forces at the very least.

**(Definition)** Two clocks (at rest) in K are synchronized if the same positions of their hands are simultaneous events in K.

**Notes: **(a) in general, by saying "E in K" it is meant that E is at rest relative to K, (b) we'll always assume that all clocks are of the same construction.

Using mostly symmetry-based reasoning, we infer all the below:

**(Theorem)** If one position of the hands of two clocks in K are simultaneous events, then all positions are.

**(Proof) **There is no such difference in the situations of the two clocks that would explain why a symmetrically placed observer would see them showing different times, ever.

**(Lemma)** If a light signal is sent earlier from one location to another in K, it also arrives earlier.

**(Proof)** If synchronized clocks are used at the two locations, the difference between the time values at sending and arrival, as shown by the corresponding co-located clocks, is always the same. This can be seen if we add another pair of synchronized clocks that, when sending the second light signal, both show the time as it was when the first one was sent. The original and the added clocks run then in parallel, and there is no reason why the latter would not show the same time difference for the second light signal as it was for the first one.

**(Theorem)** All symmetrically placed observers in K judge the simultaneity of two events the same way.

**(Proof)** Let o_{1} and o_{2} be two symmetrically placed observers, and let o_{2} send light signals s_{1} and s_{2} to o_{1} upon seeing the two events e_{1} and e_{2}, respectively. Due to symmetry, o_{1} will observe the same time difference between seeing s_{1} and e_{1} as that between seeing s_{2} and e_{2}. Thus e_{1} and e_{2} are simultaneous to o_{1} if and only if s_{1} and s_{2} are sent by o_{2} at the same moment.

**(Theorem)** Simultaneity in K is transitive.

**(Proof)** Let event e_{1} be simultaneous with e_{2}, and e_{2} with e_{3}. If any two of the events are co-located, the statement is trivial. Otherwise, create an event e_{4} which is simultaneous with e_{2} but not located on any of the e_{1}e_{2}, e_{1}e_{3}, e_{2}e_{3} lines. Then, an observer in the circumcenter of the triangle e_{1}e_{2}e_{4} will see that e_{1} and e_{4} are simultaneous. Similar is true for the triangle e_{4}e_{2}e_{3}, i.e. e_{4} and e_{3} are simultaneous too. Finally, an observer in the circumcenter of the triangle e_{1}e_{4}e_{3} will see that e_{1} and e_{3} are simultaneous.

The last two theorems basically say that the definition of simultaneity is consistent.

For simplicity, it is assumed in the following that in every inertial frame, all clocks are already ticking in sync… just by coincidence.

So far the classical conception of time was untouched. What we've solely gained is a rather restricted, but rock-solid way of measuring time.

**(Definition)** In an inertial frame K, the time read off from the synchronized clocks is called the coordinate time of K.

The definition of simultaneity in K can now be extended to include those events that happen directly to entities in motion:

**(Definition)** Two events are simultaneous in K if and only if they happen at the same coordinate time.

The coordinate time of an event can be read off from the clock in K that is momentarily co-located with the entity to which the event happens.

As long as there is only one inertial frame considered, time seems no different from that of classical mechanics.

Let K denote an inertial frame.

**(Assumption)** Particle trajectories are continuous in K. That is, the t ↦ (x, y, z) relation is a continuous function, where t denotes the coordinate time in K and (x, y, z) the momentary location of the particle relative to K.

**Notes:** (a) backed by the fact that the speed of light is finite, similar can be assumed about signals too, (b) the assumption implies that particle speeds are always finite.

**(Definition)** In K, the distance between two particles at coordinate time t is the distance between their respective momentary locations.

This definition is compatible with the classical conception of distance, and includes the possibility that the particles are moving relative to K. So we rely on coordinate time to define the distance between moving particles.

As long as there is only one inertial frame considered, space seems no different from that of classical mechanics.

As for the simultaneity of two or more events, naked-eye observers placed at arbitrary (non-equidistant) locations would typically come to different conclusions, due to the finiteness of the speed of light. This suggests that __inertial frame wide simultaneity is not a direct experience but a mere definition__, based on an agreed way of measurement. Similar is true for coordinate time and distance, since they both build on the concept of simultaneity. (Already in the case of duration, two co-located observers need to use a clock in order to avoid ambiguity.)

It does not harm to imagine that the above concepts describe a directly intangible "objective reality" in an inertial frame, but as a matter of fact they are essentially just tools that help us in calculating answers to questions about our (more) direct experiences. Later we'll see that the most important question to answer is this: between two events that happen directly to an observer, how much time elapses according to the observer's own clock?

Inertial frames are universal:

**(Assumption)** There is a one-to-one correspondence between the (x, y, z, t) tuples of any two inertial frames.

That is, at any one moment, an observer in an inertial frame encounters exactly one location and sees exactly one clock time of another inertial frame. And if two observers meet for just a moment, they will agree on the two tuples they perceive (i.e. their own one and that of the other's). Moreover, given an inertial frame K, anything that can be labelled by any other inertial frame is visible in K too, hence the term "universal".

Transitivity is assumed as well:

**(Assumption)** If two (x, y, z, t) tuples, of two inertial frames, both correspond to the same (x, y, z, t) tuple of a third inertial frame, they also correspond to each other.

**(Alternative definition)** An inertial frame is a reference frame in which no force is needed to keep a particle at rest.

Intuitively, this means that if all forces acting upon a particle at rest were "switched off", it would continue to stay at rest.

**(Assumption)** The alternative definition and the original definition of inertial frame are equivalent. In other words, the law of inertia holds in all "alternative" inertial frames.

The below theorems characterize the relationship among inertial frames.

**(Theorem)** Let K be an inertial frame. Then every point of another inertial frame K' moves at a constant velocity relative to K.

**(Proof)** Take an arbitrary point P' in K', and place a particle p there that is at rest in K' and upon which no force is acting. Then, due to the law of inertia, p and P' are moving at the same constant velocity in K.

Here we made use of the fact that if there is a force, it must exist in every inertial frame, for it is the "business" of the interacting objects only.

**(Assumption)** Given two points of K', the distance between their corresponding points in K cannot grow arbitrarily large.

**(Theorem)** The constant velocity in the previous theorem is the same for every point of K'.

**(Proof)** Otherwise, the assumption just made would not hold. (I suspect the theorem could be proved without the assumption, but I don't know how.)

In the opposite direction:

**(Theorem)** If a reference frame K' moves at a constant velocity relative to an inertial frame K, then K' is an inertial frame too.

**(Proof)** Let p be a particle that is at rest in K'. Then p is moving at a constant velocity relative to K, and due to the law of inertia it can be assumed that no force is acting upon p. Thus, no force is needed to keep p at rest in K'.

To summarize:

**(Theorem)** The classical theorem characterizing the relationship among inertial frames is also valid in special relativity.

By definition, every reference frame is fixed to a rigid object. That is, the particles of the rigid object are at rest relative to the reference frame.

**Note:** to lend "fixed" a meaning for non-inertial frames too, we can assume that time is "passing" at every single location of any reference frame.

In special relativity, the scope is limited to inertial frames. So let K denote an inertial frame. Then, what was said above means that for every inertial frame, there exists an underlying rigid object that is moving uniformly relative to K. To round off this relationship, it is reasonable to assume the converse too:

**(Assumption)** To every rigid object O that is moving uniformly relative to K, there exists an inertial frame relative to which O is at rest.

The following suffices for our purposes:

**(Assumption)** Particles can be destroyed: if a particle gets destroyed, its trajectory is discontinued in every inertial frame. That is, if t_{K} denotes the time of destruction relative to inertial frame K, then the trajectory will not exist for any t > t_{K} in K.

**Note:** backed by the fact that the speed of light is finite, similar can be assumed about signals too.

Universality alone would only guarantee that the event of destruction happens in every inertial frame, at exactly one (x, y, z, t) tuple. However, it states nothing about the existence of the particle before or after t. That's what causality tells us in addition.

**(Theorem)** The order of two events e_{1} and e_{2 }that happen directly to a given particle is the same relative to every inertial frame.

**(Proof)** Let K and K' be two inertial frames, and let e_{1} and e_{2} happen at t_{1} and t_{2} in K, respectively, with t_{1} ≤ t_{2}. Let t_{1}' and t_{2}' denote the corresponding time values in K'. Moreover, let the particle get destroyed in K at t_{2}, which corresponds to t_{2}' in K'. Then, since the particle does exist in K' at t_{1}', t_{1}' ≤ t_{2}' must hold. From universality, t_{1}' = t_{2}' if and only if t_{1} = t_{2}. Thus, t_{1}' < t_{2}' if and only if t_{1} < t_{2}.

The argument can be applied to signals too:

**(Theorem)** The order in which a signal meets two given entities is the same relative to every inertial frame.

So far, each inertial frame has its own meter bars at rest to measure distances, clocks at rest to measure durations, and elastic solid objects (threads, springs, etc.) to measure forces.

**(Assumption)** Every rigid or elastic solid object can be brought (to be at rest) into any inertial frame.

Backed by the principle of relativity, one can venture to say that:

**(Principle)** A rigid or elastic solid object has the same mechanical (incl. geometrical) properties in every inertial frame.

From this, it naturally follows that:

**(Corollary)** The units of length, time, and force can be synchronized between any two inertial frames K and K'.

To synchronize e.g. length, take a rod that is 1 meter long in K, bring it "up to speed" into K', and adjust the definition of 1 meter in K' accordingly. Similar applies to time and force. After that, the measuring tools of K and K' will be interchangeable in any mechanical experiment.

**(Definition)** Two inertial frames are of the same construction if and only if their units of measurement (length, time, and force) are synchronized.

Going forward, K and K' will always denote two inertial frames of the same construction, with K' moving along the x-axis of K in the positive direction, at a constant velocity **v** ≠ **0**. Since K and K' are both inertial frames, K is also moving relative to K' at some constant velocity **v**' ≠ **0**. Both v and v' are finite, for the speed of every particle is finite.

**Notes:** (a) vectors are written in boldface, e.g. velocity **v**, while scalars in plain text, e.g. speed v = |**v**|, (b) we'll always assume that all inertial frames are of the same construction.

In this subsection, "location" is meant in the general sense, i.e. not as point-like location only.

**(Theorem)** If l is a straight line in K parallel to **v**, and l' is the corresponding location in K' marked out at (coordinate) time t in K, then l' is a straight line parallel to **v**'.

**(Proof)** Viewed from K', any point P of l is moving at the constant velocity **v**'. On the other hand, in K all points of l', and only those from K', go through P. Thus, during the course of its movement in K', P meets exactly the points of l'. So l' is a straight line parallel to **v**'.

**(Theorem)** The location l' marked out in K' is the same for every t, and l' corresponds to l at any time t' in K'.

**(Proof)** The points of K' marked out at any t_{1} in K, and only those, are moving along l in K, so marking out at any other t_{2} in K results in the exact same points of K'. Now, switching the roles of K and K', we get that for every t' in K', l' corresponds in K to the very same straight line l_{l'} parallel to **v**. But from the way l' was constructed we also know that l_{l'} and l have common points, so l_{l'} must be equal to l, since both of them are parallel to **v**.

The x'-axis of K' will always be chosen such that it coincides with the x-axis of K, at all times. To obtain the x'-axis, all we have to do is to mark out the location in K' that corresponds to the x-axis at any given time t in K. As for the orientation of the x'-axis, we define the order of its points, and with that the positive and the negative direction as well, by the order of the corresponding points on the x-axis at t. Independently of our choice of t, we'll get the same x'-axis and the same order of its points.

**(Theorem)** In K', **v**' points in the negative direction of the x'-axis.

**(Proof)** In K, a point P of the x-axis meets the points of the x'-axis in the above defined negative direction. If we imagine there is a particle at P, then causality implies that P, while moving at velocity **v**', meets the points of the x'-axis in the same order in K' (and so the above defined "directions" are proper directions in K' too).

For completeness, one last theorem about the x'-axis:

**(Theorem)** If in K', point P_{2}' comes after P_{1}' on the x'-axis, then at any given t', the corresponding point P_{2} comes after P_{1} on the x-axis of K.

**(Proof)** Both P_{1} and P_{2} are moving in the negative direction along the x'-axis in K'. At t', P_{1} is already at P_{1}', so P_{2}' must have met P_{1} before t'. Therefore, P_{2}' meets in K' first P_{1} then P_{2}. Due to causality, the order is the same in K. And since P_{2}' is moving in the positive direction along the x-axis in K, P_{2} comes after P_{1}.

There is only one way an inertial frame can move relative to another at a given velocity:

**(Lemma)** If inertial frame K'' is moving at **v** relative to K, then K' and K'' are at rest relative to each other, and the time values of each pair of co-located clocks of K' and K'' differ by the very same constant.

**(Proof)** Take a point P in K at time t. The corresponding points P' and P'' of K' and K'', respectively, always coincide in K. So, due to universality (transitivity), the constant velocity at which K' and K'' are moving relative to each other must be **0**. This is because at any two t_{1}' and t_{2}' in K', P' corresponds (through transitivity via K) to the same P'' of K'', and similarly, P'' in K'' always corresponds to the same P' of K'. Furthermore, since all clocks are synchronized in both K' and K'', the coordinate times of K' and K'' can differ only by a constant.

Now we can prove an expected result:

**(Theorem)** v' = v, and thus **v**' = –**v**.

**(Proof)** Let K_{w} be an inertial frame moving at some speed 0 ≤ w ≤ v relative to K along the x-axis in the positive direction, and let the x_{w}-axis of K_{w }be defined similarly to that of the x'-axis of K'. Relative to K_{w}: when w = 0, K' is moving at (positive) speed v, while K at speed 0; when w = v, K' is moving at speed 0, while K at (negative) speed v'. Because of continuity, there exists a w where K and K' are moving relative to K_{w} at equal speed but in opposing directions. From the symmetry of that situation follows v' = v.

**Notes:** (a) roughly speaking, "continuity" is the assumption that whenever a parameter is being changed continuously, the result is also changing continuously, (b) the lemma is necessary to establish the symmetry.

In the proof it was tacitly assumed that:

**(Assumption)** K_{w} exists for any w < v.

This is not as obvious as it looks; what if v > c, are we sure that w = c would be possible? Nevertheless, we maintain the assumption, as later it will be shown that independently of the above theorem, v < c holds.

At a given location L in K, at (coordinate) time t, the corresponding time t' in K' is read off from the clock in K' that is momentarily located at L. Since coordinate time is defined separately in each single inertial frame, it's not guaranteed that t and t' are equal, not even if K and K' were at rest relative to each other.

If the observation at L spans a duration Δt = t_{end} – t_{begin} in K, then the corresponding duration in K' is Δt' = t'_{end} – t'_{begin}. If K and K' were at rest relative to each other, it would be guaranteed that Δt = Δt' holds. Because of causality, it is true in any case that:

**(Theorem)** Δt > 0 implies Δt' > 0.

**(Proof)** Let there be a particle at L, at rest relative to K. Then, the beginning and the end (events) of the observation happen to the same particle.

We'll show now that Δt' depends solely on Δt:

**(Theorem)** If two observations, at L_{1} and L_{2} in K, beginning at t_{begin1 }and t_{begin2}, respectively, span the same duration Δt, the corresponding durations Δt_{1}' and Δt_{2}' in K' are equal too.

**(Proof)** There must exist an inertial frame K'', also moving at velocity **v** relative to K, such that for the observation at L_{2} in K, the corresponding duration Δt_{2}'' in K'' is equal to that of Δt_{1}' in K'. The rationale is that if such a K' can exist for L_{1} and t_{begin1} of K, there is no reason why a similar K'' would not exist for L_{2} and t_{begin2}. And as K' and K'' are at rest relative to each other, the durations Δt_{2}'' and Δt_{2}', and thus Δt_{1}' and Δt_{2}' too, must be equal.

This means that for any positive integer n, if an observation takes n · Δt time at L in K, it will have a corresponding duration of n · Δt' in K'. Due to continuity it follows that:

**(Theorem)** Δt' = λ · Δt, where λ > 0 is a constant.

**Notes:** (a) the value of λ can only depend on the choice of K and K', or rather only on v due to symmetry reasons, (b) in classical mechanics, λ = 1; in special relativity, we don't yet know the exact value.

Since v = v', the situation of K and K' is symmetrical, so the theorem will be valid with the very same λ if the roles of K and K' are switched.

Let A and B be two points on a straight line l in K parallel to **v**, and let Δx = x_{B} – x_{A} denote the signed distance between them. Then, at any time t in K, for the corresponding time values in K':

**(Theorem)** t'_{B} – t'_{A} = (Δx / v) · (1 / λ – λ).

**(Proof)** Let A' denote the point in K' that corresponds to A at time t_{1} in K. A' needs Δt = t_{2} – t_{1} = Δx / v time to get from A to B in K. An observer at A in K will see that a corresponding λ · Δt time has elapsed in K'. To spell it out, at time t_{2} in K, at location A, the corresponding time in K' is t'_{A} = t_{1}' + λ · Δt. On the other hand, an observer at A' in K' will see that a corresponding Δt time has elapsed in K, and thus that Δt / λ time has elapsed in K'. To spell it out, at time t_{2} in K, at location B, the corresponding time in K' is t'_{B} = t_{1}' + Δt / λ.

Therefore, unless λ = 1, the corresponding time t' in K' along l changes linearly with the x-coordinate, i.e. Δt' ~ Δx. This has a curious implication:

**(Theorem)** If λ ≠ 1, the speed of particles has a finite upper bound.

**(Proof)** Otherwise, if e.g. t'_{A} > t'_{B} holds, a fast enough particle could get from A to B within such a short time that on its arrival at B, the corresponding time in K' would still be less than t'_{A}, and that would violate causality.

**(Theorem)** Let P and Q be two points in K whose x-coordinates are equal. Then, for any t in K, the x'-coordinates of the corresponding P' and Q' in K' are equal too.

**(Proof)** Due to the symmetrical situation of P and Q in K with respect to **v**, and owing to the fact that there is only one way K' can move at **v** relative to K, there is no reason why the situation of P' and Q' would be asymmetrical in K' with respect to **v**', or in other words that the P'Q' segment would tilt in a non-symmetrical way in K'.

Analogously for time:

**(Theorem)** The corresponding t'_{P} and t'_{Q} values in K' are equal as well.

Putting the above two theorems together (the roles of K and K' can be switched in both):

**(Corollary)** A plane S in K that is perpendicular to the x-axis is perceived in K' at any given time t' as a plane S' that is perpendicular to the x'-axis.

**Notes:** (a) S' is different for different t' values, (b) every point of S is moving at **v**' relative to K', which means that S itself is moving in K' at **v**', (c) strictly speaking, we should say "measured" instead of "perceived".

Next, we'll explore how straight lines on S map onto S'.

**(Theorem)** If M is the middle point of segment PQ on S, then for the corresponding points in K', marked out at any t in K, M' is also the middle point of segment P'Q' on S'.

**(Proof)** Due to the symmetrical situation of PM and MQ in K with respect to **v**, and owing to the fact that there is only one way K' can move at **v** relative to K, there is no reason why the length of P'M' and M'Q' would not be equal in K'. Similarly, there is no reason why M' would fall on one side (and not the other) of the straight line determined by P'Q' on S'.

Because of continuity, we can go on and say (again, meaning "location" in the general sense):

**(Theorem)** If l is a straight line (segment) on S, and l' is the corresponding location in K' marked out at any t in K, then l' is a straight line (segment) on S' too.

**Note:** every point of l is moving at **v**' relative to K', which means that l itself is moving in K' at **v**'.

Finally, we'll show that S and S' look exactly the same.

**(Theorem)** Let r in K be a straight line segment perpendicular to the x-axis. Then, the length of the corresponding segment r' in K', i.e. the perceived length of r in K', is equal to that of r in K.

**(Proof)** Let t' denote the time in K' that corresponds, along r, to a given t in K. Let q be the segment in K such that: (a) it is parallel to r, (b) its middle point coincides with that of r, and (c) its length in K is equal to that of r' in K'. Since the situation of K and K' is symmetrical (for v = v'), the corresponding q' in K', marked out at t, must have the same length (in K') as that of r in K. Due to the symmetrical placement of r and q in K, q contains r or r contains q. However, if e.g. q __properly__ contained r in K, then q' would also properly contain r' in K'; the first part would imply that the length of r' in K' is greater than that of r in K, while the second part would imply just the opposite, which would be a contradiction. So q, and thus r' in K', must be exactly as long as r in K.

**Note:** the letter "r" was used because it is the first letter of the word "rod".

Let d and d' denote distance in K and K', respectively:

**(Corollary)** If P and Q are two points in K whose x-coordinates are equal, then d(P, Q) = d'(P', Q').

Let ∠ and ∠' denote angles in K and K', respectively:

**(Theorem)** If P, Q, and R are three points in K whose x-coordinates are equal, then for the angles at Q and Q', ∠(P, Q, R) = ∠'(P', Q', R') holds.

**(Proof)** The triangles PQR in K and P'Q'R' in K' are congruent because their corresponding sides are equal in length.

The y'- and z'-axes of K' will always be chosen such that they are parallel to, and have the same orientation as, the y- and z-axes of K, respectively, at all times.

As it can be rightly suspected by now, in special relativity the interesting things happen along the x-axis.

If the x-, y-, and z-axes constitute a right-handed coordinate system, then so do the x'-, y'-, and z'-axes when viewed from K. But do the x'-, y'-, and z'-axes have the same handedness when they are viewed from K' instead of K? (Remember that the axes of K' were set up entirely from within K.)

What we can do is to consider handedness a mechanical property of rigid objects (or rather, of the arrangements of their parts). Then, K and K' can be synchronized in this aspect via bringing a rigid object of known handedness from K into K'. So basically, the question is whether a materialized unit cube of coordinate system K can be brought "up to speed" into K' such that it could seamlessly replace the unit cube of coordinate system K'.

**(Theorem)** K and K' are of the same handedness.

**(Proof)** We know from the proof of v = v' that there exists a K_{w} relative to which K and K' are moving at equal speed but in opposing directions. Let the axes of K_{w} be defined similarly to that of K'. Due to causality, K' is moving in the positive while K in the negative direction along the x_{w}-axis. Now, let's reverse the orientation of the x- and the x_{w}-axes of K and K_{w}, respectively. Then, due to symmetry, the unit cube of the original K_{w} can be brought to perfectly match the unit cube of K' if and only if the same can be done between the reversed K_{w} and the reversed K too. Thus, since the reversed and the original K_{w} have opposite handedness, K' and the reversed K must also have opposite handedness; and so K and K' must have the same.

**Note:** handedness will not play any role later in this essay.

Let S_{1} and S_{2} be two planes in K perpendicular to the x-axis, and let Δx = x_{2} – x_{1 }denote the signed distance between them, i.e. the difference between their respective x-coordinates. In K', let S_{1}' and S_{2}' be the corresponding planes perpendicular to the x'-axis, marked out at a given t in K.

Using a similar argument to that when λ was introduced, it can be easily seen that for the corresponding signed distance Δx' in K':

**(Lemma)** Δx' = μ · Δx, where μ > 0 is a constant.

**Notes:** (a) the value of μ can only depend on the choice of K and K', or rather only on v due to symmetry reasons, (b) in classical mechanics, μ = 1.

Since distances on planes perpendicular to the x-axis don't change, it follows immediately that (yet again, meaning "location" in the general sense):

**(Theorem)** If l is a straight line (segment) in K, and l' is the corresponding location in K' marked out at time t in K, then l' is a straight line (segment) too.

**(Proof)** It's due to the proportionality between each of Δx and Δx', Δy and Δy', Δz and Δz'.

But how is l __perceived__ in K'? There is no guarantee that at time t in K, all points of l have the very same corresponding t' values in K'. (In classical mechanics it is guaranteed that they all do, since λ = 1.)

**(Theorem)** l is perceived in K' as a straight line (segment).

**(Proof)** Let P be a fixed point on l in K, and P' the corresponding point in K' at a given time t', marked out at an appropriate t in K. Now, let Q be an another point on l in K, with Δx = x_{Q} – x_{P} being the difference between the x-coordinates of Q and P. It was shown earlier that for the corresponding time value t'_{Q} in K', read off at t in K, the deviation from t' is proportional to Δx, i.e. t'_{Q} – t' ~ Δx. This also entails t'_{Q} – t' ~ Δx', as Δx = (1 / μ) · Δx'. Thus, to find out where Q was (or will be) in K' at t', we have to shift it away from l', along **v**' by the (signed) amount of -(t' – t'_{Q}) · v'. Since this amount is proportional to Δx', shifting all points on l' accordingly will result in a straight line (segment) in K'.

**Note:** it will result in a "proportional" straight line (segment) in K', meaning that e.g. middle points in K are perceived as middle points in K' too.

Making use of v' = v, the following relationship can be derived:

**(Theorem)** μ = λ.

**(Proof)** An observer in K, located on S_{2} sees that Δt = Δx / v time elapsed between meeting S_{2}' and S_{1}'. In addition, the observer also sees a corresponding time Δt' = λ · Δt elapsed in K'. This means that in K', S_{2} travelled Δt' = λ · Δx / v time from S_{2}' to S_{1}'. And since S_{2} is moving at v relative to K', the distance between S_{1}' and S_{2}' is Δx' = v · Δt' = λ · Δx.

Consequently, if there was a rod r' in K', parallel to the x'-axis and reaching from S_{1}' to S_{2}', its length would be Δx' in K' but an observer in K would see it Δx.

**(Corollary)** In K, the perceived length of r' is 1 / λ times its rest length in K'.

**(Theorem)** If a particle is moving at a constant velocity relative to K, it's also moving at a constant velocity relative to K'.

Although this follows immediately from the law of inertia, let me demonstrate it differently and explain the rationale behind afterwards.

**(Proof)** Let P, Q, and M be points on the trajectory of the particle in K, such that d(P, M) = d(M, Q). Let P', Q', and M' denote the corresponding points on the trajectory in K', respectively. Due to symmetry reasons: (a) projecting P, Q, and M, each when the particle is passing by, onto the x'-, y'-, and z'-axes yields the same Δx', Δy', and Δz' values between P' and M' as it does between M' and Q'; (b) the respective Δt' values are the same too. So in K', the (average) velocity of the particle between P' and M' is equal to that between M' and Q'. Thus, due to continuity, the velocity of the particle is constant in K'.

The advantage of this proof is that it's valid not only for particles but also for any moving entity, including the tip of a ray of light, or even just a point-like state propagating through K.

**Note:** the proof does not exclude the possibility that Δt' = 0, i.e. that the speed in K' is "infinite".

In the following, the consequent application of the previously defined concepts, in combination with the constancy of the speed of light, will lead to concrete conclusions about space and time that are foreign to classical mechanics.

**(Law)** A given ray of light propagates at the same c ≈ 300'000'000 m/s speed relative to every inertial frame.

In point-like terms, this means that the tip of the ray of light is moving at c in every inertial frame.

**(Corollary)** The speed of light in an inertial frame is independent of the state of motion of the entity that emits it.

It's important to emphasize here that the means of measurement, i.e. coordinate time and distance, were defined based on symmetry considerations backed by the principle of relativity, without making any use of the physical properties of light. And although the above law is surprising, the fact that it's nevertheless in line with the principle of relativity corroborates the same.

Let AB be a segment of non-zero length in K, parallel to the x-axis, with x_{A} < x_{B}. We saw before that for the corresponding points A' and B' in K', respectively, marked out at any t in K, x'_{A'} < x'_{B'} holds.

In K', let's send a light signal from A' toward B'. It will arrive at B' after some time Δt' > 0. For the corresponding duration in K, Δt > 0 must hold due to causality. (Relative to K', the light signal meets first A' and then B', so it happens in the same order relative to K too.)

Since the light signal is moving at a constant velocity relative to K', it has to do likewise relative to K as well. Furthermore, both A' and B' are moving at **v** relative to K, and when the light signal is emitted, B' is ahead of A'. Later the light signal meets B', which then means it is traveling in K along a straight line parallel to the x-axis, in the positive direction. And by law, at the constant speed c. The fact that it does catch up with B' implies:

**(Theorem)** v < c.

The relationship between inertial frames and rigid objects entails that particles too have exactly the same speed limit.

Let A'B' be a segment of non-zero length in K', perpendicular to the x'-axis. We saw before that A'B' is perceived in K as being perpendicular to the x-axis, traveling at velocity **v**.

In K', let's send a light signal from A' toward B'. It will arrive at B' after some time Δt' > 0. Let AB and CD denote the segments in K that correspond to A'B' when the light signal is sent and arrives, respectively. Moreover, let e_{1} and e_{2} be two events, both happening at B', the first when the light signal is sent and the second when it arrives. That is, viewed from K, e_{1} happens at B and e_{2} at D. Clearly, Δt' is the time difference between e_{1} and e_{2} in K'. For the corresponding time difference in K, Δt > 0 must hold due to causality. (Imagine there is a particle at B' to which both e_{1} and e_{2} happen.) And since B' is moving at **v** relative to K, the x-coordinate of CD is greater than that of AB.

Again, as the light signal travels at a constant velocity relative to K too, it has to travel in K along a straight line from A toward D, at the constant speed c. It was shown before that d(A, B) = d(C, D) = d'(A', B'). Let d denote this common distance. In K', the light signal then needs

Δt' = d / c

time to travel from A' to B'. We also know that:

d(A, D) = c · Δt

d(B, D) = v · Δt

Due to the Pythagorean theorem:

d^{2} = (c · Δt)^{2} – (v · Δt)^{2}

Replacing d with c · Δt' and rearranging:

Δt' = Δt · √1 – v2 / c2

, which means from the perspective of an observer at B' that:

**(Theorem)** If an observer is at rest for Δt' time in K', they observe a corresponding Δt = Δt' / √1 – v2 / c2 time elapsed in K.

So the duration between events e_{1} and e_{2} is different in K and K', or more generally, the (coordinate) time that elapses between two events depends on the inertial frame in which it is measured.

As a byproduct, we have just determined the value of λ as well:

**(Corollary)** λ = 1 / √1 – v2 / c2 .

The effect is called time dilation since λ > 1, and thus Δt > Δt' for any v. In other words, a resting observer in K' will always see as if in K time was passing faster.

**Note:** time dilation is unnoticeable in our everyday life.

Due to symmetry reasons, the previous theorem is valid in any direction in K, not only along the x-axis:

**(Theorem)** If an observer is moving at velocity **w** relative to K, then after Δt time elapsed in K, the observer's own clock will show only a corresponding Δt' = Δt · √1 – w2 / c2 .

The formula works for polygonal trajectories as well, provided that the speed is the same along all edges.

**(Assumption)** Any motion (of a point-like entity) can be approximated with arbitrary accuracy by polygonal motion.

This makes the formula valid for any motion of constant speed; even for circular ones as a matter of fact.

Knowing the value of λ, we can tell exactly how t' changes along the x-axis:

**(Theorem)** At any given time t in K, the corresponding time t' in K' decreases as the x-coordinate increases. For an increase of Δx, there is a decrease in t' by Δx · (v / c^{2}) / √1 – v2 / c2 .

So two events that are simultaneous in K do not happen at the same time in K' unless they have the same x-coordinate.

One might think of simultaneity as a bond between certain events. If two events in K happen at the same coordinate time, they are connected by "now".

However, two events that happen at different times in K can also be connected by "now", provided there exists another inertial frame relative to which the very same two events happen simultaneously. Moreover, assuming that the bond is transitive, it can be shown that in fact any two events are connected this way, i.e. all the events of the world (ever) are simultaneous. Yet at our outset, non-simultaneous events did seem to exist.

Instead of spending a lot of time figuring out what this really means, we rather throw out this elusive, action-at-a-distance-flavor bond and interpret inertial frame wide simultaneity as a mere definition without any immediate physical content.

The conception of time in special relativity has much been influenced by the classical idea of absolute time. Absolute time can be represented by a geometrical straight line T. Points on T correspond to moments, while distances to durations. Every event is mapped to exactly one point on T. In other words, there exist absolute temporal relationships among events.

Looking back, the tacit motivation has always been to gradually reintroduce absolute time into the new theory, by providing feasible ways of measuring it in increasingly general situations:

**Local time:** multiple observers at the same location in an inertial frame all sense the same objective time passing that can be measured by a co-located (local) clock.

At that point, the possibility was open that all local clocks in fact measure absolute time, identifying points and distances on T.

**Inertial frame time:** the expectation that objective simultaneity must then exist within a single inertial frame led to a well-defined coordinate time, measured by the synchronized local clocks in the inertial frame.

At that point, the possibility was open that coordinate time in fact (necessarily) measures absolute time on T as well.

After that, our unspoken attempt came to a dead-end when multiple inertial frames were considered: it was demonstrated via the time dilation effect that coordinate time cannot measure absolute time, since both simultaneity and duration are inertial frame dependent. To be precise, from the perspective of absolute time what was demonstrated is this: when two inertial frames are moving relative to each other, it cannot be that the coordinate time in __both__ of them measure absolute time. Yet the principle of relativity suggests that the situation of the two inertial frames must be symmetrical. It would not fit in the picture if it was possible that the coordinate time in one frame is absolute, while in the other it's not… and to make matters worse, there would be no known way to tell which frame is which.

This is the point where we must realize that time dilation has left us with no tangible evidence but only one thing supporting the idea of absolute time: our imagination, i.e. the intuition that we've gained from the limited spectrum of our day-to-day experiences. Time dilation has refuted the strongest argument we thought we had for absolute time. Namely, it's not true that two clocks of the same construction always show the same time elapsed between any two of their encounters. To test this, we don't even need to define simultaneity and coordinate time, all we need is the two clocks.

That's why, since it does not seem to help to go back and try to adjust the simple and robust definitions that led us here, the choice has been made in special relativity to rather give up absolute time.

Let a rod be at rest in K', lying parallel to the x'-axis. Knowing the value of λ, we can say that:

**(Theorem)** The length of the rod contracts by a factor of √1 – v2 / c2 when viewed from K.

So the distance between the endpoints of the rod is different in K and K', or more generally, the distance that spatially separates two events (i.e. two entities at two given moments, respectively) depends on the inertial frame in which it is measured.

**Note:** length contraction is unnoticeable in our everyday life.

Due to symmetry reasons, the previous theorem is valid in any direction in K, not only along the x-axis:

**(Theorem)** If a rod is moving lengthwise at velocity **w** relative to K, its rest length contracts by a factor of √1 – w2 / c2 when viewed from K.

The theorem is valid for any uniformly moving rigid objects. The contraction happens along the direction of movement, while there is no change in size along perpendicular directions.

The absolute space is a 3-dimensional geometric space U. Every event is mapped to exactly one point in U. In other words, there exist absolute spatial relationships (e.g. distance) among events.

In this section, "location" is meant again in the general sense, i.e. not as point-like location only.

We saw in classical mechanics that there is no known way to permanently mark the locations of absolute space. We can only make sure that the locations are identified momentarily, via the coordinates of any one reference frame. In spite of this shortcoming, the following argument still strongly supports the idea of absolute space in classical mechanics:

Imagine that time was frozen at an (absolute) moment. Then, for that one moment, all observers in every reference frame would see the very same space. All physical objects, as well as the locations they naturally mark, would have the exact same shapes, sizes and arrangement, for each individual observer. This way, in classical mechanics the absolute space can be exhibited.

In special relativity, however, this thought experiment cannot be carried out. Due to time dilation, freezing the "present moment" in one inertial frame would freeze a continuum of moments in any other inertial frame that is moving relative to it. If we decide not to require that the very same moment (i.e. coordinate time) gets frozen at all points of an inertial frame, it would still not be possible, due to length contraction, that the distance between any two given particles is the same in all (frozen) inertial frames. And even if there was a frozen inertial frame whose spatial relationships did in fact match those of the absolute space, there would be no known way to tell which one it is.

At this point, similarly to the case of absolute time, we must realize that we cannot exhibit absolute space in a tangible way, and thus the only thing left in support of it is our outdated intuition. Again, as the principle of relativity suggests that the situation of all inertial frames must be symmetrical, the choice has been made in special relativity to give up absolute space.

After having introduced the basic effects in Part 1, a couple of more involved problems will be tackled here by applying the previously derived results.

Let a particle be moving at velocity **w'** in K', and let w'_{x'}, w'_{y'}, and w'_{z'} denote the (signed) x', y', and z' components of w', respectively. That is, w'^{2} = w'_{x'}^{2} + w'_{y'}^{2} + w'_{z'}^{2}. To obtain the corresponding w_{x}, w_{y}, and w_{z} in K, we take an arbitrary segment of the particle's trajectory in K', say, of Δt', Δx', Δy', and Δz', and calculate the corresponding Δt, Δx, Δy, and Δz values in K:

Δx = Δx' · √1 – v2 / c2 + v · Δt = w'_{x'} · Δt' · √1 – v2 / c2 + v · Δt

Δt = (Δx' · √1 – v2 / c2 · (v / c^{2}) / √1 – v2 / c2 + Δt') / √1 – v2 / c2 = (1 + w'_{x'} · v / c^{2}) · Δt' / √1 – v2 / c2

Δy = Δy' = w'_{y'} · Δt'

Δz = Δz' = w'_{z'} · Δt'

The resulting velocities are:

w_{x} = Δx / Δt = w'_{x'} · (1 – v^{2} / c^{2}) / (1 + w'_{x'} · v / c^{2}) + v = (w'_{x'} + v) / (1 + w'_{x'} · v / c^{2})

w_{y} = Δy / Δt = w'_{y'} · √1 – v2 / c2 / (1 + w'_{x'} · v / c^{2})

w_{z} = Δz / Δt = w'_{z'} · √1 – v2 / c2 / (1 + w'_{x'} · v / c^{2})

**(Theorem)** w' < c if and only if w < c.

**(Proof)** w' < c means that w'_{y'}^{2} + w'_{z'}^{2} < c^{2} – w'_{x'}^{2}. Then w^{2} = w_{x}^{2} + w_{y}^{2} + w_{z}^{2} < ((w'_{x'} + v)^{2} + (c^{2} – w'_{x'}^{2}) · (1 – v^{2} / c^{2})) / (1 + w'_{x'} · v / c^{2})^{2} = c^{2}. The other direction follows from the interchangeability of K and K'.

This is in line with the earlier established speed limit for particles.

**(Theorem)** If v → c, then w_{x} → c, w_{y} → 0, and w_{z} → 0.

Whenever v, w'_{x'}, w'_{y'}, w'_{z'} ≪ c, the equations yield values very close to the classical ones, i.e. w_{x} ≈ w'_{x'} + v, w_{y} ≈ w'_{y'}, and w_{z} ≈ w'_{z'}.

The formulas are valid also for the tip of a ray of light, i.e. when w' = c. (The formulas can be derived without making any use of the fact that we are talking about a particle.)

** (Theorem)** w' = c if and only if w = c.

Imagine that all points of a directed straight line l' in K' are of black color. Let P'_{0} be a point on l', and let every point P' of l', just by coincidence, switch its color from black to red at coordinate time t'_{P'} = d'(P'_{0}, P') / w', where d' is a signed distance and w' is an arbitrary positive constant. The resulting speed at which the redness property, i.e. the red ray (or the tip of the red ray, in point-like terms), propagates along l' is w'. Curiously, w' can be greater than c, and the formulas for velocity addition remain valid even in that case.

Using the notations of the previous section:

**(Theorem)** w' > c if and only if w > c.

If w'_{x'} = -c^{2} / v, then Δt = 0 for any Δx, which means that w_{x} (and thus w too) is "infinite", at least in the sense that the red color appears at the very same moment along a whole straight line in K. This seems to violate the continuity assumption of trajectories.

If w'_{x'} < -c^{2} / v, then Δt < 0 for any Δt' > 0, which means that the red ray travels "backward in time", in the sense that the points of l' are becoming red in the opposite order when viewed from K. This seems to violate causality at first sight.

**Note:** whenever w'_{x'} < -c, there always exists a v < c such that w'_{x'} < -c^{2} / v holds, from which it follows that no signal can travel faster than light.

However, it's important to emphasize that in these examples neither rigid objects nor signals, but only a state of space, made up of independent point-like properties, is propagating. A red ray that is faster than light cannot be "sent", it can only arise either due to coincidence or pre-arrangement. Nevertheless, the phenomenon is well-defined and can be considered when speculating about hypothetical particles moving faster than light.

At (coordinate) time t in K, let **v** be the velocity of a non-uniformly moving particle. Let t' denote the corresponding time in K' that is read off at the particle's momentary location in K at time t.

**(Definition)** K' is called a momentarily comoving inertial frame of the particle at time t.

It is comoving because in K' the particle is momentarily at rest at time t'; one can see that by applying the velocity addition formulas. (The derivation of the formulas can be adjusted in a straightforward manner to account for the case of non-uniform motion too.)

Let the particle's acceleration be **a'** ≠ **0** relative to K' during the time interval [t' – Δt'; t' + Δt'], and let a'_{x'}, a'_{y'}, and a'_{z'} denote the (signed) x', y', and z' components of **a'**, respectively. That is, a'^{2} = a'_{x'}^{2} + a'_{y'}^{2} + a'_{z'}^{2}. To obtain the corresponding a_{x}, a_{y}, and a_{z} in K, we take the [t'; t' + Δt'] segment of the particle's trajectory in K', and calculate the corresponding Δt, w_{x}, w_{y}, and w_{z} values in K:

w'_{x'} = a'_{x'} · Δt'

Δx' = (1 / 2) · a'_{x'} · Δt'^{2} = (1 / 2) · w'_{x'} · Δt'

Δt = (Δx' · √1 – v2 / c2 · (v / c^{2}) / √1 – v2 / c2 + Δt') / √1 – v2 / c2 = (1 + (1 / 2) · w'_{x'} · v / c^{2}) · Δt' / √1 – v2 / c2

w_{x} = (w'_{x'} + v) / (1 + w'_{x'} · v / c^{2})

w'_{y'} = a'_{y'} · Δt'

w_{y} = w'_{y'} · √1 – v2 / c2 / (1 + w'_{x'} · v / c^{2})

w'_{z'} = a'_{z'} · Δt'

w_{z} = w'_{z'} · √1 – v2 / c2 / (1 + w'_{x'} · v / c^{2})

We get to the acceleration components when Δt' → 0:

a_{x} = lim_{Δt'→0} (w_{x} – v) / Δt = a'_{x'} · (1 – v^{2} / c^{2})^{3/2}

a_{y} = lim_{Δt'→0} (w_{y} – 0) / Δt = a'_{y'} · (1 – v^{2} / c^{2})

a_{z} = lim_{Δt'→0} (w_{z} – 0) / Δt = a'_{z'} · (1 – v^{2} / c^{2})

**(Theorem)** a < a' · (1 – v^{2} / c^{2}) < a'.

**(Theorem)** If v → c, then a → 0.

If we keep accelerating in K a particle of mass m > 0, from a standing position by exerting a constant force **F** ≠ **0**, it will have the same magnitude of acceleration, a**'** = F / m, in each of its momentarily comoving inertial frames along the way. (In classical mechanics, the magnitude of acceleration would be the same F / m in every inertial frame, not only in the momentarily comoving ones.)

Let's divide the motion of the particle in K into infinitely many sections as follows: the first section starts at v = 0; then, iteratively, the next section will always start as soon as the speed has reached v + (c – v) / 2. So the first section is [0; c / 2), the second is [c / 2; c · 3 / 4), and so on. If Δt_{v} denotes the duration of the section in K that started at speed v, and a_{v} the magnitude of the particle's acceleration at the beginning of that section relative to K, then:

Δt_{v} > ((c – v) / 2) / a_{v} = ((c – v) / 2) / (a' · (1 – v^{2} / c^{2})^{3/2}) = c^{3} / (2 · a' · (c – v)^{1/2} · (c + v)^{3/2}) > c / (2^{5/2} · a')

The last expression is a positive constant independent of v, which means that the particle can never reach the speed of light this way, as there are infinitely many sections.

Now let's consider the case when the force is not constant but its magnitude is increasing with v, namely let F = m / (1 – v^{2} / c^{2})^{3/2}, i.e. a function of the speed already reached. Then in K, a = 1 m/s^{2} all the time, so the particle's speed would eventually reach c after having been accelerated during the time interval of [0; c) seconds. Within this (half-open) interval, F is always of finite value (although it's converging to infinity). Therefore, this way of accelerating is feasible in theory; there is no such law in mechanics that would limit the value of F.

**Note:** energy is out of scope in this essay.

An event can be fully localized by providing a space-time point that consists of the event's spatial and temporal coordinates, relative to any one inertial frame. Let A = (x_{A}, y_{A}, z_{A}, t_{A}) and B = (x_{B}, y_{B}, z_{B}, t_{B}) be two such points relative to K, and let A' = (x'_{A'}, y'_{A'}, z'_{A'}, t'_{A'}) and B' = (x'_{B'}, y'_{B'}, z'_{B'}, t'_{B'}) denote the corresponding points relative to K', respectively. If Δx ≔ x_{B} – x_{A}, and similar notation is used for the other deltas too, we can write:

Δx' = (Δx – v · Δt) / √1 – v2 / c2

Δt' = (Δt – Δx · v / c^{2}) / √1 – v2 / c2

Δy' = Δy

Δz' = Δz

Now, multiply both sides of the second equation by c, then square the first two equations, and after that subtract the second from the first to get:

Δx'^{2} – (c · Δt')^{2} = Δx^{2} – (c · Δt)^{2}

Finally, adding the squared third and fourth equations to both sides leads to an invariant measure between space-time points:

d^{2}(A, B) ≔ Δx^{2} + Δy^{2} + Δz^{2} – (c · Δt)^{2}

Apart from the fact that it can be negative, it resembles a (squared) distance measure. It is absolute like classical distance in that its value does not change when switching from one inertial frame to the other. That is, d^{2}(A, B) = d^{2}(A', B') always holds, for arbitrary K'. This suggests that space-time points constitute, at least in a mathematical sense, a four-dimensional absolute space, and that very same space is observed from all inertial frames. (In classical mechanics, the spatial and the temporal points form two absolute spaces, a three- and a one-dimensional one, respectively, which are independent of each other.)

As for the interpretation of d^{2}(A, B): if it's positive, it means that two events at A and B, respectively, cannot have any causal relationship. If e.g. t_{B} > t_{A}, even a light signal that is sent from A would be too slow to influence the event at B.

The results of special relativity suggest that absolute space and absolute time do not exist separately. What seems to exist is an absolute structure that contains both, but in an intrinsically inseparable way. This structure is called spacetime. In a single inertial frame, however, space and time can be treated as if they were fully separate things. For one (inertial) observer alone, space and time looks exactly like that of classical mechanics.

Every event identifies a location in spacetime. It was possible to provide a "distance" measure between events (i.e. between their spacetime locations) in a way that it's inertial frame invariant. With that, spacetime took the shape of an absolute, four-dimensional mathematical space. What we still don't know at this point is whether this kind of spacetime would remain tenable if the theory was extended to non-inertial reference frames too.

Let D be a disk rotating in K, at angular velocity ω. Then, the radius r of D has to be smaller than r_{c} = c / ω, otherwise the particles on the periphery would have a speed of v_{r} = r · ω ≥ r_{c} · ω = c.

This sounds counter-intuitive, because from classical mechanics we expect that the magnitude of the centripetal force acting upon a particle of mass m > 0 at the periphery of radius r_{c} would be F_{c} = m · c^{2} / r_{c}, which is a finite quantity. Thus, for an observer on D it would definitely seem possible to gradually extend D until it reaches any target radius r ≥ r_{c}, since the centrifugal force to overcome during the process would be limited.

However, according to special relativity, in a momentarily comoving inertial frame of a particle that is on the periphery of D, the magnitude of the particle's acceleration is a' = a / (1 – v_{r}^{2} / c^{2}), where a = v_{r}^{2} / r is the magnitude of the centripetal acceleration observed in K. So the magnitude of the centripetal force "felt" by the particle is F' = m · a', which converges to infinity if r → r_{c}.

The major obstacle in coming to terms with relativity theory is the objection from our own intuition. In addition to that, the formulas of special relativity are apparently more complicated than those of classical mechanics. This all happens because we seek an understanding of new phenomena in terms of such mathematical and physical concepts, and even senses, that developed for a very long time against a fundamentally different backdrop.

It does not help either that special relativity discovers more the "what" than the "why". As a logical process, it derives the strange consequences of a counter-intuitive assumption about light. It would be better for understanding if one could derive the same consequences from a more intuitive assumption instead. (It's not as hopeless as it sounds, take for example the many surprising consequences of the non-surprising law of conservation of angular momentum.)

In the future, when relativistic effects become more and more a common experience, the concepts and formulations of the theory, as well as our intuition, will adapt and hopefully make relativity look simpler and more intuitive to grasp. Nevertheless, as suggested by the quotation under the title of this essay, we never really understand a theory. It can only become intuitive at best, once we've managed to get used to it.

In the light of relativity theory, we can say that our perception of absolute time is an illusion: beyond the scope of classical mechanics, empirical evidence does not support it any longer. The illusion arises due to restrictions prevailing in our environment: the relatively low speeds and short distances of physical objects in our everyday life, coupled with the limited accuracy of our senses and measurements. Similar applies to absolute space, and to the independence of space and time.

As soon as the restrictions are relaxed, we cease to perceive space and time as independent, absolute entities, and a unified, absolute spacetime appears instead. There is no such thing anymore as an observer-independent, globally passing time.

In a sense, everything, or rather, every property we perceive is an illusion and ceases to exist as soon as the horizon of our perception and measurements sufficiently broadens. (On a related note: is it due to causality that no signal can travel faster than light, or is it because nobody has ever seen a signal traveling faster than light that causality appears as a law of nature?)

Does then special relativity capture the real nature of space and time? Well, on the one hand it definitely does, in its own scope. But on the other hand it does not, for the real nature of space and time is that they, eventually, don't exist.

Within the scope of special relativity, space and time appear as follows.

**Time**

Two observers moving relative to each other may judge the simultaneity of the same pair of events differently.

Between two encounters of the observers, their clocks typically measure different durations. In the special case when one of them stays at rest in an inertial frame, the other's clock will always show less time elapsed when they again meet.

From the perspective of an observer moving at velocity **v** ≠ **0** relative to an inertial frame K, the time needed to cover a distance d in K is Δt' = Δt · √1 – v2 / c2 , where Δt = d / v is the time elapsed in terms of the coordinate time of K. That is, Δt' < Δt.

Basically, every result of special relativity is the consequence of the previous paragraph.

**Space**

Two observers moving relative to each other may judge the spatial distance between the same pair of events differently.

In an inertial frame K, the faster an object is moving, the more its length contracts relative to K along the direction of movement. On the other hand, there is no change in size along perpendicular directions (and planes).

From the perspective of an observer moving at velocity **v** ≠ **0** relative to an inertial frame K, every distance d in K along **v** becomes d' = d · √1 – v2 / c2 . This follows from what was told about time, since d = v · Δt and d' = v · Δt'. That is, d' < d.

**Scope**

Non-inertial reference frames are out of scope in special relativity.

The perspective of a non-uniformly moving observer is calculated by decomposing their motion into sections s_{i} during which they can be regarded as comoving with a respective inertial frame K_{i}.

The co-located clock of K_{i} then becomes, temporarily along section s_{i}, the observer's "own clock".

A. Einstein (1916, 1920, 1952), R.W. Lawson [trans.] (1920, 1954), *Relativity: The Special and the General Theory*

A. Einstein (1905), M.N. Saha [trans.] (1920), *On the Electrodynamics of Moving Bodies*

Ø. Grøn, A. Næss (2011), *Einstein's Theory: A Rigorous Introduction for the Mathematically Untrained*

N.D. Mermin (2005), *It's About Time: Understanding Einstein's Relativity*

D.J. Morin (2017), *Special Relativity: For the Enthusiastic Beginner*

It is our day-to-day experience that rigid objects, e.g. two books, can be placed right next to each other, in a way that they touch but don't overlap. However, if we think of space as composed of points, the concept of touching becomes different: if two geometrical objects touch each other, they always overlap. For example, when a cube touches a sphere, they have one point in common. This is inconsistent with our intuition that a given piece of space cannot be occupied by multiple rigid objects at the same time.

The geometry developed by G. Veronese in the 19th century avoids the aforementioned inconsistency. The idea is that the continuum is not composed of points.^{(1)} The points do belong to the continuum, but are not part of it. Just like the graph of a computer network: it belongs to the network, but is physically not part of it.

______________________________

Fig. 1: Point of separation.

To illustrate the idea, imagine that Fig. 1 depicts a geometrical straight line. It consists of two disjoint, touching half-lines: one is of green color, the other blue. The point is the thing in the middle that marks where the two are separated. Clearly, the point belongs to both half-lines, since it marks where they end. But is it also part of them? If it was part of one, then, due to the symmetry of the situation, it would be part of the other too. With that, the green and the blue half-lines could not be disjoint. Thus, the point is not part of either half-line.

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Book 1 Book 2

Fig. 2: Two books touching each other.

In general, the contour of a geometrical object belongs to the object, but is not part of it. Fig. 2 shows another example, where two planar books touch each other. Again, the straight line segment in the middle (would be a rectangular surface in three dimensions) that marks the separation between the books is not part of either book.

In Veronese's geometry, the continuum is investigated by means of a superimposed system of all points. The underlying assumption is that differences among geometrical objects manifest themselves through differences in the respective collections of points that belong to each object. That is, a geometrical object is unambiguously determined, and thus can be defined, by its points.^{(2)}

The superimposed system of all points must be dense enough, so that applying geometrical operations cannot lead out of it. For example, ℤ^{3} would not be appropriate in a 3-dimensional continuum because, among other shortcomings, the straight line that passes through (0, 0, 0) and (100, 1, 0) would not intersect in ℤ^{3} any of the straight lines that pass through (n, 0, 0) and (n, 1, 0), where n is an integer between 1 and 99.

The ordinary analytic form of Euclidean geometry would correspond to the case where the system of all points is ℝ^{3}, which is already dense enough. However, Veronese goes beyond that and postulates a 3-dimensional (number) system that includes infinitesimally small as well as infinitely large quantities. He also provides a method by which similar systems can be constructed for continua of up to infinitely many dimensions.

The paradoxes start with the assumption that motion does exist. Then, an argument is presented that arrives at a contradiction, and the conclusion is drawn that motion cannot exist.

__Problem__

Eventually, Achilles catches up with the slower tortoise. His path will then necessarily include infinitely many (gradually decreasing) consecutive segments. Therefore, he would have to cover an infinite number of distances before he finally reaches the tortoise. But this is impossible, as no one can ever complete an infinite sequence of actions.

__Solution__

The seeming contradiction arises from the unspoken assumption that Achilles must become conscious of each individual segment during his motion. Intuition tells us that humans can only have a finite number of discrete conscious experiences behind them at any point in time. How is then Achilles' motion possible? Well, being conscious of the whole (path) does not imply being conscious of its (infinitely many) parts.

__Remark__

One can eat a bowl of rice without becoming conscious of every single grain of it.

__Problem__

A flying arrow occupies exactly one position at any moment. Thus, at every instant of time, the arrow is at rest. But then the motion of the arrow is impossible, since time is composed of such (motionless) instants. In other words, something that is always at rest cannot be in motion.

__Solution__

The paradox is baffling at the very least, until Veronese's conception of the continuum comes to the rescue. Although instants (points) of time do exist, with exactly one position of the arrow belonging to each of them, time itself is not composed of instants. Time can only be decomposed into intervals of non-zero length, and in none of those is the arrow at rest.

__Remark__

This is a pure mathematical resolution of the paradox. No need to resort to physics.

Imagine people living in an infinitesimally small world. Their chosen unit of length (i.e. their "1 meter") must be comparable to their physical sizes, which is infinitesimally small from our perspective. Analogously, since time resembles in many aspects the spatial dimensions, their chosen unit of time (i.e. their "1 second") may also be infinitesimally small from our perspective.

The corollary would be that by the time we notice the tiniest change in our world, in an infinitesimally small world literally eternities would have elapsed.

This would open up interesting possibilities that seem to fall into the sci-fi category. To mention one, if we could mandate somebody in an infinitesimally small world to execute an extremely time-consuming algorithm for us, we would get back the result from them (or from one of their descendants) within the blink of an eye.

Applying Veronese's theory to time entails the existence of both infinitesimally small and infinitely large durations. This alone can be a rich source of riddles and paradoxes, especially when one attempts to reconcile it with human consciousness.

As an example, let's assume that Achilles shoots an arrow whose velocity is infinitesimally small. Will it ever reach its (standing) target? Algebraically, yes, in an actual infinite amount of time. But what does it really mean? Will an immortal observer see it if they wait long enough?

G. Fisher, P. Ehrlich [ed.] (1994), *Veronese's Non-Archimedean Linear Continuum*

J.L. Heiberg [ed.] (1883-1885), R. Fitzpatrick [trans.] (2008), *Euclid's Elements of Geometry*

D. Hilbert (1898-1899), E.J. Townsend [trans.] (1902), *The Foundations of Geometry*

L. Keele (2008), *Theories of Continuity and Infinitesimals: Four Philosophers of the Nineteenth Century*

H.J. Keisler (2011), *Foundations of Infinitesimal Calculus*

P. Lynds (2003), *Zeno's Paradoxes: A Timely Solution*

H. Poincaré (1902), E.V. Huntington [trans.] (1903), *Review of Hilbert's Foundations of Geometry*

W.M. Strong (1898), *Is continuity of space necessary to Euclid's geometry?*

G. Veronese (1891), A. Schepp [trans.] (1894), *Grundzüge der Geometrie*

G. Veronese (1908), P. Ehrlich [ed.], M. Marion [trans.] (1994), *On Non-Archimedean Geometry*

**(1) Composed of points or not? Who said what?**

__Hilbert__

The elements of Hilbert's geometry are points, straight lines, and planes. Any other geometrical object is defined in terms of such elements. For example, the segment (of a straight line) is defined as a "system" of two points, and the circle as a "totality" of points. Obviously, Hilbert's goal was logical soundness rather than coherent storytelling.

__Poincaré__

"There is supposed to be, between the elements of the continuum, a sort of intimate bond which makes a whole of them, in which the point is not prior to the line, but the line to the point."

__Euclid__

In its original form, Euclid's geometry presents a more appealing story than that of Hilbert, albeit logically less sound. The segment and the circle are both defined as "lines", which is in accordance with intuition. The question whether lines, surfaces, and solids are composed of points is left open.

**(2) What is a geometrical object and when does a point belong to it?**

A geometrical object is part of either the continuum in question or a lower-dimensional sub-continuum thereof. A point that belongs to it marks where the object or a part of it may touch another geometrical object. To be able to define geometrical objects by their points, it is important to postulate criteria as to when a collection of points represents part of a (sub-)continuum and when it does not.

For the typical primary school or high school student, the following definition of mathematics would suffice:

**(Naive definition)** Starting from obviously true axioms, use obviously correct inference rules to derive additional truths.

In this sense, mathematics is all about discovering indisputable truths. For example, the theorems proved in geometry would be literally true statements about the physical space. Someone may argue that there is no such physical object as a geometrical point, or a geometrical line, but this is no issue because we can reply that geometrical objects are nothing more than locations in the physical space, and thus they can happily exist even if nobody can see them materialized. As for the exotic topic of complex numbers, they can be viewed as a man-made tool that sometimes comes in handy for mathematicians in describing reality.

Historically, this looked plausible until the early nineteenth century, when non-Euclidean geometries were invented. It wasn't generally believed any longer that Euclidean geometry was the true description of physical space. Curiously, it turned out to be impossible to decide which geometry, if any, was the true one. But then it raises the question: if mathematics does not necessarily describe reality, what does it describe then? After some meditation, we may adjust our previous definition as follows:

**(Modern definition)** Starting from obviously clear assumptions, use obviously correct inference rules to derive consequences.

Here we acknowledge that mathematics is all about discovering the logical consequences of given assumptions, where it is not required that the assumptions are actually true. It's comparable to making logical deductions based on the "facts" set out in a detective fiction.

Again, this looks plausible until we find out that different groups of mathematicians don't even agree on which logical rules should be permitted when deriving consequences. Most notably, certain groups don't accept the unconditional application of the law of excluded middle. This stems from their divergent views on the existence of mathematical objects.

What was told so far shows that defining mathematics is far more involved than expected. And we didn't even try to define the scope of mathematics, only its methodology was considered. To move things forward, the attempt to define mathematics must be accompanied by a better understanding of existing approaches to the foundations of mathematics.

One remark before we continue: in the definitions given above, "obviously true", "obviously correct", and "obviously clear" do not mean true, correct, and clear, respectively. Such a wording was used solely to indicate things that, for most people out there with a brain, would or used to appear as obvious. It is this entanglement with the obvious what makes mathematics possess the illusion of indisputability.

Philosophy offers reasonable arguments about topics where, given our current level of knowledge, there is no feasible way of testing or verifying any theory. The foundations of mathematics is a philosophical topic of active research. This suggests that today we are still far away from a definitive answer as to what mathematics really is.

**(Metaphorical definition)** Mathematics is the intellectual discovery of nature's eternal, immutable infrastructure.

By "infrastructure", I mean that the discovery ultimately targets fundamental, ubiquitous ideas that we sense when observing appropriate configurations of things. (I am deliberately imprecise here, in the spirit of the quotation under the title of this essay.) By "intellectual", I mean that the process of discovering happens entirely within one's mind.

All ideas in mathematics are tied to observations. For example, when we look at {apple, apple, apple}, or {orange, orange, orange}, we sense "threeness", denoted by the symbol 3. When we observe the edge of a ruler, we sense a "straight line segment", even in spite of knowing that it would not look smooth through a magnifying glass. Looking at a computer network plan, we sense the idea of a "graph". Imagining a row of natural numbers, starting from 0 and fading away in the distance, can lead us to sense what we'd call the "set of natural numbers", denoted by ℕ. When we throw the dice, we sense "randomness". When we think of all humans ever born, we sense "potential infinity" (the actual set is finite at any given point in time, it's never completed, and may grow indefinitely). When we write down √-1, we sense something weird as if a number whose square is -1 existed (it sounded really weird before complex numbers became widely accepted). When we ask if there may be a positive quantity smaller than any positive real number, we sense the "infinitesimally small".

The ideas in the previous paragraph are sensed with varying vagueness. While "threeness" is pretty clear, √-1 and "infinitesimally small" feel like guessing. Since all the ideas are unclear to some (varying) extent, we need to invent theories in order to describe them and their relationships.^{(1) (2)}

As for the methodology, the starting point of mathematical theories are fundamental, ubiquitous ideas (incl. logic) about which we have accumulated so much and so consistent day-to-day experience that makes it possible to confidently rely on our intuition. We just close our eyes, and in an iterative process, come up with new ideas based on the already available ones, think out or take note of assumptions about the ideas we have, and discover the logical consequences of the assumptions.^{(3)} The only constraint is that the resulting theories must constitute conceivable stories about our world. As a minimum, statements including "clear" ideas (e.g. natural numbers, finite sets) should coincide with our experience, and statements including "unclear" ideas (e.g. √-1, infinitesimals, infinite sets) should not lead to known contradictions.^{(4) (5)}

The reason for mathematics transcending cultures and millennia is that humans have always had very similar experiences about the fundamental, ubiquitous ideas from which mathematics emerged. An alien civilization, if any, might develop a mathematics very different from ours, provided they exist in a very different environment and/or have very different sense organs.

The merits of a mathematical theory are assessed based on its beauty, success, and consistency. With regard to the development of mathematics, the first aspect is the most important guiding principle. The reason is, as once somebody put it, that beauty is felt as a result of sensing a deep law of nature. (I am deliberately imprecise here too.) As opposed to other sciences, the aim in mathematics is thus not to pinpoint and test, but rather to reflect laws of nature. That is, mathematics is both a science and an art.

In the following, standpoints of various philosophical schools are presented and commented. They can be used to customize the framework outlined above. It's a matter of personal preference.

According to fictionalism, mathematics is a collection of useful fictions whose statements are, despite their usefulness, actually all false. In these fictions there are recurring "characters" like numbers, straight lines, graphs and many others, all entirely fictitious. Nevertheless, the fictions are useful because they convey (or rather, reflect) truths about our world. Furthermore, discussing our experiences in terms of carefully chosen, representative fictional characters greatly facilitates communication.

Although I agree that mathematics is a collection of stories, I still think that the ideas (i.e. the characters) in those stories are real, in one way or the other, simply because we do sense them. The assumptions the stories make about the ideas, however, may well (all) be fictional. It's like writing a guide about an existing city without knowing it well.

Also, in my view the ideas exist right here with us (just like the city in the previous analogy), not only in a separate "world of ideas" as platonism would suggest. E.g. in a computer network, there is a graph right there belonging to the network; where else could it be? Putting it another way, I don't think the network is more real than the graph.

Loosely speaking, constructivism means seeing is believing. The principle is that only those ideas and properties exist for which we can exhibit an appropriate configuration in terms of an agreed way of representation. For example, if it's agreed to represent real numbers via Cauchy sequences of rational numbers, then the square root of 2 exists only after one has constructed an appropriate Cauchy sequence. Before that, the square root of 2 does not exist, however esoteric this may sound.

The allowed ways of construction differ in different flavors of constructivism. In most cases though, the set of natural numbers is either assumed to exist or allowed to be constructed, either as actual (completed) infinity or as potential (incomplete) infinity. Moreover, instead of carrying out a construction (e.g. that of a square root), it may be agreed that it suffices just to provide a feasible method for the same.

Truth values have to be constructed too. Here the "appropriate configuration" is the concrete proof, and the "agreed way of representation" is the allowed forms of proof. A statement does not have a truth value until it has either been proved or disproved (or until a feasible method has been provided that would certainly result in a proof of truth or falsity).^{(6)} If there is no such truth value to observe, it simply does not exist for a constructivist, leaving the statement undecided.^{(7)}

To prove "A or B", we need to prove at least one of them. This requirement lies at the very heart of constructivism, and follows from a "seeing is believing" interpretation of logical disjunction. Accordingly, to be able to say that "A or (not A)" is true, we need to either prove A or disprove A. This is different from classical mathematics where "A or (not A)" is always true in itself, since there it is taken for granted that every statement has a truth value, even if nobody can actually observe it. Similar applies to "not (not A) ⇒ A": in constructivism, the left hand side only means we've demonstrated that there is no way of disproving A; but it does not necessarily imply that A is true, or that A has a truth value at all.

Adhering to the principle of constructivism lends constructive mathematics certainty and confidence, and leaves little room for unpleasant surprises like paradoxes or contradictions. Eventually, it's hard to imagine a more obvious and tangible evidence of existence than that of a constructed representation. The price we pay is that proofs tend to become unusually cumbersome.^{(8)}

Infinity in mathematical theories has always been a major source of controversy. We can only perceive a finite thing in its entirety, and this is true for our imagination too. With respect to intuition, this means we can only guess what a real infinity would look like in terms of our finite perception, and whether there exists infinity of any kind (actual or potential) at all. To eliminate this guesswork, theories in strict finitism are free from infinity.

In classical mathematics, one can state that the formula n^{2}-1=(n+1)·(n-1) is true for all natural numbers. In strict finitism, the corresponding statement is that any concrete equation that matches the above formula is true. In other words, we only state that if someone gets hold of a concrete natural number, say, 19, then the resulting concrete equation, 19^{2}-1=(19+1)·(19-1), will be true. That is, while in classical mathematics the statement refers to all the elements of a (hypothetically) existing infinite set, in strict finitism it's merely an abstraction of concrete individual occurrences.

Infinite sets of classical mathematics may have counterparts in strict finitism. For example, the "set" of even numbers is basically defined as the property of being even. If a concrete number has this property, we say that the number is an "element of the set". So it's not really a set in the classical sense, but rather a common property that ties concrete occurrences together.

A "sequence" of rational numbers can be defined as a (finite) method that expects a single input, a natural number, and is guaranteed to produce a rational number as output in finitely many steps. Such methods can then be used to represent real numbers by finitary means.

Strict finitism is usually coupled with constructivism (constructivism taken to the next level, so to speak), but nevertheless it's possible to develop non-constructive theories that do not make use of infinity.^{(9)}

Pure mathematics deals with discovering about the ideas we sense, while applied mathematics means modeling real-world phenomena using a mathematical theory.

As an example, developing (the story of) Euclidean geometry, i.e. intellectually discovering the properties of and relationships between ideas like points, straight lines and planes, is pure mathematics. On the other hand, modeling shapes and trajectories of physical objects by means of Euclidean geometry, with the aim of making measurable predictions about them, is applied mathematics.^{(10)}

Another example of applied mathematics is to model asset prices as continuous quantities, while knowing that real prices have a finite number of decimal places, given e.g. in cents.

In summary, mathematics is the intellectual discovery of nature's infrastructure. It consists of theories about ideas that we sense with varying vagueness.

A theory begins with a number of ideas and assumptions, from which its story unfolds via the derivation of more and more logical consequences. The entire process happens within one's mind, relying fully on one's intuition.

The aim in mathematics is to reflect deep laws of nature; that's where its beauty comes from, and that's how it is connected with arts.

What is still missing is a clarification of the terms "logical consequence" and "fundamental idea", which were both used informally all along. Discussing these in detail will be the topic of another essay.

M. Balaguer, E.N. Zalta [ed.] (2011), *Fictionalism in the Philosophy of Mathematics*

E. Bishop, D. Bridges (1985), *Constructive Analysis*

J. Bolyai (1831), F. Kárteszi [ed.] (1987), *Appendix: The Theory of Space*

D. van Dalen [ed.] (1981), *Brouwer's Cambridge Lectures on Intuitionism*

K. Devlin (1993), *The Joy of Sets: Fundamentals of Contemporary Set Theory*

H. Field (2008), *Saving Truth from Paradox*

H. Field (1980), *Science Without Numbers: A Defence of Nominalism*

T. Gowers [ed.] et al. (2008), *The Princeton Companion to Mathematics*

J.L. Heiberg [ed.] (1883-1885), R. Fitzpatrick [trans.] (2008), *Euclid's Elements of Geometry*

D. Hilbert (1926), E. Putnam & G.J. Massey [trans.] (1964), *On the infinite*

D. Hilbert (1898-1899), E.J. Townsend [trans.] (1902), *The Foundations of Geometry*

D.R. Hofstadter (2008), *I Am a Strange Loop*

H.J. Keisler (2011), *Foundations of Infinitesimal Calculus*

E. Nelson (2005), *Completed versus Incomplete Infinity in Arithmetic*

E. Nelson (2010), *Confessions of an Apostate Mathematician*

E. Nelson (1977), *Internal Set Theory: A new approach to nonstandard analysis*

E. Nelson (2000), *Mathematics and Faith*

E. Nelson (1987), *Radically Elementary Probability Theory*

E. Nelson (2002), *Syntax and Semantics*

E. Nelson (2006), *Warning Signs of a Possible Collapse of Contemporary Mathematics*

B. Russel (1919), *Introduction to Mathematical Philosophy*

E. Schechter (2001), *Constructivism Is Difficult*

I. Stewart (2010), *Alien mathematics: is Pi universal?*

P. Suppes (2001), *Finitism in geometry*

P. Suppes (2000), *Quantifier-free axioms for constructive affine plane geometry*

P. Suppes (2002), *Representation and Invariance of Scientific Structures*

P. Suppes (2010), *The nature of probability*

M. Tiles (1989), *The Philosophy of Set Theory: An Historical Introduction to Cantor's Paradise*

E.P. Wigner (1959), *The Unreasonable Effectiveness of Mathematics in the Natural Sciences*

F. Ye (2011), *Strict Finitism and the Logic of Mathematical Applications*

**(1) Isn't "threeness" crystal clear?**

Good question. I don't think it is.

**(2) Is √-1 a ubiquitous idea?**

No, it's neither fundamental nor ubiquitous, but it bears a close relationship to such ideas, which at the end of the day makes it interesting for mathematical discovery. On a related note, the various kinds of mathematical spaces (vector spaces, topological spaces, etc.) are all about discovering what emerges from the interplay between the fundamental, ubiquitous ideas of set, relation, and operation.

**(3) What about discoveries where computers are used?**

There is a philosophical concept of the "extended mind", which for the mathematical practice means that the mind can be aided by things like pencil and paper, calculator, or computer. Such extensions boost already existing capabilities of the mind, especially the memory and the ability to derive logical consequences.

**(4) What if it turns out that a story is not true?**

No theory is true, there are only circumstances under which it cannot be falsified. The story will continue to have its own life in our mind and can be further developed via logical deductions.

**(5) Wouldn't it be more productive to allow experiments in mathematics?**

Thought experiments are allowed, of course. Others would be of very limited use, since they would either be rendered superfluous by logical proofs (due to the "unreasonable effectiveness" of logic in mathematics), or just couldn't be performed at all (e.g. due to infinity involved).

**(6) What does "disproved" mean in constructivism?**

A statement is disproved after a contradiction has been derived from the assumption that it's true (= proved). This is how the truth value "false" is defined. The following are synonymous: "A is false", "A is disproved", "not A". To paraphrase, a statement is false if and only if it has been proved that the situation described by it cannot occur.

**(7) Assuming that the square root of 2 has not yet been constructed, can we say that it does not exist, or only that its existence is undecided?**

If existence is meant literally, then given the assumption, the statement "the square root of 2 exists" is false. However, if by "exists" we actually meant "can be constructed", then the statement is (currently) undecided.

**(8) Why are constructive proofs cumbersome?**

I believe it has to do with our education, namely that we have been trained in classical mathematics from an age of 6 or so. As a result, textbooks on classical mathematics can be written in an informal style where many details that would clutter the main line of thought are omitted. It does not compromise rigor, since it capitalizes on the readers' solid understanding of and intuition about the fundamentals. Alternative mathematical theories don't have this luxury. In their textbooks, theorems and proofs are stated either in great detail at the expense of increased clutter, or with less details risking that the audience gets confused or misunderstands what is written. Either way, the exposition is likely to be difficult to follow.

**(9) Does strict finitism deny the existence of infinity?**

No, not necessarily. It rather says that we don't need to care whether infinity exists or not, because we can still do useful mathematics without it. As of today, the resulting theories seem sufficient for the purpose of modeling the finite world we know about.

**(10) Is "applied mathematics" mathematics at all?**

Yes, because in the models we work with mathematical ideas. However, the application of a model is unlikely to yield deep mathematical discoveries, for the aim is to solve problems of another discipline, not that of mathematics.