To measure or not to measure... a real problem!

This post will be more theoretical than usual. If you are afraid of mathematics, go away now before it is too late!

I will write about a set, the Vitali set, which arises in measure theory, to show that there is no coherent way of assigning a size to every subset of the real line. Because of that I will try to define size for subsets of the real line and then verify that such task is impossible.
I will also enforce a couple of (seemingly reasonable) restrictions on my definition.

I challenge you to read this post and to calmly follow my reasoning. Whenever something doesn't seem obvious, try to make it clear by yourself with a piece of paper and pen/pencil. If any doubts persist, drop me your question(s) in the comments and I will answer gladly!

Let us call $m$ to the function that, given a subset of the real line, returns its size; that is, let us try to define $m: \mathcal{P}(\mathbb{R}) \to [0, \infty]$.

Let us also suppose that the function $m$ satisfies the following restrictions:

If $I$ is an interval with endpoints $a \leq b$, $m(I) = b-a$. That is, the size of an interval is what we intuitively call its length. For example, $m([3,4]) = 4-3 = 1$;
$m(E+x) = m(E)$: the size of a set remains unchanged under translations, i.e. even if a set is "pushed over", its size remains the same;
If we have a finite number of sets, or a numerable quantity of them, and if they are all disjoint, then the size of their union must equal the sum of their individual sizes: $m (\cup_{n=1}^{\infty}E_n) = \sum_{n=1}^{\infty}m (E_n) $ whenever all the $E_n $ are disjoint.

Under these restrictions, can we create the function $m$?

Giuseppe Vitali, an italian mathematician, found out that the answer was "no". Vitali discovered a set $V \subset \mathbb{R}$ that allows one to conclude that we cannot create a function $m$ satisfying the restrictions above. I will now present the construction of the Vitali set:

Let us take a real number $x$ and define the set $C_x = \{y \in \mathbb{R}: y-x \in \mathbb{Q}\}$, i.e. the set $C_x$ contains all numbers that differ of $x$ by a rational amount: $C_x = \{x + q, \forall q \in \mathbb{Q}\}$ (in fact, the set $C_x$ represents the equivalence class of $x$ under the equivalence relation $x \sim y \iff x-y\in\mathbb{Q}$).

Having defined the sets $C_x$ for all real $x$, we pick out of every unique $C_x$, a value $v_x$ such that $v_x \in ]0, 1[$.
I said "out of every unique $C_x$" because, for example, $C_0 = C_1$; for this construction, we do not want repeated sets $C_x$. We will also say that the number $v_x$ is a representative of the set $C_x$ and all its elements. As an example, $0.3$ could be a representative of the set $C_0$ and $\pi - 3.1$ could be a representative of the set $C_\pi$.
(By the axiom of choice) we have a set $V$ - which we will call the Vitali set - where we have, from each unique set $C_x$, a single representative $v_x \in ]0, 1[$.
We now ask what should be the value $m(V)$. Either $m(V) = 0$ or $m(V) > 0$. We will build an auxiliary set which will allow one to conclude that both cases lead to incoherent results.

To build the auxiliary set, let us consider $\{r_1, r_2, \cdots, r_n, \cdots \}$ an enumeration of all the rationals in $]-1, 1[$ and let us define $V_j = V + r_j$: $V_j$ is the translation of the set $V$ by $r_j$. Then we define $$G = \cup_{j=1}^\infty V_j$$ We notice that the sets $V_j$ are all disjoint! Suppose $y$ is an element lying simultaneously in $V_j$ and in $V_k$ (with $j \neq k$). This means that

$x_j = y-r_j$ is in $V$ because $y \in V_j = V + r_j$;
$x_k = y-r_k$ is in $V$ because $y \in V_k = V + r_k$;
From $1.$ and $2.$ we have that $x_j - x_k = (y - r_j) - (y - r_k) = r_k - r_j$ is a rational number, i.e. $x_j - x_k \in \mathbb{Q}$;

From $3.$ we get that $x_k \in C_{x_j}$ but, by definition, $V$ only has a representative from each $C_x$. Thus we obtain a contradiction and there can be no real number $x$ lying in two different $V_j$: all the $V_j$ are disjoint.

Because of the restrictions we imposed on $m$, we have that $m(V_j) = m(V + r_j) = m(V)$ because $m$ is constant under translations; we also have that $$m(G) = m\left(\cup_{j=1}^\infty V_j\right) = \sum_{j=1}^\infty m(V_j) = \sum_{j=1}^\infty m(V)$$ because the $V_j$ are all disjoint and we are performing their union.

Now we can see that if it is the case that $m(V) = 0$, then $m(G) = 0$. But $m(G) = 0$ is absurd because the interval $]0, 1[$ is inside $G$ and $m(]0, 1[) = 1$ implying that $m(G) \geq 1$. To verify that $]0, 1[ \subset G$, let $x \in ]0,1[$. This element $x$ belonged to some initial $C_\bullet$ and therefore has a representative $y$ in $]0, 1[$. If $y$ is a representative of $x$, then $x$ and $y$ differ by a rational number. But $x,y \in ]0, 1[$ means that their distance is smaller than $1$ and, in particular, $x - y \in ]-1, 1[$. Adding everything together, $x-y \in \mathbb{Q}$ and $x-y \in ]-1, 1[$ gives that $x-y = r_k$ for some $k$, as $\{r_1, \cdots, r_n, \cdots\}$ was an enumeration of all the rationals in $]-1, 1[$.

If it was the case that $m(V) > 0$, then $m(G) = \infty$. But $m(G) = \infty$ is another absurd given that $G \subseteq ]-1, 2[$ implying that $m(G) \leq m(]-1, 2[) = 3$. To see that $G \subseteq ]-1, 2[$ we just have to remember that $G$ is a numerable union of the sets $V_j$ and every one of those is $V_j = V + r_j$. We had that $V \subset ]0, 1[$ and $r_j \in ]-1, 1[$ hence $V_j \subseteq ]-1, 2[$.

(the image above has some points from $V$ in red, one of the rationals $r_j$ and the respective set $V_j$ in blue; because every $r_j$ lies in $]-1, 1[$ it is obvious that $V_j \subset ]-1, 2[$)

Looking back at what we have, neither $m(V) = 0$ nor $m(V) > 0$ produce coherent results! We conclude that our function $m$ cannot exist as I envisioned it in the beginning of the post... even though the restrictions I imposed seemed perfectly reasonable!

I understand if this post isn't the easiest to digest. If you have any doubts regarding the construction I described or anything related, drop a line on the comments section and I will be glad to answer!

Este post vai ser um pouco mais "pesado" do que o costume. Se tens medo de matemática, sai agora antes que seja tarde demais.

Vou falar de um conjunto, o conjunto de Vitali, que surge no estudo da teoria da medida, para mostrar que não conseguimos definir um tamanho para todos os subconjuntos da reta real. Por isso, neste post vamos tentar definir uma noção de tamanho para subconjuntos da reta real e verificar que isso é impossível.
Vamos procurar que a nossa definição satisfaça algumas condições que, intuitivamente falando, pareciam ser razoáveis.

O desafio que deixo é que sigam o post e, com calma, acompanhem os raciocínios descritos. Quando algo não parecer óbvio ou não for claro, tentem esclarecer as vossas próprias dúvidas e provar o que eu digo. Se houver alguma dúvida que persista, deixem-me um comentário e eu terei todo o prazer em responder!

Chamemos $m$ à função que, dado um subconjunto da reta real, devolve o seu tamanho; ou seja, vamos procurar definir $m: \mathcal{P}(\mathbb{R}) \to [0, \infty] $.

Vamos supor que a função $m$ satisfaz as seguintes condições:

Se $I $ é um intervalo com extremos $a\leq b $, $m(I)= b-a $. Ou seja, o tamanho de um intervalo é aquilo a que intuitivamente chamamos o seu comprimento. Por exemplo, $m ([3,4])=4-3=1$;
$m (E + x) = m (E) $: o tamanho de um conjunto deve ser constante sob translações, i.e. se um conjunto for "empurrado para o lado", o seu tamanho continua igual;
Se tivermos um número finito de conjuntos, ou uma quantidade numerável deles, e eles forem todos disjuntos, então o tamanho da sua união deve ser igual à soma dos seus tamanhos: $m (\cup_{n=1}^{\infty}E_n) = \sum_{n=1}^{\infty}m (E_n) $ sempre que os $E_n $ são todos disjuntos. Isto pode ser visto como uma formalização de "o todo é a soma das partes".

Postas estas condições, será que conseguimos criar a dita função $m$?

Giuseppe Vitali, um matemático italiano, descobriu que não. Vitali descobriu um conjunto $V \subset \mathbb{R}$ que nos permite concluir que a função $m$ não pode existir e satisfazer as condições que eu listei em cima. Passemos de seguida à construção do conjunto de Vitali:

Tomemos um número real $x$ e defina-se o conjunto $C_x = \{y \in \mathbb{R}: y-x \in \mathbb{Q}\}$, i.e. o conjunto $C_x$ contém todos os números que diferem de $x$ por um número racional: $C_x = \{x + q, \forall q \in \mathbb{Q}\}$ (na verdade, o conjunto $C_x$ representa a classe de equivalência de $x$ sob a relação de equivalência $x \sim y \iff x-y\in\mathbb{Q}$).

Tendo definido os conjuntos $C_x$ para todo o $x$ real, escolhemos de dentro de cada conjunto $C_x$ único, um valor $v_x$ tal que $v_x \in ]0, 1[$.
Digo "de cada conjunto $C_x$ único" porque, por exemplo, $C_0 = C_1$; para esta construção, não nos interessam conjuntos $C_x$ repetidos. Dizemos ainda que esse valor $v_x$ é um representante do conjunto $C_x$ e de todos os elementos que lá estão; por exemplo, $0.3$ podia ser o representante do conjunto $C_0$ e $\pi - 3.1$ podia ser o representante do conjunto $C_\pi$.
(Pelo axioma da escolha) temos um conjunto $V$ - a que chamamos o conjunto de Vitali - onde reunimos, de cada conjunto único $C_x$, apenas um representante $v_x \in ]0, 1[$.
Perguntamo-nos agora sobre qual deveria ser o valor de $m(V)$. Das duas uma, ou $m(V) = 0$, ou $m(V) > 0$. Vamos construir um outro conjunto auxiliar que nos vai ajudar a perceber que, independentemente do caso que se verifique, obtemos uma contradição.

Para construir o conjunto auxiliar, vamos considerar que $\{r_1, r_2, \cdots, r_n, \cdots \}$ é uma enumeração de todos os racionais em $]-1, 1[$ e vamos definir $V_j = V + r_j$: $V_j$ é a translação do conjunto $V$ em $r_j$. Definimos de seguida $$G = \cup_{j=1}^\infty V_j$$ Vamos ainda observar que os conjuntos $V_j$ são todos disjuntos! Suponhamos que $y$ é um elemento que está, simultaneamente, em $V_j$ e em $V_k$ (com $j \neq k$). Isto significa que

$x_j = y-r_j$ está em $V$ porque $y \in V_j = V + r_j$;
$x_k = y-r_k$ está em $V$ porque $y \in V_k = V + r_k$;
De $1.$ e de $2.$ vem que $x_j - x_k = (y - r_j) - (y - r_k) = r_k - r_j$ é um número racional, i.e. $x_j - x_k \in \mathbb{Q}$;

De $3.$ vem que $x_k \in C_{x_j}$ mas, por definição, $V$ tem apenas um elemento de cada $C_x$. Obtemos assim uma contradição e não pode existir um $x$ em dois conjuntos $V_j$ diferentes: todos os $V_j$ são disjuntos.

Pelas condições que impusemos à função $m$, temos por um lado que $m(V_j) = m(V + r_j) = m(V)$ porque a função $m$ é constante sob translações e, por outro lado $$m(G) = m\left(\cup_{j=1}^\infty V_j\right) = \sum_{j=1}^\infty m(V_j) = \sum_{j=1}^\infty m(V)$$ porque a função $m$ respeita a noção de que o todo é a soma das partes (e porque os conjuntos $V_j$ são todos disjuntos!).

Vemos assim que se for o caso que $m(V) = 0$, então $m(G) = 0$. Mas $m(G) = 0$ é absurdo, porque o intervalo $]0, 1[$ está contido dentro de $G$ e $m(]0,1[) = 1$, logo $m(G) \geq 1$. Para vermos que $]0, 1[ \subset G$, seja $x \in ]0, 1[$. Este elemento $x$ pertencia a algum conjunto $C_\bullet$ inicial e portanto tem um representante $y$ em $]0, 1[$. Ora, se $y$ é representante de $x$, quer dizer que $x$ e $y$ diferem por um racional. Mas $x,y \in ]0, 1[$ significa que a distância entre $x$ e $y$ é inferior a $1$ e, em particular, $x - y \in ]-1, 1[$. Mas se $x-y \in \mathbb{Q}$ e $x-y\in]-1,1[$, então $x-y = r_k$ para algum $k$, já que $\{r_1, \cdots, r_n, \cdots\}$ era uma enumeração de todos os racionais em $]-1,1[$.

Se fosse o caso que $m(V) > 0$, então $m(G) = \infty$. Mas $m(G) = \infty$ é absurdo, já que $G \subseteq ]-1, 2[$ e portanto deveríamos ter $m(G) \leq m(]-1, 2[) = 3$. Para vermos que $G \subseteq ]-1, 2[$ basta relembrar que $G$ é a união numerável dos conjuntos $V_j$ e cada um desses $V_j$ é $V + r_j$. Ora $V \subset ]0, 1[$ e $r_j \in ]-1, 1[$ logo $V_j \subseteq ]-1, 2[$.

(na imagem temos alguns pontos de $V$ a vermelho, um dos racionais $r_j$ e o respetivo conjunto $V_j$ a azul; uma vez que $r_j$ tem de estar em $]-1, 1[$ é claro que $V_j \subset ]-1, 2[$)

Temos assim que nem $m(V) = 0$ nem $m(V) > 0$ produzem resultados coerentes! Concluímos assim que a nossa função $m$ não pode existir e satisfazer as condições que eu impus no início... ainda que essas condições parecessem perfeitamente razoáveis do ponto de vista intuitivo!

Acredito que o post não tenha sido trivial de seguir. Se houver alguma dúvida quanto à construção que acabei de repetir aqui ou sobre algo relacionado, deixa um comentário em baixo e eu respondo!

- RGS

Search This Blog

Mathspp: mathematics and programming

To measure or not to measure... a real problem!

Popular posts from this blog

Pocket maths: how to compute averages in your head

Tutorial on programming a memory card game

Markov Decision Processes 02: how the discount factor works