## Posts

Showing posts from September, 2018

### Markov Decision Processes 02: how the discount factor works

PtEn< change language In this previous post I defined a Markov Decision Process and explained all of its components; now, we will be exploring what the discount factor $\gamma$ really is and how it influences the MDP.

In this MDP the states are Hungry and Thirsty (which we will represent with $H$ and $T$) and the actions are Eat and Drink (which we will represent with $E$ and $D$). The transition probabilities are specified by the numbers on top of the arrows. In the previous post we put forward that the best policy for this MDP was defined as $$\begin{cases} \pi(H) = E\\ \pi(T) = D\end{cases}$$ but I didn't really prove that. I will do that in a second, but first what are all the other possible policies? Well, recall that the policy $\pi$ is the "best strategy" to be followed, and $\pi$ is formally seen as a function from the states to the actions, i.e. $\pi: S \to A$. Because of that, we must know what $\pi(H)$ a…

### Markov Decision Processes 01: the basics

PtEn< change language In this post I will introduce Markov Decision Processes, a common tool used in Reinforcement Learning, a branch of Machine Learning. By the end of the post you will be able to make some sense of the figure above!
I will couple the formal details, definitions and maths with an intuitive example that will accompany us throughout this post. In later posts we will make our example more complete and use other examples to explain other properties and characteristics of the MDPs.

Let me introduce the context of the example:
From a simplistic point of view, I only have two moods: "hungry" and "thirsty". Thankfully, my parents taught me how to eat and how to drink, so that I can fulfill the needs I mentioned earlier. Of course that eating when I am hungry makes me happy, just as drinking when I am thirsty makes me happy! Not only that, but eating when I am hungry usually satisfies me, much like drinking when I am thirsty usually satisfies me.

Supp…

### Twitter proof: folding my way to the moon

PtEn< change language In this twitter proof we will see how the exponential function can mess up with objects from our daily lives!..

Claim: with less than $50$ folds, a piece of paper will be so thick that it will cover the distance from the Earth to the Moon.

Twitter proof: a common sheet of paper is $0.1$mm thick. If we fold it once, it becomes $0.2$mm thick. Folding twice, $0.4$mm. Folding $49$ times, the paper becomes $2^{49}\times 0.1$mm thick, which is around $5.63\times 10^{13}$mm or $5.63\times10^7$km, $141$ times the distance from the Earth to the Moon ($398818$km). Neste post vamos ver como a função exponencial pode interagir com objetos do nosso quotidiano e criar resultados inesperados.

Proposição: com menos de $50$ dobras, uma folha de papel fica com uma grossura superior à distância da Terra à Lua.

Prova num tweet: uma folha de papel normal tem $0.1$mm de grossura. Se a dobrarmos uma vez, fica com $0.2$mm de grossura. Dobrando de novo, fica com $0.4$mm. Dobrando …

### Twitter proof: neural networks and the linear activation function

PtEn< change language In this post we will see why it is not helpful to have two consecutive layers of neurons with linear activation functions in neural networks. With just a bit of maths we can conclude that $n$ consecutive linear layers compute the exact same functions as $1$ single layer.

Claim: having two fully connected layers of neurons using linear activation functions is the same as having just one layer with a linear activation function.

We just have to lay down some notation in order for the maths to be doable. Assume the two consecutive layers of linear neurons are preceded by a layer with $n_0$ neurons, whose outpus are $o_1, o_2, \cdots, o_{n_0}$.
Let us say that after that layer, there is a layer of $n_1$ neurons with linear activation functions $f_i(x) = a_ix + b_i$; the weight from neuron $t$ of the previous layer to the neuron $i$ of this layer is $w_{t,i}$.