Posts

Showing posts from October, 2018

Twitter proof: the sum of inverses diverges

Pt En In this post I will share with you my favourite proof that the series of the inverses diverges: $\sum_{i=1}^\infty \frac1i = \infty $. Claim : the series $\sum_i \frac1i$ diverges. Twitter proof : consider the series $$ \begin{align} &\frac12 + \frac12 + \frac12 + \cdots = \\ &\frac12 + 2 \times\frac14 + 4\times \frac18 + \cdots = \\ &\frac12 + \frac14 + \frac14 + \frac18 + \cdots \leq \\ &\frac12 + \frac13 + \frac14 + \frac15 + \cdots \end{align}$$ that clearly diverges because it is a series of a constant nonzero term. By the comparison test, the series of the inverses also diverges. Comment with your favourite way to prove this fact!! Neste post quero partilhar com todos a minha prova preferida de que a série dos inversos dos naturais diverge: $\sum_{i=1}^\infty \frac1i = \infty $. Proposição : a série $\sum_i \frac1i$ diverge. Prova num tweet : considere-se a série$$ \begin{align} &\frac12 + \frac12 + \frac12 + \cdots = \\ &...

Twitter proof: the Tower of Hanoi

Image
Pt En In this post we prove what the minimum number of moves to solve the problem of the Tower of Hanoi is! Claim: let $T(n)$ denote the number of moves it takes to solve the Tower of Hanoi with $n$ disks; then $T(n) = 2^{n}-1$. Twitter proof: note that to solve the problem with $n$ disks, we first have to move the top $n-1$ disks to one of the two poles, move the bottom disk (the bigger one) to the remaining pole, and then move the top $n-1$ disks to the top of the bigger disk. Each time we move the top $n-1$ disks to another pole we must take, at least, $T(n-1)$ moves (by definition of $T$) hence we clearly have $T(n) = 2T(n-1) + 1$. Just notice that $B(n) = 2^n - 1$ satisfies the recurrence relation and that $T(0) = B(0) = 0$. If you are having trouble understanding what I mean by to solve the problem with $n$ disks, we first have to move the top $n-1$ disks to one of the two poles, move the bottom disk (the bigger one) to the remaining pole, and then move t...

Markov Decision Processes 03: optimal policy variation

Image
Pt En This post is going to follow up what was discussed here , and I will show you how changing the value of the discount factor may affect what the optimal policy is. For that, consider the MDP defined below: This MDP is inspired in the example I have been using; I added an extra state and stripped the MDP of the majority of the transitions so we can focus on what is essential here. From the MDP above there is a clear contender for the title of optimal policy: $$\begin{align} &\pi(H) = E\\ &\pi(T) = D\\ &\pi(TU) = D\end{align}$$ Assume that when we enter the MDP there is a $50\%$ chance we start Hungry and a $50\%$ chance we start Thirsty. If we use the policy $\pi$ then the expected reward is $$E[R | \pi] = 0.5E[R_T | \pi] + 0.5E[R_H | \pi]$$ where $E[R_s | \pi]$ is the expected reward we get by following policy $\pi$ starting from state $s$. From my previous post we know that $E[R_T | \pi] = E[R_H | \pi] = \frac{1}{1-\gamma}$ and hence $$E[R | \pi] ...