Monthly Archives: February 2018

A simple guide to Reinforcement Learning

Catch 5 game states

This is the first post in a series of articles on Reinforcement Learning which is a subfield of Machine Learning on that we have blogged about before.

Machine Learning is all about having a piece of software learn to solve tasks that it was not explicitly programmed for. Instead, it learns by observing data with the goal to find patterns that it can generalize into certain rules. In other words, instead of having to explicitly define rules using traditional programming constructs such as conditions, loops or function calls, we’ll let a machine figure out those rules in an automatic iterative process.

Most forms of Machine learning still relies on human input though. In particular, humans are needed to select and provide lots of data for the machine to learn from. In contrast, Reinforcement Learning takes the idea another step forward. Instead of providing datasets, we let the machine take actions in the environment that we want it to master and give it feedback on its actions in the form of rewards or punishment. In other words, the machine has no clue about the task it is trying to solve. It basically starts acting randomly at first, but develops a policy to solve the task over time. Doesn’t that sound exciting?

A word of warning

This post aims to provide an easy entry for beginners who are new to Reinforcement Learning or even Machine Learning in general. We may oversimplify or sacrifice performance in certain areas to achieve this goal. We’ll also choose a simpler vocabulary over a more scientific one whenever possible.

In this article we want to learn the basics of Reinforcement Learning using a simple Q-Table approach. Don’t worry if that doesn’t ring a bell yet. We’ll learn what all of this means as we go.

We’ll be taking things further in follow-up articles in which we’ll do Deep Reinforcement Learning using a neural net instead of the Q-Table. Step by step we’ll learn about the explore-exploit dilemma, replay memory and many other exciting things. But let’s start simple!

Defining a simple game

As our main goal is to learn Reinforcement Learning, let’s keep the actual task that we want our machine to learn as simple as possible. In fact let’s just make up a simple math game ourselves! We’ll call it “Catch 5” and these are the rules:

  1. Every game starts by revealing a random number between 1 and 12 with the exception of the number 5 that will never be the starting point.

  2. From that given starting point, it is our goal to get to number 5 by simple addition/subtraction. However, we can only choose from the following six actions: Add 3, Add 2, Add 1, Subtract 1, Subtract 2 and Subtract 3

  3. We can only play a maximum of three turns. If we don’t get to 5 within three turns, we have lost the game. Since the starting number is between 1 and 12, it is always possible to win the game if we don’t screw up the basic math 🙂

Before we move on, let’s get familiar with the game itself. We’ve created a browser based version that you can play with right here:

Our game may turn out like this and we win in two turns.

Game: Starting number is 9. Catch 5!

Player: playing -3

Game: You played -3, now at 6. Catch 5!

Player: playing -1

Game: Caught 5! You won in 2 turns.

However, keep in mind that our machine has no idea about the rules of the game at first and basically just “sees” six buttons without any description on how to play. It doesn’t even know it’s playing a game, so it may act like the following and lose.

Game: Starting number is 9. Catch 5!

Player: playing +3

Game: You played +3, now at 12. Catch 5!

Player: playing -1

Game: You played -1, now at 11. Catch 5!

Player: playing -3

Game: You played -3 with your last move. Now at 8. You lost!

We didn’t manage to get to number 5 within three turns, which means we have lost the game. But how is our machine going to find out how to win the game if we don’t explicitly teach it how to play?

A Q-Table to the rescue

As we can see our game isn’t that complex and we may wonder what’s the point of such a silly toy task. It turns out that a simple task like this makes it very easy for us to explore the topic and enables us to visualize the states and their possible actions in simple graphics.

Notice that the 5 is a special state here as it is the final winning state from which no further action can be done.

Let’s ignore the fact that our defined rules would allow to also reach numbers above 12 or below 1. Even if we allow these states to happen, they don’t change the principles of what we are about to learn.

Our main takeway for now should be, that we can think of our game as a defined space of states where we can take six possible actions from every single state.

The thing that we want our machine to learn is a mapping from every possible state to its best possible action. Let’s assume the best strategy to win this game would be to always move with the biggest possible step towards the number 5. We could visualize the entire state space and their best actions as seen in the following animation.

Catch 5 game states

For instance, starting at 12, the best move we could do is to play -3 which takes us to 9, from where we play -3 again, which takes us to 6, from where we play -1 and win the game in three turns.

Obviously, our strategy isn’t the only one to win the game. Given a starting point of 9 for instance, we could play -3 followed by -1 or we could play -2 followed by -2 or even -1 followed by -3. Each of these pairs of actions would make us win the game in two turns. That is perfectly fine, in fact, we don’t know yet what kind of policy our machine will develop to figure out how to win this game.

The most important learning up to this point should be that our machine can develop a mapping of states to rated actions. Such a mapping is called a Q-Table.

For instance, looking at state 9 we can easily imagine that the machine develops a strong sympathy towards playing -3 indicated in red with the other possible actions having a less strong indication of being the best action.

Catch 5 game states

The complete Q-Table contains all possible states and their rated possible actions. Once the machine has developed such a mapping, it can simply lookup the current state in the table and perform the action with the highest rating. It repeats this process for each turn until it wins the game.

Obviously, this approach does only work if the entire state space is small enough to be represented as such a simple table. But let’s not get ahead of ourselves.

Ok, so we know we want our machine to develop such a Q-Table. But how does that work?

Creating the environment, the game

As we’ve spoiled in the beginning, Reinforcement Learning is about machines acting in an environment and receiving positive or negative rewards to eventually learn a policy to solve a task.

Let’s begin by creating the environment – the game – that our machine will be playing.

The game is already implemented in the demo above. Notice that the interactive demo is written in TypeScript and runs on For the rest of the article and the follow-up articles, we’ll be using Python since it has a great ecosystem for Machine Learning. For demos we’ll be using which is the perfect platform for these kind of tasks.

Rest assured, the code is equally simple no matter whether we’ll be using TypeScript or Python!

class Game():
  def __init__(self):

  def reset(self):
    self.current_number = random.randrange(1, 12)
    if (self.current_number == 5):
    self.turns = 0

  def has_lost(self):
    return self.turns >= 3 and self.current_number != 5

  def has_won(self):
    return self.turns <= 3 and self.current_number == 5

  def is_active(self):
    return not self.has_lost() and not self.has_won()

  def play(self, action: Action):

    if (self.turns >=3):
      raise Exception('Max number of turns reached. Call reset.')

    self.turns += 1;
    self.current_number += int(action)

    return self.has_won()

There are really only two basic requirements that we have for the API of the game:

  • A play method that takes an action and applies it to the current state. This is the most important API as it will control the game and move us from one state to another.

  • has_won()/has_lost() methods to figure out if we have won or lost the game. These APIs are important to gather feedback to learn from.

We’ve also added some other APIs for convenience but the listed APIs are really the only crucial ones for our mission.

Building the agent

There are multiple actors in Reinforcement Learning. We’ve already build a simple game, which serves as the environment, that we want our machine to act in.

In order to play the game we need an agent that will perform actions in the game with the intention to figure out how to win the game over time.

We’ll start with something really simple and implement two methods, namely play and play_and_train. The play method basically lets the agent play the game for a given number of times in a mode where we can follow the visual output.

# creating random actions
def get_action(self):
  return random.randrange(0, 6)

# mapping actions (0, 1, 2, 3, 4, 5) to answers (3, 2, 1, -1 , -2, -3)
def action_to_answer(self, action):
  return actionMap[action]

def play(self, num_times):
    for nb_game in range(1, num_times + 1):
      print('Starting game #{nb_game}'.format(nb_game=nb_game))
      while (
        print('Current number is {current_number}'.format(
        action = self.get_action()
        human_readable_answer = self.action_to_answer(action)
        print('Playing {answer}'.format(answer=human_readable_answer))

        if (
          print('You won!')

        if (
          print('You lost')


As you can see, there’s no rocket sience behind it. A simple loop for the number of times we want our agent to play and an inner loop for the game itself to make turns as long as the game isn’t either won or lost.

Also notice that we are mapping the human readable answers (+3, +2..-2, -3) to zero based values (0, 1..4, 5) so that we can easily address them as indexes in an array to make our life much easier.

Since our agent is just making random moves, we should’t expect super powers either. Here’s some output from it playing randomly. It even won game #4 by accident. Hooray! For this simple game though we’d really want our agent to win 100% of the games!

Starting game #1
Current number is 7
Playing 1
Current number is 8
Playing 2
Current number is 10
Playing 3
You lost
Starting game #2
Current number is 10
Playing 1
Current number is 11
Playing 1
Current number is 12
Playing -2
You lost
Starting game #3
Current number is 1
Playing 3
Current number is 4
Playing -2
Current number is 2
Playing -2
You lost
Starting game #4
Current number is 7
Playing -2
You won!

Ok, great, feel free to checkout this embedded demo to familiarize yourself with the code at this point.

We’ll keep the play method to have an API that we can call that gives us visual feedback when the agent plays the game.

As we mentioned we’ll also implement a play_and_train method which will play in a headless mode, meaning we won’t be able to follow along the moves. Instead, we’ll get live metrics to follow along how the actual training is working out. For instance, we’d like to count the number of won or lost games.

Let’s take a look how play_and_train is implemented.

def play_and_train(self):
  stats = TrainStats()

  for epoch in range(1, self.config.nb_epoch + 1):
    stats.epoch = epoch

    while (

      state =

      action = self.get_action()
      human_readable_answer = self.action_to_answer(action)

      reward = self.get_reward()
      next_state =
      final = not
      self.train(state, action, reward, next_state, final)

      if (
        stats.nb_wins += 1

      if (
        stats.nb_losses += 1

    stats.p_wins = 100 / epoch * stats.nb_wins
    stats.p_loss = 100 / epoch * stats.nb_losses

    if (epoch % self.config.print_every_n_epoch == 0):

It’s really not that much different. Instead of printing out each and every move, we’ll collect stats to print out at a given frequency. Also notice that we slightly changed the wording: From now on, we’ll be calling the number of times that our agent plays the game epochs which is the general term for a complete training cycle.

Collecting the precious bits

The eagle-eyed reader may have spotted that there are in fact some subtle changes that are quite important. We collect five important variables that we are passing to a new train method on our agent.

Let’s take a closer look at what these things are:

  1. The state is just the current number of the game before we take our action. For instance, this may be the number 9.

  2. The action is simply the move that our agent performed on the state. For instance, playing -2 is an action in our game. Notice however that we internally represent these actions with the numbers 0 to 5 and just map them to their human readable values (3..-3) when we invoke play on the game.

  3. The next_state is the current number of the game after we took our action. If the state was 9 and the action was -2 the next_state will be 7.

  4. The final variable is True when the game is either won or lost and False when the game is still active.

  5. The most important variable that we haven’t yet talked about is the reward. At the end of the day, this is what enables our agent to learn at all. We’ll get the reward by calling self.get_reward() and we’ll take a look at the implementation in the next section.

The reward function

As mentioned, the reward function is one of the most important things to design in a Reinforcement Learning system. It will heavily influence the policy that our agent will learn.

It’s out of the scope of this article to discuss this in detail. Fortunately designing the rewards should be pretty straight forward and easy to follow for our simple task.

Remember that this is the function we call after our agent performed an action.

def get_reward(self):
    return 1
    return -1
    return -0.1

As we can see, we return a positive reward of 1 when our agent won the game, a negative reward of -1 when it lost the game and a negative reward of -0.1 for every action that didn’t directly lead to winning or losing the game. In other words, every action is penalized with a slightly negative reward of -0.1. This makes sense if we keep in mind that it is our goal to win the game in a maximum of three turns. So even if we consider that we have to make at least one turn per game, turns can be considered costly overall which is essentially what we price in here.

Updating the Q-Table

Ok, so let’s see how we can build up this Q-Table that we talked about before. Remember we are calling self.train(state, action, reward, next_state, final) and by now we should have a clear understanding what each parameter represents.

Before we move on, let’s create a new instance member on our agent called qtable. We’ll initialize it as an empty hash map. Remember that we said the Q-Table basically maps states to rated actions. Naturally these things can well be represented as hash maps.

self.qtable = {}

We will also create a new method ensure_qtable_entry which takes care of creating entries in our qtable in case they don’t exist yet.

def ensure_qtable_entry(self, state):
  if state not in self.qtable:
    self.qtable[state] = np.zeros(6)

Notice that the key is the state itself and the value is a numpy array with six entries initialized to 0, each representing one of our six possible actions. If the term numpy array is new to you, just think of an array with a super powerful API.

With that in place, let’s unveil what happens inside the train method.

def train(self, state, action, reward, next_state, final):


  if final:
    q_value = reward
    next_state_actions = self.qtable[next_state]
    next_state_max = np.amax(next_state_actions)

    q_value = reward + self.config.discount_factor * next_state_max

  self.qtable[state][action] = q_value

The first two lines are just to ensure we have entries in the qtable for the values at state and next_state as we are about to work with them.

The real hot sauce is in the very last line of the method. Here we can clearly see that we are mutating the rating of the action that we took on the state. This makes perfect sense because, again, we want to build up a map of rated actions for each state so that the agent can lookup the best possible action for each state when it plays the game. We call this value the q_value.

This brings us to the question, how do we calculate the q_value? This is in fact a difficult question to answer because if we think back about our rewards we remember that only the final winning or losing state gives pretty clear rewards of +1 or -1. All the other moves that don’t immediately cause the game to be won or lost, such as playing -3 on a 9, just yield us a negative reward of -0.1.

We somehow have to find a way to consider that when we calculate the q_value. Remember that we want -3 to become an action with a high q-value for state 9. On the other hand we don’t want -3 to get a strong q-value for state 6 as clearly playing -1 would be better and make us win the game from here.

In other words, what we need is a formular that not only takes the immediate reward into account but also the rewards that are yet to follow if we choose the given action from here. We call that the discounted future reward.

The formular for that is hiding in this innocently looking if / else block.

if final:
  q_value = reward
  next_state_actions = self.qtable[next_state]
  next_state_max = np.amax(next_state_actions)

  q_value = reward + self.config.discount_factor * next_state_max

It says that if we reached a final state (won or lost) the q_value should simply be the reward. This makes a lot of sense because there is no future reward to expect from subsequent actions simply because there are no further actions possible from here.

If, however, we aren’t in a final state we do the following:

  1. We get the the highest q-value for the next_state. Remember the next_state is the state that we shifted to as we applied the action to the state. In other words, if the state was 9 and the action was -3 the next_state is 6. We don’t care which action of the next_state has the highest q-value, we simply want get the value to use it in our formular.

  2. We calculate the new q_value for the action of the state as the reward plus the highest q-value of the next_state multiplied by some mysterious discount_factor. Let’s ignore the discount_factor for a moment and just think of it as being set to 1. Replacing our variables with concrete numbers this may boild down to q_value = -0.1 + 1 * 1.

By now, you may be wondering: “But how does that achieve the learning? There’s no guarantee that the q-value for the next_state makes any sense?!”

You’re kinda right, the q-values won’t be perfect from the beginning. In fact, they may start out completely wrong. What this formular achieves though is that the values approximate and get more and more accurate with every iteration.

Getting back to the discount_factor, this thing is less scary than we may think. The discount_factor should be set between 0 and 1 and will influence how much we care about future rewards. A discount_factor of 0 means that we eleminate the righthand side of our formular entirely and don’t price in any future rewards. On the other hand, a discount_factor of 1 means that we strongly care about future rewards. A value of 0.9 is usually a good start but since our task is completely deterministic we can also set it to 1.

The nice thing about using such a terrible simple toy task is that we can make this process perfectly visible. We can simply print out the Q-Table at different stages of the learning.


This is what the Q-Table looks like after our agent played the game 10 times.

    # actions go from +3, +2, + 1, -1, -2, -3
    0: array([0., 0., 0., -0.1, 0., -1.]),
    1: array([0., 0., 0., -0.1, 0., 0.]),
    2: array([0., -0.1, -0.1, 0., -1., 0.]),
    3: array([0., 1., 0., -0.1, 0., -0.1]),
    4: array([0., 0.673289, 0., 0.89, 0., 0.]),
    5: array([0., 0., 0., 0., 0., 0.]),
    # it is in favor of playing -2 from here. Stupid machine!
    6: array([-1., 0., 0., 0., 0.7811, -0.1]),
    7: array([-1., -0.1, 0., 0., 0., 0.]),
    9: array([0., -0.1, -0.1, 0., 0., 0.]),
    10: array([0., -0.1, -0.1, 0., 0., -0.1]),
    11: array([0., -1., 0., 0., 0., 0.]),
    12: array([0., -0.1, 0., 0., 0., 0.]),
    13: array([0., 0., 0., 0., 0., 0.]),
    14: array([0., 0., 0., 0., 0., -1.]),
    -2: array([0., 0., 0., 0., 0., 0.]),
    -3: array([0., 0., 0., 0., 0., 0.]),
    -1: array([0., 0., 0., -1., 0., 0.])

Each line represents a state with the right-hand side being the array of q-values for each action starting at 3 and ending at -3. Notice how the machine “thinks” playing a -2 on a six would be the best move. We need to give it some more training cycles!

And after 100 games the Q-Table looks like this. By now, the agent has figured out how to play perfectly. Also notice that the entire Q-Table grow a bit as it figured out how to get to exotic states such as 20 or -7.

    # actions go from +3, +2, + 1, -1, -2, -3
    0: array([-1., 0.89, 0.7811, 0.7811, -1., -1.09]),
    1: array([0.89, 0.89, 0.89, 0.7811, -1., -1.]),
    2: array([1., 0.89, 0.89, -1., -1., 0.7811]),
    3: array([0.89, 1., 0.89, 0.89, -1., -1.]),
    4: array([0.89, 0.89, 1., 0.89, 0.89, 0.7811]),
    5: array([0., 0., 0., 0., 0., 0.]),
    # it figured out to play -1 from here
    6: array([-1., -1., 0.89, 1., 0.89, 0.89]),
    7: array([-1., 0.7811, 0.89, 0.89, 1., 0.89]),
    8: array([0.7811, 0.7811, 0.7811, 0.89, 0.89, 1.]),
    9: array([0.673289, -1., -1., 0.89, 0.89, 0.89]),
    10: array([0.673289, -1., -1., -1., -1., 0.89]),
    11: array([-0.199, -1., 0.56655611, 0.7811, 0.7811, 0.89]),
    12: array([-1.09, 0.673289, -1., -1., 0.7811, -1.]),
    13: array([-1.09, -1.09, -1., -1., -1., 0.7811]),
    14: array([-0.1, -1.09, -1.09, -1., -1., -1.]),
    15: array([-1., -1., -1., -1., -1., -1.]),
    16: array([-1., -1., -1., -1., -1., -1.]),
    17: array([-1., -1., -1., -1., -1., -1.]),
    18: array([0., 0., 0., 0., 0., 0.]),
    19: array([0., 0., 0., 0., 0., 0.]),
    20: array([0., 0., 0., 0., 0., 0.]),
    -2: array([0.7811, 0.7811, -1.09, -1., -1.09, -1.]),
    -8: array([0., 0., 0., 0., 0., 0.]),
    -7: array([0., 0., 0., 0., 0., 0.]),
    -6: array([0., 0., 0., 0., 0., 0.]),
    -5: array([-1., -1., -1., -1., -1., -1.]),
    -4: array([-1., -1., -1., -1., -1., -1.]),
    -3: array([-1., -1., -1., -1., -1., -1.]),
    -1: array([0.89, -1., -1., -1., -1., -1.09])

Notice that the Q-Table righfully favors playing -1 on a 6 now. All the other moves seem to make sense as well.

Playing with the trained agent

It’s really cool and handy that we are able to validate the learning simply by looking at the Q-Table. But of course we don’t want to stop here. We want to truly measure wether our agent wins every single game.

Notice that our agent explored the game entirely by making random moves so far. This is only possible because the entire state space of our task is very small. For real world tasks this strategy wouldn’t take us very far and we have to refine the approach, but that’s a story for another post.

Still, to measure the performance of our agent, we have to be able to control wether it chooses actions randomly or based on Q-Table lookups.

A simple condition would do the trick but we can do a litle better and introduce a randomness_rate to make fine grained adjustments so that, for instance, 70 % percent of the actions are choosen randomly and 30 % based on Q-Table lookups. This will already pave the way for other optimizations that we’ll be applying in a future post.

We just have to apply a tiny refactoring to our get_action method to pass in the state and then return an action randomly or based on a Q-Table lookup depending on the randomness_rate.

def get_action(self, state):
  if not self.should_go_random() and state in self.qtable:
    return self.predict_action(state)

  return self.get_random_action()

def should_go_random(self):
  return np.random.rand() <= self.randomness_rate

def get_random_action(self):
  return random.randrange(0, 6)

def predict_action(self, state):
  return np.argmax(self.qtable[state])

A randomness_rate of 0 means that all actions should be based on Q-Table lookups (unless the Q-Table is lacking the entry) whereas a value of 1 means that all actions should be choosen randomly. We can choose any value in between 0 and 1 such as 0.3 so that 30 % of the actions are picked randomly.

With that in place we can first perform 100 epochs of training and then have the trained agent play 1000 games.

config = AgentConfig()
config.nb_epoch = 100
agent = Agent(config)

#play 1000 games on the trained agent
config.nb_epoch = 1000
agent.randomness_rate = 0

We can see that it really wins every single game out of 1000.

Epoch: 1000 Wins: 1000 (100.00%) Losses: 0 (0.00%)

We can also see that it’s not always going for the straight path that we would expect it to take.

Starting game #2

Current number is 10
Playing -1
Current number is 9
Playing -1
Current number is 8
Playing -3
You won!

However this is just becaue it was trained on 100 randomly played games. We could increase the number of games the agent performs for the training or fine tune our training strategy to fix that!

You can check out the final code of the agent playing perfectly in this embedded lab.

Play with the code, fork it and try solving other challenges!

Wrapping up

Phew! This was quite a long post. Congrats if you made it this far. We hope you had fun exploring the exciting field of Reinforement Learning. As mentioned in the beginning, this is part one in a series of articles on Reinforcement Learning. Stay tuned!

Source:: Thoughtram

When to use CQRS?!

By Golo Roden

When to use CQRS?!

The formerly exotic architectural pattern CQRS is becoming increasingly suitable for the masses. However, too many developers still know the approach only from hearsay, and can hardly estimate whether it is worth the effort.

Until a few years ago, when searching for CQRS, one was asked by Google whether one might have meant the search term cars. In the course of time, this has developed into a joke that developers familiar with CQRS actually pronounce the acronym CQRS like cars. But what is CQRS anyway?

One of the simplest and most frequently cited explanations is that CQRS is in principle the same as the design pattern CQS, applied to the architectural level of an application. This comparison is quite correct, but for someone who is not yet familiar with CQRS, it is difficult to understand and therefore hardly helpful.

The fundamental question must therefore first of all be what the design pattern CQS actually is. Bertrand Meyer’s approach separates the methods of objects into two categories: Commands and queries. This is where the name comes from, because CQS stands for Command Query Separation.

Commands and queries

A command is a method that either changes the state of an object, has side-effects, or fulfills both criteria at the same time. However, a command deliberately does not return a return value, which is why it cannot return any information about the state of an object.

A query, on the other hand, is a method that returns information about the state of an object, but must not influence this state or have any other side effects.

According to CQS, you should be able to classify each method of an object in exactly one of the categories. Methods that change the state and have side-effects, and at the same time return information about the state, should therefore be avoided.

At first glance, meeting the requirement seems trivial. Considering classic get and set methods, it is clear that some are queries and others are commands. However, the practice knows more advanced methods that can no longer be assigned so easily.

For example, a method that saves a file and at the same time returns the number of bytes written would be questionable. Saving the file is a side effect, so it is a command. However, since it also returns the number of bytes written, this is also a query. How can this case be dealt with if the CQS principle is to be observed?

An interesting approach is suggested by Yegor Bugayenko in his book Elegant Objects: Instead of creating the method as outlined above, you should reduce it to a query that returns a dedicated object representing a one-time save:

// Query
const saver = file.getSaver();

// Command;

// Query
const bytesWritten = saver.getBytesWritten();

This new object then has the actual method save, which is now a command, and the method getBytesWritten as query. In this way, the duality of the original method can be resolved into two separate methods as a command and query.

The reason why the procedure works in the described way is the separation of writing and reading, even in a process that supposedly does both at the same time.

Separating writing and reading

The CQRS design pattern raises the idea of separating writing and reading data from object to system level. This means, for example, that an application has not only one but two APIs to address it: While one API is used for writing data, the other is used for reading.

The separation does not necessarily have to be technical, but at least it should be thoughtfully planned. At first glance, this seems absurd and looks like unnecessary effort. In fact, however, the approach does offer some serious advantages.

A typical problem for applications that are subject to a high load is, for example, normalizing the database. For writing, a strictly normalized database is advantageous because writing operations can be carried out quickly and consistency guaranteed. At the same time, however, this brings with it massive reading problems, because a highly normalized database cannot be read out easily. Instead, it usually requires the use of numerous JOIN statements, which slow down reading dramatically.

On the other hand, if the database is optimized for the most efficient reading, a completely denormalized system should be aimed for. In this case, a single SELECT statement is sufficient for each read access, but writing becomes extremely time-consuming. The scenario also makes it extremely difficult and error-prone to guarantee consistency.

If, on the other hand, CQRS is used and the two aspects are separated on an architectural level, the problems go away. Then it is possible to work with two databases, one of which is normalized and responsible for writing, the other one denormalized and responsible for reading. This way, both writing and reading processes can be done optimally and with the best possible performance.

Thesis 1: CQRS is suitable for systems in which the number of writing and reading accesses differs greatly.

In addition, the separate scaling of an application’s read/write side enables the application to be scaled in a way that it can be optimally adapted to the load of the respective situation as required.

Thesis 2: CQRS is suitable for systems whose read and write sides should be scaled individually.

Eventual consistent

However, this procedure means that the two databases must be synchronized. This in turn raises the question of the guarantees under which this is done. In particular, if the separation of writing and reading actually takes place with the help of physically different databases, it becomes clear that distributed transactions are probably not a very suitable means.

Therefore, in CQRS-based systems, the guaranteed consistency between the read and write sides is often given up in favor of availability: In case of doubt, it is better to get a response from the system, even if it may be slightly outdated, than none at all.

Of course, this does not apply to all scenarios. It is obvious that the approach is not appropriate, for example, for systems that affect people’s lives or health: guaranteed consistency is probably desirable in the case of an eye laser, surgical intervention or the control of a nuclear power plant.

However, many other cases do well with a soft consistency. Real life also works in many places with this so-called eventual consisteny, i.e. an occasional consistency: Whoever orders a drink in a café usually receives the goods before they have to be paid for. This means that there is no transaction, which is why consistency from the café’s point of view is not guaranteed in the meantime.

Thesis 3: CQRS is suitable for systems where availability is more important than consistency and eventual consistency is not an exclusion criterion.

Asynchronous UIs

Considering the approach to be complete, this means that commands sent to the application do not return anything – completely in accordance with the CQS principle, which stipulates that commands change the state and have side-effects, but that they cannot return information about the internal state. But what do you do with the results of the commands that do necessarily exist?

Of course, the user interface can use a query to check regularly whether a result exists, but such a pull-based procedure is cumbersome and time-consuming. It would be better to have a push notification, which will be delivered automatically as soon as a command is processed. Exactly this is solved with the help of so-called events, which represent a reaction to a command.

Thesis 4: CQRS is suitable for systems that work with commands and (asynchronous) events to map the interaction with the user.

For the user interface, this means that a command is first sent away in a fire-and-forget style and then the UI waits for the associated event. It is questionable whether or not you want to prevent the user from performing other tasks during this time. If you allow the user to wait, this results in a consistent state of the UI, but his nerves are often unnecessarily strained.

Therefore, assuming that most of the commands are processed successfully anyway, you can let the UI work asynchronously: As soon as a command is delivered to the backend, only the receipt is acknowledged. The user can then continue working and even navigate to other parts of the application if necessary. The result of the command is then displayed asynchronously at a later time, if this is still relevant. This is often only relevant in the event of an error.

Thesis 5: CQRS is suitable for systems whose graphical user interface can or should work asynchronously.

Another option to quickly give feedback to the user is to falsify the application’s response in the graphical user interface, i.e. display the probable response directly. This is the way most online shops work, for example, which initially confirm receipt of the order and claim that it is now being processed and delivered. In fact, processing often only starts at a later point in time, which the customer only learns in the event of an error, for example, if the desired article is no longer in stock.

Collecting events

Although events are not the original concept of CQRS, they are an excellent counterpart to commands. Therefore, it is advisable to collect these events in a database and use them as a starting point for changing the status. The principle is called event sourcing.

Thesis 6: CQRS is suitable for systems with a persistence layer based on event sourcing.

This does not store the current state of the application, but the individual events that have led to the current state. The current status can then be restored at any later point in time via a replay. A database that stores such events and is optimized for the execution of replays is called event store.

The read database can also be filled from these events by semantically interpreting the individual events and mapping them to classic CRUD statements. Since the events contain domain semantics, they can be interpreted differently as required, so that different read tables can be generated from the same raw data.

Since the events do not describe the current status, but the way to get there, this can be done afterwards, for example, to answer questions that have arisen only in the course of time: Provided that the semantics contained in the events permit the corresponding evaluation, this is possible without any problems.

In addition, CQRS can also be perfectly combined with DDD (domain-driven design) as the command- and event-oriented approach fits in well with the concept that puts domain-oriented events at the forefront of software modeling. Of course, CQRS can also be used without event sourcing or DDD, just as these concepts work without CQRS. However, there is no denying that the three concepts complement each other very well.

Thesis 7: CQRS is suitable for systems that use DDD to model the underlying domain.

What about CRUD?

Occasionally, CQRS is also mentioned in connection with CRUD, but usually not as a suitable supplement, but as a contrast. Theoretically, the two approaches do not exclude each other, but in practice there is hardly any benefit from their combination: Since CQRS requires the separation of writing and reading, one acts with two databases or at least with two database schemas, which have to be synchronized as already mentioned.

This is extremely difficult with pure CRUD, as with CRUD there is no semantics for updating the read side. As described above, these can be obtained via domain events, which can then be used both as feedback to the user interface and as data for the event store.

Nevertheless, there are numerous applications where pure CRUD is completely legitimate. This is the case, for example, if an application ultimately only does forms over data, i.e. does not contain any complex domain logic, but merely provides masks with which the raw data from the database can be edited.

Thesis 8: CQRS is suitable for systems whose domain logic is too complex for pure CRUD.


CQRS is an exciting architectural approach that demands an unusual handling of data. The separation of writing and reading might be familiar to the fewest developers, but makes sense in terms of scalability, modern asynchronous user interfaces, and the proximity to event sourcing and DDD.

Nevertheless, CQRS is not the magic silver bullet that solves all problems. CQRS is particularly suitable for small applications that do not require a high degree of scalability and that do not have complex domain logic, and for applications that have a direct impact on life or health, CQRS is not or only to a very limited extent suitable. Other approaches may be preferable here.

However, CQRS is ideal for most web and cloud applications: here, scalability is often an essential requirement for the software. In addition, much more is usually read than written, which speaks for the individual scalability of both sides. If you add event sourcing and DDD to CQRS, you have an excellent basis for the development of modern web and cloud applications.


DevOps Automation with MongoDB Atlas

By Jay Gordon

Configuration Management

Configuration management tools such as Puppet, Chef and Ansible, which provide the ability to quickly automate config and deployment processes, have become a critical part of many engineering teams’ plans when building new systems. Implementing an additional cloud service should fit alongside the configuration management methods you already use. Luckily, the MongoDB Atlas API provides you with the ability to programmatically launch MongoDB clusters with your pre-existing toolset, ensuring a repeatable and reliable method that can be customized to your needs.

MongoDB Atlas API

The Atlas API follows the principles of the REST architectural style and exposes a number of internal resources which enable programmatic access to Atlas features. Instead of writing additional code for the aforementioned tools, you can call upon this HTTPS API with instructions for the MongoDB cluster you would like to use and a secure key for authentication. If you follow the documentation of your configuration management tool, you should be able to leverage a similar method to submit an HTTPS POST to launch a MongoDB Atlas Cluster.

To use the MongoDB Atlas API from your configuration management tool, you’ll first need to configure API access. This ensures a secure connection is always available between your configuration management server and the MongoDB API. Our documentation also shows you how to generate your API key and specify a whitelist of IP addresses that are permitted to modify your MongoDB Atlas clusters via your API key.

As shown in the screenshot above, MongoDB Atlas grants you to the ability to disable or delete API keys as needed; you can also easily see when your API keys were last used.

Cluster Attributes

Let’s build our MongoDB Atlas M30 cluster named DataStore with 40 GB of disk, backups enabled, IOPS of 120, and 3 replica set members in total.

Items required for launching:

  • JSON file atlas.json

    "name" : "DataStore",
    "numShards" : 1,
    "replicationFactor" : 3,
    "providerSettings" : {
    "providerName" : "AWS",
    "regionName" : "US_EAST_1",
    "instanceSizeName" : "M30",
    "diskIOPS" : 120,
    "encryptEBSVolume" : false
    "diskSizeGB" : 40,
    "backupEnabled" : true
  • My API Key
  • My Atlas account username (jay.gordon)
  • My Group ID (Found by going to Settings -> Group Settings at the top of the screen)
  • My AWS server with SSH key to permit ansible to log in
  • An ansible “hosts” file with our inventory

In this example I’ll use a simple curl from my local computer. I provided the API with some basic info:

bash-3.2$ curl -i -u "jay.gordon:$APIKEY" --digest -H "Content-Type: application/json" -X POST
 "" --data @atlas.json

In this situation, I’ve used a standard HTTPS curl POST with my JSON payload containing the settings I want for my cluster.

Launch a MongoDB Atlas Cluster with Ansible

Ansible allows you to execute complex playbooks from your local desktop computer; we’ll use it in this example to launch our MongoDB Atlas cluster.

The Ansible uri module can be used to interact with the MongoDB HTTPS API along with the created secure key.

The Ansible documentation for uri provides an example on how to generate a new JIRA ticket via HTTPS post:

- name: Create a JIRA issue
    method: POST
    user: your_username
    password: your_pass
    body: "{{ lookup('file','issue.json') }}"
    force_basic_auth: yes
    status_code: 201
    body_format: json

This is exactly the same kind of method we can with the MongoDB Atlas API to easily build a small playbook for any new Atlas Clusters we need.

- hosts: webapp
  remote_user: ec2-user
  become: true
    - name: pip httplib2
    # ansible uri module requires httplib2
      pip: name=httplib2 extra_args="--user"
    - name: setup atlas
        url:{{ groupid }}/clusters/
        method: POST
        user: "{{ user }}"
        password: "{{ apikey }}"
        body: "{{ lookup('file','atlas.json') }}"
        body_format: json
        HEADER_Content-Type: "application/json"
        status_code: 201

I’ve created a basic playbook which will do the following:

  1. Permit you to log into your AWS instance and install httpdlib2, a required library on our Amazon Linux server to use the uri feature in Ansible.
  2. It will gather the attributes for our requested cluster from the atlas.json file and send the payload in JSON format to the API
  3. It will begin building the cluster within your account

To execute the command and begin launching your cluster, you can do the following from your command line terminal window:

ansible-playbook -v create-atlas.yml

This will begin the process of installing the required software and making the API call to launch your Atlas cluster. Ansible will notify you that the process is completed by giving you a green “201” status code.

In the example I provided below, we can see the output from the API confirming our requirements:

TASK [setup atlas] 
ok: [] => {"changed": false, "content_length": "510", "content_type": "application/json", "date": "Wed, 19 Apr 2017 13:15:03 GMT", "json": {"backupEnabled": true, "diskSizeGB": 40.0, "groupId": "588b776f96e82110b163ed93", "links": [{"href": "", "rel": "self"}], "mongoDBMajorVersion": "3.2", "mongoDBVersion": "3.2.12", "mongoURIUpdated": "2017-04-19T13:15:03Z", "name": "DataStore1", "numShards": 1, "providerSettings": {"diskIOPS": 120, "encryptEBSVolume": false, "instanceSizeName": "M30", "providerName": "AWS", "regionName": "US_EAST_1"}, "replicationFactor": 3, "stateName": "CREATING"}, "redirected": false, "status": 201, "strict_transport_security": "max-age=300"}
 *********************************************************************             : ok=3    changed=0    unreachable=0    failed=0

Once the process of creating your cluster is completed, you can add the connection string to your application and begin working with your database.

Get Started Today

Thanks to the MongoDB Atlas API and Ansible, we’ve avoided having to write additional code to build our three node replica set. You can start working today by going to this github and cloning the basic playbook I used in this example— simply insert these details into your existing Ansible playbook or build a brand new one.


Code Challenge #4: Build An Eye-Tracking Alien with JavaScript

By William Imoh

Hi and welcome to our Code Challenge #4. This week on the challenge we shall be delving into mouse tracking with JavaScript and simple animations with CSS.

The Challenge

The challenge is basically to build this Alien from Mars. We were browsing CodePen and saw this awesome CSS alien built by Eduardo Sada. View the original CodePen here:

We’ve taken the CSS alien and added JavaScript. Notice the cool thing about our alien when you move your cursor around?

The alien’s eye moves and always tracks your mouse cursor.

Try to implement that feature using JavaScript. You can use CodePen or any other similar means to try the challenge and do well to share the link with us.


Complete this awesome challenge using JavaScript.

  • Use JavaScript to make the alien’s eye follow your mouse cursor
  • You shouldn’t have to write any extra HTML/CSS for the mouse-tracking. Only JavaScript for this part


Our Alien blinks occasionally, you could implement that also. For this part, you will need to add a CSS animation.


Check out the base code for the HTML and CSS without the eye tracking effect and the blinking.

Feel free to fork this base CodePen as your starting point for this challenge.


  • Join the Slack Group to Interact with others in the Slack Channel #codechallenge.
  • Post your solutions and issues on twitter with the hashtag #scotchchallenge
  • Post your solutions in the comments.


The solution to this challenge will be released on Monday. Happy coding!