Machine predicts the FIFA 2018 Outcome data analytics

Machine predicts the FIFA 2018 Outcome data analytics

It’s time for the adrenaline rush. There are gonna be lots of fights, celebrations, and posts all over social media. Its FIFA time, and this time in Russia. The craze is all over again. With the ardent following, people predict their own outcomes with emotions as the base. 

We are under the digital revolution and probably in pre-machine age. Predictions from octopus, cows, crocodile, parrots and normal humans no more make the news. They are long past and forgotten. Now it’s time for machine and data to take the baton.

Past predictions

            Prediction, all these days have been made by bookmakers with some statics and more emotion. Data collected from previous matches as a team, and with some top player’s stats in recent terms; all make up their data set. Indians sometimes rely on Vedic methods to predict the outcome. It may be true and may be false. They are not based on logic. But still, an emotional mind can accept.

The Rise of Data

            Since data has been an exponential rise. Being in the Digital Revolution age, the limits are endless. What if we put in all the data and make a model and predict the outcome that can align with probability and some numbers? Since numbers are valid than emotions to take a decision.

A research team of Andreas Groll and his colleagues from the Technical University of Dortmund in Germany came up with a potential model. Based on conventional statistics, Machine Learning methods and mostly with a proper data. They used Random Forest approach to predict the winner.

Random Forest — basic 

            “The outcome was predicted after simulating the entire tournament 100000 times”

Basically, Random Forest is a classification algorithm. I’ll put it simply here.

It’s like a forest. Where there are many trees and each tree gives an answer to your solution. And you choose the final answer based on the maximum probability given by trees.

Example: You want to go to a restaurant with a budget in mind. You ask your friend for a suggestion. He asks you some questions about your preferences, timing, budget range, number of persons accompanying, location etc. and gives you suggestions. If you stop with this, it’s Decision Tree. If you ask more friends the same question, you will get multiple suggestions. And you put to vote the maximum number of same restaurants. This is how Random Forest Works.



Let’s not get more technical.


The World Cup 2018 Model

            The team used this particular algorithm to model the 2018 FIFA World Cup. The model is made for each outcome for each team for each game. And the forest is constructed based on those values.

They did use a lot of external factors influencing the game. Like GDP, FIFA ranking, individual players stat, strengths, bookmakers data, weather, team value, their property, age, experience, health condition, the experience of players, coach, manager, team, medical issues, stamina etc…

They had to factor in those external data points and pick out the most relevant variable.

It’s more of iteration and outcomes at different stages. They winner tend to differ each time. Spain has the higher chance of winning the 2018 cup than Germany in first few iterations.

They made 100000 simulations and the final result differs. It shows the Germany team has more chance than Spain. Still, it’s based on another scenario, wherein Germany makes it to the Quarters. And it will be in a pool of less dominating teams, which makes it win probability more.



The Machine model suggests its Germany this time! Let’s wait and watch!

It’s Digital and Data that are setting trends now. Content at the atomic level is data. Data will survive all odds.