Build realistic artificial intelligence models

Oren Atia
3 min readJul 6, 2020

In recent years, we have witnessed a transformation in the space of digital data science, mainly due to these two significant reasons: powerful computing resources have become accessible to all, along with the Internet & mobile revolution that leaves large amounts of digital footprints.

This functionality is imperative as more and more processes and business workflows are digitalising. As a result, many people, especially in the Western world, create personalised digital trails. The combination of the possibility of calculating mathematical and probabilistic formulas for computer resources revived the statistical and probabilistic world and shattered the fixed frames (a limited number of templates) and made it possible to find unlimited calculations that could not be calculated, had it not been for artificial intelligence. So the data has become the gold of today.

Big companies are building the business plan on data aggregation to predict the future as professionals data scientist try. But this argument, receives a great deal of opposition since the prediction world is correct and accurate, as long as the reality has not changed. But the world is dynamic and generates unexpected events (and data). On the other hand, it can be said that there is almost no event that has not taken place in one way or another in the past. The Coronavirus (COVID-19) events that surprised us all was also here in various forms: there was an economic crisis, and there was also a global epidemic that hit. So how do you do that? How can you predict the future and calculate a prediction in a dynamic world that change can affect and you don’t fall into the trap of yesterday’s data?

My suggestion to the data scientist:

1. Up-to-date model versus reality. Everyone who deals with data science knows that the most important thing is the quality of data for training. Preparing a “blind” and not “breathing” model is a fundamental mistake and is a detachment from reality. In relation to the data entered into the model, and on which it was built, one must ask ourselves the basic question — are the “data” influenced by a new reality? And if so, how do I update the model to be relevant to a new reality? This process is a process that today seems to be a human being involved For example: Suppose you want to predict road accidents and one of the important data is the road. Given a particular road under certain conditions, an accident can be predicted (e.g. high speed, under different weather conditions can cause an accident). And now suppose the particular road has undergone a substantial change (e.g., reducing the speed limit or increasing the number of lanes). Obviously, the model is at least inaccurate from now on, and sometimes irrelevant.

2. Build several models that will tell a different story. I would say this: Building a model that “holds” a different story behind it is a good solution to changing reality. For example, suppose building an optimistic model alongside a pessimistic one, in relation to the same research question. It is important to reflect to those who commissioned the prediction work that there may be realistic cases that could change the model and its predictability. And how do you do it? For example, if we produce an economic forecast model for next year, two models can be built. One model that is an optimistic script model: where a corona vaccine drug was found, and a second model, We are still affected by business because the Corona is still here. After all, we have examples of past data for both scenarios. We may need models that are not equal in the field structure (compared to a model in the same structure but different in the data itself).

In summary, in my opinion, the clear conclusion is that data collection is essential and probably the most critical part of building predictive artificial intelligence models and maintain them updated. Adjusting the model to reality is a vital step. When you look to predict the future against a research query, you should consider building multiple models that consider dynamic and evolving reality.

--

--

Oren Atia

Husband | Father | Data Scientist with passion | blogger I Bassist I Dreaming & try to apply | Copy-paste Python programmer