Sometimes you get carried away by all these fancy buzzwords and I am no stranger to that. However, I’m always aiming for feasible and executable solutions.

Big data, in my opinion, has definitely survived the buzzword hype and we can witness a multitude of use cases, tools and benefits this technology is accomplishing.

However, when talking to experts in the field of airline operations, we often encounter the challenge that big data or sources for big data are simply not available.

To avoid misunderstandings: I’m not talking about the huge amount of data every aircraft is generating during a flight. I’m talking about process data: data from ground handlers, timestamps, information about process changes, etc.

When analysing available operational data from airlines, one usually has to concentrate on data sets with a very limited amount of attributes. Scheduled times, actual times, number of passengers — actually, the basic data.

And the problem gets bigger, since every flight is operated not more than once per day. Accordingly, we have to rely on a maximum of 365 sets of data with a limited set of attributes.

Ultimately, this results in a situation that most of the big data approaches, all the fancy tools and analytical approaches simply don’t work. The available limited sets of data actually don’t allay the big data hunger.

That results in a situation in which airline operations is still bound to old-fashioned approaches in terms of data analysis and technological possibilities.

As the title of the blogpost already relinquishes, I’m absolutely fascinated about a new possibility on the horizon which dramatically changes this situation.

A possibility that literally helps airlines to generate big data in the area of operations and creates a whole new source for analytics. No worries, you won’t have to connect a multitude of additional data sources or a vast number of suppliers.

I can see the questions mark above your head — “What is he talking about? How does that work?”

The core idea is to create events out of data — by applying rules to it. That means that very basic incoming data, such as off-block events, estimated time of arrivals, ACARS messages, information from ground handler, etc. are processed according to rules you define. And by doing so, you can create dozens of events out of one single data attribute.

How can these rules look like:
Here’s a basic one: An incoming ETA is compared to the STA. In case the ETA is five minutes later than the STA a “delay” event is generated.
Or a more complex one: An incoming ETA is compared to the STA. In case the ETA is five minutes later and First Class passengers are on board a “critical delay” event is generated.
I’m pretty sure you can already think about ten, twenty or hundreds of other rules. Simple and complex ones.

Of course, that requires to sit down and define the rules which are important to you. By doing so you can create a huge amount of data for each single flight. Ultimately, this provides the possibility for much more detailed and sophisticated analysis.

And there are basically no limitations to the rules: You include data from public sources such as weather or include information about the aircraft, maintenance aspects and so on.

You can even take this one step further and create events out of events. Again, with defined rules. For example, an event that is generated when three critical delay events are generated within a timeframe of five minutes.

From first test runs with a large airline and a very limited rules set of only 30 rules, we’ve seen that the amount of data can be increased by the factor 20-30. That means 30 times more data to analyse. 30 times more data to detect out problems. And 30 times more data to identify the root causes for these problems.

What’s needed for it? Actually, not very much — at least compared to other big data solutions. Yes, you need a software component that generates the events and provides you the possibility to define all the rules you need.

For the techies: We realised that with common big data technologies, by developing a component which is based on Kafka and Flink. Developed with Scala.

And yes, it is an effort to tackle. But on the other side the potential of this approach is so unbelievably huge that makes the efforts absolutely reasonable.

Personally, I really do think that this approach holds the potential to massively improve airline operations by providing new possibilities for analysing — and for applying state-of-the-art technologies.


Benjamin Walther, CEO, Information Design
Get in touch with Benjamin directly: benjamin.office@id1.de or connect with him on LinkedIn