How to create big data in airline operations — when there’s only a limited set of data available?

Sometimes you get carried away by all these fancy buzzwords and I am no stranger to that. However, I’m always aiming for feasible and executable solutions.

Big data, in my opinion, has definitely survived the buzzword hype and we can witness a multitude of use cases, tools, and benefits this technology is accomplishing.

However, when talking to experts in the field of airline operations, we often encounter the challenge that big data or sources for big data are simply not available.

To avoid misunderstandings: I’m not talking about the huge amount of data every aircraft is generating during a flight. I’m talking about process data: data from ground handlers, timestamps, information about process changes, etc.

When analyzing available operational data from airlines, one usually has to concentrate on data sets with a very limited amount of attributes. Scheduled times, actual times, number of passengers — actually, the basic data.

And the problem gets bigger since every flight is operated not more than once per day. Accordingly, we have to rely on a maximum of 365 sets of data with a limited set of attributes.

Ultimately, this results in a situation that most of the big data approaches, all the fancy tools, and analytical approaches simply don’t work. The available limited sets of data actually don’t allay the big data hunger.

That results in a situation in which airline operations are still bound to old-fashioned approaches in terms of data analysis and technological possibilities.

As the title of the blog post already relinquishes, I’m absolutely fascinated about a new possibility on the horizon that dramatically changes this situation.

A possibility that literally helps airlines to generate big data in the area of operations and creates a whole new source for analytics. No worries, you won’t have to connect a multitude of additional data sources or a vast number of suppliers.

I can see the questions mark above your head — “What is he talking about? How does that work?”

The core idea is to create events out of data — by applying rules to it. That means that very basic incoming data, such as off-block events, estimated time of arrivals, ACARS messages, information from the ground handler, etc. are processed according to rules you define. And by doing so, you can create dozens of events out of one single data attribute.

How can these rules look like:
Here’s a basic one: An incoming ETA is compared to the STA. In case the ETA is five minutes later than the STA a “delay” event is generated.
Or a more complex one: An incoming ETA is compared to the STA. In case the ETA is five minutes later and First Class passengers are on board a “critical delay” event is generated.
I’m pretty sure you can already think about ten, twenty, or hundreds of other rules. Simple and complex ones.

Of course, that requires sitting down and defining the rules which are important to you. By doing so you can create a huge amount of data for every single flight. Ultimately, this provides the possibility for a much more detailed and sophisticated analysis.

And there are basically no limitations to the rules: You include data from public sources such as weather or include information about the aircraft, maintenance aspects, and so on.

You can even take this one step further and create events out of events. Again, with defined rules. For example, an event is generated when three critical delay events are generated within a timeframe of five minutes.

From the first test runs with a large airline and a very limited rules set of only 30 rules, we’ve seen that the amount of data can be increased by the factor of 20-30. That means 30 times more data to analyze. 30 times more data to detect problems. And 30 times more data to identify the root causes for these problems.

What’s needed for it? Actually, not very much — at least compared to other big data solutions. Yes, you need a software component that generates the events and provides you the possibility to define all the rules you need.

For the techies: We realized that with common big data technologies, by developing a component which is based on Kafka and Flink. Developed with Scala.

And yes, it is an effort to tackle. But on the other side, the potential of this approach is so unbelievably huge that makes the efforts absolutely reasonable.

Personally, I really do think that this approach holds the potential to massively improve airline operations by providing new possibilities for analyzing — and for applying state-of-the-art technologies.

Share this article
About the author

Benjamin Walther

CEO, Frankfurt

Benjamin is Information Design's CEO and a proven content-maniac. Besides running a successful business and developing pioneering ideas, he's dedicated to writing blog posts and creating content.

You might also like

Discover More!

More like this?
Discover all articles we’ve published in this category!
More about our latest product?
Discover a revolutionary mobile app — developed for aviation pros.
More about us?
Discover all about ID1 — our company, services, and history.