Some details
Challenge
Though having a robust analytical system, the Customer believed that it would not be able to satisfy the company’s future needs. A system-to-be was to cope with the continuously growing amount of data, to analyze big data faster and to provide insights into media consumption patterns based on the analytics results.
For the new analytical system, the following frameworks were
selected:- Apache Hadoop – for data storage,
- Apache Hive – for data aggregation, query and analysis,
- Apache Spark – for data processing.
Amazon Web Services and Microsoft Azure were selected as cloud computing platforms.
Overall, the solution included five main modules:
- Data preparation
- Staging
- Data warehouse 1
- Data warehouse 2
- Desktop application
Results
At the project closing stage, the new system was able to process several queries up to 100 times faster than the outdated solution. With the valuable insights that the analysis of almost 30,000 attributes brought, the Customer was able to understand media consumption patterns worldwide and get a clear picture of different markets.