Comparative Study, Applications and Challenges of Big Data Analysis Technologies
Subject Areas : Technology ManagementYaser Ghasemi nejad 1 * , Abbass Ketabchi 2
1 -
2 -
Keywords: Big Data Technology Comparative Analysis of Frameworks The Applications of Big Data Challenges,
Abstract :
oday, receiving and sharing information is easier and cheaper than before, enabling organizations to handle large volumes of data at a high speed and variety in the name of big data. Big data technology provides many opportunities when problems are resolved correctly. Data processing technologies in the past are not suitable for dealing with large quantities of generated data. While Suggested frameworks for big data applications help to store, analyze and process data. In this study, we first reviewed and summarized the big data definitions, and challenges of using it and then a number of important big data frameworks (Hadoop, Flink, Storm, Spark and Samza) have been studied and compared comparatively. The studied framework of big data is generally classified into two categories: (1) batch mode; and (2) stream mode. The Hadoop framework processes data in batch mode, while other frameworks allow stream or real time processing. Ultimately, the most important applications of using big data technology have been described. The most important applications for big data analysis are healthcare applications, advisory systems, smart cities and social networks analysis. Due to the growth of Internet-connected devices, social networking data is growing widely and requires more big data technology. Also, the most challenges of big data application, including confidentiality in storage systems, software deficiencies and the limitation of existing hardware and equipment, the need for large initial investment and the lack of technical skills and expert workforce.
1- حزباوي، سنا؛ دوستی، پریسا؛ رستمی نوروزآباد، مجتبی؛ شیخ اسماعیلی، سامان. مفهومپردازي بزرگ دادهها در مدیریت دانش؛ با تأکید بر رایانش ابري، هفتمین کنفرانس ملی و اولین کنفرانس بینالمللی مدیریت دانش، ایران، تهران، 1393.
2- ملکزاده، غلامرضا و صادقی، صدیقه. راهبرد مدیریت منابع انسانی در عصر دیجیتال با تکیه بر کلان داده، فصلنامه رشد فناوری، شماره 51، 62-70، 1396.
3- همتی، مهدی و شیرازی، سحر. مدیریت دادههای بزرگ، سومین کنفرانس بینالمللی مدیریت و مهندسی صنایع، ایران، تهران، 1396.
4- عباسیمهر، حسین و پورسلیمان صومعه دل، یوسف. بررسی گزینههای معماری اطلاعات سازمانها با ظهور دادههای بزرگ، چهارمین کنفرانس ملی محاسبات توزیعی و پردازش دادههای بزرگ، ایران، تبریز، 1397.
5- غفاری، غلامرضا. منطق پژوهش تطبیقی. مجله مطالعات اجتماعي ايران، دوره 3، شماره 4، 1389.
6- Sirin, Erkan, and Hacer Karacan. "A Review on Business Intelligence and Big Data." International Journal of Intelligent Systems and Applications in Engineering 5, no. 4 (2017): 206-215.
7- Gandomi, A., & Haider, M. "Beyond the hype: Big data concepts, methods, and analytics." International Journal of Information Management, 35(2), (2015): 137-144.
8- Inoubli, Wissem, Sabeur Aridhi, Haithem Mezni, Mondher Maddouri, and Engelbert Mephu Nguifo. "An experimental survey on big data frameworks." Future Generation Computer Systems 86 (2018): 546-564.
9- Zerbino, Pierluigi, Davide Aloini, Riccardo Dulmin, and Valeria Mininno. "Big Data-enabled customer relationship management: A holistic approach." Information Processing & Management 54, no. 5 (2018): 818-846.
10- Assuncao, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, (2015): 3-15.
11- Oguntimilehin, A., and E. O. Ademola. "A review of big data management, benefits and challenges." A Review of Big Data Management, Benefits and Challenges 5, no. 6 (2014): 1-7.
12- Singh, D. & Reddy, C.K. A survey on platforms for big data analytics. Journal of Big Data (2015) 2: 8.
13- Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. "The parable of Google Flu: traps in big data analysis." Science 343, no. 6176 (2014): 1203-1205.
14- Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National science review, 1(2), 293-314.
15- Jukić, Nenad, Abhishek Sharma, Svetlozar Nestorov, and Boris Jukić. "Augmenting data warehouses with big data." Information Systems Management 32, no. 3 (2015): 200-209.
16- Chen, CL Philip, and Chun-Yang Zhang. "Data-intensive applications, challenges, techniques and technologies: A survey on Big Data." Information sciences 275 (2014): 314-347.
17- Shi, Juwei, Yunjie Qiu, Umar Farooq Minhas, Limei Jiao, Chen Wang, Berthold Reinwald, and Fatma Özcan. "Clash of the titans: Mapreduce vs. spark for large scale data analytics." Proceedings of the VLDB Endowment 8, no. 13 (2015): 2110-2121.
18- Zhang, Fan, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang. "A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications." Future generation computer systems 43 (2015): 149-160.
19- Veiga, Jorge, Roberto R. Expósito, Xoán C. Pardo, Guillermo L. Taboada, and Juan Tourifio. "Performance evaluation of big data frameworks for large-scale data analytics." In 2016 IEEE International Conference on Big Data (Big Data), pp. 424-431. IEEE, 2016.
20- García-Gil, Diego, Sergio Ramírez-Gallego, Salvador García, and Francisco Herrera. "A comparison on scalability for batch big data processing on Apache Spark and Apache Flink." Big Data Analytics 2, no. 1 (2017): 1.
21- Polato, Ivanilton, Reginaldo Ré, Alfredo Goldman, and Fabio Kon. "A comprehensive view of Hadoop research—A systematic literature review." Journal of Network and Computer Applications 46 (2014): 1-25.
22- Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51, no. 1 (2008): 107-113.
23- Zaharia, Matei, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. "Spark: Cluster computing with working sets." HotCloud 10, no. 10-10 (2010): 95.
24- Alexandrov, Alexander, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao et al. "The stratosphere platform for big data analytics." The VLDB Journal—The International Journal on Very Large Data Bases 23, no. 6 (2014): 939-964.
25- Lin, Jimmy, and Chris Dyer. "Data-intensive text processing with MapReduce." Synthesis Lectures on Human Language Technologies 3, no. 1 (2010): 1-177.
26- Bajaber, Fuad, Radwa Elshawi, Omar Batarfi, Abdulrahman Altalhi, Ahmed Barnawi, and Sherif Sakr. "Big data 2.0 processing systems: Taxonomy and open challenges." Journal of Grid Computing 14, no. 3 (2016): 379-405.
27- Zhang, Fan, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang. "A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications." Future generation computer systems 43 (2015): 149-160.
28- Xu, Xiaofei, Quan Z. Sheng, Liang-Jie Zhang, Yushun Fan, and Schahram Dustdar. "From big data to big service." Computer 7 (2015): 80-83.
29- Domann, Jaschar, Jens Meiners, Lea Helmers, and Andreas Lommatzsch. "Real-time News Recommendations using Apache Spark." In CLEF (Working Notes), pp. 628-641. 2016.
30- Hu, Han, Yonggang Wen, Tat-Seng Chua, and Xuelong Li. "Toward scalable systems for big data analytics: A technology tutorial." IEEE access 2 (2014): 652-687.
31- Bello-Orgaz, Gema, Jason J. Jung, and David Camacho. "Social big data: Recent achievements and new challenges." Information Fusion 28 (2016): 45-59.
32- Phillips-Wren, Gloria, and Angela Hoskisson. "An analytical journey towards big data." Journal of Decision Systems 24, no. 1 (2015): 87-102.
33- Yin, ChuanTao, Zhang Xiong, Hui Chen, JingYuan Wang, Daven Cooper, and Bertrand David. "A literature survey on smart cities." Science China Information Sciences 58, no. 10 (2015): 1-18.
34- Stimmel, Carol L. Building smart cities: analytics, ICT, and design thinking. Auerbach Publications, 2015.
35- Piro, Giuseppe, Ilaria Cianci, Luigi Alfredo Grieco, Gennaro Boggia, and Pietro Camarda. "Information centric services in smart cities." Journal of Systems and Software 88 (2014): 169-188.
36- Xu, Xiaofei, Quan Z. Sheng, Liang-Jie Zhang, Yushun Fan, and Schahram Dustdar. "From big data to big service." Computer 7 (2015): 80-83.