بررسی تطبیقی، کاربردها و چالش‌های فناوری‌های تحلیل بزرگ داده

قاسمی نژاد, یاسر; کتابچی, سید عباسعلی

doi:10.52547/jstpi.20796.15.60.66

کد مقاله : 13980218179163 بازدید : 15058 صفحه: 66 - 77

10.52547/jstpi.20796.15.60.66

20.1001.1.17355486.1398.15.60.7.7

نوع مقاله: پژوهشی

بررسی تطبیقی، کاربردها و چالش‌های فناوری‌های تحلیل بزرگ داده

محورهای موضوعی : مديريت تکنولوژي

یاسر قاسمی نژاد ^{1
*} , سید عباسعلی کتابچی ²

1 - دانشگاه امام حسین (ع)
2 - دانشگاه آزاد تهران مرکزی

تاریخ دریافت : 1398/02/18 تاریخ پذیرش : 1398/06/13 تاریخ انتشار : 1398/07/08

کلید واژه: فناوری بزرگ داده بررسی تطبیقی چارچوب‌ها کاربرد بزرگ داده چالش‌ها,

چکیده مقاله :

امروزه سازمان‌ها، با به‌کارگیری فناوری بزرگ داده‌، از طریق دریافت و به اشتراک‌گذاری ساده‌تر و ارزان‌تر اطلاعات، قادر به اداره حجم زیادی داده‌ها، با سرعت و تنوع زیاد شده‌اند. فناوری داده‌های عظیم، در صورت حل صحیح مشکلات مرتبط، فرصت‌های زیادی را فراهم می‌کنند. فناوری‌های گذشته، در پردازش داده‌های موجود برای مواجهه با مقادیر زیاد داده‌های تولید شده، مناسب نیستند. درصورتیکه قالب‌های پیشنهادی برای کاربردهای بزرگ داده، به ذخیره، تجزیه و تحلیل و پردازش داده‌های عظیم کمک می‌کنند. در این تحقیق، ابتدا تعاریف و چالش‌های بزرگ داده، بررسی شده و سپس تعدادی از چارچوب‌های بزرگ دادۀ موجود ( هادوپ، فلینک، استورم، اسپارک و سمزا)، مورد مطالعه و مقایسه تطبیقی قرار گرفته است. چارچوب بزرگ داده‌های مورد مطالعه، به طور کلی در دو دسته طبقه‌بندی می‌شود: (۱) حالت دسته‌ای و (۲) حالت جریانی.‌ چارچوب ‌ هادوپ،‌ داده‌ها را در حالت دسته‌ای پردازش می‌کند، در حالی که چارچوب‌های دیگر، اجازۀ پردازش جریانی یا بلادرنگ را می‌دهند. نهایتاً مهم‌ترین کاربردهای فناوری بزرگ داده تشریح شده است. مهم‌ترین کاربردهای تحلیل بزرگ داده عبارتند از: کاربردهای برنامه‌های بهداشتی، سیستم‌های توصیه‌گر، شهر هوشمند و تحلیل شبکه‌های اجتماعی. با توجه به رشد دستگاه‌‌ها‌ی متصل به اینترنت، داده‌‌ها‌ی شبکه‌های اجتماعی به طور گسترده در حال رشد بوده و نیاز بیشتری به فناوری بزرگ داده دارند. همچنین مهم‌ترین چالش‌های کاربرد بزرگ داده‌ها، شامل محرمانگی در سیستم‌های ذخیره‌سازی، كمبودهاي نرم‌افزاري و محدوديت ابزارها و امكانات سخت‌افزاری موجود، لزوم سرمايه‌گذاري بزرگ اوليه و فقدان مهارت‌هاي تكنيكي و نيروي كار خبره می‌باشد.

چکیده انگلیسی:

oday, receiving and sharing information is easier and cheaper than before, enabling organizations to handle large volumes of data at a high speed and variety in the name of big data. Big data technology provides many opportunities when problems are resolved correctly. Data processing technologies in the past are not suitable for dealing with large quantities of generated data. While Suggested frameworks for big data applications help to store, analyze and process data. In this study, we first reviewed and summarized the big data definitions, and challenges of using it and then a number of important big data frameworks (Hadoop, Flink, Storm, Spark and Samza) have been studied and compared comparatively. The studied framework of big data is generally classified into two categories: (1) batch mode; and (2) stream mode. The Hadoop framework processes data in batch mode, while other frameworks allow stream or real time processing. Ultimately, the most important applications of using big data technology have been described. The most important applications for big data analysis are healthcare applications, advisory systems, smart cities and social networks analysis. Due to the growth of Internet-connected devices, social networking data is growing widely and requires more big data technology. Also, the most challenges of big data application, including confidentiality in storage systems, software deficiencies and the limitation of existing hardware and equipment, the need for large initial investment and the lack of technical skills and expert workforce.

منابع و مأخذ:

1- حزباوي، سنا؛ دوستی، پریسا؛ رستمی نوروزآباد، مجتبی؛ شیخ اسماعیلی، سامان. مفهوم‌پردازي بزرگ داده‌ها در مدیریت دانش؛ با تأکید بر رایانش ابري، هفتمین کنفرانس ملی و اولین کنفرانس بین‌المللی مدیریت دانش، ایران، تهران، 1393.
2- ملک‌زاده، غلامرضا و صادقی، صدیقه. راهبرد مدیریت منابع انسانی در عصر دیجیتال با تکیه ‌بر کلان داده، فصلنامه رشد فناوری، شماره 51، 62-70، 1396.
3- همتی، مهدی و شیرازی، سحر. مدیریت داده‌های بزرگ، سومین کنفرانس بین‌المللی مدیریت و مهندسی صنایع، ایران، تهران، 1396.
4- عباسی‌مهر، حسین و پورسلیمان صومعه دل، یوسف. بررسی گزینه‌های معماری اطلاعات سازمان‌ها با ظهور داده‌های بزرگ، چهارمین کنفرانس ملی محاسبات توزیعی و پردازش داده‌های بزرگ، ایران، تبریز، 1397.
5- غفاری، غلامرضا. منطق پژوهش تطبیقی. مجله مطالعات اجتماعي ايران، دوره 3، شماره 4، 1389.
6- Sirin, Erkan, and Hacer Karacan. "A Review on Business Intelligence and Big Data." International Journal of Intelligent Systems and Applications in Engineering 5, no. 4 (2017): 206-215.
7- Gandomi, A., & Haider, M. "Beyond the hype: Big data concepts, methods, and analytics." International Journal of Information Management, 35(2), (2015): 137-144.
8- Inoubli, Wissem, Sabeur Aridhi, Haithem Mezni, Mondher Maddouri, and Engelbert Mephu Nguifo. "An experimental survey on big data frameworks." Future Generation Computer Systems 86 (2018): 546-564.
9- Zerbino, Pierluigi, Davide Aloini, Riccardo Dulmin, and Valeria Mininno. "Big Data-enabled customer relationship management: A holistic approach." Information Processing & Management 54, no. 5 (2018): 818-846.
10- Assuncao, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, (2015): 3-15.
11- Oguntimilehin, A., and E. O. Ademola. "A review of big data management, benefits and challenges." A Review of Big Data Management, Benefits and Challenges 5, no. 6 (2014): 1-7.
12- Singh, D. & Reddy, C.K. A survey on platforms for big data analytics. Journal of Big Data (2015) 2: 8.
13- Lazer, David, Ryan Kennedy, Gary King, and Alessandro Vespignani. "The parable of Google Flu: traps in big data analysis." Science 343, no. 6176 (2014): 1203-1205.
14- Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National science review, 1(2), 293-314.
15- Jukić, Nenad, Abhishek Sharma, Svetlozar Nestorov, and Boris Jukić. "Augmenting data warehouses with big data." Information Systems Management 32, no. 3 (2015): 200-209.
16- Chen, CL Philip, and Chun-Yang Zhang. "Data-intensive applications, challenges, techniques and technologies: A survey on Big Data." Information sciences 275 (2014): 314-347.
17- Shi, Juwei, Yunjie Qiu, Umar Farooq Minhas, Limei Jiao, Chen Wang, Berthold Reinwald, and Fatma Özcan. "Clash of the titans: Mapreduce vs. spark for large scale data analytics." Proceedings of the VLDB Endowment 8, no. 13 (2015): 2110-2121.
18- Zhang, Fan, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang. "A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications." Future generation computer systems 43 (2015): 149-160.
19- Veiga, Jorge, Roberto R. Expósito, Xoán C. Pardo, Guillermo L. Taboada, and Juan Tourifio. "Performance evaluation of big data frameworks for large-scale data analytics." In 2016 IEEE International Conference on Big Data (Big Data), pp. 424-431. IEEE, 2016.
20- García-Gil, Diego, Sergio Ramírez-Gallego, Salvador García, and Francisco Herrera. "A comparison on scalability for batch big data processing on Apache Spark and Apache Flink." Big Data Analytics 2, no. 1 (2017): 1.
21- Polato, Ivanilton, Reginaldo Ré, Alfredo Goldman, and Fabio Kon. "A comprehensive view of Hadoop research—A systematic literature review." Journal of Network and Computer Applications 46 (2014): 1-25.
22- Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51, no. 1 (2008): 107-113.
23- Zaharia, Matei, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. "Spark: Cluster computing with working sets." HotCloud 10, no. 10-10 (2010): 95.
24- Alexandrov, Alexander, Rico Bergmann, Stephan Ewen, Johann-Christoph Freytag, Fabian Hueske, Arvid Heise, Odej Kao et al. "The stratosphere platform for big data analytics." The VLDB Journal—The International Journal on Very Large Data Bases 23, no. 6 (2014): 939-964.
25- Lin, Jimmy, and Chris Dyer. "Data-intensive text processing with MapReduce." Synthesis Lectures on Human Language Technologies 3, no. 1 (2010): 1-177.
26- Bajaber, Fuad, Radwa Elshawi, Omar Batarfi, Abdulrahman Altalhi, Ahmed Barnawi, and Sherif Sakr. "Big data 2.0 processing systems: Taxonomy and open challenges." Journal of Grid Computing 14, no. 3 (2016): 379-405.
27- Zhang, Fan, Junwei Cao, Samee U. Khan, Keqin Li, and Kai Hwang. "A task-level adaptive MapReduce framework for real-time streaming data in healthcare applications." Future generation computer systems 43 (2015): 149-160.
28- Xu, Xiaofei, Quan Z. Sheng, Liang-Jie Zhang, Yushun Fan, and Schahram Dustdar. "From big data to big service." Computer 7 (2015): 80-83.
29- Domann, Jaschar, Jens Meiners, Lea Helmers, and Andreas Lommatzsch. "Real-time News Recommendations using Apache Spark." In CLEF (Working Notes), pp. 628-641. 2016.
30- Hu, Han, Yonggang Wen, Tat-Seng Chua, and Xuelong Li. "Toward scalable systems for big data analytics: A technology tutorial." IEEE access 2 (2014): 652-687.
31- Bello-Orgaz, Gema, Jason J. Jung, and David Camacho. "Social big data: Recent achievements and new challenges." Information Fusion 28 (2016): 45-59.
32- Phillips-Wren, Gloria, and Angela Hoskisson. "An analytical journey towards big data." Journal of Decision Systems 24, no. 1 (2015): 87-102.
33- Yin, ChuanTao, Zhang Xiong, Hui Chen, JingYuan Wang, Daven Cooper, and Bertrand David. "A literature survey on smart cities." Science China Information Sciences 58, no. 10 (2015): 1-18.
34- Stimmel, Carol L. Building smart cities: analytics, ICT, and design thinking. Auerbach Publications, 2015.
35- Piro, Giuseppe, Ilaria Cianci, Luigi Alfredo Grieco, Gennaro Boggia, and Pietro Camarda. "Information centric services in smart cities." Journal of Systems and Software 88 (2014): 169-188.
36- Xu, Xiaofei, Quan Z. Sheng, Liang-Jie Zhang, Yushun Fan, and Schahram Dustdar. "From big data to big service." Computer 7 (2015): 80-83.

مقالات مرتبط

پیشران‌ها، موانع و پسایندهای استراتژی تحول دیجیتال در صنعت حمل و نقل بار جاده ای ایران با تمرکز بر اینترنت اشیا و تحلیلگری داده
تاریخ چاپ : 1405/03/13
تحلیل مؤلفه‌های فناوری شناختی در محیط عدم اطمینان با استفاده از مجموعه فازی بایپولار
تاریخ چاپ : 1405/01/17
طراحی مدل بانک‌داری مبتنی بر شبکه‌های اجتماعی در دوران پسا‌کرونا (مطالعه موردی: بانک کشاورزی)
تاریخ چاپ : 1405/01/17
طراحي مدل و دسته‌بندی عوامل مؤثر بر توسعه قابلیت‌های فناورانه در شرکت‌های اکتشاف و تولید نفت ایرانی
تاریخ چاپ : 1404/10/06
تأثیر رضایت از زمان انتظار و استفاده از فناوری سلف‌سرویس بر پایداری وفاداری مشتری
تاریخ چاپ : 1404/07/15
ارائه مدلي به منظور پياده سازي يادگيري تكنولوژيك در صنعت نفت
تاریخ چاپ : 1404/04/24

اشتراک گذاری

آدرس مقاله

بررسی تطبیقی، کاربردها و چالش‌های فناوری‌های تحلیل بزرگ داده