Resource Usage Characterization for Social Network Analytics on Spark
Authors: Irene Manotas (University of Delaware), Rui Zhang (IBM Corporation), Min Li (IBM Corporation), Renu Tewari (IBM Corporation), Dean Hildebrand (IBM Corporation)
Abstract: Platforms for Big Data Analytics such as Hadoop, Spark, and Storm have gained large attention given their easy-to-use programming model, scalability, and performance characteristics.
Along with the wide adoption of these big data platforms, Online Social Networks (OSN) have evolved as one of the major sources of information given the large amount of data being generated in a daily basis from different online communities such as Twitter, Facebook, etc. However, current benchmarks neither consider the evaluation of big data platforms with workloads targeted for OSN analytics nor the usage of OSN data as input. Hence, there is no studies that characterize the resource utilization of algorithms used for OSN analytics on big data platforms. This poster presents the resource characterization of two workloads for OSN data. Our results show the data patterns and major resource demands that could appear when analyzing OSN data.
Two-page extended abstract: pdf