Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/2164
Title: Performance Analysis of RDBMS and Hadoop Components with Their File Formats for the Development of Recommender Systems
Authors: Gupta A
Saxena M
Gill R.
Keywords: Avro
Big Data
CSv
Hadoop
HDFS
Hive
Impala
MySQL
ORC
Parquet
RC
Recommender System
Spark
SparkSQL
Issue Date: 2018
Publisher: Institute of Electrical and Electronics Engineers Inc.
Abstract: A recommender system is a software that can suggest users through prediction based on their previous data usage in the shortest amount of time. Present recommender systems are designed using complex techniques like collaborative filtering, content-based filtering etc. but a similar system can be built by applying complex queries using different query tools. Performance of these query tools depends upon various factors like data size, file formats of the dataset, aggregate search etc. In this paper, we compare four query tools like Hive, Impala, SparkSQL and MySQL to design a fast and an efficient recommender system. Analysis of these tools is done by comparing the execution time of complex queries on data stored in different file formats like text, CSV, AVRO, PARQUET, RC and ORC. The results obtained indicate that a fast recommender system can be built using a query tool like Impala on a dataset saved in AVRO file format. � 2018 IEEE.
URI: 10.1109/I2CT.2018.8529480
http://hdl.handle.net/123456789/2164
Appears in Collections:Conferences

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.