One of the biggest problems that many companies face nowadays is dealing with the huge volumes of data that they generate daily. In the data-driven world all data needs to be stored, organized and analyzed to get the required information that will help the administration to make the right decision to support the next step of the company. Big Data and Business Intelligence have become very popular terms in the business field, where Big Data highlights the tools that are used to manage the huge volume of data. One of the Big Data tools is the Data Warehouse, which is used to manipulate the massive amount of data, while the Business Intelligence (BI) focuses on how we can analyze information from the huge volumes of data that support companies in decision making
In this thesis, we will compare the implementation of the DW concepts using the Relational Database Management Systems (RDBMS), specifically, SQL Server DB over the Hadoop system, and then analyze the resource (CPU and RAM) consumption.
I prove that using the Hadoop system speeds up the process of manipulating these huge volumes of data with very low cost, based on the nature of the Hadoop system that is efficient in processing all kinds of structured, semi-structured, unstructured or raw data with minimum cost and high efficiency in manipulating and storing massive amounts of data.
Library of Congress Subject Headings
Data warehousing; Apache Hadoop; Relational databases--Management; Non-relational databases
Information Sciences and Technologies (MS)
Department, Program, or Center
Information Sciences and Technologies (GCCIS)
Al-Wattar, Nazar, "Data Warehouse performance comparing Relational Database Management Systems and the Hadoop-based NoSQL Database system" (2020). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus