Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.11779/1325
Title: | Cloud2hdd: Large-Scale Hdd Data Analysis on Cloud for Cloud Datacenters | Authors: | Zeydan, Engin Arslan, Şefik Şuayb |
Keywords: | Lifetime Hadoop Cloud Machine learning Data center Hdds |
Publisher: | IEEE | Source: | Zeydan, E. & Arslan S. S. (February 01, 2020). Cloud2HDD: large-scale HDD data analysisn cloud for cloud datacenters, 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN 2020), Paris, France, IEEE, Article number: 9059482, pp. 243-249, DOI: https://doi.org/10.1109/ICIN48450.2020.9059482 | Abstract: | The main focus of this paper is to develop a distributed large scale data analysis platform for the opensource data of Backblaze cloud datacenter which consists of operational hard disk drive (HDD) information collected over an observable period of 2272 days (over 74 months). To carefully analyze the intrinsic characteristics of the hard disk behavior, we have exploited a large bolume of data and the benefits of Hadoop ecosystem as our big data processing engine. In other words, we have utilized a special distributed scheme on cloud for cloud HDD data, which is termed as Cloud2HDD. To classify the remaining lifetime of hard disk drives based on health indicators such as in-built S.M.A.R.T (Self-Monitoring, Analysis, and Reporting Technology) features, we used some of the state-of-the-art classification algorithms and compared their accuracy, precision, and recall rates simultaneously. In addition, importance of various S.M.A.R.T. features in predicting the true remaining lifetime of HDDs are identified. For instance, our analysis results indicate that Random Forest Classifier (RFC) can yield up to 94% accuracy with the highest precision and recall at a reasonable time by classifying the remaining lifetime of drives into one of three different classes, namely critical, high and low ideal states in comparison to other classification approaches based on a specific subset of S.M.A.R.T. features. | URI: | https://hdl.handle.net/20.500.11779/1325 https://doi.org/10.1109/ICIN48450.2020.9059482 |
ISBN: | 9781728151281 9781728151274 |
ISSN: | 2472-8144 2162-3414 |
Appears in Collections: | Bilgisayar Mühendisliği Bölümü Koleksiyonu Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Şefik Şuayb ARSLAN.pdf Until 2040-05-22 | Full Text - Conference Proceeding | 1.15 MB | Adobe PDF | View/Open Request a copy |
CORE Recommender
SCOPUSTM
Citations
6
checked on Nov 23, 2024
WEB OF SCIENCETM
Citations
5
checked on Nov 23, 2024
Page view(s)
22
checked on Nov 25, 2024
Google ScholarTM
Check
Altmetric
Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.