Cloud2hdd: Large-Scale Hdd Data Analysis on Cloud for Cloud Datacenters
| dc.contributor.author | Zeydan, Engin | |
| dc.contributor.author | Arslan, Şefik Şuayb | |
| dc.date.accessioned | 2020-05-31T13:51:23Z | |
| dc.date.available | 2020-05-31T13:51:23Z | |
| dc.date.issued | 2020 | |
| dc.description.abstract | The main focus of this paper is to develop a distributed large scale data analysis platform for the opensource data of Backblaze cloud datacenter which consists of operational hard disk drive (HDD) information collected over an observable period of 2272 days (over 74 months). To carefully analyze the intrinsic characteristics of the hard disk behavior, we have exploited a large bolume of data and the benefits of Hadoop ecosystem as our big data processing engine. In other words, we have utilized a special distributed scheme on cloud for cloud HDD data, which is termed as Cloud2HDD. To classify the remaining lifetime of hard disk drives based on health indicators such as in-built S.M.A.R.T (Self-Monitoring, Analysis, and Reporting Technology) features, we used some of the state-of-the-art classification algorithms and compared their accuracy, precision, and recall rates simultaneously. In addition, importance of various S.M.A.R.T. features in predicting the true remaining lifetime of HDDs are identified. For instance, our analysis results indicate that Random Forest Classifier (RFC) can yield up to 94% accuracy with the highest precision and recall at a reasonable time by classifying the remaining lifetime of drives into one of three different classes, namely critical, high and low ideal states in comparison to other classification approaches based on a specific subset of S.M.A.R.T. features. | |
| dc.description.sponsorship | TÜBİTAK, MINECO | |
| dc.description.sponsorship | Gen-eralitat de Catalunya; TUBITAK, (2232-115C111); Generalitat de Catalunya, (2017SGR1195); Ministerio de Economía y Competitividad, MINECO, (5G-REFINE, TEC2017-88373-R); Türkiye Bilimsel ve Teknolojik Araştirma Kurumu, TÜBITAK | |
| dc.description.sponsorship | This work was partially funded by The Scientific and Technological Research Council of Turkey (TUBITAK) under the grant number 2232-115C111, Spanish MINECO under the grant number TEC2017-88373-R (5G-REFINE) and by Generalitat de Catalunya under the grant number 2017SGR1195. | |
| dc.description.sponsorship | ACKNOWLEDGMENT This work was partially funded by The Scientific and Technological Research Council of Turkey (TUBITAK) under the grant number 2232-115C111, Spanish MINECO under the grant number TEC2017-88373-R (5G-REFINE) and by Gen-eralitat de Catalunya under the grant number 2017SGR1195. | |
| dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TUBITAK) [2232-115C111]; Spanish MINECO [TEC2017-88373-R]; Generalitat de Catalunya [2017SGR1195] | |
| dc.identifier.citation | Zeydan, E. & Arslan S. S. (February 01, 2020). Cloud2HDD: large-scale HDD data analysisn cloud for cloud datacenters, 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN 2020), Paris, France, IEEE, Article number: 9059482, pp. 243-249, DOI: https://doi.org/10.1109/ICIN48450.2020.9059482 | |
| dc.identifier.doi | 10.1109/ICIN48450.2020.9059482 | |
| dc.identifier.isbn | 9781728151281 | |
| dc.identifier.isbn | 9781728151274 | |
| dc.identifier.issn | 2472-8144 | |
| dc.identifier.issn | 2162-3414 | |
| dc.identifier.scopus | 2-s2.0-85084061181 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.11779/1325 | |
| dc.identifier.uri | https://doi.org/10.1109/ICIN48450.2020.9059482 | |
| dc.language.iso | en | |
| dc.publisher | IEEE | |
| dc.relation.ispartof | 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops = ICIN 2020 | |
| dc.relation.ispartofseries | Conference on Innovations in Clouds Internet and Networks | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Lifetime | |
| dc.subject | Hadoop | |
| dc.subject | Cloud | |
| dc.subject | Machine learning | |
| dc.subject | Data center | |
| dc.subject | Hdds | |
| dc.title | Cloud2hdd: Large-Scale Hdd Data Analysis on Cloud for Cloud Datacenters | |
| dc.type | Conference Object | |
| dspace.entity.type | Publication | |
| gdc.author.id | Şuayb Şefik Arslan / 0000-0003-3779-0731 | |
| gdc.author.id | Şuayb Şefik Arslan / K-2883-2015 | |
| gdc.author.id | Arslan, Suayb/0000-0003-3779-0731 | |
| gdc.author.institutional | Arslan, Şuayb Şefik | |
| gdc.author.scopusid | 24315322700 | |
| gdc.author.scopusid | 35955672100 | |
| gdc.author.wosid | Zeydan, Engin/AAI-2467-2019 | |
| gdc.author.wosid | Arslan, Suayb/K-2883-2015 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.access | metadata only access | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | |
| gdc.description.departmenttemp | [Zeydan, Engin] Ctr Technol Telecomunicac Catalunya, Barcelona 08860, Spain; [Arslan, Suayb S.] MEF Univ, Dept Comp Engn, TR-34912 Istanbul, Turkey | |
| gdc.description.endpage | 249 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| gdc.description.scopusquality | N/A | |
| gdc.description.startpage | 243 | |
| gdc.description.woscitationindex | Conference Proceedings Citation Index - Science | |
| gdc.description.wosquality | N/A | |
| gdc.identifier.openalex | W3016192933 | |
| gdc.identifier.wos | WOS:000569984100041 | |
| gdc.index.type | WoS | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.downloads | 12 | |
| gdc.oaire.impulse | 3.0 | |
| gdc.oaire.influence | 2.7317726E-9 | |
| gdc.oaire.isgreen | true | |
| gdc.oaire.keywords | lifetime | |
| gdc.oaire.keywords | machine learning | |
| gdc.oaire.keywords | Hadoop | |
| gdc.oaire.keywords | HDDs | |
| gdc.oaire.keywords | Cloud | |
| gdc.oaire.popularity | 5.3484066E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.oaire.sciencefields | 0211 other engineering and technologies | |
| gdc.oaire.sciencefields | 02 engineering and technology | |
| gdc.oaire.sciencefields | 0101 mathematics | |
| gdc.oaire.sciencefields | 01 natural sciences | |
| gdc.oaire.views | 4 | |
| gdc.openalex.collaboration | International | |
| gdc.openalex.fwci | 1.0676 | |
| gdc.openalex.normalizedpercentile | 0.83 | |
| gdc.opencitations.count | 4 | |
| gdc.plumx.crossrefcites | 2 | |
| gdc.plumx.mendeley | 6 | |
| gdc.plumx.scopuscites | 7 | |
| gdc.publishedmonth | Şubat | |
| gdc.scopus.citedcount | 7 | |
| gdc.virtual.author | Arslan, Şefik Şuayb | |
| gdc.wos.citedcount | 5 | |
| gdc.wos.documenttype | Proceedings Paper | |
| gdc.wos.indexdate | 2020 | |
| gdc.wos.publishedmonth | Şubat | |
| gdc.yokperiod | YÖK - 2019-20 | |
| relation.isAuthorOfPublication | 37152966-5384-4fd7-a0dc-34d1dd8bdc7f | |
| relation.isAuthorOfPublication.latestForDiscovery | 37152966-5384-4fd7-a0dc-34d1dd8bdc7f | |
| relation.isOrgUnitOfPublication | 05ffa8cd-2a88-4676-8d3b-fc30eba0b7f3 | |
| relation.isOrgUnitOfPublication | 0d54cd31-4133-46d5-b5cc-280b2c077ac3 | |
| relation.isOrgUnitOfPublication | a6e60d5c-b0c7-474a-b49b-284dc710c078 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 05ffa8cd-2a88-4676-8d3b-fc30eba0b7f3 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Şefik Şuayb ARSLAN.pdf
- Size:
- 1.13 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full Text - Conference Proceeding
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.44 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
