Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11779/1512
Full metadata record
DC FieldValueLanguage
dc.contributor.authorArslan, Şuayb Şefik-
dc.contributor.authorZeydan, Engin-
dc.date.accessioned2021-07-09T08:48:36Z
dc.date.available2021-07-09T08:48:36Z
dc.date.issued2021-
dc.identifier.citationArslan, S. S., & Zeydan, E. (2021). On the Distribution Modeling of Heavy-Tailed Disk Failure Lifetime in Big Data Centers. IEEE Transactions on Reliability, 70(2), 507–524. https://doi.org/10.1109/tr.2020.3007127en_US
dc.identifier.issn1558-1721-
dc.identifier.issn0018-9529-
dc.identifier.urihttps://hdl.handle.net/20.500.11779/1512-
dc.identifier.urihttps://doi.org/10.1109/TR.2020.3007127-
dc.description.abstractIt has become commonplace to observe frequent multiple disk failures in big data centers in which thousands of drives operate simultaneously. Disks are typically protected by replication or erasure coding to guarantee a predetermined reliability. However, in order to optimize data protection, real life disk failure trends need to be modeled appropriately. The classical approach to modeling is to estimate the probability density function of failures using nonparametric estimation techniques such as kernel density estimation (KDE). However, these techniques are suboptimal in the absence of the true underlying density function. Moreover, insufficient data may lead to overfitting. In this article, we propose to use a set of transformations to the collected failure data for almost perfect regression in the transform domain. Then, by inverse transformation, we analytically estimated the failure density through the efficient computation of moment generating functions, and hence, the density functions. Moreover, we developed a visualization platform to extract useful statistical information such as model-based mean time to failure. Our results indicate that for other heavy-tailed data, the complex Gaussian hypergeometric distribution and classical KDE approach can perform best if the overfitting problem can be avoided and the complexity burden is overtaken. On the other hand, we show that the failure distribution exhibits less complex Argus-like distribution after performing the Box–Cox transformation up to appropriate scaling and shifting operations.en_US
dc.description.sponsorshipTurkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) 115C111 - 119E235 / Spanish MINEC TEC2017-88373-R / Generalitat de Catalunya 2017SGR1195en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectEstimationen_US
dc.subjectKernel density estimation (kde)en_US
dc.subjectKernelen_US
dc.subjectReliabilityen_US
dc.subjectProbability density functionen_US
dc.subjectMeasurementen_US
dc.subjectModelingen_US
dc.subjectPredictive modelsen_US
dc.subjectHard-disk systemsen_US
dc.subjectData analyticsen_US
dc.subjectData modelsen_US
dc.subjectData storageen_US
dc.titleOn the Distribution Modeling of Heavy-Tailed Disk Failure Lifetime in Big Data Centersen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TR.2020.3007127-
dc.identifier.scopus2-s2.0-85110818271en_US
dc.authoridŞuayb Şefik Arslan / 0000-0003-3779-0731-
dc.authoridŞuayb Şefik Arslan / K-2883-2015-
dc.description.woscitationindexScience Citation Index Expanded-
dc.identifier.wosqualityQ1-
dc.description.WoSDocumentTypeArticle
dc.description.WoSInternationalCollaborationUluslararası işbirliği ile yapılan - EVETen_US
dc.description.WoSPublishedMonthJuneen_US
dc.description.WoSIndexDate2021en_US
dc.description.WoSYOKperiodYÖK - 2020-21en_US
dc.identifier.scopusqualityQ1-
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.identifier.startpage507 - 524en_US
dc.identifier.issue2en_US
dc.identifier.volume70en_US
dc.departmentMühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.relation.journalIEEE Transactions on Reliabilityen_US
dc.identifier.wosWOS:000659549200008en_US
dc.institutionauthorArslan, Şuayb Şefik-
item.grantfulltextopen-
item.fulltextWith Fulltext-
item.languageiso639-1en-
item.openairetypeArticle-
item.openairecristypehttp://purl.org/coar/resource_type/c_18cf-
item.cerifentitytypePublications-
crisitem.author.dept02.02. Department of Computer Engineering-
Appears in Collections:Bilgisayar Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Files in This Item:
File Description SizeFormat 
On the Distribution Modeling.pdfFull Text - Article1.43 MBAdobe PDFThumbnail
View/Open
Show simple item record



CORE Recommender

SCOPUSTM   
Citations

4
checked on Nov 16, 2024

WEB OF SCIENCETM
Citations

3
checked on Nov 16, 2024

Page view(s)

38
checked on Nov 18, 2024

Download(s)

10
checked on Nov 18, 2024

Google ScholarTM

Check




Altmetric


Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.