Bilgisayar Mühendisliği Bölümü Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1940
Browse
9 results
Search Results
Conference Object Analyzing Customer Churn: a Comparative Study of Machine Learning Models on Pay-Tv Subscribers in Turkey(IEEE, 2023-12-21) Obalı, Emir; Çalışkan, Sibel Kırmızıgül; Karani Yılmaz, Veysel; Kara, Erkan; Meşe, Yasemin Kürtcü; Çakar, Tuna; Yıldız, Ayşenur; Hataş, Tuğce AydınUnderstanding the reasons for customer churn provides added value in terms of retaining existing customers, as customer attrition leads to revenue loss for companies and incurs marketing costs for acquiring new customers. In this study, the 6-month historical data of a Pay-TV company operating in Turkey was used, and due to the imbalanced nature of the dataset on a label basis, the oversampling method was applied. During the model development phase, various artificial learning algorithms (Random Forest, Logistic Regression, KNearest Neighbors, Decision Tree, AdaBoost, XGBoost, Extra Tree Classifier) were utilized, and their performances were compared. Based on the evaluation of success criteria for each model, it was observed that the tree-based Random Forest, Extra Tree Classifier and XGBoost achieved the highest performance for this dataset.Conference Object Fault Detection Model Using Measurement Data in Fiber Optic Internet Lines(IEEE, 2023-12-21) Çakar, Tuna; Savaş, Kerem; Battal, Eray; Özkan, GözdeIn this study, a model has been developed to predict potential faults in advance based on performance metrics of various fiber-optic internet lines, as well as alarm (fault data) and performance measurement values from the 5 hours prior to the occurrence of the alarm. Performance metrics that vary over time have been analyzed in a time-series format based on alarm numbers, and anomaly detection methods have been used to label the data for any potential patterns that may occur in the performance metrics specific to the alarm. The labeled data was then fed into a classification model to create a model that enables to detect possible patterns in the relevant performance values for the specific fault type. The best performing model was Random Forest Classifier with accuracy and F1 scores of 0.89 and 0.84 respectively.Conference Object Citation - Scopus: 3Grafraud: Fraud Detection Using Graph Databases and Neural Networks(IEEE, 2023-12-21) Raina, Ajeet Singh; Çakar, Tuna; Ertuğrul, Seyit; Arslan, Şuayip; Sayar, AlperenThe issue of fraud has become a significant concern for many companies, particularly in the finance sector, but the traditional methods of detecting fraud are no longer adequate. Innovative technologies are necessary to identify complex fraudulent activities, and RedisGraph, a high-performance graph database, may offer a solution. With the assistance of neural networks, RedisGraph can accurately and efficiently detect fraudulent transactions in vast and intricate environments. Companies typically use a combination of Python and Oracle Databases to design fraud detection systems. which provide robust data management and real time AI processing capabilities. These technologies allow to create fraud detection systems that can determine fraudulent activities in real-time. But according to advancements of fraud methods only using of these systems not efficient nowadays. This article presents a proof of concept based on an essential use case of RedisGraph-powered neural networks in detecting financial fraud. It demonstrates the value of carefully employing Python and Oracle Database to construct and deploy real-time systems that can efficiently detect fraudulent activities.Conference Object Spine Posture Detection for Office Workers With Hybrid Machine Learning(IEEE, 2023-09-13) Öke, Deniz; Çakar, Tuna; Yıldız, Ahmet; Mise, Pelin; Terzibaşıoğlu, Aynur MetinThis study aims to detect bad spine posture using an al-ternative approach that doesn't rely on deep learning or excessive energy. The goal is to improve accuracy and effectiveness without disrupting workflow. A custom dataset was created, numerical inferences were made from posture values, and a hybrid approach using Light Gradient Boosting achieved a 96 % success rate.Conference Object Citation - Scopus: 3Segmentation for Factoring Customers: Using Unsupervised Machine Learning Algorithms(IEEE, 2023-10-11) Yalçuva, Berat; Akçay, Ahmet; Ertuğrul, Seyit; Çakar, Tuna; Sayar, Alperen; Ayyıldız, Nur SeherNowadays the fact that technology facilitates data collection is an important opportunity, as well as making the management of all this data difficult and makes no sense unless it is well processed. This stored data is extremely important, and companies use data provided by their customers. Catching the needs of the customer profiles of the changing world is now a necessity and takes the first place for companies. With the increase in the amount of stored data over time, it has become difficult to establish a relationship between the data and to separate them from each other. At this point, machine learning methods have become more involved in our lives. In this study, what segmentation is and its change over the years are mentioned. It has been mentioned which machine learning techniques will be useful in data selection. Then, possible machine learning methods are shown in real life segmentation problem by using the domestic factoring company’s customer check data. Since this study aims to group unlabeled data, unsupervised learning techniques are emphasized. Among these methods, Hierarchical Clustering, DBSCAN, Gaussian Mixture Modeling methods, Fuzzy c- Means were used as well as the most popular K-Means algorithm. When the clustering results were examined, the optimal number of clusters was calculated very high with GMM, DBSCAN could not assign clusters, and Hierarchical clustering could not produce expected results. It was observed that the best results were obtained with the K-Means and Fuzzy c - Means algorithms.Conference Object Citation - Scopus: 3Emg-Based Bci for Picar Mobilization(IEEE, 2022-09-14) Yilmaz, Yasin; Günden, Burak Bahri; Ertekin, Efe; Sayar, Alperen; Çakar, Tuna; Arslan, Şefik ŞuaybIn this study, the main scope was to develop a brain-computer interface (BCI) with the use of PiCar and EEG/ERP devices. Thus, it is aimed to facilitate the lives of people with certain diseases and disabilities. The ultimate goal of this project has been to direct and control a BCI-based PiCar concerning the signals captured via the EEG/ERP device. With the EEG headset, the EMG signals of the gestures (facial expressions) of the participant were captured. With the collected data, filtering and other preprocessing methods were applied to have noise-free signals. In the preprocessing, the detrending method was used to clean the data set which showed a constantly increasing trend, to a certain range, and zero trends. The denoising (Wavelet Denoising) and outlier detection/elimination methods (OneClassSVM) were used for noise elimination. The SMOTE oversampling method was used for data augmentation. Welch's method was used to get band powers from the signals. With the use of augmented data, several machine learning algorithms were applied such as Support Vector Machine, Logistic Regression, Linear Discriminant Analysis, Random forest Classifier, Gradient Boosting Classifier, Multinomial Naive Bayes, Decision tree, K-Nearest Neighbor, and voting classifier. The developed models were used to predict the direction that is passed as an input to PiCar's API. After that, PiCar was controlled concerning the predicted direction with HTTP GET requests. In this project, the OpenBCI headset and the Brainflow library for EEG/EMG signal obtaining and processing were used. Also, the Tkinter library was used for the Graphical user interface and Django for establishing a server on PiCar's brain which is RaspberryPi. © 2022 IEEE.Conference Object Citation - WoS: 5Citation - Scopus: 7Cloud2hdd: Large-Scale Hdd Data Analysis on Cloud for Cloud Datacenters(IEEE, 2020-02-01) Zeydan, Engin; Arslan, Şefik ŞuaybThe main focus of this paper is to develop a distributed large scale data analysis platform for the opensource data of Backblaze cloud datacenter which consists of operational hard disk drive (HDD) information collected over an observable period of 2272 days (over 74 months). To carefully analyze the intrinsic characteristics of the hard disk behavior, we have exploited a large bolume of data and the benefits of Hadoop ecosystem as our big data processing engine. In other words, we have utilized a special distributed scheme on cloud for cloud HDD data, which is termed as Cloud2HDD. To classify the remaining lifetime of hard disk drives based on health indicators such as in-built S.M.A.R.T (Self-Monitoring, Analysis, and Reporting Technology) features, we used some of the state-of-the-art classification algorithms and compared their accuracy, precision, and recall rates simultaneously. In addition, importance of various S.M.A.R.T. features in predicting the true remaining lifetime of HDDs are identified. For instance, our analysis results indicate that Random Forest Classifier (RFC) can yield up to 94% accuracy with the highest precision and recall at a reasonable time by classifying the remaining lifetime of drives into one of three different classes, namely critical, high and low ideal states in comparison to other classification approaches based on a specific subset of S.M.A.R.T. features.Conference Object Citation - WoS: 1Citation - Scopus: 1Hata Düzeltme Çıktı Kodları: Genel Bakış, Zorluklar ve Gelecek Yönelimler(IEEE, 2019-04-01) Arslan, Şuayb Şefik; Güney, Osman B.Çok sınıflı sınıflandırma problemini çözmenin en etkili yollarından biri, bir grup akıllıca tasarlanmıs ikili sınıflandırıcı kullanarak, sınıflandırıcı sonuçlarını belli bir kritere göre bir araya getirmektir. Hata Düzeltme Çıktı Kodları (HDÇK) birden fazla ikili sınıflandırma yoluyla is bölümü saglayan basarılı tekniklerden biridir. Bu çalışmamızın amacı modern HDÇK tiplerine kısa bir giris yapmak, ikili sınıflandırma sonuçlarını birlestiren çesitli kod çözme yöntemleri ve zorlukları, avantajları ve dezavantajlarını ortaya koyan karsılastırmalı bir çalısma sunmaktır. Ayrıca HDÇK tekniğinin birkaç önemli uygulaması, MNIST veri seti üzerindeki performansı ve gelecekteki egilimlerin bazıları sunulmaktadır.Conference Object Citation - Scopus: 2A Visualization Platfom for Disk Failure Analysis(IEEE, 2018-05-01) Arslan, Şuayb Şefik; Yiğit, İbrahim Onuralp; Zeydan, EnginIt has become a norm rather than an exception to observe multiple disks malfunctioning or whole disk failures in places like big data centers where thousands of drives operate simultaneously. Data that resides on these devices is typically protected by replication or erasure coding for long-term durable storage. However, to be able to optimize data protection methods, real life disk failure trends need to be modeled. Modelling helps us build insights while in the design phase and properly optimize protection methods for a given application. In this study, we developed a visualization platform in light of disk failure data provided by BackBlaze, and extracted useful statistical information such as failure rate and model-based time to failure distributions. Finally, simple modeling is performed for disk failure predictions to alarm and take necessary system-wide precautions.
