Segmentation for factoring customers using unsupervised machine learning algorithms
Loading...
Files
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
MEF Üniversitesi
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
Günümüzde teknolojinin veri toplamayı kolaylaştırmasının önemli bir fırsat olmasının yanı sıra tüm bu verilerin yönetimini zorlaştırmakta ve veriler iyi işlenmedikçe bir anlam ifade etmemektedir. Depolanan bu veriler son derece önemlidir ve şirketler, müşterileri tarafından sağlanan verileri kullanır. Değişen dünyanın müşteri profillerinin ihtiyaçlarını yakalamak artık bir zorunluluk haline gelmekte ve firmalar için ilk sırayı almaktadır. Zamanla depolanan verinin artması ile artık veriler arasında ilişki kurmak ve bunları birbirinden ayırmak zor bir hal almıştır. Bu noktada hayatımıza makine öğrenmesi yöntemleri daha fazla dahil olmaya başlamıştır. Bu çalışmada, segmentasyonun ne olduğu ve yıllar içindeki değişiminden bahsedilmiştir. Hangi makine öğrenmesi tekniklerinin veri seçiminde faydalı olacağına değinilmiştir. Ardından olası makine öğrenmesi yöntemleri yerel bir faktoring şirketinin müşteri çek verileri kullanılarak gösterilmiştir. Bu çalışma etiketsiz verilerin gruplanmasını hedeflediğinden gözetimsiz öğrenme teknikleri üzerinde durulmuştur. Bu yöntemler arasında en popular olan K – means algoritmasının yanı sıra Hiyerarşik Kümeleme, DBSCAN, Gauss Karışık Modelleme ve Fuzzy c - Means yöntemleri kullanılmıştır. Her bir algoritma için başarı ölçütleri incelenerek uygun küme sayıları bulunmuş ve bulunan sonuçlar karşılaştırılmıştır. Kümeleme sonuçları incelendiğinde GMM ile optimal küme sayısı oldukça yüksek hesaplanmış, DBSCAN küme atayamamış, Hierarchical clustering ise zaman açısından maliyetli bulunmuştur. En iyi sonuçların K - means ve Fuzzy c - Means algoritmalarıyla elde edildiği gözlemlenmiştir.
Nowadays the fact that technology facilitates data collection is an important opportunity, as well as making the management of all this data difficult and makes no sense unless it is well processed. This stored data is extremely important, and companies use data provided by their customers. Catching the needs of the customer profiles of the changing world is now a necessity and takes the first place for companies. With the increase in the amount of stored data over time, it has become difficult to establish a relationship between the data and to separate them from each other. At this point, machine learning methods have become more involved in our lives. In this study, what segmentation is and its change over the years are mentioned. It has been mentioned which machine learning techniques will be useful in data selection. Then, possible machine learning methods are shown using the local factoring company's customer check data. Since this study aims to group unlabeled data, unsupervised learning techniques are emphasized. Among these methods, Hierarchical Clustering, DBSCAN, Gaussian Mixed Modeling methods, Fuzzy c - Means were used besides the most popular K-Means. The success criteria for each algorithm were examined and the appropriate cluster numbers were found, and the results were measured. When the clustering outcomes were examined, the optimal number of clusters was calculated very high with GMM, DBSCAN could not assign clusters, and Hierarchical clustering has been found to be very costly in terms of time. It was observed that the best results were obtained with the K - Means and FCM.
Nowadays the fact that technology facilitates data collection is an important opportunity, as well as making the management of all this data difficult and makes no sense unless it is well processed. This stored data is extremely important, and companies use data provided by their customers. Catching the needs of the customer profiles of the changing world is now a necessity and takes the first place for companies. With the increase in the amount of stored data over time, it has become difficult to establish a relationship between the data and to separate them from each other. At this point, machine learning methods have become more involved in our lives. In this study, what segmentation is and its change over the years are mentioned. It has been mentioned which machine learning techniques will be useful in data selection. Then, possible machine learning methods are shown using the local factoring company's customer check data. Since this study aims to group unlabeled data, unsupervised learning techniques are emphasized. Among these methods, Hierarchical Clustering, DBSCAN, Gaussian Mixed Modeling methods, Fuzzy c - Means were used besides the most popular K-Means. The success criteria for each algorithm were examined and the appropriate cluster numbers were found, and the results were measured. When the clustering outcomes were examined, the optimal number of clusters was calculated very high with GMM, DBSCAN could not assign clusters, and Hierarchical clustering has been found to be very costly in terms of time. It was observed that the best results were obtained with the K - Means and FCM.
Description
Keywords
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
Turkish CoHE Thesis Center URL
Fields of Science
Citation
WoS Q
Scopus Q
Source
Volume
Issue
Start Page
1
End Page
93
Collections
Google Scholar™
Sustainable Development Goals
16
PEACE, JUSTICE AND STRONG INSTITUTIONS
