Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1785

Browse

Search Results

Now showing 1 - 7 of 7
  • Master Term Project
    Prediction of Up and Down Signalsın Selected Blues Chip Stocks
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2019) Yıldız, Mustafa; Koç, Utku
    Efforts have been made to predict the direction in which equity stocks will move in the capital markets. In most of these studies, Technical Analysis and Fundamental Analysis based models have been used. For daily price estimations, macroeconomic variables or financial ratios of financial instruments are used. On the other hand trade book data are taken into consideration in intraday price estimates. In this study, equity market data analytics, which are created by Borsa İstanbul as a benchmark for intraday price signals, are used. These analytics are derived from trade and order book data. For 5 minute periods, intraday price and equity market data analytics data sets are created, and different algorithms are tried over these data sets. The study is carried out using one-week data of 4 selected blue chip stocks. The signals for increase is 1, for decreases is -1 and 0 for non-change signals. As a result of the study, the decision jungle algorithm is the most successful algorithm. In addition this, the lack of volatility and liquidity in the market have caused overfitting problems in ensemble algorithms. According to the multiclass decision jungle confusion matrix, the positive true results for 1 (or increase of the price) are promising. If an investors can just use the algorithm for the price increase, it will be meaningful. The true positive ratio of 1, 54.5%, is too high when it is compared with its false trues value for decrease (or -1), which is just 13.6%. The difference between true positive and false negative (54.5% - 13.6%) will be the earning ratio for the investor, if he/she decides to invest the price increase of Yapi Kredi stock with the decision jungle algorithm. Although it is stated that big data algorithms (machine learning techniques) can give the best results for the data, domain knowledge related to the data is still very important. As it is seen in the study, in order to overcome the problems of overfitting or bias that occur in other studies, it is necessary to obtain sufficient domain knowledge in consultation with the experts and practitioners of the subject. In addition, the increase in the studies on intraday trading, which is a shallow area in the literature, will provide better results in the studies conducted on price forecasts in the future. In the results of this study, parallel with the literature, it is revealed that there is difficulty in estimating the stock price movements.
  • Master Term Project
    Predicting Customer Perfection on Brands Functional Near-Infrared Spectroscopy Measurements
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2019) Kemerci, Emre; Koç, Utku
    Customer perception on the brands have importance to give strategic decisions by marketing professionals. In classical ways, customer perception on brands are researched through conducting field surveys. Similarly, neuromarketing discipline have studies on customer behaviors, their perceptions, communication techniques etc. under the frame of decision-making process of human. In neuromarketing, functional near-infrared spectroscopy (fNIRS) is a technology used to measure oxy and deoxy hemoglobin concentration in the tissues in order to enable to analyze hemodynamic responses of the brain activities. In this study, a group of participants’ activations of prefrontal cortex so the hemodynamic responses that were collected against a set of stimuli, which is a brand logo and adjective associated with the brand is used as dataset. Measured hemodynamic response metrics are oxygenated hemoglobin (HbO), deoxygenated hemoglobin (HbR), total hemoglobin (HbT) and Oxygenation (Oxy) and the dataset includes 168 participants’ measurements for 30 stimuli. In addition, the information regarding the responses of the participants and common perception of stimuli (field study results for same stimuli) are also exists in dataset. The aim of the project is to predict through machine learning algorithms whether relation between brand and the relevant adjective is Positive, Negative or Neutral using these feature set. As methodology of this study, fNIRS measurements in the data is cleaned and Null values are handled, measurements are consolidated per participant and stimuli with two different method as feature creation and classification algorithms are used as supervised learning to predict brand perception. In conclusion, performance of support vector classifier and XGBoosting algorithms are become very low, slightly over 50% accuracy despite the optimization with different classifier parameters. Further studies are addressed as performing feature engineering studies with different options.
  • Master Term Project
    Analyzing the Drivers of Customer Satisfaction Via Social Media
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2019) Yücel, Kadir Kutlu; Koç, Utku
    Social media became a great influence force during the last decade. Active social media user population increased with the new generations. Thus, data started to accumulate in tremendous amounts. Data accumulated through social media offers an opportunity to reach valuable insights and support business decisions. The aim of this project is to understand the drivers of customer satisfaction by public sentiments on Twitter towards a financial institution. Data was extracted from the most popular microblogging platform Twitter and sentiment analysis was performed. The unstructured data was classified by their sentiments with a lexicon-based model and a machine learning based model. The outcome of this study showed machine learning based model successfully overcame the language specific problems and was able to make better predictions where lexicon-based model struggled. Further analysis was performed on the extreme daily average sentiment scores to match these days with prominent events. The results showed that the public sentiment on Twitter is driven by three main themes; complaints related to services, advertisement campaigns, and influencers’ impact.
  • Master Term Project
    Football Player Profiling Using Opta Match Event Data: Hierarchical Clustering
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2019) Kalenderoğlu, Uğurcan; Koç, Utku
    Increasing popularity of data analytics has impacted the sport industry. Dimension of available data and best practices on the usage of data analytics increased as a result of this trend. Player profiling is one of emerging hot topics among those, especially in football. On the other hand, income and expense balance of transfers has been biggest burden on clubs’ financials while it should be reverse. Scouting processes are currently dominated by bilateral relations and intuitive comments of scouting staff. It is an important step to transform into data driven decision framework to overcome this situation. It is crucial to replace a player who leave the team with someone who has potential and very close playing style. Player profiling is the first step to do this. The data set used in this project is obtained from Opta – a sport focused data company – and contains all actions performed on-ball at player level from Turkish Super League, English Premier League and German Bundesliga in three seasons between 2015 and 2018. Principal component analysis is applied to the dataset in order to reduce dimensionality to the 15 features which consists of 2469 players and 271 features at the beginning. As a result of this study, it is observed that there are twelve different player clusters within the traditional main positions; three for defenders, four for midfielders and five for forwards. Clubs can enrich and benefit from these clusters in three ways: 1) evaluation of a player style over a period of time and detecting the best role fit 2) analyzing the effect of cluster combination to decide which line-up yields better team results 3) finding the closest match to a player who is subject to replacement.
  • Master Term Project
    Mortality Prediction of Countries
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2018) Üşenmez, Elif Efser; Koç, Utku
    In this study mortality reasons of countries detailed by sex and age-group is analyzed and different forecasting models are developed by using different machine learning algorithms. The dataset is obtained from the World Health Organization(WHO) Mortality Database. In WHO database there are different datasets for countries mortality reason number. The study used the dataset that used ICD-10 for classifying mortality reasons.ICD-10 is the 10 revision of International Statistical Classification of Diseases and Related Health Problems published by the World Health Organization. In addition to main mortality reason datasets, we add different independent variables and try to find the best features to fit models without biasing and overfitting and reaching high R2 and Mean Square Errors. To find the best model for forecasting mortality reasons by age-groups and sex different machine learning algorithms are fitted and results of these algorithms are analyzed.
  • Master Term Project
    Predicting Birth Defects
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2018) Korkut Özer, Selen; Koç, Utku
    Many couples are eager to have a healthy baby. For this reason, the pregnant woman is trying to take their baby through the steps of adjusting their lives during the pregnancy, such as healthy nutrition, organic life, avoiding cosmetics. Even though the woman can do it, health problems can be observed in the baby at the time of birth or after birth. The causes of these health problems may be factors such as genetic, the physiological characteristics of the mother, environmental. In this paper, we tried to answer the question whether the health problems that occur in babies after childbirth can be estimated before birth. This includes the birth records of the American Centers for Disease Control and Prevention (CDC). Approximately 3M data was analyzed and the prediction model worked on the baby dataset. Boosting, Random Forest, Neural Network, Logistic Regression and SVM models were used to estimate the babies who could have any disease at birth. Sick babies were estimated with an accuracy of 69.5%.
  • Master Term Project
    Sms Spam Detection in Turkish Language
    (MEF Üniversitesi, Fen Bilimleri Enstitüsü, 2018) Gürkan, Cem Kaya; Koç, Utku
    Short message (SMS) is one of the most common communication methods. The growth of mobile phone users has led to a dramatic increase in using short messages. With the increasing number of mobile phone users, mobile phone users have started receiving unsolicited text messages. The use of SMS as a spam tool after the e-mail is due to a direct access to customer and high reversion to the users. These unsolicited short messages are disturbing the users even content intended for deceiving or defrauding (phishing). Up to date, all of the research carried out on SMS Spam detection was focused on the English language. In this study, Turkish datasets tagged with spam information is introduced and existing methods for English are applied to these datasets. The SMS dataset used in this study is gathered from different people and all messages are tagged according to whether they are spam or not. Naïve Bayes, Logistic Regression, SGD, SVM and Random Forest classification algorithms are tested with three feature extraction methods and a number of performance measures are evaluated. The evaluation resulted in a f-measure of 96.4% for SVM classification algorithm with TF-IDF (Term Frequency-Inverse Document Frequency) extraction method.