Bilgisayar Mühendisliği Bölümü Koleksiyonu
Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1940
Browse
Browsing Bilgisayar Mühendisliği Bölümü Koleksiyonu by Title
Now showing 1 - 20 of 142
- Results Per Page
- Sort Options
Article Citation - WoS: 2Citation - Scopus: 1A Benchmark Dataset for Turkish Data-To Generation(Elsevier, 2022) Demir, Şeniz; Öktem, SezaIn the last decades, data-to-text (D2T) systems that directly learn from data have gained a lot of attention in natural language generation. These systems need data with high quality and large volume, but unfortunately some natural languages suffer from the lack of readily available generation datasets. This article describes our efforts to create a new Turkish dataset (Tr-D2T) that consists of meaning representation and reference sentence pairs without fine-grained word alignments. We utilize Turkish web resources and existing datasets in other languages for producing meaning representations and collect reference sentences by crowdsourcing native speakers. We particularly focus on the generation of single-sentence biographies and dining venue descriptions. In order to motivate future Turkish D2T studies, we present detailed benchmarking results of different sequence-to-sequence neural models trained on this dataset. To the best of our knowledge, this work is the first of its kind that provides preliminary findings and lessons learned from the creation of a new Turkish D2T dataset. Moreover, our work is the first extensive study that presents generation performances of transformer and recurrent neural network models from meaning representations in this morphologically-rich language.Article Citation - WoS: 5Citation - Scopus: 8A Data-Assisted Reliability Model for Carrier-Assisted Cold Data Storage Systems(Elsevier, 2020) Arslan, Şuayb Şefik; Göker, Turguy; Peng, JamesCold data storage systems are used to allow long term digital preservation for institutions’ archive. The common functionality among cold and warm/hot data storage is that the data is stored on some physical medium for read-back at a later time. However in cold storage, write and read operations are not necessarily done in the same exact geographical location. Hence, a third party assistance is typically utilized to bring together the medium and the drive. On the other hand, the reliability modeling of such a decomposed system poses few challenges that do not necessarily exist in other warm/hot storage alternatives such as fault detection and absence of the carrier, all totaling up to the data unavailability issues. In this paper, we propose a generalized non-homogenous Markov model that encompasses the aging of the carriers in order to address the requirements of today's cold data storage systems in which the data is encoded and spread across multiple nodes for the long-term data retention. We have derived useful lower/upper bounds on the overall system availability. Furthermore, the collected field data is used to estimate parameters of a Weibull distribution to accurately predict the lifetime of the carriers in an example scale-out setting.Conference Object Citation - WoS: 3Citation - Scopus: 2A Joint Dedupe-Fountain Coded Archival Storage(2017) Arslan, Şuayb Şefik; Göker, Turguy; Wideman, RodAn erasure-coded archival file storage system is presented using a chunk-based deduplication mechanism and fountain codes for space/time efficient operation. Unlike traditional archival storage, this proposal considers the deduplication operation together with correction coding in order to provide a reliable storage solution. The building blocks of deduplication and fountain coding processes are judiciously interleaved to present two novel ideas, reducing memory footprint with weaker hashing and dealing with the increased collisions using correction coding, and applying unequal error protection to deduplicated chunks for increased availability. The combination of these two novel ideas made the performance of the proposed system stand out. For example, it is shown to outperform one of the replication-based as well as RAID data protection schemes. The proposed system also addresses some of the fundamental challenges of today's low-cost deduplicated data storage systems such as hash collisions, disk bottleneck and RAM overflow problems, securing savings up to 90% regular RAM use.Conference Object A Multiobjective Evolutionary Algorithm Approach for Map Sketch Generation(2018) Topcu, Şafak; Etaner-Uyar, A. SimaIn this paper, we present a method to generate map sketches for strategy games using a state of the art many-objective evolutionary algorithm, namely NSGAIII. The map sketch generator proposed in this study outputs a three objective Pareto-front in which all the points are fair and strong in different aspects. The generated map sketch can be used by level designers to create real time strategy maps effectively and/or help them see multiple aspects of a game map simultaneously. The algorithm can also be utilised as a benchmark generator to be used in tests for various cases such as shortest path algorithms and strategy game bots. The results reported in this paper are very promising and promote further study.Article A New Benchmark Dataset for P300 Erp-Based Bci Applications(Academic Press Inc Elsevier Science, 2023) Çakar, Tuna; Özkan, Hüseyin; Musellim, Serkan; Arslan, Suayb S.; Yağan, Mehmet; Çakar, Tuna; Alp, NihanBecause of its non-invasive nature, one of the most commonly used event-related potentials in brain -computer interface (BCI) system designs is the P300 electroencephalogram (EEG) signal. The fact that the P300 response can easily be stimulated and measured is particularly important for participants with severe motor disabilities. In order to train and test P300-based BCI speller systems in more realistic high-speed settings, there is a pressing need for a large and challenging benchmark dataset. Various datasets already exist in the literature but most of them are not publicly available, and they either have a limited number of participants or utilize relatively long stimulus duration (SD) and inter-stimulus intervals (ISI). They are also typically based on a 36 target (6 x 6) character matrix. The use of long ISI, in particular, not only reduces the speed and the information transfer rates (ITRs) but also oversimplifies the P300 detection. This leaves a limited challenge to state-of-the-art machine learning and signal processing algorithms. In fact, near-perfect P300 classification accuracies are reported with the existing datasets. Therefore, one certainly needs a large-scale dataset with challenging settings to fully exploit the recent advancements in algorithm design (machine learning and signal processing) and achieve high-performance speller results. To this end, in this article we introduce a new freely-and publicly-accessible P300 dataset obtained using 32-channel EEG, in the hope that it will lead to new research findings and eventually more efficient BCI designs. The introduced dataset comprises 18 participants performing a 40 -target (5 x 8) cued-spelling task, with reduced SD (66.6 ms) and ISI (33.3 ms) for fast spelling. We have also processed, analyzed, and character-classified the introduced dataset and we presented the accuracy and ITR results as a benchmark. The introduced dataset and the codes of our experiments are publicly accessible at https://data .mendeley.com /datasets /vyczny2r4w.(c) 2023 Elsevier Inc. All rights reserved.Article Citation - WoS: 1Citation - Scopus: 1A Novel Genetic Algorithm-Based Improvement Model for Online Communities and Trust Networks(IOS Press, 2020) Bekmezci, ilker; Cimen, Egemen Berkic; Ermiş, MuratSocial network analysis offers an understanding of our modern world, and it affords the ability to represent, analyze and even simulate complex structures. While an unweighted model can be used for online communities, trust or friendship networks should be analyzed with weighted models. To analyze social networks, it is essential to produce realistic social models. However, there are serious differences between social network models and real-life data in terms of their fundamental statistical parameters. In this paper, a genetic algorithm (GA)-based social network improvement method is proposed to produce social networks more similar to real-life data sets. First, it creates a social model based on existing studies in the literature, and then it improves the model with the proposed GA-based approach based on the similarity of the average degree, the k-nearest neighbor, the clustering coefficient, degree distribution and link overlap. This study can be used to model the structural and statistical properties of large-scale societies more realistically. The performance results show that our approach can reduce the dissimilarity between the created social networks and the real-life data sets in terms of their primary statistical properties. It has been shown that the proposed GA-based approach can be used effectively not only in unweighted networks but also in weighted networks.Article Citation - WoS: 3Citation - Scopus: 4A Novel Graph Transformation Strategy for Optimizing Sptrsv on Cpus(Wiley, 2023) Yılmaz, BuseSparse triangular solve (SpTRSV) is an extensively studied computational kernel. An important obstacle in parallel SpTRSV implementations is that in some parts of a sparse matrix the computation is serial. By transforming the dependency graph, it is possible to increase the parallelism of the parts that lack it. In this work, we present a novel graph transformation strategy to increase the parallelism degree of a sparse matrix and compare it to our previous strategy. It is seen that our transformation strategy can provide a speedup as high as 1.42x$$ 1.42x $$.Article Citation - WoS: 6Citation - Scopus: 7A Reliability Model for Dependent and Distributed Mds Disk Array Units(IEEE Transactions on Reliability, 2018) Arslan, Şuayb ŞefikArchiving and systematic backup of large digital data generates a quick demand for multi-petabyte scale storage systems. As drive capacities continue to grow beyond the few terabytes range to address the demands of today’s cloud, the likelihood of having multiple/simultaneous disk failures became a reality. Among the main factors causing catastrophic system failures, correlated disk failures and the network bandwidth are reported to be the two common source of performance degradation. The emerging trend is to use efficient/sophisticated erasure codes (EC) equipped with multiple parities and efficient repairs in order to meet the reliability/bandwidth requirements. It is known that mean time to failure and repair rates reported by the disk manufacturers cannot capture life-cycle patterns of distributed storage systems. In this study, we develop failure models based on generalized Markov chains that can accurately capture correlated performance degradations with multiparity protection schemes based on modern maximum distance separable EC. Furthermore, we use the proposed model in a distributed storage scenario to quantify two example use cases: Primarily, the common sense that adding more parity disks are only meaningful if we have a decent decorrelation between the failure domains of storage systems and the reliability of generic multiple single-dimensional EC protected storage systems.Conference Object Citation - Scopus: 2A Visualization Platfom for Disk Failure Analysis(IEEE, 2018) Arslan, Şuayb Şefik; Yiğit, İbrahim Onuralp; Zeydan, EnginIt has become a norm rather than an exception to observe multiple disks malfunctioning or whole disk failures in places like big data centers where thousands of drives operate simultaneously. Data that resides on these devices is typically protected by replication or erasure coding for long-term durable storage. However, to be able to optimize data protection methods, real life disk failure trends need to be modeled. Modelling helps us build insights while in the design phase and properly optimize protection methods for a given application. In this study, we developed a visualization platform in light of disk failure data provided by BackBlaze, and extracted useful statistical information such as failure rate and model-based time to failure distributions. Finally, simple modeling is performed for disk failure predictions to alarm and take necessary system-wide precautions.Conference Object Citation - Scopus: 1Adaptive Boosting of Dnn Ensembles for Brain-Computer Interface Spellers(IEEE, 2021) Çatak, Yiğit; Aksoy, Can; Özkan, Hüseyin; Güney, Osman Berke; Koç, Emirhan; Arslan, Şuayb ŞefikSteady-state visual evoked potentials (SSVEP) are commonly used in brain computer interface (BCI) applications such as spelling systems, due to their advantages over other paradigms. In this study, we develop a method for SSVEP-based BCI speller systems, using a known deep neural network (DNN), which includes transfer and ensemble learning techniques. We test performance of our method on publicly available benchmark and BETA datasets with leave-one-subject-out procedure. Our method consists of two stages. In the first stage, a global DNN is trained using data from all subjects except one subject that is excluded for testing. In the second stage, the global model is fine-tuned to each subject whose data are used in the training. Combining the responses of trained DNNs with different weights for each test subject, rather than an equal weight, provide better performance as brain signals may differ significantly between individuals. To this end, weights of DNNs are learnt with SAMME algorithm with using data belonging to the test subject. Our method significantly outperforms canonical correlation analysis (CCA) and filter bank canonical correlation analysis (FBCCA) methods.Patent Adaptive Erasure Codes(2017) Arslan, Şuayb Şefik; Göker, TurguyMethods, apparatus, and other embodiments associated with adaptive use of erasure codes for distributed data storage systems are described. One example method includes accessing a message, where the message has a message size, selecting an encoding strategy as a function of the message size, data storage device failure statistics, data storage device wear periods, data storage space constraints, or overhead constraints, and where the encoding strategy includes an erasure code approach, generating an encoded message using the encoding strategy, generating an encoded block, where the encoded block includes the encoded mes sage and metadata associated with the message, and storing the encoded block in the data storage system. Example methods and apparatus may employ Reed Solomon erasure codes or Fountain erasure codes. Example methods and apparatus may display to a user the storage capacity and durability of the data storage system.Article Citation - WoS: 12Citation - Scopus: 21Advancements in Distributed Ledger Technology for Internet of Things(Elsevier, 2020) Jurdak, Raja; Arslan, Şuayb Şefik; Krishnamachari, Bhaskar; Jelitto, JensInternet of Things (IoT) is paving the way for different kinds of devices to be connected and properly communicated at a mass scale. However, conventional mechanisms used to sustain security and privacy cannot be directly applied to IoT whose topology is increasingly becoming decentralized. Distributed Ledger Technologies (DLT) on the other hand comprise varying forms of decentralized data structures that provide immutability through cryptographically linking blocks of data. To be able to build reliable, autonomous and trusted IoT platforms, DLT has the potential to provide security, privacy and decentralized operation while adhering to the limitations of IoT devices. The marriage of IoT and DLT technology is not very recent. In fact many projects have been focusing on this interesting combination to address the challenges of smart cities, smart grids, internet of everything and other decentralized applications, most based on blockchain structures. In this special issue, the focus is on the new and broader technical problems associated with the DLT-based security and backend platform solutions for IoT devices and applications.Book Part Affection for Nouvel Architecture: on Contemporary (islamic) Architecture and Affect(Intellect Ltd., 2022) Yücel, Şebnem[No abstract available]Conference Object Citation - Scopus: 2Alternative Data Sources and Psychometric Scales Supported Credit Scoring Models(IEEE, 2023) Şahin, Türkay; Filiz, Gözde; Çakar, Tuna; Özvural, Özden Gebizlioğlu; Nicat, ŞahinThis study aims to evaluate individuals with limited access to banking services and enhance credit scoring models with alternative data sources. A psychometric-based credit scoring model was developed and tested. Despite limited data, significant potential findings were obtained. However, clarification of the distinction between credit payment intention and ability and validation of the results with more data are necessary.Article Citation - WoS: 29Citation - Scopus: 38An Efficient Framework for Visible-Infrared Cross Modality Person Re-Identification(Elsevier, 2020) Gökmen, Muhittin; Başaran, Emrah; Kamasak, Mustafa E.Visible-infrared cross-modality person re-identification (VI-ReId) is an essential task for video surveillance in poorly illuminated or dark environments. Despite many recent studies on person re-identification in the visible domain (ReId), there are few studies dealing specifically with VI-ReId. Besides challenges that are common for both ReId and VI-ReId such as pose/illumination variations, background clutter and occlusion, VI-ReId has additional challenges as color information is not available in infrared images. As a result, the performance of VI-ReId systems is typically lower than that of ReId systems. In this work, we propose a four-stream framework to improve VI-ReId performance. We train a separate deep convolutional neural network in each stream using different representations of input images. We expect that different and complementary features can be learned from each stream. In our framework, grayscale and infrared input images are used to train the ResNet in the first stream. In the second stream, RGB and three-channel infrared images (created by repeating the infrared channel) are used. In the remaining two streams, we use local pattern maps as input images. These maps are generated utilizing local Zernike moments transformation. Local pattern maps are obtained from grayscale and infrared images in the third stream and from RGB and three-channel infrared images in the last stream. We improve the performance of the proposed framework by employing a re-ranking algorithm for post-processing. Our results indicate that the proposed framework outperforms current state-of-the-art with a large margin by improving Rank-1/mAP by 29.79%/30.91% on SYSU-MM01 dataset, and by 9.73%/16.36% on RegDB dataset.Article Citation - WoS: 9Citation - Scopus: 11An Efficient Multiscale Scheme Using Local Zernike Moments for Face Recognition(MDPI, 2018) Gökmen, Muhittin; Başaran, Emrah; Kamasak, Mustafa E.In this study, we propose a face recognition scheme using local Zernike moments (LZM), which can be used for both identification and verification. In this scheme, local patches around the landmarks are extracted from the complex components obtained by LZM transformation. Then, phase magnitude histograms are constructed within these patches to create descriptors for face images. An image pyramid is utilized to extract features at multiple scales, and the descriptors are constructed for each image in this pyramid. We used three different public datasets to examine the performance of the proposed method:Face Recognition Technology (FERET), Labeled Faces in the Wild (LFW), and Surveillance Cameras Face (SCface). The results revealed that the proposed method is robust against variations such as illumination, facial expression, and pose. Aside from this, it can be used for low-resolution face images acquired in uncontrolled environments or in the infrared spectrum. Experimental results show that our method outperforms state-of-the-art methods on FERET and SCface datasets.Article Citation - WoS: 19Citation - Scopus: 27An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named Entity Recognition(Elsevier, 2021) Makaroğlu, Didem; Demir, Şeniz; Aras, Gizem; Çakır, AltanNamed entity recognition (NER) is an extensively studied task that extracts and classifies named entities in a text. NER is crucial not only in downstream language processing applications such as relation extraction and question answering but also in large scale big data operations such as real-time analysis of online digital media content. Recent research efforts on Turkish, a less studied language with morphologically rich nature, have demonstrated the effectiveness of neural architectures on well-formed texts and yielded state-of-the art results by formulating the task as a sequence tagging problem. In this work, we empirically investigate the use of recent neural architectures (Bidirectional long short-term memory (BiLSTM) and Transformer-based networks) proposed for Turkish NER tagging in the same setting. Our results demonstrate that transformer-based networks which can model long-range context overcome the limitations of BiLSTM networks where different input features at the character, subword, and word levels are utilized. We also propose a transformer-based network with a conditional random field (CRF) layer that leads to the state-of-the-art result (95.95% f-measure) on a common dataset. Our study contributes to the literature that quantifies the impact of transfer learning on processing morphologically rich languages.Conference Object An Exploratory Study on the Effect of Contour Types on Decision Making Via Optic Brain Imaging Method (fnirs)(eScholarship, 2023) Demircioglu, Esin Tuna; Girişken, Yener; Çakar, TunaDecision-making is a combination of our positive anticipations from the future with the contribution of our past experiences, emotions, and what we perceive at the moment. Therefore, the cues perceived from the environment play an important role in shaping the decisions. Contours, which are the hidden identity of the objects, are among these cues. Aesthetic evaluation, on the other hand, has been shown to have a profound impact on decision-making, both as a subjective experience of beauty and as having an evolutionary background. The aim of this empirical study is to explain the effect of contour types on preference decisions in the prefrontal cortex through risk-taking and aesthetic appraisal. The obtained findings indicated a relation between preference decision, contour type, and PFC subregion. The results of the current study suggest that contour type is an effective cue in decision-making, furthermore, left OFC and right dlPFC respond differently to contour types.Article Citation - WoS: 51Citation - Scopus: 66An Investigation of the Neural Correlates of Purchase Behavior Through Fnirs(2018) Cakir, Murat Perit; Yurdakul, Dicle; Girisken, Yener; Çakar, TunaPurpose This study aims to explore the plausibility of the functional near-infrared spectroscopy (fNIRS) methodology for neuromarketing applications and develop a neurophysiologically-informed model of purchasing behavior based on fNIRS measurements. Design/methodology/approach The oxygenation signals extracted from the purchase trials of each subject were temporally averaged to obtain average signals for buy and pass decisions. The obtained data were analyzed via both linear mixed models for each of the 16 optodes to explore their separate role in the purchasing decision process and a discriminant analysis to construct a classifier for buy/pass decisions based on oxygenation measures from multiple optodes. Findings Positive purchasing decisions significantly increase the neural activity through fronto-polar regions, which are closely related to OFC and vmPFC that modulate the computation of subjective values. The results showed that neural activations can be used to decode the buy or pass decisions with 85 per cent accuracy provided that sensitivity to the budget constraint is provided as an additional factor. Research limitations/implications The study shows that the fNIRS measures can provide useful biomarkers for improving the classification accuracy of purchasing tendencies and might be used as a main or complementary method together with traditional research methods in marketing. Future studies might focus on real-time purchasing processes in a more ecologically valid setting such as shopping in supermarkets. Originality/value This paper uses an emerging neuroimaging method in consumer neuroscience, namely, fNIRS. The decoding accuracy of the model is 85 per cent which presents an improvement over the accuracy levels reported in previous studies. The research also contributes to existing knowledge by providing insights in understanding individual differences and heterogeneity in consumer behavior through neural activities.Conference Object Citation - WoS: 14Citation - Scopus: 40An Overview of Blockchain Technologies: Principles, Opportunities and Challenges(IEEE, 2018) Arslan, Şuayb Şefik; Mermer, Gültekin Berahan; Zeydan, EnginBlokzincir, toplumumuzun birbiriyle iletişim kurma ve ticaret yapma biçiminde devrim yapma potansiyeline sahip, yakın zamanda ortaya çıkmış olan bir teknolojidir. Bu teknolojinin sağladığı en önemli avantaj aracı gerektiren bir oluşumda güvenilir bir merkezi kuruma ihtiyaç duymadan değer taşıyan işlemleri değiş tokuş edebilmesidir. Ayrıca, veri bütünlüğü, dahili orijinallik ve kullanıcı şeffaflığı sağlayabilir. Blokzincir, birçok yenilikçi uygulamanın temel alınacağı yeni internet olarak görülebilir. Bu çalışmada, genel çalışma prensibi, oluşan fırsatlar ve ileride karşılaşılabilecek zorlukları içerecek şekilde güncel blokzincir teknolojilerinin genel bir görünümünü sunmaktayız.
