Bilgisayar Mühendisliği Bölümü Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.11779/1940

Browse

Search Results

Now showing 1 - 10 of 41
  • Patent
    Artificial Intelligence Augmented Iterative Product Decoding
    (2023) Arslan , Şuayb Şefik; Göker, Turguy
    A method for product decoding within a data storage system includes receiving data to be decoded within a first decoder; performing a plurality of decoding iterations to decode the data utilizing a first decoder and a second decoder; and outputting fully decoded data based on the performance of the plurality of decoding iterations. Each of the plurality of decoding iterations includes (i) decoding the data with the first decoder operating at a first decoder operational mode to generate once decoded data; (ii) sending the once decoded data from the first decoder to the second decoder; (iii) receiving error information from the first decoder with an artificial intelligence system; (iv) selecting a second decoder operational mode based at least in part on the error information that is received by the artificial intelligence system; and (v) decoding the once decoded data with the second decoder operating at the second decoder operational mode to generate twice decoded data; and outputting fully decoded data based on the performance of the plurality of decoding iterations.
  • Patent
    Joint Multi-Nanopore Sequencing for Reliable Data Retrieval in Nucleic Acid Storage
    (2023) Arslan , Şuayb Şefik; Göker, Turguy; Doerner, Don
    A nucleic acid storage system (100) that uses nanopore sequencing to read data values chemically embedded in oligonucleotides includes a membrane (102), a voltage source (108), and a nucleic acid strand (110). The membrane (102) has a plurality of nanopores (104) that are stacked upon one another in a multi-nanopore arrangement. The voltage source (108) is configured to direct voltage across the plurality of nanopores (104). The nucleic acid strand (110) including the oligonucleotides is threaded through each of the plurality of nanopores (104) within the membrane (102). A separate base signal (118) is generated from the nucleic acid strand (110) being threaded through each of the plurality of nanopores (104), and Recursive Neural Networks can be used to estimate a signal shape for each oligonucleotide. Recurrent Convolutional Neural Networks and noise predictive data detection algorithms can be used based on the estimated signal shapes to sequence the oligonucleotides.
  • Article
    Minimum Repair Bandwidth Ldpc Codes for Distributed Storage Systems
    (IEEE, 2023) Pourmandi, Massoud; Pusane, Ali Emre; Arslan , Şuayb Şefik; Haytaoğlu, Elif
    In distributed storage systems (DSS), an optimal code design must meet the requirements of efficient local data regeneration in addition to reliable data retention. Recently, lowdensity parity-check (LDPC) codes have been proposed as a promising candidate that can secure high data rates as well as low repair bandwidth while maintaining low complexity in data reconstruction. The main objective of this study is to optimize the repair bandwidth characteristics of LDPC code families for a DSS application while meeting the data reliability requirements. First, a data access scenario in which nodes contact other available nodes randomly to download data is examined. Later, a minimum-bandwidth protocol is considered in which nodes make their selections based on the degree numbers of check nodes. Through formulating optimization problems for both protocols, a fundamental trade-off between the decoding threshold and the repair bandwidth is established for a given code rate. Finally, conclusions are confirmed by numerical results showing that irregular constructions have a large potential for establishing optimized LDPC code families for DSS applications.
  • Article
    Citation - Scopus: 1
    A New Benchmark Dataset for P300 Erp-Based Bci Applications
    (Academic Press Inc Elsevier Science, 2023-04-01) Çakar, Tuna; Özkan, Hüseyin; Musellim, Serkan; Arslan, Suayb S.; Yağan, Mehmet; Alp, Nihan
    Because of its non-invasive nature, one of the most commonly used event-related potentials in brain -computer interface (BCI) system designs is the P300 electroencephalogram (EEG) signal. The fact that the P300 response can easily be stimulated and measured is particularly important for participants with severe motor disabilities. In order to train and test P300-based BCI speller systems in more realistic high-speed settings, there is a pressing need for a large and challenging benchmark dataset. Various datasets already exist in the literature but most of them are not publicly available, and they either have a limited number of participants or utilize relatively long stimulus duration (SD) and inter-stimulus intervals (ISI). They are also typically based on a 36 target (6 x 6) character matrix. The use of long ISI, in particular, not only reduces the speed and the information transfer rates (ITRs) but also oversimplifies the P300 detection. This leaves a limited challenge to state-of-the-art machine learning and signal processing algorithms. In fact, near-perfect P300 classification accuracies are reported with the existing datasets. Therefore, one certainly needs a large-scale dataset with challenging settings to fully exploit the recent advancements in algorithm design (machine learning and signal processing) and achieve high-performance speller results. To this end, in this article we introduce a new freely-and publicly-accessible P300 dataset obtained using 32-channel EEG, in the hope that it will lead to new research findings and eventually more efficient BCI designs. The introduced dataset comprises 18 participants performing a 40 -target (5 x 8) cued-spelling task, with reduced SD (66.6 ms) and ISI (33.3 ms) for fast spelling. We have also processed, analyzed, and character-classified the introduced dataset and we presented the accuracy and ITR results as a benchmark. The introduced dataset and the codes of our experiments are publicly accessible at https://data .mendeley.com /datasets /vyczny2r4w.(c) 2023 Elsevier Inc. All rights reserved.
  • Article
    Cooperative Network Coding for Distributed Storage Using Base Stations With Link Constraints
    (arXiv, 2021) Arslan, Şuayb Şefik; Pourmandi, Massoud; Haytaoğlu, Elif
    In this work, we consider a novel distributed data storage/caching scenario in a cellular setting where multiple nodes may fail/depart at the same time. In order to maintain the target reliability, we allow cooperative regeneration of lost nodes with the help of base stations allocated in a set of hierarchical layers. Due to this layered structure, a symbol download from each base station has a different cost, while the link capacities connecting the nodes of the cellular system and the base stations are also limited. In this more practical and general scenario, we present the fundamental trade-off between repair bandwidth cost and the storage space per node. Particularly interesting operating points are the minimum storage as well as bandwidth cost points in this trade-off curve. We provide closed-form expressions for the corresponding bandwidth (cost) and storage space per node for these operating points. Finally, we provide an explicit optimal code construction for the minimum storage regeneration point for a given set of system parameters.
  • Conference Object
    Residual Data Usage in LDPC Codes
    (IEEE, 2022-05-15) Kaya, Erdi; Pourmandi, Massoud; Haytaoglu, Elif; Arslan, Şefik Şuayb
    In distributed storage systems/coded caching systems, padding operations should be performed when the encoded data cannot be divided by the number of storage nodes evenly. Thus, extra zero values are stored in one of the nodes to balance each node's storage content. In this study, distribution of data to storage nodes with no padding was investigated for distributed caching context in which a base station and devices both store the coded data. In other words, no redundancy (no-padding) is included into the encoded data. This approach is named as residual data distribution. LDPC codes are selected as the erasure code due to their low complexity encode/decode operations. Moreover, performance comparisons were conducted between using traditional data distribution approach (with padding) and using residual data (use of no-padding) (standard) in terms of repair time. In our work, the effect of no-padding data usage on the repair time and the ratios of storage savings have been also demonstrated.
  • Conference Object
    Citation - Scopus: 1
    Improved Bounds on the Moments of Guessing Cost
    (IEEE, 2022-06-26) Arslan, Suayb S.; Haytaoglu, Elif
    Guessing a random variable with finite or countably infinite support in which each selection leads to a positive cost value has recently been studied within the context of "guessing cost". In those studies, similar to standard guesswork, upper and lower bounds for the rho-th moment of guessing cost are described in terms of the known measure Renyi's entropy. In this study, we non-trivially improve the known bounds using previous techniques along with new notions such as balancing cost. We have demonstrated that the novel lower bound proposed in this work, achieves 5.84%, 18.47% higher values than that of the known lower bound for rho = 1 and rho = 5, respectively. As for the upper bound, the novel expression provides 10.93%, 5.54% lower values than that of the previously presented bounds for rho = 1 and rho = 5, respectively.
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 2
    Base Station-Assisted Cooperative Network Coding for Cellular Systems With Link Constraints
    (IEEE, 2022-06-26) Arslan, Suayb S.; Pourmandi, Massoud; Haytaoglu, Elif
    We consider a novel distributed data storage/caching scenario in a cellular network, where multiple nodes may fail/depart simultaneously To meet reliability, we allow cooperative regeneration of lost nodes with the help of base stations allocated in a set of hierarchical layers1. Due to this layered structure, a symbol download from each base station has a different cost, while the link capacities between the nodes of the cellular system and the base stations are also constrained. Under such a setting, we formulate the fundamental trade-off with closed form expressions between repair bandwidth cost and the storage space per node. Particularly, the minimum storage as well as bandwidth cost points are formulated. Finally, we provide an explicit optimal code construction for the minimum storage regeneration point for a special set of system parameters.
  • Article
    Citation - WoS: 12
    Citation - Scopus: 20
    Compress-Store on Blockchain: a Decentralized Data Processing and Immutable Storage for Multimedia Streaming
    (Springer, 2022-03-25) Arslan, Şuayb Şefik; Turguy, Göker; Goker, Turguy
    Decentralization for data storage is a challenging problem for blockchain-based solutions as the blocksize plays a key role for scalability. In addition, specific requirements of multimedia data call for various changes in the blockchain technology internals. Considering one of the most popular applications of secure multimedia streaming, i.e., video surveillance, it is not clear how to judiciously encode incentivization, immutability, and compression into a viable ecosystem. In this study, we provide a genuine scheme that achieves this encoding for a video surveillance application. The proposed scheme provides a novel integration of data compression, immutable off-chain data storage using a new consensus protocol namely, Proof-of-WorkStore (PoWS) in order to enable fully useful work to be performed by the miner nodes of the network. The proposed idea is the first step towards achieving greener application of a blockchain-based environment to the video storage business that utilizes system resources efficiently.
  • Article
    Citation - WoS: 3
    Citation - Scopus: 3
    Array Bp-Xor Codes for Hierarchically Distributed Matrix Multiplication
    (IEEE, 2022-03-01) Arslan, Şuayb Şefik
    A novel fault-tolerant computation technique based on array Belief Propagation (BP)-decodable XOR (BP-XOR) codes is proposed for distributed matrix-matrix multiplication. The proposed scheme is shown to be configurable and suited for modern hierarchical compute architectures such as Graphical Processing Units (GPUs) equipped with multiple nodes, whereby each has many small independent processing units with increased core-to-core communications. The proposed scheme is shown to outperform a few of the well–known earlier strategies in terms of total end-to-end execution time while in presence of slow nodes, called stragglers. This performance advantage is due to the careful design of array codes which distributes the encoding operation over the cluster (slave) nodes at the expense of increased master-slave communication. An interesting trade-off between end-to-end latency and total communication cost is precisely described. In addition, to be able to address an identified problem of scaling stragglers, an asymptotic version of array BP-XOR codes based on projection geometry is proposed at the expense of some computation overhead. A thorough latency analysis is conducted for all schemes to demonstrate that the proposed scheme achieves order-optimal computation in both the sublinear as well as the linear regimes in the size of the computed product from an end-to-end delay perspective.