WoS İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12416/8653

Browse

Search Results

Now showing 1 - 10 of 18

Citation - WoS: 16
Citation - Scopus: 21
Spend: Linked Data Sparql Endpoints Discovery Using Search Engines
(Ieice-inst Electronics information Communication Engineers, 2017) Yumusak, Semih; Dogdu, Erdogan; Kodaz, Halife; Kamilaris, Andreas; Vandenbussche, Pierre-Yves
Linked data endpoints are online query gateways to semantically annotated linked data sources. In order to query these data sources, SPARQL query language is used as a standard. Although a linked data endpoint (i.e. SPARQL endpoint) is a basic Web service, it provides a platform for federated online querying and data linking methods. For linked data consumers, SPARQL endpoint availability and discovery are crucial for live querying and semantic information retrieval. Current studies show that availability of linked datasets is very low, while the locations of linked data endpoints change frequently. There are linked data respsitories that collect and list the available linked data endpoints or resources. It is observed that around half of the endpoints listed in existing repositories are not accessible (temporarily or permanently offline). These endpoint URLs are shared through repository websites, such as Datahub. io, however, they are weakly maintained and revised only by their publishers. In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, the collected search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. We analyze our findings in comparison to Datahub collection in detail.
Spend Portal: Linked Data Discovery Using Sparql Endpoints
(Ieee, 2017) Yumusak, Semih; Aras, Riza Emre; Uysal, Elif; Dogdu, Erdogan; Kodaz, Halife; Oztoprak, Kasim
We present the project SpEnD, a complete SPARQL endpoint discovery and analysis portal. In a previous study, the SPARQL endpoint discovery and analysis steps of the SpEnD system were explained in detail. In the SpEnD portal, the SPARQL endpoints are extracted from the web by using web crawling techniques, monitored and analyzed by live querying the endpoints systematically. After many sustainability improvements in the SpEnD project, the SpEnD system is now online as a portal. SpEnD portal currently serves 1487 SPARQL endpoints, out of which 911 endpoints are uniquely found by SpEnD only when compared to the other existing SPARQL endpoint repositories. In this portal, the analytic results and the content information are shared for every SPARQL endpoint. The endpoints stored in the repository are monitored and updated continuously.
Citation - WoS: 10
Citation - Scopus: 21
Multi-Label Classification of Text Documents Using Deep Learning
(Ieee, 2020) Mohammed, Hamza Haruna; Dogdu, Erdogan; Gorur, Abdul Kadir; Choupani, Roya
Recently, studies in the field of Natural Language Processing and its related applications continue to mount up. Machine learning is proven to be predominantly data-driven in the sense that generic model building methods are used and then tailored to specific application domains. Needless to say, this has proven to be a very effective approach in modeling the complicated data dependencies we frequently experience in practice, making very few assumptions, and allowing the information to talk for themselves. Examples of these applications can be found in chemical process engineering, climate science, healthcare, and linguistic processing systems for natural languages, to name a few. Text classification is one of the important machine learning tasks that is used in many digital applications today; such as in document filtering, search engines, document management systems, and many more. Text classification is the process of categorizing of text documents into a given set of labels. Furthermore, multi-label text classification is the task of categorization of text documents into one or more labels simultaneously. Over the years, many methods for classifying text documents have been proposed, including the popularly known bag of words (BoW) method, support vector machine (SVM), tree induction, and label-vector embedding, to mention a few. These kinds of tools can be used in many digital applications, such as document filtering, search engines, document management systems, etc. Lately, deep learning-based approaches are getting more attention, especially in extreme multi-label text classification case. Deep learning has proven to be one of the major solutions to many machine learning applications, especially those involving high-dimensional and unstructured data. However, it is of paramount importance in many applications to be able to reason accurately about the uncertainties associated with the predictions of the models. In this paper, we explore and compare the recent deep learning-based methods for multi-label text classification. We investigate two scenarios. First, multi-label classification model with ordinary embedding layer, and second with Glove, word2vec, and FastText as pre-trained embedding corpus for the given models. We evaluated these different neural network model performances in terms of multi-label evaluation metrics for the two approaches, and compare the results with the previous studies.
Citation - WoS: 8
Citation - Scopus: 11
Sentiment Analysis for the Social Media: a Case Study for Turkish General Elections
(Assoc Computing Machinery, 2017) Yumusak, Semih; Oztoprak, Kasim; Dogdu, Erdogan; Uysal, Elif
The ideas expressed in social media are not always compliant with natural language rules, and the mood and emotion indicators are mostly highlighted by emoticons and emotion specific keywords. There are language independent emotion keywords (e.g. love, hate, good, bad), besides every language has its own particular emotion specific keywords. These keywords can be used for polarity analysis for a particular sentence. In this study, we first created a Turkish dictionary containing emotion specific keywords. Then, we used this dictionary to detect the polarity of tweets that are collected by querying political keywords right before the Turkish general election in 2015. The tweets were collected based on their relatedness with three main categories: the political leaders, ideologies, and political parties. The polarity of these tweets are analyzed in comparison with the election results.
Citation - WoS: 1
Topic Distribution Constant Diameter Overlay Design Algorithm (td-Cd
(Ieee, 2017) Oztoprak, Kasim; Dogdu, Erdogan; Layazali, Sina
Publish/subscribe communication systems, where nodes subscribe to many different topics of interest, are becoming increasingly more common in application domains such as social networks, Internet of Things, etc. Designing overlay networks that connect the nodes subscribed to each distinct topic is hence a fundamental problem in these systems. For scalability and efficiency, it is important to keep the maximum node degree of the overlay in the publish/subscribe system low. Ideally one would like to be able not only to keep the maximum node degree of the overlay low, but also to ensure that the network has low diameter. We address this problem by presenting Topic Distribution Constant Diameter Overlay Design Algorithm (TD-CD-ODA) that achieves a minimal maximum node degree in a low-diameter setting. We have shown experimentally that the algorithm performs well in both targets in comparison to the other overlay design algorithms.
Citation - WoS: 40
Citation - Scopus: 77
Malware Classification Using Deep Learning Methods
(Assoc Computing Machinery, 2018) Dogdu, Erdogan; Cakir, Bugra
Malware, short for Malicious Software, is growing continuously in numbers and sophistication as our digital world continuous to grow. It is a very serious problem and many efforts are devoted to malware detection in today's cybersecurity world. Many machine learning algorithms are used for the automatic detection of malware in recent years. Most recently, deep learning is being used with better performance. Deep learning models are shown to work much better in the analysis of long sequences of system calls. In this paper a shallow deep learning-based feature extraction method (word2vec) is used for representing any given malware based on its opcodes. Gradient Boosting algorithm is used for the classification task. Then, k-fold cross-validation is used to validate the model performance without sacrificing a validation split. Evaluation results show up to 96% accuracy with limited sample data.
Citation - Scopus: 2
A Discovery and Analysis Engine for Semantic Web
(Assoc Computing Machinery, 2018) Kamilaris, Andreas; Dogdu, Erdogan; Kodaz, Halife; Uysal, Elif; Aras, Riza Emre; Yumusak, Semih
The Semantic Web promotes common data formats and exchange protocols on the web towards better interoperability among systems and machines. Although Semantic Web technologies are being used to semantically annotate data and resources for easier reuse, the ad hoc discovery of these data sources remains an open issue. Popular Semantic Web endpoint repositories such as SPARQLES, Linking Open Data Project (LOD Cloud), and LODStats do not include recently published datasets and are not updated frequently by the publishers. Hence, there is a need for a web-based dynamic search engine that discovers these endpoints and datasets at frequent intervals. To address this need, a novel web meta-crawling method is proposed for discovering Linked Data sources on the Web. We implemented the method in a prototype system named SPARQL Endpoints Discovery (SpEnD). In this paper, we describe the design and implementation of SpEnD, together with an analysis and evaluation of its operation, in comparison to the aforementioned static endpoint repositories in terms of time performance, availability, and size. Findings indicate that SpEnD outperforms existing Linked Data resource discovery methods.
Citation - WoS: 7
Phishing E-Mail Detection by Using Deep Learning Algorithms
(Assoc Computing Machinery, 2018) Hassanpour, Reza; Dogdu, Erdogan; Choupani, Roya; Goker, Onur; Nazli, Nazli
Citation - WoS: 292
Citation - Scopus: 374
Context-Aware Computing, Learning, and Big Data in Internet of Things: a Survey
(Ieee-inst Electrical Electronics Engineers inc, 2018) Dogdu, Erdogan; Ozbayoglu, Ahmet Murat; Sezer, Omer Berat
Internet of Things (IoT) has been growing rapidly due to recent advancements in communications and sensor technologies. Meanwhile, with this revolutionary transformation, researchers, implementers, deployers, and users are faced with many challenges. IoT is a complicated, crowded, and complex field; there are various types of devices, protocols, communication channels, architectures, middleware, and more. Standardization efforts are plenty, and this chaos will continue for quite some time. What is clear, on the other hand, is that IoT deployments are increasing with accelerating speed, and this trend will not stop in the near future. As the field grows in numbers and heterogeneity, "intelligence" becomes a focal point in IoT. Since data now becomes "big data," understanding, learning, and reasoning with big data is paramount for the future success of IoT. One of the major problems in the path to intelligent IoT is understanding "context," or making sense of the environment, situation, or status using data from sensors, and then acting accordingly in autonomous ways. This is called "context-aware computing," and it now requires both sensing and, increasingly, learning, as IoT systems get more data and better learning from this big data. In this survey, we review the field, first, from a historical perspective, covering ubiquitous and pervasive computing, ambient intelligence, and wireless sensor networks, and then, move to context-aware computing studies. Finally, we review learning and big data studies related to IoT. We also identify the open issues and provide an insight for future study areas for IoT researchers.
Citation - WoS: 30
Citation - Scopus: 51
An Artificial Neural Network-Based Stock Trading System Using Technical Analysis and Big Data Framework
(Assoc Computing Machinery, 2017) Ozbayoglu, A. Murat; Dogdu, Erdogan; Sezer, Omer Berat
In this paper, a neural network-based stock price prediction and trading system using technical analysis indicators is presented. The model developed first converts the financial time series data into a series of buy-sell-hold trigger signals using the most commonly preferred technical analysis indicators. Then, a Multilayer Perceptron (MLP) artificial neural network (ANN) model is trained in the learning stage on the daily stock prices between 1997 and 2007 for all of the Dow30 stocks. Apache Spark big data framework is used in the training stage. The trained model is then tested with data from 2007 to 2017. The results indicate that by choosing the most appropriate technical indicators, the neural network model can achieve comparable results against the Buy and Hold strategy in most of the cases. Furthermore, fine tuning the technical indicators and/or optimization strategy can enhance the overall trading performance.

WoS İndeksli Yayınlar Koleksiyonu

Browse

Filters

Settings

Sort By

Results per page

Search Results