rss_2.0Journal of Artificial Intelligence and Soft Computing Research FeedSciendo RSS Feed for Journal of Artificial Intelligence and Soft Computing Research of Artificial Intelligence and Soft Computing Research 's Cover a Very Fast Feedforward Multilayer Neural Networks Training Algorithm<abstract> <title style='display:none'>Abstract</title> <p><sup>**</sup>This paper presents a novel fast algorithm for feedforward neural networks training. It is based on the Recursive Least Squares (RLS) method commonly used for designing adaptive filters. Besides, it utilizes two techniques of linear algebra, namely the orthogonal transformation method, called the Givens Rotations (GR), and the QR decomposition, creating the GQR (symbolically we write GR + QR = GQR) procedure for solving the normal equations in the weight update process. In this paper, a novel approach to the GQR algorithm is presented. The main idea revolves around reducing the computational cost of a single rotation by eliminating the square root calculation and reducing the number of multiplications. The proposed modification is based on the scaled version of the Givens rotations, denoted as SGQR. This modification is expected to bring a significant training time reduction comparing to the classic GQR algorithm. The paper begins with the introduction and the classic Givens rotation description. Then, the scaled rotation and its usage in the QR decomposition is discussed. The main section of the article presents the neural network training algorithm which utilizes scaled Givens rotations and QR decomposition in the weight update process. Next, the experiment results of the proposed algorithm are presented and discussed. The experiment utilizes several benchmarks combined with neural networks of various topologies. It is shown that the proposed algorithm outperforms several other commonly used methods, including well known Adam optimizer.</p> </abstract>ARTICLE2022-07-23T00:00:00.000+00:00Flow-Capture Location Model with Link Capacity Constraint Over a Mixed Traffic Network<abstract> <title style='display:none'>Abstract</title> <p>This paper constructs and settles a charging facility location problem with the link capacity constraint over a mixed traffic network. The reason for studying this problem is that link capacity constraint is mostly insufficient or missing in the studies of traditional user equilibrium models, thereby resulting in the ambiguous of the definition of road traffic network status. Adding capacity constraints to the road network is a compromise to enhance the reality of the traditional equilibrium model. In this paper, we provide a two-layer model for evaluating the efficiency of the charging facilities under the condition of considering the link capacity constraint. The upper level model in the proposed bi-level model is a nonlinear integer programming formulation, which aims to maximize the captured link flows of the battery electric vehicles. Moreover, the lower level model is a typical traffic equilibrium assignment model except that it contains the link capacity constraint and driving distance constraint of the electric vehicles over the mixed road network. Based on the Frank-Wolfe algorithm, a modified algorithm framework is adopted for solving the constructed problem, and finally, a numerical example is presented to verify the proposed model and solution algorithm.</p> </abstract>ARTICLE2022-07-23T00:00:00.000+00:00Noise Robust Illumination Invariant Face Recognition Via Bivariate Wavelet Shrinkage in Logarithm Domain<abstract> <title style='display:none'>Abstract</title> <p>Recognizing faces under various lighting conditions is a challenging problem in artificial intelligence and applications. In this paper we describe a new face recognition algorithm which is invariant to illumination. We first convert image files to the logarithm domain and then we implement them using the dual-tree complex wavelet transform (DTCWT) which yields images approximately invariant to changes in illumination change. We classify the images by the collaborative representation-based classifier (CRC). We also perform the following sub-band transformations: (i) we set the approximation sub-band to zero if the noise standard deviation is greater than 5; (ii) we then threshold the two highest frequency wavelet sub-bands using bivariate wavelet shrinkage. (iii) otherwise, we set these two highest frequency wavelet sub-bands to zero. On obtained images we perform the inverse DTCWT which results in illumination invariant face images. The proposed method is strongly robust to Gaussian white noise. Experimental results show that our proposed algorithm outperforms several existing methods on the Extended Yale Face Database B and the CMU-PIE face database.</p> </abstract>ARTICLE2022-07-23T00:00:00.000+00:00New Event Based State Estimation for Discrete-Time Recurrent Delayed Semi-Markov Jump Neural Networks Via a Novel Summation Inequality<abstract> <title style='display:none'>Abstract</title> <p>This paper investigates the event-based state estimation for discrete-time recurrent delayed semi-Markovian neural networks. An event-triggering protocol is introduced to find measurement output with a specific triggering condition so as to lower the burden of the data communication. A novel summation inequality is established for the existence of asymptotic stability of the estimation error system. The problem addressed here is to construct an <italic>H</italic><sub>∞</sub> state estimation that guarantees the asymptotic stability with the novel summation inequality, characterized by event-triggered transmission. By the Lyapunov functional technique, the explicit expressions for the gain are established. Finally, two examples are exploited numerically to illustrate the usefulness of the new methodology.</p> </abstract>ARTICLE2022-07-23T00:00:00.000+00:00A Novel Approach to Type-Reduction and Design of Interval Type-2 Fuzzy Logic Systems<abstract> <title style='display:none'>Abstract</title> <p>Fuzzy logic systems, unlike black-box models, are known as transparent artificial intelligence systems that have explainable rules of reasoning. Type 2 fuzzy systems extend the field of application to tasks that require the introduction of uncertainty in the rules, e.g. for handling corrupted data. Most practical implementations use interval type-2 sets and process interval membership grades. The key role in the design of type-2 interval fuzzy logic systems is played by the type-2 inference defuzzification method. In type-2 systems this generally takes place in two steps: type-reduction first, then standard defuzzification. The only precise type-reduction method is the iterative method known as Karnik-Mendel (KM) algorithm with its enhancement modifications. The known non-iterative methods deliver only an approximation of the boundaries of a type-reduced set and, in special cases, they diminish the profits that result from the use of type-2 fuzzy logic systems. In this paper, we propose a novel type-reduction method based on a smooth approximation of maximum/minimum, and we call this method a smooth type-reduction. Replacing the iterative KM algorithm by the smooth type-reduction, we obtain a structure of an adaptive interval type-2 fuzzy logic which is non-iterative and as close to an approximation of the KM algorithm as we like.</p> </abstract>ARTICLE2022-07-23T00:00:00.000+00:00Position-Encoding Convolutional Network to Solving Connected Text Captcha<abstract> <title style='display:none'>Abstract</title> <p>Text-based CAPTCHA is a convenient and effective safety mechanism that has been widely deployed across websites. The efficient end-to-end models of scene text recognition consisting of CNN and attention-based RNN show limited performance in solving text-based CAPTCHAs. In contrast with the street view image and document, the character sequence in CAPTCHA is non-semantic. The RNN loses its ability to learn the semantic context and only implicitly encodes the relative position of extracted features. Meanwhile, the security features, which prevent characters from segmentation and recognition, extensively increase the complexity of CAPTCHAs. The performance of this model is sensitive to different CAPTCHA schemes. In this paper, we analyze the properties of the text-based CAPTCHA and accordingly consider solving it as a highly position-relative character sequence recognition task. We propose a network named PosConv to leverage the position information in the character sequence without RNN. PosConv uses a novel padding strategy and modified convolution, explicitly encoding the relative position into the local features of characters. This mechanism of PosConv makes the extracted features from CAPTCHAs more informative and robust. We validate PosConv on six text-based CAPTCHA schemes, and it achieves state-of-the-art or competitive recognition accuracy with significantly fewer parameters and faster convergence speed.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00A Progressive and Cross-Domain Deep Transfer Learning Framework for Wrist Fracture Detection<abstract> <title style='display:none'>Abstract</title> <p>There has been an amplified focus on and benefit from the adoption of artificial intelligence (AI) in medical imaging applications. However, deep learning approaches involve training with massive amounts of annotated data in order to guarantee generalization and achieve high accuracies. Gathering and annotating large sets of training images require expertise which is both expensive and time-consuming, especially in the medical field. Furthermore, in health care systems where mistakes can have catastrophic consequences, there is a general mistrust in the black-box aspect of AI models. In this work, we focus on improving the performance of medical imaging applications when limited data is available while focusing on the interpretability aspect of the proposed AI model. This is achieved by employing a novel transfer learning framework, <italic>progressive transfer learning</italic>, an automated annotation technique and a correlation analysis experiment on the learned representations.</p> <p><italic>Progressive transfer learning</italic> helps jump-start the training of deep neural networks while improving the performance by gradually transferring knowledge from two source tasks into the target task. It is empirically tested on the wrist fracture detection application by first training a general radiology network <italic>RadiNet</italic> and using its weights to initialize <italic>RadiNet<sub>wrist</sub></italic>, that is trained on wrist images to detect fractures. Experiments show that <italic>RadiNet<sub>wrist</sub></italic> achieves an accuracy of 87% and an AUC ROC of 94% as opposed to 83% and 92% when it is pre-trained on the ImageNet dataset.</p> <p>This improvement in performance is investigated within an <italic>explainable AI</italic> framework. More concretely, the learned deep representations of <italic>RadiNet<sub>wrist</sub></italic> are compared to those learned by the baseline model by conducting a correlation analysis experiment. The results show that, when transfer learning is <italic>gradually</italic> applied, some features are learned earlier in the network. Moreover, the deep layers in the <italic>progressive transfer learning</italic> framework are shown to encode features that are not encountered when traditional transfer learning techniques are applied.</p> <p>In addition to the empirical results, a clinical study is conducted and the performance of <italic>RadiNet<sub>wrist</sub></italic> is compared to that of an expert radiologist. We found that <italic>RadiNet<sub>wrist</sub></italic> exhibited similar performance to that of radiologists with more than 20 years of experience.</p> <p>This motivates follow-up research to train on more data to feasibly surpass radiologists’ performance, and investigate the interpretability of AI models in the healthcare domain where the decision-making process needs to be credible and transparent.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Machine Learning and Traditional Econometric Models: A Systematic Mapping Study<abstract> <title style='display:none'>Abstract</title> <p><italic>Context</italic>: Machine Learning (ML) is a disruptive concept that has given rise to and generated interest in different applications in many fields of study. The purpose of Machine Learning is to solve real-life problems by automatically learning and improving from experience without being explicitly programmed for a specific problem, but for a generic type of problem. This article approaches the different applications of ML in a series of econometric methods.</p> <p><italic>Objective</italic>: The objective of this research is to identify the latest applications and do a comparative study of the performance of econometric and ML models. The study aimed to find empirical evidence for the performance of ML algorithms being superior to traditional econometric models. The Methodology of systematic mapping of literature has been followed to carry out this research, according to the guidelines established by [39], and [58] that facilitate the identification of studies published about this subject.</p> <p><italic>Results</italic>: The results show, that in most cases ML outperforms econometric models, while in other cases the best performance has been achieved by combining traditional methods and ML applications.</p> <p><italic>Conclusion</italic>: inclusion and exclusions criteria have been applied and 52 articles closely related articles have been reviewed. The conclusion drawn from this research is that it is a field that is growing, which is something that is well known nowadays and that there is no certainty as to the performance of ML being always superior to that of econometric models.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00An Autoencoder-Enhanced Stacking Neural Network Model for Increasing the Performance of Intrusion Detection<abstract> <title style='display:none'>Abstract</title> <p>Security threats, among other intrusions affecting the availability, confidentiality and integrity of IT resources and services, are spreading fast and can cause serious harm to organizations. Intrusion detection has a key role in capturing intrusions. In particular, the application of machine learning methods in this area can enrich the intrusion detection efficiency. Various methods, such as pattern recognition from event logs, can be applied in intrusion detection. The main goal of our research is to present a possible intrusion detection approach using recent machine learning techniques. In this paper, we suggest and evaluate the usage of stacked ensembles consisting of neural network (SNN) and autoen-coder (AE) models augmented with a tree-structured Parzen estimator hyperparameter optimization approach for intrusion detection. The main contribution of our work is the application of advanced hyperparameter optimization and stacked ensembles together.</p> <p>We conducted several experiments to check the effectiveness of our approach. We used the NSL-KDD dataset, a common benchmark dataset in intrusion detection, to train our models. The comparative results demonstrate that our proposed models can compete with and, in some cases, outperform existing models.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Handling Realistic Noise in Multi-Agent Systems with Self-Supervised Learning and Curiosity<abstract> <title style='display:none'>Abstract</title> <p><sup>1</sup>Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multi-agent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Energy Associated Tuning Method for Short-Term Series Forecasting by Complete and Incomplete Datasets<abstract><title style='display:none'>Abstract</title><p> This article presents short-term predictions using neural networks tuned by energy associated to series based-predictor filter for complete and incomplete datasets. A benchmark of high roughness time series from Mackay Glass (MG), Logistic (LOG), Henon (HEN) and some univariate series chosen from NN3 Forecasting Competition are used. An average smoothing technique is assumed to complete the data missing in the dataset. The Hurst parameter estimated through wavelets is used to estimate the roughness of the real and forecasted series. The validation and horizon of the time series is presented by the 15 values ahead. The performance of the proposed filter shows that even a short dataset is incomplete, besides a linear smoothing technique employed; the prediction is almost fair by means of SMAPE index. Although the major result shows that the predictor system based on energy associated to series has an optimal performance from several chaotic time series, in particular, this method among other provides a good estimation when the short-term series are taken from one point observations.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Kernel Analysis for Estimating the Connectivity of a Network with Event Sequences<abstract><title style='display:none'>Abstract</title><p> Estimating the connectivity of a network from events observed at each node has many applications. One prominent example is found in neuroscience, where spike trains (sequences of action potentials) are observed at each neuron, but the way in which these neurons are connected is unknown. This paper introduces a novel method for estimating connections between nodes using a similarity measure between sequences of event times. Specifically, a normalized positive definite kernel defined on spike trains was used. The proposed method was evaluated using synthetic and real data, by comparing with methods using transfer entropy and the Victor-Purpura distance. Synthetic data was generated using CERM (Coupled Escape-Rate Model), a model that generates various spike trains. Real data recorded from the visual cortex of an anaesthetized cat was analyzed as well. The results showed that the proposed method provides an effective way of estimating the connectivity of a network when the time sequences of events are the only available information.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Can Learning Vector Quantization be an Alternative to SVM and Deep Learning? - Recent Trends and Advanced Variants of Learning Vector Quantization for Classification Learning<abstract><title style='display:none'>Abstract</title><p> Learning vector quantization (LVQ) is one of the most powerful approaches for prototype based classification of vector data, intuitively introduced by Kohonen. The prototype adaptation scheme relies on its attraction and repulsion during the learning providing an easy geometric interpretability of the learning as well as of the classification decision scheme. Although deep learning architectures and support vector classifiers frequently achieve comparable or even better results, LVQ models are smart alternatives with low complexity and computational costs making them attractive for many industrial applications like intelligent sensor systems or advanced driver assistance systems.</p><p>Nowadays, the mathematical theory developed for LVQ delivers sufficient justification of the algorithm making it an appealing alternative to other approaches like support vector machines and deep learning techniques.</p><p>This review article reports current developments and extensions of LVQ starting from the generalized LVQ (GLVQ), which is known as the most powerful cost function based realization of the original LVQ. The cost function minimized in GLVQ is an soft-approximation of the standard classification error allowing gradient descent learning techniques. The GLVQ variants considered in this contribution, cover many aspects like bordersensitive learning, application of non-Euclidean metrics like kernel distances or divergences, relevance learning as well as optimization of advanced statistical classification quality measures beyond the accuracy including sensitivity and specificity or area under the ROC-curve.</p><p>According to these topics, the paper highlights the basic motivation for these variants and extensions together with the mathematical prerequisites and treatments for integration into the standard GLVQ scheme and compares them to other machine learning approaches. For detailed description and mathematical theory behind all, the reader is referred to the respective original articles.</p><p>Thus, the intention of the paper is to provide a comprehensive overview of the stateof- the-art serving as a starting point to search for an appropriate LVQ variant in case of a given specific classification problem as well as a reference to recently developed variants and improvements of the basic GLVQ scheme.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00A New Mechanism for Data Visualization with Tsk-Type Preprocessed Collaborative Fuzzy Rule Based System<abstract><title style='display:none'>Abstract</title><p> A novel data knowledge representation with the combination of structure learning ability of preprocessed collaborative fuzzy clustering and fuzzy expert knowledge of Takagi- Sugeno-Kang type model is presented in this paper. The proposed method divides a huge dataset into two or more subsets of dataset. The subsets of dataset interact with each other through a collaborative mechanism in order to find some similar properties within each-other. The proposed method is useful in dealing with big data issues since it divides a huge dataset into subsets of dataset and finds common features among the subsets. The salient feature of the proposed method is that it uses a small subset of dataset and some common features instead of using the entire dataset and all the features. Before interactions among subsets of the dataset, the proposed method applies a mapping technique for granules of data and centroid of clusters. The proposed method uses information of only half or less/more than the half of the data patterns for the training process, and it provides an accurate and robust model, whereas the other existing methods use the entire information of the data patterns. Simulation results show the proposed method performs better than existing methods on some benchmark problems.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00A Survey of Artificial Intelligence Techniques Employed for Adaptive Educational Systems within E-Learning Platforms<abstract><title style='display:none'>Abstract</title><p>The adaptive educational systems within e-learning platforms are built in response to the fact that the learning process is different for each and every learner. In order to provide adaptive e-learning services and study materials that are tailor-made for adaptive learning, this type of educational approach seeks to combine the ability to comprehend and detect a person’s specific needs in the context of learning with the expertise required to use appropriate learning pedagogy and enhance the learning process. Thus, it is critical to create accurate student profiles and models based upon analysis of their affective states, knowledge level, and their individual personality traits and skills. The acquired data can then be efficiently used and exploited to develop an adaptive learning environment. Once acquired, these learner models can be used in two ways. The first is to inform the pedagogy proposed by the experts and designers of the adaptive educational system. The second is to give the system dynamic self-learning capabilities from the behaviors exhibited by the teachers and students to create the appropriate pedagogy and automatically adjust the e-learning environments to suit the pedagogies. In this respect, artificial intelligence techniques may be useful for several reasons, including their ability to develop and imitate human reasoning and decision-making processes (learning-teaching model) and minimize the sources of uncertainty to achieve an effective learning-teaching context. These learning capabilities ensure both learner and system improvement over the lifelong learning mechanism. In this paper, we present a survey of raised and related topics to the field of artificial intelligence techniques employed for adaptive educational systems within e-learning, their advantages and disadvantages, and a discussion of the importance of using those techniques to achieve more intelligent and adaptive e-learning environments.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Performance Analysis of Rough Set–Based Hybrid Classification Systems in the Case of Missing Values<abstract> <title style='display:none'>Abstract</title> <p>The paper presents a performance analysis of a selected few rough set–based classification systems. They are hybrid solutions designed to process information with missing values. Rough set-–based classification systems combine various classification methods, such as support vector machines, k–nearest neighbour, fuzzy systems, and neural networks with the rough set theory. When all input values take the form of real numbers, and they are available, the structure of the classifier returns to a non–rough set version. The performance of the four systems has been analysed based on the classification results obtained for benchmark databases downloaded from the machine learning repository of the University of California at Irvine.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A Novel Fast Feedforward Neural Networks Training Algorithm<abstract> <title style='display:none'>Abstract</title> <p>In this paper<sup>1</sup> a new neural networks training algorithm is presented. The algorithm originates from the Recursive Least Squares (RLS) method commonly used in adaptive filtering. It uses the QR decomposition in conjunction with the Givens rotations for solving a normal equation - resulting from minimization of the loss function. An important parameter in neural networks is training time. Many commonly used algorithms require a big number of iterations in order to achieve a satisfactory outcome while other algorithms are effective only for small neural networks. The proposed solution is characterized by a very short convergence time compared to the well-known backpropagation method and its variants. The paper contains a complete mathematical derivation of the proposed algorithm. There are presented extensive simulation results using various benchmarks including function approximation, classification, encoder, and parity problems. Obtained results show the advantages of the featured algorithm which outperforms commonly used recent state-of-the-art neural networks training algorithms, including the Adam optimizer and the Nesterov’s accelerated gradient.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Decision Making Support System for Managing Advertisers By Ad Fraud Detection<abstract> <title style='display:none'>Abstract</title> <p>Efficient lead management allows substantially enhancing online channel marketing programs. In the paper, we classify website traffic into human- and bot-origin ones. We use feedforward neural networks with embedding layers. Moreover, we use one-hot encoding for categorical data. The data of mouse clicks come from seven large retail stores and the data of lead classification from three financial institutions. The data are collected by a JavaScript code embedded into HTML pages. The three proposed models achieved relatively high accuracy in detecting artificially generated traffic.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A Novel Grid-Based Clustering Algorithm<abstract> <title style='display:none'>Abstract</title> <p>Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the <italic>k<sub>dist</sub></italic> function which computes the distance between each element of a dataset and its <italic>k</italic>th nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A New Statistical Reconstruction Method for the Computed Tomography Using an X-Ray Tube with Flying Focal Spot<abstract> <title style='display:none'>Abstract</title> <p>This paper presents a new image reconstruction method for spiral cone- beam tomography scanners in which an X-ray tube with a flying focal spot is used. The method is based on principles related to the statistical model-based iterative reconstruction (MBIR) methodology. The proposed approach is a continuous-to-continuous data model approach, and the forward model is formulated as a shift-invariant system. This allows for avoiding a nutating reconstruction-based approach, e.g. the advanced single slice rebinning methodology (ASSR) that is usually applied in computed tomography (CT) scanners with X-ray tubes with a flying focal spot. In turn, the proposed approach allows for significantly accelerating the reconstruction processing and, generally, for greatly simplifying the entire reconstruction procedure. Additionally, it improves the quality of the reconstructed images in comparison to the traditional algorithms, as confirmed by extensive simulations. It is worth noting that the main purpose of introducing statistical reconstruction methods to medical CT scanners is the reduction of the impact of measurement noise on the quality of tomography images and, consequently, the dose reduction of X-ray radiation absorbed by a patient. A series of computer simulations followed by doctor’s assessments have been performed, which indicate how great a reduction of the absorbed dose can be achieved using the reconstruction approach presented here.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00en-us-1