NCTA 2024 Abstracts

Full Papers

Paper Nr:	14
Title:	Combined Depth and Semantic Segmentation from Synthetic Data and a W-Net Architecture
Authors:	Kevin Swingler, Teri Rumble, Ross Goutcher, Paul Hibbard, Mark Donoghue and Dan Harvey
Abstract:	Monocular pixel level depth estimation requires an algorithm to label every pixel in an image with its estimated distance from the camera. The task is more challenging than binocular depth estimation, where two cameras fixed a small distance apart are used. Algorithms that combine depth estimation with pixel level semantic segmentation show improved performance but present the practical challenge of requiring a dataset that is annotated at pixel level with both class labels and depth values. This paper presents a new convolutional neural network architecture capable of simultaneous monocular depth estimation and semantic segmentation and shows how synthetic data generated using computer games technology can be used to train such models. The algorithm performs at over 98% accuracy on the segmentation task and 88% on the depth estimation task.
Download

Paper Nr:	39
Title:	Invertibility of ReLU-Layers: A Practical Approach
Authors:	Hannah Eckert, Daniel Haider, Martin Ehler and Peter Balazs
Abstract:	Invertibility in machine learning models is a pivotal feature that bridges the gap between model complexity and interpretability. For ReLU-layers, the practical verification of invertibility has been shown to be a difficult task that still remains unsolved. Recently, a frame theoretic condition has been proposed to verify invertibility on an open or convex set, however, the computations for this condition are computationally infeasible in high dimensions. As an alternative, we propose an algorithm that stochastically samples the dataset to approximately verify the above condition for invertibility and can be efficiently implemented even in high dimensions. We use the algorithm to monitor invertibility and to enforce it during training in standard classification tasks.
Download

Paper Nr:	44
Title:	Anomaly Detection in eSport Games Through Periodical In-Game Movement Analysis with Deep Recurrent Neural Network
Authors:	Mhd Irvan, Franziska Zimmer, Ryosuke Kobayashi, Maharage Nisansala Sevwandi Perera, Roberta Tamponi and Rie Shigetomi Yamaguchi
Abstract:	Detecting anomaly in online video games is important to ensure a fair and secure gaming session. This is particularly crucial in eSport games, where competitive fairness is crucial. In this paper, we present an approach to anomaly detection in gaming sessions, using a variant of Deep Recurrent Neural Network, called Long Short-Term Memory (LSTM) network. Recurrent Neural Networks (RNNs) and their variant, LSTMs, are well-suited for this kind of task due to their ability to capture sequential patterns in gameplay data. The proposed system learns from normal gameplay patterns to identify anomalous behaviors such as impersonation. To confirm the feasibility of our approach, we use a game called Counter-Strike: Global Offensive (CSGO) serving as a case study. We utilize a public CSGO dataset containing in-game movement data, including coordinates, timestamps, and other contextual information. To test the model’s detection capabilities, synthetic data representing anomalous behaviors was injected into the dataset. The data was preprocessed and segmented into sequences, simulating the dynamics of player movements. Our LSTM model was trained to learn temporal dependencies within these sequences, enabling it to distinguish between normal and anomalous behaviors. Performance evaluation demonstrated the model’s robustness and effectiveness in detecting anomalies. The results indicate that our approach is able to detect anomalous activities, highlighting its potential for application in online gaming platforms to foster a more enjoyable gaming experience for all participants.
Download

Paper Nr:	59
Title:	Temporal Complexity of a Hopfield-Type Neural Model in Random and Scale-Free Graphs
Authors:	Marco Cafiso and Paolo Paradisi
Abstract:	The Hopfield network model and its generalizations were introduced as a model of associative, or content-addressable, memory. They were widely investigated both as an unsupervised learning method in artificial intelligence and as a model of biological neural dynamics in computational neuroscience. The complexity features of biological neural networks have attracted the scientific community’s interest for the last two decades. More recently, concepts and tools borrowed from complex network theory were applied to artificial neural networks and learning, thus focusing on the topological aspects. However, the temporal structure is also a crucial property displayed by biological neural networks and investigated in the framework of systems displaying complex intermittency. The Intermittency-Driven Complexity (IDC) approach indeed focuses on the metastability of self-organized states, whose signature is a power-decay in the inter-event time distribution or a scaling behaviour in the related event-driven diffusion processes. The investigation of IDC in neural dynamics and its relationship with network topology is still in its early stages. In this work, we present the preliminary results of an IDC analysis carried out on a bio-inspired Hopfield-type neural network comparing two different connectivities, i.e., scale-free vs. random network topology. We found that random networks can trigger complexity features similar to that of scale-free networks, even if with some differences and for different parameter values, in particular for different noise levels.
Download

Paper Nr:	61
Title:	Finding Strong Lottery Ticket Networks with Genetic Algorithms
Authors:	Philipp Altmann, Julian Schönberger, Maximilian Zorn and Thomas Gabor
Abstract:	According to the Strong Lottery Ticket Hypothesis, every sufficiently large neural network with randomly initialized weights contains a sub-network which – still with its random weights – already performs as well for a given task as the trained super-network. We present the first approach based on a genetic algorithm to find such strong lottery ticket sub-networks without training or otherwise computing any gradient. We show that, for smaller instances of binary classification tasks, our evolutionary approach even produces smaller and better-performing lottery ticket networks than the state-of-the-art approach using gradient information.
Download

Paper Nr:	74
Title:	META: Deep Learning Pipeline for Detecting Anomalies on Multimodal Vibration Sewage Treatment Plant Data
Authors:	Simeon Krastev, Aukkawut Ammartayakun, Kewal Jayshankar Mishra, Harika Koduri, Eric Schuman, Drew Morris, Yuan Feng, Sai Supreeth Reddy Bandi, Chun-Kit Ngan, Andrew Yeung, Jason Li, Nigel Ko, Fatemeh Emdad, Elke Rundensteiner, Heiton M. H. Ho, T. K. Wong and Jolly P. C. Chan
Abstract:	In this paper, we propose a hybrid anomaly detection pipeline, META, which integrates Multimodal-feature Extraction (ME) and a Transformer-based Autoencoder (TA) for predictive maintenance of sewage treatment plants. META uses a three-step approach: First, it employs a signal averaging method to remove noise and improve the quality of signals related to pump health. Second, it extracts key signal properties from three vibration directions (Axial, Radial X, Radial Y), fuses them, and performs dimensionality reduction to create a refined PCA feature set. Third, a Transformer-based Autoencoder (TA) learns pump behavior from the PCA features to detect anomalies with high precision. We validate META with an experimental case study at the Stonecutters Island Sewage Treatment Works in Hong Kong, showing it outperforms state-of-the-art methods in metrics like MCC and F1-score. Lastly, we develop a web-based Sewage Pump Monitoring System hosting the META pipeline with an interactive interface for future use.
Download

Paper Nr:	77
Title:	Searching for Idealized Prototype Learning for Interpreting Multi-Layered Neural Networks
Authors:	Ryotaro Kamimura
Abstract:	The present paper aims to show that neural learning consists of two fundamental phases: prototype and non-prototype learning in an ideal state. The prototype refers to a network with the simplest configuration, ideally determined without the influence of inputs. However, in actual learning, prototype and non-prototype learning are mixed and entangled. To demonstrate the existence of these two phases in an ideal state, it is necessary to explicitly distinguish between networks that are exclusively focused on acquiring the prototype and those that target non-prototype properties. We use different activation functions, combined serially, to represent the prototype and non-prototype learning phases. By combining multiple different activation functions, it is possible to create networks that exhibit both prototype and non-prototype properties in an almost ideal state. This method was applied to a business dataset that required improved generalization as well as interpretation. The experimental results confirmed that the ReLU activation function could identify the prototype with difficulty, while the hyperbolic tangent function could more easily detect the prototype. By combining these two activation functions within one framework, generalization performance could be improved while maintaining representations that are as close as possible to those obtained during prototype learning, thus facilitating easier interpretation.
Download

Paper Nr:	80
Title:	Assessing Forecasting Model Robustness Through Curvature-Based Noise Perturbations
Authors:	Lynda Ayachi
Abstract:	This paper introduces a novel approach to robustness testing of forecasting models through the use of curvature-based noise perturbations. Traditional noise models, such as Gaussian and uniform noise, often fail to capture the complex structural variations inherent in real-world time series data. By calculating the curvature of a time series and selectively perturbing curvature values, we generate a new type of noise that directly alters the shape and smoothness of the data. This method provides a unique perspective on model performance, revealing sensitivities to structural changes that conventional noise types do not address. Our analysis demonstrates the impact of curvature distortions on seasonality, trend, and overall model accuracy, highlighting vulnerabilities in forecasting models that are otherwise masked by standard robustness tests. Results show that curvature-based noise significantly affects the ability of models to accurately predict future values, especially in the presence of cyclical and seasonal patterns. The findings suggest that incorporating curvature perturbations into robustness evaluations can provide deeper insights into model resilience and guide the development of more adaptable forecasting techniques.
Download

Paper Nr:	87
Title:	A Graph-Based Deep Learning Model for the Anti-Money Laundering Task of Transaction Monitoring
Authors:	Nazanin Bakhshinejad, Uyen Trang Nguyen, Shahram Ghahremani and Reza Soltani
Abstract:	Anti-money laundering (AML) refers to a comprehensive framework of laws, regulations, and procedures to prevent bad actors from disguising illegally obtained funds as legitimate income. The AML framework encompasses customer identity verification and risk assessment, monitoring transactions to detect suspicious money laundering activities, and reporting suspicious transactions to regulators. In this paper, we focus on the transaction monitoring task of the AML framework. We propose a graph convolutional networks (GCN) model to classify transactions as legitimate or suspicious of money laundering. We tested and evaluated the model on a publicly available large dataset to promote reproducibility. The proposed model was trained and evaluated using the classification objectives for AML transaction monitoring per industry standard. We describe in detail our solutions to the class imbalance problem typical of AML datasets. We present comprehensive experiments to demonstrate and justify how the important parameters of the model were optimized and selected. This helps to support reproducibility and comparison with future work.
Download

Paper Nr:	88
Title:	Neuron Labeling for Self-Organizing Maps Using a Novel Example-Centric Algorithm with Weight-Centric Finalization
Authors:	Willem S. van Heerden
Abstract:	A self-organizing map (SOM) is an unsupervised artificial neural network that models training data using a map structure of neurons, which preserves the local topological structure of the training data space. An important step in the use of SOMs for data science is the labeling of neurons, where supervised neuron labeling is commonly used in practice. Two widely-used supervised neuron labeling methods for SOMs are example-centric neuron labeling and weight-centric neuron labeling. Example-centric neuron labeling produces high-quality labels, but tends to leave many neurons unlabeled, thus potentially hampering the interpretation or use of the labeled SOM. Weight-centric neuron labeling guarantees a label for every neuron, but often produces less accurate labels. This research proposes a novel hybrid supervised neuron labeling algorithm, which initially performs example-centric neuron labeling, after which missing labels are filled in using a weight-centric approach. The objective of this algorithm is to produce high-quality labels while still guaranteeing labels for every neuron. An empirical investigation compares the performance of the novel hybrid approach to example-centric neuron labeling and weight-centric neuron labeling, and demonstrates the feasibility of the proposed algorithm.
Download

Short Papers

Paper Nr:	24
Title:	Contrastive Learning and Abstract Concepts: The Case of Natural Numbers
Authors:	Daniel N. Nissani (Nissensohn)
Abstract:	Contrastive Learning (CL) has been successfully applied to classification and other downstream tasks related to concrete concepts, such as objects contained in the ImageNet dataset. No attempts seem to have been made so far in applying this promising scheme to more abstract entities. A prominent example of these could be the concept of (discrete) Quantity. CL can be frequently interpreted as a self-supervised scheme guided by some profound and ubiquitous conservation principle (e.g. conservation of identity in object classification tasks). In this introductory work we apply a suitable conservation principle to the semi-abstract concept of natural numbers by which discrete quantities can be estimated or predicted. We experimentally show, by means of a toy problem, that contrastive learning can be trained to count at a glance with high accuracy both at human as well as at super-human ranges. We compare this with the results of a trained-to-count at a glance supervised learning (SL) neural network scheme of similar architecture. We show that both schemes exhibit similar good performance on baseline experiments, where the distributions of the training and testing stages are equal. Importantly, we demonstrate that in some generalization scenarios, where training and testing distributions differ, CL boasts more robust and much better error performance.
Download

Paper Nr:	28
Title:	FSL-LFMG: Few-Shot Learning with Augmented Latent Features and Multitasking Generation for Enhancing Multiclass Classification on Tabular Data
Authors:	Aviv A. Nur, Chun-Kit Ngan and Rolf Bardeli
Abstract:	In this work, we propose advancing ProtoNet that employs augmented latent features (LF) by an autoencoder and multitasking generation (MG) by STUNT in the few-shot learning (FSL) mechanism. Specifically, the achieved contributions to this work are threefold. First, we propose an FSL-LFMG framework to develop an end-to-end few-shot multiclass classification workflow on tabular data. This framework is composed of three main stages that include (i) data augmentation at the sample level utilizing autoencoders to generate augmented LF, (ii) data augmentation at the task level involving self-generating multitasks using the STUNT approach, and (iii) the learning process taking place on ProtoNet, followed by various model evaluations in our FSL mechanism. Second, due to the outlier and noise sensitivity of K-means clustering and the curse of dimensionality of Euclidean distance, we enhance and customize the STUNT approach by using K-medoids clustering that is less sensitive to noisy outliers and Manhattan distance that is the most preferable for high-dimensional data. Finally, we conduct an extensive experimental study on four diverse domain datasets—Net Promoter Score segmentation, Dry Bean type, Wine type, and Forest Cover type—to prove that our FSL-LFMG approach on the multiclass classification outperforms the Tree Ensemble models and the One-vs-the-rest classifiers by 7.8% in 1-shot and 2.5% in 5-shot learning.
Download

Paper Nr:	45
Title:	LSTM versus Transformers: A Practical Comparison of Deep Learning Models for Trading Financial Instruments
Authors:	Daniel K. Ruiru, Nicolas Jouandeau and Dickson Odhiambo
Abstract:	Predicting stock prices is a difficult but important task of the financial market. Often two main methods are used to predict these prices; fundamental and technical analysis. These methods are not without their limitations which has led to the use of machine learning by analysts and investors as they try to gain an edge in the market. In this paper, a comparison is made between recurrent neural networks and the Transformer model in predicting five financial instruments; Gold, EURUSD, GBPUSD, S&P500 Index and CF Industries. This comparison starts with base models of LSTM, Bidirectional LSTM and Transformers. From the initial experiments, LSTM and Bidirectional LSTM have consistent results but with more trainable parameters. The Transformer model then has few trainable parameters but has inconsistent results. To try and gain an edge from their respective advantages, these models are combined. LSTM and Bidirectional LSTM are thus combined with the Transformer model in different variations and then trained on the same financial instruments. The best models are then trained on the larger datasets of the S&P500 index and CF Industries (1990-2024) and their results are used to make a simple trading agent whose profit and loss margin (P/L) is compared to the 2024 Q1 returns of the S&P500 index.
Download

Paper Nr:	52
Title:	Quantum Neural Network Design via Quantum Deep Reinforcement Learning
Authors:	Anca-Ioana Muscalagiu
Abstract:	Quantum neural networks (QNNs) are a significant advancement at the intersection of quantum computing and artificial intelligence, potentially offering superior performance over classical models. However, designing optimal QNN architectures is challenging due to the necessity for deep quantum mechanics knowledge and the complexity of manual design. To address these challenges, this paper introduces a novel approach to automated QNN design using quantum deep reinforcement learning. Our method extends beyond simple quantum circuits by applying quantum reinforcement learning to design parameterized quantum circuits, integrating them into trainable QNNs. As one of the first methods to autonomously generate optimal QNN architectures using quantum reinforcement learning, we aim to evaluate these architectures on various machine learning datasets to determine their accuracy and effectiveness, moving towards more efficient quantum computing solutions.
Download

Paper Nr:	56
Title:	Towards Fairness in Machine Learning: Balancing Racially Imbalanced Datasets Through Data Augmentation and Generative AI
Authors:	Anthonie Schaap, Sofoklis Kitharidis and Niki van Stein
Abstract:	Existing AI models trained on facial images are often heavily biased towards certain ethnic groups due to training data containing unrealistic ethnicity splits. This study examines ethnic biases in facial recognition AI models, resulting from skewed dataset representations. Various data augmentation and generative AI techniques were evaluated to mitigate these biases, employing fairness metrics to measure improvements. Our methodology included balancing training datasets with synthetic data generated through Generative Adversarial Networks (GANs), targeting underrepresented ethnic groups. Experimental results indicate that these interventions effectively reduce bias, enhancing the fairness of AI models across different ethnicities. This research contributes practical approaches for adjusting dataset imbalances in AI systems, ultimately improving the reliability and ethical deployment of facial recognition technologies.
Download

Paper Nr:	58
Title:	AI-Based Preliminary Modeling for Failure Prediction of Reactor Protection System in Nuclear Power Plants
Authors:	Hye Seon Jo, Ho Jun Lee, Ji Hun Park and Man Gyun Na
Abstract:	Nuclear power plants (NPPs), which generate electricity through nuclear fission energy, are crucial for safe operation due to the potential risk of exposure to radioactive materials. NPPs contain a variety of safety systems, and this study aims to develop an artificial intelligence-based failure prediction model that can predict and prevent potential failures in advance by targeting the reactor protection system (RPS). Currently, failure data for RPS are being collected through a testbed, so we conducted preliminary modeling using open-source data due to insufficient data acquisition. The applied open-source data are the accelerated aging data of insulated gate bipolar transistors (IGBTs), and the remaining useful life of IGBT was predicted using long short-term memory and Monte Carlo dropout technology. Also, physical rules were applied to improve their prediction performance and their applicability was confirmed through performance evaluation. Through performance evaluation of the developed prediction models, we explored the optimal model and confirmed the applicability of the applied methodologies and technologies.
Download

Paper Nr:	69
Title:	Online Match Prediction in Shogi Using Deep Convolutional Neural Networks
Authors:	Jim O’Connor and Melanie Fernández
Abstract:	This paper presents a novel approach to online evaluation of shogi games using Deep Convolutional Neural Networks (DCNNs). Shogi, a complex deterministic abstract strategy game, poses unique challenges due to its extensive game tree and the dynamic nature of piece movement, including the ability to play captured pieces. Traditional methods of game evaluation for shogi rely on either expert knowledge and handcrafted heuristics, or prohibitively high computational costs and limited scalability. Our method promotes a unique dataset of shogi game records and SFEN (Forsyth Edward Notation) strings to convert board positions into binary representations, which are then fed into a DCNN. The DCNN architecture, tailored for shogi board analysis, consists of convolutional and fully connected layers culminating in a binary classification output indicating a winning or losing position. Training the DCNN on approximately one million board states resulted in an 82.7% classification accuracy on a validation set. Our approach allows for online single board evaluation, while offering a computationally efficient alternative to traditional methods, paving the way for the development of additional shogi evaluation methods without the need for extensive expert knowledge or computational resources.
Download

Paper Nr:	81
Title:	Stealing Brains: From English to Czech Language Model
Authors:	Petr Hyner, Petr Marek, David Adamczyk, Jan Hůla and Jan Šedivý
Abstract:	We present a simple approach for efficiently adapting pre-trained English language models to generate text in lower-resource language, specifically Czech. We propose a vocabulary swap method that leverages parallel corpora to map tokens between languages, allowing the model to retain much of its learned capabilities. Experiments conducted on a Czech translation of the TinyStories dataset demonstrate that our approach significantly outperforms baseline methods, especially when using small amounts of training data. With only 10% of the data, our method achieves a perplexity of 17.89, compared to 34.19 for the next best baseline. We aim to contribute to work in the field of cross-lingual transfer in natural language processing and we propose a simple to implement, computationally efficient method tested in a controlled environment.
Download

Paper Nr:	82
Title:	Leveraging Deep Learning for Approaching Automated Pre-Clinical Rodent Models
Authors:	Carl Sandelius, Athanasios Pappas, Arezoo Sarkheyli-Hägele, Andreas Heuer and Magnus Johnsson
Abstract:	We evaluate deep learning architectures for rat pose estimation using a six-camera system, focusing on ResNet and EfficientNet across various depths and augmentation techniques. Among the configurations tested, ResNet 152 with default augmentation provided the best performance when employing a multi-perspective network approach in the controlled experimental setup. It reached a Root Mean Squared Error (RMSE) of 8.74, 8.78, and 9.72 pixels for the different angles. The utilization of data augmentation revealed that less altering yields better performance. We propose potential areas for future research, including further refinement of model configurations, more in-depth investigation of inference speeds, and the possibility of transferring network weights to study other species, such as mice. The findings underscore the potential for deep learning solutions to advance preclinical research in behavioral neuroscience. We suggest building on this research to introduce behavioral recognition based on a 3D movement reconstruction, particularly emphasizing the motoric aspects of neurodegenerative diseases. This will allow for the correlation of observable behaviors with neuronal activity, contributing to a better understanding of the brain and aiding in developing new therapeutic strategies.
Download

Paper Nr:	83
Title:	Unsupervised Feature Selection Using Extreme Learning Machine
Authors:	Mamadou Kanouté, Edith Grall-Maës and Pierre Beauseroy
Abstract:	In machine learning, feature selection is an important step in building an inference model with good generalization capacity when the number of variables is large. It can be supervised when the goal is to select features with respect to one or several target variables or unsupervised where no target variable is considered and the goal is to reduce the number of variables by removing redundant variables or noise. In this paper, we propose an unsupervised feature selection approach based on a model that uses a neural network with a single hidden layer in which a regularization term is incorporated to deal with nonlinear feature selection for multi-target regression problems. Experiments on synthetic and real-world data and comparisons with some methods in the literature show the effectiveness of this approach in the unsupervised framework.
Download

Paper Nr:	85
Title:	End-to-End Steering for Autonomous Vehicles via Conditional Imitation Co-Learning
Authors:	Mahmoud M. Kishky, Hesham M. Eraqi and Khaled M. F. Elsayed
Abstract:	Autonomous driving involves complex tasks such as data fusion, object and lane detection, behavior prediction, and path planning. As opposed to the modular approach which dedicates individual subsystems to tackle each of those tasks, the end-to-end approach treats the problem as a single learnable task using deep neural networks, reducing system complexity and minimizing dependency on heuristics. Conditional imitation learning (CIL) trains the end-to-end model to mimic a human expert considering the navigational commands guiding the vehicle to reach its destination, CIL adopts specialist network branches dedicated to learn the driving task for each navigational command. Nevertheless, the CIL model lacked generalization when deployed to unseen environments. This work introduces the conditional imitation co-learning (CIC) approach to address this issue by enabling the model to learn the relationships between CIL specialist branches via a co-learning matrix generated by gated hyperbolic tangent units (GTUs). Additionally, we propose posing the steering regression problem as classification, we use a classification-regression hybrid loss to bridge the gap between regression and classification, we also propose using co-existence probability to consider the spatial tendency between the steering classes. Our model is demonstrated to improve autonomous driving success rate in unseen environment by 62% on average compared to the CIL method.
Download

Paper Nr:	46
Title:	A Federated K-Means-Based Approach in eHealth Domains with Heterogeneous Data Distributions
Authors:	Giovanni Paragliola, Patrizia Ribino and Maria Mannone
Abstract:	Healthcare organizations collect and store significant amounts of patient health information. However, sharing or accessing this information outside of their facilities is often hindered by factors such as privacy concerns. Federated Learning (FL) data systems are emerging to overcome the siloed nature of health data and the barriers to sharing it. While federated approaches have been extensively studied, especially in classification problems, clustering-oriented approaches are still relatively few and less widespread, both in formulating algorithms and in their application in eHealth domains. The primary objective of this paper is to introduce a federated K-means-based approach for clustering tasks within the healthcare domain and explore the impact of heterogeneous health data distributions. The evaluation of the proposed federated K-means approach has been conducted on several health-related datasets through comparison with the centralized version and by estimating the trade-off between privacy and performance. The preliminary findings suggest that in the case of heterogeneous health data distributions, the difference between the centralized and federated approach is marginal, with the federated approach outperforming the centralized one on some healthcare datasets.
Download

Paper Nr:	53
Title:	A Comparison of Advanced Machine Learning Models for Food Import Forecasting
Authors:	Corrado Mio and Siddhartha Shakya
Abstract:	Food security is responsible for food availability, access and price stability. Food import is used to ensure availability when local production is inadequate and diversity when local production is not possible. Food import prediction is one of the tools used to ensure food security. In this case study, we analyze Neural Network Forecasting models applied to a food import dataset to understand whether these models, when applied to small time series, perform better than statistical or regression models. And if it is better to use short or long forecast horizons.
Download

Paper Nr:	55
Title:	Spatial Learning and Overfitting in Visual Recognition and Route Planning Tasks
Authors:	Margarita Zaleshina and Alexander Zaleshin
Abstract:	Spatial data recognition, navigation based on localized visual clues and ability to identify significant elements in the environment and build routes is formed as a result of general spatial learning and then adjusted to a specific location. Modern artificial intelligence (AI) — from visual processing applications to autonomous vehicles—also includes this capability. However, excessive learning can lead to overfitting, which significantly reduces the efficiency of spatial actions. In this work we describe typical algorithms for navigation, spatial learning in pigeon flights, and remote sensing recognition in neural networks. We consider learning algorithms based on significant topological elements, and suggest possible methods to expand learning opportunities and reduce the impact of erroneous settings. Our calculation results show how overfitting affects navigation behaviour and visual recognition. Result of this work provides direction for the future development of new algorithms that optimize the efficiency of spatial learning.
Download