首页期刊导航|International journal of machine learning and cybernetics
期刊信息/Journal information
International journal of machine learning and cybernetics
Springer
International journal of machine learning and cybernetics

Springer

季刊

1868-8071

International journal of machine learning and cybernetics/Journal International journal of machine learning and cyberneticsEISCI
正式出版
收录年代

    Axiomatic approaches to three types of L-valued rough sets

    Yanan ChenXiaowei Wei
    5469-5493页
    查看更多>>摘要:Abstract In this paper, considering L being a GL-quantale, we further develop the theory of L-valued rough sets with an L-set as the basic universe of defining L-valued rough approximation operators. Choosing an L-set as the universe can break the rules of adopting Zadeh’s fuzzy sets as the universe. We first introduce three types of L-valued relations on L-sets, namely, inverse serial, mediate, Euclidean, and then characterize them by L-valued rough sets. Adopting the idea of single axiomatic characterizations of L-valued rough sets, we present the axiomatical characterizations of L-valued upper and lower rough approximation operators on an L-set concerning these new L-valued relations by fuzzy unions and fuzzy intersections. Moreover, in the framework of category, we introduce the concepts of L-valued Alexander co-topological spaces and L-valued closure spaces on L-sets by fuzzy unions and fuzzy intersections, then prove they are category isomorphic. By fuzzy unions, we obtain a simplified axiomatic system of the L-valued closure spaces, which highlights the advantages of fuzzy unions. Finally, we obtain the category of L-valued Alexander co-topological spaces and their continuous mappings is isomorphic to the category of L-valued preordered approximation spaces and their order-preserving mappings.

    Context-aware generative prompt tuning for relation extraction

    Xiaoyong LiuHandong WenChunlin XuZhiguo Du...
    5495-5508页
    查看更多>>摘要:Abstract Relation extraction is designed to extract semantic relation between predefined entities from text. Recently, prompt tuning has achieved promising results in the field of relation extraction, and its core idea is to insert a template into the input and model the relation extraction as an masked language modeling (MLM) problem. However, existing prompt tuning approaches ignore the rich semantic information between entities and relations resulting in suboptimal performance. In addition, since MLM tasks can only identify one relation at a time, the widespread problem of entity overlap in relation extraction cannot be solved. To this end, we propose a novel Context-Aware Generative Prompt Tuning (CAGPT) method which ensures the comprehensiveness of triplet extraction by modeling relation extraction as a generative task, and outputs triplets related to the same entity at one time to overcome the entity overlap problem. Moreover, we connect entities and relations with natural language and inject entity and relationship information into the designed template which can make full use of the rich semantic information between entities and relations. Extensive experimental results on four benchmark datasets demonstrate the effectiveness of the proposed method.

    GFD-SSL: generative federated knowledge distillation-based semi-supervised learning

    Ali KaramiReza RamezaniAhmad Baraani Dastjerdi
    5509-5529页
    查看更多>>摘要:Abstract Federated semi-supervised learning (Fed-SSL) algorithms have been developed to address the challenges of decentralized data access, data confidentiality, and costly data labeling in distributed environments. Most existing Fed-SSL algorithms are based on the federated averaging approach, which utilizes an equivalent model on all machines and replaces local models during the learning process. However, these algorithms suffer from significant communication overhead when transferring parameters of local models. In contrast, knowledge distillation-based Fed-SSL algorithms reduce communication costs by only transferring the output of local models on shared data between machines. However, these algorithms assume that all local data on the machines are labeled, and that there exists a large set of shared unlabeled data for training. These assumptions are not always feasible in real-world applications. In this paper, a knowledge distillation-based Fed-SSL algorithm has been presented, which does not make any assumptions about how the data is distributed among machines. Additionally, it artificially generates shared data required for the learning process. The learning process of the presented approach employs a semi-supervised GAN on local machines and has two stages. In the first stage, each machine trains its local model independently. In the second stage, each machine generates some artificial data in each step and propagates it to other machines. Each machine trains its discriminator with these data and the average output of all machines on these data. The effectiveness of this algorithm has been examined in terms of accuracy and the amount of communication among machines by using different data sets with different distributions. The evaluations reveal that, on average, the presented algorithm is 15% more accurate than state-of-the-art methods, especially in the case of non-IID data. In addition, in most cases, it yields better results than existing studies in terms of the amount of data communication among machines.

    WTGCN: wavelet transform graph convolution network for pedestrian trajectory prediction

    Wangxing ChenHaifeng SangJinyu WangZishan Zhao...
    5531-5548页
    查看更多>>摘要:Abstract The task of pedestrian trajectory prediction remains challenging due to variable scenarios, complex social interactions, and uncertainty in pedestrian motion. Previous trajectory prediction research only models from the time domain, which makes it difficult to accurately capture the global and detailed features of complex pedestrian social interactions and the uncertainty of pedestrian movement. These methods also ignore the relationship between scene features and the potential motion patterns of pedestrians. Therefore, we propose a wavelet transform graph convolution network to obtain accurate pedestrian potential motion patterns through time-frequency analysis. We first construct spatial and temporal graphs, then obtain the attention score matrices through the self-attention mechanism in the time domain and combine them with the scene features. Then, we utilize the two-dimensional discrete wavelet transform to generate low-frequency and high-frequency components for representing global and detailed features of spatial-temporal interactions. These components are then further processed using asymmetric convolution, and the wavelet transform adjacency matrix is obtained through the inverse wavelet transform. We then employ graph convolution to combine the graph and the adjacency matrix to obtain spatial and temporal interaction features. Finally, we design the wavelet transform temporal convolution network to directly predict the two-dimensional Gaussian distribution parameters of the future trajectory. Extensive experiments on the ETH, UCY, and SDD datasets demonstrate that our method outperforms the state-of-the-art methods in prediction performance.

    Sequential attention layer-wise fusion network for multi-view classification

    Qing TengXibei YangQiguo SunPingxin Wang...
    5549-5561页
    查看更多>>摘要:Abstract Graph convolutional network has shown excellent performance in multi-view classification. Currently, to output a fused node embedding representation in multi-view scenarios, existing researches tend to ensure the consistency of embedded node information among multiple views. However, they pay much attention to the immediate neighbors information rather than multi-order node information which can capture complex relationships and structures to enhance feature propagation. Furthermore, the embedded node information in each convolutional layer has not been fully utilized because the consistency is frequently achieved by the final convolutional layer. To tackle these limitations, we develop a new end-to-end multi-view learning architecture: sequential attention Layer-wise Fusion Network for multi-view classification (SLFNet). Motivated by the fact that for each view, multi-order node information is hidden in the multiple layer-wise node embedding representations, a set of sequential attentions can then be calculated over those multiple layers, which provides a novel fusion strategy from the perspectives of multi-order. The contributions of our architecture are: (1) capturing multi-order node information instead of using the immediate neighbors, thereby obtaining more accurate node embedding representations; (2) designing a sequential attention module that allows adaptive learning of node embedding representation for each layer, thereby attentively fusing these layer-wise node embedding representations. Our experiments, focusing on semi-supervised node classification tasks, highlight the superiorities of SLFNet compared to state-of-the-art approaches. Reports on deeper layer convolutional results further confirm its effectiveness in addressing over-smoothing problem.

    A novel abstractive summarization model based on topic-aware and contrastive learning

    Huanling TangRuiquan LiWenhao DuanQuansheng Dou...
    5563-5577页
    查看更多>>摘要:Abstract The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.

    EEMNet: an end-to-end efficient model for PCB surface tiny defect detection

    Yuxiang WuLiming ZhengEnze Chen
    5579-5594页
    查看更多>>摘要:Abstract The miniaturization of electronic products has led to the denser and more crowded wiring on printed circuit boards (PCBs), which has made PCB defects smaller and more difficult to detect. Moreover, the complex morphology of PCB defects highlights the importance of capturing their contextual information for improved detection accuracy and efficiency. While CNN can effectively capture local information, its layered convolution-based feature extraction method has limitations in capturing contextual information. The transformer structure can capture long-range dependencies effectively, but at the cost of increased computational effort. To address this issue, an end-to-end efficient model (EEMNet) for PCB surface tiny defect detection is proposed, leveraging the modularity idea. This model includes a novel and efficient attention mechanism that can capture global dependencies without adding too much computational effort, along with several plug-and-play modules for enhancing tiny defect features. The model also incorporates a scale-sensitive localization loss function and makes extensive use of Ghost convolution to substantially reduce the number of model parameters. The resulting EEMNet achieves a detection accuracy of 99.1%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} and a detection speed of 77 FPS on a published PCB dataset, outperforming existing PCB detection algorithms. Overall, the proposed model provides an efficient and effective solution for PCB tiny defect detection.

    The concept information of graph granule with application to knowledge graph embedding

    Jiaojiao NiuDegang ChenYinglong MaJinhai Li...
    5595-5606页
    查看更多>>摘要:Abstract Knowledge graph embedding (KGE) has become one of the most effective methods for the numerical representation of entities and their relations in knowledge graphs. Traditional methods primarily utilise triple facts, structured as (head entity, relation, tail entity), as the basic knowledge units in the learning process and use additional external information to improve the performance of models. Since triples are sometimes less than adequate and external information is not always available, obtaining structured internal knowledge from knowledge graphs (KGs) naturally becomes a feasible method for KGE learning. Motivated by this, this paper employs formal concept analysis (FCA) to mine deterministic concept knowledge in KGs and proposes a novel KGE model by taking the concept information into account. More specifically, triples sharing the same head entity are organised into knowledge structures named graph granules, and then were transformed into concept lattices, based on which a novel lattice-based KGE model (TransGr) is proposed for knowledge graph completion. TransGr assumes that entities and relations exist in different granules and uses a matrix (obtained by fusing concepts from concept lattice) for quantitatively depicting the graph granule. Afterwards, it forces entities and relations to meet graph granule constraints when learning vector representations of KGs. Experiments on link prediction and triple classification demonstrated that the proposed TransGr is effective on the datasets with relatively complete graph granules.

    A copy-move forgery detection technique using DBSCAN-based keypoint similarity matching

    Soumya MukherjeeArup Kumar PalSoham Maji
    5607-5634页
    查看更多>>摘要:Abstract In an era marked by the contrast between information and disinformation, the ability to differentiate between authentic and manipulated images holds immense importance for both security professionals and the scientific community. Copy-move forgery is widely practiced thus, sprang up as a prevalent form of image manipulation among different types of forgeries. In this counterfeiting process, a region of an image is copied and pasted into different parts of the same image to hide or replicate the same objects. As copy-move forgery is hard to detect and localize, a swift and efficacious detection scheme based on keypoint detection is introduced. Especially the localization of forged areas becomes more difficult when the forged image is subjected to different post-processing attacks and geometrical attacks. In this paper, a robust, translation-invariant, and efficient copy-move forgery detection technique has been introduced. To achieve this goal, we developed an AKAZE-driven keypoint-based forgery detection technique. AKAZE is applied to the LL sub-band of the SWT-transformed image to extract translation invariant features, rather than extracting them directly from the original image. We then use the DBSCAN clustering algorithm and a uniform quantizer on each cluster to form group pairs based on their feature descriptor values. To mitigate false positives, keypoint pairs are separated by a distance greater than a predefined shift vector distance. This process forms a collection of keypoints within each cluster by leveraging their similarities in feature descriptors. Our clustering-based similarity-matching algorithm effectively locates the forged region. To assess the proposed scheme we deploy it on different datasets with post-processing attacks ranging from blurring, color reduction, contrast adjustment, brightness change, and noise addition. Even our method successfully withstands geometrical manipulations like rotation, skewing, and different affine transform attacks. Visual outcomes, numerical results, and comparative analysis show that the proposed model accurately detects the forged area with fewer false positives and is more computationally efficient than other methods.

    Federated learning-guided intrusion detection and neural key exchange for safeguarding patient data on the internet of medical things

    Chongzhou ZhongArindam SarkarSarbajit MannaMohammad Zubair Khan...
    5635-5665页
    查看更多>>摘要:Abstract To improve the security of the Internet of Medical Things (IoMT) in healthcare, this paper offers a Federated Learning (FL)-guided Intrusion Detection System (IDS) and an Artificial Neural Network (ANN)-based key exchange mechanism inside a blockchain framework. The IDS are essential for spotting network anomalies and taking preventative action to guarantee the secure and dependable functioning of IoMT systems. The suggested method integrates FL-IDS with a blockchain-based ANN-based key exchange mechanism, providing several important benefits: (1) FL-based IDS creates a shared ledger that aggregates nearby weights and transmits historical weights that have been averaged, lowering computing effort, eliminating poisoning attacks, and improving data visibility and integrity throughout the shared database. (2) The system uses edge-based detection techniques to protect the cloud in the case of a security breach, enabling quicker threat recognition with less computational and processing resource usage. FL’s effectiveness with fewer data samples plays a part in this benefit. (3) The bidirectional alignment of ANNs ensures a strong security framework and facilitates the production of keys inside the IoMT network on the blockchain. (4) Mutual learning approaches synchronize ANNs, making it easier for IoMT devices to distribute synchronized keys. (5) XGBoost and ANN models were put to the test using BoT-IoT datasets to gauge how successful the suggested method is. The findings show that ANN demonstrates greater performance and dependability when dealing with heterogeneous data available in IoMT, such as ICU (Intensive Care Unit) data in the medical profession, compared to alternative approaches studied in this study. Overall, this method demonstrates increased security measures and performance, making it an appealing option for protecting IoMT systems, especially in demanding medical settings like ICUs.