首页期刊导航|计算机科学技术学报(英文版)
期刊信息/Journal information
计算机科学技术学报(英文版)
计算机科学技术学报(英文版)

李国杰

双月刊

1000-9000

jcst@ict.ac.cn

010-62610746

100080

北京中关村科学院南路6号 《计算机科学技术学报(英)》编辑部

计算机科学技术学报(英文版)/Journal Journal of Computer Science and TechnologyCSCDCSTPCD北大核心EISCI
查看更多>>Journal of Computer Science and Technology(JCST)是中国计算机科学技术领域国际性学术期刊。 JCST于1986 年创刊, 双月刊, 国内外公开发行, 由Springer Science + Business Media代理国际出版发行。 JCST是中国计算机学会会刊, 由中国科学院计算技术研究所承办。JCST由数十位国际计算机界的著名专家和学者联袂编审,把握世界计算机科学技术最新发展趋势。JCST荟萃了国内外计算机科学技术领域中有指导性和开拓性的学术论著,定期组织热点专辑或专题栏目,部分文章邀请了世界著名计算机科学专家撰写。
正式出版
收录年代

    SAIH:A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems

    杜江溯李东升文英鹏江嘉治...
    384-400页
    查看更多>>摘要:Novel artificial intelligence(AI)technology has expedited various scientific research,e.g.,cosmology,physics,and bioinformatics,inevitably becoming a significant category of workload on high-performance computing(HPC)sys-tems.Existing AI benchmarks tend to customize well-recognized AI applications,so as to evaluate the AI performance of HPC systems under the predefined problem size,in terms of datasets and AI models.However,driven by novel AI technol-ogy,most of AI applications are evolving fast on models and datasets to achieve higher accuracy and be applicable to more scenarios.Due to the lack of scalability on the problem size,static AI benchmarks might be under competent to help un-derstand the performance trend of evolving AI applications on HPC systems,in particular,the scientific AI applications on large-scale systems.In this paper,we propose a scalable evaluation methodology(SAIH)for analyzing the AI performance trend of HPC systems with scaling the problem sizes of customized AI applications.To enable scalability,SAIH builds a set of novel mechanisms for augmenting problem sizes.As the data and model constantly scale,we can investigate the trend and range of AI performance on HPC systems,and further diagnose system bottlenecks.To verify our methodology,we augment a cosmological AI application to evaluate a real HPC system equipped with GPUs as a case study of SAIH.With data and model augment,SAIH can progressively evaluate the AI performance trend of HPC systems,e.g.,increas-ing from 5.2%to 59.6%of the peak theoretical hardware performance.The evaluation results are analyzed and summa-rized into insight findings on performance issues.For instance,we find that the AI application constantly consumes the I/O bandwidth of the shared parallel file system during its iteratively training model.If I/O contention exists,the shared parallel file system might become a bottleneck.

    AutoQNN:An End-to-End Framework for Automatically Quantizing Neural Networks

    龚成卢冶代素蓉邓倩...
    401-420页
    查看更多>>摘要:Exploring the expected quantizing scheme with suitable mixed-precision policy is the key to compress deep neural networks(DNNs)in high efficiency and accuracy.This exploration implies heavy workloads for domain experts,and an automatic compression method is needed.However,the huge search space of the automatic method introduces plenty of computing budgets that make the automatic process challenging to be applied in real scenarios.In this paper,we propose an end-to-end framework named AutoQNN,for automatically quantizing different layers utilizing different schemes and bitwidths without any human labor.AutoQNN can seek desirable quantizing schemes and mixed-precision policies for mainstream DNN models efficiently by involving three techniques:quantizing scheme search(QSS),quantiz-ing precision learning(QPL),and quantized architecture generation(QAG).QSS introduces five quantizing schemes and defines three new schemes as a candidate set for scheme search,and then uses the Differentiable Neural Architecture Search(DNAS)algorithm to seek the layer-or model-desired scheme from the set.QPL is the first method to learn mixed-precision policies by reparameterizing the bitwidths of quantizing schemes,to the best of our knowledge.QPL optimizes both classification loss and precision loss of DNNs efficiently and obtains the relatively optimal mixed-precision model within limited model size and memory footprint.QAG is designed to convert arbitrary architectures into corresponding quantized ones without manual intervention,to facilitate end-to-end neural network quantization.We have implemented AutoQNN and integrated it into Keras.Extensive experiments demonstrate that AutoQNN can consistently outperform state-of-the-art quantization.For 2-bit weight and activation of AlexNet and ResNet18,AutoQNN can achieve the accura-cy results of 59.75%and 68.86%,respectively,and obtain accuracy improvements by up to 1.65%and 1.74%,respectively,compared with state-of-the-art methods.Especially,compared with the full-precision AlexNet and ResNet18,the 2-bit models only slightly incur accuracy degradation by 0.26%and 0.76%,respectively,which can fulfill practical application demands.

    Qubit Mapping Based on Tabu Search

    蒋慧邓玉欣徐鸣
    421-433页
    查看更多>>摘要:The goal of qubit mapping is to map a logical circuit to a physical device by introducing additional gates as few as possible in an acceptable amount of time.We present an effective approach called Tabu Search Based Adjustment(TSA)algorithm to construct the mappings.It consists of two key steps:one is making use of a combined subgraph iso-morphism and completion to initialize some candidate mappings,and the other is dynamically modifying the mappings by TSA.Our experiments show that,compared with state-of-the-art methods,TSA can generate mappings with a smaller number of additional gates and have better scalability for large-scale circuits.

    Understanding and Detecting Inefficient Image Displaying Issues in Android Apps

    李文杰马骏蒋炎岩许畅...
    434-459页
    查看更多>>摘要:Mobile applications(apps for short)often need to display images.However,inefficient image displaying(IID)issues are pervasive in mobile apps,and can severely impact app performance and user experience.This paper first estab-lishes a descriptive framework for the image displaying procedures of IID issues.Based on the descriptive framework,we conduct an empirical study of 216 real-world IID issues collected from 243 popular open-source Android apps to validate the presence and severity of IID issues,and then shed light on these issues'characteristics to support research on effective issue detection.With the findings of this study,we propose a static IID issue detection tool TAPIR and evaluate it with 243 real-world Android apps.Encouragingly,49 and 64 previously-unknown IID issues in two different versions of 16 apps reported by TAPIR are manually confirmed as true positives,respectively,and 16 previously-unknown IID issues reported by TAPIR have been confirmed by developers and 13 have been fixed.Then,we further evaluate the performance impact of these detected IID issues and the performance improvement if they are fixed.The results demonstrate that the IID is-sues detected by TAPIR indeed cause significant performance degradation,which further show the effectiveness and effi-ciency of TAPIR.

    CAT:A Simple yet Effective Cross-Attention Transformer for One-Shot Object Detection

    林蔚东邓玉岩高扬王宁...
    460-471页
    查看更多>>摘要:Given a query patch from a novel class,one-shot object detection aims to detect all instances of this class in a target image through the semantic similarity comparison.However,due to the extremely limited guidance in the novel class as well as the unseen appearance difference between the query and target instances,it is difficult to appropriately ex-ploit their semantic similarity and generalize well.To mitigate this problem,we present a universal Cross-Attention Transformer(CAT)module for accurate and efficient semantic similarity comparison in one-shot object detection.The proposed CAT utilizes the transformer mechanism to comprehensively capture bi-directional correspondence between any paired pixels from the query and the target image,which empowers us to sufficiently exploit their semantic characteristics for accurate similarity comparison.In addition,the proposed CAT enables feature dimensionality compression for infer-ence speedup without performance loss.Extensive experiments on three object detection datasets MS-COCO,PASCAL VOC and FSOD under the one-shot setting demonstrate the effectiveness and efficiency of our model,e.g.,it surpasses CoAE,a major baseline in this task,by 1.0%in average precision(AP)on MS-COCO and runs nearly 2.5 times faster.

    Random Subspace Sampling for Classification with Missing Data

    曹云浩吴建鑫
    472-486页
    查看更多>>摘要:Many real-world datasets suffer from the unavoidable issue of missing values,and therefore classification with missing data has to be carefully handled since inadequate treatment of missing values will cause large errors.In this paper,we propose a random subspace sampling method,RSS,by sampling missing items from the corresponding feature histogram distributions in random subspaces,which is effective and efficient at different levels of missing data.Unlike most established approaches,RSS does not train on fixed imputed datasets.Instead,we design a dynamic training strate-gy where the filled values change dynamically by resampling during training.Moreover,thanks to the sampling strategy,we design an ensemble testing strategy where we combine the results of multiple runs of a single model,which is more effi-cient and resource-saving than previous ensemble methods.Finally,we combine these two strategies with the random sub-space method,which makes our estimations more robust and accurate.The effectiveness of the proposed RSS method is well validated by experimental studies.