首页期刊导航|计算机科学技术学报(英文版)
期刊信息/Journal information
计算机科学技术学报(英文版)
计算机科学技术学报(英文版)

李国杰

双月刊

1000-9000

jcst@ict.ac.cn

010-62610746

100080

北京中关村科学院南路6号 《计算机科学技术学报(英)》编辑部

计算机科学技术学报(英文版)/Journal Journal of Computer Science and TechnologyCSCDCSTPCD北大核心EISCI
查看更多>>Journal of Computer Science and Technology(JCST)是中国计算机科学技术领域国际性学术期刊。 JCST于1986 年创刊, 双月刊, 国内外公开发行, 由Springer Science + Business Media代理国际出版发行。 JCST是中国计算机学会会刊, 由中国科学院计算技术研究所承办。JCST由数十位国际计算机界的著名专家和学者联袂编审,把握世界计算机科学技术最新发展趋势。JCST荟萃了国内外计算机科学技术领域中有指导性和开拓性的学术论著,定期组织热点专辑或专题栏目,部分文章邀请了世界著名计算机科学专家撰写。
正式出版
收录年代

    Video Colorization:A Survey

    彭中正杨艺新唐金辉潘金山...
    487-508页
    查看更多>>摘要:Video colorization aims to add color to grayscale or monochrome videos.Although existing methods have achieved substantial and noteworthy results in the field of image colorization,video colorization presents more formidable obstacles due to the additional necessity for temporal consistency.Moreover,there is rarely a systematic review of video colorization methods.In this paper,we aim to review existing state-of-the-art video colorization methods.In addition,maintaining spatial-temporal consistency is pivotal to the process of video colorization.To gain deeper insight into the evolution of existing methods in terms of spatial-temporal consistency,we further review video colorization methods from a novel perspective.Video colorization methods can be categorized into four main categories:optical-flow based methods,scribble-based methods,exemplar-based methods,and fully automatic methods.However,optical-flow based methods rely heavily on accurate optical-flow estimation,scribble-based methods require extensive user interaction and modifications,exemplar-based methods face challenges in obtaining suitable reference images,and fully automatic methods often strug-gle to meet specific colorization requirements.We also discuss the existing challenges and highlight several future research opportunities worth exploring.

    A Survey of Multimodal Controllable Diffusion Models

    江锐郑光聪李藤杨天瑞...
    509-541页
    查看更多>>摘要:Diffusion models have recently emerged as powerful generative models,producing high-fidelity samples across domains.Despite this,they have two key challenges,including improving the time-consuming iterative generation process and controlling and steering the generation process.Existing surveys provide broad overviews of diffusion model advance-ments.However,they lack comprehensive coverage specifically centered on techniques for controllable generation.This survey seeks to address this gap by providing a comprehensive and coherent review on controllable generation in diffusion models.We provide a detailed taxonomy defining controlled generation for diffusion models.Controllable generation is categorized based on the formulation,methodologies,and evaluation metrics.By enumerating the range of methods re-searchers have developed for enhanced control,we aim to establish controllable diffusion generation as a distinct subfield warranting dedicated focus.With this survey,we contextualize recent results,provide the dedicated treatment of control-lable diffusion model generation,and outline limitations and future directions.To demonstrate applicability,we highlight controllable diffusion techniques for major computer vision tasks application.By consolidating methods and applications for controllable diffusion models,we hope to catalyze further innovations in reliable and scalable controllable generation.

    A Survey of LLM Datasets:From Autoregressive Model to AI Chatbot

    杜非马新建杨婧如柳熠...
    542-566页
    查看更多>>摘要:Since OpenAI opened access to ChatGPT,large language models(LLMs)become an increasingly popular topic attracting researchers'attention from abundant domains.However,public researchers meet some problems when de-veloping LLMs given that most of the LLMs are produced by industries and the training details are typically unrevealed.Since datasets are an important setup of LLMs,this paper does a holistic survey on the training datasets used in both the pre-train and fine-tune processes.The paper first summarizes 16 pre-train datasets and 16 fine-tune datasets used in the state-of-the-art LLMs.Secondly,based on the properties of the pre-train and fine-tune processes,it comments on pre-train datasets from quality,quantity,and relation with models,and comments on fine-tune datasets from quality,quantity,and concerns.This study then critically figures out the problems and research trends that exist in current LLM datasets.The study helps public researchers train and investigate LLMs by visual cases and provides useful comments to the research community regarding data development.To the best of our knowledge,this paper is the first to summarize and discuss datasets used in both autoregressive and chat LLMs.The survey offers insights and suggestions to researchers and LLM developers as they build their models,and contributes to the LLM study by pointing out the existing problems of LLM studies from the perspective of data.

    Advances of Pipeline Model Parallelism for Deep Learning Training:An Overview

    关磊李东升梁吉业王文剑...
    567-584页
    查看更多>>摘要:Deep learning has become the cornerstone of artificial intelligence,playing an increasingly important role in human production and lifestyle.However,as the complexity of problem-solving increases,deep learning models become in-creasingly intricate,resulting in a proliferation of large language models with an astonishing number of parameters.Pipeline model parallelism(PMP)has emerged as one of the mainstream approaches to addressing the significant chal-lenge of training"big models".This paper presents a comprehensive review of PMP.It covers the basic concepts and main challenges of PMP.It also comprehensively compares synchronous and asynchronous pipeline schedules for PMP ap-proaches,and discusses the main techniques to achieve load balance for both intra-node and inter-node training.Further-more,the main techniques to optimize computation,storage,and communication are presented,with potential research di-rections being discussed.

    Knowledge-Enhanced Conversational Agents

    Fabio CaffaroGiuseppe Rizzo
    585-609页
    查看更多>>摘要:Humanity has fantasized about artificial intelligence tools able to discuss with human beings fluently for decades.Numerous efforts have been proposed ranging from ELIZA to the modern vocal assistants.Despite the large inter-est in this research and innovation field,there is a lack of common understanding on the concept of conversational agents and general over expectations that hide the current limitations of existing solutions.This work proposes a literature re-view on the subject with a focus on the most promising type of conversational agents that are powered on top of knowl-edge bases and that can offer the ground knowledge to hold conversation autonomously on different topics.We describe a conceptual architecture to define the knowledge-enhanced conversational agents and investigate different domains of appli-cations.We conclude this work by listing some promising research pathways for future work.

    A Survey and Experimental Review on Data Distribution Strategies for Parallel Spatial Clustering Algorithms

    Jagat Sesh ChallaNavneet GoyalAmogh SharmaNikhil Sreekumar...
    610-636页
    查看更多>>摘要:The advent of Big Data has led to the rapid growth in the usage of parallel clustering algorithms that work over distributed computing frameworks such as MPI,MapReduce,and Spark.An important step for any parallel cluster-ing algorithm is the distribution of data amongst the cluster nodes.This step governs the methodology and performance of the entire algorithm.Researchers typically use random,or a spatial/geometric distribution strategy like kd-tree based par-titioning and grid-based partitioning,as per the requirements of the algorithm.However,these strategies are generic and are not tailor-made for any specific parallel clustering algorithm.In this paper,we give a very comprehensive literature survey of MPI-based parallel clustering algorithms with special reference to the specific data distribution strategies they employ.We also propose three new data distribution strategies namely Parameterized Dimensional Split for parallel densi-ty-based clustering algorithms like DBSCAN and OPTICS,Cell-Based Dimensional Split for dGridSLINK,which is a grid-based hierarchical clustering algorithm that exhibits efficiency for disjoint spatial distribution,and Projection-Based Split,which is a generic distribution strategy.All of these preserve spatial locality,achieve disjoint partitioning,and ensure good data load balancing.The experimental analysis shows the benefits of using the proposed data distribution strategies for al-gorithms they are designed for,based on which we give appropriate recommendations for their usage.

    Age-of-Information-Aware Federated Learning

    徐殷肖明军吴晨吴杰...
    637-653页
    查看更多>>摘要:Federated learning(FL)is an emerging privacy-preserving distributed computing paradigm,enabling numer-ous clients to collaboratively train machine learning models without the necessity of transmitting clients'private datasets to the central server.Unlike most existing research where the local datasets of clients are assumed to be unchanged over time throughout the whole FL process,our study addresses such scenarios in this paper where clients'datasets need to be updated periodically,and the server can incentivize clients to employ as fresh as possible datasets for local model training.Our primary objective is to design a client selection strategy to minimize the loss of the global model for FL loss within a constrained budget.To this end,we introduce the concept of"Age of Information"(AoI)to quantitatively assess the freshness of local datasets and conduct a theoretical analysis of the convergence bound in our AoI-aware FL system.Based on the convergence bound,we further formulate our problem as a restless multi-armed bandit(RMAB)problem.Next,we relax the RMAB problem and apply the Lagrangian Dual approach to decouple it into multiple subproblems.Finally,we propose a Whittle's Index Based Client Selection(WICS)algorithm to determine the set of selected clients.In addition,comprehensive simulations substantiate that the proposed algorithm can effectively reduce training loss and enhance the learning accuracy compared with some state-of-the-art methods.

    Multimodal Dependence Attention and Large-Scale Data Based Offline Handwritten Formula Recognition

    刘汉超董兰芳张信明
    654-670页
    查看更多>>摘要:Offline handwritten formula recognition is a challenging task due to the variety of handwritten symbols and two-dimensional formula structures.Recently,the deep neural network recognizers based on the encoder-decoder frame-work have achieved great improvements on this task.However,the unsatisfactory recognition performance for formulas with long IATEX strings is one shortcoming of the existing work.Moreover,lacking sufficient training data also limits the capability of these recognizers.In this paper,we design a multimodal dependence attention(MDA)module to help the model learn visual and semantic dependencies among symbols in the same formula to improve the recognition perfor-mance of the formulas with longLATEXstrings.To alleviate overfitting and further improve the recognition performance,we also propose a new dataset,Handwritten Formula Image Dataset(HFID),which contains 25 620 handwritten formula images collected from real life.We conduct extensive experiments to demonstrate the effectiveness of our proposed MDA module and HFID dataset and achieve state-of-the-art performances,63.79%and 65.24%expression accuracy on CROHME 2014 and CROHME 2016,respectively.

    When Crowdsourcing Meets Data Markets:A Fair Data Value Metric for Data Trading

    刘洋溯郑臻哲吴帆陈贵海...
    671-690页
    查看更多>>摘要:Large-quantity and high-quality data is critical to the success of machine learning in diverse applications.Faced with the dilemma of data silos where data is difficult to circulate,emerging data markets attempt to break the dilemma by facilitating data exchange on the Internet.Crowdsourcing,on the other hand,is one of the important meth-ods to efficiently collect large amounts of data with high-value in data markets.In this paper,we investigate the joint problem of efficient data acquisition and fair budget distribution across the crowdsourcing and data markets.We propose a new metric of data value as the uncertainty reduction of a Bayesian machine learning model by integrating the data in-to model training.Guided by this data value metric,we design a mechanism called Shapley Value Mechanism with Indi-vidual Rationality(SV-IR),in which we design a greedy algorithm with a constant approximation ratio to greedily select the most cost-efficient data brokers,and a fair compensation determination rule based on the Shapley value,respecting the individual rationality constraints.We further propose a fair reward distribution method for the data holders with various effort levels under the charge of a data broker.We demonstrate the fairness of the compensation determination rule and reward distribution rule by evaluating our mechanisms on two real-world datasets.The evaluation results also show that the selection algorithm in SV-IR could approach the optimal solution,and outperforms state-of-the-art methods.

    DCFNet:Discriminant Correlation Filters Network for Visual Tracking

    胡卫明王强高晋李兵...
    691-714页
    查看更多>>摘要:CNN(convolutional neural network)based real time trackers usually do not carry out online network up-date in order to maintain rapid tracking speed.This inevitably influences the adaptability to changes in object appearance.Correlation filter based trackers can update the model parameters online in real time.In this paper,we present an end-to-end lightweight network architecture,namely Discriminant Correlation Filter Network(DCFNet).A differentiable DCF(discriminant correlation filter)layer is incorporated into a Siamese network architecture in order to learn the convolution-al features and the correlation filter simultaneously.The correlation filter can be efficiently updated online.In previous work,we introduced a joint scale-position space to the DCFNet,forming a scale DCFNet which carries out the predic-tions of object scale and position simultaneously.We combine the scale DCFNet with the convolutional-deconvolutional network,learning both the high-level embedding space representations and the low-level fine-grained representations for images.The adaptability of the fine-grained correlation analysis and the generalization capability of the semantic embed-ding are complementary for visual tracking.The back-propagation is derived in the Fourier frequency domain throughout the entire work,preserving the efficiency of the DCF.Extensive evaluations on the OTB(Object Tracking Benchmark)and VOT(Visual Object Tracking Challenge)datasets demonstrate that the proposed trackers have fast speeds,while maintaining tracking accuracy.