深度学习处理器体系结构
序号 | 标题 | 类型 | 作者 |
---|---|---|---|
1 | 基于深度学习的视觉 SLAM 硬件加速器 | 会议论文 | 黄一迪;王语涵;杨雨枫;曹睿智;王锐 |
2 | Zerospy: Exploring Software Inefficiency with Redundant Zeros | 会议论文 | Xin You;Hailong Yang;Zhongzhi Luan;Depei Qian;Xu Liu |
3 | DWM: A Decomposable Winograd Method for Convolution Acceleration | 会议论文 | Di Huang;Xishan Zhang;Rui Zhang;Tian Zhi;Deyuan He;Jiaming Guo;Chang Liu;Qi Gu;Zidong Du;Shaoli Liu;Tianshi Chen;Yunji Chen |
4 | Cambricon-F: machine learning computers with fractal von neumann architecture | 会议论文 | Yongwei Zhao;Zidong Du;Qi Guo;Shaoli Liu;Ling Li;Zhiwei Xu;Tianshi Chen;Yunji Chen |
5 | Thread-Level Locking for SIMT Architectures | 期刊论文 | Gao Lan;Xu Yunlong;Wang Rui;Luan Zhongzhi;Yu Zhibin;Qian Depei |
6 | Multiple Algorithms Against Multiple Hardware Architectures: Data-Driven Exploration on Deep Convolution Neural Network | 会议论文 | Chongyang Xu;Zhongzhi Luan;Lan Gao;Rui Wang;Han Zhang;Lianyi Zhang;Yi Liu;Depei Qian |
7 | Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment | 会议论文 | Jiaming Guo;Rui Zhang;Xishan Zhang;Shaohui Peng;Qi Yi;Zidong Du;Xing Hu;Qi Guo;Yunji Chen |
8 | An optimized tensor completion library for multiple GPUs | 会议论文 | Ming Dun;Yunchun Li;Hailong Yang;Qingxiao Sun;Zhongzhi Luan;Depei Qian |
9 | 利用纠错码自动校正访问存储装置数据的装置及方法 | 专利 | 张士锦;罗韬;刘少礼;陈云霁 |
10 | 一种相对位置编码方法 及系统 | 专利 | 杜子东;戴文娟;孙正;刘小蒙 |
11 | 一种数据融合模块加速器及方法 | 专利 | 王锐;刘轶;吕向;黄一迪;钱德沛 |
12 | SpTFS: Sparse Tensor Format Selection for MTTKRP via Deep Learning | 会议论文 | Qingxiao Sun;Yi Liu;Ming Dun;Hailong Yang;Zhongzhi Luan;Lin Gan;Guangwen Yang;Depei Qian |
13 | Processing graphs with barrierless asynchronous parallel model on shared-memory systems | 期刊论文 | Le Luo;Yi Liu |
14 | iBalancer: Load-Aware in-Server Flow Scheduling for Sub-Millisecond Tail Latency | 期刊论文 | Qi Zhang;Yi Liu;Tao Liu |
15 | 2021年CCF优秀博士学位论文奖 | 奖励 | 赵永威 |
16 | BSHIFT: A Low Cost Deep Neural Networks Accelerator | 期刊论文 | Yong Yu;Tian Zhi;Xuda Zhou;Shaoli Liu;Yunji Chen;Shuyao Cheng |
17 | Structure Characteristic-Aware Pruning Strategy for Convolutional Neural Networks | 会议论文 | Peixuan Zuo;Rui Wang;Xianya Fu;Hailong Yang;Yi Liu;Lianyi Zhang;Han Zhang;Depei Qian |
18 | Magas: matrix-based asynchronous graph analytics on shared memory systems | 期刊论文 | Le Luo;Yi Liu;Hailong Yang;Depei Qian |
19 | Accelerating in-memory transaction processing using general purpose graphics processing units | 期刊论文 | Lan Gao;Yunlong Xu;Rui Wang;Hailong Yang;Zhongzhi Luan;Depei Qian |
20 | E级计算的几个问题 | 期刊论文 | 钱德沛;王锐 |
21 | An Instruction Set Architecture for Machine Learning | 期刊论文 | Yunji Chen;Huiying Lan;Zidong Du;Shaoli Liu;Jinhua Tao;Dong Han;Tao Luo;Qi Guo;Ling Li;Yuan Xie;Tianshi Chen |
22 | PriPro: Towards Effective Privacy Protection on Edge-Cloud System running DNN Inference | 会议论文 | Ruiyuan Gao;Hailong Yang;Shaohan Huang;Ming Dun;Mingzhen Li;Zerong Luan;Zhongzhi Luan;Depei Qian |
23 | Vectorizing SpMV by Exploiting Dynamic Regular Patterns | 会议论文 | Xin You;Changxi Liu;Hailong Yang;Zhongzhi Luan;Depei Qian |
24 | GraphQ: Scalable PIM-Based Graph Processing | 会议论文 | Youwei Zhuo;Chao Wang;Mingxing Zhang;Rui Wang;Dimin Niu;Yanzhi Wang;Xuehai Qian |
25 | 一种可穿戴设备 | 专利 | 华山;伍立平;潘安;张健伟;赵扬波;杜文振;邱康达;黄丹 |
26 | BenchIP: Benchmarking Intelligence Processors | 期刊论文 | Jin-Hua Tao;Zi-Dong Du;Qi Guo;Hui-Ying Lan;Lei Zhang;Sheng-Yuan Zhou;Ling-Jie Xu;Cong Liu;Hai-Feng Liu;Shan Tang;Allen Rush;Willian Chen;Shao-Li Liu;Yun-Ji Chen;Tian-Shi Chen |
27 | 一种运动识别方法、装置及存储介质 | 专利 | 华山;伍立平;潘安;张健伟;赵扬波;杜文振;邱康达;黄丹 |
28 | 一种基于深度学习的视觉SLAM加速系统及方法 | 专利 | 王锐;曹睿智;王语涵;杨雨枫;黄一迪;钱德沛 |
29 | ParaML: A Polyvalent Multicore Accelerator for Machine Learning | 期刊论文 | Zhou Shengyuan;Guo Qi;Du Zidong;Liu Daofu;Chen Tianshi;Li Ling;Liu Shaoli;Zhou Jinhong;Temam Olivier;Feng Xiaobing;Zhou Xuehai;Chen Yunji |
30 | Toward accelerated stencil computation by adapting tensor core unit on GPU | 会议论文 | Xiaoyan Liu;Yi Liu;Hailong Yang;Jianjin Liao;Mingzhen Li;Zhongzhi Luan;Depei Qian |
31 | LADet: A Light-weight and Adaptive Network for Multi-scale Object Detection | 会议论文 | Jiaming Zhou;Yuqiao Tian;Weicheng Li;Rui Wang;Zhongzhi Luan;Qian Depei |
32 | Guardauto: A Decentralized Runtime Protection System for Autonomous Driving | 期刊论文 | Kun Cheng;Yuan Zhou;Bihuan Chen;Rui Wang;Yuebin Bai;Yang Liu |
33 | swSpAMM: optimizing large-scale sparse approximate matrix multiplication on Sunway Taihulight | 期刊论文 | Xiaoyan Liu;Yi Liu;Bohong Yin;Hailong Yang;Zhongzhi Luan;Depei Qian |
34 | Temperature-Aware DRAM Cache Management—Relaxing Thermal Constraints in 3-D Systems | 期刊论文 | Minxuan Zhou;Andreas Prodromou;Rui Wang;Hailong Yang;Depei Qian;Dean Tullsen |
35 | 一种基于全自旋逻辑的物理不可克隆函数硬件电路及方法 | 专利 | 成元庆;徐康伟;王锐 |
36 | 因果关系驱动的分层强化学习框架及分层强化学习方法 | 专利 | 胡杏;彭少辉;张蕊;郭家明;易琦;张曦珊;杜子东;郭崎;陈天石 |
37 | 一种基于相对位置编码的 语义识别方法和系统 | 专利 | 杜子东;庄毅敏;支天 |
38 | 基于真值表的函数自动生成方法及系统 | 专利 | 支天;贺文凯;胡杏;张曦珊;张蕊;杜子东;郭崎 |
39 | 基于自动梯度混合的神经网络模型知识蒸馏方法及系统 | 专利 | 张蕊;曹炅宣;杜治兴;陈天石 |
40 | Adapting Combined Tiling to Stencil Optimizations on Sunway Processor | 会议论文 | Biao Sun;Mingzhen Li;Hailong Yang;Jun Xu;Huaitao Zhang;Zhongzhi Luan;Depei Qian |
41 | Cambricon-Q: A Hybrid Architecture for Efficient Training | 会议论文 | Yongwei Zhao;Liu Chang;Du Zidong;Guo Qi;Hu Xing;Yimin Zhuang;Zhang Zhenxing;Xinkai Song;Li Wei;Zhang Xishan;Li Ling;Xu Zhiwei;Chen Yunji |
42 | Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach | 会议论文 | Xuda Zhou;Zidong Du;Qi Guo;Shaoli Liu;Chengsi Liu;Chao Wang;Xuehai Zhou;Ling Li;Tianshi Chen;Yunji Chen |
43 | DLIR: An Intermediate Representation for Deep Learning Processors | 会议论文 | Huiying Lan;Zidong Du |
44 | Dictionary-Guided Editing Networks for Paraphrase Generation | 会议论文 | Shaohan Huang;Yu Wu;Furu Wei;Zhongzhi Luan |
45 | QoS-aware dynamic resource allocation with improved utilization and energy efficiency on GPU | 期刊论文 | Qingxiao Sun;Liu Yi;Hailong Yang;Mingzhen Li;Zhongzhi Luan;Depei Qian |
46 | Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture | 期刊论文 | Mingzhen Li;Yi Liu;Hailong Yang;Zhongzhi Luan;Lin Gan;Guangwen Yang;Depei Qian |
47 | Addressing Irregularity in Sparse Neural Networks through a Cooperative Software/Hardware Approach | 期刊论文 | Xi Zeng;Xuehai Zhou;Ling Li;Tianshi Chen;Ninghui Sun;Yunji Chen;Tian Zhi;Xuda Zhou;Zidong Du;Qi Guo;Shaoli Liu;Bingrui Wang;Yuanbo Wen;Chao Wang |
48 | Addressing Sparsity in Deep Neural Networks | 期刊论文 | Xuda Zhou;Yunji Chen;Zidong Du;Shijin Zhang;Lei Zhang;Huiying Lan;Shaoli Liu;Ling Li;Qi Guo;Tianshi Chen |
49 | FLONet: Fewer Labeling Cost Active Learning for Deep Neural Network | 会议论文 | Xianya Fu;Rui Wang;Peixuan Zuo;Jiaming Zhou;Jia Zhai;Xiaodan Xie;Zhongzhi Luan;Depei Qian |
50 | CoGNN: efficient scheduling for concurrent GNN training on GPUs | 会议论文 | Qingxiao Sun;Yi Liu;Hailong Yang;Ruizhe Zhang;Ming Dun;Mingzhen Li;Xiaoyan Liu;Wencong Xiao;Yong Li;Zhongzhi Luan;Depei Qian |
51 | 基于神经网络的信息处理装置及方法 | 专利 | 高钰峰;陈云霁 |
52 | Extremely Low-bit Convolution Optimization for Quantized Neural Network on Modern Computer Architectures | 会议论文 | Qingchang Han;Yongmin Hu;Fengwei Yu;Hailong Yang;Bing Liu;Peng Hu;Ruihao Gong;Yanfei Wang;Rui Wang;Zhongzhi Luan;Depei Qian |
53 | Mutual calibration training: Training deep neural networks with noisy labels using dual-models | 期刊论文 | Rui Liu;Yi Liu;Rui Wang;Yucong Zhou |
54 | Enabling One-size-fits-all Compilation Optimization across Machine Learning Computers for Inference | 期刊论文 | Yuanbo Wen;Qi Guo;Zidong Du;Jianxing Xu;Zhenxing Zhang;Xing Hu;Wei Li;Rui Zhang;Chao Wang;Zhou Xuehai;Tianshi Chen |
55 | MIPSGPU: Minimizing Pipeline Stalls for GPUs With Non-Blocking Execution | 期刊论文 | Chao Yu;Yuebin Bai;Rui Wang |
56 | CompactNet: Platform-Aware Automatic Optimization for Convolutional Neural Networks | 会议论文 | Weicheng Li;Rui Wang;Depei Qian |
57 | Towards a General and Efficient Linked-List Hash Table on GPUs | 会议论文 | Lan Gao;Yunlong Xu;Chongyang Xu;Rui Wang;Hailong Yang;Zhongzhi Luan;Depei Qian |
58 | 深度学习处理器体系结构新范式 | 奖励 | 陈天石 |
59 | 基于 FPGA 的视觉惯性里程计非线性优化硬件加速器 | 会议论文 | 曹睿智;吕向;黄一迪;刘轶;王锐 |
60 | TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding | 会议论文 | Lei Zhang;Shengyuan Zhou;Tian Zhi;Zidong Du;Yunji Chen |
61 | swTVM: Towards Optimized Tensor Code Generation for Deep Learning on Sunway Many-Core Processor | 期刊论文 | Mingzhen Li;Changxi Liu;Jianjin Liao;Xuegui Zheng;Hailong Yang;Rujun Sun;Jun Xu;Lin Gan;Guangwen Yang;Zhongzhi Luan;Depei Qian |
62 | SRAM- and STT-RAM-based hybrid, shared last-level cache for on-chip CPU–GPU heterogeneous architectures | 期刊论文 | Lan Gao;Rui Wang;Yunlong Xu;Hailong Yang;Zhongzhi Luan;Depei Qian;Han Zhang;Jihong Cai |
63 | Efficient detection of silent data corruption in HPC applications with synchronization-free message verification | 期刊论文 | Guozhen Zhang;Yi Liu;Hailong Yang;Depei Qian |
64 | Self-Aware Neural Network Systems: A Survey and New Perspective | 期刊论文 | Zidong Du;Qi Guo;Yongwei Zhao;Tian Zhi;Yunji Chen;Zhiwei Xu |
65 | 脉冲神经网络运算芯片及相关运算方法 | 专利 | 张磊;杜子东;陈云霁 |
66 | 脉冲神经网络转换方法及相关转换芯片 | 专利 | 张磊;杜子东;陈云霁 |
67 | 一种基于FPGA的从SDRAM到MRAM的接口转换系统及方法 | 专利 | 成元庆;卢诚成;彭笑;张有光;赵巍胜;王锐 |