Data orchestration in deep learning accelerators / Tushar Krishna, Hyoukjun Kwon, Angshuman Parashar, Michael Pellauer, Ananda Samajdar.Material type: TextSeries: Synthesis digital library of engineering and computer science | Synthesis lectures on computer architecture ; #52.Publisher: San Rafael, California (1537 Fourth Street, 1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool Publishers, Description: 1 PDF (xvii, 146 pages) : illustrations (some color)Content type: text Media type: electronic Carrier type: online resourceISBN: 9781681738703Subject(s): Neural networks (Computer science) | Machine learning | Data flow computing | artificial intelligence (AI) | deep learning | deep neural networks (DNN) | convolutional neural networks (CNN) | general matrix multiplication (GEMM) | hardware/software co-design | deep neural network scheduling (DNN scheduling) | deep neural network mapping (DNN mapping) | dataflow | data orchestration | spatial accelerators | architecture | hardwareGenre/Form: Electronic books.Additional physical formats: Print version:: No titleDDC classification: 006.3/2 LOC classification: QA76.87 | .K754 2020ebOnline resources: Abstract with links to resource | Abstract with links to full text Also available in print.
|Item type||Current library||Call number||Status||Date due||Barcode|
|Ebooks||Indian Institute of Technology Delhi - Central Library||Available|
Mode of access: World Wide Web.
System requirements: Adobe Acrobat Reader.
Part of: Synthesis digital library of engineering and computer science.
Includes bibliographical references (pages 131-143).
1. Introduction to data orchestration -- 1.1. Deep neural networks (DNNs) -- 1.2. DNN accelerators -- 1.3. Book overview
2. Dataflow and data reuse -- 2.1. Data reuse opportunities -- 2.2. Data reuse in 1D convolution -- 2.3. Dataflows and mappings -- 2.4. Deep dive into dataflows and mappings -- 2.5. Harnessing data reuse via hardware support -- 2.6. Dataflows and data reuse in CONV2D -- 2.7. Convolution as matrix multiplication -- 2.8. Summary
3. Buffer hierarchies -- 3.1. Motivation -- 3.2. Classifying buffering approaches -- 3.3. The buffet storage idiom -- 3.4. Composition of buffer idioms -- 3.5. Other relevant buffering idioms for accelerators -- 3.6. Research needs for accelerator buffer hierarchies -- 3.7. Summary
4. Networks-on-chip -- 4.1. Communication phases -- 4.2. Traditional networks-on-chip -- 4.3. Specialized NoCs for DNN accelerators -- 4.4. Leveraging reuse via the NoC -- 4.5. Tying it Together : from dataflow to traffic flow -- 4.6. Summary
5. Putting it together : architecting a DNN accelerator -- 5.1. Design flow -- 5.2. Example design walk through -- 5.3. Case studies -- 5.4. Summary
6. Modeling accelerator design space -- 6.1. Separating the mapping space from the architecture design space -- 6.2. Representing mappings -- 6.3. Modeling the execution of a mapping on an architecture -- 6.4. Building an automated mapper -- 6.5. Summary
7. Orchestrating compressed-sparse data -- 7.1. Overview -- 7.2. Sparsity in DNNs -- 7.3. Structured vs. Unstructured sparsity -- 7.4. Exploiting sparsity -- 7.5. Sparse dataflows -- 7.6. Costs and benefits -- 7.7. Summary
8. Conclusions -- 8.1. Research opportunities in data orchestration -- 8.2. Data orchestration in alternate platforms -- 8.3. Training accelerators -- 8.4. Summary.
Abstract freely available; full-text restricted to subscribers or individual document purchasers.
Google book search
This Synthesis Lecture focuses on techniques for efficient data orchestration within DNN accelerators. The End of Moore's Law, coupled with the increasing growth in deep learning and other AI applications has led to the emergence of custom Deep Neural Network (DNN) accelerators for energy-efficient inference on edge devices. Modern DNNs have millions of hyper parameters and involve billions of computations; this necessitates extensive data movement from memory to on-chip processing engines. It is well known that the cost of data movement today surpasses the cost of the actual computation; therefore, DNN accelerators require careful orchestration of data across on-chip compute, network, and memory elements to minimize the number of accesses to external DRAM. The book covers DNN dataflows, data reuse, buffer hierarchies, networks-on-chip, and automated design-space exploration. It concludes with data orchestration challenges with compressed and sparse DNNs and future trends. The target audience is students, engineers, and researchers interested in designing high-performance and low-energy accelerators for DNN inference.
Also available in print.
Title from PDF title page (viewed on September 8, 2020).