SlideShare a Scribd company logo
© 2019 Xilinx
The Xilinx AI Engine: High
Performance with Future-proof
Architecture Adaptability
Vinod Kathail
Xilinx
May 2019
© 2019 Xilinx 2
Motivation for AI Engine
© 2019 Xilinx
Compute Intensity
Real Time Capability
Power Efficiency
Moore’s Law
Performance & Power Scaling
Traditional Single / Multi-core
Machine
Learning
ADAS / AD5G
Smart City
Smart
Factory
Data Center
Workloads
Motivation for AI Engine
Dynamic Markets Require Adaptive Compute Acceleration Platform (ACAP)
AI Everywhere
Applications
Technology
Scaling
Page 3
© 2019 Xilinx
Adaptable Engines
2X compute density
Programmable I/O
• Any interface or sensor
• Includes 4.2Gb/s MIPI
AI Engines
• AI Compute
• Diverse DSP workloads
DDR Memory
• 3200-DDR4, 3200-LPDDR4
• 2X bandwidth/pin
Protocol Engines
• Integrated 600G cores
• 4X encrypted bandwidth
PCIe & CCIX
• 2X PCIe & DMA bandwidth
• Cache-coherent interface
to accelerators
Transceivers
• Broad range, 25G →112G
• 58G in mainstream devices
Scalar Engines
• Platform Control
• Edge Compute
Versal ACAP Architecture Overview
>> 4
Network-on-Chip
• Guaranteed Bandwidth
• Enables SW Programmability
© 2019 Xilinx
AI
CORE
MEMORY
AI
CORE
MEMORY
AI
CORE
MEMORY
AI
CORE
MEMORY
Introducing the AI Engine
Signal ProcessingArtificial
Intelligence
CNN,
LSTM, MLP
Computer Vision
• 1GHz+ Multi-precision Vector Processor
• High bandwidth extensible memory
• Up to 400 AI Engines per device
• 8X Compute Density
• 40% Lower Power
SW Programmable
Adaptable. Intelligent.
Deterministic
Efficient
Page 5
© 2019 Xilinx
C/C++
C/C++
Software Programmable: Any Developer
Page 6
Compile
Design
4G/5G/Radar
Library
AI
Library
Vision
Library
AI Engine Compiler
Programming
Abstraction Levels
1
2
3Run
Domain Specific
Architecture
Data Flow
w/ Xilinx libraries
Kernel Program
Data Flow w/ user
defined libraries
Page 6
Frameworks
© 2019 Xilinx
AI Engine Application Performance & Power Efficiency
Page 7
Image Classification
(GoogleNet, <1ms)
Massive MIMO Radio
(DUC, DDC, CFR, DPD)
AI Inference
Compute
5G Wireless
Bandwidth
Power
Consumption
Xilinx UltraScale+
Xilinx Versal w/ AI Engine
20x
40%
Less
Power
5x
Xilinx 16nm UltraScale+
Xilinx 7nm Versal w/ AI Engine
© 2019 Xilinx 8
AI Engine Architecture
© 2019 Xilinx
AI Engine: Tile-Based Architecture
Page 9
Interconnect
ISA-based
Vector Processor
Local
Memory
AI Vector
Extensions
5G Vector
Extensions
ISA-based
Vector Processor
Software Programmable
(e.g., C/C++)
Data
Mover
Data Mover
Non-neighbor data communication
Integrated synchronization primitives
Non-Blocking Interconnect
high GB/s bandwidth per tile
Local Memory
Multi-bank implementation
Shared across neighbor cores
Cascade Interface
Partial results to next core
PL
PS I/O
© 2019 Xilinx
AI Engine: Array Architecture
Page 10
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Memory
AI
Core
Modular and scalable architecture
• More tiles = more compute
• Up to 400 per device
• Versal AI Core VC1902 device
Distributed memory hierarchy
Maximize memory bandwidth
Array of AI Engines
• Increase in compute, memory and
communication bandwidth
Deterministic Performance & Low Latency
PL
PS I/O
© 2019 Xilinx
Page 11
AI Engine: Processor Core
Local, Shareable Memory
• 32KB Local, 128KB Addressable
32-bit Scalar RISC Processor
Up to 128 MACs / Clock Cycle per Core (INT 8)
Highly
Parallel
Memory Interface
Scalar Unit
Scalar
Register
File
Scalar ALU
Non-linear
Functions
Vector
Register
File
Fixed-Point
Vector Unit
Floating-Point
Vector Unit
Vector Unit Vector Processor
512-bit SIMD Datapath
Instruction Fetch
& Decode Unit
AGU AGU AGU
Load Unit A Load Unit B Store Unit
7+ operations / clock cycle
• 2 Vector Loads / 1 Mult / 1 Store
• 2 Scalar Ops / Stream Access
Instruction Parallelism: VLIW Data Parallelism: SIMD
Multiple vector lanes
• Vector Datapath
• 8 / 16 / 32-bit & SPFP operands
Stream
Interface
© 2019 Xilinx
Data Movement Architecture
Page 12
Dataflow
Graph
Mem
Mem
AI
Core
AI
Core
AI
Core
Dataflow
Pipeline
AI
Core
Memory
B0
B1
Memory
B2
B3
Mem
AI
Core
AI
Core
Mem
AI
Core
Streaming
Multicast
AI
Core
AI
Core
AI
Core
AI
Core
AI
Core
Memory
AI
Core
Memory
Non-
Neighbor
AI
Core
AI
Core
Cascade
Streaming
Memory Communication Streaming Communication
Memory Interface
Stream Interface
Cascade Interface
Mem Mem
AI
Core
AI
Core
AI
Core
© 2019 Xilinx
AI Engine Integration in Versal
Page 13
˃ TB/s of Interface Bandwidth
AI Engine to Programmable Logic
AI Engine to NOC
˃ Leveraging NOC connectivity
Processing System manages Config
/ Debug / Trace
AI Engine to DRAM without PL
PL
PS I/O
© 2019 Xilinx
AI Engine: Multi-Core Compute with Dedicated Memory
Page 14
core
L0
core
L0
core
L0
Block 0
L1
core
L0
core
L0
core
L0
Block 1
L1
L2
DRAM
D0
D0
D0
D0
Fixed, shared
Interconnect
• Blocking limits
compute
• Timing not
deterministic
Data
Replicated
• Robs bandwidth
• Reduces capacity
Traditional Multi-core
(cache-based architecture)
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI
Core
AI
Core
MEM
AI
Core
MEM
AI
Core
MEM
AI Engine Array
(intelligent engine)
Dedicated
Interconnect
• Non-blocking
• Deterministic
Local, Distributed
Memory
• No cache misses
• Higher bandwidth
• Less capacity required
© 2019 Xilinx
AI Engine Delivers High Compute Efficiency
Page 15
95%
80%
98%
ML Convolutions FFT DPD
Vector Processor Efficiency
Peak Kernel Theoretical Performance
Block-based
Matrix Multiplication
(32×64) × (64×32)
1024-pt
FFT/iFFT
Volterra-based
forward-path DPD
˃ Adaptable, non-blocking interconnect
Flexible data movement architecture
Avoids interconnect “bottlenecks”
˃ Adaptable memory hierarchy
Local, distributed, shareable = extreme
bandwidth
No cache misses or data replication
Extend to PL memory (BRAM, URAM)
˃ Transfer data while AI Engine Computes
Compute
Comm
Overlap Compute and Communication
Compute Compute
Comm Comm
© 2019 Xilinx 16
AI Engine Programming and
Applications
© 2019 Xilinx
Versal ACAP Development Tools
Page 17
Frameworks
AI and Data
Scientists
Unified Software
Development Environment
Software Application
Developers
Vivado Design Suite
Hardware
Developers
USERTOOLS SUPPORTED FRAMEWORKS
© 2019 Xilinx
Software Development Environment
Page 18
˃ Unified development environment
Full chip programming
˃ SW programmable for whole application
Heterogeneous SW acceleration
˃ Full system simulation, debug & profiling
Software development experience
Application
(e.g. C/C++)
Performance
Constraints
Application
e.g. C/C++
Processing
Sub-system
Programmable
Logic
AI
Engines
System
Simulation Hardware
System Debug & Profiling
Unified SW Development Environment
IntelligentAdaptableScalar
© 2019 Xilinx
AI Engine Programming: Dataflow Model
Page 19
a b c
d
e
User defines dataflow logic
User describes dataflow graph
using C/C++ APIs
1
2
3
a b c
d
ee
Compiler transparently manages placement
& interconnect
to e
Memory
b
Memory
a
Memory
Vector
Core
Memory
Vector
Core
Memory
Vector
Core
Memory
Vector
Core
MemoryMemory
Memory
c
Physical Mapping to AI Engines
Vector
Core d
PL
© 2019 Xilinx
Accelerating AI Inference
Page 20
2
1
3
User works in framework of choice
• Develop & train custom network
• User provides trained model
Xilinx DNN Compiler implements network
• Targets AI inference implemented on FPGA
and Versal
• Optimizations: Quantize, merge layers, prune
• Compile to AI Engines
Scalable across hardware targets
• Start with Alveo boards with FPGAs today
Deep Learning Frameworks
Xilinx DNN Compiler
New Versal based
Acceleration Cards
Xilinx AI Inference
Domain Specific Architecture
Alveo
U200/U250/U280
© 2019 Xilinx
AI Engine Delivers Real-time Inference Leadership
Page 21
Sources:
GPU: Nvidia T4 TensorRT 5, Published March 2019
(INT8, Batch=4, 1.5ms Latency)
Versal Card, Projected (INT8, Batch=8, 1.5ms Latency) 0
2,000
4,000
6,000
8,000
10,000
12,000
VersalGPU
1x
Resnet50 Inference Performance
3.5x
Images/Sec
4.5x With
Xilinx Pruning
© 2019 Xilinx
AI Engine: Accelerating AI Inference & Signal Processing
Page 22
Software Programmable Deterministic Efficient
• Frameworks & C/C++
• SW Compile, Debug &
Deploy
• Max throughput w/ low latency
• Real-time inference leadership
• Up to 8X compute density
• At ~40% lower power
Signal
ProcessingAI
Inference
10x 5x
© 2019 Xilinx
Additional Resources
23
For more information on ACAP
and Versal, please visit:
www.xilinx.com/versal
Please visit EVS Booth #610
Face recognition using Xilinx FPGA
Ad

More Related Content

What's hot (20)

FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
Dr. Swaminathan Kathirvel
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
AMD
 
CXL Fabric Management Standards
CXL Fabric Management StandardsCXL Fabric Management Standards
CXL Fabric Management Standards
Memory Fabric Forum
 
Deep learning with FPGA
Deep learning with FPGADeep learning with FPGA
Deep learning with FPGA
Ayush Singh, MS
 
System On Chip
System On ChipSystem On Chip
System On Chip
Dr. A. B. Shinde
 
Lec04 gpu architecture
Lec04 gpu architectureLec04 gpu architecture
Lec04 gpu architecture
Taras Zakharchenko
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
Chakkrit (Kla) Tantithamthavorn
 
GPU Computing
GPU ComputingGPU Computing
GPU Computing
Khan Mostafa
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
Raymond Tay
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
Allan Cantle
 
Software defined networking(sdn) vahid sadri
Software defined networking(sdn) vahid sadriSoftware defined networking(sdn) vahid sadri
Software defined networking(sdn) vahid sadri
Vahid Sadri
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
Rebekah Rodriguez
 
Pci express technology 3.0
Pci express technology 3.0Pci express technology 3.0
Pci express technology 3.0
Biddika Manjusree
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Rebekah Rodriguez
 
PCI express
PCI expressPCI express
PCI express
sarangaprabod
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
Samsung Open Source Group
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
Software hardware co-design using xilinx zynq soc
Software hardware co-design using xilinx zynq socSoftware hardware co-design using xilinx zynq soc
Software hardware co-design using xilinx zynq soc
Hossam Hassan
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
Nived R Nambiar
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Rebekah Rodriguez
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
Dr. Swaminathan Kathirvel
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
AMD
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
Raymond Tay
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
Allan Cantle
 
Software defined networking(sdn) vahid sadri
Software defined networking(sdn) vahid sadriSoftware defined networking(sdn) vahid sadri
Software defined networking(sdn) vahid sadri
Vahid Sadri
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
Rebekah Rodriguez
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Rebekah Rodriguez
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
Samsung Open Source Group
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy
 
Software hardware co-design using xilinx zynq soc
Software hardware co-design using xilinx zynq socSoftware hardware co-design using xilinx zynq soc
Software hardware co-design using xilinx zynq soc
Hossam Hassan
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Rebekah Rodriguez
 

Similar to "The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability," a Presentation from Xilinx (20)

HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
Linaro
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Rebekah Rodriguez
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
News from re:Invent 2019
News from re:Invent 2019News from re:Invent 2019
News from re:Invent 2019
Vladimir Simek
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm
 
SYCL 2020 Specification
SYCL 2020 SpecificationSYCL 2020 Specification
SYCL 2020 Specification
The Khronos Group Inc.
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
Edge AI and Vision Alliance
 
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
Edge AI and Vision Alliance
 
Xilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXXilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIX
Yoshihiro Horie
 
Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption
Michelle Holley
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Ganesan Narayanasamy
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
dvance computer architecture computer architecture: a quantitative approach c...
dvance computer architecture computer architecture: a quantitative approach c...dvance computer architecture computer architecture: a quantitative approach c...
dvance computer architecture computer architecture: a quantitative approach c...
mahdieh79
 
Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
Ganesan Narayanasamy
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V International
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Rebekah Rodriguez
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
Filipe Miranda
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the network
Megan O'Keefe
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
Alison B. Lowndes
 
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
HKG18-300K2 - Keynote: Tomas Evensen - All Programmable SoCs? – Platforms to ...
Linaro
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Rebekah Rodriguez
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
News from re:Invent 2019
News from re:Invent 2019News from re:Invent 2019
News from re:Invent 2019
Vladimir Simek
 
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based MultiprocessingArm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm DynamIQ: Intelligent Solutions Using Cluster Based Multiprocessing
Arm
 
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
“Making Edge AI Inference Programming Easier and Flexible,” a Presentation fr...
Edge AI and Vision Alliance
 
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
“Vitis and Vitis AI: Application Acceleration from Cloud to Edge,” a Presenta...
Edge AI and Vision Alliance
 
Xilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIXXilinx Data Center Strategy and CCIX
Xilinx Data Center Strategy and CCIX
Yoshihiro Horie
 
Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption Accelerating Edge Computing Adoption
Accelerating Edge Computing Adoption
Michelle Holley
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systemsXilinx Edge Compute using Power 9 /OpenPOWER systems
Xilinx Edge Compute using Power 9 /OpenPOWER systems
Ganesan Narayanasamy
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Lablup Inc.
 
dvance computer architecture computer architecture: a quantitative approach c...
dvance computer architecture computer architecture: a quantitative approach c...dvance computer architecture computer architecture: a quantitative approach c...
dvance computer architecture computer architecture: a quantitative approach c...
mahdieh79
 
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML AcceleratorsRISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V & SoC Architectural Exploration for AI and ML Accelerators
RISC-V International
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Rebekah Rodriguez
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
Filipe Miranda
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the network
Megan O'Keefe
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
Alison B. Lowndes
 
Ad

More from Edge AI and Vision Alliance (20)

“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
Edge AI and Vision Alliance
 
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
Edge AI and Vision Alliance
 
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
Edge AI and Vision Alliance
 
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
Edge AI and Vision Alliance
 
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
Edge AI and Vision Alliance
 
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
Edge AI and Vision Alliance
 
“Multi-object Tracking Systems,” a Presentation from Tryolabs
“Multi-object Tracking Systems,” a Presentation from Tryolabs“Multi-object Tracking Systems,” a Presentation from Tryolabs
“Multi-object Tracking Systems,” a Presentation from Tryolabs
Edge AI and Vision Alliance
 
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
Edge AI and Vision Alliance
 
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
Edge AI and Vision Alliance
 
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
Edge AI and Vision Alliance
 
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
Edge AI and Vision Alliance
 
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
Edge AI and Vision Alliance
 
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
Edge AI and Vision Alliance
 
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
Edge AI and Vision Alliance
 
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
Edge AI and Vision Alliance
 
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
Edge AI and Vision Alliance
 
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
Edge AI and Vision Alliance
 
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
Edge AI and Vision Alliance
 
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
Edge AI and Vision Alliance
 
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
Edge AI and Vision Alliance
 
“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
“Improved Data Sampling Techniques for Training Neural Networks,” a Presentat...
Edge AI and Vision Alliance
 
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
“Cost-efficient, High-quality AI for Consumer-grade Smart Home Cameras,” a Pr...
Edge AI and Vision Alliance
 
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
“Edge AI Optimization on Rails—Literally,” a Presentation from Wabtec
Edge AI and Vision Alliance
 
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
“How Large Language Models Are Impacting Computer Vision,” a Presentation fro...
Edge AI and Vision Alliance
 
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
“Implementing AI/Computer Vision for Corporate Security Surveillance,” a Pres...
Edge AI and Vision Alliance
 
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
“Continual Learning thru Sequential, Lightweight Optimization,” a Presentatio...
Edge AI and Vision Alliance
 
“Multi-object Tracking Systems,” a Presentation from Tryolabs
“Multi-object Tracking Systems,” a Presentation from Tryolabs“Multi-object Tracking Systems,” a Presentation from Tryolabs
“Multi-object Tracking Systems,” a Presentation from Tryolabs
Edge AI and Vision Alliance
 
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
“Improved Navigation Assistance for the Blind via Real-time Edge AI,” a Prese...
Edge AI and Vision Alliance
 
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
“Using Vision Systems, Generative Models and Reinforcement Learning for Sport...
Edge AI and Vision Alliance
 
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
“Introduction to Cameras for Embedded Applications,” a Presentation from Sens...
Edge AI and Vision Alliance
 
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
“Introduction to Modern Radar for Machine Perception,” a Presentation from Se...
Edge AI and Vision Alliance
 
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
“Diagnosing Problems and Implementing Solutions for Deep Neural Network Train...
Edge AI and Vision Alliance
 
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
“Seeing Through Machines: A Guide to Image Sensors for Edge AI Applications,”...
Edge AI and Vision Alliance
 
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
“Transformer Networks: How They Work and Why They Matter,” a Presentation fro...
Edge AI and Vision Alliance
 
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
“Removing Weather-related Image Degradation at the Edge,” a Presentation from...
Edge AI and Vision Alliance
 
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
“Seeing the Invisible: Unveiling Hidden Details through Advanced Image Acquis...
Edge AI and Vision Alliance
 
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
“Data-efficient and Generalizable: The Domain-specific Small Vision Model Rev...
Edge AI and Vision Alliance
 
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
“Omnilert Gun Detect: Harnessing Computer Vision to Tackle Gun Violence,” a P...
Edge AI and Vision Alliance
 
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Prese...
Edge AI and Vision Alliance
 
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
“Bridging Vision and Language: Designing, Training and Deploying Multimodal L...
Edge AI and Vision Alliance
 
Ad

Recently uploaded (20)

On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Mike Mingos
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
On-Device or Remote? On the Energy Efficiency of Fetching LLM-Generated Conte...
Ivano Malavolta
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Mike Mingos
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 

"The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability," a Presentation from Xilinx

  • 1. © 2019 Xilinx The Xilinx AI Engine: High Performance with Future-proof Architecture Adaptability Vinod Kathail Xilinx May 2019
  • 2. © 2019 Xilinx 2 Motivation for AI Engine
  • 3. © 2019 Xilinx Compute Intensity Real Time Capability Power Efficiency Moore’s Law Performance & Power Scaling Traditional Single / Multi-core Machine Learning ADAS / AD5G Smart City Smart Factory Data Center Workloads Motivation for AI Engine Dynamic Markets Require Adaptive Compute Acceleration Platform (ACAP) AI Everywhere Applications Technology Scaling Page 3
  • 4. © 2019 Xilinx Adaptable Engines 2X compute density Programmable I/O • Any interface or sensor • Includes 4.2Gb/s MIPI AI Engines • AI Compute • Diverse DSP workloads DDR Memory • 3200-DDR4, 3200-LPDDR4 • 2X bandwidth/pin Protocol Engines • Integrated 600G cores • 4X encrypted bandwidth PCIe & CCIX • 2X PCIe & DMA bandwidth • Cache-coherent interface to accelerators Transceivers • Broad range, 25G →112G • 58G in mainstream devices Scalar Engines • Platform Control • Edge Compute Versal ACAP Architecture Overview >> 4 Network-on-Chip • Guaranteed Bandwidth • Enables SW Programmability
  • 5. © 2019 Xilinx AI CORE MEMORY AI CORE MEMORY AI CORE MEMORY AI CORE MEMORY Introducing the AI Engine Signal ProcessingArtificial Intelligence CNN, LSTM, MLP Computer Vision • 1GHz+ Multi-precision Vector Processor • High bandwidth extensible memory • Up to 400 AI Engines per device • 8X Compute Density • 40% Lower Power SW Programmable Adaptable. Intelligent. Deterministic Efficient Page 5
  • 6. © 2019 Xilinx C/C++ C/C++ Software Programmable: Any Developer Page 6 Compile Design 4G/5G/Radar Library AI Library Vision Library AI Engine Compiler Programming Abstraction Levels 1 2 3Run Domain Specific Architecture Data Flow w/ Xilinx libraries Kernel Program Data Flow w/ user defined libraries Page 6 Frameworks
  • 7. © 2019 Xilinx AI Engine Application Performance & Power Efficiency Page 7 Image Classification (GoogleNet, <1ms) Massive MIMO Radio (DUC, DDC, CFR, DPD) AI Inference Compute 5G Wireless Bandwidth Power Consumption Xilinx UltraScale+ Xilinx Versal w/ AI Engine 20x 40% Less Power 5x Xilinx 16nm UltraScale+ Xilinx 7nm Versal w/ AI Engine
  • 8. © 2019 Xilinx 8 AI Engine Architecture
  • 9. © 2019 Xilinx AI Engine: Tile-Based Architecture Page 9 Interconnect ISA-based Vector Processor Local Memory AI Vector Extensions 5G Vector Extensions ISA-based Vector Processor Software Programmable (e.g., C/C++) Data Mover Data Mover Non-neighbor data communication Integrated synchronization primitives Non-Blocking Interconnect high GB/s bandwidth per tile Local Memory Multi-bank implementation Shared across neighbor cores Cascade Interface Partial results to next core PL PS I/O
  • 10. © 2019 Xilinx AI Engine: Array Architecture Page 10 Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Memory AI Core Modular and scalable architecture • More tiles = more compute • Up to 400 per device • Versal AI Core VC1902 device Distributed memory hierarchy Maximize memory bandwidth Array of AI Engines • Increase in compute, memory and communication bandwidth Deterministic Performance & Low Latency PL PS I/O
  • 11. © 2019 Xilinx Page 11 AI Engine: Processor Core Local, Shareable Memory • 32KB Local, 128KB Addressable 32-bit Scalar RISC Processor Up to 128 MACs / Clock Cycle per Core (INT 8) Highly Parallel Memory Interface Scalar Unit Scalar Register File Scalar ALU Non-linear Functions Vector Register File Fixed-Point Vector Unit Floating-Point Vector Unit Vector Unit Vector Processor 512-bit SIMD Datapath Instruction Fetch & Decode Unit AGU AGU AGU Load Unit A Load Unit B Store Unit 7+ operations / clock cycle • 2 Vector Loads / 1 Mult / 1 Store • 2 Scalar Ops / Stream Access Instruction Parallelism: VLIW Data Parallelism: SIMD Multiple vector lanes • Vector Datapath • 8 / 16 / 32-bit & SPFP operands Stream Interface
  • 12. © 2019 Xilinx Data Movement Architecture Page 12 Dataflow Graph Mem Mem AI Core AI Core AI Core Dataflow Pipeline AI Core Memory B0 B1 Memory B2 B3 Mem AI Core AI Core Mem AI Core Streaming Multicast AI Core AI Core AI Core AI Core AI Core Memory AI Core Memory Non- Neighbor AI Core AI Core Cascade Streaming Memory Communication Streaming Communication Memory Interface Stream Interface Cascade Interface Mem Mem AI Core AI Core AI Core
  • 13. © 2019 Xilinx AI Engine Integration in Versal Page 13 ˃ TB/s of Interface Bandwidth AI Engine to Programmable Logic AI Engine to NOC ˃ Leveraging NOC connectivity Processing System manages Config / Debug / Trace AI Engine to DRAM without PL PL PS I/O
  • 14. © 2019 Xilinx AI Engine: Multi-Core Compute with Dedicated Memory Page 14 core L0 core L0 core L0 Block 0 L1 core L0 core L0 core L0 Block 1 L1 L2 DRAM D0 D0 D0 D0 Fixed, shared Interconnect • Blocking limits compute • Timing not deterministic Data Replicated • Robs bandwidth • Reduces capacity Traditional Multi-core (cache-based architecture) MEM AI Core MEM AI Core MEM AI Core MEM AI Core MEM AI Core MEM AI Core AI Core MEM AI Core MEM AI Core MEM AI Engine Array (intelligent engine) Dedicated Interconnect • Non-blocking • Deterministic Local, Distributed Memory • No cache misses • Higher bandwidth • Less capacity required
  • 15. © 2019 Xilinx AI Engine Delivers High Compute Efficiency Page 15 95% 80% 98% ML Convolutions FFT DPD Vector Processor Efficiency Peak Kernel Theoretical Performance Block-based Matrix Multiplication (32×64) × (64×32) 1024-pt FFT/iFFT Volterra-based forward-path DPD ˃ Adaptable, non-blocking interconnect Flexible data movement architecture Avoids interconnect “bottlenecks” ˃ Adaptable memory hierarchy Local, distributed, shareable = extreme bandwidth No cache misses or data replication Extend to PL memory (BRAM, URAM) ˃ Transfer data while AI Engine Computes Compute Comm Overlap Compute and Communication Compute Compute Comm Comm
  • 16. © 2019 Xilinx 16 AI Engine Programming and Applications
  • 17. © 2019 Xilinx Versal ACAP Development Tools Page 17 Frameworks AI and Data Scientists Unified Software Development Environment Software Application Developers Vivado Design Suite Hardware Developers USERTOOLS SUPPORTED FRAMEWORKS
  • 18. © 2019 Xilinx Software Development Environment Page 18 ˃ Unified development environment Full chip programming ˃ SW programmable for whole application Heterogeneous SW acceleration ˃ Full system simulation, debug & profiling Software development experience Application (e.g. C/C++) Performance Constraints Application e.g. C/C++ Processing Sub-system Programmable Logic AI Engines System Simulation Hardware System Debug & Profiling Unified SW Development Environment IntelligentAdaptableScalar
  • 19. © 2019 Xilinx AI Engine Programming: Dataflow Model Page 19 a b c d e User defines dataflow logic User describes dataflow graph using C/C++ APIs 1 2 3 a b c d ee Compiler transparently manages placement & interconnect to e Memory b Memory a Memory Vector Core Memory Vector Core Memory Vector Core Memory Vector Core MemoryMemory Memory c Physical Mapping to AI Engines Vector Core d PL
  • 20. © 2019 Xilinx Accelerating AI Inference Page 20 2 1 3 User works in framework of choice • Develop & train custom network • User provides trained model Xilinx DNN Compiler implements network • Targets AI inference implemented on FPGA and Versal • Optimizations: Quantize, merge layers, prune • Compile to AI Engines Scalable across hardware targets • Start with Alveo boards with FPGAs today Deep Learning Frameworks Xilinx DNN Compiler New Versal based Acceleration Cards Xilinx AI Inference Domain Specific Architecture Alveo U200/U250/U280
  • 21. © 2019 Xilinx AI Engine Delivers Real-time Inference Leadership Page 21 Sources: GPU: Nvidia T4 TensorRT 5, Published March 2019 (INT8, Batch=4, 1.5ms Latency) Versal Card, Projected (INT8, Batch=8, 1.5ms Latency) 0 2,000 4,000 6,000 8,000 10,000 12,000 VersalGPU 1x Resnet50 Inference Performance 3.5x Images/Sec 4.5x With Xilinx Pruning
  • 22. © 2019 Xilinx AI Engine: Accelerating AI Inference & Signal Processing Page 22 Software Programmable Deterministic Efficient • Frameworks & C/C++ • SW Compile, Debug & Deploy • Max throughput w/ low latency • Real-time inference leadership • Up to 8X compute density • At ~40% lower power Signal ProcessingAI Inference 10x 5x
  • 23. © 2019 Xilinx Additional Resources 23 For more information on ACAP and Versal, please visit: www.xilinx.com/versal Please visit EVS Booth #610 Face recognition using Xilinx FPGA
  翻译: