Home | Registration | Program |
8:30am | Welcome Talk from Organizers |
Jiaqi Liu from HKUST and Photon Microsystems | |
Jiaqi will talk about the motivation of organizing this event and share about the overview of full day's schedule. He will also share some insights and experience on the developemt of Open-sourced hardware-Software Co-design from industry-side and Academy-side. |
8:40am | Design Open-source Chip with Open-source EDA tools and lPs |
Prof. Biwei Xie, Institute of Computer Technology, Chinese Academy of Science | |
Prof. Xie will introduce their work on an open-source intrastructure of EDA that supports the development of high quality EDA algorithms/tools. He will also discuss their progress on designing and taping-our chips with open-source EDA tools and open-source IPs. |
9:20am | Efficient Programming on Heterogeneous Accelerators |
Prof. Peipei Zhou, University of Pittsburgh | |
In this talk, I will first discuss how new mapping solutions, i.e., composing heterogeneous accelerators within system-on-chip with both FPGAs and AI tensor cores, achieve orders of magnitude energy efficiency gains when compared to monolithic accelerator mapping designs for deep learning application. Then, I will apply such novel mapping solutions to show how design space explorations are performed when composing heterogeneous accelerators in latency-through tradeoff analysis. I will further discuss how such mapping and scheduling can be applied to other computing systems, such as GPUs. We released an open-source framework, i.e., CHARM, to compile end-to-end deep learning inference models onto reconfigurable heterogeneous computing platform ACAP: https://github.com/arc-research-lab/CHARM. |
10:00am | Coffee & Tea Break |
10:30am | Open-source Benchmark and Infrastructure for Building Circuit Foundation Models |
Prof. Cunxi Yu, University of Maryland | |
The complexity of modern hardware designs necessitates advanced methodologies for optimizing and analyzing modern digital systems. In recent times, machine learning (ML) methodologies have emerged as potent instruments for assessing design quality-of-results at the Register-Transfer Level (RTL) or Boolean level, aiming to expedite the exploration of advanced RTL configurations. However, unlike ML communities, the hardware design and Electronic Design Automation (EDA) communities significantly lack open-source, high-quality, and user-friendly benchmarks. In this presentation, we introduce an innovative open-source framework that translates RTL designs into graph representation foundations, which can be seamlessly integrated with the PyTorch Geometric graph learning platform. Furthermore, the V2PYG framework is compatible with the open-source EDA toolchain OpenROAD, facilitating the collection of labeled datasets in a fully open-source manner. Additionally, we will present novel RTL data augmentation methods (incorporated in our framework) that enable functional equivalent design augmentation for the construction of an extensive graph-based RTL design database. Lastly, we will showcase several use cases of V2PYG with detailed scripting examples, and a novel foundation MLCAD database. V2PYG can be found at https://yu-maryland.github.io/Verilog-to-PyG/. |
11:10am | Cross-Layer Hardware/Software Co-Design and Design Automation for Photonic AI Computing System |
Prof. Jiaqi Gu, Arizona State University | |
The proliferation of big data and artificial intelligence (AI) has motivated the investigation of next-generation AI computing hardware to support massively parallel and energy-hungry machine learning (ML) workloads. Photonic computing, or computing using light, is a disruptive technology that can bring orders-of-magnitude performance and efficiency improvement to AI/ML with its ultra-fast speed, high parallelism, and low energy consumption. There has been growing interest in using nanophotonic processors for performing optical neural network (ONN) inference operations, which can make transformative impacts in future datacenters, automotive, smart sensing, and intelligent edge. However, the substantial potential in photonic computing also brings significant design challenges, which necessitates a cross-layer co-design stack where the circuit, architecture, and algorithm are designed and optimized in synergy. In this talk, I will present my exploration to address the fundamental challenges faced by optical AI and to pioneer a hardware/software co-design methodology toward a scalable, reliable, and adaptive photonic neural accelerator design framework. First, I will delve into the critical scalability and flexibility issue of integrated photonic tensor units and present domain-specialized photonic neural engine designs for advanced AI workloads. Next, I will present efficient on-chip training and calibration frameworks to show how to build a self-learnable photonic accelerator and overcome the robustness and adaptability bottlenecks by directly training the photonic circuits in situ. Then, I will introduce AI-assisted electronic-photonic design automation with AI-based ultra-fast photonic simulation and automated circuit topology design. In the end, I will conclude the talk with future research directions of emerging domain-specific photonic AI hardware with an intelligent end-to-end co-design & automation stack and deploying it to support real-world applications. |
Noon | Lunch |
1:30pm | Agile hardware development: architectures and tools |
Prof. Cong (Callie) Hao, Georgia Institute of Technology | |
Agile hardware development has two most important aspects: high-performance domain-specific architectures, and the design automation tools that facilitate the rapid development of such architectures. In this talk, we will discuss both aspects. First, we introduce two open-source domain-specific architectures: one for graph neural network (FlowGNN), the other for mixture-of-expert vision transformer (Edge-MoE). Next, we introduce our open-source agile development tool, LightningSim, built on top of Vitis HLS (high-level synthesis), which significantly speeds up the C/RTL co-simulation for accurate performance. Third, we briefly introduce our on-FPGA profiling tool, RealProbe, which provides detailed, on-board performance profiling using HLS. |
2:10pm | Toward Practical Quantum Computing Systems with Intelligent Cross-Stack Co-Design |
Dr. Hanrui Wang, Massachusetts Institute of Technology | |
Quantum Computing (QC) has the potential to solve classically hard problems with greater speed and efficiency, and we have witnessed exciting advancements in QC in recent years. However, there remain substantial gaps between the application requirements and the available devices in terms of software framework support, efficiency, and reliability. To close the gaps and fully unleash quantum power, it is critical to perform AI-enhanced co-design across various technology stacks, from algorithm and program design, to compilation, and hardware architecture. In this talk, I will provide an overview of my contributions to the architecture and system supports for quantum computing. To bridge the software support gap, I will discuss FPQA-C, a compilation framework for the Field-Programmable Qubit Array (FPQA) implemented by the emerging reconfigurable neutral atom arrays. This framework leverages movable atoms for routing 2Q gates and generates atom movements and gate scheduling with high scalability and parallelism. To close the reliability gap, I will introduce QuantumNAS and TorchQuantum, a framework for quantum program structure (ansatz) design for variational quantum algorithms. QuantumNAS adopts an intelligent search engine and utilizes the noisy feedback from quantum devices to search for program structure and qubit mapping tailored for specific hardware, leading to notable resource reduction and reliability enhancements. Finally, I will conclude with an overview of my ongoing work and my research vision toward building software and hardware supports for practical quantum advantages. |
3:00pm | Coffee & Tea Break |