Selected Projects

Compression and acceleration on Neuromorphic Computing


Brain-inspired neuromorphic computing aims to understand the cognitive mechanisms of a brain and apply them to advance various areas in computer science. Recently, extensive efforts have been attracted by spiking neural networks (SNNs) due to their low-power and biologically plausible nature. Despite the potential benefits of supporting SNNs, existing works fail to efficiently support them due to their software-based frameworks or hardware-based but time-driven execution mechanisms. I am currently designing a SW/HW co-design framework, dedicated to serving better accuracy and inference efficiency.

Compression and acceleration on Large-Scale Models


Nowadays, model sizes keep increasing too large to be stored on a single gas pedal; for instance, 175 billion parameters of GPT-3 require 350Gib of main memory if stored parameters with 16-bit. In addition, the memory required for activation, gradients, etc. during training is at least three times the model memory requirement. When the large-scale model (e.g., foundation model and LLM) is deployed in practice, the model is fine-tuned to generalize to specific downstream tasks using different data depending on the downstream tasks. I am currently designing a SW/HW co-design framework, dedicated to optimize storage and execution efficiency.

Compression and acceleration on Databases


A pressing demand emerges for storing extremely large-scale high-dimensional data generated by industry and academia at an increasing speed. Data compression techniques can lower expenses and save storage maintenance efforts. I am currently trying to compress high-dimensional data with an efficient store and query engine.