We conduct research on system and architectural issues for accelerating various applications such as deep learning, compression algorithms and graph processing, especially on FPGAs and GPUs. Some of the on-going research topics are listed below. However, you're free to bring your own exciting topic.
With no doubt the most popular accelerator for AI nowadays is GPU. However the world is heading towards the next step: AI-specific accelerators. There is much room to improve in terms of accelerator designs. For example, optimizing dataflow, utilizing sparse network structure, or processing-in-memory techniques.
Designing a neural architecture, especially in relation with specialized accelerators (i.e. NPUs) is a difficult and time-consuming task. Neural architecture search aims to solve this problem in a way that everyone had in mind: designing DNNs using DNNs.
To utilize multiple devices (i.e., GPUs) for high-speed DNN training, it's common to employ distributed learning. There are still many ways to improve current distributed learning methods: Devising a new communication algorithm, smartly pipelining the jobs, or changing the ways that devices synchronize.
Multiple model compression techniques have been suggested these days to reduce the computation burden from the nature of DNNs. Most of them utilize original training data to compensate for accuracy losses. Otherwise, they may end up with significant accuracy degradation. However, the original training data is usually inaccessible due to privacy or copyright issues. To this end, our research focuses on compressing neural networks while maintaining comparable performance, even without the original dataset.