Guangyao Li

PhD Candidate
Renmin University of China



Biography

I am a postdoctoral researcher at Tsinghua University, working with Prof. Wenwu Zhu and Prof. Xin Wang. I received my Ph.D. from Renmin University of China, supervised by Prof. Di Hu at the Gaoling School of Artificial Intelligence. My recently research interests include audio-visual learning, scene understanding and Embodied AI.

News

  • [07-2024] One paper accepted by ACMMM, thanks to all co-authors!
  • [05-2024] One paper accepted by TOMM, thanks to all co-authors!
  • [05-2024] One paper accepted by CVPR Workshop, thanks to all co-authors!
  • [07-2023] One paper accepted by ACMMM, thanks to all co-authors!
  • [05-2023] One paper accepted by INTERSPEECH (Oral), thanks to all co-authors!
  • [03-2022] One paper accepted by CVPR (Oral), thanks to all co-authors!
  • [08-2020] I will join GeWu-Lab to pursue a PhD degree at Renmin University of China!







Selected Publications


   indicates equal contribution.

  Most recent publications on Google Scholar.

Boosting Audio Visual Question Answering via Key Semantic-Aware Cues
Guangyao Li, Henghui Du, Di Hu
Proc. ACM International Conference on Multimedia (ACM MM), 2024.

[Paper]  [arXiv]  [Code]


AVQA-CoT: When CoT Meets Question Answering in Audio-Visual Scenarios
Guangyao Li, Henghui Du, Di Hu
CVPR Sight and Sound Workshops, 2024.

[Paper]  [arXiv]  [Code]


Towards Long Form Audio-visual Video Understanding
Wenxuan Hou, Guangyao Li, Yapeng Tian, Di Hu
ACM Transactions on Multimedia Computing, Communications and Applications (TOMM), 2024.

[Project]  [arXiv]  [Code]


Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer
Yaoting Wang, Weisong Liu, Guangyao Li, Jian Ding, Di Hu, Xi Li
The 38th Annual AAAI Conference on Artificial Intelligence (AAAI), 2024.

[Paper]  [arXiv]  [Code]  [Video]


Progressive Spatio-temporal Perception for Audio-Visual Question Answering
Guangyao Li, Wenxuan Hou, Di Hu
Proc. ACM International Conference on Multimedia (ACM MM), 2023.

[Paper]  [arXiv]  [Code]


Multi-Scale Attention for Audio Question Answering (Oral)
Guangyao Li, Yixin Xu, Di Hu
Proc. Conference of the International Speech Communication Association (INTERSPEECH), 2023.

[Paper]  [arXiv]  [Code]


Learning to Answer Questions in Dynamic Audio-Visual Scenarios (Oral)
Guangyao Li, Yake Wei, Yapeng Tian, Chenliang Xu, Ji-Rong Wen and Di Hu
Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

[Project]  [Paper]  [Supp]  [arXiv]  [Poster]  [YouTube]  [Bilibili]  [Code]


Self-supervised Audiovisual Representation Learning for Remote Sensing Data
Konrad Heidler, Lichao Mou, Di Hu, Pu Jin, Guangyao Li, Chuang Gan, Ji-Rong Wen, Xiao Xiang Zhu
International Journal of Applied Earth Observation and Geoinformation (IJAEOG), 2023.

[Paper]  [arXiv]  [Demo](YouTube)



Before 2020, my research interests mainly focus on agricultural artificial intelligence and agricultural informatization.

A review of computer vision technologies for plant phenotyping
Zhenbo Li, Ruohao Guo, Meng Li, Yaru Chen, Guangyao Li
Computers and Electronics in Agriculture (COMPAG), 2020.

[Paper]


Shellfish Detection based on Fusion Attention Mechanism in End-to-End Network
Guangyao Li, Zhenbo Li, Chuyue Zhang, Yaodong Li, Jun Yue
Proc. Conference on Pattern Recognition and Computer Vision (PRCV). 2019.

[Paper]


Sea cucumber image dehazing method by fusion of Retinex and dark channel
Zhenbo Li, Guangyao Li, Bingshan Niu, Fang Peng
IFAC PapersOnLine, 2018.

[Paper]


Service




    Conference Reviewer:
    • IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023, 2024,
    • ACM International Conference on Multimedia (ACMMM) 2024,
    • Annual Conference on Neural Information Processing Systems (NeurIPS) 2024,
    • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023, 2024,
    • International Joint Conference on Artificial Intelligence (IJCAI) 2023, 2024,
    • Asian Conference on Computer Vision (ACCV) 2022.
    Journal Reviewer:
    • IEEE Transactions on Multimedia (TMM),
    • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT).

Contact

Email: guangyaoli@ruc.edu.cn
Address: Lide Building