About
Pengyuan Li is a Research Staff Member at IBM Research, leading the development of Granite Vision model, a vision-language model specifically designed for visual document understanding. Previously, he served as the Data Acquisition Lead and collected more than 10PB data for building Large Language, Code, and Multimodal Models at IBM.
Pengyuan’s research focuses on machine learning, multimodal data mining, document analysis, and biomedical informatics. He has served as a visiting scholar at UCLA, UBC, JHU, and Tongji University, collaborating with researchers around the world to explore innovative, cross-disciplinary ideas. He is also an Adjunct Faculty member at the Data Science Institute, University of Delaware.
News
- Aug 2025: Pengyuan gave a talk on Granite Vision Models at OpenCV Live channel
- Jul 2025: Our Granite Vision model yield more than 100K downloads on HuggingFace
- Apr 2025: Pengyuan co-organized ‘AI and Biodata’ workshop at the 18th Annual International Biocuration Conference
- Apr 2025: Pengyuan presented ‘GeneScribe: Leveraging large language models for gene summary generation’ at the 18th Annual International Biocuration Conference
- Mar 2025: Published Large Vision models on behalf of IBM-Research on HuggingFace
- Feb 2023: Pengyuan joined the UD’s Data Science Institue as an adjunct faculty
- Jul 2021: Pengyuan joined the IBM-Research as a Research Staff Member