Vision CAIR Research Group在GitHub上的公开存在展示了其在人工智能和计算机视觉领域的贡献。该组织的主要编程语言包括Python和Jupyter Notebook,开发了一系列知名项目,如MiniGPT-4和ChatCaptioner,涵盖了从视频理解到情感图像字幕生成的广泛应用。
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
Official Repository of ChatCaptioner
[ICML 2025] Official PyTorch implementation of LongVU
VisualGPT, CVPR 2022 Proceeding, GPT as a decoder for vision-language models
Open-sourced code of MiniGPT-Med
3DCoMPaT++: An improved large-scale 3D vision dataset for compositional recognition
此仓库未提供描述。
此仓库未提供描述。
Code for the paper: It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection
此仓库未提供描述。
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents, CVPR 2025
Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
Official repository for the 3DCoMPaT dataset (ECCV2022 Oral)
此仓库未提供描述。
此仓库未提供描述。
Creative AI for Visual Art and Music slides and demos.
Official repository of Action-Free Guide
Creative Walk Adversarial Networks: Novel Art Generation with Probabilistic Random Walk Deviation from Style Norms
此仓库未提供描述。
此仓库未提供描述。
CIZSL++: Creativity Inspired Generative Zero-Shot Learning. T-PAMI under review.
Code for Wölfflin Affective Generative Analysis paper published in ICCC 2021
CS326 Practical assignment #2: few-shot classification
Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation. CVPR 2022 Workshop, ICCC 2022.
此仓库未提供描述。
此仓库未提供描述。
此仓库未提供描述。
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
此仓库未提供描述。
此仓库未提供描述。
此仓库未提供描述。
此仓库未提供描述。
VisionCAIR Zero-Shot Learning Research
VisionCAIR Affective and Creative AI Research
此仓库未提供描述。
此仓库未提供描述。
Creativity Inspired Zero-Shot Learning
Generator loss to reduce mode-collapse and to improve the generated samples quality.
Vision-CAIR主要开发与人工智能和计算机视觉相关的项目,包括MiniGPT-4和ChatCaptioner等,致力于推动相关技术的开源共享与应用。
Vision-CAIR的主要编程语言包括Python和Jupyter Notebook,此外还使用HTML和JavaScript。这些语言的组合使其能够开发多功能的机器学习和数据科学项目。
是的,Vision-CAIR的所有代码库都是公开的,允许开发者和研究人员访问其开源项目,促进合作和技术传播。