Paper Review 28
- [논문 리뷰] SAM 2: Segment Anything in Images and Videos
- [논문 리뷰] LoRA: Low-Rank Adaptation of Large Language Models
- [논문 리뷰] RAG, Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- [논문 리뷰] Rich Human Feedback for Text-to-Image Generation
- [논문 리뷰] Generative Image Dynamics
- [논문 리뷰] simCLR, A Simple Framework for Contrastive Learning of Visual Representations
- [논문 리뷰] Prompt-to-Prompt Image Editing with Cross Attention Control
- [논문 리뷰] ControlNet, Adding Conditional Control to Text-to-Image Diffusion Models
- [논문 리뷰] Med-PaLM M, Towards Generalist Biomedical AI
- [논문 리뷰] PaLM-E: An Embodied Multimodal Language Model
- [논문 리뷰] LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LMD)
- [논문 리뷰] BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
- [논문 리뷰] BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
- [논문 리뷰] GLIP, Grounded Language-Image Pre-training
- [논문 리뷰] BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
- [논문 리뷰] Segment Anything (SAM)
- [논문 리뷰] Scalable Pre-training of Large Autoregressive Image Models (AIM)
- [논문 리뷰] GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
- [논문 리뷰] DiT, Scalable Diffusion Models with Transformers
- [논문 리뷰] DALL-E 2, Hierarchical Text-Conditional Image Generation with CLIP Latents (unCLIP)
- [논문 리뷰] DINOv2: Learning Robust Visual Features without Supervision
- [논문 리뷰] ViViT: A Video Vision Transformer
- [논문 리뷰] BEiT: BERT Pre-Training of Image Transformers
- [논문 리뷰] DALL-E, Zero-Shot Text-to-Image Generation
- [논문 리뷰] Learning to Generate Text grounded Mask for Open World Semantic Segmentation (TCL)
- [논문 리뷰] Stable Diffusion, High-Resolution Image Synthesis with Latent Diffusion Models
- [논문리뷰] CLIP, Learning Transferable Visual Models From Natural Language Supervision
- [논문 리뷰] MaskGIT: Masked Generative Image Transformer