A curated list of tools, models, datasets, papers, and resources related to generative artificial intelligence.
Generative AI refers to algorithms (like GANs, VAEs, and LLMs) that can generate new content—text, images, music, code, video, and more. This list includes tools for researchers, developers, artists, and entrepreneurs exploring this transformative field.
- Foundations & Concepts
- Text Generation
- Image Generation
- Music & Audio Generation
- Video Generation
- Code Generation
- Multimodal Models
- Frameworks & Libraries
- Datasets
- Learning Resources
- Communities
- Related Awesome Lists
- Introduction to Generative AI (Google) – Overview of what Generative AI is and how it works.
- GANs in 50 Lines of Code – Introductory example for Generative Adversarial Networks.
- The Illustrated Transformer – Visual explanation of the Transformer model, core to modern generative AI.
- GPT-4 – OpenAI’s latest large language model for text generation.
- Claude – Constitutional AI-powered chatbot from Anthropic.
- Bard – Google’s conversational generative AI model.
- LLaMA – Meta’s open large language model.
- Mistral – Lightweight open-source LLMs optimized for performance and scale.
- Stable Diffusion – Open-source image synthesis model.
- DALL·E 3 – Text-to-image model by OpenAI.
- Midjourney – High-quality AI art generator from text prompts.
- Deep Dream Generator – Tool for generating dream-like AI art using convolutional neural nets.
- Riffusion – Real-time music generation using spectrograms and stable diffusion.
- Jukebox (OpenAI) – AI model for generating music with vocals.
- Soundraw – AI-powered music generator for creators.
- Magenta – Research project by Google exploring music and art generation with ML.
- Runway ML – Platform for video creation using generative models like Gen-2.
- Pika Labs – AI-generated short-form video from text prompts.
- Synthesia – Create AI-generated videos with avatars.
- Kaiber – Turn audio and prompts into stylized video.
- GitHub Copilot – AI pair programmer powered by OpenAI Codex.
- Code Llama – Meta’s code generation model.
- Replit Ghostwriter – AI coding assistant for Replit users.
- Amazon CodeWhisperer – AI coding companion from AWS.
- Gemini – Google’s multimodal foundation model.
- GPT-4V (Vision) – GPT-4 with visual input capabilities.
- CLIP – Connects vision (images) and language (text) via contrastive learning.
- BLIP – Bootstrapped Language-Image Pretraining from Salesforce.
- Hugging Face Transformers – State-of-the-art models and tools for text, vision, and audio.
- Diffusers – Library for diffusion models from Hugging Face.
- LangChain – Framework for building LLM applications with chaining and memory.
- Transformers.js – Run Hugging Face models in the browser with JavaScript.
- LAION-5B – Open large-scale dataset for training image-text models.
- Common Crawl – Petabyte-scale web crawl data for training LLMs.
- C4 (Colossal Clean Crawled Corpus) – Text corpus for language model training.
- LibriSpeech – ASR dataset with thousands of hours of English speech.
- Awesome Prompt Engineering – Prompts, tools, and learning resources.
- Full Stack Deep Learning – Practical course for building and scaling LLM applications.
- DeepLearning.AI Courses – AI/ML courses including prompt engineering and diffusion models.
- OpenAI Cookbook – Recipes and examples for using OpenAI’s APIs.
- Hugging Face Forums – Community for open LLMs and model development.
- r/GenerativeAI – Reddit community on generative models and applications.
- Papers with Code – Research papers linked with code, datasets, and benchmarks.
Contributions are welcome!