This is a work in progress --started 2025-02-- to document my exploration of Artificial Intelligence, Large Language Models (LLM), Natural Language Models (NLM), Natural Language Processing (NLP), etc.
- Large Language Models (LLM)
- General-purpose language models
- Code-specialized models {GitHub Copilot (Codex), Anthropic Claude Code, StarCoder, CodeLlama, DeepSeek Coder}
- Image Generation Models
- Text-to-Image {Stable Diffusion, Midjourney, DALL-E}
- Image-to-Image
- Multimodal Models (Text, Image, Audio, Video) {OpenAI Sora, Google Gemini, Meta Seamless, }
(Order by Name ↑)
- Alibaba
- Allen Institute for AI
- Amazon AI
- Anthropic
- xAI
- Cohere
- Deepseek
- Google AI
- Hugging Face
- IBM
- Meta AI
- Microsoft AI
- Mistral
- OpenAI
- Stability AI
(Order by Name ↑)
(Order by Name ↑)
- AWS Open Data
- Common Crawl
- Google Dataset Search
- Hugging Face Datasets
- Kaggle
- Microsoft Planetary Computer Data Catalog
- SNAP - Stanford Large Network Dataset Collection
- UC Irvine Machine Learning Repository
- USA Data Gov
(Order by Name ↑)
- GAIA General Artificial Intelligence Assessment - Leaderboard
- GPQA General Purpose Question Answering
- GSM8K - Leaderboard
- LiveBench - Leaderboard
- LMArena - Leaderboard
- MMLU - Leaderboard
- SWE-bench
- WebDev Arena - Leaderboard
- ollama - CLI
- ollama-ui - Simple HTML UI for Ollama. Available as Chrome extension.
- LM Studio - GUI
(Order by Name ↑)
- Allen Institute AI - Tulu 3:405B
- Anthropic - Claude
- Deepseek - R1
- GitHub Models
- Google AI Studio
- Granite
- Grok
- Nova
- Qwen
(Order by Name ↑) Ai2-Allen Institute ApX Machine Learning Arsturn Epoch AI ShinChven's Blog
(Sorted by Surname ↑)
- Dario Amodei - Anthropic
- AmandA Askell - Anthropic
- Lex Fridman - MIT
- Salim Ismail - Anthropic
- Nathan Lambert - Allen Institute for AI
- Emad Mostaque - stability.ai
- Christopher Olah - Anthropic
(Order by Name ↑)
(Sorted by Publication Date ↓)
- 2024-04-09 RULER: What's the Real Context Size of Your Long-Context Language Models?
- 2023-07-27 Universal and Transferable Adversarial Attacks on Aligned Language Models - LLM Attacks
(Sorted by Publication Date ↓)
- 2025-02-12 Unlocking the Effective Context Length: Benchmarking the Granite-3.1-8b Model
- 2025-02-02 A Detailed Analysis of Fine-Tuning, Direct Preference Optimization (DPO), and Reinforcement Learning with Verifiable Rewards (RLVR) on the LLama3.1 405B Model
- 2025-01-31 DeepSeek-V3 Explained 1: Multi-head Latent Attention
- 2025-01-19 Top 5 Mistakes to Avoid When Learning Machine Learning
- 2024-10-02 How to Get Started with Machine Learning: A Beginner’s Step-by-Step Guide
- 2024-07-13 MHA vs MQA vs GQA vs MLA
- 2020-00-00 Over 200 of the Best Machine Learning, NLP, and Python Tutorials — 2018 Edition
(Sorted by Publication Date ↓)
(Sorted by Publication Date ↓)
- 2025-02-02 Lex Fridman: DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters | Lex Fridman Podcast #459
- 2025-01-29 Peter H. Diamandis: DeepSeek vs. Open AI - The State of AI w/ Emad Mostaque & Salim Ismail | EP #146
- 2024-11-11 Lex Fridman: Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity | Lex Fridman Podcast #452
- 2018-07-30 Lex Fridman: Deep Learning State of the Art (2020)